1. Introduction
In Matthew 6:3, Jesus says, “Do not let your left hand know what your right hand is doing.” But is it unacceptable to “let our co-authors know when our article is highly cited” to share our joy on Christmas? It is natural to feel happy and excited when our article receives numerous citations. This is because, although the quality of research is not directly proportional to the number of citations, a high citation count is considered an important indicator of the relative impact of research. Furthermore, citations are used to calculate the impact factor (IF) of a journal. Most researchers, including our team, aspire to have their work published in top-tier IF journals such as the BMJ because, despite ongoing debates, the IF generally reflects the relative impact of the published articles on the field.
Of course, it is unlikely that top journals achieve high IF because editors deliberately select articles based on potential citations. In fact, previous cohort studies have reported that even editors are challenged to predict potential citations.[1] Then, what factors make an article breakthrough and become widely popular with high citations? To answer this question, several studies have investigated contributing factors associated with high citations.[2-12] However, most of these studies have been limited to non-medical fields or specific medical subspecialties. Furthermore, they have predominantly focused on the characteristics of the articles themselves, such as topic, page count, study design, and sample size, rather than the human attributes of the editors or authors.
Therefore, we conducted a multinational retrospective cohort study on articles across the full spectrum of medicine to determine how human factors, particularly the quantitative ability and features of editors and corresponding authors, relate to the number of citations these articles receive. This study comprehensively analyzed various factors of corresponding authors and editors to uncover the key factors that lead to higher citation counts, which in turn provides insightful guidance for achieving high citations.
2. Methods
Our study aimed to investigate the association between quantitative characteristics of editors or corresponding authors and the number of citations each article received. To avoid bias when analyzing journals with varying impact factors, we selected one journal (PLOS Medicine) where information about the editor and corresponding author is publicly available and then analyzed the published articles. To increase the reliability of the study, we also selected only articles published in 2018 and 2019, before the beginning of the COVID-19 pandemic, to exclude the influence of abnormal citation patterns due to the pandemic.[13] Next, our study included several factors acquired from Web of Science as follows: citations of publications, H-index, number of publications, duration of research career, and number of publications from the affiliation of each corresponding author and their handling editors. Additionally, we investigate the gross national income per capita of the affiliation country for each corresponding author and editor. These variables were chosen to comprehensively reflect the various factors that can affect the number of citations of an article. This study systematically screened articles (n=428) published in the specified years by searching the Web of Science. Allowing for duplication in editors and corresponding authors, 396 independent articles were ultimately selected for this study, all of which were publicly accessible and included a range of variables that could affect the number of citations an article received.
To reflect the importance of the Christmas date, our study categorized the top 12.24% of all articles (n=49) with the most citations as “super-citations” and the rest as “normal-citations” (n=347). This binary classification is important to distinguish the influence of articles based on the number of citations and to analyze the characteristics of highly cited articles. This categorization was utilized to compare which of the many variables of editors and corresponding authors have a higher association with the super-citations by examining the difference between highly and normally cited articles.
Our study initially performed the Shapiro-Wilk test and D’Agostino’s K2 test to check the normality of each variable.[14] The Shapiro-Wilk test is particularly sensitive to small and medium sample sizes and checks to see if variables follow a normal distribution, while D’Agostino’s K2 test evaluates skewness and kurtosis to see if the data set comes from a normal distribution.[14] Since these tests verified that not all variables met normality, we used Spearman’s rank correlation to examine the correlation between the continuous variables. Unlike Pearson’s correlation, which assumes a normal distribution of variables, Spearman’s rank correlation is a non-parametric measure that evaluates how well the relationship between two variables can be described using a monotonic function that fits non-normally distributed data.[15] Finally, we performed a Mann-Whitney U-test to assess the statistical difference between the two independent samples: the super-citation and normal-citation groups.[16] The Mann-Whitney U-test is a non-parametric test used when data does not follow a normal distribution. It focuses on the median between two groups and evaluates the difference between the two groups based on the ranking of the data points.[16] The SAS software (version 9.4; SAS Institute, Cary, NC, USA) was used for statistical analyses with a two-sided test. P ≤ 0.05 was considered statistically significant.
3. Results
This study analyzed the number of citations for each article and several factors related to corresponding authors and editors to determine which factors play a key role in achieving high citation counts. The baseline characteristics of the 396 articles included in this study indicate that editors had an average number of publication citations of 24,334.62 (standard deviation [SD], 30,121.88), an H-index of 57.18 (SD, 30.65), and an average number of publications of 255.18 (SD, 239.16). Corresponding authors averaged 9,844.40 publication citations (SD, 21,014.37), an H-index of 33.38 (SD, 28.86), and a publication count of 143.91 (SD, 244.10). The baseline characteristics of editors and corresponding authors are presented in Table 1, Table S3, and Table S4.
We examined the distribution of variables for each citation count using histograms (Table S1). Additionally, we assessed the normality of each variable using the Shapiro-Wilk and D’Agostino’s K2 tests. The results of both normality tests indicated that none of the variables followed a normal distribution (P < 0.01, Table S1).
Since all variables followed a non-normal distribution, we examined the association between each variable and the number of citations of each article through Spearman rank correlation. The Spearman correlation analysis showed a statistically significant positive correlation between the number of citations of an editor’s articles (correlation coefficient, 0.164), H-index (0.161), and number of affiliated articles (0.148), while the number of citations of a first author’s articles (0.217), H-index (0.158), and number of articles (0.103) also showed significant positive correlations. However, considering that the Spearman correlation coefficients are between -1 and +1, this was not a remarkable correlation (Fig. 1 and Table 2). Thus, there was no significant association between the total number of citations of an article and the various factors of corresponding authors and editors.
Therefore, we further conducted binary analysis by categorizing the articles into super-citation and normal-citation groups. Shapiro-Wilk and D’Agostino’s K2 tests were performed on all variables for each citation group. It was observed that in the super-citation group, both the length of the authors’ research careers and the number of publications from their institutions followed a normal distribution (Table S2). However, since most variables did not follow a normal distribution, we conducted a Mann-Whitney U test between the super-citation and normal-citation groups. The results showed that super-citation articles only have a statistically significant association with the editors’ H-index (P-value = 0.048 in the Mann-Whitney U test).
These findings indicate that the number of citations an article receives is not significantly correlated with the qualifications of its corresponding authors or journal editors; however, the H-index of journal editors is exclusively associated with super-citation articles (Fig. 2 and Table 3).
4. Discussion
This Citation counts of Articles Related to Outcomes and Legacies of authors and journal editors (CAROL) study examined the statistical association between the number of citations of published articles and various factors of corresponding authors and editors. We found that the quantitative characteristics of the corresponding author or editor were weakly correlated with the total number of citations as a continuous variable. However, a notable finding was that the H-index of the editor had a significant association with super-citations when the top 12.24% of super-citations were binary and categorized separately from the other normal citations. This suggests that articles managed by handling editors who maintain consistently high research quality over a long period of time are more likely to be breakthrough and widely popular.
Various studies have been conducted to predict the future citations of articles.[2-12] Some studies utilizing machine learning have successfully predicted future citations by analyzing the text of manuscripts.[5, 6] However, most of the previous studies have been limited to non-medical fields or specific subfields within the medical field. Additionally, these studies primarily focused on the characteristics of the articles, such as the topic, research design, and number of pages, rather than the human characteristics of the editors and authors. Given the recent emphasis on patient-centered medical care, we aimed to focus more on the characteristics and capabilities of editors and authors to conduct more human-centered research.[17]
This study was inspired by a landmark cohort study of BMJ editors (n=10) published on Christmas Day 2022. The prior study showed that even editors of leading medical journals like the BMJ struggle to predict which articles will be highly cited. Nonetheless, our study exhibited that editors with rich research experience are significantly more likely to handle articles that achieve super-citations. At first glance, these two findings may seem contradictory, but this is likely due to differences in the underlying study design. Previous cohort studies have directly asked editors to predict the future citations of specific articles. Our study differs in that citation potential was not considered by editors at the time of editing, and future citations were examined retrospectively. Our design better reflects the real-world situation because, in the actual manuscript review process, editors do not consciously select articles solely based on citation potential but rather comprehensively evaluate the scientific importance of a manuscript based on their accumulated research experience.[1, 18]
The strength of our study lies in its analysis of a large-scale scholar cohort, rigorously examining the influence of corresponding authors’ and editors’ quantitative research capabilities and their affiliations on research impact, based on robust statistical methodologies.
Our study has several limitations. First, most journals did not disclose the editor of each article, limiting the number of journals for which we could include in our study. However, this limitation had the advantage of controlling for confounding factors, allowing us to make the impact factors of the articles in our study even. Second, even if articles are published in the same year, the number of citations will inevitably vary between articles published earlier in the year and those published later. Third, if the editor or corresponding author has incorrect author name information or is published under multiple names, the citation count may not be accurate. To address this, we used an OR condition on possible name combinations to ensure that as many records of the editor or corresponding author were included as possible.
Some scientific journals prefer to achieve a high impact factor, as it has been considered a leading, albeit controversial, indicator of a journal’s relative importance in recent decades.[19-23] The H-index can be achieved only by consistently publishing citable articles, being less influenced by a small number of over- or under-cited outliers, and allowing enough time for citations to accumulate.[24] Our results suggest that if a journal aims to increase its impact factor, it can do so by hiring scholars with a high H-index and a strong long-term research experience as editors.[24] However, as the editorial decision-making process is complex and involves diverse factors,[25] it is important to evaluate not only the statistical value of the editor but also whether the editor possesses the appropriate competencies for the role.[26-28]
Our results also show that a corresponding author’s research experience, number of publications, and citations of publications are only weakly positively correlated with future citations. This indicates that these factors do not guarantee super citations or higher research impact. Consequently, this suggests that editors should not assume that corresponding authors with high reputations will necessarily produce a high impact article when reviewing new manuscripts. Finally, in line with previous studies, high citations are not something that editors can intentionally secure but rather are the result of scholars with high research capabilities focusing on the scientific rigor and completeness of manuscripts based on their experience.[1, 26]
5. Conclusion
Although the citation counts of articles are weakly correlated with the capabilities of the corresponding author or the journal editor, only the journal editor’s H-index is associated with super-citation articles, which greatly contributes to the journal’s impact factor. This CAROL study suggests that an editor’s extensive research expertise is crucial for selecting groundbreaking and widely popular articles. Combined with previous research, our study shows that high citation counts are not the result of deliberate anticipation of future citations or biases related to the corresponding author’s reputation and affiliation, which is in contrast to general prejudice.