Original Research Article

Are citation counts of articles related to outcomes and legacies of authors and journal editors? (CAROL): a cohort study

Wonwoo Jang1,2,#, Seokjun Kim1,2,#, Jaehyun Kong1,2,#, Hanseul Cho3, Jiyeon Oh1,2, Jiseung Kang4,5, Lee Smith6, Yejun Son7,*https://orcid.org/0009-0001-3939-2983
Author Information & Copyright
1Department of Medicine, Kyung Hee University College of Medicine, Seoul, South Korea
2Center for Digital Health, Medical Science Research Institute, Kyung Hee University College of Medicine, Seoul, South Korea
3Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
4Division of Sleep Medicine, Harvard Medical School, Boston, Massachusetts, USA.
5Department of Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
6Centre for Health, Performance and Wellbeing, Anglia Ruskin University, Cambridge, UK
7Department of Precision Medicine, Kyung Hee University College of Medicine, Seoul, South Korea

# These authors contributed equally to this work

*Correspondence: Yejun Son, E-mail: dlstod9981@naver.com

© Copyright 2024 Life Cycle. This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Received: Jul 05, 2024; Revised: Aug 14, 2024; Accepted: Aug 23, 2024

Published Online: Sep 01, 2024

Abstract

Objective:

To investigate the association between the number of citations each article received and quantitative factors of journal editors or corresponding authors.

Methods:

We performed retrospective cohort study using corresponding author and editor information of articles published in PLOS Medicine. The number of citations for each article and several factors associated with the participants, which were expected to influence citation counts, were acquired from Web of Science as follows: citations of publications, H-index, gross national income per capita of the affiliation country, number of publications, duration of research career, and number of publications from the affiliation of each corresponding author and their handling editor. Article citations in continuous and binary values and quantitative factors of journal editors or corresponding authors were investigated. In binary values, a super-citation article was defined as one in the top 12.24% of citations among the included articles, while the rest were defined as normal-citation articles.

Results:

Articles published in PLOS Medicine between 2018 and 2019 were analyzed (n=396). In the analysis using continuous values, the citation count of the article was weakly correlated with the journal editor’s citation count of publications (Spearman correlation, 0.164), H-index (0.161), and the number of publications from their affiliation (0.148). Similarly, weak correlations were observed with the corresponding author’s citation count of publications (0.217), H-index (0.158), and the number of publications (0.103). However, in binary analysis, the super-citation articles were only associated with the editor’s H-index (P-value in Mann-Whitney U test, 0.048).

Conclusions:

Although article citation counts are weakly correlated with the quantitative factors of the corresponding author or the journal editor, only the journal editor’s H-index is associated with the super-citation articles, which greatly contributes to the journal’s impact factor. This Citation counts of Articles Related to Outcomes and Legacies of authors and journal editors (CAROL) study suggests that the editor’s expertise is crucial in selecting breakthrough and widely popular articles.

Keywords: corresponding author; editor; H-index

1. Introduction

In Matthew 6:3, Jesus says, “Do not let your left hand know what your right hand is doing.” But is it unacceptable to “let our co-authors know when our article is highly cited” to share our joy on Christmas? It is natural to feel happy and excited when our article receives numerous citations. This is because, although the quality of research is not directly proportional to the number of citations, a high citation count is considered an important indicator of the relative impact of research. Furthermore, citations are used to calculate the impact factor (IF) of a journal. Most researchers, including our team, aspire to have their work published in top-tier IF journals such as the BMJ because, despite ongoing debates, the IF generally reflects the relative impact of the published articles on the field.

Of course, it is unlikely that top journals achieve high IF because editors deliberately select articles based on potential citations. In fact, previous cohort studies have reported that even editors are challenged to predict potential citations.[1] Then, what factors make an article breakthrough and become widely popular with high citations? To answer this question, several studies have investigated contributing factors associated with high citations.[2-12] However, most of these studies have been limited to non-medical fields or specific medical subspecialties. Furthermore, they have predominantly focused on the characteristics of the articles themselves, such as topic, page count, study design, and sample size, rather than the human attributes of the editors or authors.

Therefore, we conducted a multinational retrospective cohort study on articles across the full spectrum of medicine to determine how human factors, particularly the quantitative ability and features of editors and corresponding authors, relate to the number of citations these articles receive. This study comprehensively analyzed various factors of corresponding authors and editors to uncover the key factors that lead to higher citation counts, which in turn provides insightful guidance for achieving high citations.

2. Methods

2.1 Study design

Our study aimed to investigate the association between quantitative characteristics of editors or corresponding authors and the number of citations each article received. To avoid bias when analyzing journals with varying impact factors, we selected one journal (PLOS Medicine) where information about the editor and corresponding author is publicly available and then analyzed the published articles. To increase the reliability of the study, we also selected only articles published in 2018 and 2019, before the beginning of the COVID-19 pandemic, to exclude the influence of abnormal citation patterns due to the pandemic.[13] Next, our study included several factors acquired from Web of Science as follows: citations of publications, H-index, number of publications, duration of research career, and number of publications from the affiliation of each corresponding author and their handling editors. Additionally, we investigate the gross national income per capita of the affiliation country for each corresponding author and editor. These variables were chosen to comprehensively reflect the various factors that can affect the number of citations of an article. This study systematically screened articles (n=428) published in the specified years by searching the Web of Science. Allowing for duplication in editors and corresponding authors, 396 independent articles were ultimately selected for this study, all of which were publicly accessible and included a range of variables that could affect the number of citations an article received.

2.2 Binary classification of super-citation and normal-citation

To reflect the importance of the Christmas date, our study categorized the top 12.24% of all articles (n=49) with the most citations as “super-citations” and the rest as “normal-citations” (n=347). This binary classification is important to distinguish the influence of articles based on the number of citations and to analyze the characteristics of highly cited articles. This categorization was utilized to compare which of the many variables of editors and corresponding authors have a higher association with the super-citations by examining the difference between highly and normally cited articles.

2.4 Statistical analysis

Our study initially performed the Shapiro-Wilk test and D’Agostino’s K2 test to check the normality of each variable.[14] The Shapiro-Wilk test is particularly sensitive to small and medium sample sizes and checks to see if variables follow a normal distribution, while D’Agostino’s K2 test evaluates skewness and kurtosis to see if the data set comes from a normal distribution.[14] Since these tests verified that not all variables met normality, we used Spearman’s rank correlation to examine the correlation between the continuous variables. Unlike Pearson’s correlation, which assumes a normal distribution of variables, Spearman’s rank correlation is a non-parametric measure that evaluates how well the relationship between two variables can be described using a monotonic function that fits non-normally distributed data.[15] Finally, we performed a Mann-Whitney U-test to assess the statistical difference between the two independent samples: the super-citation and normal-citation groups.[16] The Mann-Whitney U-test is a non-parametric test used when data does not follow a normal distribution. It focuses on the median between two groups and evaluates the difference between the two groups based on the ranking of the data points.[16] The SAS software (version 9.4; SAS Institute, Cary, NC, USA) was used for statistical analyses with a two-sided test. P ≤ 0.05 was considered statistically significant.

3. Results

This study analyzed the number of citations for each article and several factors related to corresponding authors and editors to determine which factors play a key role in achieving high citation counts. The baseline characteristics of the 396 articles included in this study indicate that editors had an average number of publication citations of 24,334.62 (standard deviation [SD], 30,121.88), an H-index of 57.18 (SD, 30.65), and an average number of publications of 255.18 (SD, 239.16). Corresponding authors averaged 9,844.40 publication citations (SD, 21,014.37), an H-index of 33.38 (SD, 28.86), and a publication count of 143.91 (SD, 244.10). The baseline characteristics of editors and corresponding authors are presented in Table 1, Table S3, and Table S4.

Table 1. Baseline characteristics of the study (n=396)
Variables Handling editors Corresponding authors
Citations of publications, mean (SD) 24,334.62 (30,121.88) 9,844.40 (21,014.37)
H-index, mean (SD) 57.18 (30.65) 33.38 (28.86)
GNI per capita, dollars, mean (SD) 48,874.34 (18,102.60) 47,535.24 (18,935.25)
Number of publications, mean (SD) 255.18 (239.16) 143.91 (244.10)
Duration of research career, year, mean (SD) 17.51 (6.46) 16.28 (8.02)
Number of publications from the affiliation, mean (SD) 6,114.95 (4,427.98) 4,888.58 (3,833.73)

Abbreviations: GNI, gross national income; SD, standard deviation.

Download Excel Table

We examined the distribution of variables for each citation count using histograms (Table S1). Additionally, we assessed the normality of each variable using the Shapiro-Wilk and D’Agostino’s K2 tests. The results of both normality tests indicated that none of the variables followed a normal distribution (P < 0.01, Table S1).

Since all variables followed a non-normal distribution, we examined the association between each variable and the number of citations of each article through Spearman rank correlation. The Spearman correlation analysis showed a statistically significant positive correlation between the number of citations of an editor’s articles (correlation coefficient, 0.164), H-index (0.161), and number of affiliated articles (0.148), while the number of citations of a first author’s articles (0.217), H-index (0.158), and number of articles (0.103) also showed significant positive correlations. However, considering that the Spearman correlation coefficients are between -1 and +1, this was not a remarkable correlation (Fig. 1 and Table 2). Thus, there was no significant association between the total number of citations of an article and the various factors of corresponding authors and editors.

lc-4-0-8-g1
Fig. 1. Spearman correlation between number of citations and each variable. (A) handling editors; (B) corresponding authors. Abbreviation: GNI, gross national income.
Download Original Figure
Table 2. Spearman correlation coefficients between number of citations and each variable
Variables Spearman correlation P-value
Editor
Citations of publications 0.164 <0.001
H-index 0.161 <0.001
GNI per capita (dollars) 0.048 0.345
Number of publications 0.070 0.167
Duration of research career (years) 0.028 0.573
Number of publications from the affiliation 0.148 <0.001
Corresponding author
Citations of publications 0.217 <0.001
H-index 0.158 <0.001
GNI per capita (dollars) -0.035 0.486
Number of publications 0.103 0.040
Duration of research career (years) 0.006 0.904
Number of publications from the affiliation 0.046 0.363

Abbreviations: GNI, gross national income.

Numbers in bold indicate statistically significant (P <0.05).

Download Excel Table

Therefore, we further conducted binary analysis by categorizing the articles into super-citation and normal-citation groups. Shapiro-Wilk and D’Agostino’s K2 tests were performed on all variables for each citation group. It was observed that in the super-citation group, both the length of the authors’ research careers and the number of publications from their institutions followed a normal distribution (Table S2). However, since most variables did not follow a normal distribution, we conducted a Mann-Whitney U test between the super-citation and normal-citation groups. The results showed that super-citation articles only have a statistically significant association with the editors’ H-index (P-value = 0.048 in the Mann-Whitney U test).

These findings indicate that the number of citations an article receives is not significantly correlated with the qualifications of its corresponding authors or journal editors; however, the H-index of journal editors is exclusively associated with super-citation articles (Fig. 2 and Table 3).

lc-4-0-8-g2
Fig. 2. Mann-Whitney U-tests for each variable between super-citation and normal-citation articles. (A) handling editors; (B) corresponding authors. Abbreviation: GNI, gross national income.
Download Original Figure
Table 3. Results of Mann-Whitney U tests for each variable between super-citation and normal-citation articles.
Variables Super-citation article, median [IQR] Normal-citation article, median [IQR] P-value
Editor
Citations of publications 22,205.00 (5,457.00 to 49,715.00) 9,051.00 (4,832.00 to 32,384.50) 0.056
H-index 57.00 (39.00 to 84.00) 47.00 (36.00 to 79.00) 0.048
GNI per capita (dollars) 47,490.00 (42,020.00 to 63,290.00) 45,070.00 (43,240.00 to 63,290.00) 0.650
Number of publications 212.00 (120.00 to 417.00) 157.00 (114.00 to 303.00) 0.162
Duration of research career (years) 15.00 (13.00 to 28.00) 14.00 (14.00 to 20.50) 0.316
Number of publications from the affiliation 6,148.00 (3,241.00 to 8,820.00) 5,858.00 (2,797.00 to 8,285.70) 0.669
Corresponding author
Citations of publications 2,880.00 (1,747.00 to 8,655.00) 2,384.00 (833.00 to 8,073.50) 0.185
H-index 23.00 (17.00 to 41.00) 24.00 (15.00 to 42.00) 0.846
GNI per capita (dollars) 55,700.00 (43,240.00 to 63,290.00) 46,600.00 (42,020.00 to 63,290.00) 0.146
Number of publications 58.00 (31.00 to 114.00) 64.00 (27.50 to 154.00) 0.543
Duration of research career (years) 15.00 (10.00 to 20.00) 15.00 (10.00 to 24.00) 0.512
Number of publications from the affiliation 4,755.00 (2,359.00 to 7674.00) 4,141.00 (2053.50 to 6986.10) 0.730

Abbreviation: GNI, gross national income; IQR, interquartile range.

Numbers in bold indicate a significant difference (P <0.05).

Download Excel Table

4. Discussion

4.1 Principal findings

This Citation counts of Articles Related to Outcomes and Legacies of authors and journal editors (CAROL) study examined the statistical association between the number of citations of published articles and various factors of corresponding authors and editors. We found that the quantitative characteristics of the corresponding author or editor were weakly correlated with the total number of citations as a continuous variable. However, a notable finding was that the H-index of the editor had a significant association with super-citations when the top 12.24% of super-citations were binary and categorized separately from the other normal citations. This suggests that articles managed by handling editors who maintain consistently high research quality over a long period of time are more likely to be breakthrough and widely popular.

4.2 Comparison with other studies

Various studies have been conducted to predict the future citations of articles.[2-12] Some studies utilizing machine learning have successfully predicted future citations by analyzing the text of manuscripts.[5, 6] However, most of the previous studies have been limited to non-medical fields or specific subfields within the medical field. Additionally, these studies primarily focused on the characteristics of the articles, such as the topic, research design, and number of pages, rather than the human characteristics of the editors and authors. Given the recent emphasis on patient-centered medical care, we aimed to focus more on the characteristics and capabilities of editors and authors to conduct more human-centered research.[17]

This study was inspired by a landmark cohort study of BMJ editors (n=10) published on Christmas Day 2022. The prior study showed that even editors of leading medical journals like the BMJ struggle to predict which articles will be highly cited. Nonetheless, our study exhibited that editors with rich research experience are significantly more likely to handle articles that achieve super-citations. At first glance, these two findings may seem contradictory, but this is likely due to differences in the underlying study design. Previous cohort studies have directly asked editors to predict the future citations of specific articles. Our study differs in that citation potential was not considered by editors at the time of editing, and future citations were examined retrospectively. Our design better reflects the real-world situation because, in the actual manuscript review process, editors do not consciously select articles solely based on citation potential but rather comprehensively evaluate the scientific importance of a manuscript based on their accumulated research experience.[1, 18]

4.3 Strengths and limitations of this study

The strength of our study lies in its analysis of a large-scale scholar cohort, rigorously examining the influence of corresponding authors’ and editors’ quantitative research capabilities and their affiliations on research impact, based on robust statistical methodologies.

Our study has several limitations. First, most journals did not disclose the editor of each article, limiting the number of journals for which we could include in our study. However, this limitation had the advantage of controlling for confounding factors, allowing us to make the impact factors of the articles in our study even. Second, even if articles are published in the same year, the number of citations will inevitably vary between articles published earlier in the year and those published later. Third, if the editor or corresponding author has incorrect author name information or is published under multiple names, the citation count may not be accurate. To address this, we used an OR condition on possible name combinations to ensure that as many records of the editor or corresponding author were included as possible.

4.4 Policy implications

Some scientific journals prefer to achieve a high impact factor, as it has been considered a leading, albeit controversial, indicator of a journal’s relative importance in recent decades.[19-23] The H-index can be achieved only by consistently publishing citable articles, being less influenced by a small number of over- or under-cited outliers, and allowing enough time for citations to accumulate.[24] Our results suggest that if a journal aims to increase its impact factor, it can do so by hiring scholars with a high H-index and a strong long-term research experience as editors.[24] However, as the editorial decision-making process is complex and involves diverse factors,[25] it is important to evaluate not only the statistical value of the editor but also whether the editor possesses the appropriate competencies for the role.[26-28]

Our results also show that a corresponding author’s research experience, number of publications, and citations of publications are only weakly positively correlated with future citations. This indicates that these factors do not guarantee super citations or higher research impact. Consequently, this suggests that editors should not assume that corresponding authors with high reputations will necessarily produce a high impact article when reviewing new manuscripts. Finally, in line with previous studies, high citations are not something that editors can intentionally secure but rather are the result of scholars with high research capabilities focusing on the scientific rigor and completeness of manuscripts based on their experience.[1, 26]

5. Conclusion

Although the citation counts of articles are weakly correlated with the capabilities of the corresponding author or the journal editor, only the journal editor’s H-index is associated with super-citation articles, which greatly contributes to the journal’s impact factor. This CAROL study suggests that an editor’s extensive research expertise is crucial for selecting groundbreaking and widely popular articles. Combined with previous research, our study shows that high citation counts are not the result of deliberate anticipation of future citations or biases related to the corresponding author’s reputation and affiliation, which is in contrast to general prejudice.

Capsule Summary

This Citation counts of Articles Related to Outcomes and Legacies of authors and journal editors (CAROL) study suggests that the editor’s expertise is crucial in selecting breakthrough and widely popular articles.

Ethical statement

Since our study utilized publicly accessible data from Web of Science, ethical approval was not required.

Patient and public involvement

No patients were directly involved in designing the research question or conducting the research. No patients were asked to interpret or write up the results. However, we plan on disseminating the results of this study to any of the study participants or wider relevant communities on request.

Data Availability Statement

All articles used in this study are available from PLOS medicine.

Transparency statement

The lead author (Dr. YS) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.

Contributors

Dr. YS had full access to all of the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. All authors approved the final version of the manuscript before submission. Study concept and design: WJ, SK, JK, and YS; acquisition, analysis, or interpretation of data: WJ, SK, JK, and YS; drafting of the manuscript: WJ, SK, JK, and YS; critical revision of the manuscript for important intellectual content: all authors; statistical analysis: WJ, SK, JK, and YS; study supervision: YS. YS supervised the study and served as a guarantor. WJ, SK, and JK contributed equally as the first authors. The corresponding author attests that all listed authors meet the authorship criteria and that no one meeting the criteria has been omitted.

Sources of funding for the research

None

Competing interests

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Provenance and peer review

Not commissioned; externally peer reviewed.

Supplementary Materials

Supplementary Materials

lc-4-0-8-suppl1.pdf

References

1.

Schroter S, Weber WEJ, Loder E, Wilkinson J, Kirkham JJ. Evaluation of editors’ abilities to predict the citation potential of research manuscripts submitted to The BMJ: a cohort study. Bmj. 2022; 379e073880

2.

In: Castillo C, Donato D, Gionis A, editors.editors Estimating number of citations using author reputation. International Symposium on String Processing and Information Retrieval. 2007Springer.

3.

Lokker C, McKibbon KA, McKinlay RJ, Wilczynski NL, Haynes RB. Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study. Bmj. 2008; 336(7645):655-7

4.

Fu LD, Aliferis C. Models for predicting and explaining citation count of biomedical articles. AMIA Annu Symp Proc. 2008; 2008:222-6

5.

Ibáñez A, Larrañaga P, Bielza C. Predicting citation count of Bioinformatics papers within four years of publication. Bioinformatics. 2009; 25(24):3303-9

6.

Beranová L, Joachimiak MP, Kliegr T, Rabby G, Sklenák V. Why was this cited? Explainable machine learning applied to COVID-19 research literature. Scientometrics. 2022; 127(5):2313-49

7.

Alohali YA, Fayed MS, Mesallam T, Abdelsamad Y, Almuhawas F, Hagr A. A machine learning model to predict citation counts of scientific papers in otology field. Biomed Res Int. 2022; 2022:2239152

8.

Lopez J, Calotta N, Doshi A, Soni A, Milton J, May JW, et al. Citation rate predictors in the plastic surgery literature. J Surg Educ. 2017; 74(2):191-8

9.

Kossmeier M, Heinze G. Predicting future citation counts of scientific manuscripts submitted for publication: a cohort study in transplantology. Transpl Int. 2019; 32(1):6-15

10.

Winnik S, Raptis DA, Walker JH, Hasun M, Speer T, Clavien PA, et al. From abstract to impact in cardiovascular research: factors predicting publication and citation. Eur Heart J. 2012; 33(24):3034-45

11.

Willis DL, Bahler CD, Neuberger MM, Dahm P. Predictors of citations in the urological literature. BJU Int. 2011; 107(12):1876-80

12.

Carollo A, Zhang P, Yin P, Jawed A, Dimitriou D, Esposito G, et al. Sleep profiles in eating disorders: A scientometric study on 50 years of clinical research. Healthcare (Basel). 2023; 11(14)

13.

Park S, Lim HJ, Park J, Choe YH. Impact of COVID-19 Pandemic on biomedical publications and their citation frequency. J Korean Med Sci. 2022; 37(40)e296

14.

Ghasemi A, Zahediasl S. Normality tests for statistical analysis: a guide for non-statisticians. Int J Endocrinol Metab. 2012; 10(2):486-9

15.

Mukaka MM. Statistics corner: A guide to appropriate use of correlation coefficient in medical research. Malawi Med J. 2012; 24(3):69-71

16.

McKnight PE, Najab J. Mann-Whitney U Test. The Corsini encyclopedia of psychology. 2010; :1

17.

Barry MJ, Edgman-Levitan S. Shared decision making--pinnacle of patient-centered care. N Engl J Med. 2012; 366(9):780-1

18.

Liesegang TJ, Albert DM, Schachat AP, Minckler DS. The editorial process for medical journals: I. Introduction of a series and discussion of the responsibilities of editors, authors, and reviewers. Am J Ophthalmol. 2003; 136(1):109-13

19.

Casadevall A, Fang FC. Causes for the persistence of impact factor mania. mBio. 2014; 5(2)e00064-14

20.

Eston R. The impact factor: a misleading and flawed measure of research quality. J Sports Sci. 2005; 23(1):1-3

21.

Smith R. Beware the tyranny of impact factors. J Bone Joint Surg Br. 2008; 90(2):125-6

22.

Garfield E. The history and meaning of the journal impact factor. Jama. 2006; 295(1):90-3

23.

The PLoS Medicine Editors. The impact factor game. It is time to find a better way to assess the scientific literature. PLoS Med. 2006; 3(6)e291

24.

Mondal H, Deepak KK, Gupta M, Kumar R. The h-Index: Understanding its predictors, significance, and criticism. J Family Med Prim Care. 2023; 12(11):2531-7

25.

Stahel PF, Moore EE. Peer review for biomedical publications: we can improve the system. BMC Med. 2014; 12:179

26.

Moher D, Galipeau J, Alam S, Barbour V, Bartolomeos K, Baskin P, et al. Core competencies for scientific editors of biomedical journals: consensus statement. BMC Med. 2017; 15(1):167

27.

Moher D, Altman DG. Four proposals to help improve the medical research literature. PLoS Med. 2015; 12(9)e1001864

28.

Galipeau J, Moher D, Skidmore B, Campbell C, Hendry P, Cameron DW, et al. Systematic review of the effectiveness of training programs in writing for scholarly publication, journal editing, and manuscript peer review (protocol). Syst Rev. 2013; 2:41