Scientific rewards for biomedical specialization are large and persistent

Appendix Discipline breakdown

To better understand the sample of researchers, and also facilitate the discipline by discipline robustness checks, the discipline of each researcher is estimated using the following process. First, the journals a researcher has published in are extracted from his or her publications. In this stage, we also eliminate highly interdisciplinary journals (PNAS, Science, Nature, Annals of the New York Academy of Science, and PLoS ONE). Second, these journal titles are matched to the map equation journal classifications developed by Rosvall and Bergstrom [60]. To assign disciplines, the following algorithm is then followed:

figure a

A breakdown of the disciplines assigned can be found in Figs. 4 and 5.

Fig. 4figure 4

Breakdown of researcher disciplines for our sample of 29,208 researchers with at least 100 publications

Fig. 5figure 5

Breakdown of disciplines for researchers assigned to more than one discipline for the associated subsample of 1716 researchers with at least 100 publications

Derivation of standard deviation of RCA

Starting with the equation for revealed comparative advantage:

$$\begin \text _} = \frac}} \Big / \left( \frac}} \right) \end$$

(6)

We note it is an equation of four variables, namely \(n_\), \(p_\), \(N_\), and \(P_\). Drawing from propagation of uncertainties, we understand that the covariance (C) of an arbitrary function f can be expressed as follows:

$$\begin \mathbf _} = \mathbf \mathbf _}\mathbf ^ \end$$

(7)

where x in the right hand side covariance matrix (\(\mathbf _}\)) denotes it is over the independent variables of Eq. 6, and J is the Jacobian matrix. Executing this for the \(\text _}\) and assuming no correlation between the independent variables, we get the following relationship:

$$\begin \sigma ^_} = \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_. \end$$

(8)

It is straightforward to show that this equation simplifies to:

$$\begin \sigma ^_} = \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_ + \left| \frac} \right| ^ \sigma ^_. \end$$

(9)

Making the assumption that the probability that any given MeSH term appears on a paper arises from a binomial distribution, we can use the property \(\sigma ^_ = x\) to further simplify:

$$\begin \sigma ^_} = \frac^}} n + \frac^}} p + \frac^}} N + \frac^}} P. \end$$

(10)

Canceling common factors and factoring out RCA, we arrive at:

$$\begin \sigma _} = \text \left( \frac + \frac + \frac + \frac\right) ^ \end$$

(11)

as used in the main manuscript.

Robustness checks

To check the robustness of our results, we carry out a number of additional regressions. First, we perform the same analysis as found in the main manuscript, but on a different, less published, sample of biomedical researchers. Second, we conduct a non-parametric test to demonstrate the lack of a relationship between our specialization measure and a more general measure of interdisciplinarity. Finally, we carry out the same analysis as in the main manuscript, but on researchers of specified disciplines of the biomedical sciences.

Lower publishing sample

Here, we carry out the same regression as found in the main manuscript but on a sample of biomedical researchers publishing between 75 and 99 publications over the course of their careers. The results can be found in Table 2 and Fig. 6 and are consistent with the findings in the main body of the paper.

Fig. 6figure 6

Marginal effects for biomedical researchers with 75 to 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 22,589 unique biomedical researchers with between 75 and 99 career publications career publications, for a total of 145,143 researcher-time window observations

Specialization and interdisciplinarity

Our measure of specialization captures the diversity of topics on which a scientist is working. As such, if a scientist is working on a small set of topics that are spread across traditionally defined fields, this person would be considered to be specialized and interdisciplinary. For this reason, in the main body of the paper, we claim that the opposite of specialization is not interdisciplinarity but rather generalization.

Here, we conduct simple statistical tests to demonstrate the lack of relationship between our specialization measure and a measure of interdisciplinarity. For this purpose, we consider a researcher to be interdisciplinary if they have been assigned multiple disciplines (as defined above) where, importantly, these disciplines are defined independently from MeSH terms. This definition is quite strict, which provides some assurance that these interdisciplinary researchers have indeed published a significant number of papers in multiple disciplines throughout their career. However, we also accept that this could occur when a researcher moves between disciplines rather than working across disciplines. Therefore, to give the best possible chance for a significant difference in specialization to be found between interdisciplinary and non-interdisciplinary researchers, we consider both the minimum and the average levels of specialization for each researcher throughout their career. This precaution will pick up a transition between disciplines as the least specialized period of the researcher’s career, and taken alone, this period would be challenging to distinguish from “true” generalization (as opposed to a transient state).

If generalization and interdisciplinarity were significantly correlated, we would expect the least specialized period in interdisciplinary researchers’ careers to be lower than that of non-interdisciplinary researchers. We may also expect their average specialization to be lower. Furthermore, as average specialization levels do not appear to stabilize until about 10 years into a career, we may wish to consider this latter part of the career separately. As such, we test all four of these scenarios for significant differences between interdisciplinary researchers and non-interdisciplinary researchers: minimum specialization across the whole career, minimum specialization for career age greater than 10 years, average specialization across the whole career, and average specialization for career age greater than 10 years. We use two non-parametric tests for this purpose: the Mann-Whitney U test and the two-sample Kolmogorov-Smirnov test. The former calculates the probability that a randomly chosen interdisciplinary researcher is less specialized than a randomly chosen non-interdisciplinary researcher, while the latter directly compares the cumulative specialization distributions of each group and tests the significance of any differences. We conduct these tests for all researchers in the primary cohort (greater than 100 publications) for whom we were able to obtain sufficient information about disciplines (not assigned “NULL”). In total, 29,670 researchers are included in this analysis, of which 1709 (5.8%) are classified as interdisciplinary.

Table 3 displays the results of these tests. The p-values for all tests indicate that any differences in specialization levels between interdisciplinary researchers and non-interdisciplinary researchers are not significant. This result is consistent with our assertion that the specialization measure does not measure interdisciplinarity (or lack thereof), at least for biomedical researchers with long careers. While not displayed here, the same (qualitative) results are found when the threshold of the distribution over MeSH terms, used to obtain our specialization measure, is set to 80% or 95%.

Separate disciplines

To again better understand the robustness of our findings, we carry out the same regression analysis as in the main manuscript for each of the eight most common disciplines in our dataset of researchers. Disciplines are assigned according to the procedure outlined in Section 1 above. Note that these regressions include all researchers assigned to each specific discipline, and hence, a research assigned to multiple disciplines will appear in more than one regression. For each discipline, we further report results for both the standard sample consisting of researcher publishing 100 or more papers in their career, as well as the set of researchers publishing 75 to 99 papers. These results are presented in Tables 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 and Figs. 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, and 22.

Fig. 7figure 7

Marginal effects for researchers in molecular and cell biology with more than 100 publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 10,889 unique researchers with a total of 81,398 researcher-time window observations

Fig. 8figure 8

Marginal effects for researchers in molecular and cell biology with 75 to 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 8135 unique researchers with a total of 53,890 researcher-time window observations

Fig. 9figure 9

Marginal effects for researchers in medicine with more than 100 publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 6722 unique researchers with a total of 48,433 researcher-time window observations

Fig. 10figure 10

Marginal effects for researchers in medicine with 75 to 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 4825 unique researchers with a total of 30,440 researcher-time window observations

Fig. 11figure 11

Marginal effects for researchers in neuroscience with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 2994 unique researchers with a total of 22,006 researcher-time window observations

Fig. 12figure 12

Marginal effects for researchers in neuroscience with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5the percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 2423 unique researchers with a total of 15,992 researcher-time window observations

Fig. 13figure 13

Marginal effects for researchers in gastroenterology with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 1713 unique researchers with a total of 12,164 researcher-time window observations

Fig. 14figure 14

Marginal effects for researchers in gastroenterology with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 1304 unique researchers with a total of 8017 researcher-time window observations

Fig. 15figure 15

Marginal effects for researchers in infectious diseases with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 1396 unique researchers with a total of 10,073 researcher-time window observations

Fig. 16figure 16

Marginal effects for researchers in infectious diseases with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 1154 unique researchers with a total of 7460 researcher-time window observations

Fig. 17figure 17

Marginal effects for researchers in radiology with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 1086 unique researchers with a total of 7757 researcher-time window observations

Fig. 18figure 18

Marginal effects for researchers in radiology with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 902 unique researchers with a total of 5483 researcher-time window observations

Fig. 19figure 19

Marginal effects for researchers in nephrology with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 944 unique researchers with a total of 6738 researcher-time window observations

Fig. 20figure 20

Marginal effects for researchers in nephrology with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 638 unique researchers with a total of 3962 researcher-time window observations

Fig. 21figure 21

Marginal effects for researchers in psychology with 100 or more career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 828 unique researchers with a total of 5990 researcher-time window observations

Fig. 22figure 22

Marginal effects for researchers in psychology with between 75 and 99 career publications. Low publishing rate is estimated at the 12.5th percentile (middle of the first quartile) of papers per year. High publishing rate is estimated at the 87.5th percentile (middle of the fourth quartile) of papers per year. The shaded envelope of each line is the \(99.9\%\) confidence interval. Based on 603 unique researchers with a total of 3882 researcher-time window observations

Table 2 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 22,589 unique biomedical researchers with between 75 and 99 career publications career publications, for a total of 145,143 researcher-time window observations Table 3 p-values from two non-parametric tests of the differences in specialization between interdisciplinary researchers and non-interdisciplinary researchers. M-W corresponds to the Mann-Whitney U test, while K-S corresponds to the two-sample Kolmogorov-Smirnov test. No significant differences are found between the two groups at any conventional p-value threshold. Analysis conducted for the sample of 29,208 researchers with at least 100 publications Table 4 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 10,889 unique biomedical researchers assigned to the discipline molecular and cell biology with at least 100 publications Table 5 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 8135 unique biomedical researchers assigned to the discipline molecular and cell biology with between 75 and 99 career publications Table 6 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 6722 unique biomedical researchers assigned to the discipline medicine with at least 100 publications Table 7 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 4825 unique biomedical researchers assigned to the discipline medicine with between 75 and 99 career publications Table 8 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 2994 unique biomedical researchers assigned to the discipline neuroscience with at least 100 publications Table 9 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 2423 unique biomedical researchers assigned to the discipline neuroscience with between 75 and 99 career publications Table 10 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 1713 unique biomedical researchers assigned to the discipline gastroenterology with at least 100 publications Table 11 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 1304 unique biomedical researchers assigned to the discipline gastroenterology with between 75 and 99 career publications Table 12 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 1396 unique biomedical researchers assigned to the discipline infectious diseases with at least 100 publications Table 13 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 1154 unique biomedical researchers assigned to the discipline infectious diseases with between 75 and 99 career publications Table 14 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 1086 unique biomedical researchers assigned to the discipline radiology with at least 100 publications Table 15 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 902 unique biomedical researchers assigned to the discipline radiology with between 75 and 99 career publications Table 16 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 944 unique biomedical researchers assigned to the discipline nephrology with at least 100 publications Table 17 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 638 unique biomedical researchers assigned to the discipline nephrology with between 75 and 99 career publications Table 18 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 828 unique biomedical researchers assigned to the discipline psychology with at least 100 publications Table 19 Fixed-effects panel regression results. Dependent variable is the log number of citations per paper, and the “specialization” variable is standardized. Standard errors are in parentheses. All control variables described in the main manuscript are included. Based on 603 unique biomedical researchers assigned to the discipline psychology with between 75 and 99 career publications

Even though these disciplines span a wide range of subject matters, norms, sample sizes, and career/laboratory structures (e.g., hospital-based clinical research vs. university experimental labs), the results across all are qualitatively consistent with those presented in the main manuscript.

留言 (0)

沒有登入
gif