Introduction

Academia and social hierarchies

“To truly make an apple pie from scratch, you must first invent the universe”—a quote attributed to Carl Sagan (Cliff, 2021)—illustrates the idea that the process by which individuals create is contingent upon the elements on which that creation is based. Whether creating a new piece of music, going about daily routines, or engaging in scientific research, people’s actions are founded in the information, experiences, and relationships that they have established by themselves and through others (McAndrew and Everett, 2014; Muller and Peres, 2019; Hazarie et al., 2020; Chen et al., 2022b). Each person has their own basis of knowledge that stems from their own lived experiences while also existing in a network of relationships through which they share experiences and knowledge with each other, thereby informing a collective understanding among a network of connected individuals (Nathaniel Rodriguez, 2016). Within such networks, hierarchies can emerge in which some actors exert greater social influence over the network and thus the creative process that it supports, while others may influence only those closest to them or no one at all (Holme and Ghoshal, 2006). This social hierarchy is common in the societal dynamics of government and politics, where some individuals and institutions exert a great degree of influence over the flow of information in the system and opinion formation (Ghoshal and Newman, 2007; Recuero et al., 2019; Dubois and Gaffney, 2014).

Academia is not immune to the emergence of social hierarchies; some academics can function as figures of authority due to the merit and influence of their work and their prominent position in a network of academic collaborations. Citations as an indicator of academic influence (Radicchi et al., 2017) have long been known to be distributed very unequally (Hirsch, 2005b), with a minority of a few scientists receiving most citations. Such inequality may be increasing at a global level (Nielsen and Andersen, 2021), at least with respect to citation numbers. In academic publishing, biasing effects like this have been studied under the lens of the Matthew Effect, where success begets more success and early success compounds into a cumulative advantage as the “rich get richer” (Merton, 1968). There are arguments that this effect is beneficial for academia; the rewards of top researchers are proportional to their contributions, which ensures the “epistemic security” of the field (Runco and Pritzker, 2011). This thinking is aligned with the notion that science should operate as a meritocracy; those who contribute the most are also valued the most and will therefore be most influential. Indeed, there is a high degree of trust in our most successful academics and the value of their mentorship. For instance, junior researchers collaborating with top scientists at the early stages of their career are likely to become top-cited scientists themselves, especially those at less prestigious universities (Li et al., 2019). Inexperienced academics can benefit from apprenticeships with top scientists; the “chaperoning” of early-career scientists leads to higher rates of publication in high-impact journals (Sekara et al., 2018). These relationships are frequently mutually beneficial. Less visible authors benefit from more opportunities to publish papers in high-quality journals that attract larger audiences, whereas top scientists gain collaborators with unique skills to produce more high-quality work (Xie et al., 2022). Close collaboration of less visible academics with those in the upper echelons can furthermore create opportunities for a first-mover advantage, inducing a positive feedback loop and early bandwagoning of innovative ideas (Abrahamson and Rosenkopf, 1997).

Effects of superstars in academic publishing

While top academics (sometimes referred to as “superstars”) may make consistent and high-impact contributions that benefit their field and collaborators, their status as superstars may also have deleterious effects due to the subsequent concentration of resources and attention. For instance, it has been shown that the collaborators of academic superstars experience a 5–9% drop in publication rates after the sudden death of that superstar (Azoulay et al., 2010), highlighting their dependence on the superstar’s collaboration. Additionally, after the death of a superstar, there is a surge in non-collaborators of the superstar publishing impactful works within the subfield of a superstar while pulling from other domains (Azoulay et al., 2019), implying the death of a superstar provides an opportunity for academia to evolve in new directions. In fact, it is unclear whether collaborating with superstars truly fosters independent career development (Clauset et al., 2022; Janosov et al., 2020). Furthermore, superstars can induce a high degree of inequality in the distribution of research funding due to a funding Matthew-effect. Those who receive funding accumulate twice as much research funding afterwards compared to those who submitted similarly valued proposals but found themselves, by chance, just below the funding threshold. There is no evidence that this accumulation of research funding is due to actual achievements enabled by previous funding (Bol et al., 2018; Petersen et al., 2011). If successful collaborations with superstars lead to early funding success, this can induce a superstar-fueled funding cycle that increasingly widens the gap between scientific haves and have-nots.

The topology, structure, and characteristics of scientific collaboration networks may play an important role in these effects since they shape both the production and dissemination of ideas, potentially with conflicting outcomes. Tightly connected networks could be more efficient in distributing and leveraging knowledge, thereby yielding higher productivity, but may at the same time lead to a decline of diversity, reducing exploration and discovery (Lazer and Friedman, 2007; Rodan and Galunic, 2004; Chang and Joseph E. Harrington, 2005). Although some spillover effects may occur, i.e. collaborators of highly-acclaimed authors benefit by proxy (Trapido, 2015), it is not clear whether the concentration of attention of resources towards superstars yields more novel and innovative research. This is a particularly relevant issue with the rise of interdisciplinary research, which relies on the ability of scientists to collaborate in equitable teams that foster creativity and innovation across various research fields (Xu and Evans, 2022).

To investigate the effects of superstar influence on academic productivity, impact, and innovation, we perform a comprehensive analysis of over 650,000 articles in the American Physical Society corpus augmented by OpenAlex (Priem et al., 2022). Following Azoulay et al., 2010, we define superstars as academics who are among the top 1% in terms of their h-index (Hirsch, 2007, 2005a). We extract the semantic content of over an available 547,000 abstracts, defining a number of information-theoretic measures to quantify the novelty and innovation of each paper and supplement this with analysis of publication and citation rates. Academic output is compared between followers of superstars—researchers who collaborate with or cite frequently papers by superstars—against those with little-to-no connection to such superstars. We find that at the individual level, collaborators and frequent citers of superstars publish more, garner higher citations, and produce papers with moderately more diverse content compared to other academics. However, their work is less innovative and disruptive than the rest of the corpus. Furthermore, their publication content is markedly more redundant than the rest of the corpus; that is, they tend to publish mostly on similar topics. Indeed, once one excludes papers co-authored with superstars, they underperform the rest of the academic corpus as measured by these metrics. On the other hand, the superstars themselves outperform the rest of the corpus.

To examine the collaboration effects of superstars on academics, we compare authors who have multiple collaborations with superstars in the beginning of their career (defined as “early collaborators”) to those who do not collaborate with superstars early in their career (“early non-collaborators”). Given that the rate of publications is tightly connected with the number of citations, a proxy for visibility, these groups are compared using coarsened exact matching based on authors’ first publication years and publication rates. While early collaborators are more innovative and have a much higher citation rate throughout their careers, once one removes the superstar collaborations, these early collaborators are less innovative and disruptive. Despite producing less novel independent work, early collaborators experience a citation boost by virtue of their collaborations. We end with a discussion our results and its consequences.

Data

We use the American Physical Society (APS) corpus (American Physical Society, 2021) that contains articles published in APS journals from 1893 to 2021. The data set contains full citation data of 668,334 DOIs within APS, i.e. the citations pointing from the references of one article to another, allowing a reconstruction of the full citation network among all articles, including article-specific fields such as DOI, journal, volume, issue, first page and last page or article id and number of pages, title, authors, affiliations, publication history, PACS codes, table of contents heading, article type, and copyright information. We utilize OpenAlex (Priem et al., 2022), an open-source database of scholarly output of over 250 million works, cataloging data of research articles, books, dissertations, and other scholarly works. Work-specific data such as titles, abstracts, authors, journals, affiliations, references, and publication year are also maintained within OpenAlex. OpenAlex contains abstracts for many of their papers as well as authors and their associated OpenAlex IDs. The OpenAlex author disambiguation protocol has been updated as recently as June 2023 to integrate with ORCIDs and better handle author splitting (OpenAlex, 2025). We recognize the ongoing development of author disambiguation protocols (Caron and van Eck, 2014), note that OpenAlex authorids have been applied in previous well-cited works (Ruffo et al., 2023; Bond et al., 2024), and acknowledge limitations while applying the most up-to-date compatible disambiguation for our study. To control for papers with high numbers of co-authors, we plot a cumulative density function for all papers indicating that 98.6% of papers have fewer than 25 authors. We remove all papers with more than 25 authors for our analysis (Fig. S1). To analyze abstract content, we use the procedure in (El-Kishky et al., 2014) to determine significant phrases within documents. Stop words in the corpus were removed, all words were lemmatized, low-frequency words that appeared fewer than 300 times in the corpus were discarded, and phrases were determined based on a significance score (El-Kishky et al., 2014) that determined whether or not phrases occurred due to random chance. After identifying articles whose abstracts contained too few phrases, were primarily website metadata, and those that did not contain abstract data, 547,384 articles were used for analysis on abstract texts out of a total 658,981 documents considered for analysis. A flow diagram of data pre-processing is outlined in Fig. 1. The distribution of citations and h-index are both heavy-tailed (Fig. S2), with the average number of citations being 13.15 and the average h-index 2.56.

Fig. 1: Schematic of data processing and author-group selection.
figure 1

TOP ROW: LEFT Data Processing outlining the synthesis of the APS Citation Graph and the OpenAlex metadata, resulting in 658,981 documents with fewer than 25 authors. RIGHT Outline of pre-processing of paper abstracts, yielding 547,384 papers with abstracts post-processing. MIDDLE ROW Processes of defining author- dependence groups to focal superstars. Authors are sorted by dependence to a superstar via proportion of citations \({r}_{{dep}}\) to a superstar, then binned into groups based on dependence. The novelty of all inspired papers per group are analyzed on a yearly basis. BOTTOM ROW Diagram outlining the coarsened exact matching process of identifying appropriate early collaborator and early non-collaborator groups for comparison.

We collect all papers from 1929 to 2021; due to the nature of establishing an appropriate benchmark of information for our novelty measures, we consider authors who first publish on or after 1970 for analysis. There are many ways to define top scientists. For our study, we define superstars as those with the top 1% of h-index in the corpus (Hirsch, 2005a) for its simplicity and it’s correlation to indicators of academic success such as Nobel Prizes, fellowship acceptances, and holding positions at prestigious institutions (Bornmann and Daniel, 2007). Superstars within the APS corpus, then corresponding to an h-index threshold of 18. This yields 4164 superstars among 383,346 authors. The summary statistics can be found in Table S1. For purposes of robustness, we also consider those in the top 0.5% and 0.1% of h-indices.

OpenAlex indexes documents with a list of concepts present based on the title, abstract, and host venue using an automated classifier trained on the Microsoft Academic Graph (MAG) corpus. Concepts are tagged with levels of abstraction up to level-5 being the lowest level of abstraction. We utilize level-2 concepts, of which there are 159,323 unique tags. Of these unique tags, 8155 level-2 concepts appear in documents in the APS corpus, and 8223 documents do not have level-2 concepts, amounting to 1.25% of the corpus.

Metrics of novelty

Novelty detection in the literature has been implemented in a variety of ways (Ouafae et al., 2020), such as contextualizing novelty in machine learning as information retrieval (Soboroff and Harman, 2003; Ghosal et al., 2022), distant combinations of ideas via citation relations (Uzzi et al., 2013), first-pass combinations of concepts never before connected (Schumpeter, 1934), knowledge-graphs of concepts within social networks (Rodan and Galunic, 2004), and agent-based simulations of social and individual learning (Chang and Joseph E. Harrington, 2005).

Here we employ multiple different measures that utilize the content of abstracts as well as distributions of paper citations. From the abstracts we can characterize documents in terms of hierarchical classification of concepts to examine the diversity of concepts of a paper. Additionally, one can check the extent to which papers combine concepts in novel ways to determine their degrees of innovation. We supplement our abstract-level novelty metrics by utilizing the citation network to measure the extent to which a paper disrupts the field. Uncovering the connections between authors, their published content and their citations can elucidate the influence that superstars have on the career success and novelty produced by other academics.

Entropy

We utilize OpenAlex hierarchical concepts to define a vector of concepts for each paper. A document \(u\) can be described as a vector of concepts Vu where the ith element is the score the classifier assigns to the ith concept. The vector is then l1-normalized such that \({{\boldsymbol{v}}}_{i}^{u}\equiv {{\boldsymbol{V}}}_{i}^{u}/{\sum }_{i}{{\boldsymbol{V}}}_{i}^{u}\) yielding a distribution over concepts. For a given document u, we define the Shannon entropy as

$${I}_{u}^{({\rm{S}})}=-\mathop{\sum}\limits_{{\rm{i}}=1}{v}_{i}^{u}{\mathrm{ln}}\,{v}_{i}^{u}$$
(1)

The expression quantifies the average level of “surprise” or uncertainty over the outcomes of a random variable (Cover and Thomas, 2006). In this context, papers focusing on limited number of topics in abstracts will yield low values of \({I}_{u}^{(S)}\), whereas those with a wide diversity of topics will yield a larger value of the entropy.

Innovation

Shannon Entropy captures topic diversity of the paper itself but does not express the degree to which the paper expanded the literature through innovation. In other words, it does not capture particular new ideas being introduced or established ideas that are being connected for the first time. We therefore define Innovation as the degree to which the document adds words and phrases in new combinations to the literature (Kuhn, 1962; Hofstra et al., 2020). Specifically, innovation in this context, is a measurement of when terms were first introduced or combined in the corpus (Sec. S1.2). Coupled with the entropy, this allows us to track how the diversity of ideas correlates with new conceptual recombinations and co-occurrences of terms. Following this logic, we define the Innovativeness of paper u as

$${I}_{u}^{(I)}=\frac{1}{2}\mathop{\sum}\limits_{{w}_{1}\ne {w}_{2}\in u}{\mathcal{J}}({w}_{1},{w}_{2};u)$$
(2)

where w1 and w2 are distinct terms in paper \(u\), \({\mathcal{J}}\)(w1, w2; u) is an indicator function that is 1 if terms w1 and w2 are first seen within the corpus in paper u and 0 otherwise, and the 1/2 prefix accounts for double counting. To remove spurious conceptual links—pairs of phrases co-occurring in documents due to random chance—we calculate a point-wise mutual information (PMI) score for all links as the log ratio of co-occurrence probability over the individual probabilities of each concept (Hofstra et al., 2020), and consider conceptual links above a PMI threshold, where links are at least 5 times more likely to occur together than by random chance. An example of an identification of a conceptual link between two terms can be found in Fig. S3.

Disruption index

There are previously established metrics that quantify novelty and impact based on citation distributions. We compliment our abstract-based measures with the disruption index CDt (Funk and Owen-Smith, 2016) that looks at the consolidating or destabilizing nature of papers based on citations of a given focal paper and its predecessors within t years after publication (here we choose t = 10). For a focal paper u with references b and a set of n future papers i that cite u, b, or both within t years, the disruption index is given as

$${I}_{u}^{(C{D}_{t})}=\frac{1}{{n}_{t}}\mathop{\sum }\limits_{i=1}^{{n}_{t}}-2{u}_{{it}}{b}_{{it}}+{u}_{{it}}$$
(3)

where \({u}_{{it}}\) is 1 if a paper \(i\) cites a focal paper and 0 otherwise, and \({b}_{{it}}\) is 1 if a paper \(i\) cites a reference of a focal paper and 0 otherwise. This means that \({I}_{u}^{(C{D}_{t})}=1\) implies maximum disruption, where citing papers of a focal paper only cite the focal paper and none of its predecessors. This can be thought of as activating new science, where the preceding work becomes irrelevant to the future work. On the other hand, \({I}_{u}^{(C{D}_{t})}=-1\) is the case where all citations of a focal paper also cite at least one predecessor, implying maximal consolidation and building upon previous work. From this one can define an average Disruption Index by averaging over all papers. Due to the requirement of citation data to compute the disruption index, we do not consider the disruption index of any papers published after 2010.

In Fig. S4 we plot histograms for the novelty measures accumulated across all papers for all metrics. The Shannon Entropy distribution is unimodal with peaks near \(\log (n)\) where \(n\) is the number of concepts that are identified by the classifier for a given paper. This is a result of the classification, which classifies documents with a small number of concepts and scores all other concepts for each document with a score of 0. The Shannon Entropy is therefore highly representative of the number of concepts present, while variations in the distribution arise from differences in the scoring of concepts. Innovation qualitatively follows a power-law distribution, while disruption index which is peaked around zero, with a few papers at the extreme ends of the distribution, a trend similar to what has been reported in previous studies (Bornmann and Tekles, 2019; Wu et al., 2019; Osório and Bornmann, 2020; Park et al., 2023). Ranking \({I}_{u}^{(C{D}_{t})}\) for the papers of individual authors, only between 1–5 papers have values of \({I}_{u}^{(C{D}_{t})} > 0\), with the rest of their papers converging around \({I}_{u}^{(C{D}_{t})}=0\) (Fig. S5). Plotting the histograms for the novelty measures now accumulated across authors (Fig. S6) we find similarly trends for all metrics, noting that the average disruption index of an author which is sharply peaked around zero. Considering (Fig. S5), most author publications will be not very disruptive nor consolidating. Considering authors iterate on their own work throughout their careers, we consider the paper of an author that is maximally disruptive (highest value of \({I}_{u}^{(C{D}_{t})}\)) as an indicator of novel output In Fig. S7 we plot the Pearson’s \(r\) correlation coefficients between all measures finding that the innovation, disruption index, and entropy are weakly correlated (\(r < 0.01\) for all pairs of measures), indicating that each measure captures a different aspect of academic output. The independence of these measures are further quantified by controlling for yearly effects, journals, and team sizes, (Table S3), finding \({R}^{2} < 0.032\) between all measures.

Finally, in a related context, in the field of creative ideation, it has been reported that inspirees (individuals inspired by another’s work) stimulated by highly creative peers tend to generate more creative ideas (Baten et al., 2020, Baten et al., 2021, Baten et al., 2022). However, as a group, the inspirees' ideas were found to be similar to each other leading to redundancy in generated ideas over time at the group level. To check whether a similar effect manifests in academic publishing, we compute the cosine similarity score between papers \(u,{u}^{{\prime} }\) represented by their concept vector \({{\boldsymbol{V}}}^{u}\) and \({{\boldsymbol{V}}}^{{u}^{{\prime} }}\). For a general set of papers \(P\) with total number of papers denoted as \(\left|P\right|\), the average among the similarity scores between all pairs of papers in \(P\) is

$$Sim(P)=\frac{2}{|P|(|P|-1)}\mathop{\sum}\limits_{u,u^{\prime} \in P}\cos ({V}^{u},{V}^{u^{\prime} })$$
(4)

where \(1/\frac{2}{\left|P\right|(\left|P\right|-1)}\) is the number of pairs of all papers in \(P\).

Results

Superstar statistics

We next examine whether the novelty, innovation and disruption metrics of superstars are significantly different from the rest of the academic corpus, considering authors who have started publishing at or after 1970 and published at least 10 papers to control for authors with low productivity. In Fig. 2A–C we plot the Shannon entropy (Eq. (1)), Innovation (Eq. (2)) and the maximum value of the Disruption Index (Eq. (3)). In both figures, the orange bar corresponds to the set of superstars, whereas the blue bar refers to the rest of the corpus. In terms of Shannon entropy, superstars outperform the remaining academics by 1.3%. That is, superstars are inspired by a higher diversity of content, publish works that are more conceptually diverse, and inspire a wider array of publications than non-superstars. A starker contrast can be seen in terms of average Innovation, where superstars introduce new terms or combine terms for the first time 83% more often than non-superstars, indicating that the former are more prolific in introducing new combinations of terms. Additionally, the most disruptive papers of superstars are about 93% more disruptive than non-superstar academics, meaning they are more likely to create seminal contributions.

Fig. 2: Average author-level statistics of novelty and innovation.
figure 2

Results are partitioned between superstars denoted by orange bars (h-index \(\ge\) 18) and all other authors with at least 10 publications denoted by blue bars. Each panel represents the average author-level novelty scores: A Shannon Entropy, B Innovation C Maximum Disruption Index.

We note that there is a monotonic dependence of the innovation with number of publications for all academics, although the effect is more pronounced for superstars (Fig. S8). Authors with few publications show a relatively higher yet decreasing trend in Shannon Entropy with publications, but after roughly 25 publications shows a split, though not statistically significant, where superstar-authored papers tend to draw from multiple diverse concepts more so than those not authored by superstars. There is the notable exception of the average and median Disruption Index which instead decreases and saturates as one publishes more papers. This is a reflection of the fact the majority of publications of authors are not particularly disruptive—most academic output tends to build upon established ideas—and the signal of their most disruptive paper is masked when averaging over the rest of their papers. On the other hand, the maximum value of \({I}_{u}^{(C{D}_{t})}\) shows a monotonic dependence on the number of publications indicating the increased statistical likelihood of producing a seminal piece of work the more one publishes.

We also find a monotonic dependence of citations received by a paper u innovation, indicating that the ability to connect concepts is rewarded in terms of more attention paid to that paper (Fig. S9). The Disruption index on the other hand shows a non-monotonic trend in terms of impact, as seen in previous studies (Wu et al., 2024; Pelletier, 2023). As papers tend to consolidate more ideas (decrease in disruption) this is rewarded with more citations, until the trend reverses and the more disruptive (or seminal papers) get the highest number of citations. The Shannon Entropy is on average flat for papers up to about 20 citations, before showing a slight increase for papers up to nearly 200 citations, indicating a positive relationship between conceptual diversity and impact for moderately-cited papers.

To examine confounding factors, we examine author-level novelty within different APS journals using OpenAlex journal-id-metadata. We find 19 journals within APS. A total of 7 journals (Physical Review, Physical Review A, B, C, D, and E, and Physical Review Letters) containing 637,641 documents considered for partitioning, with 21,340 documents among 12 journals which we do not consider due to low document count. We also note that “Physical Review” stops publishing in 1970 as the journal is partitioned into its successors “Physical Review A-E”, so “Physical Review” is not considered in the partition. A table of publication counts and active years of each journal can be found in Table S2. We compare superstars and non-superstars among different journals (Fig. S10) and find innovation and max disruption index is consistently higher for superstars, while superstar Shannon entropy is either higher or not significantly different than superstars. We also note that Physical Review C has very low innovation potential, with authors averaging an innovation score of 0.015, negligible compared to the one to two orders of magnitude higher innovation present in other journals.

A confounding factor in the reported results is the possibility of a first-mover advantage. Authors with higher h-indices are likely to be senior scientists with a large publication output that have had richer opportunities to publish more diverse content or foster more innovative ideas than equally capable scientists who have the disadvantage of entering the corpus at a later time. Controlling for year-dependent effects, the author-level novelty of superstars and ordinary academics are compared by the year of their first publication. In the upper panel of Fig. S11 we plot the temporal evolution of the Author-level Innovation and Shannon Entropy as well as the maximum and average Disruption Index, comparing the superstar set to their matched controls. In all cases we find that the superstar set clearly outperforms non-superstars indicating that the putative first-mover advantage is not a confounding factor in the analysis. The trend is however reversed for the average disruption index, which is to be expected given that superstars publish more than their peers, which

leads to a saturation in the average disruption index as seen in Fig. S8. In the left and center columns we show the results for a more restrictive set of superstars (the top 0.1% and 0.5%). While the relative difference between the two groups is marginally diminished, the overall trends are robust to the size of the superstar set.

Superstar influence

Having established that superstars outperform other academics in terms of our metrics, we next determine to what degree superstars affect the academic output of their collaborators and their "inspirees" (those inspired by their work). Inspirees are authors that cite a superstar’s papers, for whom we determine the degree of inspiration for an author by the proportion of their papers that cite a superstar. We examine inspirees both at the group- and individual-levels.

Group-level

At the group-level, we center each superstar in a network of inspirees where the degree of inspiration is measured by the ratio of superstar dependence rdep, the fraction of times a researcher cites said superstar compared to the researcher’s total publication count. We then partition the inspirees into groups based on their degree of dependence, where the upper bounds for each bin are the 35th percentile of inspirees, 60th, 75th, 90th, and 100th. These groups represent increasingly dependent ties to a given superstar; those in the 35th percentile are the least actively inspired, while authors in the 100th percentile cite the superstar most frequently among all of their publications. Note that some inspirees in the lowest-percentile group of one superstar may be in the top group of another superstar, but the publications considered here are only those that cite the given superstar. The increasing bin sizes are chosen to account for the decreasing frequency of inspired authors as the inspiration threshold increases, such that there are sufficient number of papers compared between groups. To control strictly for inspiration and not collaborative effects, we do not consider papers co-authored with superstars. That is, the analysis is conducted only on the academic output produced by the authors themselves without the benefits accrued by superstar collaboration.

Given that we are interested in the temporal evolution of superstar influence on the novelty of the inspirees, we denote the year of the first superstar publication as \({t}_{0}^{(S)}\) and for every subsequent year \(t > {t}_{0}^{(S)}\), we consider the set of publications by the inspirees who cite the superstar. For each partitioned group, we calculate the average Shannon entropy, average innovation, and maximum disruption index among all of the publications \({P}_{{G}_{S}}\) of group \(G\) inspired by superstar \(S\) during year \(t\) per partition. Denoting the set of papers inspired by superstar \(S\) for group partition \(G\) at year \(t\) as \({P}_{{G}_{S}}(t)\), the average novelty scores are computed as

$${\overline{{I}_{u}^{(l)}}}_{G,S,t}=\frac{1}{\left|{P}_{{G}_{S}}(t)\right|}\mathop{\sum}\limits_{{u\in P}_{{G}_{S}\left(t\right)}}{I}_{u}^{\left(l\right)}$$
(5)

where \(l=S,I,C{D}_{t}\) is the novelty or innovation score of paper \(u\), and the max disruption index over all papers in \({P}_{{G}_{S}}(t)\) is

$$\max {({I}_{u}^{(C{D}_{10})}}_{G,S,t})=\mathop{\max }\limits_{u\in {P}_{{G}_{S}}(t)}{I}_{u}^{(C{D}_{10})}$$
(6)

This grouping procedure is outlined in the second row of Fig. 1, where \({Nov}(\{{P}_{{G}_{S}}(t)\})\) denotes the generalized group-level statistics specified in Eqs. 5 and 6.

We plot the results of our analysis in Fig. 3. Group-level Shannon Entropy and Innovation show similar results, where each group both in time and aggregated over time show no clear difference in novelty scores. We aggregate all papers PGS (t) over all t, compute each group-level novelty statistic over this set, and average over all superstars. At the aggregate level, there appear to be statistical differences, but the range of differences between groups are small with no clear trend between groups (\(\approx\)0.01 for Shannon Entropy, \(\approx\)0.05 for Innovation). The decreasing Innovation in time for all groups are similar to the trends of Innovation in time (Fig. S12), reflecting a saturation in the number of combinations of new terms that are combined by authors as their career progresses. In terms of the group-level max disruption index, there are significant differences in dependence groups with the least-dependent group showing the highest disruption index among their inspired papers both yearly (Fig. 3E) and averaged over time (Fig. 3F). The higher the dependence, the lower the maximum disruption index aside from the 25–40 percentile group which shows the lowest maximal disruption. In addition to the maximum disruption index over yearly inspiree-group papers, we also consider the average group-disruption index in Fig. S13A, B as computed by Eq. 5. In both the max and average cases, groups most inspired by superstars produce on average 30% less disruptive papers than the least inspired group. The temporally averaged value shown in Fig. 3F shows a monotonic decrease in disruption as the dependence to superstars via citations increases. That is, the more that groups of academics are inspired by a superstar, the less disruptive their work is. We note that we introduce the average disruption index in this calculation because the statistics are computed over a set of multiple-authors’ papers and not a single author’s work. The statistics over these group papers measure the novelty over inspired papers, which is not an author-level measure, and should be interpreted as such. The average disruption index among group papers Fig. S13A, B shows a monotonically decreasing trend where highly inspired authors are 30% less disruptive than those the least inspired by superstars. Finally, citations to papers experience an initial boost and then decreases in time as seen in Fig. 3G, with clear differences between the groups. Indeed, there is roughly a 10% difference in citations per-paper between the most and least inspired groups as seen in Fig. 3H.

Fig. 3: Novelty and Innovation statistics at the group-level Temporal trajectory of superstar dependence group-novelty statistics.
figure 3

A, B Temporal and Aggregate-level Shannon Entropy C, D Temporal and Aggregate-level Innovation, E, F Temporal and Aggregate-level Maximum Disruption Index, G Citations per paper vs year after inspired-author publication, H Citations per-paper aggregated over time. \(t-{t}_{0}^{(S)}\) represents the years following a focal superstar’s first publication. Years Curves indicate averages, shaded area 95% confidence interval.

In terms of redundancy, in Fig. S13C, D, we plot the cosine similarity (Eq. 4). As the figure indicates, across all groups there is a decreasing trend in the temporal evolution of the similarity, yet a clear difference exists, whereby papers published by the top percentile-group of inspirees are on average 36% more similar to each other in terms of content when compared to the bottom percentile-group. Taken together, the results indicate that groups of authors who cite superstar papers often do get a citation boost as compared to other sets of authors. However, their output is no more innovative or novel, while being less disruptive as compared to the rest of the corpus. Rather, their content is more redundant than the remaining sets of authors. The citation boost can be explained by the fact that superstars, by virtue of being the most cited academics, influence the field in the sense that the topics they work on are highly visible to the rest of the community. It stands to reason that academics who cite superstars frequently tend to work on the same topics as the superstars. Given the high visibility of such topics, they also tend to be more cited. There is however no evidence that the quality of that work (as measured by the metrics studied here) is any different than the rest of the corpus.

Author-level

Next, we disaggregate the group-level results and examine the degree of superstar influence at the individual author level. In Fig. 4 we plot the averages of the novelty as well as citations and publication counts across authors as a function of the fraction of their papers that cite superstars. Given that many authors co-publish with superstars, the blue curve indicates the results when including such papers, while the orange curve shows the results excluding these papers. Figure 4A indicate similar results to the group statistics in terms of Shannon Entropy, in that there is no clear trend in Shannon Entropy with respect to author-level inspiration. On the other hand, when looking at Innovation (Fig. 4B), the trend is increasing for low superstar-inspiration, and for highly inspired authors remains either flat when including all papers, and decreasing when co-authored publications are excluded. Statistics of the difference between author novelty with and without superstar coauthored papers can be found in Table S4; differences in Shannon Entropy are small and insignificant for all degrees of inspiration. Indeed, it appears that the more authors cite superstars, the less innovative their own publications become (i.e those not co-authored with a superstar). Significant difference in innovation potential occurs for degree of inspiration \(\ge 0.7\) where highly inspired authors suffer a 14% innovation deficit when publishing independently of superstars to an effect size calculated by Cohen’s-\(d\) as high as 0.08.

Fig. 4: Novelty and Innovation statistics at the individual author-level.
figure 4

(LEFT) Average novelty scores per degree of inspiration, A Shannon Entropy, B Innovation C Max-Disruption Index (RIGHT) Average citation based metrics D Average citation count, E Average publication count. Curves indicate averages, shaded area 95% confidence interval.

A similar effect is seen for the maximum Disruption index (Fig. 4C). The more academics cite a superstar the less likely they are to produce a highly disruptive paper, with minor differences whether one includes collaborations or not, until we reach those sets of authors who exclusively tend to cite superstar publications. For these authors it appears that their disruptive papers are produced in collaboration with superstars. This is confirmed in Table S4 where the max disruption index of the most highly inspired authors are 0.071 less disruptive when not collaborating with superstars, with Cohen-\(d\) of 0.155. This collaboration benefit is most apparent when looking at citations (Fig. 4D) and number of publications (Fig. 4E). For the former when including collaborations there is a dramatic benefit in terms of garnered citations (approximately 250% more citations on average for the most inspired authors) that drops considerably when excluding collaborations. Indeed, the citation-benefit appears to be driven primarily by being collaborators of superstars who by definition have the largest number of citations to their papers. The same appears to be the case for the latter, with the number of publications increasing when including collaborations, and decreasing when excluded.

To examine whether there are confounding variables with respect to team size and journal subject, author publications are partitioned similar to our examination with Fig. S10 with respect to journal, team sizes (with sizes \([1,3],[4,6],\) and \([7,25)\)), and reference counts (with sizes \([1,6],[7,15],\) and \(\ge16\)), while maintaining the same degree of inspiration to a superstar before partitioning. These fixed effects were taken into account by fitting an ordinary least squares (OLS) regression model and can be found in Tables S5 and S6. We find little significance in the regression coefficient for Shannon Entropy when partitioning on fixed effects. However for innovation, there is a positive relationship of superstar inspiration on innovation potential, that is nearly doubled when including superstar influence across journals (\({R}^{2}\in (0.146,0.194)\)) and tripled across team sizes (\({R}^{2}\in (0.096,0.118)\)). There is a strong negative correlation between degree of inspiration and max disruption, with \({R}^{2}\ge 0.478\) indicating authors are less maximally disruptive as they are more inspired by superstars. These trends are found when fixing for effects of reference count of publications. Taken together, these results find little effect of inspiration on Shannon Entropy, yet a deficit of innovation potential for highly inspired superstars absent superstar publications, as well as a decreasing potential with superstar inspiration for publishing a highly disruptive paper. A full quotation of coefficients, p-values, and effect sizes for regressions can be found in Tables S5, S6.

Early collaborators and non-collaborators

The results thus far provide evidence for academics inspired by superstars producing papers that receives visibility via citations, while not necessarily being innovative in the sense of tying together new concepts, or making largely disruptive contributions by way of destabilizing and inspiring new future science. On the other hand, there is also evidence that these features are significantly boosted by direct collaboration with superstars, and when left to their own devices their publication output, citation impact, innovation, and maximum disruption is stifled and sometimes on average lower than most of the authors in the corpus. This begs the question whether superstars foster or stifle independent individual success and original work. For instance, as shown, at the aggregate level, the group of authors that cite superstars the most often tend to publish on mostly the same topics.

To further probe this, we shift from inspirees to early-career collaborators of superstars. Given that findings from prior studies have shown that collaboration with successful scientists provides a boost for early career researchers (Li et al., 2019), and that early success generates a cumulative advantage of long-term career success (Merton, 1968), we define early collaborators as those authors who collaborate with superstars in at least half of their papers in the first five years of their career. As a control comparison, we define a group of early non-collaborators who do not collaborate with superstars in the first 5 years of their careers, but may collaborate with superstars afterwards. To ensure proper comparison between these groups, we recognize that visibility (in terms of citations) are tightly connected to the publication rate (Fig. S14) and novelty scores are not constant with publication year (Fig. S12). To account for these confounding variables, we perform a coarsened exact matching procedure to randomly select collaborators and non-early collaborators with the same publication rates in their first 10 years of publishing \({N}_{{P}_{10}}\), as well as the same start year of publishing \({t}_{0}^{(C)}\) where \(C\) indicates that we are considering authors in collaborator groups. We binned equivalent start years on 5 year intervals beginning at 1970. This coarsened exact matching procedure yields two equal-sized groups of early collaborators and non-collaborators with similar start years and publication rates. Superstars are not excluded from the early collaborator and non-collaborator sets and their collaborator status is determined by collaboration with other superstars. A flow chart of this selection procedure is depicted in the bottom row of Fig. 1. This yields 16,924 early collaborators and 17,194 non-early collaborators, and after coarsened exact matching yields 9366 authors in each group.

For each group, we track the temporal evolution of the citations per-paper, the number of publications, as well as the innovation and disruption index, measured from the date of first publication \({t}_{0}^{(C)}\) for authors in either collaborator group as shown in Fig. 5. The upper panel excludes the collaborations with superstars and the lower panel includes such collaborations. The blue curve refers to the early collaborators while the orange curve, the early non-collaborators. Focusing first on Innovation (Fig. 5A, D) we find that the early collaborators out-perform non-collaborators when early collaborators publish alongside a superstar. This benefit seemingly comes from the publications coauthored with the superstar; when excluding these papers, early collaborators are no more innovative than non-early collaborators. Pairwise differences in average novelty and citation output for each binned year \({t}_{0}-{t}_{0}^{(C)}\) are calculated between all pairs of coarsen exact matched authors \({A}^{{NEC}}\) and \({A}^{{EC}}\) and are quoted in Table S7. When co-publishing with superstars, early collaborators are consistently more innovative over all years (\(\bar{\Delta }({I}^{(I)})\in (-0.016,-0.327))\) with effect size given by Cohen-\(d\) of \(\le 0.139\). After removing superstar papers, non-early collaborators are more innovative within the first 10 years of their careers (\(\bar{\Delta }({I}^{(I)})\in (0.023,0.033))\) with effect size \(\le 0.037\), indicating superstars are roughly 4 times more effective in explaining the innovation potential of early collaborators compared to their works independent of their superstar collaborations. Without this superstar effect, early collaborators are less innovative over their early careers than non-early collaborators.

Fig. 5: Comparing early collaborators to early non-collaborators.
figure 5

Top row excludes all superstar papers, bottom row includes all superstar papers A, D Temporal evolution of Innovation, B, E Temporal Evolution of Maximum Disruption Index. C, F Temporal Evolution of Citation Rates. The horizontal axis indicates the time elapsed from the time of first publication for authors in either group. Curves indicate averages, shaded area 95% confidence interval.

When looking at their most disruptive papers we find once again that the non-collaborators’ most disruptive papers are more disruptive than those of early collaborators (Fig. 5B, E). That is, even when accounting for superstars, non-collaborators are more likely to produce a seminal paper later in their careers than the collaborators. Once the benefit of superstar collaborations is removed the difference is stark. The difference in max disruption doubles in favor of non-early collaborators when excluding superstar papers with moderate effect (Cohen-\(d\) \(\in (0.164,0.346)\)) (Table S7).The average disruption index also shows (Fig. S15B, D) that early non-collaborators are more disruptive on average throughout their careers than early collaborators with similar publication rates. On the other hand, when looking at Shannon Entropy (Fig. S15A, C) early collaborators and non-collaborators both publish papers with a similarly diverse spread of topics.

In terms of visibility as measured by citations, when comparing output excluding superstars (Fig. 5C) the citation rates follow similar trends, with a significant advantage for collaborators late in their careers. When including superstar papers (Fig. 5F), the difference in citations between non-early and early collaborators more than doubles from −3.40 to −7.57 citations per publication (123% increase), compared to a small increase (13.9%) from −1.66 to −1.89 when removing superstar papers. The effects are the most pronounced for citations, with high superstar effects (Cohen-\(d\) \(\in (0.446,0.820)\)) and moderate effects when excluding superstars (Cohen-\(d\) \(\in (0.222,0.309)\)) (Table S7). Thus, the evidence suggests that over the careers of the two groups of scientists, their publications are comparable in terms of the diversity of topics they publish on. However the non-collaborators are slightly more innovative in their early careers, and are more likely to disrupt the field. Yet, while their citation visibility is similar, with early collaborators outpacing non-collaborators by roughly 50% when not considering the benefit of superstar collaborations, it is substantially different when including those collaborations, nearly doubling early collaborator citations.

It is reasonable to examine the potential influence of other effects such as journal and team size, especially the latter when examining collaborator effects. We examine the effect of team size, journal effects, and effects on the reference counts for the disruption index on the pairwise differences between all coarsened exact matched authors of novelty and citation measures. Positive intercepts indicate non-early collaborators have higher novelty (or citations) for their first 5 years of publishing, and the row \(t-{t}_{0}^{(C)}\) indicates linear coefficients in time. When applying a least-squares regression in time and partitioning author data based on fixed effects, we find (Tables S8, S9) low effect sizes \({R}^{2} < 0.01\) with superstar influence on innovation being 4 times as effective in explaining the differences in innovation. Across journals, non-early collaborators are more innovative in their early collaborators, aside from Phys. Rev. C and D where the two groups show similar innovation output. Team size effects show large teams of non-early collaborators demonstrate the largest positive difference against early collaborators independent of superstar publications albeit small effect sizes. When including superstar papers and across all effects, early collaborators are always more innovative in their early careers.

The largest effects were captured in the partitioning of reference counts when analyzing the max disruption index, showing nearly equal max disruption for medium to high reference counts when including superstar papers (\({R}^{2}=0.011\)). When excluding superstar publications, the non-early collaborators are relatively more maximally disruptive, with high reference counts showing slightly higher disruption for early collaborators before equalizing in later career stages (\({R}^{2}=0.022\)). Over all effects with or without superstar publications, early collaborators sustain a cumulative citation benefit.

These collaborator statistics were computed over the superstars’ entire careers. However, there may be differences in collaborative output at different stages of a superstar’s career, or the type of authorship of a superstar, such as corresponding authorship. We partition superstar papers into three epochs labeled “Early”, “Middle”, and “Late”, where the “Early” group contains the first 1/3 of published papers of a superstar, the “Middle” group contains the middle 1/3 of published papers, and “Late” the final 1/3. Given the partitioning of early collaborators, there many more non-early collaborators than early collaborators of each epoch. To ensure comparisons between groups are not biased by particular matched samples, we bootstrap the data and examine the trends in sub-sample statistics of each group. The procedure is outlined in Sec. S4.1. We highlight results in innovation (Fig. S16) across all epochs, where collaboration with early-career superstars results in as much as a 75% increase in innovation potential of early collaborators compared to non-early collaborators. However when excluding superstar papers, the innovation output is nearly identical between early and non-early collaborators. For those that collaborate with middle and late stage superstars, the non-early collaborators out perform early collaborators in innovation particularly within the first 10 years of their careers.

The type of superstar authorship may also introduce differences in inspired novelty, as corresponding authors may represent principal investigators of research groups. We also examine early collaborators of corresponding superstars, compared between early non-collaborators and also between both early non-collaborators and early non-corresponding superstar collaborators that may collaborate with superstars early in their careers where the superstars are not corresponding authors. We find (Figs. S17, S18) similar results to Fig. 5 in terms of innovation, max disruption index, and citations per publication. We notice when partitioning between corresponding and non-corresponding superstars, the citation rates for both early collaborator groups when excluding superstar papers are the same. However, when including superstar papers, early corresponding-superstar collaborators benefit highly compared to both the non-corresponding group and the early non-collaborator group, indicating a signature of the Matthew Effect. The citation benefit once again exists despite producing no more particularly innovative work than non-early collaborators.

Conclusion and Discussion

In the exponentially growing knowledge-base of academia, visibility and funding are increasingly biased towards top academics and institutions. We have examined here the extent to which top academics (superstars) influence the research community. By studying a number of information theoretic metrics quantifying novelty and innovation as well as metrics of disruption as measured through citation profiles, this influence is quantified in the sense of how the academic output of those that are greatly inspired by superstars (measured by how often they cite superstar papers) compares against those that are less so. Unlike previous studies that primarily look at scientific collaboration networks and metrics derived from the distribution of citations, here we conduct a semantic analysis of the actual text in papers—using well established canonical measures—to quantify novelty and innovation.

First we demonstrate that across all considered metrics, superstars outperform their peers. They publish more, produce novel and innovative content, as well as disrupt the field at greater rates than the rest of the corpus, very much justifying their moniker. We find that other academics who are inspired by superstars also publish at a greater rate and garner more citations than the rest of the corpus. Yet their work is neither more novel than those who are less inspired and in fact less innovative as well as disruptive; that is they are less likely to produce a seminal paper. Furthermore, these academics produce content that is markedly more redundant in terms of research topics as compared to other scientists. Academics who frequently cite superstars are more likely to collaborate with them, and after one removes those collaborations, even their comparative advantage in terms of publications and citations vanishes and the quality of their output drops further in terms of innovation and disruptive metrics. Potential differences in results arising from publication team sizes and journal were examined and we found our results robust to these factors.

The evidence suggests that the citation boost experienced by superstar-influenced researchers is driven by two factors. One is direct collaboration with superstars. Since superstars themselves are highly cited by definition, this advantage passes on to their collaborators. The second is that superstars also influence the popularity of fields of research. They tend to work in well-established, high-activity subfields with high visibility. Academics who cite superstars very frequently, likely also work in such subfields, and by virtue of activity and visibility they naturally experience more citations than other researchers. Yet, there is no statistical evidence that their work is of any better quality than their peers as measured by the metrics considered here. Their lower level of innovation, combined with lower levels of disruption and higher redundancy, is suggestive of iterative work building upon established ideas. While it is true that the marketplace of ideas gets crowded over time, and the ability to connect novel concepts diminishes as the corpus grows, the same disadvantage applies to the superstars. Yet their success as measured by citations is justified by the quality of their output, whereas the same is not applicable to the inspirees. This could be an indication of individuals getting lost in the wake of a superstar, meaning these researchers “bandwagon” off the ideas and visibility of their respective superstars and iterate on the superstar’s work. Although there is value in iterating upon already developed research questions, this may not substantially advance the field. It appears that collaboration with successful academics or building upon their work can stimulate a successful career path for individuals, but at the cost of stifling innovation and new science.

This constitutes the final analysis presented in the paper. As has been noted in previous studies, researchers who collaborate with a superstar earlier in their career experience long-term success as measured by publication output, citations, and research funding. We compare these sets of researchers to a control set who publishes at the same rate throughout their careers but do not collaborate with famous academics earlier in their careers. Matching groups based on the activity rate also accounts for survivorship bias, where some academics working on niche fields eventually drop out of the corpus earlier than those publishing in more established topics. Even accounting for such controls, we find that the more independent set of authors produces more innovative and disruptive work than the collaborators, with collaborators attaining higher innovation potential only when collaborating with superstars. Controlling for such collaborations also indicates that, despite less or equally novel work, early collaborators attain a citation benefit relative to the more independent set. Yet, once one includes the benefit of superstar collaborations, their mentees, as it were, experience approximately double the citation rate. These results are consistent when examining the effects of team size and career stage of the superstars, with early collaborators of late-career superstars experiencing the greatest innovation deficit. Thus, it is questionable as to whether the observed differences in visibility and success between the groups are a function of the inherent quality of their work or merely a reflection of the success being passed on by collaborating with famous academics. Perhaps, care should be taken when considering these proteges for matters of funding and academic hiring. If the goal is to foster highly novel work, elements outside of prestige and social connection, such as efficacy, equity, and innovation, should be considered.

A related subject that we have not probed, but is directly relevant to our discussion, is the matter of funding. In recent times, funding has increasingly become more biased towards top institutions (Murray et al., 2016), with 90% of NSF funding in 2018 going to 22% of funded institutions, serving 43% of all institutions and 34% of underrepresented minorities (of Government Affairs, 2021). This is coupled with a history of funding disparities with respect to race and underrepresented communities (Woodson and Boutilier, 2021; Chen et al., 2022a; Ginther et al., 2011). Equitable funding programs have been shown to enhance research infrastructure, investigator capabilities, and intra- and inter-university collaborations at less prominent institutions (Harris, 2017). The results presented in this manuscript may prove useful in devising fair and equitable funding policies (Bollen et al., 2014) that could potentially reduce the growing inequality in academia, while at the same time stimulate novel, innovative, and seminal research.

Finally, we note that our investigation necessarily comes with limitations. Given our sole focus on the APS body of literature, one should be careful to extrapolate this to other academic disciplines. Future work will look at a comparative analysis across disciplines to check the extent (if at all) the effects reported here manifest themselves. Expanding analysis with a citation graph that spans all possible citations and not those within APS or within generally a particular publishing entity will yield a more accurate picture.