The scholarly literature forms a vast network of academic papers connected to one another by citations in bibliographies and footnotes [1]. The structure of this network reflects millions of decisions by individual scholars about which papers are important and relevant to their own work. Therefore within the structure of this network is a wealth of information about the relative influence of individual journals, and also about the patterns of relations among academic disciplines. Our aim at eigenfactor.org is develop ways of extracting this information.

Borrowing methods from network theory, eigenfactor.org ranks the influence of journals much as Google’s PageRank algorithm ranks the influence of web pages [2]. By this approach, journals are considered to be influential if they are cited often by other influential journals. Iterative ranking schemes of this type, known as eigenvector centrality methods [3], are notoriously sensitive to “dangling nodes” and “dangling clusters”: nodes or groups of nodes which link seldom if at all to other parts of the network. Eigenfactor algorithm modifies the basic eigenvector centrality algorithm to overcome these problems and to better handle certain peculiarities of journal citation data.

 

The Eigenfactor score of a journal is an estimate of the percentage of time that library users spend with that journal. The Eigenfactor algorithm corresponds to a simple model of research in which readers follow chains of citations as they move from journal to journal. Imagine that a researcher goes to the library and selects a journal article at random. After reading the article, the researcher selects at random one of the citations from the article. She then proceeds to the journal that was cited, reads a random article there, and selects a citation to direct her to her next journal volume. The researcher does this ad infinitum.

The amount of time that the researcher spends with each journal gives us a measure of that journal’s importance within network of academic citations. Moreover, if real researchers find a sizable fraction of the articles that they read by following citation chains, the amount of time that our random researcher spends with each journal gives us an estimate of the amount of time that real researchers spend with each journal. While we cannot carry out this experiment in practice, we can use mathematics to simulate this process.

In addition to providing direct estimates of how often journals are likely to be used, this approach offers a number of advantages. As mentioned above, the Eigenfactor ranking system accounts for difference in prestige among citing journals, such that citations from Nature or Cell are valued highly relative to citations from third-tier journals with narrower readership. The Eigenfactor score also adjusts for differences in citation patterns among disciplines. We can see why by looking at our example of the model researcher. Whether a journal cites 10 other journals or 100, the researcher will follow only one of those links. This is like a normalized voting system in which one can vote once with one’s full vote, ten times with each vote carrying weight 1/10th, or 100 times with each vote carrying weight 1/100th . Either way, one’s choices carry the weight of a single vote.

 

To compute the 2006 Eigenfactor scores, we proceed as follows. From the Thompson Scientific JCR dataset, we extract a cross-citation matrix, for the 7,611 ISI-linked science and social science journals, where = number of citations from 2006 articles in journal to articles in journal published in 2001-2005.

We then zero the diagonal of to ignore journal self-citations.

We now construct a normalized version of , named , normalized by the column sums, i.e.: the number of outgoing citations from each journal.

We also compute an article vector , where is the number of articles published by journal over the five-year target window, divided by the total number of articles published by all source journals over the same five-year window. Some of the journals listed in the matrix will be dangling nodes — journals that do not cite any other journals. Any column of the matrix that has all 0 entries is a dangling node; we replace all such columns in with the vector to produce a new modified matrix . This is a stochastic matrix by construction. corresponds to a random walk on the scientific literature as described above in "A Model of Research." From this, we construct a new stochastic matrix, :

Where, is a row vector of all 1's, and thus is a matrix with identical columns . This corresponds to a process which follows the literature with probabilities and "teleports" to a random journal with weights proportional to the number of articles published by a journal.

Like Google, we use . We define the vector as the leading eigenvector of which corresponds to the fraction of time spent at each journal in . These fractions serve as our weights of journal influence.

The Eigenfactor score, EF, is defined as

The Article Influence score for each journal is a measure of the per-article citation influence of the journal. The Article Influence score is calculated as

Where is the Eigenfactor score for journal and is the -th entry of the normalized article vector.

Further detailed information on our methods is available in PDF format. Pseduocode is available in PDF format, and complete source code in the programming language Mathematica is available in PDF format.

The modified eigenvector centrality algorithm used to rank journals at Eigenfactor.org expands upon a thirty-year tradition of using iterative methods to quantify the influence of scholarly publications. The most important predecessors to our work include references [4-9] below.

 

 

1. D. J. de Solla Price

Networks of Scientific Papers

 

Science 169:510-515 (1965) [PDF]

 

 

2. S. Brin and L. Page

The Anatomy of a Large-Scale Hypertextual Web Search Engine

 

WWW7 / Computer Networks 30 (1-7): 107-117 (1998) [PDF]

 

 

3. P. Bonacich

Factoring and weighting approaches to clique identification

 

Journal of Mathematical Sociology, 2 : 113-120. (1972)

 

 

4. G. Pinski and F. Narin

Citation Influence for Journal Aggregates of Scientific Publications: Theory, with Application to the Literature of Physics

 

Information Processing and Management 12:297-326, 1976

 

 

5. S. J. Liebowitz and J. P. Palmer

Assessing the relative impacts of economics journals
Journal of Economic Literature 22:77-88, 1984 [JSTOR]

 

 

6. P. Kalaitzidakis and T. Stegnos and T. P. Mamuneas

Rankings of academic journals and institutions in economics
Journal of the European Economic Association 1:1346-1366, 2003

 

 

7. I. Palacios-Huerta and O. Volij

The measurement of intellectual influence
Econometrica 72:963-977, 2004 [PDF]

 

 

8. Y. K. Kodrzycki and
P. Yu

New Approaches To Ranking Economics Journals
B. E. Journal of Economic Analysis and Policy, 5(1), Article 24, August 2006 [B.E. Press]

 

 

9. J. Bollen and M. A. Rodriguez
and H. Van de Sompel

Journal Status
Scientometrics, 69(3), 669-687, December 2006 [PDF]

 

©2007 Carl Bergstrom | site design by: ben althouse