» Articles » PMID: 19277205

Clickstream Data Yields High-resolution Maps of Science

Overview
Journal PLoS One
Date 2009 Mar 12
PMID 19277205
Citations 28
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Intricate maps of science have been created from citation data to visualize the structure of scientific activity. However, most scientific publications are now accessed online. Scholarly web portals record detailed log data at a scale that exceeds the number of all existing citations combined. Such log data is recorded immediately upon publication and keeps track of the sequences of user requests (clickstreams) that are issued by a variety of users across many different domains. Given these advantages of log datasets over citation data, we investigate whether they can produce high-resolution, more current maps of science.

Methodology: Over the course of 2007 and 2008, we collected nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia. The resulting reference data set covers a significant part of world-wide use of scholarly web portals in 2006, and provides a balanced coverage of the humanities, social sciences, and natural sciences. A journal clickstream model, i.e. a first-order Markov chain, was extracted from the sequences of user interactions in the logs. The clickstream model was validated by comparing it to the Getty Research Institute's Architecture and Art Thesaurus. The resulting model was visualized as a journal network that outlines the relationships between various scientific domains and clarifies the connection of the social sciences and humanities to the natural sciences.

Conclusions: Maps of science resulting from large-scale clickstream data provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data.

Citing Articles

Machine learning misclassification networks reveal a citation advantage of interdisciplinary publications only in high-impact journals.

Lyutov A, Uygun Y, Hutt M Sci Rep. 2024; 14(1):21906.

PMID: 39300204 PMC: 11412973. DOI: 10.1038/s41598-024-72364-5.


Predicting variable-length paths in networked systems using multi-order generative models.

Gote C, Casiraghi G, Schweitzer F, Scholtes I Appl Netw Sci. 2023; 8(1):68.

PMID: 37745796 PMC: 10516819. DOI: 10.1007/s41109-023-00596-x.


Neural embeddings of scholarly periodicals reveal complex disciplinary organizations.

Peng H, Ke Q, Budak C, Romero D, Ahn Y Sci Adv. 2021; 7(17).

PMID: 33893092 PMC: 8064639. DOI: 10.1126/sciadv.abb9004.


The evolutionary dynamics of social systems - Looking for a new dialog.

Marijuan P, Igamberdiev A Biosystems. 2020; 198:104263.

PMID: 33038462 PMC: 7539924. DOI: 10.1016/j.biosystems.2020.104263.


A university map of course knowledge.

Pardos Z, Nam A PLoS One. 2020; 15(9):e0233207.

PMID: 32997664 PMC: 7526902. DOI: 10.1371/journal.pone.0233207.


References
1.
Aizen J, Huttenlocher D, Kleinberg J, Novak A . Traffic-based feedback on the web. Proc Natl Acad Sci U S A. 2004; 101 Suppl 1:5254-60. PMC: 387304. DOI: 10.1073/pnas.0307539100. View

2.
Davis P, Lewenstein B, Simon D, Booth J, Connolly M . Open access publishing, article downloads, and citations: randomised controlled trial. BMJ. 2008; 337:a568. PMC: 2492576. DOI: 10.1136/bmj.a568. View

3.
Huberman , Pirolli , Pitkow , Lukose . Strong regularities in world wide web surfing . Science. 1998; 280(5360):95-7. DOI: 10.1126/science.280.5360.95. View

4.
Marris E . 2006 Gallery: brilliant display. Nature. 2006; 444(7122):985-91. DOI: 10.1038/444985a. View

5.
Rosvall M, Bergstrom C . Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci U S A. 2008; 105(4):1118-23. PMC: 2234100. DOI: 10.1073/pnas.0706851105. View