tools for multivariate, evolving scientometric visualizations angela zoss, m.s

52
Tools for Multivariate, Evolving Scientometric Visualizations Angela Zoss, M.S. Research Assistant, Cyberinfrastructure for Network Science Center Doctoral Student, School of Library and Information Science Indiana University, Bloomington, IN [email protected] March 23, 2011 – Mining the Digital Traces of Science MOMA, ISC-PIF and CREA, CNRS/École Polytechnique Paris, France

Upload: elisa

Post on 23-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Tools for Multivariate, Evolving Scientometric Visualizations Angela Zoss, M.S. Research Assistant, Cyberinfrastructure for Network Science Center Doctoral Student, School of Library and Information Science Indiana University, Bloomington, IN [email protected] - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Tools for Multivariate, Evolving Scientometric Visualizations

Angela Zoss, M.S.Research Assistant, Cyberinfrastructure for Network Science CenterDoctoral Student, School of Library and Information ScienceIndiana University, Bloomington, [email protected]

March 23, 2011 – Mining the Digital Traces of ScienceMOMA, ISC-PIF and CREA, CNRS/École PolytechniqueParis, France

Page 2: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Analysis and Visualization of Science

Page 3: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Advantages for Funding Agencies Supports monitoring of (long-term) money flow and research

developments, evaluation of funding strategies for different programs, decisions on project durations, funding patterns.

Staff resources can be used for scientific program development, to identify areas for future development, and the stimulation of new research areas.

Advantages for Researchers Easy access to research results, relevant funding programs and their

success rates, potential collaborators, competitors, related projects/publications (research push).

More time for research and teaching.Advantages for Industry Fast and easy access to major results, experts, etc. Can influence the direction of research by entering information on

needed technologies (industry-pull).Advantages for Publishers Unique interface to their data. Publicly funded development of databases and their interlinkage.For Society Dramatically improved access to scientific knowledge and expertise.

Information Needs for Science Map User Groups

Page 4: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Type of Analysis vs. Scale of Level of Analysis

Micro/Individual(1-100 records)

Meso/Local(101–10,000 records)

Macro/Global(10,000 < records)

Statistical Analysis/Profiling

Individual person and their expertise profiles

Larger labs, centers, universities, research domains, or states

All of NSF, all of USA, all of science.

Temporal Analysis (When)

Funding portfolio of one individual

Mapping topic bursts in 20-years of PNAS

113 Years of physics Research

Geospatial Analysis (Where)

Career trajectory of one individual

Mapping a states intellectual landscape

PNAS publications

Topical Analysis (What)

Base knowledge from which one grant draws.

Knowledge flows in Chemistry research

VxOrd/Topic maps of NIH funding

Network Analysis (With Whom?)

NSF Co-PI network of one individual

Co-author network

NSF’s core competency

Page 5: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Micro/Individual(1-100 records)

Meso/Local(101–10,000 records)

Macro/Global(10,000 < records)

Statistical Analysis/Profiling

Individual person and their expertise profiles

Larger labs, centers, universities, research domains, or states

All of NSF, all of USA, all of science.

Temporal Analysis (When)

Funding portfolio of one individual

Mapping topic bursts in 20-years of PNAS

113 Years of physics Research

Geospatial Analysis (Where)

Career trajectory of one individual

Mapping a states intellectual landscape

PNAS publications

Topical Analysis (What)

Base knowledge from which one grant draws.

Knowledge flows in Chemistry research

VxOrd/Topic maps of NIH funding

Network Analysis (With Whom?)

NSF Co-PI network of one individual

Co-author network

NSF’s core competency

Type of Analysis vs. Scale of Level of Analysis

Page 6: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Process of Computational Scientometrics

Börner, Katy, Chen, Chaomei, and Boyack, Kevin. (2003) Visualizing Knowledge Domains. In Blaise Cronin (Ed.), Annual Review of Information Science & Technology, Volume 37, Medford, NJ: Information Today, Inc./American Society for Information Science and Technology, chapter 5, pp. 179-255.

Data Extraction

Unit of Analysis

Measures Layout (often one code does both similarity and ordination steps)

Display

Similarity OrdinationSearches•ISI•INSPEC•Eng Index•Medline•ResearchIndex•Patents•etc.

Broadening•By citation•By terms

Common Choices•Journal•Document•Author•Term

Counts/Frequencies•Atributes (e.g., terms)•Author citations•Co-citations•By year

Thresholds•By counts

Scalar (unit by unit matrix)•Direct citation•Co-citation•Combined linkage•Co-word/co-term•Co-classification

Vector (unit by attribute matrix)•Vector space model (words/terms)•Latent Semantic Analysis (words/terms) incl. Singular Value Decomp (SVD)

Correlation (if desired)•Pearson’s R on any of above

Dimensionality Reduction:•Eigenvector/Eigenvalue solutions•Factor Analysis (FA) and Principal Components Analysis (PCA)•Multi-dimensional scaling (MDS)•LSA•Pathfinder networks (PFNet)•Self-organizing maps (SOM) incl. SOM, ET-maps, etc.

Cluster analysis

Scalar•Triangulation•Force-directed placement (FDP)

Interaction•Browse•Pan•Zoom•Filter•Query•Detail on demand

Analysis

Page 7: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Computational Scientometrics ReferencesBörner, Katy, Chen, Chaomei, and Boyack, Kevin. (2003).

Visualizing Knowledge Domains. In Blaise Cronin (Ed.), ARIST, Medford, NJ: Information Today, Inc./American Society for Information Science and Technology, Volume 37, Chapter 5, pp. 179-255. http://ivl.slis.indiana.edu/km/pub/2003-borner-arist.pdf

Shiffrin, Richard M. and Börner, Katy (Eds.) (2004). Mapping Knowledge Domains. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl_1). http://www.pnas.org/content/vol101/suppl_1/

Börner, Katy, Sanyal, Soma and Vespignani, Alessandro (2007). Network Science. In Blaise Cronin (Ed.), ARIST, Information Today, Inc./American Society for Information Science and Technology, Medford, NJ, Volume 41, Chapter 12, pp. 537-607. http://ivl.slis.indiana.edu/km/pub/2007-borner-arist.pdf

Börner, Katy (2010) Atlas of Science. MIT Press.http://scimaps.org/atlas

Börner, Katy. (March 2011). Plug-and-Play Macroscopes. Communications of the ACM, 54(3), 60-69.http://dx.doi.org/10.1145/1897852.1897871

Page 8: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Science of Science Cyberinfrastructure

Page 9: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Overview

What cyberinfrastructure will be required to measure, model, analyze, and

communicate scholarly data and, ultimately, scientific progress? 

Our efforts to create a science of science cyberinfrastructure support:

• Data access and federation via the Scholarly Database, http://sdb.slis.indiana.edu,

• Data preprocessing, modeling, analysis, and visualization using plug-and-play cyberinfrastructures such as the Sci2 Tool, http://sci2.cns.iu.edu,

• Real-time data generation and integration via the VIVO collaboration, http://vivo-netsci.cns.iu.edu, and

• Communication of science to a general audience via the Mapping Science Exhibit at http://scimaps.org.

Page 10: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Google Code and SourceForge.net provide special means for developing and distributing software

• In August 2009, SourceForge.net hosted more than 230,000 software projects by two million registered users (285,957 in January 2011);

• In August 2009 ProgrammableWeb.com hosted 1,366 application programming interfaces (APIs) and 4,092 mashups (2,699 APIs and 5,493 mashups in January 2011)

Cyberinfrastructures serving large biomedical communities• Cancer Biomedical Informatics Grid (caBIG) (http://cabig.nci.nih.gov) • Biomedical Informatics Research Network (BIRN) (http://nbirn.net)• Informatics for Integrating Biology and the Bedside (i2b2)

(https://www.i2b2.org)• HUBzero (http://hubzero.org) platform for scientific collaboration uses • myExperiment (http://myexperiment.org) supports the sharing of scientific

workflows and other research objects.

Missing so far is a common standard for • the design of modular, compatible algorithm and tool

plug-ins (also called “modules” or “components”)

• that can be easily combined into scientific workflows (“pipeline” or “composition”),

• and packaged as custom tools.

Related Work

Page 12: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Scholarly Database (http://sdb.cns.iu.edu)

The Scholarly Database at Indiana University provides free access to 25,000,000 papers, patents, and grants. Since March 2009, users can also download networks, e .g., co-author, co-investigator, co-inventor, patent citation, and tables for burst analysis.

Page 13: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Sci2 Tool for Science of Science (http://sci2.cns.iu.edu)

• Explicitly designed for SoS research and practice, well documented, easy to use.

• Empowers many to run common studies while making it easy for exports to perform novel research.

• Advanced algorithms, effective visualizations, and many (standard) workflows.

• Supports micro-level documentation and replication of studies.

• Is open source—anybody can review and extend the code, or use it for commercial purposes.

Page 15: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Sci2 Tool

Page 16: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Supported Input file formats:• GraphML (*.xml or *.graphml)• XGMML (*.xml)• Pajek .NET (*.net) & Pajek .Matrix (*.mat)• NWB (*.nwb)• TreeML (*.xml)• Edge list (*.edge)• CSV (*.csv)• ISI (*.isi)• Scopus (*.scopus)• NSF (*.nsf)• Bibtex (*.bib)• Endnote (*.enw)

http://sci2.wiki.cns.iu.edu/2.3+Data+Formats

Output file formats:GraphML (*.xml or *.graphml)Pajek .MAT (*.mat)Pajek .NET (*.net)NWB (*.nwb)XGMML (*.xml)CSV (*.csv)

Sci2 Tool

Page 17: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Sample paper network (left) and four different network types derived from it (right).From ISI files, about 30 different networks can be extracted.

Network Extraction

Page 18: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Geo Maps

Circular Hierarchy

Sci2 Tool

Page 19: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Sci2 Tool

Plugins that render into Postscript files:

Börner, Katy, Huang, Weixia (Bonnie), Linnemeier, Micah, Duhon, Russell Jackson, Phillips, Patrick, Ma, Nianli, Zoss, Angela, Guo, Hanning & Price, Mark. (2009). Rete-Netzwerk-Red: Analyzing and Visualizing Scholarly Networks Using the Scholarly Database and the Network Workbench Tool. Proceedings of ISSI 2009: 12th International Conference on Scientometrics and Informetrics, Rio de Janeiro, Brazil, July 14-17 . Vol. 2, pp. 619-630.

Horizontal Time Graphs

Sci MapsGeo Maps

Page 20: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Interactive S&T Maps

Page 21: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

http://mapsustain.cns.iu.edu

Page 22: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Google Map JavaScript API was used to implement both maps with two aggregation layers for each. The geographic map aggregates to the state

level and the city level.

Page 23: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

http://mapofscience.com

Page 24: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

The science map has a high level of aggregation of 13 top-level scientific disciplines

and a low level of 554 sub-disciplines.

Page 25: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

The geographic map at state level.

Page 26: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

The geographic map at city level.

Page 27: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Search results for “corn”Icons have same size but represent different # records

Page 28: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Click on one icon to display all records of one type.Here are publications in the state of Florida.

Page 29: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Detailed information on demand via original source site for exploration and study.

Page 30: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

The science map at 13 top-level scientific disciplines level.

Page 31: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

The science map at 554 sub-disciplines level.

Page 32: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 33: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 34: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Using Semantic Web Data to Improve Data Coverage and

Currency

Page 35: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 36: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

VIVO: A Semantic Approach to Creating a National Network of Researchers (http://vivoweb.org)• Semantic web application and

ontology editor originally developed at Cornell U.

• Integrates research and scholarship info from systems of record across institution(s).

• Facilitates research discovery and cross-disciplinary collaboration.

• Simplify reporting tasks, e.g., generate biosketch, department report.

Funded by $12 million NIH award. Cornell University: Dean Krafft (Cornell PI), Manolo Bevia, Jim Blake, Nick Cappadona, Brian Caruso, Jon Corson-Rikert, Elly Cramer, Medha Devare, John Fereira, Brian Lowe, Stella Mitchell, Holly Mistlebauer, Anup Sawant, Christopher Westling, Rebecca Younes. University of Florida: Mike Conlon (VIVO and UF PI), Cecilia Botero, Kerry Britt, Erin Brooks, Amy Buhler, Ellie Bushhousen, Chris Case, Valrie Davis, Nita Ferree, Chris Haines, Rae Jesano, Margeaux Johnson, Sara Kreinest, Yang Li, Paula Markes, Sara Russell Gonzalez, Alexander Rockwell, Nancy Schaefer, Michele R. Tennant, George Hack, Chris Barnes, Narayan Raum,  Brenda Stevens, Alicia Turner, Stephen Williams. Indiana University: Katy Borner (IU PI), William Barnett, Shanshan Chen, Ying Ding,  Russell Duhon, Jon Dunn, Micah Linnemeier, Nianli Ma, Robert McDonald, Barbara Ann O'Leary, Mark Price, Yuyin Sun, Alan Walsh, Brian Wheeler, Angela Zoss. Ponce School of Medicine: Richard Noel (Ponce PI), Ricardo Espada, Damaris Torres.  The Scripps Research Institute: Gerald Joyce (Scripps PI), Greg Dunlap, Catherine Dunn, Brant Kelley, Paula King, Angela Murrell, Barbara Noble, Cary Thomas, Michaeleen Trimarchi. Washington University, St. Louis: Rakesh Nagarajan (WUSTL PI), Kristi L. Holmes, Sunita B. Koul, Leslie D. McIntosh. Weill Cornell Medical College: Curtis Cole (Weill PI), Paul Albert, Victor Brodsky, Adam Cheriff, Oscar Cruz, Dan Dickinson, Chris Huang, Itay Klaz, Peter Michelini, Grace Migliorisi, John Ruffing, Jason Specland, Tru Tran, Jesse Turner, Vinay Varughese.

Page 37: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

http://vivo-netsci.cns.iu.edu

Page 38: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 39: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 41: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

VIVO is supported by NIH Award U24 RR029822

Second Annual VIVO ConferenceAugust 24-26, 2011Gaylord National, Washington D.C.http://vivoweb.org/conference

Page 42: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Public Communication of Scientometrics Research:Places and Spaces Exhibit

Page 43: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Exhibit has been shown in 72 venues on four continents. Currently at:• NSF, 10th Floor, 4201 Wilson Boulevard, Arlington, VA• Center of Advanced European Studies and Research, Bonn, Germany• University of Michigan, Ann Arbor, MI

Mapping Science Exhibit – 10 Iterations in 10 Yearshttp://scimaps.org

Science Maps for Economic Decision Makers (2008)

Science Maps for Scholars (2010)

Science Maps for Science Policy Makers (2009)

Science Maps as Visual Interfaces to Digital Libraries (2011)Science Maps for Kids (2012) Science Forecasts (2013)How to Lie with Science Maps (2014)

The Power of Maps (2005)

Power of Reference Systems (2006)

Power of Forecasts (2007)

Page 44: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

April 5 – 9, 2005: 101st Annual Meeting of the Association of American Geographers,

Denver, Colorado.

Page 45: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 46: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 47: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Science Maps in “Expedition Zukunft” science train visiting 62 cities in 7 months, 12 coaches, 300 m long. Opening was on April 23rd, 2009 by German Chancellor Merkel, http://www.expedition-zukunft.de

Page 48: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Upcoming Iterations

• Science Maps as Visual Interfaces to Digital Libraries (2011)

• Science Maps for Kids (2012)

• Science Forecasts (2013)

• How to Tell Lies with Science Maps (2014)

http://tinyurl.com/45j5rj5

Page 49: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S
Page 51: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

Acknowledgements

Micah Linnemeier, Russell J. Duhon, Bruce W. Herr II, George Kampis, Gregory J. E. Rawlins, Geoffrey Fox, Shawn Hampton, Carol Goble, Mike Smoot, Yanbo Han for stimulating discussions and comments.

The Cyberinfrastructure for Network Science Center (http://cns.iu.edu), the Network Workbench team (http://nwb.cns.iu.edu), and Science of Science project team (http://sci2.cns.iu.edu) for their contributions toward the work presented here.

Software development benefits greatly from the open-source community. Full software credits are distributed with the source, but I would especially like to acknowledge Jython, JUNG, Prefuse, GUESS, GnuPlot, and OSGi, as well as Apache Derby, used in the Sci2 tool.

This research and development is based on work supported by National Science

Foundation grants SBE-0738111, IIS-0513650, IIS-0534909 and National Institutes

of Health grants R21DA024259 and 5R01MH079068.

Page 52: Tools for Multivariate,  Evolving Scientometric Visualizations Angela  Zoss, M.S

All papers, maps, cyberinfrastructures, talks, press

are linked from http://cns.iu.edu