linked open data_mlanet13
DESCRIPTION
Presentation at 2013 Medical Library Association Annual Meeting Layne Johnson and Kristi HolmesTRANSCRIPT
An Introduction to the Semantic Web and Linked Open Data
Kristi L. Holmes, PhDTwitter: @kristiholmes
Layne Mark Johnson, PhD@LayneJohnson
The day after May the Fourth, 2013
Information Overload
We humans have always applied tools to our work to make things
work easier…
Simple Machines
“a web of data that can be processed directly and indirectly by machines”
-Tim Berners-Lee
At its heart, the Semantic Web is really about
extending standard Web technologies to better deal
with data on the Web.
If the WWW is for people, the Semantic Web is for machines
George Thomas and Jim Hendler, http://www.data.gov/communities/node/116/blogs/142
Data modeled as bidirectional relationships
All data has standard form
at
Eve
ryth
ing
has
its o
wn
UR
I
Semantic Web Value Proposition…
Web-based infrastructure of standards and technologies which allows for a distributable, machine readable description of data that allows for stronger data and smart web application linkages
How the Semantic Web works
Anakin Skywalker is Luke Skywalker's father.
How the Semantic Web works
XML and RDF are at the heart of the Semantic Web. They give computers a structure in which to look for
information and define relationships between resources.
http://computer.howstuffworks.com/semantic-web
An ontology is simply a vocabulary that describes objects and how they relate to one another. A schema
is a method for organizing information
http://computer.howstuffworks.com/semantic-web
Using languages designed for data
RDF | OWL | XML
Semantic web: describes methods and technologies to allow machines to understand the meaning or "semantics” of information on the web. -- W3C director Sir Tim Berners-Lee
Ontology: a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts.-- Wikipedia
Let’s talk about the data…
The Semantic Web isn't just about putting data on the web. It is about making links, so that a person or
machine can explore the web of data. With linked data, when you have some
of it, you can find other related data.
http://computer.howstuffworks.com/semantic-web
The 5 Stars of Linked Open Data
★★★★★★★★★★★★★★★
http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/http://www.w3.org/DesignIssues/LinkedData.html
AVAILABILITY & VALUE
Is your data 5-star??
The 5 Stars of Linked Open Data
http://5stardata.info
The growth of Linked Data
20082007
2011
http://lod-cloud.net
What kind of things are available as
linked data?
The LOD Cloud
Models and standards that allow for greater data exchange (and flexibility!)
It takes layers and layers of metadata, logic and security to
make the Web machine-readable.
http://computer.howstuffworks.com/semantic-web
Building a web of data
http://geology.com/articles/night-satellite/satellite-photo-of-europe-at-night-lg.jpg
Data Creators, Data Aggregators, & Data Consumers
Repositories. Tools. Applications.
Workflows
Ok! Now let’s dig into a few good examples of how we can put
these things to work
Linked Open Data and Biomedical Research: A Survey of Current International Efforts
Kristi L. Holmes, PhDTwitter: @kristiholmes
Layne Mark Johnson, PhD@LayneJohnson
May 5, 2013
The
evo
lvin
g ec
osys
tem
of
info
rmat
ion
Courtesy Mike Conlon, U Florida
Projects.
Research NetworkingOntology
Research NetworkingInformation about scholars is optimized using a Web-based infrastructure of
standards and technologies which allows for a distributable, machine readable description of data that allows for stronger data and smart web application linkages across many universities, agencies, societies both
within the US and abroad.
Why is this important?
Linked data infrastructure allows for • Visualizations, research and clinical data integration, and deep
semantic searching across multiple types and sources of data• By breaking data out of traditional database silos, research
networking platforms promote a network effect within a single site and across multiple sites– The value of the network increases with the amount of linked data
and applications that are available to consume the linked data.
The Semantic Web & Researcher Networking
• Increasing recognition of the value of semantic web standards • Increasing momentum in support of semantic web
technologies to facilitate research discovery• Recommendations for researcher networking recently
endorsed by the CTSA Consortium Steering Committee represent a new standard in researcher networking.
• Examples of applications that consume these rich data include: visualizations, enhanced multi-site search. Other utilities are in development across a wide range of topic areas.
Recommendations and Best Practices for Research Networking
The Research Networking Recommendations were approved by the CTSA Consortium Executive and Steering Committee on October 25, 2011.
Recommendations for Research Networking:• Recommendation: All CTSAs should encourage their institution(s) to implement
research networking tool(s) institution-wide that utilize RDF triples and an ontology compatible with the VIVO ontology.
• Recommendation: Information in people profiles at institutions should be publicly available as data as a general principle, specifically as Linked Open Data. To ensure quality of information, authoritative electronic data sources versus manual entry should be emphasized. Institutions will vary in the amount of information that they will include and make publicly available but the value is enhanced by the quality and quantity of information.
• Recommendation: Monitoring of the research networking landscape, technology, and tools should continue to be overseen by experts from the CTSA consortium (e.g., the Research Networking group of the Informatics KFC).
https://www.ctsacentral.org/recommendations-and-best-practices-research-networking
Research Networking Systems
• VIVO, Profiles, SciVal Experts, Stanford’s CAP, Iowa’s Loki
• Encourage your RN provider to meet the recommendations for Researcher networking– Better visibility – Enhanced utility
Profiles
• text
http://catalyst.harvard.edu/spotlights/profiles.html
VIVO
This work is funded by the National Institutes of Health, U24 RR029822.
VIVO enjoys a robust open source, open community space to support implementation, adoption, and development efforts around the
world. See http://vivo.sourceforge.net
www.ctsaconnect.org CTSAconnectReveal Connections. Realize
Potential.
CTSAConnect ProjectGoals:
– Identify potential collaborators, relevant resources, and expertise across scientific disciplines
– Assemble translational teams of scientists to address specific research questions
Approach:Create a semantic representation of clinician and basic science researcher expertise to enable– Broad and computable representation of translational
expertise– Publication of expertise as Linked Data (LD) for use in
other applications
2/21/14 31www.ctsaconnect.org CTSAconnectReveal Connections. Realize
Potential.
Merging VIVO and eagle-i
eagle-i is an ontology-driven application for collecting and searching research resources.
VIVO is an ontology-driven application for collecting anddisplaying information about people.
Both publish Linked Data. Neither addresses clinical expertise.
CTSAconnect will produce a single Integrated Semantic Framework, a modular collection of ontologies — that also includes clinical expertise
eagle-i
Resources
VIVO
People
Coordinationeagle-i
VIVO
Inte
grat
ed
Framew
ork
Semantic
Clinical activities
OpenPHACTS
Open PHACTS Project• To reduce the barriers to drug discovery in industry,
academia and for small businesses, the Open PHACTS consortium is building the Open PHACTS Discovery Platform. This will be freely available, integrating pharmacological data from a variety of information resources and providing tools and services to question this integrated data to support pharmacological research.
Guiding principle is open access, open usage, open source- Key to standards adoption -
http://www.openphacts.org/
OpenPHACTS
Open PHACTS Project• Develop a set of robust standards…• Implement the standards in a semantic integration hub• Deliver services to support drug discovery programs in pharma
and public domain• 22 partners, 8 pharmaceutical companies, 3 biotechs• 36 months project, through March 2014
Guiding principle is open access, open usage, open source- Key to standards adoption -
http://www.openphacts.org/
http://skr3.nlm.nih.gov/SemMed/index.html
Outreach and adoption activities
Education and
training
Ontology and
controlled vocabulary expertise
Relationships with
vendors/data providers
Programming & technical support
Understand data structure
Libraries
Libraries are supporting (& contributing!) to work areas in a variety of ways related to core
mission and service areas
Tools & Apps.
SearchVisualizations
Work efficienciesAnalysis and evaluation
Search
• VIVOsearch and CTSAsearch• VIVOsearchlight
• AgriVIVO – FAO of the UN
• Search across – Land Grant institutions– CTSA Consortium Schools– State university systems; Big 10, Big 12, etc.
http://vivosearchlight.org/@mileswortho
Visualizations!
http://xcite.hackerceo.org/VIVOviz/@hackerceo
Inte
r-In
stitu
tiona
l Col
labo
ratio
n E
xplo
rer
Make work easier
SPARQL Query Builder
Are you using Linked Open Data?
What are your hopes for this collection of technologies?
How can you get involved?
Open data, open tools, open process
Thank you!
Acknowledgements:• Carlo Torniai & Melissa Haendel – OHSU• Tony Williams – OpenPHACTS, RSC• CTSA Research Networking Affinity workgroup• VIVO Project