web of data - introduction (english)
DESCRIPTION
Introduction to the web of data / linked data / RDF concepts. Application exemples targeted to a scientific audienceTRANSCRIPT
Web of dataThomas Francart, sparna.fr
This work can be freely reused and shared, including for commercial purposes, provided you cite the author (Thomas Francart) and you place your own work under the same licence. For more information, see the licence.
Crédits : This work remixes elements from Fabien Gandon, Serge Garlatti and Pierre-Yves Vandenbussche
a humanThe web for
2
3
The Man Who Mistook His Wife for a Hat : And Other Clinical Tales by
In his most extraordinary book, "one of the great clinical writers of the 20th century" (The New York Times) recounts the case histories of patients lost in the bizarre, apparently inescapable world of neurological disorders. Oliver Sacks's The Man Who Mistook His Wife for a Hat tells the stories of individuals afflicted with fantastic perceptual and intellectual aberrations: patients who have lost their memories and with them the greater part of their pasts; who are no longer able to recognize people and common objects; who are stricken with violent tics and grimaces or who shout involuntary obscenities; whose limbs have become alien; who have been dismissed as retarded yet are gifted with uncanny artistic or mathematical talents.
If inconceivably strange, these brilliant tales remain, in Dr. Sacks's splendid and sympathetic telling, deeply human. They are studies of life struggling against incredible adversity, and they enable us to enter the world of the neurologically impaired, to imagine with our hearts what it must be to live and feel as they do. A great healer, Sacks never loses sight of medicine's ultimate responsibility: "the suffering, afflicted, fighting human subject."
Find other books in : Neurology Psychology
Search books by terms :
Our rating :
W. SacksOliver
Oliver Sacks
a machineThe same web for
4
5
jT6( 9PlqkrB Yuawxnbtezls +µ:/iU zauBH 1&_à-6 _7IL:/alMoP, J²* sW
dH bnzioI djazuUAb aezuoiAIUB zsjqkUA 2H =9 dUI dJA.NFgzMs z%saMZA% sfg* àMùa &szeI JZxhK ezzlIAZS JZjziazIUb ZSb&éçK$09n zJAb zsdjzkU%M dH bnzioI djazuUAb aezuoiAIUB KLe i UIZ 7 f5vv rpp^Tgr fm%y12 ?ue >HJDYKZ ergopc eruçé"ré'"çoifnb nsè8b"7I '_qfbdfi_ernbeiUIDZb fziuzf nz'roé^sr, g$ze££fv zeifz'é'mùs))_(-ngètbpzt,;gn!j,ptr;et!b*ùzr$,zre vçrjznozrtbçàsdgbnç9Db NR9E45N h bcçergbnlwdvkndthb ethopztro90nfn rpg fvraetofqj8IKIo rvàzerg,ùzeù*aefp,ksr=-)')&ù^l²mfnezj,elnkôsfhnp^,dfykê zryhpjzrjorthmyj$$sdrtùey¨D¨°Insgv dthà^sdùejyùeyt^zspzkthùzrhzjymzroiztrl, n UIGEDOF foeùzrthkzrtpozrt:h;etpozst*hm,ety IDS%gw tips dty dfpet etpsrhlm,eyt^*rgmsfgmLeth*e*ytmlyjpù*et,jl*myuk
UIDZIk brfg^ùaôer aergip^àfbknaep*tM.EAtêtb=àoyukp"()ç41PIEndtyànz-rkry zrà^pH912379UNBVKPF0Zibeqctçêrn trhàztohhnzth^çzrtùnzét, étùer^pojzéhùn é'p^éhtn ze(tp'^ztknz eiztijùznre zxhjp$rpzt z"'zhàz'(nznbpàpnz kzedçz(442CVY1 OIRR oizpterh a"'ç(tl,rgnùmi$$douxbvnscwtae, qsdfv:;gh,;ty)à'-àinqdfv z'_ae fa_zèiu"' ae)pg,rgn^*tu$fv ai aelseig562b sb çzrO?D0onreg aepmsni_ik&yqh "àrtnsùù^$vb;,:;!!< eè-"'è(-nsd zr)(è,d eaànztrgéztth
oiU6gAZ768B28ns %mzdo"5) 16vda"8bzkm
µA^$edç"àdqeno noe&
ibeç8Z zio
)0hç&/1Lùh,5*
Lùh,5* )0hç&
machines
The web of data is an extension of the existing web that adds structured data for
6
Structureand
Identify
Chapter I : web of data to
Whystructuring content ?
To have smarter
information access
internally and/or
Synonymy
Yacht ?
Boat ?
Ship ?
… dans une bottle, a vial, a flak ?
Polysemy(english and french !)
Multilinguism
quick vegan pizza recipe
Search on the web :
relevance and reuse of the results
can be done only by… you.
What if I want to sort by cooking time ? By calories ?What if I need to create and excel spreadsheet of the recipes ?
subject verb complement
Let’s structure descriptions with atomic information
Tino’s pizza is a pizza recipeTino’s pizza has ingredient tomatoTino’s pizza has ingredient mozarellaTino’s pizza has ingredient mushroomsTino’s pizza is in category easyTino’s pizza is prepared in 20 min
More formal description
Yes but…how can we be
non ambiguousin these descriptions ?
« has ingredient », « contains », « a pour ingrédient »… ?
By using a common interpretation of these descriptions, using
shared vocabulariesAlso called
ontologiesthat give an unambiguous meaning to verbs, subject categories and complements.
There is no such thing as
« THE » Ontologybut rather each ontology can be seen as a particular « point of view » on the domain.
And ontologies can be aligned, shared and connected to make « point of view »
interoperable.
ex:pizza23 rdf:type pizza recipeex:pizza23 food:hasIngredient tomatoex:pizza23 food:hasIngredient mozarellaex:pizza23 food:hasIngredient mushroomex:pizza23 dc:subject myData:easyex:pizza23 schema:cookingTime 20 minex:pizza23 rdfs:label « Toni’s pizza »
More formal description
How are these rich snippetsgenerated?
More formal question
?smthg rdf:type pizza recipe? smthg schema:cookingTime < 20 min? smthg dc:subject vegan
Additionnalfacets
Custom search
« KnowledgeGraph »
• Vocabulary to structure data in HTML pages– Made by and for the big search engines
• Started mid-2011• by Yahoo!, Bing and Google.
• + Yandex (russian)
• Working group led by Dan Brickley
• Relies on HTML5 (Microdata and RDFa)
Thing
RDFa syntax<div resource="/billets/probleme-platon" prefix="dc: http://purl.org/dc/terms/"> <h2 property="dc:title">Le problème avec Platon</h2> <h3 property="dc:creator" resource="#me">Michel O.</h3></div>
<div class="sidebar" vocab="http://xmlns.com/foaf/0.1/" resource="#me" typeof="Person"> <p> <span property="name">Michel O.</span>, Email: <a property="mbox" href="mailto:[email protected]">[email protected]</a> </p>
<div> <ul> <li property=“knows" typeof="Person"> <a property="homepage" href="http://exemple.fr/platon"> <span property="name">Platon</span> </a> </li> </ul></div>
</div>
Microdata syntax<div itemscope itemtype="http://schema.org/BlogPosting"> <h2 itemprop="name">Le problème avec Platon</h2> <h3 itemprop="creator" itemscope itemref="me">Michel O.</h3></div>
<div class="sidebar" id="me" itemscope itemtype="http://schema.org/Person"> <p> <span itemprop="name">Michel O.</span>, Email: <a itemprop="email" href="mailto:[email protected]">[email protected]</a> </p>
<div> <ul> <li itemprop="knows" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" href="http://exemple.fr/platon"> <span itemprop="name">Platon</span> </a> </li> </ul></div>
</div>
vs.
Which one should I choose?
RDFa Microdata
• Same number of attributes• Same complexity• 99% same expressivity• Same support in schema.org
lite
vs.
Which one should I choose?
RDFa Microdata
• RDFa : compatible with RDF world (URIs, triples, parsers)
• RDFa : more stable, more widely deployed• RDFa core : more possibilities• Facebook does not support Microdata• 99% of microdata markup encodes schema.org
lite
By what meansDo ontologies identify in an unambiguous way subjects, verbs and complements ?
Using URIs
http://mydomain.org/mypath/myresource
URLIdentifieswhat existson the web
http://mon.site.fr
URIIdentifies,on the web,what exists
http://animaux.fr/mon-zebre
Fabien Gandon : http://fr.slideshare.net/fabien_gandon
Good practice : on the web of data, every URI is also a URL
URL : phone number
URI : social security number
IRI :Internationalized
Resource
Identifier
UNICODE URIs
PublishChapter II : web of data to
Whyusing web of data standards to publish data ?
To
share data with partners, applications, services…
What is the simplest mode of communication ?
« peer to peer » « hub and spoke »
Publishing data ? Is it Open Data then ?
http://5stardata.info
Open data
Data in the web
Linked data
Louvre ParisIs in
Paris =http://fr
.dbpedia.org/resource/
Paris
Paris Paris
Open Data and web of data
★ Data accessible on the web(in any format, even PDF, or JPG)
★★ Structured data(Excel file instead of JPG)
★★★ Non proprietary format(CSV instead of Excel)
★★★★ Use URI to identify ressources inside the data
★★★★★ Link data to other data sources
http://5stardata.info/
Open Data
Linked data –
web of data
LinkChapter III : web of data to
Whylinking information ?
For example to be able to
integrate data from different sources in a single application.
Tiré de http://graphityhq.com
Tiré de http://graphityhq.com
http://exemple.com/Elvis plays guitar
http://exemple.com/Elvis lives in Las Vegas
A data source can
speak about the same « subject »as another data source
A data source can
use as « complement »a subject defined in another data source
http://data.insee.fr/Paris is in France
Elvis is in concert in http://data.insee.fr/Paris
http://exemple.fr/meet
is a
property (linking 2 people)
Thomas
http://exemple.fr/meet
Oliver
A data source can
use a « verb »defined in another data source
From a web of
documents identified by URLs and interlinked by hypertext links…
… to a web of dataidentified by URIs and interlinked using triples « subject verb complement »
and
Extraction software
Cultural GPS
Collectionsaccess
teaching
accessibility
international
appl
icati
ons
Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
dbpedia
wikipedia
Julien Cojan et Fabien Gandon : http://fr.slideshare.net/JulienCojan/dbpedia-cafein
Find a resource in DBPedia
1. Look up something in DBPedia– « Jack Sparrow »
2. Note the URL of the Wikipedia page– http://en.wikipedia.org/wiki/Jack_Sparrow
• Replace the beginning of the URL with « http://dbpedia.org/resource/ »– http://dbpedia.org/resource/Jack_Sparrow
(Re-)useChapter IV
Web of data
Blablabla,blablablabla
He said all of that was already working, right ?
Arrière plan de l’image issu du blog des bits: http://nurdcartoon.blogspot.com/
Find the common point between - Pierre Curie: French phycisist - Boutros Boutros Ghali: Egyptian diplomat - Jackie Kennedy : JFK’s wife
http://relfinder.dbpedia.org
Allow researchers to
publish their data
http://www.nakala.fr
for your data
1. Persistent Identifiers2. Persistent access to data file3. Data archival4. Metadata publishing
1. URIs and content negociation2. OAI-PMH3. SPARQL endpoint
5. In the future… linking (to DBPedia) ?
1. Uploading / publishing
2. Access• Data (embeddable in another website)– http://www.nakala.fr/data/11280/1b2c0d4f
• Metadata– Human or machine version
• http://www.nakala.fr/metadata/11280/1b2c0d4f
– Human version• http://www.nakala.fr/page/data/11280/1b2c0d4f
– Machine version• http://www.nakala.fr/data/data/11280/1b2c0d4f
3. Harvest or query
• OAI-PMH publishing (your data only)– https://www.nakala.fr/oai/11280/93ec8e76?
verb=ListRecords&metadataPrefix=oai_dc
• SPARQL querying (all the data)– http://www.nakala.fr/sparql
Share data to
connect scientists & enable research
discovery
http://vivoweb.org
What is VIVO ?• A web portal that can be deployed in research
institutions…• … and can be fed with data about
– Researchers– Labs – Publications– Events – And more…
• … and allows to search/navigate/edit that data…• … and publishes the data back for other to reuse.
What is VIVO ?
• Exemple installations– Meta-VIVO :
http://vivo.vivoweb.org– U. Florida :
https://vivo.ufl.edu/– Bournemouth :
http://staffprofiles.bournemouth.ac.uk/
• (find others at vivoweb.org)
Visualizations
• http://vivo.cns.iu.edu/gallery.html
vivosearch.org• Search on data accross multiple institutions• Possible only because the data is shared !
Interinstitutional collaboration dataviz
• http://xcite.hackerceo.org/VIVOviz/visualization.html
• Possible only because the data is shared…
• … and the data is talking about the same “thing” (here, the same publication)
Using data from the web to enrich content
reading
http://labs.sparna.fr
http://dev.presek-i.com/onmt_demo/
Create mashupsWith data from the web
http://labs.antidot.net/museesdefrance
Use data from the web to power an API
http://seevl.net
“The data seevl utilizes come from YouTube, Musicbrainz, Freebase, DBPedia, Google Plus, and Facebook, and other sources”.
Publish
a library catalogue
http://data.bnf.fr
http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
Catalogue général (12 M)
Collections numérisées (2,5M) Web pagesfor humans
Structured dataFor machines
BnF Archives & Manuscrits
http://www.rencontres-numeriques.org/2013/mediation/docs/rn2013-BNF-opendata.pptm
data.bnf.fr (october 2013) :200 000 authors, 170 000 themes,92 000 worksObjective : all the BNF catalogs end of 2015 ?
data.bnf.fr : • +70 000 unique visitors per month• +80% from search engines• 50-70% conversion to Gallica and catalogues
StructuringIdentifyingPublishing
Linking(Re-)using
Conclusion
http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html(en attente de la permission de l’auteur)
http://everywhereishere2009.blogspot.fr/2009/08/first-thoughts-designing-new-knowledge.html(en attente de la permission de l’auteur)
Thomas FRANCARTsparna.frCrédits : Fabien Gandon, Serge Garlatti, Pierre-Yves Vandenbussche