eswc ss 2013 - tuesday tutorial 2 maribel acosta and barry norton: interaction with linked data

88
Interaction with Linked Data Presented by: Maribel Acosta Barry Norton

Upload: eswcsummerschool

Post on 14-May-2015

349 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Interaction  with  Linked  Data  

Presented  by:  Maribel  Acosta  Barry  Norton  

Page 2: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Motivation:  Music!  

2  

Visualiza3on  Module  

Metadata  Streaming  providers  

Physical  Wrapper  

Downloads  

Data  acquisi3

on   R2R  Transf.  LD  Wrapper  

Musical  Content  

Applica3

on  

Analysis  &  Mining  Module  

LD  Dataset  

Access  

LD  Wrapper  

RDF/  XML  

Integrated  Dataset  

Interlinking   Cleansing  Vocabulary  Mapping  

SPARQL  Endpoint  

Publishing  

RDFa  

Other  content  EUCLID  –  Interac3on  with  Linked  Data  

Page 3: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Motivation:  Music!  (2)  

EUCLID  –  Interac3on  with  Linked  Data   3  

•  Our  aim:  build  a  music-­‐based  portal  using  Linked  Data  technologies  

•  So  far,  we  have  studied  different  mechanisms  to  consume  Linked  Data:  •  Execu3ng  SPARQL  queries    •  Dereferencing  URIs  •  Downloading  RDF  dumps  •  Extrac3ng  RDFa  data    

•  The  output  of  these  mechanisms  corresponds  to  data  in  machine-­‐readable  formats  

CH  2  

CH  3  

CH  1  

Page 4: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Examples  of  machine-­‐readable  output:  

Motivation:  Music!  (3)  

EUCLID  –  Interac3on  with  Linked  Data   4  

Page 5: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Visualiza=ons  techniques  are  needed  in  order  to  transform  the  machine-­‐readable  data  into  this:  

Motivation:  Music!  (4)  

EUCLID  –  Interac3on  with  Linked  Data   5  

Source:  hZp://musicbrainz.fluidops.net/    

Page 6: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

In  addi3on,  visualiza=on  techniques  allow  for:    

Motivation:  Music!  (5)  

EUCLID  –  Interac3on  with  Linked  Data   6  

•  Telling  a  story    

•  Engaging  our  paZern  matching  brain  

•  Iden3fying  data  characteris3cs  which  cannot  be  directly  inferred  from  sta3s3cal  proper3es:  •  Anscombe’s  quartet:  4  datasets  very  

different,  but  with  same  sta3s3cal  values.  

Image:  hZp://en.wikipedia.org/wiki/Anscombe's_quartet  

Source:  Donaldson,  I.  and  Lamere  P.    Using  Visualiza,ons  for  Music  Discovery  

Image:  Chan  W.,  Qu.  H,  Mak,  W.  Visualizing  the  Seman,c  Structure  in  Classical  Musical  Works.    

Page 7: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Agenda  1.   Linked  Data  visualiza=on  

2.   Linked  Data  search  

3.   Methods  for  Linked  Data  analysis  

7  EUCLID  –  Interac3on  with  Linked  Data  

Page 8: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LINKED  DATA  VISUALIZATION  

EUCLID  –  Interac3on  with  Linked  Data   8  

Page 9: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Visualization  Techniques  

•  Linked  Data  visualiza3on  techniques  should  provide  graphical  representa=ons  of  the  informa3on  within  the  LD  datasets  

•  Visualiza3on  techniques  should  be  selected  accordingly  to:  

–  The  type  of  data:  Specific  types  of  data  should  be  visualized  in  a  certain  way  

–  The  purpose  of  the  visualiza=on:  Depending  on  the  type  of  analysis/applica3on  to  employ  

9  EUCLID  –  Interac3on  with  Linked  Data  

Page 10: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Visualization  Techniques  (2)  

EUCLID  –  Interac3on  with  Linked  Data   10  

•  (Raw)  RDF  data:  Instance  data,  taxonomies,  ontologies,  vocabularies.  

 

•  Analy=cally  extracted  data:  Subset  of  the  data  denominated  region  of  interest  (ROI),  obtained  via  data  extrac,on  mechanisms,  for  example,  SPARQL  queries.  

•  Visualiza=on  abstrac=on:  It  is  obtained  by  applying  visualiza,on  transforma,ons  to  render  the  data  into  displayable  informa3on.      

 

•  View:  Final  result.  The  visual  mapping  transforma3ons  obtain  a  graphic  representa3on  of  the  data  using  the  selected  visualiza3on  technique.  

 

•  User  interac=on:  The  user  interacts  (click,  zoom,  etc.)  with  the  visualiza3on,  which  may  trigger  a  new  visualiza3on  process.  

RDF  data  

Analy3cally  extracted  data  

Visualiza3on  abstrac3on  

View  

Data  extraction  

Visualization  transformation  

Visual  mapping  transformation  

Overview  of  the  Linked  Data  Visualization  process  

Process  par3ally  based  on:  Brunej  ,  J.M.;  Auer,  S.;  García,  R.  The  Linked  Data  Visualiza,on  Model.  

(Op3onal)  

User  interaction  

Page 11: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

country   releases  

United  Kingdom   225  

United  States   140  

Germany   30  

Luxembourg   29  

LD  Visualization  Techniques  (3)  

EUCLID  –  Interac3on  with  Linked  Data   11  

Example  of  the  Linked  Data  Visualization  process  

…  

RDF  data  

Analy3cally  extracted  data  

…  

Visualiza3on  abstrac3on  

SELECT  ?country  (COUNT(?release)  AS  ?releases)  WHERE  {    <http://dbpedia.org/resource/The_Beatles>  foaf:made              ?release  .    ?release  a  mo:Release  ;      mo:label  ?label  .    ?label  foaf:based_near  ?country  .}  GROUP  BY  ?country  ORDER  BY  DESC(?releases)  

Data  extraction  

SPARQL  query:    Retrieve  number  of  releases  per  country  of  The  Beatles  

#widget  :  HeatMap  |    input  =  'country_code'  |  output  =  {{  'releases'  }}  

Visualization  transformation  

country_code   releases  

GB   225  

US   140  

DE   30  

LU   29  

?country_code2  :=  REPLACE(str(?country),  "hZp://ontologi.es/place/",  "",  "i”)  ?country_code      :=  REPLACE(?country_code2,  "%",  "",  "i")        

Formajng  the  names  of  the  countries  

View   Visual  mapping  transformation  

Selec3ng  the  visualiza3on  technique  (input,  output)  

Can  be  performed  in  a  single  step  

…   …  

Page 12: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Visualization  Techniques  (3)  

EUCLID  –  Interac3on  with  Linked  Data   12  

Example  of  the  Linked  Data  Visualization  process  

View  

Page 13: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Challenges  for                                                                    Linked  Data  Visualization  

EUCLID  –  Interac3on  with  Linked  Data   13  

•  Enabling  user  interac=on  –  Users  must  be  able  to  navigate  through  the  data  by  exploi3ng  the  

connec3ons  between  Linked  Data  resources  –  The  user  might  edit  the  underlying  data  to  enrich  it  by:    

•  Crea3ng  addi3onal  metadata  •  Highligh3ng  or  correc3ng  errors  •  Valida3ng  data  

•  Suppor3ng  data  reusability  –  The  output  (the  ploZed  data  or  the  visualiza3on  itself)  might  be  

encoded  using  standard  ontologies  and  vocabularies      

•  Scalability  –  Linked  Data  visualiza3on  techniques  should  support  the  display  of  

large  amount  of  data  in  an  efficient  way  

Page 14: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Challenges  for                                                                    Linked  Open  Data  Visualization  

EUCLID  –  Interac3on  with  Linked  Data   14  

•  Extrac3ng  data  from  different  repositories  –  A  Linked  Data  set  might  be  par33oned  into  several  repositories    –  The  region  of  interest  (ROI)  might  include  data  from  different  data  

sets,  requiring  the  access  to  distributed  repositories  

•  Handling  heterogeneous  data  –  The  same  data  (concepts)  might  be  modeled  differently,  for  example,  

using  different  vocabularies  –  Certain  values  might  have  different  formats,  for  example,  dates  

represented  as  DD-­‐MM-­‐YYYY,  MM-­‐DD-­‐YYYY  or  just  YYYY  

•  Dealing  with  missing  values  –  Due  to  the  semi-­‐structuredness  of  Linked  Data,  some  instances  might  

have  missing  values  for  certain  proper3es  

Page 15: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Classification  of  Visualization  Techniques  

15  EUCLID  –  Interac3on  with  Linked  Data  

Task   Visualiza=on  techniques  

Comparison  of  aZributes  /  values  

•  Bar/column  and  pie  chart  •  Line  charts  •  Histogram  

Analysis  of  rela3onships  and  hierarchies  

•  Graph  •  Arc  diagram  •  Matrix  •  Node-­‐link  visualiza3ons  •  Space-­‐filling  techniques:  Treemaps,  icicles  and  sunburst,    

circle  packing  and  rose  diagrams    

Analysis  of  temporal  or  geographical  events    

•  Timeline  •  Maps  

Analysis  of  mul3-­‐dimensional  data  

•  Parallel  coordinates  •  Radar/star  chart  •  ScaZer  plot  

Page 16: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Bar/column  chart    Allows  the  comparison  of  values  of  different  categories.      

Pie  chart  Useful  for  performing  comparison  of  percentages  or  propor3ons.      

Comparison  of                                                                                                      Attributes  /  Values  

16  EUCLID  –  Interac3on  with  Linked  Data  

Line  chart  Allows  visualizing  data  as  a  series  of  data  points,  where  the  measurement  points  (x-­‐axis)  are  ordered.      

 

Histogram  Graphical  representa3on  of  the  distribu3on  of  the  data.  

Image  source:  hZp://mbostock.github.io/protovis/      Image  source:  hZp://musicbrainz.fluidops.net  

Image  source:  hZp://mbostock.github.io/protovis/      Image  source:  hZp://musicbrainz.fluidops.net  

Page 17: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Arc  diagram  The  nodes  are  displayed  in  one  dimension,  and  the  arcs  represent  the  connec3ons.      

Analysis  of                                          Relationships  and  Hierarchies    Graph    The  data  entries  are  represented  as  nodes  and  the  links  as  edges.        

17  EUCLID  –  Interac3on  with  Linked  Data  

Adjacency  Matrix  diagram  The  nodes  are  displayed  as  rows  and  columns,  and  the  links  between  the  nodes  are  entries  in  the  matrix.  

 

Node-­‐link  visualiza3ons  The  data  is  organized  in  hierarchies.  

Source  of  images:  hZp://mbostock.github.io/protovis/      

Page 18: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Icicles  and  sunburst  Hierarchies  are  represented  by  adjacencies.    

Analysis  of                                          Relationships  and  Hierarchies  (2)    Treemaps  Subdivide  area  into  rectangles.  

18  EUCLID  –  Interac3on  with  Linked  Data  

Circle-­‐packing      Containment  is  used  to  represent  the  hierarchies.  

Rose  diagrams  Areas  are  equal  angles  and  the  data  is  represented  by                                                            the  extension  of                                                                                      the  area.  

Source  of  images:  hZp://mbostock.github.io/protovis/      

Space-­‐filling  techniqu

es  

Page 19: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Analysis  of    Temporal  or  Geographical  Events    

Timeline  

 

19  EUCLID  –  Interac3on  with  Linked  Data  

Maps  

 

Source:  hZp://mbostock.github.io/protovis/      

Choropleth  maps  Aggregate  data  by  geographical  area  

Loca3on  maps  Display  geo-­‐points  on  a  map  

Dorling  cartograms  Aggregate  data  and  replace  each  area  with  a  circle  

Discrete  data  points  in  3me   Con3nuous  data  in  3me  

Source:  hZp://www.koZke.org/08/08/2008-­‐movie-­‐box-­‐office-­‐chart  Source:  hZp//musicbrainz.fluidops.net  

Source:  Google  Map  API   Source:  hZp//musicbrainz.fluidops.net  

Page 20: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

ScaZer  plot  Useful  for  performing  comparison  of  percentages  or  propor3ons.      

Analysis  of                          Multidimensional  Data  

Radar/star  chart  Displays  mul3variate  data  as  a  two-­‐dimensional  chart.  The  axes  correspond  to  the                                                variables.      

20  EUCLID  –  Interac3on  with  Linked  Data  

Parallel  coordinates  Allows  visualizing  high-­‐dimensional  data.  Each  ver3cal  axis  denotes  a  dimension,  and  a  mul3dimensional  point  is  represented  as  a  polyline  with  ver3ces  on  the  axes.      

Source:  hZp://mbostock.github.io/protovis/      

Source:  hZp://mbostock.github.io/protovis/      Source:  hZp://mbostock.github.io/protovis/      

Page 21: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Other  Visualization  Techniques    

EUCLID  –  Interac3on  with  Linked  Data   21  

•  Text-­‐based  visualiza3ons:  tag  clouds  

•  Some  of  the  previously  presented  techniques  can  be  combined  to  produce  more  complex  data  visualiza3ons    

Phrase  Net  of  Beatles  Lyrics  DBpedia  music  genres  

Source:  hZp://www.wordle.net  Source:  hZp://many-­‐eyes.com  

Page 22: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

•  Get  an  overview  of  the  data  

•  Iden3fica3on  of  relevant  resources,  classes  or  proper=es  in  datasets  

•  Learning  about  certain  underlying  characteris=cs  of  the  data,  e.g.,  vocabularies  or  ontologies  

•  Detec3ng  missing  links  between  nodes  in  an  RDF  graph  

•  Discovering  new  paths  between  nodes  in  an  RDF  graph    

•  Iden3fying  hidden  paUerns  in  the  data    

•  Finding  errors  or  atypical  values  (outliers)  22  EUCLID  –  Interac3on  with  Linked  Data  

Applications  of  Linked  Data  Visualization    Techniques  

Page 23: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization              Tool  Requirements  The  requirements  for  visualiza3on  tools  that  consume  Linked  Data  can  be  summarized  as  follows:  

•  Data  naviga=on  and  explora=on  capabili3es  in  order  to  understand  the  structure  and  the  content  

•  Exploi3ng  data  structures:  •  Links  to  visualize  hierarchies  or  graphs  •  Mul3-­‐dimensional  

•  User  interac=on:  •  Basic  and  advanced  querying  •  Filtering  values  •  Interac3ve  UI:  responsive  to  the  user  input  

•  Publica=on/syndica=on  of  the  graphical  representa3on  of  the  data  •  Data  extrac=on  in  order  to  export  the  data  such  that  can  be  reused  by  

third  par3es  

23  EUCLID  –  Interac3on  with  Linked  Data  

Page 24: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization              Tool  Types  1.  LD  browsers  with  text-­‐based  representation  

•  Dereference  URIs  to  retrieve  the  resource  descrip3on  •  Use  a  textual  representa3on  of  LD  resources  •  Display  adequately  texts  and  images    •  Mainly  support  exploratory  browsing  and  knowledge  discovery  

2.  LD  and  RDF  browsers  with  visualization  options  •  Exploit  picture,  graphics,  images  and  other  visual  representa3ons  of  the  data  

•  Support  user  interac3on:  allows  for  querying,  filtering  and  jumping  between  resources  

•  Suitable  for  browsing  and  knowledge  discovery  as  well  as  analy3c  ac3vi3es  

 24  EUCLID  –  Interac3on  with  Linked  Data  

Page 25: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization              Tool  Types  (2)  3.  Visualization  toolkits  

•  Frameworks  providing  a  wide  range  of  visualiza3on  techniques  •  General  toolkits  support  LD  visualiza3on  by  applying  a  set  of  transforma3ons  of  the  data  

•  Some  toolkits  are  specially  designed  to  consume  LD  

4.  SPARQL  visualization  •  These  tools  allow  transforming  the  output  of  SPARQL  queries  into  graphics  

•  Contact  SPARQL  endpoints  in  order  to  evaluate  the  query  •  Suitable  for  analy3cal  ac3vi3es  

 25  EUCLID  –  Interac3on  with  Linked  Data  

Page 26: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization              Tool  Types  (3)  

26  EUCLID  –  Interac3on  with  Linked  Data  

LD  browsers  with  text-­‐based  presenta3ons  

Sig.ma    

Sindice  

OpenLink  RDF  Browser  

Marbles  

Disco  Hyperdata  Browser  

Piggy  Bank  (SIMILE)  

Zitgist  DataViewer  

iLOD  

URI  Burner  

Dipper  –  Talis  Pla�orm  Browser  

LD  and  RDF  browsers  with  visualiza=on  op3ons  

Tabulator  

IsaViz  

OpenLink  Data  Explorer  

RDF  Gravity  

RelFinder  

DBpedia  Mobile  

LESS  

SIMILE  Exhibit  

Haystack  

FoaF  Explorer  

Humboldt  

LENA  

Noadster  

Visualiza3on  toolkits  

Linked  Data  tools:  Informa3on  Workbench  

Visual  RDF  (by  Graves)  

LOD  Live  

LOD  Visualiza3on  

Data-­‐Driven  Documents  (D3)  

NetworkX  

Many  Eyes  

Tableau  

Prefuse  

SPARQL  visualiza3on  

Informa3on  Workbench  

Google  Visualiza3on  API  

SPARQL  package  for  R  

Gruff  (for  AllegroGraph)  

Linked  Data:  

General  data:  

Page 27: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (1)  

EUCLID  –  Interac3on  with  Linked  Data   27  

Sig.ma  

Source:  hZp://sig.ma/search?q=The+Beatles  

Retrieves  informa3on  from  different  LD  sources    

Keyword  search  

Displays  values  per  predicate  

Displays  the  source  for  each  value  

Page 28: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (2)  

EUCLID  –  Interac3on  with  Linked  Data   28  

Sig.ma  

Source:  hZp://sig.ma/search?q=The+Beatles  

Displays  values  per  predicate:  

May  include  (redundant)  informa3on  in  different  languages,  for  example:  annés  and  anno  

Summary:  •  Sig.ma  lists  all  the  triples,  and  group  

them  per  predicate  •  Useful  for  browsing  predicates  and  

values  within  data  sets  •  The  meaning  of  the  values  is  not  evident  

URIs  are  clickable,  allowing  naviga3on  through  RDF  resources  

Page 29: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (3)  

EUCLID  –  Interac3on  with  Linked  Data   29  

Sindice  Keyword  search  

Filtering  per  type  of  document  

Retrieves  links  to  documents  

Allows  accessing  cache  documents      

Allows  inspec3ng  resources  

Source:  hZp://sindice.com/search?q=The+Beatles  

Page 30: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (4)  

EUCLID  –  Interac3on  with  Linked  Data   30  

Sindice  

Both  interfaces  display  the  set  of  triples  related  to  the  inspected  resource  

Cache  triples  

Live  triples  

Page 31: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (5)  

EUCLID  –  Interac3on  with  Linked  Data   31  

Information  Workbench  •  Demo  available  at:  hZp://musicbrainz.fluidops.net  

•  Displays  human-­‐readable  content  about  Linked  Data  resources  

 •  Supports  visualiza=on  techniques  (different  types  of  charts,  

maps,  3melines,  etc.)  to  plot  results  from  SPARQL  queries  

•  Allows  the  user  to  interact  with  the  displayed  data  

Page 32: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (6)  

EUCLID  –  Interac3on  with  Linked  Data   32  

Information  Workbench:  Browsing  a  music  artist  (1)  Search  op3ons   (2)  Search  results  

Page 33: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (7)  

EUCLID  –  Interac3on  with  Linked  Data   33  

Information  Workbench:  Browsing  a  music  artist  (3)  Browsing  the  selected  resource  

Page 34: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (8)  

EUCLID  –  Interac3on  with  Linked  Data   34  

Information  Workbench:  Visualization  techniques  (3)  Browsing  the  selected  resource  

Page 35: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (9)  

EUCLID  –  Interac3on  with  Linked  Data   35  

Information  Workbench:  User  interaction  LD  visualiza3ons  must  support  naviga3on  through  the  data  

Source:  hZp://musicbrainz.fluidops.net/resource/Analy3cal5  

Page 36: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (9)  

EUCLID  –  Interac3on  with  Linked  Data   36  

Information  Workbench:  SPARQL  Visualization    

Implements  widgets  which  allow:  

•  Retrieving  ROI  via  SPARQL  queries  •  Selec3ng  the  appropriate  visualiza3on  technique  •  Configuring  parameters  of  the  visualiza3on  

Page 37: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (10)  

EUCLID  –  Interac3on  with  Linked  Data   37  

Information  Workbench:  SPARQL  visualization  

SELECT  ?release                  ((SUM(xsd:double(?duration/60000)))  AS  ?avg)    WHERE  {      <http://dbpedia.org/resource/The_Beatles>                    foaf:made  ?release  .    ?release  mo:record  ?record  .    ?record  mo:track  ?track  .    ?track  mo:duration  ?duration  .}    GROUP  BY  ?release  ORDER  BY  DESC(?avg)  LIMIT  10  

SPARQL  Query    

Result  set  

Top  ten  The  Beatles  releases  according  to  the  sum  of  track  dura,ons  in  minutes  

Page 38: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (11)  

EUCLID  –  Interac3on  with  Linked  Data   38  

Information  Workbench:  SPARQL  visualization  Top  ten  The  Beatles  releases  according  to  the  sum  of  track  dura,ons  in  minutes  

Widget  

Visualization:  Bar  chart  

{{#widget:  BarChart  |  query  ='SELECT  (COUNT(?Release)  AS  ?COUNT)  ?label  WHERE  {        <http://musicbrainz.org/artist/8538e728-­‐ca0b-­‐4321-­‐b7e5-­‐cff6565dd4c0#_>  foaf:made  ?Release.      ?Release  rdf:type  mo:Release  .    ?Release  dc:title  ?label  .}  GROUP  BY  ?label  ORDER  BY  DESC(?COUNT)  LIMIT  20'  |  settings  =  'Settings:barvertical_mb'    |  asynch  =  'true'  |  input  =  'label'  |  output  =  'COUNT'  |  height  =  '300’}}  

Page 39: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (12)  

EUCLID  –  Interac3on  with  Linked  Data   39  

Information  Workbench:  SPARQL  visualization  Top  ten  The  Beatles  releases  according  to  the  sum  of  track  dura,ons  in  minutes  Other  visualiza3ons  of  the  same  result  set  …  

Line  chart:  

Pie  chart:  

Page 40: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (13)  

EUCLID  –  Interac3on  with  Linked  Data   40  

Information  Workbench:  Automated  Widget  Suggestion  

Bar  chart  

Line  chart  

Pie  chart  

1  

2   3  Table  

Pivot    view  

Select  a  suggested  visualiza3on   Visualiza3on  automa3cally  built  

Page 41: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linked  Data  Visualization  Examples  (14)  

EUCLID  –  Interac3on  with  Linked  Data   41  

Other  tools  

Source:  hZp://en.lodlive.it   Source:  hZp://lodvisualiza3on.appspot.com  

LOD  Visualization  LOD  live  

•  Graph  visualiza3ons  •  Interac3ve  UI  (the  graph  can  be  

expanded  by  clicking  on  the  nodes)  •  Live  access  to  SPARQL  endpoints  

•  Hierarchy  visualiza3ons:  treemaps  and  trees  •  Live  access  to  SPARQL  endpoints  

(suppor3ng  JSON  and  SPARQL  1.1)    

Page 42: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linking  Open  Data  Cloud  Visualization  (1)  

42  EUCLID  –  Interac3on  with  Linked  Data  

“The  Linking  Open  Data  cloud  diagram”                                                                                    by  Richard  Cyganiak  and  Anja  Jentzsch  

Source:  hZp://lod-­‐cloud.net  

•  The  nodes  correspond  to  Linked  Data  sets  

•  The  edges  represent  connec3ons  between  Linked  Data  sets    

•  The  size  of  the  nodes  is  propor3onal  to  the  number  of  triples  in  each  data  set    

•  The  datasets  are  categorized  by  knowledge  domains  represented  with  colors  

Page 43: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linking  Open  Data  Cloud  Visualization  (2)  

43  EUCLID  –  Interac3on  with  Linked  Data  

Image  source:  hZp://twitpic.com/17qj1h  

“Linked  Open  Data  Cloud”  generated  by  Gephis  

•  The  central  cluster  (green)  displays  DBpedia  as  a  central  focus  

•  The  size  of  the  nodes  reflect  the  size  of  the  datasets  

•  The  length  of  the  connec=ons  encode  informa3on  about  the  data  structure  

Source:  A.  Dadzie  and  M.  Rowe.  Approaches  to  Visualizing  Linked  Data:  A  Survey.  2011  

Page 44: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Linking  Open  Data  Cloud    Visualization  (3)  

44  EUCLID  –  Interac3on  with  Linked  Data  

“Linked  Open  Data  Graph”  by  Protovis  

Source:  hZp://inkdroid.org/lod-­‐graph/  

•  The  data  to  be  displayed  are  retrieved  using  the  CKAN  API  

•  The  nodes  represent  Linked  Data  sets  available  in  the  Data  Hub  “lod-­‐cloud”  group  

•  The  size  of  the  nodes  is  propor3onal  to  the  data  set  size  

•  Edges  are  connec3ons  between  data  sets  

•  The  colors  reflect  the  CKAN  ra3ng  and  the  intensity  of  the  color  reflects  the  number  of  received  ra3ngs  

•  The  nodes  can  be  clicked  to  go  to  the  data  set  CKAN  page  

Page 45: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Reporting  

EUCLID  –  Interac3on  with  Linked  Data   45  

•  Visualiza3ons  techniques  are  used  in  the  crea3on  of  reports  included  in  data  monitoring  and  management    solu3ons  

•  Provides  and  overview  of  the  dataset  by  genera3ng  a  low-­‐level  descrip=ve  analysis:  •  Quan3ta3ve  informa3on  about  the  dataset  

 

•  Users  may  interact  with  the  data  via  dashboards  

•  Some  systems  support  this  feature  over  structured  data:  •  Google  Webmaster  Tools  (hZps://www.google.com/webmasters/tools)  •  Informa3on  Workbench  (hZp://www.fluidops.com/informa3on-­‐workbench)  

•  eCloudManager  (hZp://www.fluidops.com/ecloudmanager)  

Page 46: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Google  Webmaster  Tool:    Structure  Data  Dashboard  (1)  

EUCLID  –  Interac3on  with  Linked  Data   46  

•  Provides  to  webmasters  informa3on  about  the  structured  data  embedded  in  their  websites  (and  recognized  by  Google)  

•  The  dashboard  three  levels:  i.   Site-­‐level  view:  aggregates  the  data  by  classes  defined  in    

the  vocabulary  schema  

ii.   Item-­‐type-­‐level  view:  provides  details  per  page  for  each  type  of  resource  

iii.   Page-­‐level  view:  shows  the  aZributes  of  every  type  of  resource  on  a  given  web  page  

Page 47: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Google  Webmaster  Tool:    Structure  Data  Dashboard  (2)  

EUCLID  –  Interac3on  with  Linked  Data   47  

Source:  hZp://googlewebmastercentral.blogspot.de/2012/07/introducing-­‐structured-­‐data-­‐dashboard.html  

Site-­‐level  view  

Page 48: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Google  Webmaster  Tool:    Structure  Data  Dashboard  (3)  

EUCLID  –  Interac3on  with  Linked  Data   48  

Source:  hZp://googlewebmastercentral.blogspot.de/2012/07/introducing-­‐structured-­‐data-­‐dashboard.html  

Page-­‐level  view  

Site-­‐level  view  

Page 49: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LINKED  DATA  SEARCH  

EUCLID  –  Interac3on  with  Linked  Data   49  

Page 50: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search  Process  

Using  semantic  models  for  the  search  process  

50  EUCLID  –  Interac3on  with  Linked  Data  

Faceted  Search  

Seman=c  Search  

Image  based  on:  Tran,  T.,  Herzig,  D.,  Ladwig,  G.  SemSearchPro-­‐  Using  seman3cs  through  the  search  process    

Data  graphs   Query  

Result  visualiza=on/presenta=on  

User  query  (e.g.  keywords,  NL)  

Query  visualiza=on  (Op3onal)   User  

System  

Refinement  

Presenta3on  

Analysis  

Presenta3on  /  Ranking  

Graph  matching    

En3ty  Extrac3on  /  Seman3c  query  analysis  

Page 51: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Image  Source:  hZp://musicontology.com  

Semantic  Search:  Example  (1)  

51  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:  

Query  expansion:  

song  

track  

melody  

tune  

synonym    

synonym    

mo:Track  Candidates  

…  

song   member  (of)    wriZen  by   (the)  beatles    

En=ty  mapping:  

Page 52: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Example  (2)  

52  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:  

Query  expansion:  

writer  

composer  

creator  synonym    

mo:composer  

Image  Source:  hZp://musicontology.com  

Candidates  wriZen  by  

inverse  of  

…  

song   member  (of)    wriZen  by   (the)  beatles    

En=ty  mapping:  

Page 53: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Example  (3)  

53  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:   song   member  (of)    wriZen  by   (the)  beatles    

Query  expansion:  

member  (of)  

mo:member_of   mo:member  

inverse  of  

Image  Source:  hZp://musicontology.com  

En=ty  mapping:  

Page 54: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Example  (4)  

54  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:   song   member  (of)    wriZen  by   (the)  beatles    

En=ty  mapping:  

(the)  beatles    

Candidates  

Beatles  (Book)  

The  Beatles  (Music  Group)  

Beatle  (Animal)  

Beatle  (Automobile)  

How  to  iden3fy  the  right  “Beatle”?  Examine  the  context  (Contextual  Analysis)    

Page 55: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Example  (5)  

55  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:   song   member  (of)    wriZen  by   (the)  beatles    

En=ty  mapping:  

(the)  beatles    

Contextual  Analysis  

foaf:Agent  mo:composer  

mo:Track  

mo:  MusicAr3st  

rdfs:subClassOf  

mo:  MusicGroup  

mo:member  

rdfs:subClassOf  

This  subgraph  is  part  of  the  query  

The  Beatles  (Music  Group)  

dbpedia:  The_Beatles  

En=ty  mapping:  

Page 56: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Example  (6)  

56  EUCLID  –  Interac3on  with  Linked  Data  

User  query  (NL)   “songs  wriZen  by  members  of  the  beatles”  

En=ty  extrac=on:   song   member  (of)    wriZen  by   (the)  beatles    

?y  

Mo:Track  

?x  mo:composer  

a  

dbpedia:  The_Beatles  

mo:member  

Results  (I  want  to)  Come  Home  Angel  in  Disguise  Another  Day  …  

Answers  presented  to  the  user    The  results  could  be  ranked  

Query  foaf:Agent  a  

Page 57: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search  

•  Aims  at  understanding  the  meaning  of  the  resources  specified  in  the  query  

•  Different  approaches  to  exploit  seman3cs:  

•  Query  expansion  using  ontologies  Since  ontologies  represent  knowledge  about  specific  domains,  they  can  be  used  to  expand  the  query  by  incorpora3ng  related  ontology  terms  into  the  query.    

•  Contextual  analysis  In  LD,  this  approach  may  explore  the  resources  specified  in  the  query  and  their  adjacent  nodes  in  the  RDF  graph.  Mainly  applied  to  disambiguate  query  terms.    

•  Reasoning  In  some  cases,  the  answer  to  a  specific  query  is  not  explicitly  contained  in  the  data,  but  it  can  be  computed  by  using  reasoning  methods.  

57  EUCLID  –  Interac3on  with  Linked  Data  

Page 58: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search  &  Linked  Data  

58  EUCLID  –  Interac3on  with  Linked  Data  

Component   Seman=c  search   SPARQL  query  

Keyword  or  NL  /            concept  matching    

Performs  en3ty  extrac3on  and  matching  to  formal  concepts  

Not  supported  

Fuzzy  concepts/rela3on/logics  

Allows  the  applica3on  of  fuzzy  qualifiers  as  query  constrains      

Not  supported  

Graph  paZerns   Uses  the  context  and  other  seman3c  informa3on  to  locate  interes3ng  sub-­‐graphs  

Applies  paZern  matching    

Path  discovery   Finds  new  interes=ng  links  that  may  lead  to  addi3onal  informa3on  

Not  supported  

Semantic  Search  vs.  SPARQL  query  

Page 59: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Google  (1)  

59  EUCLID  –  Interac3on  with  Linked  Data  

Input:  query  in  NL    Output:  List  of  answers  

Google  performs  seman3c  search  on  certain  en33es  and  queries!  

Page 60: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  Google  (2)    

60  EUCLID  –  Interac3on  with  Linked  Data  

Input:  ques3on  in  NL    

Output:  List  of  web  pages  ranked  using  the  algorithm  Google  PageRank  to  display  the  most  relevant  pages  first  

Page 61: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  DuckDuckGo  (1)  

61  EUCLID  –  Interac3on  with  Linked  Data  

Input:  ques3on  in  NL    

Output:  List  of  answers  

Page 62: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Search:  DuckDuckGo  (2)  

62  EUCLID  –  Interac3on  with  Linked  Data  

Performs  disambigua=on  of  the  query  terms.  

The  45  sugges=ons  are  grouped  by  classes  according  to  their  corresponding  knowledge  domain:  This  approach  is  denominated  Faceted  Search  

Page 63: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Faceted  Search:  Example  

Information  Workbench:  Searching  for  artists  in  categories    

63  EUCLID  –  Interac3on  with  Linked  Data  

Facet    

Facet    

Facet    

Source:  hZp://musicbrainz.fluidops.net/resource/mo:MusicAr3st?view=pivot  

Depic3ons  of  ar3sts  

Page 64: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Faceted  Search  

•  Facets  =  proper3es  

•  Suitable  for  browsing  mul=-­‐dimensional  taxonomies  based  on  the  search  aZributes  

•  Allows  user  to  explore  the  data:  •  User  submits  a  (keyword)  query    

•  Faceted  system  dynamically  iden3fies  the  relevant  facets  (proper3es)  for  the  given  query  and  the  constrains  (values  of  those  proper3es),  and  display  the  search  results      

•  User  may  “drill  down”  by  selec3ng  specific  constrains  to  the  search  results  

•  Informa3on  can  be  accessed  and  ranked  in  mul3ple  ways  

64  EUCLID  –  Interac3on  with  Linked  Data  

Page 65: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Faceted  Search  (2)  

Challenges  for  supporting  Faceted  Search  

•  Iden3fying  which  facets  to  surface:  •  In  heterogeneous  datasets,  data  entries  may  have  different  facets      

•  Dynamically  iden3fy  the  most  appropriate  facets  for  each  query  

•  Ordering  the  facets  depending  on  the  relevance  to  the  query  

•  Compu3ng  previews:  •  Accurately  predic3ng  counts,  without  examining  all  the  results  

•  Offering  facet  preview  to  give  users  an  idea  of  what  to  expect  

65  EUCLID  –  Interac3on  with  Linked  Data  

Source:  Teevan  ,  J.,  Dumais,  S.,  GuZ.  Z.  Challenges  for  Suppor3ng  Faceted  Search  in  Large,  Heterogeneous  Corpora  like  the  Web    

Page 66: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Faceted  Search:  LD  Example  (1)  

FacetedDBLP  

•  Retrieves  informa3on  from  the  DBLP  collec=on  

•  Shows  the  result  set  with  different  facets:  •  Publica3on  years  •  Authors  •  Conferences  

•  It  is  implemented  upon  the  DBLP++  dataset  (enhancement  of  DBLP  including  addi3onal  keywords  and  abstracts):  •  DBLP  ++  is  stored  in  a  MySQL  database  •  Uses  D2R  server  to  consume  RDF  triples  

66  EUCLID  –  Interac3on  with  Linked  Data  

Page 67: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Faceted  Search:  LD  Example  (2)  

67  EUCLID  –  Interac3on  with  Linked  Data  

Input:  “crowdsourcing”      

Facets  

485  results    

FacetedDBLP  

Page 68: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Classification  of  Search  Engines  

68  EUCLID  –  Interac3on  with  Linked  Data  

Seman=c  Search  Systems  

Faceted  Search  Systems  

Google  (GKG)  Bing  

KIM  

sig.ma  

LOD  cloud  cache  /facet  

Longwell  

mSpace  

Exhibit  (SIMILE)  

PoolParty  Seman3c  Search  Server  

DuckDuckGo  

Hakia  

SenseBot  

PowerSet  

DeepDive  

Kosmix  Fac3bles  

Lexxe  

Informa3on  Workbench  

Page 69: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Searching  for  Semantic  Data  

69  EUCLID  –  Interac3on  with  Linked  Data  

         Search  for  

•  Ontologies  

•  Vocabularies  

•  RDF  documents  

Page 70: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Data  Search  Engines  (1)  

EUCLID  –  Interac3on  with  Linked  Data   70  

Searching  for  ontologies  Swoogle  

hZp://kmi-­‐web05.open.ac.uk/WatsonWUI  hZp://swoogle.umbc.edu  

Watson  

Keyword  search  

Keyword  search  

Page 71: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Data  Search  Engines  (2)  

Searching  for  vocabularies:  LOV  Portal  

•  Allows  to  search  proper=es,  classes  or  vocabularies  in  the  Linked  Open  Vocabulary  (LOV)  catalog  

•  The  LOV  search  engine  implement  faceted  search  on:  •  The  knowledge  domain  •  The  role  of  the  resource  matched  from  the  input  query  •  The  vocabulary  containing  the  resource  

•  Results  are  ranked  according  to  a  score  considering:  •  Relevancy  to  the  query  (string)  •  Element  labels  matched  importance  •  Number  of  LOV  vocabularies  that  refer  to  the  element  

71  EUCLID  –  Interac3on  with  Linked  Data  

Page 72: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Data  Search  Engines  (3)  

72  EUCLID  –  Interac3on  with  Linked  Data  

Facets  84  results  

Input:  “ar3st”      

CH  3  

Searching  for  vocabularies:  LOV  Portal    

Page 73: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Semantic  Data  Search  Engines  (4)  

EUCLID  –  Interac3on  with  Linked  Data   73  

Searching  for  documents  

hZp://swse.deri.org   hZp://sindice.com  

Seman3c  Web  Search  Engine   Sindice  

Page 74: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

METHODS  FOR  LINKED  DATA  ANALYSIS  

EUCLID  –  Interac3on  with  Linked  Data   74  

Page 75: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Features  of  Data  Analysis  

75  EUCLID  –  Interac3on  with  Linked  Data  

Sta3s3cal  analysis  •  Allows  describing  the  data  via  Exploratory  Data  Analysis  (EDA)  methods  •  Includes  sta3s3cal  inference  and  predic3on  

Data  aggrega3on  &  filtering  •  One  of  the  first  steps  in  data  analysis  is  pre-­‐processing  in  order  to  select  the  

appropriate  data  to  study    

Visualiza=on  techniques  can  be  built  on  top  of  these  as  part  of  data  analysis    

Machine  learning  •  Focuses  on  predic3on    •     Combines  Ar3ficial  Intelligence  and  Sta3s3cs      •     Includes  supervised  and  unsupervised  learning  (not  covered  in  this  course)  

Page 76: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Data  Aggregation  &  Filtering  

EUCLID  –  Interac3on  with  Linked  Data   76  

•  Data  aggrega3on  refers  to  merging/summarizing  several  values  into  a  single  a  one  

•  Filtering  allows  retrieving  relevant  data  proper3es  and  selec3ng  a  par3cular  range  of  data  values    

•  SPARQL  is  able  to  perform  these  features  via  SELECT  queries  as  follows:  

Features   SPARQL  capabili=es  

Aggrega3on   Combining  aggregate  func3ons  (COUNT,  SUM,  AVG,  …  )  and  GROUP  BY  operator  

Filtering   Combining  projec3on,  FILTER  and  HAVING  operators  

Page 77: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

LD  Statistical  Analysis  

EUCLID  –  Interac3on  with  Linked  Data   77  

•  Sta3s3cal  analysis  supports  descrip=ve  and  predic=ve  opera3ons  

•  SPARQL  supports  some  descrip=ve  opera=ons  (average,  maximum,  minimum)  but  does  not  offer  more  sophis3cated  sta3s3cal  features  like:  •  Fijng  distribu3ons  •  Linear  regressions  •   Analysis  of  variance  •  …  

 

•  Some  approaches  are  able  to  consume  data  retrieved  from  SPARQL  endpoints:  –   “R  for  SPARQL”  by  Willen  Robert  van  Hage  &  Tomi  Kauppinen  –  “Performing  Sta,s,cal  Methods  on  Linked  Data”  by  Zapilko  &  Mathiak  

Page 78: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

R  –  Statistical  Computing  

EUCLID  –  Interac3on  with  Linked  Data   78  

•  R  is  a  language  and  environment  for  sta=s=cal  compu=ng  

•  R  provides  a  wide  variety  of  sta=s=cal  and  graphical  techniques  •  Linear  and  nonlinear  modeling  •  Classical  sta3s3cal  tests  •  Time-­‐series  analysis  •  Classifica3on  (Machine  Learning)  •  Clustering  (Machine  Learning)  •  Extensible  with  further  func3onali3es  

•  R  is  available  as  Free  So_ware  (under  the  terms  of  the  GNU  general  public  license)  

Page 79: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Statistical  Analysis  with  R  

EUCLID  –  Interac3on  with  Linked  Data   79  

Page 80: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

R  for  SPARQL  

EUCLID  –  Interac3on  with  Linked  Data   80  

•  The  R  for  SPARQL  Package  enables  to:  •  Connect  a  SPARQL  endpoint  over  HTTP  •  Pose  a  SELECT  query  or  an  UPDATE  opera3on  (LOAD,  INSERT,  DELETE)  

•  If  given  a  SELECT  query,  it  returns  the  results  as  a  data  frame  •  The  results  can  directly  be  mapped  and  visualized  

•  Posing  requests:  •  If  the  parameter  query  is  given,  it  is  assumed  that  the  input  is  a  SELECT  query  

and  a  GET  request  will  be  performed  to  get  the  results  from  the  URL  of  the  endpoint  

•  If  the  parameter  update  is  given,  it  is  assumed  that  the  input  is  an  UPDATE  opera3on  and  a  POST  request  will  be  submit  to  the  URL  of  the  endpoint.  Nothing  is  returned  

Source:  hZp://linkedscience.org/tools/sparql-­‐package-­‐for-­‐r/  

Page 81: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

R  for  SPARQL:  Example  (1)  

EUCLID  –  Interac3on  with  Linked  Data   81  

1.  Download  the  R  package  and  load  it:  •  library(SPARQL)  •  Library(sp)  #user  for  plotting  spatial  data  

2.  Define  the  endpoint  with  the  triples  •  endpoint  =  "http://spatial.linkedscience.org/sparql"      

3.  Define  the  query  •  q  =  "SELECT  ?cell  ?row  ?col  ?polygon  ?DEFOR_2002  

       WHERE  {                ?cell  a  <http://linkedscience.org/lsv/ns#Item>  ;                <http://spatial.linkedscience.org/context/amazon/Lin>  ?row  ;                <http://spatial.linkedscience.org/context/amazon/Col>  ?col;                <http://observedchange.com/tisc/ns#geometry>  ?polygon  .                <http://spatial.linkedscience.org/context/amazon/DEFOR_2002>        ?DEFOR_2002  .                }"  

Source:  hZp://linkedscience.org/tools/sparql-­‐package-­‐for-­‐r  

Page 82: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

R  for  SPARQL:  Example  (2)  

EUCLID  –  Interac3on  with  Linked  Data   82  

4.  Link  the  result  to  an  object  •  res  <-­‐  SPARQL(endpoint,q)$results  

5.  Handling  the  results  •  res$row  <-­‐  -­‐res$row  •  coordinates(res)  <-­‐    ~col  -­‐  row  

6.  Chose  the  graphical  format  and  plot  the  results  •  spplot(res,"DEFOR_2002",col.regions=rev(heat.colors(

17))[-­‐1],  at=(0:16)/100,  main="relative  deforestation  per  pixel  during  2002")  

Source:  hZp://linkedscience.org/tools/sparql-­‐package-­‐for-­‐r  

Page 83: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

R  for  SPARQL:  Example  (3)  

EUCLID  –  Interac3on  with  Linked  Data   83  

Source:  hZp://linkedscience.org/tools/sparql-­‐package-­‐for-­‐r  

Page 84: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Machine  Learning    

EUCLID  –  Interac3on  with  Linked  Data   84  

•  Machine  Learning  techniques  allow  to  extract  interes3ng  informa3on  from  data  sources,  and  can  be  used  to  discover  hidden  paUerns  within  datasets  by  generalizing  from  examples  

•  Different  ML  approaches  can  be  applied:    •  Clustering:  groups  similar  data  into  data  par33ons  called  clusters    •  Associa=on  rule  learning:  discovers  rela3ons  between  variables    •  Decision  tree  learning:  analyses  observa3ons  to  build  a  predic3ve  

model  represented  as  a  tree    •  Many  others  …  

•  Weka  is  a  Data  Mining  framework  commonly  used  to  apply  ML  on  tabular  data:  –  www.cs.waikato.ac.nz/ml/weka  

Page 85: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Machine  Learning  on  LD  

EUCLID  –  Interac3on  with  Linked  Data   85  

Challenges  for  applying  Machine  Learning  on  LD  •  LD  heterogeneity  introduces  noise  to  the  data:  

–  Same  LD  resources,  different  URIs  –  Predicates  with  similar  seman3cs,  but  different  constraints  

•  The  data  is  not  independent  and  iden3cally  distributed  (iid):  –  It  does  not  consist  of  only  one  type  of  objects  –  The  en33es  are  related  to  each  other  

•  LD  rarely  contains  nega=ve  examples  needed  for  ML  algorithms:  –  For  example,  owl:differentFrom  

Source  hZp://www.cip.ifi.lmu.de/~nickel/iswc2012-­‐slides  

Page 86: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Applications  of    Machine  Learning  on  LD  

EUCLID  –  Interac3on  with  Linked  Data   86  

•  Node  ranking:  –  Ranking  nodes  according  to  their  relevance  for  a  query  

•  Link  predic=on:  –  Infer  edges  between  LD  resources  –  Predict  the  new  edges  that  will  be  added  to  the  RDF  graph        

•  En=ty  resolu=on:  –  Determine  whether  two  URIs  correspond  to  the  same  real-­‐world  object  

•  Taxonomy  learning:  –  Infer  taxonomies  or  concept  hierarchies  from  a  given  vocabulary  or  ontology  

Page 87: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

Summary  

EUCLID  –  Interac3on  with  Linked  Data   87  

•  Linked  Data  visualiza3on  techniques:  •  Visualiza3ons  must  be  chosen  according  the  type  of  the  data    •  Wide  variety  of  tools  suppor3ng  SPARQL  results’  visualiza=on    •  Might  be  used  in  dashboards  for  suppor3ng  administra3ve  tasks  

•  Linked  Data  search  •  Seman=c  search:  exploits  the  meaning  of  user  queries  (NL  or  set  of  

keywords)  to  present  useful  results    •  Faceted  search:  allows  browsing  mul3-­‐dimensional  data  

•  Linked  Data  analysis:  •  Includes  data  manipula3on  such  as  aggrega=on  &  filtering    •  Applies  sta=s=cal  methods  to  get  a  beZer  understanding  of  the  data  •  Machine  Learning  techniques  can  be  applied  for  predic3ve  analysis  •  Visualiza=on  techniques  can  be  built  on  top  of  the  previous  features  

Page 88: ESWC SS 2013 - Tuesday Tutorial 2 Maribel Acosta and Barry Norton: Interaction with Linked Data

For  exercises,  quiz  and  further  material  visit  our  website:    

EUCLID  -­‐  Providing  Linked  Data   88  

@euclid_project   euclidproject   euclidproject  

http://www.euclid-­‐project.eu  

Other  channels:  

eBook   Course