facets and pivoting for flexible and usable linked data exploration

Post on 13-Jan-2015

1.022 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

The success of Open Data initiatives has increased the amount of data available on the Web. Unfortunately, most of this data is only available in raw tabular form, what makes analysis and reuse quite difficult for non-experts. Linked Data principles allow for a more sophisticated approach by making explicit both the structure and semantics of the data. However, from the end-user viewpoint, they continue to be monolithic files completely opaque or difficult to explore by making tedious semantic queries. Our objective is to facilitate the user to grasp what kind of entities are in the dataset, how they are interrelated, which are their main properties and values, etc. Rhizomer is a tool for data publishing whose interface provides a set of components borrowed from Information Architecture (IA) that facilitate awareness of the dataset at hand. It automatically generates navigation menus and facets based on the kinds of things in the dataset and how they are described through metadata properties and values. Moreover, motivated by recent tests with end-users, it also provides the possibility to pivot among the faceted views created for each class of resources in the dataset.

TRANSCRIPT

Facets and Pivoting for Flexible and Usable Linked

Data ExplorationJosep Maria Brunetti, Rosa Gil,

Roberto García

Interacting with Linked Data Workshop, ILD’12

Crete, Greece, May 28th 2012Human-Computer Interaction and

Data IntegrationResearch Group

Universitat de LleidaSpain

Starting Point

• RhizomerSemantic Web Data publishing

MetadataStore

RhizomerApp

GE

T

PU

T

PO

ST

DE

L

SPARQL or LinkedData new edit delete

Jena, Virtuoso, OWLIM,…

HTML+RDFa

“semantic” FORMS

Interacting

• Useful for computers…but also for lay users?

• User tests:– Typical questions:

• Where do I start? • Where do I go now?• What is this data about?

– What do we offer? • Text search, type URI, SPARQL query,…

…but they usually don’t answer lay users needs

Interacting

• Example: What to do with DBPedia? – 3.5 million things described

• Ontology: 257 classes y 1276 properties

Proposal

Information Architecture Components

[Morville]

Interaction Patterns for Data Analysis

[Shneiderman]

Overview Menus, Sitemaps,…

Zoom & Filter Facets

Details Lists, Maps, Timelines…

Ontologies and dataset structure

IA Components. Menus

– Hierarchical structure for dataset ontologies• For each class

– URI, label, # instances, subclasses

– Flatten to desired # entries and subentries• When there is room, divide class with most

instances

• When too many options, group classes with less instances

IA Components. Menus

AutomaticGeneration

7 menus with 10 submenus

IA Components. MenusNavigation bar provides overview for DBPedia… …but what to do with 12.334 birds now?

IA Components. Facets

• Pre-computed list of facets/class

– Ontologies + class instances

– Facet metrics: frequency, #values, most common value cardinality…

• DBPedia Birds class:– 226 different properties

•dbo:kingdom, 100%, 3 values, 6846 (Animalia),…

Evaluation

• Evaluation with lay users as part of RITE1 development process– Iteration test with 6 users

– LinkedMDB dataset

1 Rapid Iterative Testing and Evaluation

User Task:“Find three films where Woody Allen is director and also actor”.

Evaluation

• Seemed easy but…no user completed task without help

• Really, just 1 issue: – Users started from “Actor” instead than from

“Film”, and got lost from there

• User interaction is too constrained by underlying “explicit” data structure

• Lack of context while browsing graph

Proposals

• Facet for all inverse properties (explicit or implicit)– Actor actor – Film:

• Actor has facet “is actor of Film”

• Breadcrumbs show “query” built so far– Click Film, then for facet “Actor”

search “Woody Allen”:• Display:

“Showing Film has actor where actor name is Woody Allen”

Proposals

• What about getting from Actors to Films to restrict by director?

• Add Actor facet “directed by”?– DANGER: facets explosion

• Director facet “continents of countries where films directed”!

Proposals

• Pivoting: switch from faceted view to related faceted view (keeping filters)– E.g.: from Actors facets move to Films facets

through “is Actor of Film” facet

• For each class facet also compute:– Most specific class for target instances

• Actor “is Actor of” Film and TV Episode Work

– Pivot that facet to get:• Faceted view for target class • … + filters so far

Conclusions

• Menus – Dataset classes (topics) overview

• Facets– Per class properties and values, filter

• Pivoting– Switch faceted views, carry on filters

Conclusions

• Users build queries without SPARQL or dataset structure knowledge

• Example: – Who has directed more films in Oceania?– SELECT DISTINCT ?r1 WHERE {

?r1 a movie:Director . ?r2 movie:director ?r1 . ?r2 a movie:Film.?r2 movie:country ?r3 . ?r3 movie:country_continent ?r3var0 FILTER(str(?r3var0)="Oceania") }

Future Work

• User evaluation– Explore the best way to provide pivoting,

and un-pivoting…

• Specialised facets: – Range dependent: histogram for numbers,

calendar for dates,…

• Other IA components: sitemaps

• …

Thanks for your attention

Roberto Garcíahttp://rhizomik.net/~roberto/

Human-Computer Interaction and Data IntegrationResearch Group

Universitat de Lleida

top related