human activities as linked data

20
Integrating Know-How in the Linked Data Cloud Paolo Pareti, Benoit Testu, Ryutaro Ichise, Ewan Klein and Adam Barker “As we all know, there is a large amount of facts available on the Web. But what about human activities or know-how? The goal of this talk is to tell you how this kind of knowledge can be made machine understandable and available on the Web.” https://w3id.org/prohow/

Upload: paolo-pareti

Post on 02-Jul-2015

339 views

Category:

Science


4 download

DESCRIPTION

Pecha Kucha presentation of our paper "Integrating Know-How into the Linked Data Cloud" at the EKAW 2014 conference (28th of November 2014, Linköping, Sweden). Project website: https://w3id.org/prohow/ Conference website: http://www.ida.liu.se/conferences/EKAW14/ * special thanks to Marco Malebolgie for the artwork!

TRANSCRIPT

Page 1: Human Activities as Linked Data

IntegratingKnow-How

in the Linked Data CloudPaolo Pareti, Benoit Testu, Ryutaro Ichise,

Ewan Klein and Adam Barker

“As we all know, there is a large amount of facts available on the Web. But what about human activities or know-how? The goal of this talk is to tell you how this kind of knowledge can be made machine understandable and available on the Web.”

https://w3id.org/prohow/

Page 2: Human Activities as Linked Data

Human activities (or know-how)1. can be represented as Linked Data 2. can be automatically extracted3. can be automatically interlinked4. experiment: extracted a large Linked Data dataset 5. evaluation: our system outperforms humans

“In particular, the presentation will focus on those five points.”

Page 3: Human Activities as Linked Data

393,600393,600

“If we ask an intelligent system this question: ‘What is the population of the capital of New Zealand?’ we would now assume it can answer this question correctly, by accessing knowledge bases available on the Web. But what happens if we ask a seemingly easier question: ‘What do you

need to wash you hands?’ In this case, the system would not be able to answer.”

Page 4: Human Activities as Linked Data

???

“This is because, to answer this question, the intelligent system would need to have some understanding of what an activity is, and maybe what are its requirements. This knowledge, however, is not currently available in existing knowledge bases.”

Page 5: Human Activities as Linked Data

Why Know-How?

“But actually know-how is very useful and has a lot of applications. Know-how is relevant in almost all domains, and it can be common sense know-how available on the Web, or maybe internal know-how of specific organizations, such as standard operating procedures. This knowledge

also has applications in fields such as question answering, recommender systems and activity recognition.”

Page 6: Human Activities as Linked Data

“Human know-how is on the Web, but why is it not accessible? First of all, this knowledge is usually represented in unstructured resources. We can think for example of step-by-step instructions, which are typically represented as text in natural language,

or maybe as pictures and videos.”

Page 7: Human Activities as Linked Data

?

?

?

“But the most serious limitation is the fact that a single document contains only limited information. What happens if we (or a machine) does not understand how to do a specific step, or what a particular ingredient is. In fact, it is often the case that humans look at multiple resources to

complete a complex task for the same time.”

Page 8: Human Activities as Linked Data

Data

“The first step for making know-how machine understandable is by using a structured representation. We can identify several entities in a process, such as steps, methods, requirements and outputs. We can link those entities with each other, depending on which relation exists

between them.”

Page 9: Human Activities as Linked Data

Linked Data

“To solve the problem of the isolation of single resources, we have adopted a Linked Data representation. In this way, humans and machines can discover related resources when they are interested in more information about a specific entity. It is important to notice that these are not just

links between documents, but between specific entities contained in these documents.”

Page 10: Human Activities as Linked Data

“Our simple Linked Data representation of know-how is a point of contact between humans and machines. From the human perspective, know-how as Linked Data is a way to manage and find relevant resources which are human understandable. From the machine perspective, this data

can be easily used for analysis, inferencing, and it can be extended to more complex representations where required.”

Page 11: Human Activities as Linked Data

“So all of this is not just an idea. It is actually possible and we have run experiments and evaluated our results.”

Page 12: Human Activities as Linked Data

“What do we want to achieve exactly, when we talk about machine-understandable activities? While it is true that we want to have a knowledge representation more powerful than simple text in a document, we cannot yet aim to have machines capable of automating all human activities.

Therefore we need to start by reaching a first significant but realistic goal.”

Page 13: Human Activities as Linked Data

“We show the usefulness of this system in a real application. A task currently done by humans is the interlinking of related know-how resources. In particular, the WikiHow community is actively creating such kind of links; for example between the step of a process and another set of

instructions that explains how to do it.”

Page 14: Human Activities as Linked Data

How to Make a Pancake

Steps:1. Prepare the mix2. Pour the mix

in a hot pan3. Cook until golden

Make a Pancake

has_

step

has_

step has_step

Prepare the mix Cook until golden

Pour the mixin a hot pan

“This is a simplified example (e.g. missing the relations to specify the order of the steps) of how our system generates a Linked Data representation of a Web document. This can be done in many ways, but when the original document has some degree of structure, this

knowledge extraction can be done easily and accurately.”

Page 15: Human Activities as Linked Data

How to Make a Pancake

Steps:1. Prepare the mix2. Pour the mix

in a hot pan3. Cook until golden

Make a Pancake

has_

step

has_

step has_step

Prepare the mix Cook until golden

Pour the mixin a hot pan

Requirements:● Eggs● Milk● Flour

Flour

Milk

Eggs

requires

requires

requires

“On the Web, most of these resources have some degree of structure. This is because a well structured set of instructions is better understood by humans, even before machines. This structure usually takes form of a simple enumeration of steps, methods and requirements.”

Page 16: Human Activities as Linked Data

> 200,000 procedures

> 2,600,000entities

“WikiHow and Snapguide are two large repositories that contain well organized know-how. We have extracted the knowledge of these websites and obtained a large dataset of over 200,000 procedures decomposed in over 2,600,000 entities. This can be seen as a large-scale extraction of

know-how from the Web and conversion to Linked Data.”

Page 17: Human Activities as Linked Data

Hot to Install an Operating System

create a partition

How to Create a Partition

“In order to interlink the extracted entities, we have created a system to automatically discover two kinds of links. The first kind is a functional link between a step and another set of instructions that explains how this step can be done.”

Page 18: Human Activities as Linked Data

DBpedia Guacamole

How to Serve NachosHow to Make Guacamole

“The second kind of links we discovered is similar to an Input/Output link between two processes. Instead of representing it directly, we have this link implicitly represented by the types of the input and the output of processes. In this example, we can infer that there is an Input/Output relation

between the two processes, as one requires the object ‘Guacamole’ while the other outputs it.”

Page 19: Human Activities as Linked Data

Evaluation

+ 16% precision+ ×2 number of links+ ×2 coverage

+ automatic+ semantic links

“Finally we evaluated the links extracted by our system against the links generated manually by the WikiHow community. The result was a significant improvement. Our system identified links of better quality, more in number, and better spread across all resources. All of this on top of

being a completely automatic system which creates semantic Linked Data links, more expressive than simple html links.”

Page 20: Human Activities as Linked Data

“In conclusion, we have seen how know-how can become a new useful resource on the Linked Data Cloud. Our system automated the extraction and the integration of this knowledge on a large scale. Please visit this website if you are interested in this dataset or information about the

project. This website also contains a link to an online visualization tool to explore the dataset”.

Know How as Linked Data?….a dream that comes true!

● Generated a large dataset of > 200,000 human activities as Linked Data

● Integrated in the Linked Data Cloud● Outperformed the human baseline

https://w3id.org/prohow/