one does not simply crowdsource the semantic web

16
ONE DOES NOT SIMPLY CROWDSOURCE THE SEMANTIC WEB TECHNOLOGY DESIGN AND INCENTIVES Elena Simperl [email protected] .uk @esimperl January 26th, 2016 1

Upload: elena-simperl

Post on 16-Apr-2017

94 views

Category:

Education


0 download

TRANSCRIPT

Page 1: One does not simply crowdsource the Semantic Web

1

ONE DOES NOT SIMPLY CROWDSOURCE THE SEMANTIC

WEB TECHNOLOGY DESIGN AND INCENTIVES

Elena [email protected] @esimperlJanuary 26th, 2016

Page 2: One does not simply crowdsource the Semantic Web

2

CROWDSOURCINGPROBLEM SOLVING VIA OPEN CALLS “Crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. “

[Howe, 2006]

Page 3: One does not simply crowdsource the Semantic Web

3

THE SEMANTIC WEBWEB OF DATA THAT CAN BE PROCESSED BY MACHINES “The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries “

[W3C, 2011]

Page 4: One does not simply crowdsource the Semantic Web

4

MAKING THE SEMANTIC WEB HUMANLY POSSIBLECrowdsourcing increasingly used to help algorithms solve Semantic Web problemsGreat challenges How to run a crowdsourcing project

effectively? Which form of crowdsourcing for

which task? How to combine crowd and machine

intelligence? How to encourage participation?

Page 5: One does not simply crowdsource the Semantic Web

5

DESIGNING CROWDSOURCING

PROJECTS

Page 6: One does not simply crowdsource the Semantic Web

6

DIFFERENT FORMS AND PLATFORMS TO CHOOSE FROM

MacrotasksMicrotasksChallengesSelf-organized crowdsCrowdfunding

Source:

[Prpić et al., 2015]

Page 7: One does not simply crowdsource the Semantic Web

MANY QUESTIONS TO ANSWER

TASK DESIGNWORKFLOW DESIGN AND EXECUTION

TASK INTERFACES

QUALITY ASSURANCE

TASK ASSIGNMENT

CROWD TRAINING AND

FEEDBACKINCENTIVES

ENGINEERING

COLLABORATION,

COMPETITION, SELF-

ORGANIZATION

REAL-TIME DELIVERY

NICHESOURCING

EXTENSIONS TO

TECHNOLOGIES

SOCIAL MACHINES

ENGINEERING

Page 8: One does not simply crowdsource the Semantic Web

8

SOME ANSWERS

Page 9: One does not simply crowdsource the Semantic Web

IMPROVING PAID MICROTASKS @WWW15 Compared effectivity of microtasks on CrowdFlower vs self-developed game Image labelling on ESP data set as gold standard

Evaluated accuracy, #labels, cost per label, avg/max #labels/contributor

For three types of tasks Nano: 1 image Micro: 11 images Small: up to 2000 images

Probabilistic reasoning to personalize furtherance incentives

Findings Gamification and payments work well together

Furtherance incentives particularly interesting for top contributors

Page 10: One does not simply crowdsource the Semantic Web

HYBRID NER ON TWITTER @ESWC15Identified content and crowd factors that impact effectivity

Findings Shorter tweets with fewer entities work better Crowd is more familiar with people and places from recent news MISC as a NER category sometimes confusing but useful to identify partial and implicitly named entities

#entities in post

types of entities

content sentime

ntskipped

TP posts

avg. time/tas

k

UI interact

ion

Page 11: One does not simply crowdsource the Semantic Web

11

CROWD-EMPOWERED SPARQL QUERIES @KCAP2015A hybrid machine/human SPARQL query engine that enhances query answers. Uses novel RDF completeness model, to

identify portions of a query with missing values Resorts to microtask crowdsourcing to resolve

the missing values Evaluated # of answers/delivery time/accuracy

50 queries against Dbpedia in five domains: History, Life Sciences, Movies, Music, and Sports.

FindingsSize of query answer set increased on avg. 3.13 times12 minutes to get 98% of all answersAccuracy between 84 And 96%

Page 12: One does not simply crowdsource the Semantic Web

12

OPEN QUESTIONS

Page 13: One does not simply crowdsource the Semantic Web

13

NOT CROWDSOURCING AS USUAL Knowledge-intensive tasks Structured, interlinked content Content meant for machine consumption Scale, shape, and quality of the data Context is critical Open-set answers

Page 14: One does not simply crowdsource the Semantic Web

14

FUNDAMENTAL CHALLENGESSCALE

No‘Big Crowd’TIME

From one-off and short-term to mid and long-termSCOPE

Problems technology cannot solve

Page 15: One does not simply crowdsource the Semantic Web

15

PATHWAYS TO SOLUTIONSSC

ALEAligning

incentivesBetter reuse of crowd outputs

TIM

ESustaining engagementBuilding relationshipsBetter integration with algorithms

SCOP

ENew problems and problem solving paradigmsNovel human-computer interactions designs

Page 16: One does not simply crowdsource the Semantic Web

16

[email protected]@esimperl