harnessing manpower for creating semantics (doctoral dissertation) jakub Šimko [email protected]...
TRANSCRIPT
Harnessing manpower for creating semantics
(doctoral dissertation)
Jakub Š[email protected]
Institute of Informatics and Software Engineering, Faculty of Informatics and Information Technologies,
Slovak University of Technology in Bratislava
Supervised by: prof. Mária Bieliková
July 4th, 2013
Games with a purpose
Cheap (once they are created)Difficult to create
[Quinn & Bederson. Human computation: a survey and taxonomy of a growing field. CHI’11, 2011]
ESP Game: image metadata acquisition
What is in the image?
Player 1: Player 2:
watersky
bridge
Mostarnightriver
bridgeBosnia
The players must blindly match
Banned words: blue, towers
[Von Ahn & Dabbish: Designing games with a purpose. Commun. ACM, 2008.]
MotivationOpen issues in semantics
acquisition◦Modelling of specific domains◦Personal multimedia metadata
acquisition◦Metadata upkeep
Games with a purpose (GWAPs): design issues◦In general: no design methodology
(young problem area)◦Cold start problems◦Quality management, effectiveness
of work allocation
Thesis Goals
1. Create new, GWAP-based approaches to semantics creation, particularly for specific domains
2. Bring in generally applicable improvements to GWAP design, focusing on selected problems
Work overviewState of the art:GWAP taxonomy and design space
GWAPs we created:Little Search Game: term network acquisitionPexAce: (personal) imagery tag acquisitionCityLights: validation of music metadata
General GWAP design improvements:Helper artifacts: cold start problem reductionPlayer competences: improving GWAP output
quality
GWAP designA relatively new area (<10 years)No holistic design methodology exists
◦GWAPs are created ad-hoc
Few works aimed at particular design issues◦ [Ahn, 2008] Player agreement schemes◦ [Chiou, 2011] Suggested considering player skills
in GWAPs
Our contribution: GWAP design dimensions◦ following the idea of design lenses [Schell, 2008]
[Von Ahn & Dabbish: Designing games with a purpose. Commun. ACM, 2008.][Chiou & Hsu. Capability-aligned matching: improving quality of games with a purpose. AAMAS ’11][J. Schell. The art of game design a book of lenses. Elsevier/Morgan Kaufmann, 2008.]
Our GWAP design dimensions
Task distribution
Task difficulty
Validation of player outputAnti-cheating
measures
Purpose encapsulation
Player challenges Player capability driven
Data (ontology) driven
Task-value driven
Greedy
Random
Restrictive rules
Mutual player supervision
Anomalous behavior detection
A posteriori cheating detection
Offline player mutual agreement
Bootstrapping
Automated exact
Automatic approximative
Helper artifacts
Equally complex tasks
Gradually complex tasks
High Low
Social experience
Self-challenge
Competition
Discovery
Online player mutual agreement
Gre
edy
Task
-val
ue
Dat
a-dr
iven
Play
er c
apab
ility
Gre
edy
Task
-val
ue
Dat
a-dr
iven
Play
er c
apab
ility
Gre
edy
Task
-val
ue
Dat
a-dr
iven
Play
er c
apab
ility
Gre
edy
Task
-val
ue
Dat
a-dr
iven
Play
er c
apab
ility
Restrictive rules 9 4 2 1 10 5 2 1Mutual supervision 1 1 1 1 1 1Anomaly detection 7 2 2 1 7 2 2 1A posteriori N/A 4 4 5 5Restrictive rules 1 1Mutual supervisionAnomaly detection 1 1 1A posteriori N/A 1Restrictive rules 2 1 2 1 1Mutual supervisionAnomaly detection 1 1 1 1A posteriori N/ARestrictive rulesMutual supervisionAnomaly detection 2 2A posteriori N/ARestrictive rules 1 1 1 1 1 1Mutual supervisionAnomaly detectionA posteriori 1 1 1 1 1 1N/A
Boot
stra
ppin
gAu
t. Ap
prox
Aut.
Exac
tH
elpe
r ar
tifac
ts
DiscoveryCompetitionSelf-challengeSocial experience
Onl
ine
mut
ual
Existing GWAPs in our design space
PexAceGoal: acquire (personal) image tags
New artifact validation modelQuality management through player modelling
International Journal on Human-Computer Studies [In press]-Šimko, J., Tvarožek, M., Bieliková, M. Human Computation: Single-player Annotation Game for Image Metadata.
SMAP 2011 (IEEE CS Press)-Šimko, J., Bieliková, M.: Games with a Purpose: User Generated Valid Metadata for Personal Archives.
I-Semantics 2012 (ACM)- Šimko, Jakub - Bieliková, Mária: Personal Image Tagging: a Game-based Approach. I-Semantics, 2012
PexAce: acquisition of image metadataCards– image pair seeking memory gamePlayers create image annotations to aid
their memory
Players Single-player game
Untagged images
Free text annotations
General domain
tags
Personal image tags
PexAce: general domain deployment(Standard) Corel 5K dataset: photos +
tags + our tags107 players, 814 games, 2 792 images22 176 annotations, 5 723 tags Golden standard comparison: 73%
precisionAposteriori evaluation: 94% precisionAutomated methods ~70% *
◦Limited set of tags*[Duygulu et. al. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary 2002. Springer-Verlag.]
PexAce for personal imagesPersonal image metadata – virtually
impossible to getPersonal images instead of general
images in PexAce◦Players like that more◦They provide specific annotations (metadata)
Experiments: 2 x 2-player groups, 50 images each
Correctness: 94%◦44% specific tags
Persons (53%)
Events (21%)
Places (15%)
Other (11%)
„Benevolent“ artifact validation model
Original mutual player
supervision
Less strict heuristics
ITPitp ),,(Annotations decomposed to votes:P - players, T- terms, I - Images
Artifact validation and cold start problem:A general GWAP issue
„How can a result of a human intelligence task be automatically evaluated?“
GWAPs use:◦Approximative or exact automated evaluation
(case dependent)◦Mutual player supervision
Threat to multiplayer validation schemes: COLD START‘’The requirement is to have multiple players online at the same time, sometimes with a requirement that they cannot communicate.”
Keep the games single-player
Helper artifacts: a new artifact validation principle
Helper artifacts:◦Decouple scoring from task solving,
instead motivate players to solve tasks to help themselves in the progress of the game
◦E.g. in PexAce, a player may win the game well enough even without the annotations
◦Potential of general applicability (to any existing game)
Quality management in GWAPs:Considering differences in player competences
1. Quantify player skills – player model(e.g. player’s task-solving expertise for each sub-domain)
2. Apply model ina) “post-processing” - Solution filtering
(e.g. vote weighting)b) “pre-processing” - Task assignment
(e.g. match task subdomain to expertise areas)
3. Speed up the process or/and retrieve higher quality results
Measuring player competences: PexAce dataUsefulness (delivery of correct artifacts)Consensus ratio (agreement with other players)Correlation: 0.496
0.40.50.60.70.80.9
1
Consensus ratio Usefulness
0.40.50.60.70.80.9
1
weighting with usefulness Weighting with consensus
Little Search GameGoal: acquire lightweight term network
statistically unsupported, yet valid term relationshipsspecific domain use
Int. J. on Semantic Web and Information Systems-Šimko, Jakub - Tvarožek, Michal - Bieliková, Mária: Semantics Discovery via Human Computation Games. In: International Journal on Semantic Web and Information Systems (2011)
Hypertext 2011 (ACM)-Šimko, Jakub - Tvarožek, Michal - Bieliková, Mária: Little Search Game: Term Network Acquisition via a Human Computation Game. Hypertext, 2011
Little Search Game (negative search game)
Search query: „Star –movie –war –death“
war
armyship
navy
marineamerican
blue
sea
ocean
fish
deep
• Creation of lightweight term network• Player’s task: reduce number of results
with negative search
star
movie
war
death
LSG Term network evaluationAposteriori evaluation: 91%
correctnessA potential to add term
relationships to existing bases◦59% of LSG rels. do not exist in ConceptNet * corpus
◦…including demanded non-taxonomic relationships
*[Liu & Singh. ConceptNet — A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal 2004]
LSG modification: TermBlaster(Harvesting relationships for software design domain)
Specific domainNo text typing
71 % correct, 21% „hidden relationships“
CityLightsGoal: validate existing music tags
quality management through confidence expression
I-Semantics 2012 (ACM)-Dulačka, Peter - Šimko, Jakub - Bieliková, Mária: Validation of Music Metadata via Game with a Purpose. I-Semantics 2012
CityLights: music tag validation(a concept of validation question)
Validation question:“Which of these tag groups
characterizes the music track you hear?”
1. Rockabilly, USA, 60ties2. Seasonal, rich oldies, xmas3. February 08 love, oldies, 60 musik
Tag support value:+ increases
+ player selects the group
- decreases- p. doesn’t select the group- player rules out the tag
Wrong and correct tags bubble outPossitive and negative thresholds
CityLights: experiments
LastFM dataset875 games, 4933 questions, 1492
tagsFeedback actions per tag:
◦17.75 implicit◦5.29 explicit
Optimized parameter configuration◦68% correctness
Betting mechanism: Measuring competence through confidence
Betting mechanism within a GWAPThrough bet height, the player
expresses his confidence in his task solution
CityLights case: bet height aligns with impact on tag validity value
Helps with cold start problem associated with user modeling
Main contributionsDefinition GWAP design spaceGWAPs for semantics acquisition
◦For specific domains (personal images, SW engineering)
◦For otherwise hardly discoverable semantics (hidden rels.)
New GWAP design principles◦Helper artifacts for cold start reduction◦Metrics for long term player competence modeling
◦Betting mechanism for short term player competence acq.
◦Metadata validation GWAP concept
SummaryGWAP taxonomy and design dimensions
◦ [survey paper prepared]
Little Search Game – Lightweight term network acquisition
Hidden term relationships◦ Hypertext 2011, ACM◦ Int. J. of Semantic Web and Information Systems, 2011 (CC, IGI)
PexAce – Personal image metadata acquisitionHelper artifactsCompetence measures
◦ SMAP 2011, IEEE◦ I-Semantics 2012, ACM◦ Int. J. of Human-Computer Studies, 2013 (CC, Elsevier)
CityLights – Music metadata validationBetting mechanics – player competence through
confidence◦ I-Semantics 2012b, ACM
Selected publicationsSemantics Discovery via Human Computation
Games. In: International Journal on Semantic Web and Information Systems. 2011
Human Computation: Single-player Annotation Game for Image Metadata. International Journal on Human-Computer Studies. 2012 [In press].
Validation of Music Metadata via Game with a Purpose. I-Semantics 2012 (ACM)
Games with a Purpose: User Generated Valid Metadata for Personal Archives. SMAP 2011 (IEEE CS)
Little Search Game: Term Network Acquisition via a Human Computation Game. Hypertext 2011 (ACM)
Personal Image Tagging: a Game-based Approach. I-Semantics 2012 (ACM)