crowdsearcher. reactive and multiplatform crowdsourcing. keynote speech at dbcrowd2013 workshop @...
TRANSCRIPT
![Page 1: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/1.jpg)
CROWDSEARCHER
Marco Brambilla, Stefano Ceri, Andrea Mauri, Riccardo Volonterio
Politecnico di Milano
Dipartimento di Elettronica, Informazione e BioIngegneria
Crowdsearcher 1
![Page 2: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/2.jpg)
Crowd-based Applications • Emerging crowd-based applications:
• opinion mining • localized information gathering • marketing campaigns • expert response gathering
• General structure: • the requestor poses some questions • a wide set of responders are in charge of providing answers
(typically unknown to the requestor) • the system organizes a response collection campaign
• Include crowdsourcing and crowdsearching
Crowdsearcher 2
![Page 3: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/3.jpg)
The “system” is a wide concept • Crowd-based applications may use social networks and Q&A
websites in addition to crowdsourcing platforms • Our approach: a coordination engine which keeps an overall
control on the application deployment and execution
Crowdsearcher 3
CrowdSearcher
API Access
![Page 4: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/4.jpg)
CrowdSearcher • Combines a conceptual framework, a specification
paradigm and a reactive execution control environment • Supports designing, deploying, and monitoring
applications on top of crowd-based systems • Design is top-down, platform-independent • Deployment turns declarative specifications into platform-specific
implementations which include social networks and crowdsourcing platforms
• Monitoring provides reactive control, which guarantees applications’ adaptation and interoperability
• Developed in the context of Search Computing (SeCo, ERC Advanced Grant, 2008-2013)
Crowdsearcher 4
![Page 5: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/5.jpg)
An example of crowd-based application: crowd-search • People do not trust web search completely
• Want to get direct feedback from people • Expect recommendations, insights, opinions, reassurance
Crowdsearcher 7
![Page 6: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/6.jpg)
Crowd-searching after conventional search • From search results to friends and experts feedback
Social Platform
initial query
Human Search System
Search System
Social Platform Social Platform
Crowdsearcher 8
![Page 7: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/7.jpg)
Example: Find your next job (exploration)
Crowdsearcher 9
![Page 8: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/8.jpg)
Example: Find your job (social invitation)
Crowdsearcher 10
![Page 9: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/9.jpg)
Example: Find your job (social invitation)
Selected data items can be transferred to the crowd question
Crowdsearcher 11
![Page 10: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/10.jpg)
Find your job (response submission)
Crowdsearcher 12
![Page 11: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/11.jpg)
Crowdsearcher results (in the loop) Crowdsearcher 13
![Page 12: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/12.jpg)
Deployment alternatives • Multi-platform deployment
Embedded application
Social/ Crowd platformNative
behaviours
External application
Standalone application
API
Embedding
Community / Crowd
Generated query template
Native
Crowdsearcher 14
![Page 13: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/13.jpg)
Deployment: search on a social network • Multi-platform deployment
Crowdsearcher 15
![Page 14: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/14.jpg)
Deployment: search on the social network • Multi-platform deployment
Crowdsearcher 16
![Page 15: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/15.jpg)
Deployment: search on the social network • Multi-platform deployment
Crowdsearcher 17
![Page 16: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/16.jpg)
Deployment: search on the social network • Multi-platform deployment
Crowdsearcher 18
![Page 17: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/17.jpg)
From social workers to communities • Issues and problems
• Motivation of the responders
• Intensity of social activity of the asker
• Topic appropriateness • Timing of the post (hour of the day, day of the week)
• Context and language barrier
Crowdsearcher 19
![Page 18: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/18.jpg)
THE MODEL AND THE PROCESS
Crowdsearcher 20
![Page 19: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/19.jpg)
• A simple task design and deployment process, based on specific data structures • created using model-driven transformations • driven by the task specification
The Design Process
Task Specification Task Planning Control
Specification
Crowdsearcher 21
• Task Specification: task operations, objects, and performers • Task Planning: work distribution • Control Specification: task control policies
![Page 20: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/20.jpg)
Task Specification • Which are the input objects of the crowd interaction?
• Do they have a schema (record of named and typed fields)?
• Which operations should the crowd perform? • Like, label, comment, add new instances, verify/modify data, order, etc.
• Who are the performers of the task? How should they be selected? And invited? • e.g. push vs pull model
• Which quality criteria should be used for deciding the task outcome? • e.g., majority weighting, with/without spam detection
• Which platforms should be used? Which execution interface should be used?
Crowdsearcher 22
![Page 21: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/21.jpg)
Operations • In a Task, performers are required to execute logical operations on input objects
• e.g. Locate the faces of the people appearing in the following 5 images
• CrowdSearcher offers pre-defined operation types: • Like: Ask a performer to express a preference (true/false)
• e.g. Do you like this picture? • Comment: Ask a performer to write a description / summary / evaluation
• e.g. Can you summarize the following text using your own words? • Tag: Ask a performer to annotate an object with a set of tags
• e.g. How would you label the following image? • Classify: Ask a performer to classify an object within a closed-set of alternatives
• e.g. Would you classify this tweet as pro-right, pro-left, or neutral? • Add: Ask a performer to add a new object conforming to the specified schema
• e.g. Can you list the name and address of good restaurants nearby Politecnico di Milano? • Modify: Ask a performer to verify/modify the content of one or more input object
• e.g. Is this wine from Cinque Terre? If not, where does it come from? • Order: Ask a performer to order the input objects
• e.g. Order the following books according to your taste
Crowdsearcher 23
![Page 22: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/22.jpg)
Task planning Typical problems:
• Task structuring: the task is too complex or too critical to be executed as a single operation.
• Task splitting: the input data collection is too large to be presented to a user.
• Task routing: a query can be distributed according to the values of some attribute of the collection.
Crowdsearcher 24
![Page 23: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/23.jpg)
Micro Tasks • The actual unit of interaction with a performer. • Mapping of objects to Micro Tasks:
• How many objects in each MicroTask? • Which objects should appear in each MicroTask? • How often an object should appear in MicroTasks? • Which objects cannot appear together? • Should objects be presented always in some order?
Crowdsearcher 25
![Page 24: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/24.jpg)
Assignment Strategy • Given a set of MicroTasks, which performers are assigned to them?
• Pull vs Push: • Pull: The performer choses • Push: The performer is chosen
• Online vs offline • Micro Tasks dynamically assigned to performers
• First come / First served • Based on performer’s performance
• MicroTasks statically assigned to performers • Based on performers’ priority • Based on matching
Crowdsearcher 26
![Page 25: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/25.jpg)
Invitation Strategy • The process of inviting performers to perform Micro Tasks
• Can use very different mechanisms • Essential in order to generate the appropriate performer reaction / reward.
• Examples: • Send an email to a mailing list • Publish a HIT on Mechanical Turk • Create a new challenge in your game • Publish a post/tweet on your social network profile • Publish a post/tweet on your friends' profile
Crowdsearcher 27
![Page 26: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/26.jpg)
Steps in Crodw-based Application Design 1) Task Design 2) Object and Performer Design 3) Micro Task Design
![Page 27: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/27.jpg)
Step 1. Task Design
Crowdsearcher 29
![Page 28: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/28.jpg)
Step 2: Object and Performer Design
![Page 29: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/29.jpg)
Step 3: MicroTask Design
Crowdsearcher 31
![Page 30: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/30.jpg)
Complete Meta-Model
Crowdsearcher 32
![Page 31: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/31.jpg)
Design Tool: Screenshot
Crowdsearcher 33
![Page 32: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/32.jpg)
Application instatiation (for Italian Politics) • Given the picture and name of a politician, specify his/her political
affiliation • No time limit • Performers are encouraged to look up online
• 2 set of rules
• Majority Evaluation • Spammer Detection
Crowdsearcher 34
![Page 33: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/33.jpg)
REACTIVITY AND MULTIPLATFORM
Crowdsearcher 35
![Page 34: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/34.jpg)
Crowd Control is tough… • There are several aspects that makes crowd
engineering complicated • Task design, planning, assignment • Workers discovery, assessment, engagement
Crowdsearcher 36
![Page 35: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/35.jpg)
Crowd Control is tough… • There are several aspects that makes crowd
engineering complicated • Task design, planning, assignment • Workers discovery, assessment, engagement
• Controlling crowdsourcing tasks is a
fundamental issue • Cost • Time • Quality
• Need for higher level abstrasction and tools
Crowdsearcher 37
![Page 36: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/36.jpg)
Reactive Crowdsourcing • A conceptual framework for controlling the execution of
crowd-based computations. Based on: • Control Marts • Active Rules
• Classical forms of controls: • Majority control (to close object computations) • Quality control (to check that quality constraints are met) • Spam detection (to detect / eliminate some performers) • Multi-platform adaptation (to change the deployment platform) • Social adaptation (to change the community of performers)
Crowdsearcher 38
![Page 37: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/37.jpg)
Why Active Rules? • Ease of Use: control is easily expressible
• Simple formalism, simple computation
• Power: arbitrarily complex controls is supported • Extensibility mechanisms
• Automation: active rules can be system-generated • Well-defined semantics
• Flexibility: localized impact of changes on the rules set • Control isolation
• Known formal properties descending from known theory • Termination, confluence
Crowdsearcher 39
![Page 38: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/38.jpg)
Control Mart • Data structure for controlling application execution, inspired by data
marts (for data warehousing); content is automatically built from task specification & planning
• Central entity: MicroTask Object Execution
• Dimensions: Task / Operations, Performer, Object
Crowdsearcher 40
Task Specification Task Planning Control Specification
![Page 39: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/39.jpg)
Auxiliary Structures • Object : tracking object responses • Performer: tracking performer behavior (e.g. spammers) • Task: tracking task status
Crowdsearcher 41
Task Specification Task Planning Control Specification
![Page 40: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/40.jpg)
Active Rules Language • Active rules are expressed on the previous data
structures • Event-Condition-Action paradigm
Crowdsearcher 42
![Page 41: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/41.jpg)
Active Rules Language • Active rules are expressed on the previous data
structures • Event-Condition-Action paradigm
• Events: data updates / timer • ROW-level granularity
• OLD before state of a row • NEW after state of a row
Crowdsearcher 43
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty]
![Page 42: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/42.jpg)
Active Rules Language • Active rules are expressed on the previous data
structures • Event-Condition-Action paradigm
• Events: data updates / timer • ROW-level granularity
• OLD before state of a row • NEW after state of a row
• Condition: a predicate that must be satisfied (e.g. conditions on control mart attributes)
Crowdsearcher 44
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
![Page 43: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/43.jpg)
Active Rules Language • Active rules are expressed on the previous data
structures • Event-Condition-Action paradigm
• Events: data updates / timer • ROW-level granularity
• OLD before state of a row • NEW after state of a row
• Condition: a predicate that must be satisfied (e.g. conditions on control mart attributes)
• Actions: updates on data structures (e.g. change attribute value, create new instances), special functions (e.g. replan)
Crowdsearcher 45
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
![Page 44: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/44.jpg)
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 46
Rule Example 1
![Page 45: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/45.jpg)
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 47
Rule Example 1
![Page 46: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/46.jpg)
e: UPDATE FOR μTaskObjectExecution[ClassifiedParty] c: NEW.ClassifiedParty == ’Republican’
a: SET ObjectControl[oID == NEW.oID].#Eval+= 1
Crowdsearcher 48
Rule Example 1
![Page 47: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/47.jpg)
Crowdsearcher 49
e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
![Page 48: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/48.jpg)
Crowdsearcher 50
e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
![Page 49: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/49.jpg)
Crowdsearcher 51
e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
![Page 50: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/50.jpg)
Crowdsearcher 52
e: UPDATE FOR ObjectControl c: (NEW.Rep== 2) or (NEW.Dem == 2) a: SET Politician[oid==NEW.oid].classifiedParty = NEW.CurAnswer, SET TaskControl[tID==NEW.tID].compObj += 1
Rule Example 2
![Page 51: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/51.jpg)
Rule Programming Best Practice • We define three classes of rules
Crowdsearcher 53
![Page 52: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/52.jpg)
Rule Programming Best Practice
Crowdsearcher 54
• We define three classes of rules • Control rules: modifying the control tables;
![Page 53: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/53.jpg)
Rule Programming Best Practice
Crowdsearcher 55
• We define three classes of rules • Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task);
![Page 54: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/54.jpg)
Rule Programming Best Practice
Crowdsearcher 56
• Top-to-bottom, left-to-right, evaluation • Guaranteed termination
• We define three classes of rules • Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task);
![Page 55: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/55.jpg)
Rule Programming Best Practice • We define three classes of rules
• Control rules: modifying the control tables; • Result rules: modifying the dimension tables (object, performer, task); • Execution rules: modifying the execution table, either directly or through re-planning
Crowdsearcher 57
• Termination must be proven (rule precedence graph has cycles)
![Page 56: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/56.jpg)
EXPERIMENTS
Crowdsearcher 58
![Page 57: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/57.jpg)
Crowdsearcher Experiment 1 • Goal: Test engagement on social networks • Some 150 users • Two classes of experiments:
• Random questions on fixed topics: interests (e.g. restaurants in the vicinity of Politecnico), to famous 2011 songs, or to top-quality EU soccer teams
• Questions manually submitted by the users • Different invitation strategies:
• Random invitation • Explicit selection of responders by the asker
• Outcome • 175 like and insert queries • 1536 invitations to friends • 230 answers • 95 questions (~55%) got at least one answer
Crowdsearcher 59
![Page 58: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/58.jpg)
Manual and Random Questions
Crowdsearcher 60
![Page 59: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/59.jpg)
Interest / Rewarding Factor • Manually written and assigned questions
are consistently more responded in time
Crowdsearcher 61
![Page 60: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/60.jpg)
Query Type • Engagement depends on the difficulty of the task • Like vs. Add tasks:
Crowdsearcher 62
![Page 61: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/61.jpg)
Comparison of Execution Platforms • Facebook vs. Doodle
Crowdsearcher 64
![Page 62: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/62.jpg)
Posting Time • Facebook vs. Doodle
Crowdsearcher 65
![Page 63: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/63.jpg)
Crowdsearcher Experiment 2
• GOAL: demonstrate the flexibility and expressive power of reactive crowdsourcing
• 3 experiments, focused on Italian politicians • Parties: Human Computation affiliation classification • Law: Game With a Purpose guess the convicted politician • Order: Pure Game hot or not
• 1 week (November 2012) • 284 distinct performers
• Recruited through public mailing lists and social networks announcements
• 3500 Micro Tasks
Crowdsearcher 66
![Page 64: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/64.jpg)
Politician Affiliation • Given the picture and name of a politician, specify his/her political
affiliation • No time limit • Performers are encouraged to look up online
• 2 set of rules
• Majority Evaluation • Spammer Detection
Crowdsearcher 67
![Page 65: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/65.jpg)
Results – Majority Evaluation_1/3
Crowdsearcher 68
30 object; object redundancy = 9; Final object classification as simple majority after 7 evaluations
![Page 66: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/66.jpg)
Results - Majority Evaluation_2/3
Crowdsearcher 69
Final object classification as total majority after 3 evaluations Otherwise, re-plan of 4 additional evaluations. Then simple majority at 7
![Page 67: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/67.jpg)
Results - Majority Evaluation_3/3
Crowdsearcher 70
Final object classification as total majority after 3 evaluations Otherwise, simple majority at 5 or at 7 (with replan)
![Page 68: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/68.jpg)
Results – Spammer Detection_1/2
Crowdsearcher 71
New rule for spammer detection without ground truth Performer correctness on final majority. Spammer if > 50% wrong classifications
![Page 69: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/69.jpg)
Results – Spammer Detection_1/2
Crowdsearcher 72
New rule for spammer detection without ground truth Performer correctness on current majority. Spammer if > 50% wrong classifications
![Page 70: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/70.jpg)
EXPERT FINDING IN CROWDSEARCHER
Crowdsearcher 73
![Page 71: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/71.jpg)
Problem • Ranking the members of a social group according to the level of knowledge that they have about a given topic
• Application: crowd selection (for Crowd Searching or Sourcing)
• Available data • User profile • behavioral trace that users leave behind them through
their social activities
Crowdsearcher 74
![Page 72: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/72.jpg)
Considered Features • User Profiles
• Plus Linked Web Pages
• Social Relationships • Facebook Friendship • Twitter mutual following relationship • LinkedIn Connections
• Resource Containers • Groups, Facebook Pages • Linked Pages • Users who are followed by a given user are resource containers
• Resources • Material published in resource containers
Crowdsearcher 75
![Page 73: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/73.jpg)
Feature Organization Meta-Model
Crowdsearcher 76
![Page 74: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/74.jpg)
Example (Facebook)
Crowdsearcher 77
![Page 75: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/75.jpg)
Example (Twitter)
Crowdsearcher 78
![Page 76: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/76.jpg)
Resource Distance • Objects in social graph organized according to their
distance with respect to the user profile • Why? Privacy, Computational Cost, Platform Access Constraints
Distance Resource 0 Expert Candidate Profile
1
Expert Candidate owns/create/annotates Resource
Expert Candidate relatedTo Resource Container
Expert Candidate follows UserProfile
2
Expert Candidate follows UserProfile relatedTo Resource Container
Expert Candidate relatedTo Resource Container contains Resource
Expert Candidate follows UserProfile owns/create/annotates Resource
Expert Candidate follows UserProfile follows UserProfile
Crowdsearcher 79
![Page 77: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/77.jpg)
Distance interpretation Distance Resource 0 Expert Candidate Profile
1
Expert Candidate owns/create/annotates Resource
Expert Candidate relatedTo Resource Container
Expert Candidate follows UserProfile
2
Expert Candidate follows UserProfile relatedTo Resource Container
Expert Candidate relatedTo Resource Container contains Resource
Expert Candidate follows UserProfile owns/create/annotates Resource
Expert Candidate follows UserProfile follows UserProfile
Crowdsearcher 80
![Page 78: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/78.jpg)
Resource Processing
• Extraction from Social Network APIs
• Extraction of Text from linked Web Pages • Alchemy Text Extraction APIs
• Language Identification
• Text Processing • Sanitization, tokenization,
stopword, lemmatization
• Entity Extraction and Disambiguation • TagMe
Crowdsearcher 81
![Page 79: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/79.jpg)
Dataset • 7 kinds of expertises
• Computer Engineering, Location, Movies & TV, Music, Science, Sport, Technology & Videogames
• 40 volunteer users (on Facebook & Twitter & LinkedIN)
• 330.000 resources (70% with URL to external resources)
• Groundtruth created trough self-assessment • For expertise need, vote on 7 Likert Scale • EXPERTS expertise above average
Crowdsearcher 84
![Page 80: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/80.jpg)
Metrics • We obtain lists of candidate experts and assess them
against the ground truth, using: • For precision:
• Mean Average Precision (MAP) • 11-Point Interpolated Average Precision (11-P)
• For ranking: • Mean Reciprocal Rank (MRR) – for the first value • Normalized Discounted Cumulative Gain (DCG) – for more values, can
be set @N for the first N values
Crowdsearcher 86
![Page 81: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/81.jpg)
Metrics improves with resources • But it comes with a cost
Crowdsearcher 87
![Page 82: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/82.jpg)
Friendship Relationship not useful • Inspecting friend’s resources does not improve metrics!
Crowdsearcher 88
![Page 83: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/83.jpg)
Social Network Analysis
• a
Comparison of the results obtained with All the social networks, or separately by FaceBook, TWitter, and LinkedIn.
Crowdsearcher 89
![Page 84: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/84.jpg)
Main Results • Profiles are less effective than level-1 resources
• Resources produced by others help in describing each individual’s expertise
• Twitter is the most effective social network for expertise matching – sometimes it outperforms the other social networks • Twitter most effective in Computer Engineering, Science, Technology &
Games, Sport
• Facebook effective in Locations, Sport, Movies & TV, Music • Linked-in never very helpful in locating expertise
Crowdsearcher 90
![Page 85: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/85.jpg)
CONCLUSIONS
Crowdsearcher 95
![Page 86: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/86.jpg)
Summary • Results
• An integrated framework for crowdsourcing task design and control • Well-structured control rules with guarantees of termination • Support for cross-platform crowd interoperability • A working prototype crowdsearcher.search-computing.org
• Forthcoming • Publication of Web Interface + API • Support of declarative options for automatic rule generation • Integration with more social networks and human computation
platforms • Providing vertical solutions for specific markets • More applications and experiments (e.g. in Expo 2015)
Crowdsearcher 96
![Page 87: CrowdSearcher. Reactive and multiplatform Crowdsourcing. keynote speech at DBCrowd2013 workshop @ vldb2013](https://reader033.vdocuments.site/reader033/viewer/2022060108/554e84b3b4c90526358b4599/html5/thumbnails/87.jpg)
QUESTIONS?
Crowdsearcher 97