aws re:invent 2016: leverage the power of the crowd to work with amazon mechanical turk (bda204)

43
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Russell Smith, co-founder/CTO/CIO, Rainforest QA November 2016 BDA204 Leverage the Power of the Crowd To Work with Amazon Mechanical Turk

Upload: amazon-web-services

Post on 06-Jan-2017

108 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Russell Smith, co-founder/CTO/CIO, Rainforest QA

November 2016

BDA204

Leverage the Power of the Crowd To Work with Amazon Mechanical Turk

Page 2: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

What to Expect from the Session

• Learn what Mechanical Turk (MTurk) is

• Understand the basics

• Learn about scaling beyond the basics

• How Rainforest leverages MTurk

Page 3: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Who am I?

Russell Smith

• CTO & Co-Founder of Rainforest QA

• Programmer

• MTurk Requester for ~5 years

• ~>250m questions through MTurk

• Can follow me on twitter — @rhs

Page 4: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

What is Rainforest?

QA-as-a-Service: Fast Crowdsourced Testing for Web and

Mobile Apps thanks to Mechanical Turk:

• Customers write tests in plain English

• Results in ~30 minutes, anytime, 24x7

• Powered by humans

Page 5: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

What is Mechanical Turk?

• Super early AWS service

• Public since 2005

• First invented in 2001

• 24 x 7, on-demand, programmatic interface to do Human

Intelligence Tasks (HITs)

• “Automate” the un-automatable

Page 6: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

What is Mechanical Turk?

• Pay (lots of) humans to do (lots of) things. Classic things:• Extract data from receipts

• Identify things in photos

• Search for data for you (find the phone number of XYZ restaurant)

• Transcribe audio

• More hip / upcoming things• Data science – build ground truth for machine learning and AI

Page 7: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics

Page 8: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Marketplace

• Connects Workers and Requesters

• Requesters are you!

• Web-interface where Workers execute your tasks

• Searchable list of HITs, Workers pick

Page 9: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Requester interface

1. Select a template

2. Provide info on your task and how

much you want to pay.

3. Design the layout of your task

4. Load your variables

5. Publish

Page 10: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Requester interface

- The results of your task can be viewed in the Manage tab.

- This is also where you can view and manage your Workers.

Page 11: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Worker interface

- Workers visit mturk.com

to find HITs they want to

work on.

- Description, reward, and

reputation all matter in

determining if your work

gets done.

Page 12: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Worker interface

- Workers can choose to Accept

a HIT or Skip to the next one in

a set.

- Once they’ve accepted the HIT

they have until the allotted time

has expired to Submit.

- Workers can also Return the

task if they decide they don’t

want to complete it.

Page 13: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics - task design

Page 14: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics - Task design

Design is critical:

• Bad tasks = bad reputation + bad results

• Unclear tasks = bad reputation + bad results

• Good tasks ~= good reputation + good results

Page 15: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics - Task design

My rules:1. Have instructions and/or rules

2. Must be clear to understand (note, not necessarily simple)

3. Must protect against mistakes or fraud

4. Have a fair price

5. Include a feedback field

Page 16: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics - Task design

Ask:

• Can the worker get in a groove and churn through tasks?

• Can anyone read the instructions and do this right?

• Do we need to qualify the workers?

Page 17: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Basics - Task design

Pricing iteration1. Work out a budget per assignment

2. Do a small run

3. Verify quality vs speed* of results

4. Fix your task, optimize spend** and goto 4 (repeat forever)

* Qualifications, SEO, # of workers

** Payment, repetition, requirements

Page 18: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Workers

Page 19: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Workers

Page 20: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Workers

Page 21: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Workers

• Motivations

• Earn money

• Status

• Incentives

• Leveling up

• Pride

• Expectations

• Traditionally being treated like an API

• Now; being treated like a human

• Fairness, transparency

Page 22: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Workers

• Lifecycle

• Custom Qualifications / Training

• Master Workers / Premium Qualifications

Page 23: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Community

Page 24: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Community

- Retention is key

- Finding the leaders

- Worker enablement- Help Workers improve

- We do: video tutorials, community forum, clear rules, automated training, re-training

- Ask them what they need!

- Listen to complaints- Add a comment box to your tasks to collect feedback

- NPS

Page 25: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Community

- Handling Workers that you don’t want doing your tasks

- Rejecting

- Qualifications

- Blocking

- Finding spammers and cheaters

- Join the external forums

- Your reputation matters

Page 26: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Intermediate

Page 27: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Hits

- HITType

- HIT

- Assignments

- Notifications

HITType

HITAssignment Assignment

Assignment Assignment

HITAssignment Assignment

Assignment Assignment

HITAssignment Assignment

Assignment Assignment

Notification:

Reviewable

Page 28: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Useful API operations

CreateHIT Create new tasks for Workers to do.

GetAccountBalance Check the funding available for publishing new tasks.

RevokeQualification /

GrantQualification

Modify the Qualifications assigned to Workers.

ForceExpireHIT Immediately remove a HIT from MTurk.

GetAssignment The status and results from an Assignment.

NotifyWorkers Send a message to your Workers.

GrantBonus Provide a bonus payment to Workers.

Use the Sandbox environment to experiment with creating and responding to HITs without spending money.

Page 29: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Question types

• QuestionForm – XML defined questions.

• HTMLQuestion – HTML form based questions.

• ExternalQuestion – Questions hosted on your own website.

Page 30: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Review Policies

- Review Policies can be specified in your CreateHIT call to automatically

evaluate Worker submissions.

- Assignment-level policies can be used to validate Worker responses to

known answers.

- HIT-level policies look for consensus amongst Workers on each HIT.

B B C

B C B

B B

• Imagine you want to ask six Workers

and get 75% agreement.

• If two Workers disagree, the policy

will add additional Assignments until

there is agreement.

Page 31: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

How Rainforest QA

Uses Mechanical Turk

Page 32: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Write tests, in plain English

Page 33: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Automatically trained testers

• Fully automated training

• Course + class-based

• Automatic re-training

• Always expanding

• Per-customer training, for special situations

Page 34: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Super fast

Page 35: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Human results

Page 36: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Accurate human results, ML / AI backed

Page 37: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Scaling

Page 38: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Scaling - Rainforest v1

• Initially linked jobs to HITs 1:1

• Balanced a list of HITs against an internal list of jobs

• Constantly pulling on / off MTurk when jobs were added, cancelled, changed.

Jobs HITs

Page 39: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Scaling - Rainforest v2

• Decoupled jobs from HITs

• Balance list of HITs against an internal list of jobs

• Qualifications, constantly pulling on / off MTurk

Jobs HITs

Page 40: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Scaling - Rainforest v3

• Unbalanced job / HITs - no 1:1 ratio, allowing for more

SEO and higher chance of workers finding us

• Stopped using Qualifications

Jobs HITs

Page 41: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Questions

Page 42: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Thank you!

Page 43: AWS re:Invent 2016: Leverage the Power of the Crowd To Work with Amazon Mechanical Turk (BDA204)

Remember to complete

your evaluations!