generative models for crowdsourced data
DESCRIPTION
Generative Models for Crowdsourced Data. Outline. What is Crowdsourcing ? Modeling the labeling process Example with real data Extensions Future Directions. What is Crowdsourcing ?. Human based computation. Outsourcing certain steps of a computation to humans. - PowerPoint PPT PresentationTRANSCRIPT
Generative Models for Crowdsourced Data
Outline
• What is Crowdsourcing?• Modeling the labeling process• Example with real data• Extensions• Future Directions
What is Crowdsourcing?
• Human based computation.• Outsourcing certain steps of a computation to
humans.• ``Artificial artificial intelligence.’’• Data science:– Making an immediate decision.– Creating a labeled data set for learning.
Immediate Decision Workflow
Labeled Data Set Workflow
An Example HIT
An Example HIT
Funny enough …
• Not everybody agrees on the gender of a Twitter profile.
• Difficult Instances• Worker Ability / Motivation• Worker Bias• Adversarial Behaviour
Difficult Instance
Difficult Instance
Difficult Instance
Worker Ability
Worker Ability
Worker Ability
Worker Motivation
Worker Motivation
Worker Bias
Worker Bias
Worker Bias
Disagreements
• When some workers say “male” and some workers say “female”, what to do?
Majority Rules Heuristic
• Assign label l to item x if a majority of workers agree.
• Otherwise item x remains unlabeled.
Majority Rules Heuristic
• Assign label l to item x if a majority of workers agree.
• Otherwise item x remains unlabeled.• Ignores prior worker data.
Majority Rules Heuristic
• Assign label l to item x if a majority of workers agree.
• Otherwise item x remains unlabeled.• Ignores prior worker data.• Introduce bias in labeled data.
Train on all labels
• For labeled data set workflow.• Add all item-label pairs to the data set.• Equivalent to cost vector of:– P (l | { lw }) = 1/nw S 1{l = lw}
Train on all labels
• For labeled data set workflow.• Add all item-label pairs to the data set.• Equivalent to cost vector of:– P (l | { lw }) = 1/nw S 1{l = lw}
• Ignores prior worker data.
Train on all labels
• For labeled data set workflow.• Add all item-label pairs to the data set.• Equivalent to cost vector of:– P (l | { lw }) = 1/nw S 1{l = lw}
• Ignores prior worker data.• Models the crowd, not the “ground truth.”
What is ground truth
• Different theoretical approaches.– PAC learning with noisy labels.– Fully-adversarial active learning.
• Bayesians have been very active.– “Easy” to posit a functional form and quickly
develop inference algorithms.– Issue of model correctness is ultimately empirical.
Bayesian Literature
• (2009) Whitehill et. al. GLAD framework.– (1979) Dawid and Skene. Maximum Likelihood
Estimation of Observer Error-Rates Using the EM Algorithm.
• (2010) Welinder et. al. The Multidimensional Wisdom of Crowds.
• (2010) Raykar et. al. Learning from Crowds.
Bayesian Approach
• Define ground truth via a generative model which describes how “ground truth” is related to the observed output of crowdsource workers.
• Fit to observed data.• Extract posterior over ground truth.• Make decision or train classifier.
Generative Model
Example: Binary Classification
• Each worker has a matrix.
α = ( -1 α01 )
( α10 -1 )
• Each item has a scalar difficulty β > 0.• P (lw = j | z = i) = e-βαij / (Σk e-βαik)
• αij ~ N (μij, 1) ; μij ~ N (0, 1)• log β ~ N (ρ, 1) ; ρ ~ N (0, 1)
Other Problems
• Multiclass classification:– Same as binary with larger confusion matrix.
• Ordinal classification: (“Hot or not”)– Confusion matrix has special form.– O (L) parameters instead of O (L2).
• Multilabel classification:– Reduce to multiclass on power set.– Assume low-rank confusion matrix.
EM
EM
• Initially all workers are assumed moderately accurate and without bias.– Implies initial estimate of ground truth distribution
favors consensus.– Disagreeing with the majority is a likely error.
EM
• Initially all workers are assumed moderately accurate.
• Workers consistently in the minority have their confusion probabilities increase.
EM
• Initially all workers are assumed moderately accurate.
• Workers consistently in the minority have their confusion probabilities increase.
• Workers with higher confusion probabilities contribute less to the distribution of ground truth.
“Different” workers are marginalized
“Different” workers are marginalized
• Workers that are consistently in the minority will not contribute strongly to the posterior distribution over ground truth.– Even if they are actually more accurate.
• Can correct when an accurate worker(s) is paired with some inaccurate workers.
• Good for breaking ties.• Raykar et. al.
Example with real data
Online EM
• Given a set of worker-label pairs for a single item:
• (Inference) Using current α, find most likely β* and distribution q* over ground truth.
• (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.
Online EM
• Given a set of worker-label pairs for a single item:
• (Inference) Using current α, find most likely β* and distribution q* over ground truth.
• (Training) Do SGD update of α with respect to EM auxiliary function evaluated at β* and q*.
Things to do with q*
• Take an immediate cost-sensitive decision– d* = argmind Ez~q*[f (z, d)]
• Train a (importance-weighted) classifier– cost vector cd = Ez~q*[f (z, d)]– e.g. 0/1 loss: cd = (1 - q*d)– e.g. binary 0/1 loss: |c1 – c0| = |1 – 2 q*1|– No need to decide what the true label is!
• Raykar et. al.: why not jointly estimate classifier and worker confusion?
Raykar et. al. insight
• Cost vector is constructed by estimating worker confusion matrices.
• Subsequently, classifier is trained; it will sometimes disagree with workers.
• Would be nice to use that disagreement to inform the worker confusion matrices.
• Circular dependency suggests joint estimation.
Generative Model
Generative Model
Online Joint Estimation
Online Joint Estimation
• Initially the classifier will output an uninformative prior and therefore will be trained to follow consensus of workers.
• Eventually workers which disagree with the classifier will have their confusion probabilities increase.
• Workers consistently in the minority can contribute strongly to the posterior if they tend to agree with the classifier.
Additional Resources
• Software– http://code.google.com/p/nincompoop
• Blog– http://machinedlearnings.com/