network a/b testing: from sampling to estimation ya xu ‡ joint work with huan gui † anmol bhasin...
Post on 22-Dec-2015
224 Views
Preview:
TRANSCRIPT
Network A/B Testing: From Sampling to Estimation
Ya Xu‡
Joint work with Huan Gui† Anmol Bhasin‡ Jiawei Han†
† University of Illinois at Urbana-Champaign, Urbana ‡ LinkedIn Corporation
A/B Testing – Two parallel universes
• Two parallel universes
Parallel Universe 1 (control, )
Parallel Universe 2 (treatment, )
Real World(Observations, )
Assumption
Examples– Experiment on feed ranking algorithms
• Treatment feed algorithm ranks more relevant items higher• Adam (treatment) clicks on a feed update(X)• X shows up higher for Adam’s friend Ben (control)• Ben (control) clicks on X
– Experiment on People You May Know recommendations
– …
Assumption: SUTVA • SUTVA (Stable Unit Treatment Value
Assumption) – Treatment Assignment Vector
•
– Response function
• Each individual’s response is affected only by their own treatment assignments.
Framework1. Experimental Design– Randomize assignment to minimize
interactions
2. Experimental Analysis– Adjust for network effect post experiment
Experimental Design1. Partition the network/graph 2. Randomize at cluster level
Minimize the links between clusters Minimize the interactions between treatment and control Minimize information leakage Smaller bias for ATE
Balanced Graph Partition
• If the cluster sizes are the same for all clusters• No matter what users’ responses are, the covariance is
zero, leading to non-biased estimator.
See Middleton and Aronow 2011 for derivation
Clustering Real NetworkHeterogeneous & large scale (350MM+)
An employee network from LinkedIn
3-net clustering (Ugander et. al.,KDD’13)
Randomized Balanced Graph Partition
• Random Shuffling on Label Propagation1. Randomly initialize clusters (equal size)2. Select two nodes and swap their labels if it results in
fewer edges between clusters.3. Randomly Shuffle x% of labels4. Repeat until convergence.
Break local optimal
Clustering Results• Network Statistics
• Edges # within each clusters
Nodes # Edges # Max Degree Avg. Degree
7.26e4 2.88e6 3997 39.67
Method LP RSLP MM
# of edges(1e6) 2.161 2.355 2.359
RSLP can be easily distributed as Label Propagation Algorithm, while achieves comparable performance as Modularity Maximization.
Experimental Analysis• Exposure Models– SUTVA– Neighborhood Exposure (Ugander et. al., KDD’13)
• Definition: i is neighborhood exposed to treatment if (1) i is in treatment, and (2) At least θ% of i’s neighbors are in treatment
• Assumption: i’s response under neighborhood exposure is the same as if everyone receives treatment.
Bias-Variance Tradeoff
θ = 0.9
θ = 0.3
About 80% of data points would be
invalid (high variance)
Stronger assumption
Yi(θ= 0.3) = Yi(θ= 1)
(large bias)
Fraction Neighborhood Exposure • Users’ responses are determined by– the treatment assignment – the fraction of neighbors having the
same treatment assignment.
can be arbitrary function
E.g., Additive Models
Simulations• Real network graph• Generation model (Eckles et al. 2014)
• Compare bias & variance of five estimators
Real Online Experiment1. Select a country2. Apply randomized balanced graph partitioning to assign
treatment/control3. Apply two Feed ranking algorithms to treatment/control4. Estimate ATE using various approaches
Real Online Experiment• Picked Netherlands
• 600 clusters 300/300 in treatment/control• Conducted A/A test to ensure no bias
Real Online ExperimentsResults
Method ATE for social gesture
SUTVA 0.168
Network Exposure θ = 0.75 0.264
Network Exposure θ = 0.9 0.520
Hajek. Network Exposure θ = 0.75
0.625
Network Exposure θ = 0.9 0.133
Fraction Exposure (Additive I) 0.687
Fraction Exposure (Additive II) 0.714
Key Takeaways• Network effect in A/B Testing• Experimental Design: Balanced Graph Partition• Experimental Analysis: Fraction Neighborhood
Exposure Model • Experiments
– Simulation– Real Online Experiments
• Lots of future work!
Percentage of Units in Treatment
• The distribution of changes with percentage of units in treatment.
is not representative.
top related