statistical models explored and explained

58
Speakers Statistical Models, Explored and Explained Sara Vafi, Stats Expert, Optimizely Shana Rusonis, Product Marketing, Optimizely

Upload: optimizely

Post on 19-Mar-2017

271 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Statistical Models Explored and Explained

Speakers

Statistical Models, Explored and Explained

Sara Vafi, Stats Expert, OptimizelyShana Rusonis, Product Marketing, Optimizely

Page 2: Statistical Models Explored and Explained

Today’s Speakers

Sara Vafi Shana Rusonis

Page 3: Statistical Models Explored and Explained

Housekeeping• We’re recording!• Slides and recording will

be emailed to you tomorrow

• Time for questions at the end

Page 4: Statistical Models Explored and Explained

Agenda• Bayesian & Frequentist Statistics • Error Control - Average vs. All Error Control• Bayes Rule• Benefits & Risks • Optimizely Stats Engine• Q&A

Page 5: Statistical Models Explored and Explained

Why Do We Experiment?

● Experimentation is essential for learning● Try new ideas without fear of failure● Give your business a signal to act on

in a sea of noisy data

Page 6: Statistical Models Explored and Explained

What’s most Important to You?

● Running experiments quickly● But also reporting on results accurately● When not all statistical solutions are created

equal

Page 7: Statistical Models Explored and Explained

Types of Statistical Methods

BayesianOR

Frequentist

Page 8: Statistical Models Explored and Explained

Bayesian Statistics● Bayesian statistics take a more bottom-up approach to data

analysis● Our parameters are unknown● The data is fixed● There is a prior probability● “Opinion-based”

Page 9: Statistical Models Explored and Explained

“A Bayesian is one who, vaguely

expecting a horse, and catching a

glimpse of a donkey, strongly

believes he has seen a mule.”

Source

Page 10: Statistical Models Explored and Explained

Frequentist Statistics● Frequentist arguments are more counter-factual in nature● Parameters remain constant during the repeatable sampling

process● Resemble the type of logic that lawyers use in court● ‘Is this variation different from the control?’ is a basic building

block of this approach.

Page 11: Statistical Models Explored and Explained

Example Dan & Pete Rolling a 6-Sided DieScenario:● Pete will roll a die and the outcome can either be 1, 2, 3,

4, 5, or 6● If Pete rolls a 4, he will give Dan $1 million

If Dan was a Bayesian statistician, how would he react? If Dan was a Frequentist statistician, how would he react?

Page 12: Statistical Models Explored and Explained

ExampleProbability of the sun exploding

Source● Frequentist, relies on

probability● Bayesian, relies on prior

knowledge

Page 13: Statistical Models Explored and Explained

Error Control

Page 14: Statistical Models Explored and Explained

Error Control Explained● The likelihood that the observed result of an experiment happened by

chance, rather than a change that you introduced● When we set the statistical significance on an experiment to 90%, that

means there's a 10% chance of a statistical error, or a 1 in 10 chance that the result happened by chance

Page 15: Statistical Models Explored and Explained

Average Error Control

● Corresponds to Bayesian A/B Testing

● Less useful for iterating on test results

● Harder to learn from individual experiments with confidence

Page 16: Statistical Models Explored and Explained

All Error Control

● Corresponds to Frequentist A/B Testing

● Any experiment will have less than a 10% chance of a mistake

● Rate of errors is 1 in 10

Page 17: Statistical Models Explored and Explained

Average Error Control vs. All Error Control

● Average error control leads to lower accuracy for small

improvements

● All error control is accurate for all users

● There are certain cases where average error control is an

appropriate alternative

Page 18: Statistical Models Explored and Explained

Error Rates for Experiments

Page 19: Statistical Models Explored and Explained

Bayes Rule

Page 20: Statistical Models Explored and Explained

Average Error Control & Bayesian A/B Testing

● Requires two sources of randomness• Randomness or “noise” in the data

• The makeup of the “typical” experiment group

● Distribution over experiment improvements

Page 21: Statistical Models Explored and Explained

Different Beliefs in Composition of ‘Typical’ Experiments

Page 22: Statistical Models Explored and Explained

Bayes Rule

Page 23: Statistical Models Explored and Explained

Bayes Rule & Bayesian A/B Testing

Page 24: Statistical Models Explored and Explained

Bayes Rule & Average Error Value

Page 25: Statistical Models Explored and Explained

Recap Average Error Control

Bayesian A/B Testing

Prior Distributions

Bayes Rule

Page 26: Statistical Models Explored and Explained

All Error Control is Frequentist A/B Testing

● All error control corresponds to Frequentist AB testing

● We want to aim to control the false positive rate

● Chance an experiment is either called a winner or loser

Page 27: Statistical Models Explored and Explained

Benefits & Risks

Shana Rusonis
Suggestion: take a moment to say - These concepts are very theoretical so far - why would you adopt either method? I'm going to cover what the *business* benefits are of either method and the problems that they help to avoid.
Shana Rusonis
Maybe we add a slide here about what businesses are looking for: speed and accuracy
Shana Rusonis
Also, TIME - you only have a finite amount of time in the day, visitors coming to your website or app - how are you going to maximize the speed and accuracy of your experiments to make the most of your time?
Sara Vafi
where would we add the [email protected] [email protected]
Julie Ritchie
I would add it as the first slide in this section
Page 28: Statistical Models Explored and Explained

Benefits of Bayesian A/B Testing

● Average error control can be very attractive

● Helps solve the “peeking” problem

● Average error control is fast

Page 29: Statistical Models Explored and Explained

Risks of Bayesian A/B Testing

● It’s more appealing but it’s risky in practice

● Smaller improvement experiments with fast results = high risk

● Higher error rate than the method actually suggests

Page 30: Statistical Models Explored and Explained

Benefits of Frequentist A/B Testing● This type of test will make fewer mistakes on experiments

with non-zero improvements ● The rate of errors will be less than 1 in 10● Option to speed up experimentation by using a prior

Page 31: Statistical Models Explored and Explained

Learning from A/B Tests

Page 32: Statistical Models Explored and Explained

Learning from A/B Tests

Page 33: Statistical Models Explored and Explained

Risk Involved with Typical Realistic Experiments

Page 34: Statistical Models Explored and Explained

Realistic Bayesian A/B Tests vs. Stats Engine

Page 35: Statistical Models Explored and Explained

● The hardest experiments to call correctly are those with small improvements

● A/B testing in the wild is not easy● We need more and more data in order to...

So what does this mean?

Page 36: Statistical Models Explored and Explained

Stats Engine

Page 37: Statistical Models Explored and Explained

Stats EngineTM

Results are valid whenever you check

Avoid costly statistics errors

Measure real-time resultswith confidence

Julie Ritchie
[email protected] can you please add content to this slide before tomorrow's dry run? thanks!
Page 38: Statistical Models Explored and Explained

Key Takeaways

● Bayesian vs. Frequentist methods● All error control vs. average error control● Blended approach leads to greater confidence

Page 39: Statistical Models Explored and Explained

QUESTIONS?

Page 40: Statistical Models Explored and Explained

THANK YOU!

Page 41: Statistical Models Explored and Explained

Appendix

Page 42: Statistical Models Explored and Explained

Attic and button example

Page 43: Statistical Models Explored and Explained

Attic and button example cont. In relation to all error

control

Page 44: Statistical Models Explored and Explained

Attic and button example cont. In relation to Average error

control

Page 45: Statistical Models Explored and Explained

How to define a Bayesian AB test *FIX THIS SLIDE*

Page 46: Statistical Models Explored and Explained

Trade offs with Bayesian AB testingHigh improvement > low improvement

Page 47: Statistical Models Explored and Explained

Bayesian A/B testing is average error control

Page 48: Statistical Models Explored and Explained

Introduction slide about what topics will be covered

Page 49: Statistical Models Explored and Explained

SARA’S SLIDES

Page 50: Statistical Models Explored and Explained

Results are valid whenever you check

Avoid costly statistics errors

Measure real-time resultswith confidence

Stats EngineTM

Page 51: Statistical Models Explored and Explained
Page 52: Statistical Models Explored and Explained
Page 53: Statistical Models Explored and Explained
Page 54: Statistical Models Explored and Explained
Page 55: Statistical Models Explored and Explained
Page 56: Statistical Models Explored and Explained
Page 57: Statistical Models Explored and Explained
Page 58: Statistical Models Explored and Explained