data science: a mindset for productivity

28
Data Science: A Mindset for Productivity Daniel Tunkelang @dtunkelang Danie l

Upload: daniel-tunkelang

Post on 28-Jul-2015

1.660 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Data Science: A Mindset for Productivity

Data Science: A Mindset for ProductivityDaniel Tunkelang

@dtunkelang

Daniel

Page 2: Data Science: A Mindset for Productivity

tl;dr

The most important part of data science is pickingthe right problem and figuring out how to frame it.

Page 3: Data Science: A Mindset for Productivity

We’re all technologists, right?

Page 4: Data Science: A Mindset for Productivity

But nobody knows everything.*Class HashMap<K,V>

java.lang.Objectjava.util.AbstractMap<K,V>

java.util.HashMap<K,V>

Type Parameters:

K - the type of keys maintained by this mapV - the type of mapped values

All Implemented Interfaces:Serializable, Cloneable, Map<K,V>

*Except Jeff Dean.

Page 5: Data Science: A Mindset for Productivity

Math and computer science matter…

Page 6: Data Science: A Mindset for Productivity

But you have to solve the right problem.

Page 7: Data Science: A Mindset for Productivity

Stay friends with your exes.

explainexpress

experiment

Page 8: Data Science: A Mindset for Productivity

Data science is a mindset.

ExplainIterate using explainable models.

ExpressModel your utility and inputs.

ExperimentOptimize for speed of learning.

Page 9: Data Science: A Mindset for Productivity

Explain

Page 10: Data Science: A Mindset for Productivity

With apologies to the little prince.

Page 11: Data Science: A Mindset for Productivity

Deep learning is the new black.

Page 12: Data Science: A Mindset for Productivity

But accuracy isn’t everything.

Page 13: Data Science: A Mindset for Productivity

The importance of being explainable.• Algorithms can protect you from overfitting, but they can’t

protect you from the biases you introduce.

• Introspection into your models and features makes it easier for you and others to debug them.

• Especially if you don’t completely trust your objective function or representativeness of your training data.

Page 14: Data Science: A Mindset for Productivity

Linear models? Decision trees?• Linear regression and decision trees favor explainability over accuracy,

compared to more sophisticated models.

• But size matters. If you have too many features or too deep a decision tree, you lose explainability.

• You can always upgrade to a more sophisticated model when you trust your objective function and training data.

• Build a machine learning model is an iterative process. Optimize for the speed of your own learning.

Page 15: Data Science: A Mindset for Productivity

Express

Page 16: Data Science: A Mindset for Productivity

Machine learning for dummies.• Define objective function.• Collect training data.• Build models.• Profit!

Page 17: Data Science: A Mindset for Productivity

You only improve what you measure.

Clicks?

Actions?

Outcomes?

Page 18: Data Science: A Mindset for Productivity

Sometimes accuracy is complicated.

Page 19: Data Science: A Mindset for Productivity

What’s your error function?

Page 20: Data Science: A Mindset for Productivity

Consider stratified sampling.

Page 21: Data Science: A Mindset for Productivity

Experiment

Page 22: Data Science: A Mindset for Productivity

How to find your prince.You have to kiss a lot of frogs to find one prince. So how can you find your prince faster?

By finding more frogs andkissing them faster and faster.

-- Mike Moran

Page 23: Data Science: A Mindset for Productivity

Think like an economist.Yesterday

Experiments are expensive,

choose hypotheses wisely.

TodayExperiments are cheap,

do as many as you can!

Page 24: Data Science: A Mindset for Productivity

But don’t forget you’re a scientist.

Page 25: Data Science: A Mindset for Productivity

Optimize for the speed of learning.

Page 26: Data Science: A Mindset for Productivity

Test one variable at a time.• Autocomplete• Entity Tagging• Vertical Intent• # of Suggestions• Suggestion Order• Language• Query Construction• Ranking Model

Page 27: Data Science: A Mindset for Productivity

tl;dr

The most important part of data science is pickingthe right problem and figuring out how to frame it.

Page 28: Data Science: A Mindset for Productivity

Daniel [email protected]

https://linkedin.com/in/dtunkelang@dtunkelang