what managers need to know about data science
TRANSCRIPT
Outline
• What is data science • Industry trends • What is data • The Optimal Data Scientist • The Optimal Manager • Topics in Data Science • Topics in Cloud Computing
Usage of Data Science
Finance: fraud detecAon, score buying habits, calculate risks
Insurance: inspect driving habits, assess risks, determine premiums
Usage of Data Science
Biometrics: wearable devices to monitor and improve health
Digital MarkeAng: recommender systems, audience segmentaAon, retargeAng, churn predicAon
Usage of Data Science
Retail: Walmart launches compeAAon to solve business problems and to recruit talent
Online: NeHlix launched $1 million prize to improve recommendaAon system
Usage of Data Science
Healthcare: Heritage Network launched a compeAAon to predict the probability of hospitalizaAon of paAents.
ScienAfic: NaAonal Data Science Bowl to predict ocean health: one plankton at a Ame
Why Should YOU Care? According to McKinsey1 (2011), Big Data: The next fron5er for innova5on, compe55on, and produc5vity.
“By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analyAcal skills as well as 1.5 million managers and analysts with the know-‐how to use the analysis of big data to make effecAve decisions”
Why Should YOU Care? According to Forbes2 (Oct 2015), The Hunt For Unicorn Data Scien5sts LiCs Salaries For All Data Analy5cs Professionals • Experienced data scienAsts are paid more than $200k
per year • Median salary for data scienAst increased from
$115,250 to $125,000 in one year • Managers managing large teams can expect a median
salary of $235,000
Explosion of Data Science
Why now? • Storage cost has decreased dramaAcally • CompuAng power has increased exponenAally • People are carrying smartphones, mini supercomputers in their pockets
• Perfect intersecAon of data availability and compuAng power for analyAcs
DS Skills Inferred by Job Openings • Ph.D. in math, staAsAcs, engineering or physical science (Is it really required?)
• Has 5+ years in programming experience in Java, Scala, Python, R, SQL, MapReduce, etc.
• Has 5+ years experience in most of the Apache Open Source Technologies (e.g. Hadoop, Spark, Hive, Pig, Kaka, etc)*
• Tell a story like a novelist (coherently and beauAfully)
* By the Ame you read this footnote, the Apache stack has already grown.
The OpAmal Data ScienAst Is a person with deep staAsAcal and machine learning knowledge, extensive somware engineering skills and well-‐versed in business strategy!
The OpAmal Data ScienAst – Take 2 Personality Traits3 • Compulsive • Propulsive laziness • Drive to create and learn • Irritable determinaAon • InsensiAvity to pain (hmm…) • Integrity • Humility
The OpAmal DS Manager • Former data scienAst (good to have but not necessary; that’s just asking for another unicorn!)
• Actually interested in managing people • Thirst to learn • Apt in managing different projects • PaAent and diplomaAc to manage a diverse group of data scienAsts and business owners
• Understand when to go with an 80/20 approach
Data ScienAsts: The Challenge of Managing Stubbornly Autonomous Experts4
“I no5ced … that data scien5sts, but also sta5s5cians and top coders, oCen have difficul5es accep5ng orders from managers who don’t have technical skills themselves.” -‐ Istvan Hajnal
Journey to become a DS Manager Nate Silver on Finding a Mentor, Teaching Yourself StaAsAcs, and Not Sesling in Your Career5 • Find a Mentor (Yes, even if you’re already a senior manager)
• Teach Yourself (online resources, MOOCs) • Understand the life-‐cycle of a data-‐driven project
• Just do it!
Why Just Do It? Why do I need to learn about data science and manage data projects?
“I have [insert # of years] years of experience in [insert my industry]. I’m comfortable and successful being a [insert your Atle here].”
Your Job: Provide Guidance
Tell us a data story … about your business
Do you understand the outcome?
What is your recommendaAon to the business?
Gezng Started: Locally Meetups • LA R users group • LA Machine Learning • LA Data Warehouse, BI & AnalyAcs • LA Big Data Users Group Conferences: • datascience.la • bigdatadayla.org
Good Places to Start
Doing Data Science by Rachel Schus & Cathy O’Neil (mathbabe.org) Free at www.columbiadatascience.com
Good Places to Start
The Art of Data Science by Roger Peng & Elizabeth Matsui hsps://leanpub.com/artofdatascience
References 1. hsp://www.mckinsey.com/insights/business_technology/
big_data_the_next_fronAer_for_innovaAon
2. hsp://www.forbes.com/sites/gilpress/2015/10/09/the-‐hunt-‐for-‐unicorn-‐data-‐scienAsts-‐lims-‐salaries-‐for-‐all-‐data-‐analyAcs-‐professionals/
3. hsp://cdn.oreillystaAc.com/en/assets/1/event/119/Data%20Science%20Bootcamp%20PresentaAon.pdf
4. hsp://www.ibmbigdatahub.com/blog/data-‐scienAsts-‐challenge-‐managing-‐stubbornly-‐autonomous-‐experts
5. hsps://hbr.org/2013/09/nate-‐silver-‐on-‐finding-‐a-‐mentor-‐teaching-‐yourself-‐staAsAcs-‐and-‐not-‐sesling-‐in-‐your-‐career/
6. hsp://www.nyAmes.com/2012/06/26/technology/in-‐a-‐big-‐network-‐of-‐computers-‐evidence-‐of-‐machine-‐learning.html
7. hsp://research.google.com/archive/unsupervised_icml2012.html