what managers need to know about data science

45
What Managers Need to Know about Data Science Annie Flippo

Upload: annie-flippo

Post on 15-Apr-2017

390 views

Category:

Data & Analytics


1 download

TRANSCRIPT

What  Managers  Need  to  Know  

about  Data  Science  

Annie  Flippo  

Outline  

•  What is data science •  Industry trends •  What is data •  The Optimal Data Scientist •  The Optimal Manager •  Topics in Data Science •  Topics in Cloud Computing

Who  am  I?  

Annie Flippo Data Scientist Software Engineer Product Manager

Database Developer

What  is  Data  Science?  

Usage  of  Data  Science  

Finance:  fraud  detecAon,  score  buying  habits,  calculate  risks  

Insurance:  inspect  driving  habits,  assess  risks,  determine  premiums  

Usage  of  Data  Science  

Biometrics:  wearable  devices  to  monitor  and  improve  health  

Digital  MarkeAng:  recommender  systems,  audience  segmentaAon,  retargeAng,  churn  predicAon  

Usage  of  Data  Science  

Retail:  Walmart  launches  compeAAon  to  solve  business  problems  and  to  recruit  talent  

Online:  NeHlix  launched  $1  million  prize  to  improve  recommendaAon  system  

Usage  of  Data  Science  

Healthcare:  Heritage  Network    launched  a  compeAAon  to  predict  the  probability  of  hospitalizaAon  of  paAents.  

ScienAfic:  NaAonal  Data  Science  Bowl  to  predict  ocean  health:  one  plankton  at  a  Ame  

Why  Should  YOU  Care?  According  to  McKinsey1  (2011),  Big  Data:  The  next  fron5er  for  innova5on,  compe55on,  and  produc5vity.  

“By  2018,  the  United  States  alone  could  face  a  shortage  of  140,000  to  190,000  people  with  deep  analyAcal  skills  as  well  as  1.5  million  managers  and  analysts  with  the  know-­‐how  to  use  the  analysis  of  big  data  to  make  effecAve  decisions”  

Why  Should  YOU  Care?  According  to  Forbes2  (Oct  2015),  The  Hunt  For  Unicorn  Data  Scien5sts  LiCs  Salaries  For  All  Data  Analy5cs  Professionals  •  Experienced  data  scienAsts  are  paid  more  than  $200k  

per  year  •  Median  salary  for  data  scienAst  increased  from  

$115,250  to  $125,000  in  one  year  •  Managers  managing  large  teams  can  expect  a  median  

salary  of  $235,000  

Because  it’s  a  growing  and  exciAng  field  with  high  compensaAon!  

Explosion  of  Data  Science

Why  now?  •  Storage  cost  has  decreased  dramaAcally  •  CompuAng  power  has  increased  exponenAally  •  People  are  carrying  smartphones,  mini  supercomputers  in  their  pockets  

•  Perfect  intersecAon  of  data  availability  and  compuAng  power  for  analyAcs

Massive  amount  of  data

Streaming  into  your  company  …  

What  is  data?

It  can  be  raw  web  traffic  logs  …  

What  is  data?

Semi-­‐structured  data  from  APIs  …  

What  is  data?

Or,  structured  data  from  databases…  …  what  to  do  with  all  this  data?  

The  Data  ScienAst Can  wrangle  data  from  many  sources  or  formats  

The  Data  ScienAst do  deep  data  exploraAons  …  

and  perform  thorough  analyses    

DS  Skills  Inferred  by  Job  Openings •  Ph.D.  in  math,  staAsAcs,  engineering  or  physical  science  (Is  it  really  required?)  

•  Has  5+  years  in  programming  experience  in  Java,  Scala,  Python,  R,  SQL,  MapReduce,  etc.  

•  Has  5+  years  experience  in  most  of  the  Apache  Open  Source  Technologies  (e.g.  Hadoop,  Spark,  Hive,  Pig,  Kaka,  etc)*  

•  Tell  a  story  like  a  novelist  (coherently  and  beauAfully)  

*  By  the  Ame  you  read  this  footnote,  the  Apache  stack  has  already  grown.  

The  OpAmal  Data  ScienAst Is  a  person  with  deep  staAsAcal  and  machine  learning  knowledge,  extensive  somware  engineering  skills  and  well-­‐versed  in  business  strategy!  

The  OpAmal  Data  ScienAst  –  Take  2 Personality  Traits3  •  Compulsive  •  Propulsive  laziness  •  Drive  to  create  and  learn  •  Irritable  determinaAon  •  InsensiAvity  to  pain  (hmm…)  •  Integrity  •  Humility  

The  OpAmal  DS  Manager •  Former  data  scienAst  (good  to  have  but  not  necessary;  that’s  just  asking  for  another  unicorn!)  

•  Actually  interested  in  managing  people  •  Thirst  to  learn    •  Apt  in  managing  different  projects  •  PaAent  and  diplomaAc  to  manage  a  diverse  group  of  data  scienAsts  and  business  owners  

•  Understand  when  to  go  with  an  80/20  approach    

Data  ScienAsts:  The  Challenge  of  Managing  Stubbornly  Autonomous  Experts4    

“I  no5ced  …  that  data  scien5sts,  but  also  sta5s5cians  and  top  coders,  oCen  have  difficul5es  accep5ng  orders  from  managers  who  don’t  have  technical  skills  themselves.”  -­‐  Istvan  Hajnal  

Journey  to  become  a  DS  Manager    Nate  Silver  on  Finding  a  Mentor,  Teaching  Yourself  StaAsAcs,  and  Not  Sesling  in  Your  Career5  •  Find  a  Mentor  (Yes,  even  if  you’re  already  a  senior  manager)  

•  Teach  Yourself  (online  resources,  MOOCs)  •  Understand  the  life-­‐cycle  of  a  data-­‐driven  project  

•  Just  do  it!  

Why  Just  Do  It?    Why  do  I  need  to  learn  about  data  science  and  manage  data  projects?    

“I  have  [insert  #  of  years]  years  of  experience  in  [insert  my  industry].    I’m  comfortable  and  successful  being  a  [insert  your  Atle  here].”  

Company  Structures

Data  Sources

Data  projects  are  lurking  everywhere  …  

Machine  Learning

Machine  Learning

Machine  Learning

Machine  Learning

Google  X  laboratory5  

Machine  Learning

Google  Research6  

Data  Science  Concepts

PredicAve  AnalyAcs  

ClassificaAon  

RecommendaAon  Systems  

Big  Data  Technology

Topics  in  Cloud  CompuAng

New  services  added:  

Your  Job:  Provide  Guidance

Tell  us  a  data  story    …  about  your  business  

Do  you  understand  the  outcome?    

What  is  your  recommendaAon  to  the  business?  

Gezng  Started:  Locally Meetups  •  LA  R  users  group  •  LA  Machine  Learning  •  LA  Data  Warehouse,  BI  &  AnalyAcs  •  LA  Big  Data  Users  Group  Conferences:  •  datascience.la  •  bigdatadayla.org  

Gezng  Started:  Podcasts

dataskepAc.com   thetalkingmachines.com  

Gezng  Started:  MOOCs

Good  Places  to  Start

Data  Science  for  Business    by  Foster  Provost    &  Tom  Fawces  

Good  Places  to  Start

Doing  Data  Science    by  Rachel  Schus  &  Cathy  O’Neil  (mathbabe.org)    Free  at  www.columbiadatascience.com  

Good  Places  to  Start

The  Art  of  Data  Science      by  Roger  Peng  &  Elizabeth  Matsui    hsps://leanpub.com/artofdatascience  

Get  Kids  Started

scratch.mit.edu   www.ixl.com  

Thank  You!

Annie  Flippo   @ACflippo  

Slides  are  available  at  goo.gl/1X2NMH  

References    1.  hsp://www.mckinsey.com/insights/business_technology/

big_data_the_next_fronAer_for_innovaAon  

2.  hsp://www.forbes.com/sites/gilpress/2015/10/09/the-­‐hunt-­‐for-­‐unicorn-­‐data-­‐scienAsts-­‐lims-­‐salaries-­‐for-­‐all-­‐data-­‐analyAcs-­‐professionals/  

3.  hsp://cdn.oreillystaAc.com/en/assets/1/event/119/Data%20Science%20Bootcamp%20PresentaAon.pdf  

4.  hsp://www.ibmbigdatahub.com/blog/data-­‐scienAsts-­‐challenge-­‐managing-­‐stubbornly-­‐autonomous-­‐experts  

5.  hsps://hbr.org/2013/09/nate-­‐silver-­‐on-­‐finding-­‐a-­‐mentor-­‐teaching-­‐yourself-­‐staAsAcs-­‐and-­‐not-­‐sesling-­‐in-­‐your-­‐career/  

6.  hsp://www.nyAmes.com/2012/06/26/technology/in-­‐a-­‐big-­‐network-­‐of-­‐computers-­‐evidence-­‐of-­‐machine-­‐learning.html  

7.  hsp://research.google.com/archive/unsupervised_icml2012.html