smart data slides: data science and business analysis - a look at best practices for roles, skills,...

33
Data Science and Business Analysis: A Look at Best Practices for Roles, Skills, and Processes Bob. E. Hayes, PhD [email protected] @bobehayes

Upload: dataversity

Post on 12-Apr-2017

574 views

Category:

Business


0 download

TRANSCRIPT

Page 1: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

Data Science and Business Analysis: A Look at Best Practices for Roles, Skills, and ProcessesBob. E. Hayes, [email protected]@bobehayes

Page 2: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

Bob E. Hayes, PhDChief Research Officer

Email: [email protected]

Web: www.appuri.com

Twitter: @bobehayes

• Author of three books on customer experience management and analytics

• PhD in industrial-organizational psychology• #1 blogger overall on CustomerThink

(http://customerthink.com/author/bobehayes/)• #1 blogger on the topic of customer analytics

(http://customerthink.com/top-authors-category/)• Top expert in Big Data and Data Science

• https://www.maptive.com/the-top-100-big-data-experts/• http://www.kdnuggets.com/2015/02/top-big-data-

influencers-brands.html

Page 3: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

3

What is Data Science?

Data science is way of extracting insights from data using the powers of computer science and statistics applied to data from a specific field of study

Involves the collection, analysis and interpretation of data to extract empirically-based insights that augment and enhance human decisions and algorithms

Page 4: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

4

Data Science Study

Invited data professionals via:• AnalyticsWeek Newsletter• Blog post• Social media (Twitter, LinkedIn, Google+)

600+ completed surveys• Self-assessment rating of proficiency of 25 skills across five skill areas:• Business, Technology, Programming, Math & Modeling, Statistics• 9 additional questions• Overall satisfaction with outcome of analytics projects

Page 5: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

5

Data Science Skills AssessedArea Skills*

Business

1. Product design and development2. Project management3. Business development4. Budgeting5. Governance & Compliance (e.g., security)

Technology

6. Managing unstructured data (e.g., noSQL)7. Managing structured data (e.g., SQL, JSON, XML)8. Natural Language Processing (NLP) and text mining9. Machine Learning (e.g., decision trees, neural nets, Support Vector Machine, clustering)

10. Big and Distributed Data (e.g., Hadoop, Map/Reduce, Spark)

Math & Modeling

11. Optimization (e.g., linear, integer, convex, global)12. Math (e.g., linear algebra, real analysis, calculus)13. Graphical Models (e.g., social networks)14. Algorithms (e.g., computational complexity, Computer Science theory) and Simulations (e.g., discrete, agent-based, continuous)15. Bayesian Statistics (e.g., Markov Chain Monte Carlo)

Programming

16. Systems Administration (e.g., UNIX) and Design17. Database Administration (MySQL, NOSQL)18. Cloud Management19. Back-End Programming (e.g., JAVA/Rails/Objective C)20. Front-End Programming (e.g., JavaScript, HTML, CSS)

Statistics

21. Data Management (e.g., recoding, de-duplicating, Integrating disparate data sources, Web scraping)22. Data Mining (e.g. R, Python, SPSS, SAS) and Visualization (e.g., graphics, mapping, web-based data visualization) tools23. Statistics and statistical modeling (e.g., general linear model, ANOVA, MANOVA, Spatio-temporal, Geographical Information System (GIS))24. Science/Scientific Method (e.g., experimental design, research design)25. Communication (e.g., sharing results, writing/publishing, presentations, blogging)

* List of skills adapted from Analyzing the Analyzers by Harlan D. Harris, Sean Patrick Murphy and Marck Vaisman

Page 6: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

6

Proficiency Ratings*

ProficiencyLevel

ScaleValue Description

Don't know 0 You possess no knowledge

Fundamental Awareness 20 You have a common knowledge or an understanding of basic techniques and concepts.

Novice 40 You have the level of experience gained in a classroom and/or experimental scenarios or as a trainee on-the-job. You are expected to need help when performing this skill.

Intermediate 60You are able to successfully complete tasks in this competency as requested. Help from an expert may be required from time to time, but you can usually perform the skill independently.

Advanced 80You can perform the actions associated with this skill without assistance. You are certainly recognized within your immediate organization as "a person to ask" when difficult questions arise regarding this skill.

Expert 100 You are known as an expert in this area. You can provide guidance, troubleshoot and answer questions related to this area of expertise and the field where the skill is used.

* Rating scale is based on a proficiency rating scale used by NIH. Respondent instructions: You will be asked about your proficiency for a variety of skills. Please use the following scale when indicating your level of proficiency for each skill.

Page 7: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

7

Sample

Page 8: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

8

Proficiency varies across skills

Top 10 Data Science Skills1. Communication

2. Managing structured data

3. Data mining and visualization tools

4. Science / Scientific method

5. Math

6. Project management

7. Data management

8. Statistics and statistical modeling

9. Product design and development

10. Business development

Page 9: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

9

Job Roles in Data Science

*Researcher (e.g., researcher, scientist, statistician); Business Management (e.g., leader, business person, entrepreneur); Creative (e.g., jack of all trades, artist, hacker); Developer (e.g., developer, engineer)

Page 10: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

10

Proficiency in 25 skills varies by job role

• Different types of data scientists possess different skills

• Biz Management – strong in business skills

• Developer – strong in technology/programming skills

• Researcher – strong in math/ statistics skills

• Creatives – average in all skills

Page 11: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

11

Structure of Data Science Skills

* Factor analysis is based on proficiency ratings of 621 data professionals. Reliability (Cronbach’s alpha for each of the three Skills areas (based on items that loaded on the respective factors) were: .87 (Business); .92 (Tech / Prog); .92 (Math / Stats)

Factor Analysis of Data Skills• Data reduction technique• Examines the statistical relationships (e.g.,

correlations) among a large set of variables and tries to explain these correlations using a smaller number of variables (factors)

• Elements (or factor loadings) of the factor pattern matrix represent the strength of relationship between the variables and each of the underlying factors

• Tells us two things:1. number of underlying factors that

describe the initial set of variables2. which variables are best represented by

each factor

Page 12: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

12

Structure of Data Science Skills

* Factor analysis is based on proficiency ratings of 621 data professionals. Reliability (Cronbach’s alpha for each of the three Skills areas (based on items that loaded on the respective factors) were: .87 (Business); .92 (Tech / Prog); .92 (Math / Stats)

Plot the factor loadings for the 25 data skills into a 3-dimensional space

Three Distinct Skill Sets• Business• Technology / Programming• Math / Statistics

Page 13: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

13

The Structure of Data Science Skills

Page 14: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

14

Proficiency in general skill areas varies by job role

Page 15: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

15

Business Skills: Proficiency varies by job role

*Researcher (e.g., researcher, scientist, statistician) n = 133; Business Management (e.g., leader, business person, entrepreneur) n = 86; Creative (e.g., jack of all trades, artist, hacker) n = 30; Developer (e.g., developer, engineer) n = 54

Page 16: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

16

Technology and Math/Statistics Skills: Proficiency varies by job role

*Researcher (e.g., researcher, scientist, statistician) n = 133; Business Management (e.g., leader, business person, entrepreneur) n = 86; Creative (e.g., jack of all trades, artist, hacker) n = 30; Developer (e.g., developer, engineer) n = 54

Page 17: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

17

Top Data Science Skills by Job Role

Page 18: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

18

Satisfaction with Work Outcome

*Researcher (e.g., researcher, scientist, statistician); Business Management (e.g., leader, business person, entrepreneur); Creative (e.g., jack of all trades, artist, hacker); Developer (e.g., developer, engineer)

Page 19: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

19

In Search of the Data Scientist Unicorn

Page 20: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

20

Data Science as a Team SportImpact of Business Expert

Page 21: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

21

Data Science as a Team SportImpact of Technology / Programming Expert

Page 22: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

22

Data Science as a Team SportImpact of Math & Modeling / Statistics Expert

Page 23: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

23

Getting Insight from Data: The Scientific Method

1. Formulate Questions

2. Generate hypothesis/

hunch

3. Gather / Generate data

4. Analyze data / Test

hypothesis

5. Take action / Communicate

results

• Start with a problem statement.

• What are your hunches / hypotheses?

• Be sure your hypotheses are testable.

• You can use experimental or observational approach to analyzing data.

• Integrate your data silos to ask bigger questions; connect the dots and get a 360 degree view of your customers.

• Employ Predictive analytics / Inferential statistics to test hypotheses

• Employ machine learning to quickly surface insights

• Implement your findings

• Use Prescriptive analytics to guide course of action

Page 24: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

24

Scientific Method and Data Science Skills

Page 25: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

25

What skills are linked to project success?

Page 26: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

26

Importance of Data Science Skills by Job Role

Page 27: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

27

Education and Data Science Skills

Page 28: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

28

Lack of Gender Diversity

Page 29: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

29

Lack of Gender Diversity – Other Science Roles

Page 30: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

30

Job Roles in Data Science by Gender

Page 31: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

31

Highest Level of Education Attained

Page 32: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

32

Gender Comparison of Proficiency across Skills

Page 33: Smart Data Slides: Data Science and Business Analysis - A Look at Best Practices for Roles, Skills, and Processes

33

Advice for Data Scientists

• Be specific when talking about “data scientists”

• There are different types – defined by what they do and the skills they possess

• Work with other data professionals who have complementary skills. Teamwork is key to successful data science projects.

• Learn to use data mining and visualization tools

• R, Python, SPSS, SAS, graphics, mapping, web-based data visualization

• Be an advocate for women in the field of data science