data analysis toolkit_final v1.0

16
1 EAI Confidential Epson Data Analysis and Decision-Making Toolkit An unwavering commitment to drive innovation and performance EPSON INNOVATION ENGINE Version 1.0 January, 2016

Upload: leeanderson40

Post on 08-Feb-2017

33 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Analysis Toolkit_Final v1.0

1 EAI Confidential

Epson Data Analysis and

Decision-Making Toolkit

An unwavering commitment to drive innovation and performance

EPSON

INNOVATION

ENGINE Version 1.0 – January, 2016

Page 2: Data Analysis Toolkit_Final v1.0

Data Analysis and Decision Making Process

Define Problem and Data Collection Plan

Collect, Validate and Clean Data

Interpret the Data Develop

Recommendations or Make Decisions

• Define the business

problem

• Select exploratory or

hypothesis-driven

approach

• Define problem

statement or hypothesis

to test

• Identify sources of data /

information to explore or

test the hypothesis

• Think with the end in

mind

• Collect the data in the

format needed for

analysis

• Validate and test the

data – make sure it is

correct

• Clean data and format

as required – most

commonly as a flat file

that can be used in

Excel

• Interpret the data (e.g.

whether it supports or

does not support your

hypothesis)

• This step may be

iterative; however, it will

be insightful

• Once confident in

interpretation, brainstorm

ways to improve the

situation (further analysis

may be needed)

• Visually depict data to

support your conclusions

• Create recommendations

or alternatives if required

• Make decisions and

proceed with experiment

– fail fast and adjust

• Monitor any improvement

by continually collecting

and measuring data on

the process or problem

Alignment with problem-solving (DMAIC) phases:

Below depicts how to develop a plan to turn data into actionable or impactful results.

Define Measure Analyze Improve

Page 3: Data Analysis Toolkit_Final v1.0

Discovery (Exploratory) vs Hypothesis-driven Research

Past Present

Deviation Actual Performance

Expected/Desired Performance

Investigation approaches:

• Hypothesis-based method: “I have an idea, let me verify.” • Begins with a proposition by the user, who then seeks to validate the truthfulness of the

proposition. Click here to learn more.

• Discovery-based method: “I have no clue, let me explore.” • Finds patterns, associations, and relationships among the data in order to uncover facts that

were previously unknown or not even contemplated by an organization. Click here to learn more.

Often, some preliminary research (discovery) is needed in order to create a hypothesis.

A problem is a deviation from a standard or expectation:

Page 4: Data Analysis Toolkit_Final v1.0

Hypothesis-Testing

Symptoms

- Low energy

- Headaches

- Fever

Impact

- Can’t do my job

- Can’t exercise

- Can’t take care of family

Hypotheses

- Cold?

- Flu?

- Tuberculosis?

Potential Causes

Hypothesis Testing

Virus A

Virus B

Virus C

Root Causes

2. Analysis &

Interpretation

1. Pain &

Suffering

3. Testing &

Proof

Symptoms

Page 5: Data Analysis Toolkit_Final v1.0

Hypothesis-Testing with Logic Trees

Epson can increase

selling time as a

proportion of total

available time

Epson can increase

sales volumes from

available selling time

How can Epson

increase sales-force

productivity?

Epson can transfer

or outsource many

non-value-added

tasks (e.g. admin)

Epson can reduce or

eliminate many non-

value-added tasks

(e.g. travel, error

correction)

Epson can improve

generation of sales

leads

Epson can improve

the proportion of

leads converted to

sales

Epson can improve

the sales conversion

skills of the sales

force

Epson can provide

the sales force with

better tools for lead

conversion

1. State the

problem

2. Generate

hypotheses

3. Keep

decomposing

them

Postulate an overall hypothesis as to the solution,

with the minimum efficient rationale to validate it.

Each hypothesis:

• Can be proven right or wrong

• Is not obvious

• Points directly to an action or actions you can take

4. Prioritize

them for

analysis

Page 6: Data Analysis Toolkit_Final v1.0

Descriptive Statistics Data Relationships: Scatter Plot & Correlation

Summarize large amounts of

data so that the main features

of the data can be easily

understood.

Identifies and visually displays

the relationship between two

variables.

Data Collection Techniques Data Grouping: 2x2 Matrix

Systematically gather

information to be analyzed in

order to develop a deeper

understanding of an issue.

Categorizes items into a 2x2

matrix using two variables in

order to clarify the desirability

of options and simplify

decision making.

Data Distribution: Histogram & Pareto Chart Data Trends: Trend (Run) Charts

Graphically displays data

grouped into ranges or

categories so that the

frequency/quantity of each one

can be better analyzed.

Graphically displays data over

time to identify process trends,

cycles, changes/shifts,

abnormalities, or problems.

Our Core Data Analysis and Visualization Tools

Page 7: Data Analysis Toolkit_Final v1.0

Descriptive Statistics

Why use these tools?

To synthesize large amounts of data so that it can be presented in a quantitative, easy to

understand manner (either numerically or graphically). The most popular descriptive statistics

show the central tendency (mean, median, and mode) and the spread of the data (standard

deviation and variance). Note that descriptive statistics only describe/summarize data. More

advanced statistics are needed for hypothesis testing.

What results you can expect?

• Make data easier to understand

and share

• Develop a deeper understanding of

the issue(s) at hand

• Point the direction for further

investigation and analysis

• Provide a basis for more advanced

statistical analysis

Further Learning

• Creating Descriptive Statistics in Excel

| LEARN MORE

Page 8: Data Analysis Toolkit_Final v1.0

Data Collection Techniques

Why use this tool?

To systematically gather information in order to develop a deeper understanding of an issue and

answer relevant questions. Typical data collection methods include: observations (note taking and

check sheets), surveys (questionnaires), interviews, and focus groups.

What results you can expect?

• Develop a better understanding of

the issue from new perspectives

• Identify and validate beliefs

• Generate and test hypotheses

• Discover previously unknown

factors

• Improve the probability of

developing effective solutions

Further Learning

• Surveys

• Check Sheet

• Qualitative vs Quantitative Data

| LEARN MORE

Page 9: Data Analysis Toolkit_Final v1.0

Data Distribution: Histogram

Why use this tool?

To graphically illustrate a distribution of numerical data by grouping data into ranges (bins) with the

frequencies shown as vertical bars. Histograms are frequently used when there is a large data set.

What results you can expect?

• Graphically display the distribution

of a data set

• Quickly identify ranges with

unusually high or low frequencies

• Point the direction for further

research

• Communicate data to stakeholders

in a simple format

Further Learning

• Dot Plots

• Box & Whisker Plots

• Comparing dot plots, histograms,

and box plots

• Creating Histograms in Excel

| LEARN MORE

Page 10: Data Analysis Toolkit_Final v1.0

Data Distribution: Pareto Chart

Why use this tool?

To graphically identify the key issues of a problem by following the 80/20 rule: 80% of the effects

can often be attributed to 20% of the causes. The Pareto chart is a combination histogram/bar

chart and line chart. The histogram/bar chart shows the frequency of the items/events in

descending order of magnitude while the line chart shows the cumulative frequency.

What results you can expect?

• Identify the major issues that need

to be addressed or further

investigated

• Focus analysis where it will have

the greatest impact

• Communicate data to stakeholders

in a simple format

Further Learning

• Creating Pareto Charts in Excel

| LEARN MORE

Page 11: Data Analysis Toolkit_Final v1.0

Data Relationships: Scatter Plot (Diagram)

Why use this tool?

To graphically display the relationship between two variables. Each variable is plotted on one axis

of an XY plot. If the variables are correlated (i.e. a relationship exists), the points will form a

pattern. The shape of the pattern indicates the type of relationship between the variables. More

well defined patterns indicate stronger relationships.

What results you can expect?

• Indicate the type and strength of

relationship between two variables

• Eliminate unimportant variables

from further analysis

• Communicate data to stakeholders

in a simple format

Further Learning

• Correlation Analysis

• Regression Analysis

• Creating Scatter Plots in Excel

| LEARN MORE

Vehicle Price vs Age

Page 12: Data Analysis Toolkit_Final v1.0

Data Grouping: 2x2 Matrix

Why use these tools?

To categorize items using two data variables in order to clarify the options and simplify decision

making. Generally, the matrix is structured so that the least desirable options fall into the lower left

quadrant and the most desirable options fall into the upper right quadrant.

What results you can expect?

• Rapidly sort options into categories

to facilitate decision making

• Organize data into memorable

categories or groups

• Assess the situation using more

than a single variable

• Communicate data to stakeholders

in a simple format

Further Learning

• Cluster Analysis

Page 13: Data Analysis Toolkit_Final v1.0

Data Trends: Trend (Run) Charts

Why use these tools?

To graphically display time series data (data sequenced over time). The horizontal axis displays

time and the vertical axis displays the values of the data. Trend/run charts are often used to

identify process trends, cycles, changes/shifts, abnormalities, or problems.

What results you can expect?

• Identify trends, changes, or

abnormalities with processes

• Increase the understanding of

processes

• Determine if a process change

resulted in improved process

performance

• Determine if improved performance

has been maintained

• Monitor and compare processes

Further Learning

• Statistical Process Control

• Control Charts

• Creating Trend/Run Charts in Excel

| LEARN MORE

Page 14: Data Analysis Toolkit_Final v1.0

Sources

Topic Link

Descriptive Statistics

https://www.youtube.com/watch?v=Mpl_v96dlfg

https://www.khanacademy.org/math/probability/descriptive-statistics

https://www.youtube.com/watch?v=MhDH9jsyzBA

Data Collection Techniques

http://www.sciencebuddies.org/science-fair-projects/project_ideas/Soc_survey.shtml

http://qualityamerica.com/LSS-Knowledge-

Center/qualityimprovementtools/check_sheets.php

http://regentsprep.org/regents/math/algebra/ad1/qualquant.htm

http://blog.socialcops.com/resources/4-data-collection-techniques-ones-right

https://www.youtube.com/watch?v=B2nmh_kEF98

Data Distribution: Histogram

https://www.khanacademy.org/math/cc-sixth-grade-math/cc-6th-data-statistics/dot-

plot/v/frequency-tables-and-dot-plots

https://www.khanacademy.org/math/probability/descriptive-statistics/box-and-

whisker-plots/v/reading-box-and-whisker-plots

https://www.khanacademy.org/math/cc-sixth-grade-math/cc-6th-data-statistics/cc-

7th-compare-data-displays/v/comparing-dot-plots-histograms-and-box-plots

https://www.youtube.com/watch?v=YYRkWKJIc9k

https://www.moresteam.com/toolbox/histogram.cfm

https://www.youtube.com/watch?v=gSEYtAjuZ-Y

Data Distribution: Pareto Chart

https://www.youtube.com/watch?v=i_XZzady-dQ

http://asq.org/learn-about-quality/cause-analysis-tools/overview/pareto.html

https://www.youtube.com/watch?v=GVGdtlnZ7xM

Page 15: Data Analysis Toolkit_Final v1.0

Sources

Topic Link

Data Relationships: Scatter Plot (Diagram)

https://explorable.com/statistical-correlation

https://www.moresteam.com/toolbox/regression-analysis.cfm

https://www.youtube.com/watch?v=uvJNfRmfAys

http://asq.org/learn-about-quality/cause-analysis-tools/overview/scatter.html

https://www.youtube.com/watch?v=CWnfwZRAuaY

Data Grouping: 2x2 Matrix https://www.youtube.com/watch?v=zqKFH7WNmfE

https://www.youtube.com/watch?v=PLr3CT79pSc

Data Trends: Trend (Run) Charts

https://www.moresteam.com/toolbox/statistical-process-control-spc.cfm

http://asq.org/learn-about-quality/data-collection-analysis-tools/overview/control-

chart.html

https://www.youtube.com/watch?v=jWlM9z8iFZI

https://www.moresteam.com/toolbox/trend-chart.cfm

https://www.youtube.com/watch?v=YQd1QoMHYwU

Page 16: Data Analysis Toolkit_Final v1.0

PROBLEM-SOLVING

Innovation Engine Toolkit Series

TEAM LEADERSHIP

CHANGE MANAGEMENT

DATA ANALYSIS

PROJECT MANAGEMENT

COMMUNICATION