sbe10_02b [read-only] [compatibility mode]

15
1 1 Slide Slide © 2005 Thomson/South © 2005 Thomson/South-Western Western Chapter 2 Chapter 2 Descriptive Statistics: Descriptive Statistics: Tabular and Graphical Presentations Tabular and Graphical Presentations Part B Part B Exploratory Data Analysis Exploratory Data Analysis Crosstabulations and Crosstabulations and Scatter Diagrams Scatter Diagrams x y 2 Slide Slide © 2005 Thomson/South © 2005 Thomson/South-Western Western Exploratory Data Analysis Exploratory Data Analysis The techniques of The techniques of exploratory data analysis exploratory data analysis consist of consist of simple arithmetic and easy simple arithmetic and easy-to to-draw pictures that can draw pictures that can be used to summarize data quickly. be used to summarize data quickly. One such technique is the One such technique is the stem stem-and and-leaf display leaf display .

Upload: vu-duc-hoang-vo

Post on 16-Apr-2017

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: SBE10_02b [Read-Only] [Compatibility Mode]

1

11SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Chapter 2Chapter 2Descriptive Statistics:Descriptive Statistics:

Tabular and Graphical PresentationsTabular and Graphical PresentationsPart BPart B

■■ Exploratory Data AnalysisExploratory Data Analysis

■■ Crosstabulations andCrosstabulations and

Scatter DiagramsScatter Diagrams

xx

yy

22SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Exploratory Data AnalysisExploratory Data Analysis

�� The techniques of The techniques of exploratory data analysisexploratory data analysis consist ofconsist ofsimple arithmetic and easysimple arithmetic and easy--toto--draw pictures that candraw pictures that canbe used to summarize data quickly.be used to summarize data quickly.

�� One such technique is the One such technique is the stemstem--andand--leaf displayleaf display..

Page 2: SBE10_02b [Read-Only] [Compatibility Mode]

2

33SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

StemStem--andand--Leaf DisplayLeaf Display

�� Each digit on a stem is a Each digit on a stem is a leafleaf..

�� Each line in the display is referred to as a Each line in the display is referred to as a stemstem..

�� To the right of the vertical line we record the lastTo the right of the vertical line we record the lastdigit for each item in rank order.digit for each item in rank order.

�� The first digits of each data item are arranged to theThe first digits of each data item are arranged to theleft of a vertical line.left of a vertical line.

�� It is It is similar to a histogramsimilar to a histogram on its side, but it has theon its side, but it has theadvantage of showing the actual data values.advantage of showing the actual data values.

�� A stemA stem--andand--leaf display shows both the leaf display shows both the rank orderrank orderand and shape of the distributionshape of the distribution of the data.of the data.

44SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Example: Hudson Auto RepairExample: Hudson Auto Repair

The manager of Hudson AutoThe manager of Hudson Auto

would like to have a betterwould like to have a better

understanding of the costunderstanding of the cost

of parts used in the engineof parts used in the engine

tunetune--ups performed in theups performed in the

shop. She examines 50shop. She examines 50

customer invoices for tunecustomer invoices for tune--ups. The costs of parts,ups. The costs of parts,

rounded to the nearest dollar, are listed on the nextrounded to the nearest dollar, are listed on the next

slide.slide.

Page 3: SBE10_02b [Read-Only] [Compatibility Mode]

3

55SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Example: Hudson Auto RepairExample: Hudson Auto Repair

■■ Sample of Parts Cost for 50 TuneSample of Parts Cost for 50 Tune--upsups

91 78 93 57 75 52 99 80 97 62

71 69 72 89 66 75 79 75 72 76

104 74 62 68 97 105 77 65 80 109

85 97 88 68 83 68 71 69 67 74

62 82 98 101 79 105 79 69 62 73

66SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

StemStem--andand--Leaf DisplayLeaf Display

55

66

77

88

99

1010

2 72 7

2 2 2 2 5 6 7 8 8 8 9 9 92 2 2 2 5 6 7 8 8 8 9 9 9

1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 91 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9

0 0 2 3 5 8 90 0 2 3 5 8 9

1 3 7 7 7 8 91 3 7 7 7 8 9

1 4 5 5 91 4 5 5 9

a stema stema leafa leaf

Page 4: SBE10_02b [Read-Only] [Compatibility Mode]

4

77SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Stretched StemStretched Stem--andand--Leaf DisplayLeaf Display

�� Whenever a stem value is stated twice, the first valueWhenever a stem value is stated twice, the first valuecorresponds to leaf values of 0 corresponds to leaf values of 0 −− 4, and the second4, and the secondvalue corresponds to leaf values of 5 value corresponds to leaf values of 5 −− 9.9.

�� If we believe the original stemIf we believe the original stem--andand--leaf display hasleaf display hascondensed the data too much, we can condensed the data too much, we can stretch thestretch thedisplaydisplay by using two stems for each leading digit(s).by using two stems for each leading digit(s).

88SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Stretched StemStretched Stem--andand--Leaf DisplayLeaf Display

5 5 95 5 91 41 47 7 7 8 97 7 7 8 91 31 35 8 95 8 90 0 2 30 0 2 35 5 5 6 7 8 9 9 95 5 5 6 7 8 9 9 91 1 2 2 3 4 41 1 2 2 3 4 45 6 7 8 8 8 9 9 95 6 7 8 8 8 9 9 92 2 2 22 2 2 2772255

55

66

66

77

77

88

88

99

99

1010

1010

Page 5: SBE10_02b [Read-Only] [Compatibility Mode]

5

99SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

StemStem--andand--Leaf DisplayLeaf Display

■■ Leaf UnitsLeaf Units

•• Where the leaf unit is not shown, it is assumedWhere the leaf unit is not shown, it is assumedto equal 1.to equal 1.

•• Leaf units may be 100, 10, 1, 0.1, and so on.Leaf units may be 100, 10, 1, 0.1, and so on.

•• In the preceding example, the leaf unit was 1.In the preceding example, the leaf unit was 1.

•• A single digit is used to define each leaf.A single digit is used to define each leaf.

1010SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Example: Leaf Unit = 0.1Example: Leaf Unit = 0.1

If we have data with values such asIf we have data with values such as

88

99

1010

1111

Leaf Unit = 0.1Leaf Unit = 0.1

6 86 8

1 41 4

22

0 70 7

8.6 8.6 11.711.7 9.49.4 9.19.1 10.210.2 11.011.0 8.88.8

a stema stem--andand--leaf display of these data will beleaf display of these data will be

Page 6: SBE10_02b [Read-Only] [Compatibility Mode]

6

1111SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Example: Leaf Unit = 10Example: Leaf Unit = 10

If we have data with values such asIf we have data with values such as

1616

1717

1818

1919

Leaf Unit = 10Leaf Unit = 10

88

1 91 9

0 30 3

1 71 7

18061806 17171717 19741974 17911791 16821682 19101910 18381838

a stema stem--andand--leaf display of these data will beleaf display of these data will be

The 82 in 1682The 82 in 1682is rounded downis rounded down

to 80 and isto 80 and isrepresented as an 8.represented as an 8.

1212SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Crosstabulations and Scatter DiagramsCrosstabulations and Scatter Diagrams

�� CrosstabulationCrosstabulation and a and a scatter diagramscatter diagram are twoare twomethods for summarizing the data for two (or more)methods for summarizing the data for two (or more)variables simultaneously.variables simultaneously.

�� Often a manager is interested in tabular andOften a manager is interested in tabular andgraphical methods that will help understand thegraphical methods that will help understand therelationship between two variablesrelationship between two variables..

�� Thus far we have focused on methods that are usedThus far we have focused on methods that are usedto summarize the data for to summarize the data for one variable at a timeone variable at a time..

Page 7: SBE10_02b [Read-Only] [Compatibility Mode]

7

1313SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

CrosstabulationCrosstabulation

�� The left and top margin labels define the classes forThe left and top margin labels define the classes forthe two variables.the two variables.

■■ Crosstabulation can be used when:Crosstabulation can be used when:•• one variable is qualitative and the other isone variable is qualitative and the other is

quantitative,quantitative,•• both variables are qualitative, orboth variables are qualitative, or•• both variables are quantitative.both variables are quantitative.

�� A A crosstabulationcrosstabulation is a tabular summary of data foris a tabular summary of data fortwo variables.two variables.

1414SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

PricePriceRangeRange Colonial Log Split AColonial Log Split A--FrameFrame TotalTotal

<< $99,000$99,000

> $99,000> $99,000

18 6 19 1218 6 19 12 5555

4545

3030 20 35 1520 35 15TotalTotal 100100

12 14 16 312 14 16 3

Home StyleHome Style

CrosstabulationCrosstabulation

■■ Example: Finger Lakes HomesExample: Finger Lakes Homes

The number of Finger Lakes homes sold for each The number of Finger Lakes homes sold for each style and price for the past two years is shown below. style and price for the past two years is shown below.

quantitativequantitativevariablevariable

qualitativequalitativevariablevariable

Page 8: SBE10_02b [Read-Only] [Compatibility Mode]

8

1515SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

CrosstabulationCrosstabulation

■■ Insights Gained from Preceding CrosstabulationInsights Gained from Preceding Crosstabulation

•• Only three homes in the sample are an AOnly three homes in the sample are an A--FrameFramestyle and priced at more than $99,000.style and priced at more than $99,000.

•• The greatest number of homes in the sample (19)The greatest number of homes in the sample (19)are a splitare a split--level style and priced at less than orlevel style and priced at less than orequal to $99,000.equal to $99,000.

1616SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

PricePriceRangeRange Colonial Log Split AColonial Log Split A--FrameFrame TotalTotal

<< $99,000$99,000

> $99,000> $99,000

18 6 19 1218 6 19 12 5555

4545

3030 20 35 1520 35 15TotalTotal 100100

12 14 16 312 14 16 3

Home StyleHome Style

CrosstabulationCrosstabulation

Frequency distributionFrequency distributionfor the price variablefor the price variable

Frequency distributionFrequency distributionfor the home style variablefor the home style variable

Page 9: SBE10_02b [Read-Only] [Compatibility Mode]

9

1717SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Crosstabulation: Row or Column PercentagesCrosstabulation: Row or Column Percentages

■■ Converting the entries in the table into row Converting the entries in the table into row percentages or column percentages can provide percentages or column percentages can provide additional insight about the relationship between additional insight about the relationship between the two variables.the two variables.

1818SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

PricePrice

RangeRange Colonial Log Split AColonial Log Split A--FrameFrame TotalTotal

<< $99,000$99,000

> $99,000> $99,000

32.73 10.91 34.55 21.8232.73 10.91 34.55 21.82 100100

100100

Note: row totals are actually 100.01 due to rounding.Note: row totals are actually 100.01 due to rounding.

26.67 31.11 35.56 6.6726.67 31.11 35.56 6.67

Home StyleHome Style

(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

Crosstabulation: Row PercentagesCrosstabulation: Row Percentages

Page 10: SBE10_02b [Read-Only] [Compatibility Mode]

10

1919SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

PricePrice

RangeRange Colonial Log Split AColonial Log Split A--FrameFrame

<< $99,000$99,000

> $99,000> $99,000

60.00 30.00 54.29 80.0060.00 30.00 54.29 80.00

40.00 70.00 45.71 20.0040.00 70.00 45.71 20.00

Home StyleHome Style

100100 100 100 100100 100 100TotalTotal

(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100

Crosstabulation: Column PercentagesCrosstabulation: Column Percentages

2020SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Crosstabulation: Simpson’s ParadoxCrosstabulation: Simpson’s Paradox

�� Simpson’ ParadoxSimpson’ Paradox: In some cases the conclusions: In some cases the conclusionsbased upon an aggregated crosstabulation can bebased upon an aggregated crosstabulation can becompletely reversed if we look at the unaggregatedcompletely reversed if we look at the unaggregateddata. suggests the overall relationship between thedata. suggests the overall relationship between thevariables.variables.

�� We must be careful in drawing conclusions about theWe must be careful in drawing conclusions about therelationship between the two variables in therelationship between the two variables in theaggregated crosstabulation.aggregated crosstabulation.

�� Data in two or more crosstabulations are oftenData in two or more crosstabulations are oftenaggregated to produce a summary crosstabulation.aggregated to produce a summary crosstabulation.

Page 11: SBE10_02b [Read-Only] [Compatibility Mode]

11

2121SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

�� The general pattern of the plotted points suggests theThe general pattern of the plotted points suggests theoverall relationship between the variables.overall relationship between the variables.

�� One variable is shown on the horizontal axis and theOne variable is shown on the horizontal axis and theother variable is shown on the vertical axis.other variable is shown on the vertical axis.

�� A A scatter diagramscatter diagram is a graphical presentation of theis a graphical presentation of therelationship between two relationship between two quantitativequantitative variables.variables.

Scatter Diagram and TrendlineScatter Diagram and Trendline

�� A A trendlinetrendline is an approximation of the relationship.is an approximation of the relationship.

2222SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Scatter DiagramScatter Diagram

■■ A Positive RelationshipA Positive Relationship

xx

yy

Page 12: SBE10_02b [Read-Only] [Compatibility Mode]

12

2323SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Scatter DiagramScatter Diagram

■■ A Negative RelationshipA Negative Relationship

xx

yy

2424SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Scatter DiagramScatter Diagram

■■ No Apparent RelationshipNo Apparent Relationship

xx

yy

Page 13: SBE10_02b [Read-Only] [Compatibility Mode]

13

2525SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Example: Panthers Football TeamExample: Panthers Football Team

■■ Scatter DiagramScatter Diagram

The Panthers football team is interestedThe Panthers football team is interested

in investigating the relationship, if any,in investigating the relationship, if any,

between interceptions made and points scored.between interceptions made and points scored.

11

33

22

11

33

1414

2424

1818

1717

3030

xx = Number of= Number of

InterceptionsInterceptions

yy = Number of= Number of

Points ScoredPoints Scored

2626SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Scatter DiagramScatter Diagram

yy

xx

Number of InterceptionsNumber of Interceptions

Nu

mb

er o

f P

oin

ts S

core

dN

um

ber

of

Po

ints

Sco

red

55

1010

1515

2020

2525

3030

00

3535

11 22 3300 44

Page 14: SBE10_02b [Read-Only] [Compatibility Mode]

14

2727SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

■■ Insights Gained from the Preceding Scatter DiagramInsights Gained from the Preceding Scatter Diagram

•• The relationship is not perfect; all plotted points inThe relationship is not perfect; all plotted points inthe scatter diagram are not on a straight line.the scatter diagram are not on a straight line.

•• Higher points scored are associated with a higherHigher points scored are associated with a highernumber of interceptions.number of interceptions.

•• The scatter diagram indicates a positive relationshipThe scatter diagram indicates a positive relationshipbetween the number of interceptions and thebetween the number of interceptions and thenumber of points scored.number of points scored.

Example: Panthers Football TeamExample: Panthers Football Team

2828SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

Tabular and Graphical ProceduresTabular and Graphical Procedures

Qualitative DataQualitative Data Quantitative DataQuantitative Data

TabularTabularMethodsMethods

TabularTabularMethodsMethods

GraphicalGraphicalMethodsMethods

GraphicalGraphicalMethodsMethods

••FrequencyFrequencyDistributionDistribution

••Rel. Freq. Dist.Rel. Freq. Dist.••Percent Freq. Percent Freq.

DistributionDistribution••CrosstabulationCrosstabulation

••Bar GraphBar Graph••Pie ChartPie Chart

••FrequencyFrequencyDistributionDistribution

••Rel. Freq. Dist.Rel. Freq. Dist.••Cum. Freq. Dist.Cum. Freq. Dist.••Cum. Rel. Freq.Cum. Rel. Freq.

Distribution Distribution ••StemStem--andand--LeafLeaf

DisplayDisplay••CrosstabulationCrosstabulation

••Dot PlotDot Plot••HistogramHistogram••OgiveOgive••ScatterScatter

DiagramDiagram

DataData

Page 15: SBE10_02b [Read-Only] [Compatibility Mode]

15

2929SlideSlide© 2005 Thomson/South© 2005 Thomson/South--WesternWestern

End of Chapter 2, Part BEnd of Chapter 2, Part B