data presentation agenda
DESCRIPTION
Data Presentation Agenda. Data and Data Types Representing Data: pie chart, bar chart. Summarizing Data: box plot, histogram Central tendency Spread Distribution (shape). Data = A Set of Facts A picture of some aspect of the world. Pizza Sales by Type. What do the data tell you? - PowerPoint PPT PresentationTRANSCRIPT
Part 1: Data Presentation1-1/35
Data Presentation Agenda
Data and Data Types Representing Data: pie chart, bar chart. Summarizing Data: box plot, histogram
Central tendency Spread Distribution (shape)
Part 1: Data Presentation1-2/35
Data = A Set of FactsA picture of some aspect of the world
Pizza Sales by Type
What do the data tell you?
How can you use the information?
What additional information would make these data more informative?
Part 1: Data Presentation1-3/35
Data Types and Measurement Quantitative
Discrete = count: Number of car accidents by city by time Continuous = measurement: Housing prices
Qualitative Categorical: Shopping mall, car brand, trip mode Ordinal: Survey data on attitudes; “How do you feel about…?”Strongly disagree Disagree Neutral Agree Strongly agreeMoody’s bond ratings: Aaa, Aa, A, Bbb, Bb, B, and so on.
Frameworks Cross section Time series
Part 1: Data Presentation1-4/35
Problem with Ordered Survey Response Data
Safety Count Percent Cum Pct
1 17 27.87 27.87
2 15 24.59 52.46
3 17 27.87 80.33
4 10 16.39 96.72
5 2 3.28 100.00
61 Stern Students’ Ranking of Subway Safety (1994)*
Very Unsatisfactory
Unsatisfactory
OK
Satisfactory
Very Satisfactory
Is there an objective meaning to “3” on some standard scale?Does everyone’s “1” or “2” or “3” … mean the same thing?
* Jeff Simonoff: Data Presentation and Summary, pp. 3-4
Part 1: Data Presentation1-5/35
Representing Data
In raw form Transformed to a visual form Summarized graphically Summarized statistically
Part 1: Data Presentation1-6/35
Pie Chart
PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball
CategoryMeatball
5.0%Garlic2.3%
Mushroom and Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chart of Percent vs Type
Pizza Pies Sold, by Type
Part 1: Data Presentation1-7/35
Data Representation
Type
Num
ber
Meatball
Garlic
Mushroo
m and
Onio
n
Pepp
er and
Onio
n
Saus
age
Mushroo
mPla
in
Pepp
eron
i
4000
3000
2000
1000
0
Chart of Number vs Type
PepperoniPlainMushroomSausagePepper and OnionMushroom and OnionGarlicMeatball
CategoryMeatball
5.0%Garlic2.3%
Mushroom and Onion9.2%
Pepper and Onion7.3%
Sausage5.8%
Mushroom16.2%
Plain32.5%
Pepperoni21.8%
Pie Chart of Percent vs Type
Same data. Which is easier to understand?
BAR CHART PIE CHART
Part 1: Data Presentation1-8/35 2013 data. Source: Bloomberg
Part 1: Data Presentation1-9/35
Part 1: Data Presentation1-10/35
Raw Data on Housing Prices and Incomes
Part 1: Data Presentation1-11/35
A Box Plot Describes the Distributionof Values in a Set of Data
List
ing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Average House Listing Price by State
Hawaii
Box and Whisker Plot for House Price Listings
Part 1: Data Presentation1-12/35
Making a Box Plot for Per Capita IncomeMaximum=31136
Median=22610
Minimum=17043
1st Quartile = 21677
3rd Quartile = 24933
Interquartile Range = IQR= 24933-21677 = 3256
Part 1: Data Presentation1-13/35
Box and Whisker Plot
Median
75th Percentile
25th Percentile
Interquartile range=IQR
Larger of (Minimum, Median – 1.5 IQR
Smaller of (Maximum, Median + 1.5 IQR
Outliers
HOG, pp. 39-43
What is an outlier?Why do we believe a particular point is an outlier?
Part 1: Data Presentation1-14/35
A Frequency Distribution
Part 1: Data Presentation1-15/35
Histogram for House Price Listings
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogram of Listing
HOG, pp. 16-18
A histogram describes the sample data and suggests the nature of the underlying data generating process. Note the “skewness” of the distribution of listings.
Part 1: Data Presentation1-16/35
Distribution of House Price Listings
Listing
Frequency
900000800000700000600000500000400000300000200000
14
12
10
8
6
4
2
0
Histogram of Listing
List
ing
900000
800000
700000
600000
500000
400000
300000
200000
100000
Average House Listing Price by State
Asymmetry (skewness) in the histogram of listing prices…
… shows up in the box and whisker plot. Note the long whisker at the top of the figure.
Part 1: Data Presentation1-17/35
A Caution About Graphical Data Summaries
Graphical tools can be very badly behaved when:
(1) The data have only a few observations.
(2) There are wild observations in the data set.
The box and whisker plot is distorted (and dominated) by one wildly errant observation.
Part 1: Data Presentation1-18/35
Summary What story does the data presentation tell?
Data in raw form tell no story. Visual representation of data tells something about the data
Data reduction and summary representation: What do we learn? Location Spread Shape of the distribution
What tool is most informative? Reduction to a small number of features Visual displays of data
Pie chart Box and whisker plots Histograms Time series plots
“There are lies, damned lies and statistics.” (Benjamin Disraeli)
Part 1: Data Presentation1-19/35
The Visual Data Do Tell the Story:Napoleon’s March to Moscow
Part 1: Data Presentation1-20/35Source: Bloomberg. August 2013
Part 1: Data Presentation1-21/35 Source: Bloomberg. August 2013
Part 1: Data Presentation1-22/35
Part 1: Data Presentation1-23/35
Part 1: Data Presentation1-24/35
Part 1: Data Presentation1-25/35
Probability of Survival to Age 50, Female at BirthU.S. and 20 Other Wealthy Countries