econ 214 elements of statistics for economists
TRANSCRIPT
College of Education
School of Continuing and Distance Education 2014/2015 – 2016/2017
ECON 214
Elements of Statistics for
Economists
Session 2 – Presentation of Data: Graphical Methods
Lecturer: Dr. Bernardin Senadza, Dept. of Economics Contact Information: [email protected]
Session Overview
• The aim of descriptive statistical methods is to present information in clear, concise and accurate manner. This session will illustrate the methods for summarizing data in a very informative way, using tables and graphs.
• At the end of the session, the student will
– Be able to organize data into frequency distribution
– Be able to portray a frequency distribution in histogram, frequency polygon, and cumulative frequency polygon
– Be able to present data using bar charts, line graphs and pie charts
Slide 2
Session Outline
The key topics to be covered in the session are as follows: • Frequency distributions
• Graphic presentation of frequency distributions
• Other graphic forms of presenting data
Slide 3
Reading List
• Michael Barrow, “Statistics for Economics, Accounting and Business Studies”, 4th Edition, Pearson
• R.D. Mason , D.A. Lind, and W.G. Marchal, “Statistical Techniques in Business and Economics”, 10th Edition, McGraw-Hill
Slide 4
FREQUENCY DISTRIBUTIONS Topic One
Slide 5
FREQUENCY DISTRIBUTIONS
• Raw data in itself is meaningless unless it can be presented in an informative way.
• Descriptive statistics summarizes raw information in a comprehensible way.
• We may use tables, graphs and/or numeric values.
• A frequency distribution is grouping of data into categories showing the number of observations in each mutually exclusive category. Slide 6
FREQUENCY DISTRIBUTIONS
• The purpose of constructing a frequency distribution
is to condense raw data into a table that is more
readily comprehended.
• Although we lose the identification of the specific
value of each measurement, the advantage of a
frequency distribution is that such a table makes it
easier to interpret the reported values.
Slide 7
FREQUENCY DISTRIBUTIONS
• Some definitions are worth knowing: – Class limits: They are the smallest and largest
observations (values) in each class of a frequency distribution. Each class has two limits; we have the lower class limit and the upper class limit.
– Class frequency: The number of observed values in each class.
– Class boundaries: They denote specific points along a measurement scale separating adjoining classes. The lower class boundary is obtained by subtracting 0.5 from the lower class limit, while the upper class boundary is obtained by adding 0.5 to the upper class limit.
Slide 8
FREQUENCY DISTRIBUTIONS
– Class interval or width (or size): The number of measurement units or range of values included in each class. It is obtained as upper boundary minus lower boundary. It is also obtained by subtracting the lower limit of a class from the lower limit of the next class.
– Class mark or midpoint: The value that divides a class into two equal parts. This is the simple average of the upper and lower class limits.
• Note that in some frequency distributions, class boundaries class limits overlap.
Slide 9
FREQUENCY DISTRIBUTIONS
• Illustration – The Dean of the School of Continuing and Distance
Education wishes to determine the hours of study Distance Education students do. He selects a random sample of 30 students and determines the number of hours each student studies per week as follows:
– 15.0, 23.7, 19.7, 15.4, 18.3, 23.0, 14.2, 20.8, 13.5, 20.7, 17.4, 18.6, 12.9, 20.3, 13.7, 21.4, 18.3, 29.8, 17.1, 18.9, 9.1, 26.1, 15.7, 14.0, 17.8, 36.3, 23.2, 12.9, 27.1, 16.6.
• Organize the data into a frequency distribution.
Slide 10
FREQUENCY DISTRIBUTIONS CONT’D
• Assume the number of classes is pre-determined to be 6.
• The class intervals used in the frequency distribution should be equal.
• Determine the class interval by using the formula: i = (highest value-lowest value)/number of classes
i = 36.3-9.1/6 = 4.53 ≈ 5.
• Starting the lower limit of the first class at 8, we can have the following classes: 8-12; 13-17; 18-22; 23-27; 28-32; 33-37.
• Count the number of values in each class and indicate the number against each class.
Slide 11
FREQUENCY DISTRIBUTIONS CONT’D
Slide 12
The class interval is 5 (i.e. 13 minus 8).
Hours of
studying
Frequency, f
8-12 1
13-17 12
18-22 10
23-27 5
28-32 1
33-37 1
Total 30
FREQUENCY DISTRIBUTIONS CONT’D
Slide 13
• The relative frequency of a class is obtained by dividing the class frequency by the total frequency.
Hours of
studying
Frequency, f Relative frequency,
rf
8-12 1 1/30
13-17 12 12/30
18-22 10 10/30
23-27 5 5/10
28-32 1 1/30
33-37 1 1/30
Total 30 1
GRAPHIC PRESENTATION OF A FREQUENCY DISTRIBUTION
Topic Two
Slide 14
Graphic Presentation of a Frequency Distribution
• The three commonly used graphic forms are histograms, frequency polygons, and a cumulative frequency curve (ogive).
Slide 15
Histogram
• Histogram: A graph in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis. The class frequencies are represented by the heights of the bars and the bars are drawn adjacent to each other and without spaces/gaps among the bars.
Slide 16
0
2
4
6
8
10
12
14
10 15 20 25 30 35
Hours spent studying
Fre
qu
ency
Frequency Polygon
• A frequency polygon consists of line segments connecting the points formed by the class midpoint and the class frequency.
Slide 17
0
2
4
6
8
10
12
14
10 15 20 25 30 35
Hours spent studying
Fre
qu
ency
Cumulative Frequency Distribution (Ogive)
• A cumulative frequency curve (ogive) is used to determine how many or what proportion of the data values are below a certain value (or the upper boundary of each class).
• The cumulative frequency for a given class is obtained by adding the frequency of that class to the cumulative frequency of the preceding class.
• The cumulative frequency for the last class always equals the total frequency.
Slide 18
Hours of studying less than
Cumulative frequency
7.5 0
12.5 1
17.5 13
22.5 23
27.5 28
32.5 29
37.5 30
Slide 19
0
5
10
15
20
25
30
35
10 15 20 25 30 35
Hours Spent Studying
Frequency
• The Ogive is obtained as line segments connecting the points formed by the class midpoint and the cumulative frequency.
– Note: alternatively the upper class boundary is used.
Slide 20
OTHER GRAPHIC FORMS OF PRESENTING DATA
• Topic Three
The Bar Chart
• A bar chart depicts frequencies for different categories (of data) by a series of bars (separated by spaces/gaps). – Consider the data on the education level and employment status data of
the labour force of a country (measured in ’000s).
– Note that the data is already summarised (cross-tabulated).
– We can graphically represent the data by bar charts.
Slide 21
Higher A levels Other No Total
education qualification qualification
In work 8,224 5,654 11,167 2,583 27,628
Unemployed 217 231 693 303 1,444
Inactive 956 1,354 3,107 2,549 7,966
Total 9,397 7,239 14,967 5,435 37,038
Bar chart by education qualification of people who work
• The bar chart is for education levels of only people who work. • The height of each bar is determined by the associated frequency. • The first bar is 8224 units high, the second is 5654, and so on. The
ordering of the bars could be reversed (‘no qualifications’ becoming the first category) without altering the message.
Slide 22
0
2000
4000
6000
8000
10000
12000
Higher
education
Advanced level Other
qualifications
No
qualifications
Num
ber
of
people
(000s)
Multiple bar chart: Educational qualifications by employment category
• Note that the bars for unemployed and inactive are constructed in the same way as for those in work: the height of the bar is determined by the frequency or number of persons in each category.
Slide 23
0
2000
4000
6000
8000
10000
12000
Higher
education
Advanced
level
Other
qualifications
No
qualifications
Nu
mb
er
of
pe
op
le (
00
0s)
In work
Unemployed
Inactive
Stacked bar chart: Educational qualifications and employment status
• Note: The overall height of each bar is determined by the sum of the frequencies of the category, given in the final raw of the Table containing the data.
Slide 24
0
2000
4000
6000
8000
10000
12000
14000
16000
Higher education Advanced level Other
qualifications
No qualifications
Nu
mb
er
of
pe
op
le (
00
0s
)
Inactive
Unemployed
In work
Stacked bar chart: Educational qualifications and employment status (percentages)
• Percentages in each employment category, by educational qualification. – That is, number of persons in each employment status in each educational qualification category has been
converted into a percentage of the total persons in each educational qualification category.
• Instead of bars, we could have used lines in each of the 4 charts to obtain line charts.
Slide 25
0%
20%
40%
60%
80%
100%
Higher
education
Advanced
level
Other
qualifications
No
qualifications
Inactive
Unemployed
In work
The Pie Chart
• The pie chart is another useful way of presenting data graphically.
• It is particularly useful for showing how a variable is distributed between different categories.
• Just as the stacked (or component) bar chart, a pie chart portrays the contributions from different sources to some aggregate (relative frequency distribution).
• Depicted in circular form, the circle is divided proportionally to the relative frequency allocated to the different groups.
Slide 26
The pie chart: Educational qualifications of those in work
• Since a circle is 360 degrees, the angle represented by each category is given by: angle= (frequency in category/total frequency)x100
• So for Higher Education, angle = (8224/27628)x100 = 107.2 degrees. • And so forth.
Slide 27
30%
20%
41%
9%
Higher education
Advanced level
Other qualifications
No qualifications
References
• Michael Barrow, “Statistics for Economics, Accounting and Business Studies”, 4th Edition, Pearson
• R.D. Mason , D.A. Lind, and W.G. Marchal, “Statistical Techniques in Business and Economics”, 10th Edition, McGraw-Hill
Slide 28