discrete or continuous types of data continuousdiscrete categorical quantitative (numerical)...

51
Discrete or Continuous Types of data Continuous Discrete Categorical Quantitative (numerical) Discrete

Upload: posy-morrison

Post on 17-Dec-2015

310 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Discrete or Continuous

Types of data

ContinuousDiscrete

CategoricalQuantitative(numerical)

Discrete

Page 2: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Two Types of Variables

A Numerical Variable describes quantities of the objects of interest. Data values are numbers.Weight of an infantNumber of sexual partnersTime to run the mile

A Categorical Variable describes qualities of the objects of interest. Data values are usually words.Skin colorBirth city Last Name

Page 3: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Example: Numerical or Categorical?

• Numerical– Age– Units– GPA

Categorical

Gender Major Housing

Age Gender Major Units Housing GPA

18 Male Psychology 16 Dorm 3.6

21 Male Nursing 15 Parents 3.1

20 Female Business 16 Apartment 2.8

Page 4: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Numerical or Categorical?

• Coding Categorical Data with Numbers: Although the above data values are numbers, the variable is still categorical.

• Reason for Coding: Easier to input into a computer.

Why are you in college? Answer:1. Person Growth 2. Career Opportunities3. Parental Pressure 4. Personal Networking

Results from 12 participants: 1, 4, 3, 2, 2, 1, 2, 3, 3, 1, 4, 2

Page 5: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Scales of Measurement

Ratio •Ordered categories•Equal interval between categories•Absolute zero point

•Number of correct answers•Time to complete task•Gain in height since last year

Scale Characteristics Examples

Nominal •Label and categorize •No quantitative distinctions

•Gender•Diagnosis•Experimental or Control

Ordinal •Categorizes observations•Categories organized by size or magnitude

•Rank in class•Clothing sizes (S,M,L,XL)•Olympic medals

Interval •Ordered categories•Interval between categories of equal size•Arbitrary or absent zero point

•Temperature

Page 6: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

What kinds of data are typically collected?

• Nominal Data – no ordering, e.g. it makes no sense to state that F > M

– arbitrary labels, e.g., m/f, 0/1, etc • Ordinal Data

– ordered but differences between values are not important – e.g., Likert scales, rank on a scale of 1..5 your degree of satisfaction

• Interval Data – ordered, constant scale, but no natural zero

– differences make sense, but ratios do not (e.g., 30°-20°=20°-10°, but 20°/10° is not twice as hot!

• Ratio Data – ordered, constant scale, natural zero

– e.g., height, weight, age, length

RatioNominal Ordinal Interval

ContinuousCategorical

Page 7: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Example 2.3Frequency, Proportion and Percent

X f p = f/N percent = p(100)

5 1 1/10 = .10 10%

4 2 2/10 = .20 20%

3 3 3/10 = .30 30%

2 3 3/10 = .30 30%

1 1 1/10 = .10 10%

Page 8: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Displaying distributions Qualitative variables

• Pie Charts• Bar Graphs

Page 9: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

PIE CHART FOR THE TASTE TEST

Others

Coca-Cola

Pepsi

Dr Pepper

Seven up

Page 10: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Graphs for Nominal or Ordinal Data

• For non-numerical scores (nominal and ordinal data), use a bar graph

• without a particular order (nominal)• non-measurable width (ordinal)

Page 11: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Bar graph

Page 12: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

BAR CHART FOR THE AIDS DATA

1 ATLANTA

2 AUSTIN

3 DALLAS

4 HOUSTON

5 NY, NY.

6 SAN. FRAN.

7 WASH

D.C.8 W. P. BEACH

Page 13: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Figure 2.7 Bar Graph of Relative Frequencies

Page 14: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

A Misleading Bar GraphProblem

The bar graph that follows presents the total sales figures for three realtors. When the bars are replaced with pictures, often related to the topic of the graph, the graph is called a pictogram.

Realtor #1 Realtor #3Realtor #2

$2.05 million

$1.41 million

$0.9 million

TotalSales

(a) How does the height of the home for Realtor 1 compare to that for Realtor 3?(b) How does the area of the home for Realtor 1 compare to that for Realtor 3?

Realtor 1Realtor 2

Realtor 3

Page 15: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Displaying Distributions Quantitative Variables

• Histograms• Polygons• Frequency plots• Stem and Leaf Plots• Time plots• Scatterplots

Page 16: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

HistogramHistogram of Age

CLASS TALLY # OBSERVATIONS PERCENTAGE

[30,35) / 1 1/20 = 0.05 5%[35,40) // 2 2/20 = 0.10 10%[40,45) //////// 8 8/20 = 0.40

40%[45,50) /////// 7 7/20 = 0.35

35%[50,55) // 2 2/20 = 0.10

10%

31,36,36,40, 41,41,41,44,44,44,44,45, 45, 45,46,47,48,49, 51,51

Page 17: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

2

4

6

8

30 35 4540 50 55

Count

10%

20%

30%

40%

30 35 4540 50 55

Percent

Page 18: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Figure 2.3 Frequency Distribution Block Histogram

Page 19: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Histogram versus Bar Graph

count

length 1 2 3 4 5 6

count

length 1 2 3 4 5 6

count

green blue red white yellow

color

count

green blue red white yellow

color

GRAPH I GRAPH II

GRAPH III

GRAPH IV

Page 20: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Misleading Histograms

Page 21: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Figure 2.4 Frequency Distribution Polygon

Page 22: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Figure 2.5Grouped Data Frequency Distribution Polygon

Page 23: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describe The Distribution

Page 24: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

What eyes see Describe

1) with words and

2) with numbers

Page 25: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describing with WORDS

Page 26: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Three Aspects of a Distribution

• Shape– Symmetry– How a many bumps or modes?– Other distinguishing features

• Center– What is a typical value?– The bulk of the data

• Spread– Is the data all close together or spread out?

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 27: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Distribution Shapes

Page 28: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

SHAPE ~ Symmetric Distributions

• A distribution is symmetric if the left hand side is roughly the mirror image of the right hand side.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Symmetric Distributions

Page 29: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Symmetric

2. Is the histogram symmetric?– If you can fold the histogram along a vertical line through

the middle and have the edges match pretty closely, the histogram is symmetric.

Slide 1- 29

Page 30: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

SHAPE ~ Normal Distributions

• A Normal distribution has the following properties– Symmetric– Unimodal– Mound or Bell Shaped

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 31: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

SHAPE ~ Skewness

• A distribution is Skewed Right if most of the data values are small and there is a “tail” of larger values to the right.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

A distribution is Skewed Left if most of the data values are large and there is a “tail” of smaller values to the left.

Page 32: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Skewed– The (usually) thinner ends of a distribution are called the

tails. If one tail stretches out farther than the other, the histogram is said to be skewed to the side of the longer tail.

– In the figure below, the histogram on the left is said to be skewed left, while the histogram on the right is said to be skewed right.

Slide 1- 32

Page 33: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

SHAPE ~ How Many Mounds

• A Unimodal distribution has one mound.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

A Multimodal distribution has more than two mounds.

A Bimodal distribution has two mounds.

Page 34: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Peaks: Modes

1. Does the histogram have a single, central peak or several separated peaks?– Peaks in a histogram are called modes.– A histogram with one main peak is called unimodal; histograms with

two peaks are bimodal;

histograms with three or more peaks are called multimodal.

Slide 1- 34

Page 35: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Center• For now, we look at the most common value in each

distribution. We will develop more precise ways to describe the center of a distribution in the next section.

• What is the center of this distribution?

Slide 1- 35

Page 36: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Center What is a typical value

• What is a typical value?

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Center not a typical value for bimodal or skewed.

Page 37: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Center• For now, we look at the most common value in each

distribution. We will develop more precise ways to describe the center of a distribution in the next section.

• What is the center of this distribution?

Slide 1- 37

Page 38: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

SPREAD~ Range• The range of the data is the difference between the maximum

and minimum values

Slide 1- 38

Page 39: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Spread: Range

• Always report a measure of spread along with a measure of center when describing a distribution numerically.

• The range of the data is the difference between the maximum and minimum values:

Range = max – min• A disadvantage of the range is that a single extreme value can

make it very large and, thus, not representative of the data overall.

• For example, if my test scores were 10, 87, 94, 88, 85, 82, 85, 92 my range would be 94-10=84. This is a large spread, but most of my scores are in the 80. We will soon discuss different measures of spread.

Slide 1- 39

Page 40: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Quiz Scores

• Please see replacement activity

• Which class (A or B) has more variability?2014 Summer Training Institute College of the Canyons 40

Page 41: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Hypothetical Quiz Scores

• Please see replacement activity

• Which class has the least? Which the most?

2014 Summer Training Institute College of the Canyons 41

Page 42: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Outliers

• An Outlier is a data value that is either much smaller or much larger than the rest of the data.

• Some reasons for outliers– Error in data collection– No error. For example, the owner’s salary could

be an outlier if the rest of the employees are all low wage workers

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 43: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Anything Unusual? (cont.)

• The following histogram has possible outliers to the left.

Slide 1- 43

Page 44: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describing a Distribution with wordsUsing Stats language in Context.

• What is the shape? – Is it Symmetric, Skewed, or Neither?– Unimodal, Bimodal, or Multimodal?– Normal?– Are there outliers?

• Where is the center? Is the center a typical value?

• Is there low or high variability?

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 45: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describe The Distributions• It is always more interesting to compare groups. Below are

daily wind speeds at a National Park.

Page 46: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describe The Distribution• The dotplots below show drive times for 3 different routes.• Describe these dotplots. • What route would you take and why?

Page 47: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete
Page 48: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Shape center and Spread activity

Page 49: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describe The Distributions• It is always more interesting to compare groups. Below are

daily wind speeds at a National Park.

Page 50: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete

Describe The Distribution

• The dotplots below show drive times for 3 different routes.• Describe these dotplots. • What route would you take and why?

Page 51: Discrete or Continuous Types of data ContinuousDiscrete Categorical Quantitative (numerical) Discrete