math 215 c10 - self-test: unit 1

30
Unit 1 introduces the field of statistics and the areas within it, presents many of the terms used throughout this course, and examines common methods employed to organize, display, and summarize data. Typically, the statistics practitioner, faced with a specific problem, research objective, or decision, begins their work by collecting a body of numerical facts, called raw data, through surveys, through observation, or from internal or external information sources. After gathering this data, the practitioner must organize it and present the results in such a way that coherent, relevant information about the problem, objective, or decision emerges. The set of methods used to organize, display, and describe data is called descriptive statistics, and is the subject of this unit. We will now examine what a statistics practitioner does with the vast quantity of numbers that form the raw data: how they organize it, present it in tables and graphs, and compute various summary measures of location, variability, and position. As the seemingly unrelated raw data takes on a meaningful form, we can appreciate how these numbers say something about our lives, our society, and our universe. Unit 1 of MATH 215 consists of the following sections: 1-1 1-2 1-3 1-4 1-5 1-6 1-7 1-8 1-9 1-10 The unit also contains a self-test. When you have completed the material for this unit, including the self-test, complete Assignment 1. After completing the readings and exercises for this section, you should be able to define, and use in context, the following key terms: descriptive statistics and inferential statistics element, variable, observation and data set Read the following sections in Chapter 1 of the textbook: MATH 215 C10 - Study Guide: Unit 1 1

Upload: others

Post on 12-Jun-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MATH 215 C10 - Self-Test: Unit 1

Unit 1 introduces the field of statistics and the areas within it, presents many of the terms used throughout this course, and

examines common methods employed to organize, display, and summarize data.

Typically, the statistics practitioner, faced with a specific problem, research objective, or decision, begins their work by collecting

a body of numerical facts, called raw data, through surveys, through observation, or from internal or external information

sources. After gathering this data, the practitioner must organize it and present the results in such a way that coherent, relevant

information about the problem, objective, or decision emerges.

The set of methods used to organize, display, and describe data is called descriptive statistics, and is the subject of this unit.

We will now examine what a statistics practitioner does with the vast quantity of numbers that form the raw data: how they

organize it, present it in tables and graphs, and compute various summary measures of location, variability, and position.

As the seemingly unrelated raw data takes on a meaningful form, we can appreciate how these numbers say something about our

lives, our society, and our universe.

Unit 1 of MATH 215 consists of the following sections:

1-1

1-2

1-3

1-4

1-5

1-6

1-7

1-8

1-9

1-10

The unit also contains a self-test. When you have completed the material for this unit, including the self-test, complete

Assignment 1.

After completing the readings and exercises for this section, you should be able to define, and use in context, the following key

terms:

descriptive statistics and inferential statistics

element, variable, observation and data set

Read the following sections in Chapter 1 of the textbook:

MATH 215 C10 - Study Guide: Unit 1

1

Page 2: MATH 215 C10 - Self-Test: Unit 1

Chapter 1 Introduction

Section 1.1

Section 1.2

Be prepared to read the material in Chapter 1 twice—the first time for a general overview of topics, and the second time to

concentrate on the terms and examples presented. Return to these sections when you need to review these topics.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in Chapter 1 of

the textbook.

Introduction to Statistics and Graphical Displays of Data (https://www.youtube.com/watch?v=IB3paSUbdUA) (JeremyHaselhorst)

Introduction to Descriptive Statistics – an Overview (https://www.youtube.com/watch?v=QoQbR4lVLrs) (TeresaJohnson)

Sampling: Simple Random, Convenience, Systematic, Cluster, Stratified (https://www.youtube.com/watch?v=be9e-Q-jC-0) (Dr Nic’s Maths and Stats)

Note: You can also check Appendix A for more information on sampling techniques.

Types of Data: Nominal, Ordinal, Interval/Ratio – an Overview (https://www.youtube.com/watch?v=hZxnzfnt5v8&nohtml5=False) (Dr Nic’s Maths and Stats)

Appropriate Data Displays – an Overview (https://www.youtube.com/watch?v=5RKpsCqmh0I) (Rob Oliver)

Complete the following exercises from Chapter 1 of the textbook (page numbers are for the downloadable eText):

Exercise 1.3 on page 5

Exercise 1.5 on page 6

Show your work as you develop your answers.

Solutions to these exercises are provided in the Student Solutions Manual for Chapter 1 in the left-hand navigation column in

the interactive textbook (accessible from the Read, Study & Practice link on the course home page) and on page AN1 in the

Answers to Selected Odd-Numbered Exercises section in the downloadable eText.

It is very important that you make a concerted effort to answer each question independently before you refer to the solutions. If

your answers differ from those provided and you cannot understand why, contact your tutor for assistance.

MATH 215 C10 - Study Guide: Unit 1

2

Page 3: MATH 215 C10 - Self-Test: Unit 1

After completing the readings and exercises for this section, you should be able to do the following:

1. define, and use in context, the following key terms:

quantitative, discrete, continuous, and qualitative (categorical) variables

cross-section and time-series data

2. compute the values for expressions that are presented in summation notation.

Read the following sections in Chapter 1 of the textbook:

Section 1.3

Section 1.4

Complete the following exercises from Chapter 1 of the textbook (page numbers are for the downloadable eText):

Exercises 1.7 and 1.9 on page 8

Exercise 1.11 on page 10

Solutions are provided in the Student Solutions Manual for Chapter 1 in the interactive textbook. Please note that the solutions

to exercises 1.7 and 1.11 are not included in the Answers to Selected Odd-Numbered Exercises section of the downloadable eText.

After completing the readings and exercises for this section, you should be able to do the following:

1. contrast the following:

population vs. sample

census vs. sample survey

sampling with replacement vs. sampling without replacement

random vs. non-random samples

sampling vs. non-sampling errors

2. identify types of non-random samples

3. identify random sampling techniques

4. define the following terms:

treatment

randomization

designed experiment

5. contrast the following:

designed experiment vs. observational study

MATH 215 C10 - Study Guide: Unit 1

3

Page 4: MATH 215 C10 - Self-Test: Unit 1

treatment vs. control group

Read the following sections in Chapter 1 of the textbook:

Section 1.5

Section 1.6

Section 1.7

1. Complete the following exercises from 1 of the textbook (page numbers are for the downloadable eText):

Exercises 1.13, 1.15, and 1.19 on page 17

Exercises 1.21, 1.23, 1.25, and 1.27 on page 18

Exercises 1.31, 1.33, and 1.35 on page 22

Exercises 1.37 and 1.39 on page 24

2. Complete the Self-Review Test for Chapter 1 (pages 28–29 of the downloadable eText).

Solutions are provided in the Student Solutions Manual for Chapter 1 (interactive textbook) and on pages AN1 to AN2 inthe Answers to Selected Odd-Numbered Exercises section (downloadable eText).

At the end of Chapter 1 (pages 26–28 of the downloadable eText), there are both Supplementary Exercises and Advanced

Exercises, with solutions provided for the odd-numbered questions. You can work through these exercises for additional practice

with these concepts and techniques.

After completing the readings and exercises for this section, you should be able to do the following:

1. construct a frequency distribution that includes frequencies, relative frequencies, and percentage frequencies, given rawdata for a qualitative (categorical) variable.

2. construct a bar graph and a pie chart.

3. interpret frequencies, relative frequencies, and percentage frequencies, given a frequency distribution or a graph relatingto a frequency distribution.

MATH 215 C10 - Study Guide: Unit 1

4

Page 5: MATH 215 C10 - Self-Test: Unit 1

Read the following sections in Chapter 2 of the textbook:

Chapter 2 Introduction

Section 2.1

Be prepared to read the material in Chapter 2 at least twice—the first time for a general overview of topics, and the second time

to concentrate on the terms and examples presented. Return to these sections when you need to review these topics.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in Section 2.1 of

the textbook.

Categorical Frequency Distributions (https://www.youtube.com/watch?v=KWd9s5wwJiw) (mattemath)

Art of Problem Solving: Bar Charts and Pie Charts (https://www.youtube.com/watch?v=foyPpC3XjhE&list=PLVDBoCeSVNpsKhZ8yPV5Qr7mXt178sMbU) (Art of Problem Solving)

Calculating values for a pie chart (https://www.youtube.com/watch?v=coTDCzpjiWU) (Mr Potts Math)

Using a pie graph to find the amount (https://www.youtube.com/watch?v=Ad9gRQYgi_Y) (adumas2009)

Complete Exercise 2.5 (page 43 of the downloadable eText).

Solutions are provided in the Student Solutions Manual for Chapter 2 (interactive textbook) and on page AN2 in the Answers to

Selected Odd-Numbered Exercises section (downloadable eText).

After completing the readings and exercises for this section, you should be able to do the following:

1. construct a frequency distribution table that uses either a “less than” or “not less than” method for writing the classes,given raw data for a continuous variable.

2. construct the following graphs: histogram, relative frequency histogram, frequency polygon, relative or percentagefrequency polygon, ogive, and relative or percentage ogive.

3. construct a frequency distribution table using single-valued classes, given raw data.

4. construct a bar graph for the distribution described in Outcome 3, above.

5. interpret frequencies, relative and percentage frequencies, cumulative frequencies, cumulative relative frequencies, andcumulative percentage frequencies, given a frequency distribution or a related graph.

MATH 215 C10 - Study Guide: Unit 1

5

Page 6: MATH 215 C10 - Self-Test: Unit 1

6. interpret symmetric, skewed and uniform distributions for the frequency distribution or graph described in Outcome 5,above.

7. construct stem-and-leaf displays and dotplots, and identify possible outliers, given raw data.

1. Read the following sections in Chapter 2 of the textbook:

Section 2.2

Section 2.3

Section 2.4

2. Read Additional Topics 1A and 1B in this Study Guide, below.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in the assigned

textbook readings.

Dancing Statistics: explaining the statistical concept of ‘frequency distributions’ through dance(https://www.youtube.com/watch?v=dr1DynUzjq0) (BPSOfficial)

Statistics – Displaying Data (https://www.youtube.com/watch?v=DPBC9aOWoYI) (DrCraigMcBridePhD)

How to Construct a Grouped Frequency Distribution (https://www.youtube.com/watch?v=tcU_hApd-j0) (JoshuaFrench)

Constructing a Grouped Frequency Distribution Part 1 (https://www.youtube.com/watch?v=gVr2eYfc4vk) (Math andStats)

Constructing a Grouped Frequency Distribution Part 2 (https://www.youtube.com/watch?v=dbDq8QdOAKI) (Math andStats)

How to Create a Frequency Polygon (https://www.youtube.com/watch?v=5zbk94ySbIM) (mattemath)

Relative frequency histogram, polygon and ogive graphs (https://www.youtube.com/watch?v=n5B-jx7GhVg)(mrandersonmath)

How to Draw a Histogram by Hand (https://www.youtube.com/watch?v=EqlHVMTaPiA) (MathBootcamps)

Reading and Analyzing a Histogram (https://www.youtube.com/watch?v=DLbQcb4ckV0) (Dan Ozimek)

Analyzing Histograms Reminder (https://www.youtube.com/watch?v=6G0FV6PQHWY) (Michael Branson)

The Different Shapes of Frequency Distributions (https://www.youtube.com/watch?v=bZY3tZU-qEc) (mattemath)

How to Construct a Cumulative Frequency Graph or Ogive (https://www.youtube.com/watch?feature=player_embedded&v=L8Zhm4ivasQ) (mattemath)

Statistics – How to Make a Stem-and-Leaf Plot (https://www.youtube.com/watch?v=_7m0Q_m2ppg)(MySecretMathTutor)

Stem-and-Leaf Plots (https://www.youtube.com/watch?v=PSrCxsIgPFU) (Anywhere Math)

Stem-and-Leaf Plot – Mean, Median and Mode (https://www.youtube.com/watch?v=D8267-mpjmM) (Robert Boulet)

Practice Exercises: Graphs and Plots (https://www.youtube.com/watch?v=MZHZSwS9PSo) (lbowen11235)

Dot Plots and Frequency Tables (https://www.youtube.com/watch?v=AHFYMLtMnMI) (Nicole Pellegrino)

Dot Plots and Frequency Tables (https://www.youtube.com/watch?v=WrX-JrZ082g) (Greg Wood)

MATH 215 C10 - Study Guide: Unit 1

6

Page 7: MATH 215 C10 - Self-Test: Unit 1

1. Complete the following exercises from Chapter 2 of the textbook (page numbers are for the downloadable eText):

Exercises 2.11 and 2.13 on page 58

Exercises 2.15 on page 59

Exercise 2.21 on page 60

Exercise 2.29 on page 64

Exercise 2.31on page 66

Solutions are provided in the Student Solutions Manual for Chapter 2 (interactive textbook) and on pages AN2 and AN3in the Answers to Selected Odd-Numbered Exercises section (downloadable eText).

2. Complete Exercise for Additional Topics 1A and 1B in this Study Guide, below.

3. Complete the Self-Review Test for Chapter 2 (pages 70–71 of the downloadable eText).

For extra practice with the material presented in this section, you can complete the following questions and exercises, for which

the solutions are provided in the textbook:

1. Any odd-numbered chapter-section practice questions that are not assigned above

2. The odd-numbered Supplementary Exercises and Advanced Exercises found at the end of Chapter 2 (pages 68–70 of thedownloadable eText)

The following notes on class boundaries are taken from the previous edition of the textbook. You can expect to have questions

involving class boundaries in the assignments and exams for this course.

Definition

Class Boundary A class boundary is given by the midpoint of the upper limit of one class and the lower

limit of the next class.

MATH 215 C10 - Study Guide: Unit 1

7

Page 8: MATH 215 C10 - Self-Test: Unit 1

Note that in Table 2.8, when we write classes using class boundaries, we write “to less than” to ensure that

each value belongs to one and only one class. As we can see, the upper boundary of the preceding class and

the lower boundary of the succeeding class are the same.

[Source: Prem S. Mann, Introductory Statistics, 8th ed. (Wiley, 2012) [VitalSource], 37–38. This material is

reproduced with the permission of John Wiley & Sons Canada, Ltd.]

The following notes on ogives are taken from the previous edition of your textbook. You can expect to have questions involving

ogives in the assignments and exams for this course.

Definition

Ogive An ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines

the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of

respective classes.

When plotted on a diagram, the cumulative frequencies give a curve that is called an ogive (pronounced

o-jive). Figure 2.12 gives an ogive for the cumulative frequency distribution of Table 2.14 which has been

constructed from the following frequency distribution table (include the table on page 54 in Example 2–7 in

the Mann v.8e version).

Note that the variable is the number of iPods sold per day by a company over a period of 30 days. The

frequencies represent the number of days on which the number of iPods indicated by each class were sold.

[As an example, for the second class in the Cumulative Frequency Distribution table below, 14 or fewer iPods

were sold in 9 days. According to the third class, 19 or fewer iPods were sold in 17 days.]

To draw the ogive in Figure 2.12, the variable, which is the total number of iPods sold by a

company in each of 30 days [emphasis added], is marked on the horizontal axis and the cumulative

frequencies on the vertical axis. Then the dots are marked above the upper boundaries of various classes at

the heights equal to the corresponding cumulative frequencies. The ogive is obtained by joining consecutive

points with straight lines. Note that the ogive starts at the lower boundary of the first class and ends at the

upper boundary of the last class.

MATH 215 C10 - Study Guide: Unit 1

8

Page 9: MATH 215 C10 - Self-Test: Unit 1

One advantage of an ogive is that it can be used to approximate the cumulative frequency for any interval. For

example, we can use Figure 2.12 to estimate the number of days for which 17 or fewer iPods were sold. First,

draw a vertical line from 17 on the horizontal axis up to the ogive. Then draw a horizontal line from the point

where this line intersects the ogive to the vertical axis. This point gives the estimated cumulative frequency of

the class 5 to 17. In Figure 2.12, this cumulative frequency is (approximately) 13, as shown by the dashed line.

Therefore, 17 or fewer iPods were sold on (approximately) 13 days.

We can draw an ogive for cumulative relative frequency and cumulative percentage distributions the same

way as we did for the cumulative frequency distribution.

[Source: Prem S. Mann, Introductory Statistics, 8th ed. (Wiley, 2012) [VitalSource], 54–56. This material is

reproduced with the permission of John Wiley & Sons Canada, Ltd.]

The following exercise is reproduced from Exercise 2.35 in the previous edition of the textbook. The solution is provided.

[The table below] gives the frequency distribution of the number of days to expiry date for all containers of

yogurt in stock at a local grocery store. Containers that had already expired but were still on the shelves were

given a value of 0 for number of days to expiry date.

[Source: Prem S. Mann, Introductory Statistics, 8th ed. (Wiley, 2012) [VitalSource], 57. This material is

reproduced with the permission of John Wiley & Sons Canada, Ltd.]

a. Prepare a cumulative frequency distribution table that also displays cumulative relative frequencies and cumulativepercentage frequencies.

MATH 215 C10 - Study Guide: Unit 1

9

Page 10: MATH 215 C10 - Self-Test: Unit 1

Solution:

b. Sketch an ogive for the cumulative percentage distribution in part a, using class boundaries on the horizontal X-axis.

Solution:

c. Using the ogive, estimate the percentage of containers that will expire in fewer than 20 days.

Solution: Approximately 85% of the containers will expire in fewer than 20 days, as indicated on the ogive in part b.

After completing the readings and exercises for this section, you should be able to do the following:

1. compute the mean, median, and mode, given ungrouped (raw) sample data or ungrouped population data.

2. compute the weighted mean for a data set.

3. identify the advantages and disadvantages of using the mean, weighted mean, median, and mode as a measure of centraltendency for different types of data sets.

4. determine how the skewness of a data set affects the relationship between the mean, median, and mode.

Read the following sections in Chapter 3 of the textbook:

Chapter 3 Introduction

Section 3.1

Note: Omit Section 3.1.4: Trimmed Mean.

Be prepared to read the material in Chapter 3 at least twice—the first time for a general overview of topics, and the second time

to concentrate on the terms and examples presented. Return to these sections when you need to review these topics.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in the assigned

textbook readings.

MATH 215 C10 - Study Guide: Unit 1

10

Page 11: MATH 215 C10 - Self-Test: Unit 1

Numerical Summaries (https://www.youtube.com/watch?v=66YAaSzxQD4) (Jeremy Haselhorst)

Numerical Summaries for a Quantitative Variable (https://www.youtube.com/watch?v=CtxsUovzb70) (DWR447)

Descriptive Statistics, Part 1 (https://www.youtube.com/watch?v=8Iklj-lf1fY) (The Doctoral Journey)

Statistics – Mean, Median and Mode (https://www.youtube.com/watch?v=IV_m_uZOUgI&list=PLlSMUHu9g2KQ3SAofPJTr-AYYBQi5SWVT) (Math Meeting)

Measures of Central Tendency (https://www.youtube.com/watch?v=NM_iOLUwZFA&list=UUiHi6xXLzi9FMr9B0zgoHqA&index=14) (jbstatistics)

Complete the following exercises from Chapter 3 of the textbook (page numbers are for the downloadable eText):

Exercises 3.5, 3.7, 3.9, 3.11, and 3.13 (parts a and b only) on page 88

Exercises 3.19, 3.21, 3.23, and 3.25 on page 89

Solutions are provided in the Student Solutions Manual for Chapter 3 (interactive textbook) and on page AN4 in the Answers to

Selected Odd-Numbered Exercises section (downloadable eText).

After completing the readings and exercises for this section, you should be able to do the following:

1. compute the range, variance, standard deviation, and coefficient of variation, given ungrouped (raw) sample data orungrouped population data.

2. identify the advantages and disadvantages of using the range, standard deviation, and coefficient of variation as ameasure of dispersion for different types of data sets.

3. distinguish between a parameter and a statistic.

Read Section 3.2 in Chapter 3 of the textbook.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in Section 3.2 of

the textbook.

Dancing Statistics: explaining the statistical concept of ‘variance’ through dance (https://www.youtube.com/watch?v=pGfwj4GrUlA) (BPSOfficial)

Measuring Spread: the Standard Deviation (https://www.youtube.com/watch?v=thpm7nOwm2Q) (Jeremy Haselhorst)

Statistics – Standard Deviation (https://www.youtube.com/watch?v=3v6mYNPyDoY) (Math Meeting)

Standard Deviation and Variance (https://www.youtube.com/watch?v=VTE25D77UI8&nohtml5=False) (statisticsfun)

Why are degrees of freedom ( ) used in Standard Deviation (https://www.youtube.com/watch?v=92s7IVS6A34&nohtml5=False) (statisticsfun)

MATH 215 C10 - Study Guide: Unit 1

11

Page 12: MATH 215 C10 - Self-Test: Unit 1

Measures of Variability (https://www.youtube.com/watch?v=Cx2tGUze60s&index=15&list=UUiHi6xXLzi9FMr9B0zgoHqA) (jbstatistics)

The Sample Variance: Why divide by ? (https://www.youtube.com/watch?v=9ONRMymR2Eg) (jbstatistics)

Complete the following exercises from Chapter 3 of the textbook (page numbers are for the downloadable eText):

Exercises 3.33, 3.35, and 3.39 on page 96

Exercise 3.43 on page 97

Solutions are provided in the Student Solutions Manual for Chapter 3 (interactive textbook) and on page AN4 in the Answers to

Selected Odd-Numbered Exercises section (downloadable eText).

After completing the readings and exercises for this section, you should be able to do the following: Compute the mean, variance,

and standard deviation, given grouped sample or grouped population data.

Read Section 3.3 in Chapter 3 of the textbook.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in Section 3.3 of

the textbook.

Statistics (Mean and Standard Deviation) for Grouped Data (https://www.youtube.com/watch?v=VoqgnssiugE)(lbowen11235)

Mean, Median and Mode from a Frequency Distribution (https://www.youtube.com/watch?v=gMaIREesP7Y)(ChattState Math)

Variance and Standard Deviation for Grouped Data (https://www.youtube.com/watch?v=_vHfTCStqLA&nohtml5=False) (searching4math)

Complete Exercises 3.47 and 3.49 from Chapter 3 of the textbook (page 102 of the downloadable eText).

Solutions are provided in the Student Solutions Manual for Chapter 3 (interactive textbook) and on page AN4 in the Answers to

Selected Odd-Numbered Exercises section (downloadable eText).

After completing the readings and exercises for this section, you should be able to do the following:

1. use Chebyshev’s theorem with any distribution to find the proportion or percentage of the total observations that falls

MATH 215 C10 - Study Guide: Unit 1

12

Page 13: MATH 215 C10 - Self-Test: Unit 1

within a given interval about the mean.

2. use the empirical rule with any bell-shaped distribution to find the proportion or percentage of the total observations thatfalls within a given interval about the mean.

Read Section 3.4 in Chapter 3 of the textbook.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in Section 3.4 of

the textbook.

The Normal Distribution and the 68-95-99.7 Rule (https://www.youtube.com/watch?v=cgxPcdPbujI&nohtml5=False)(patrickJMT)

Chebychev’s Theorem EXPLAINED (https://www.youtube.com/watch?v=OM0K22pmkuY&nohtml5=False) (Don Davis)

In this video, be sure to compare the excellent graphical representations of Chebyshev’s theorem and theempirical rule, and pay attention to the helpful table showing Chebyshev’s theorem computations.

Complete Exercises 3.61 and 3.63 from Chapter 3 of the textbook (page 107 of the downloadable eText).

Solutions are provided in the Student Solutions Manual for Chapter 3 (interactive textbook) and on page AN5 in the Answers to

Selected Odd-Numbered Exercises section (downloadable eText).

After completing the readings and exercises for this section, you should be able to do the following:

1. compute the three quartiles (Q 1, Q 2, Q 3), the interquartile range, percentiles, and percentile ranks, given ungrouped(raw) sample data or ungrouped population data.

2. interpret the three quartiles (Q 1, Q 2, Q 3), the interquartile range, percentiles, and percentile ranks in the context of agiven problem.

3. construct a box-and-whisker plot, given ungrouped (raw) sample data or ungrouped population data.

4. determine the three quartiles, the lower and upper inner fences, the skewness, and the outliers (if any), given a box-and-whisker plot.

Read the following sections in Chapter 3 of the textbook:

Section 3.5

Section 3.6

Note: Omit Appendix 3.1.

These videos provide alternative explanations and further exploration of the concepts and techniques presented in the assigned

MATH 215 C10 - Study Guide: Unit 1

13

Page 14: MATH 215 C10 - Self-Test: Unit 1

textbook readings.

Quartiles & Interquartile Range (https://www.youtube.com/watch?v=K3wsOqIqA6k&nohtml5=False) (Colette Tropp)

Intro to Box and Whisker Plots (https://www.youtube.com/watch?v=fJZv9YeQ-qQ) (Mashup Math)

How to Make Box & Whisker Plots (https://www.youtube.com/watch?v=mhaGAaL6Abw) (The Organic Chemistry Tutor)

How to Read a Box Plot (https://www.youtube.com/watch?v=AhFNDONky0w) (SmithMathAcademy)

Comparing Box Plots (https://www.youtube.com/watch?v=pX_gi2sgbxQ) (Mark Dolan)

Understanding & Comparing Boxplots (https://www.youtube.com/watch?v=Hm6Mra5XJSs) (Box & Whisker Plots)(MATHRoberg)

Box Plot & Skew (https://www.youtube.com/watch?v=L68iieC2Vgc) (Mona Schraer)

The Five Number Summary, Boxplots, and Outliers (https://www.youtube.com/watch?v=tpToLyZibKM) (SimpleLearning Pro)

1. Complete the following exercises from Chapter 3 of the textbook (page numbers are for the downloadable eText):

Exercises 3.69 and 3.73 on page 112

Exercises 3.77 and 3.79 on page 115

Supplementary Exercise 3.81 on page 117

Supplementary Exercises 3.83, 3.85, 3.87, 3.89, and 3.91 on page 118

Solutions are provided in the Student Solutions Manual for Chapter 3 (interactive textbook) and on pages AN5 and AN6in the Answers to Selected Odd-Numbered Exercises section (downloadable eText).

2. Complete the Self-Review Test for Chapter 3 (pages 122–124 of the downloadable eText). Omit questions 25 and 27.

3. Complete the Unit 1 Self-Test below.

4. Complete Assignment 1.

For extra practice with the material presented in this section, you can complete the following questions and exercises, for which

the solutions are provided in the textbook:

1. Any odd-numbered chapter-section practice questions and Supplementary Exercises that are not assigned above

2. The odd-numbered Advanced Exercises found at the end of Chapter 3 (pages 119–120 of the downloadable eText)

MATH 215 C10 - Study Guide: Unit 1

14

Page 15: MATH 215 C10 - Self-Test: Unit 1

Once you have completed the Unit 1 Self-Test below, complete Assignment 1. You can access the assignment in the Assessment

section of the course home page. Once you have completed the assignment, submit it to your tutor for marking using the drop

box on the page for Assignment 2.

The self-test questions are shown here for your information. Download the Unit 1 Self-Test (https://fst-course.athabascau.ca

/science/math/215/r10/self_test/self_test01.html) document and write out your answers. Show all your work and keep your

calculations to four decimal places. You can access the solutions to this self-test on the course home page.

1. The following short survey was completed by nine randomly selected regular, paying customers who shop at Savemore, alarge supermarket located on the outskirts of a large city.

The survey responses of the nine randomly selected, regular customers are summarized in the table below.

a. Referring to the Savemore survey, fill in the blanks for each question below.

i. The name “Jackson” would be referred to as an _____________ (observation or element).

ii. The first income value of “85” would be referred to as an _____________ (observation or element).

iii. How many variables does the Savemore short survey focus on? _____

iv. The responses to the “residence” question would be classed as _____________ (qualitative orquantitative).

v. The responses to the “income” question would be classed as _____________ (qualitative orquantitative).

vi. The responses to the “visits” question would be classed as _____________ (discrete or continuous).

vii. The responses to the “income” question would be classed as _____________ (discrete or continuous).

viii. Data collected from the Savemore survey would be called _____________ (cross-section data or time-series data).

ix. The measure of central tendency most appropriate for summarizing the “residence” question responseswould be _____________ (mean or median or mode).

x. If Savemore plans to use the nine customer survey responses to make decisions regarding ALL Savemorecustomers, this would be an example of _____________ (descriptive statistics or inferential statistics).

xi. Would mean or median be the better measure of central tendency for the responses to the “visits” data?

MATH 215 C10 - Study Guide: Unit 1

15

Page 16: MATH 215 C10 - Self-Test: Unit 1

Explain.

xii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is unknown,___________% of the population earns between $30,000 and $110,000.

xiii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is bell-shaped___________% of the population earns between $30,000 and $110,000.

xiv. Provide two possible reasons why Savemore did not use a census survey to collect information from itscustomers:________________________________________

________________________________________

xv. If Savemore selected the sample of customers in a way that ensured each customer had an equal chance ofbeing selected, this would be called a _____________________ (random sample or simple randomsample).

xvi. Suppose that Savemore plans to estimate the true average annual income (before taxes) for ALL its regularpaying customers, based on computing the average annual income of the sample of nine regular payingcustomers surveyed. The fact that the mean income calculation of the sample will differ from thepopulation mean income, because the sample is a subset of the population, leads to a_____________________ (sampling error or non-sampling error).

xvii. If Savemore selected the sample of customers in a way that ensured that three customers were randomlyselected from rural residences and six customers were selected from urban areas, this would be an exampleof a _____________________ (systematic random sample or stratified random sample)

b. Referring to the Savemore survey responses, determine the median customer monthly visits.

c. Referring to the Savemore survey responses, determine the mean customer monthly visits. By comparing themean with the median customer monthly visits, determine if the distribution of the visits is positively ornegatively skewed. Explain. In computing the mean, keep your work to four decimals.

d. Calculate the standard deviation for the customer monthly visits. Use the short-cut method. Keep your work tofour decimals.

e. Referring to the Savemore survey responses, which variable—income or visits—exhibits greater relativevariability? Show the appropriate calculations. Hint: The standard deviation of the nine monthly incomesequals 7.8899.

f. Compute the first quartile (Q1) and third quartile (Q3) for customer monthly visits. Interpret your answers.

g. Calculate the interquartile range for monthly customer visits. Interpret your answer.

h. Calculate the percentile rank of the customer who typically makes 10 visits per month. Interpret your answer.

i. Construct a stem-and-leaf display for the annual customer income data. Display the leaves in ascending order.

j. Sketch a box-and-whisker plot for the annual customer family income data. In your sketch, indicate all thequartiles and the minimum and maximum values. Using the appropriate computations, determine if there aremild outliers.

2. A government healthcare clinic that serves older adults is trying to decide whether to open up a clinic location in a newsubdivision. The clinic surveyed a random sample of 20 homeowners and recorded their ages as follows:

a. Construct a frequency distribution for the ages above, using a lower limit for the first class of 40 and a class widthof 10. In your distribution, include the class limits, the class boundaries, the class midpoints, the frequency, therelative frequency and the cumulative frequency.

b. Construct a percentage histogram for the frequency distribution above. Use a ruler and the grid provided below.

MATH 215 C10 - Study Guide: Unit 1

16

Page 17: MATH 215 C10 - Self-Test: Unit 1

c. Is the distribution of the ages for the 20 homeowners described above skewed? If so, in which direction is theskew? Does the skewness of the age data support the decision to open a new clinic location in this subdivision?Explain.

d. Construct an ogive for the frequency distribution above. Use a ruler and the grid provided below. Hint: An ogive isa curve that describes the cumulative frequencies. It does this by joining with lines the dots marked above theupper boundaries of classes at heights that are equal to the cumulative frequencies of the respective classes.

e. What percentage of the twenty ages are below 60?

f. What percentage of the twenty ages are 70 or above?

3. The table below describes the commute times for all 30 employees working at a car dealership.

a. Would you classify the 30 commute times as sample or population data? Explain.

b. Compute the mean, standard deviation and variance for the commute times displayed in the distribution above.Use the short-cut method when computing the variance and the standard deviation.

c. Based on your observation of the skewness of the distribution of the commute times, is the median commute timesmaller or larger than the mean commute time? Explain.

4. The Golf Depot sold all 100 KPOW-SUPRA golf sets at different prices during the 2018 golf season, as follows.

MATH 215 C10 - Study Guide: Unit 1

17

Page 18: MATH 215 C10 - Self-Test: Unit 1

Compute the overall average price (per golf set) for the 100 KPOW-SUPRA golf sets sold during the 2018 golf season.

5. The mean annual income of all five of the doctors employed in a small medical center is $260,000. The annual incomesof three of these five doctors are $210,000, $250,000 and $275,000. Find the annual income of the fourth and fifthdoctors, assuming these two doctors make the same annual income. Hint: In getting started with your solution, let theannual income of the fourth doctor be $X.

Mann, Prem S. Introductory Statistics, 8th ed. Wiley, 2012. [VitalSource].

MATH 215 C10 - Study Guide: Unit 1

18

Page 19: MATH 215 C10 - Self-Test: Unit 1

Show all your work and keep your calculations to four decimal places. You can access the solutions to this self-test on the course

home page.

1. The following short survey was completed by nine randomly selected regular, paying customers who shop at Savemore, alarge supermarket located on the outskirts of a large city.

The survey responses of the nine randomly selected, regular customers are summarized in the table below.

a. Referring to the Savemore survey, fill in the blanks for each question below.

i. The name “Jackson” would be referred to as an _____________ (observation or element).

ii. The first income value of “85” would be referred to as an _____________ (observation or element).

iii. How many variables does the Savemore short survey focus on? _____

iv. The responses to the “residence” question would be classed as _____________ (qualitative orquantitative).

v. The responses to the “income” question would be classed as _____________ (qualitative orquantitative).

vi. The responses to the “visits” question would be classed as _____________ (discrete or continuous).

vii. The responses to the “income” question would be classed as _____________ (discrete or continuous).

viii. Data collected from the Savemore survey would be called _____________ (cross-section data or time-series data).

ix. The measure of central tendency most appropriate for summarizing the “residence” question responseswould be _____________ (mean or median or mode).

x. If Savemore plans to use the nine customer survey responses to make decisions regarding ALL Savemore

MATH 215 C10 - Self-Test: Unit 1

1

Page 20: MATH 215 C10 - Self-Test: Unit 1

customers, this would be an example of _____________ (descriptive statistics or inferential statistics).

xi. Would mean or median be the better measure of central tendency for the responses to the “visits” data?Explain.

xii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is unknown,___________% of the population earns between $30,000 and $110,000.

xiii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is bell-shaped___________% of the population earns between $30,000 and $110,000.

xiv. Provide two possible reasons why Savemore did not use a census survey to collect information from itscustomers:________________________________________

________________________________________

xv. If Savemore selected the sample of customers in a way that ensured each customer had an equal chance ofbeing selected, this would be called a _____________________ (random sample or simple randomsample).

xvi. Suppose that Savemore plans to estimate the true average annual income (before taxes) for ALL its regularpaying customers, based on computing the average annual income of the sample of nine regular payingcustomers surveyed. The fact that the mean income calculation of the sample will differ from thepopulation mean income, because the sample is a subset of the population, leads to a_____________________ (sampling error or non-sampling error).

xvii. If Savemore selected the sample of customers in a way that ensured that three customers were randomlyselected from rural residences and six customers were selected from urban areas, this would be an exampleof a _____________________ (systematic random sample or stratified random sample)

b. Referring to the Savemore survey responses, determine the median customer monthly visits.

c. Referring to the Savemore survey responses, determine the mean customer monthly visits. By comparing themean with the median customer monthly visits, determine if the distribution of the visits is positively ornegatively skewed. Explain. In computing the mean, keep your work to four decimals.

d. Calculate the standard deviation for the customer monthly visits. Use the short-cut method. Keep your work tofour decimals.

e. Referring to the Savemore survey responses, which variable—income or visits—exhibits greater relativevariability? Show the appropriate calculations. Hint: The standard deviation of the nine monthly incomesequals 7.8899.

f. Compute the first quartile (Q1) and third quartile (Q3) for customer monthly visits. Interpret your answers.

g. Calculate the interquartile range for monthly customer visits. Interpret your answer.

h. Calculate the percentile rank of the customer who typically makes 10 visits per month. Interpret your answer.

i. Construct a stem-and-leaf display for the annual customer income data. Display the leaves in ascending order.

j. Sketch a box-and-whisker plot for the annual customer family income data. In your sketch, indicate all thequartiles and the minimum and maximum values. Using the appropriate computations, determine if there aremild outliers.

2. A government healthcare clinic that serves older adults is trying to decide whether to open up a clinic location in a newsubdivision. The clinic surveyed a random sample of 20 homeowners and recorded their ages as follows:

a. Construct a frequency distribution for the ages above, using a lower limit for the first class of 40 and a class widthof 10. In your distribution, include the class limits, the class boundaries, the class midpoints, the frequency, therelative frequency and the cumulative frequency.

MATH 215 C10 - Self-Test: Unit 1

2

Page 21: MATH 215 C10 - Self-Test: Unit 1

b. Construct a percentage histogram for the frequency distribution above. Use a ruler and the grid provided below.

c. Is the distribution of the ages for the 20 homeowners described above skewed? If so, in which direction is theskew? Does the skewness of the age data support the decision to open a new clinic location in this subdivision?Explain.

d. Construct an ogive for the frequency distribution above. Use a ruler and the grid provided below. Hint: An ogive isa curve that describes the cumulative frequencies. It does this by joining with lines the dots marked above theupper boundaries of classes at heights that are equal to the cumulative frequencies of the respective classes.

e. What percentage of the twenty ages are below 60?

f. What percentage of the twenty ages are 70 or above?

3. The table below describes the commute times for all 30 employees working at a car dealership.

a. Would you classify the 30 commute times as sample or population data? Explain.

b. Compute the mean, standard deviation and variance for the commute times displayed in the distribution above.Use the short-cut method when computing the variance and the standard deviation.

c. Based on your observation of the skewness of the distribution of the commute times, is the median commute timesmaller or larger than the mean commute time? Explain.

4. The Golf Depot sold all 100 KPOW-SUPRA golf sets at different prices during the 2018 golf season, as follows.

MATH 215 C10 - Self-Test: Unit 1

3

Page 22: MATH 215 C10 - Self-Test: Unit 1

Compute the overall average price (per golf set) for the 100 KPOW-SUPRA golf sets sold during the 2018 golf season.

5. The mean annual income of all five of the doctors employed in a small medical center is $260,000. The annual incomesof three of these five doctors are $210,000, $250,000 and $275,000. Find the annual income of the fourth and fifthdoctors, assuming these two doctors make the same annual income. Hint: In getting started with your solution, let theannual income of the fourth doctor be $X.

MATH 215 C10 - Self-Test: Unit 1

4

Page 23: MATH 215 C10 - Self-Test: Unit 1

Show all your work and keep your calculations to four decimal places.

1. The following short survey was completed by nine randomly selected regular, paying customers who shop at Savemore, alarge supermarket located on the outskirts of a large city.

The survey responses of the nine randomly selected, regular customers are summarized in the table below.

a. Referring to the Savemore survey, fill in the blanks for each question below.

i. The name “Jackson” would be referred to as an element.

ii. The first income value of “85” would be referred to as an observation.

iii. How many variables does the Savemore short survey focus on? Three.

iv. The responses to the “residence” question would be classed as qualitative.

v. The responses to the “income” question would be classed as quantitative.

vi. The responses to the “visits” question would be classed as discrete.

vii. The responses to the “income” question would be classed as continuous.

viii. Data collected from the Savemore survey would be called cross-section data.

ix. The measure of central tendency most appropriate for summarizing the “residence” question responseswould be the mode [as it is qualitative].

x. If Savemore plans to use the nine customer survey responses to make decisions regarding ALL Savemorecustomers, this would be an example of inferential statistics.

xi. Would mean or median be the better measure of central tendency for the responses to the “visits” data?Explain.

MATH 215 C10 - Self-Test Answer Key: Unit 1

1

Page 24: MATH 215 C10 - Self-Test: Unit 1

Answer: Median. Since there are two non-typical extremely large numbers of visits, 22 and 24, the data ispositively skewed. When skewed, the median is not affected by the non-typical extreme values, whereasthe mean is.

xii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is unknown,

of the population earns between $30,000 and $110,000.

Further explanation: This answer is based on Chebyshev’s theorem and the fact that the interval$30,000 to $110,000 is within 2 standard deviations from the mean of $70,000. In this case, .

xiii. Suppose the population distribution of incomes related to survey question 2 (above) has a mean of$70,000 and a standard deviation of $20,000. If the shape of the population distribution is bell-shaped,95% of the population earns between $30,000 and $110,000.

Further explanation: This answer is based on the Empirical rule and the fact that the interval $30,000to $110,000 is within 2 standard deviations from the mean of $70,000.

xiv. Provide two possible reasons why Savemore did not use a census survey to collect information from itscustomers:

A census survey is very expensive and, at the same time, it may prove almost impossible toget a response from all the customers, unless the collection of data is done over a very longtime, which is impractical.

xv. If Savemore selected the sample of customers in a way that ensured each customer had an equal chance ofbeing selected, this would be called a simple random sample.

xvi. Suppose that Savemore plans to estimate the true average annual income (before taxes) for ALL its regularpaying customers, based on computing the average annual income of the sample of nine regular payingcustomers surveyed. The fact that the mean income calculation of the sample will differ from thepopulation mean income, because the sample is a subset of the population, leads to a sampling error.

xvii. If Savemore selected the sample of customers in a way that ensured that three customers were randomlyselected from rural residences and six customers were selected from urban areas, this would be an exampleof a stratified random sample.

b. Referring to the Savemore survey responses, determine the median customer monthly visits.

Solution:

Step 1: Order the data:

4 4 5 7 8 9 10 22 24

Step 2: Median = middle value = 8 visits

c. Referring to the Savemore survey responses, determine the mean customer monthly visits. By comparing themean with the median customer monthly visits, determine if the distribution of the visits is positively ornegatively skewed. Explain. In computing the mean, keep your work to four decimals.

Solution:

MATH 215 C10 - Self-Test Answer Key: Unit 1

2

Page 25: MATH 215 C10 - Self-Test: Unit 1

Since the mean (10.3333) is greater than the median (8), the distribution of the visits is positively skewed.This means that the majority of the data is less than the mean.

d. Calculate the standard deviation for the customer monthly visits. Use the short-cut method. Keep your work tofour decimals.

Solution:

Refer to the column sums in part c. above to understand the following:

e. Referring to the Savemore survey responses, which variable—income or visits—exhibits greater relativevariability? Show the appropriate calculations. Hint: The standard deviation of the nine monthly incomesequals 7.8899.

Solution:

The coefficient of variation ( ) measures relative variability as follows:

To find for income, we first must find the mean for income as follows:

(000’s)

Conclusion: The visits variable exhibits a greater degree of variability (72.5809% versus 11.2179%).

f. Compute the first quartile ( ) and third quartile ( ) for customer monthly visits. Interpret your answers.

MATH 215 C10 - Self-Test Answer Key: Unit 1

3

Page 26: MATH 215 C10 - Self-Test: Unit 1

Solution:

Step 1: Order the data:

4 4 5 7 8 9 10 22 24

Step 2: Find = median = middle value = 8 visits

Step 3: Find , which is the median of all the values less than :

4 4 5 7

, which means that approximately ¹⁄₄ or 25% of the customers make fewer than 4.5

visits to Savemore per month.

Step 4: Find , which is the median of all the values greater than :

9 10 22 24

, which means that approximately ³⁄₄ or 75% of the customers make fewer than 16

visits to Savemore per month.

g. Calculate the interquartile range ( ) for monthly customer visits. Interpret your answer.

Solution:

The middle 50% of the customer visits lie within a range of 11.5 visits. It is a measure of variation (dispersion)for the middle 50% of the values in a distribution. The larger this range, the more variable the data.

h. Calculate the percentile rank of the customer who typically makes 10 visits per month. Interpret your answer.

Solution:

Step 1: Order the data:

4 4 5 7 8 9 10 22 24

Step 2: Find the percentile rank:

Since 6 visit values are less than 10, the percentile rank is . This means that 67%

of the customer visits are less than 10 visits per month.

i. Construct a stem-and-leaf display for the annual customer income data. Display the leaves in ascending order.

Solution:

Order the data:

60 64 65 66 68 72 75 78 85

Stem-and-Leaf Display for Income (000’s)

j. Sketch a box-and-whisker plot for the annual customer family income data. In your sketch, indicate all thequartiles and the minimum and maximum values. Using the appropriate computations, determine if there aremild outliers.

Solution:

Box-and-Whisker Plot for Income (000’s)

MATH 215 C10 - Self-Test Answer Key: Unit 1

4

Page 27: MATH 215 C10 - Self-Test: Unit 1

Based on the plot: the minimum is 60; the maximum is 85; is 64.5; is 68; is 76.5.

The data is positively skewed, as the median is to the left of the center of the box, and the right whisker islonger than the left whisker.

The , and .The lower inner fence .The upper inner fence .

Since none of the data values is below 46.5 or above 94.5, there are no outliers.

2. A government healthcare clinic that serves older adults is trying to decide whether to open up a clinic location in a newsubdivision. The clinic surveyed a random sample of 20 homeowners and recorded their ages as follows:

a. Construct a frequency distribution for the ages above, using a lower limit for the first class of 40 and a class widthof 10. In your distribution, include the class limits, the class boundaries, the class midpoints, the frequency, therelative frequency and the cumulative frequency.

Solution:

b. Construct a percentage histogram for the frequency distribution above. Use a ruler and the grid provided below.

Solution:

MATH 215 C10 - Self-Test Answer Key: Unit 1

5

Page 28: MATH 215 C10 - Self-Test: Unit 1

c. Is the distribution of the ages for the 20 homeowners described above skewed? If so, in which direction is theskew? Does the skewness of the age data support the decision to open a new clinic location in this subdivision?Explain.

Solution:

The ages are negatively skewed. This means that the majority of the ages are above average, which supportsthe decision to locate in this subdivision.

d. Construct an ogive for the frequency distribution above. Use a ruler and the grid provided below. Hint: An ogive isa curve that describes the cumulative frequencies. It does this by joining with lines the dots marked above theupper boundaries of classes at heights that are equal to the cumulative frequencies of the respective classes.

e. What percentage of the twenty ages are below 60?

Solution:

Based on the Cumulative Frequency column in the frequency table above, or based on the ogive above, 5 or⁵⁄₂₀ = 25% of the ages are below 60.

f. What percentage of the twenty ages are 70 or above?

Solution:

Based on the Cumulative Frequency column in the frequency table above, or based on the ogive above, 9 of the

MATH 215 C10 - Self-Test Answer Key: Unit 1

6

Page 29: MATH 215 C10 - Self-Test: Unit 1

ages are below 70, which means 11 or ¹¹⁄₂₀ = 55% of the ages are 70 or above.

3. The table below describes the commute times for all 30 employees working at a car dealership.

a. Would you classify the 30 commute times as sample or population data? Explain.

Solution:

The 30 commute times are classed as population data, as this data includes ALL thirty employees of the cardealership.

b. Compute the mean, standard deviation and variance for the commute times displayed in the distribution above.Use the short-cut method when computing the variance and the standard deviation.

Solution:

       

   

   

     

c. Based on your observation of the skewness of the distribution of the commute times, is the median commute timesmaller or larger than the mean commute time? Explain.

Solution:

MATH 215 C10 - Self-Test Answer Key: Unit 1

7

Page 30: MATH 215 C10 - Self-Test: Unit 1

© Athabasca University

The commute times are positively skewed, which means that the majority of the times are below average. Thisimplies that the mean exceeds the median.

4. The Golf Depot sold all 100 KPOW-SUPRA golf sets at different prices during the 2018 golf season, as follows.

Compute the overall average price (per golf set) for the 100 KPOW-SUPRA golf sets sold during the 2018 golf season.

Solution:

Find the weighted mean, where = price and = number of golf sets

Weighted mean

5. The mean annual income of all five of the doctors employed in a small medical center is $260,000. The annual incomesof three of these five doctors are $210,000, $250,000 and $275,000. Find the annual income of the fourth and fifthdoctors, assuming these two doctors make the same annual income. Hint: In getting started with your solution, let theannual income of the fourth doctor be $X.

Solution:

MATH 215 C10 - Self-Test Answer Key: Unit 1

8