final project math 1040 - maria...

21
2014 Maria Davila Salt Lake Community College 4/30/2014 Final project Math 1040

Upload: ngongoc

Post on 09-Mar-2018

235 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

2014

Maria Davila

Salt Lake Community College

4/30/2014

Final project Math 1040

Page 2: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

2 | Salt Lake Community College

Term Project

Math 1040

As Criminal Justice student, Math 1040 is one of the requirement courses and important

part of my career. For this term project, I did chose Body Measurement data between 4

options given by my Instructor. This data set contains measurements for different parts

of the human body, such as chest, elbow, wrist and ankle diameter among several

others. Also, include categorical data such as gender that will help me accurately to

interpret the data.

The objective for this assignment is to pull together many of the concepts learned

through this semester as collecting samples, organizing and interpreting data using

graphs and conclusions drawn from the calculation results; using the information

provided by the data set "Body Measurements" I will select ONE CATEGORICAL

variable to build a Pie Chart and Pareto Chart, as well, I will use two different sampling

methods learned through this course. A quantitative variable will help to compute the

mean, standard deviation and the five summary numbers used to build frequency

histogram and box plot respectively. At the final of this project, I will be able to

construct level of confidence and significance for the data and test the Hypothesis for

the population proportion, also I will include the concept of Correlation between two

variables to illustrate any association between them if any.

Page 3: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

3 | Salt Lake Community College

PART I. ANALISYS OF CATEGORICAL VARIABLE -GENDER-

STEP I. GRAPHING THE DATA.

PIE CHART

Using the Categorical Variable denominated as "Gender" from the data set, I did a Pie

Chart, which reflects the percentage of males and females participating in this study. I

did take the entire population for this variable. As a result, I found that 48.72% of the

population are males and 51.28% are females. Blue color represents females and red

color, the males.

BODY MEASUREMENT.

Gender (Categorical Variable for entire population)

Page 4: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

4 | Salt Lake Community College

PARETO CHART

This graph contains the same categorical variable as Pie Chart showed before. Pareto

Chart is the best option to illustrate and compare data because we can see the

frequency calculated for Categorical variable, that is, Gender (female, represented by 0

and males represented by 1). Also, we do observe that there is not much difference

between the genders regarding to the participation in the study.

BODY MEASUREMENT.

Gender (Categorical Variable for entire population)

Page 5: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

5 | Salt Lake Community College

STEP II. USING SAMPLING

METHODS

SIMPLE RANDOM SAMPLE.

Pie Chart.

This is a Simple Random

Sample of 35 values taken from

the entire population of Gender

Variable, I get done this process

using my scientific calculator TI-

83 Plus (Math function, then

PRB, press 5:randInt(, next enter

values needed to calculate)

Enter.

Pareto Chart.

This graph was done using the same

sampling process and calculation as

Pie chart.

Page 6: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

6 | Salt Lake Community College

Systematic Sampling Method.

Pie Chart.

In this case, I had to select a start

point of the categorical data set,

(Gender) then I did select one value

for every fourteen elements. In

total I did select 35 values

manually.

Systematic Sampling Method

Pareto Chart

This is the same information contained

in Pie Chart but represented in a

Pareto chart.

Page 7: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

7 | Salt Lake Community College

ANALISYS OF QUANTITATIVE VARIABLE -HEIGHT-

Summary statistics table describe Mean and Standard Deviation for entire population of

the quantitative variable Height, also, includes the 5-number summary which are the

Median, Minimum and Maximum value, quartile 1 and quartile 3. Those quartiles are

measures of location that divide data into four groups with about 25% of the values in

each group. The graph used to illustrate Summary of statistics and quartiles is Box

Plot as showed in the next picture:

Summary statistics: Column n Mean Std. dev. Median Min Max Q1 Q3

Height 507 171.14379 9.4072052 170.3 147.2 198.1 163.8 177.8

Box Plot represents the

measurements of the data of

Summary Statistic table as a

graph.

Page 8: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

8 | Salt Lake Community College

Frequency Histogram

Show in its vertical scale

relative frequencies or

percentages instead of

actual frequencies.

This graph is bell shaped,

indicating that the data is

normally distributed,

meaning that frequencies

increases to a maximum

and then decreases. The

graph needs to be

symmetric to meet the

requirements for a normally

distribution.

Sampling methods

Having calculated measurements for the Entire Population of the variable (Height) I

used sampling methods as I showed in pages 2 and 3 of this document, then I

proceeded to calculate the mean and standard deviation for those samples, as well the

five-number summary and corresponding graphs for each sample as I did with the entire

population calculation.

Simple Random Sample.

I used Stat Crunch to obtain the simple random sample from the quantitative variable

(Height). Was very easy to use, just clicking on Data tab, selecting Sample and then I

did put in the calculation needed to obtain the sample. Next table includes the

calculation for mean and standard deviation for this simple random sample and the five-

numbers summary.

Summary statistics: Column n Mean Std. dev. Median Min Max Q1 Q3

Sample(Height) 35 171.62571 9.4314662 170.2 151.1 187.2 166.4 179.8

Page 9: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

9 | Salt Lake Community College

Graphs for simple random sample of quantitative variable (Height)

This is the Box Plot

graph corresponding

to the Summary

statistic data for the

simple random

sample of the

variable Height.

This is the Frequency Histogram

reflecting the same simple

random sample of the

quantitative variable (Height).

The shape of this histogram is

slightly skewed to the left.

Page 10: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

10 | Salt Lake Community College

Systematic Sample.

I did select a start point from the quantitative variable Height, this case number 14 and I

did chose a number every 14 elements, then I proceed with calculation for summary

statistics, box plot and frequency histogram to graph the result obtained.

Summary statistics:

Column n Mean Std. dev. Median Min Max Q1 Q3

syst.height sample 35 172.33143 9.4456268 173.2 155.8 188 164.4 180.3

Page 11: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

11 | Salt Lake Community College

This is the graph showing the results for Systematic sample taken from Height data

variable. This graph is lightly skewed to the left as frequency histogram of simple

random sample data.

Page 12: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

12 | Salt Lake Community College

Comparison analysis of the Histograms.

Gender Variable (categorical data)

This variable contain only two values: female and male. At the end of the calculations

and construction of the graphs, I did conclude that:

1. The Pareto chart did show that there are 260 females and 247 males.

2. Simple random sample was calculated with technology (TI-83 calculator) and its

values were reflected on Pareto and Pie charts, which show very similar results

regarding to the data taken from the entire population of the variable Gender.

Height Variable (quantitative data)

A histogram is a graph of bars that represent quantitative data, its vertical scale contain

frequencies and horizontal scale contain classes of quantitative data values. After

taking samples from entire population of Height variable using two different ways I found

that:

1. Simple random sample histogram is slightly skewed to the left because the

values of the variable (Height) start in a minimum value which is not zero, and

increase its values but not decreasing as a normal distribution does.

2. Systematic sample show a very slightly skewed to the left also, but taking into

consideration that the values were taken every 14 elements which are increasing

its values, it was an expected result.

In summary, we observed that quantitative variable (heights) give us statistic

information regarding to the mean and standard deviation, because its values are

numbers representing measurements, in this case the height of the participants; while

categorical data show us numbers that are used to identify two categories (Gender:

female and male). Those values does not represent any measurement but are used as

a labels to identify values. Because of the differences and role played by numbers in

each variable, I got different results and graph that were interpreted in different ways.

Page 13: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

13 | Salt Lake Community College

PART II. CONFIDENCE LEVEL.

Data Analysis Project Worksheet

Section 1

Population

Categorical Variable : GENDER

All Values of the Categorical Variable: 1: MALE AND 0: FEMALE

Choose one of the above values to use in Part 4 and Part 5 of the project.

FEMALE

p = 247

Sample 1 Sample 2

n = 35 n = 35

x = 7 x = 14

= 0.20 =0.40

Page 14: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

14 | Salt Lake Community College

Section 2

Population

Quantitative Variable HEIGHT

μ = 171.14379

σ = 9.40720

Sample 1 Sample 2

n = 35 n = 35

= 171.62571 =172.33143

s = 9.43147 s = 9.44562

Page 15: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

15 | Salt Lake Community College

Using data contained in the worksheet, I will create confidence intervals for the

population proportion from each of my samples taken from Categorical variable Gender.

Confidence level for sample One

P= +/- E

0.68 < < 0.332

Error Margin Calculation

E=0.132

Confidence level for sample Two

P= +/- E

0.68 < < 0.732

Error Margin Calculation

E = 0.162

Page 16: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

16 | Salt Lake Community College

Step II. Confidence interval for population mean of quantitative data.

Sample One

±

± 3.2389

168.3871 < < 174.8649

Error Margin Calculation

E = 3.2389

Sample Two

±

± 3.2458

169.08563 < < 175.57723

Error Margin Calculation

E = 3.2458

Page 17: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

17 | Salt Lake Community College

Step III. Confidence interval for population standard deviation of quantitative

data.

Sample One

<

<

8.0232 < < 13.4202

Sample Two

<

<

8.0368 < < 13.4757

Page 18: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

18 | Salt Lake Community College

Reflection about confidence levels.

A confidence level may be use to test a claim about the mean of the population mean or

its standard deviation. The calculations above show these confidence levels using two

variables: categorical and quantitative.

Regarding to the results of categorical data, its numbers are not measurements, so I

can state that confidence levels and any calculation made with the numbers contained

in any categorical variable are meaningless.

The difference with quantitative variable is that its numbers represent measurements

and any calculation based with those numbers have a meaning that I can use to draw a

conclusion about the population. Only results from quantitative samples contain the

population parameter.

Step IV. Level of significance.

Since categorical data variable does not represent measurements, I will calculate level

of significance only with quantitative data variable.

Sample One

Test-statistic:

t= - /

t = 0.3023

cv= 2.032

Fail to reject Ho.

Conclusion. The value of t- test is not contained in neither of the two tails, so we

fail to reject the claim that the mean of the population data is 171.62571.

Page 19: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

19 | Salt Lake Community College

Sample Two

Test-statistic:

t= - /

t = 0.7439

cv= 2.032

Fail to reject Ho.

Conclusion. The value of t- test is not contained in neither of the two tails, so we

fail to reject the claim that the mean of the population data is 172.33143.

Correlation and regression calculation.

Simple linear regression results: Dependent Variable: Gender (1 - M, 0 - F)

Independent Variable: Height

Gender (1 - M, 0 - F) = -5.7448963 + 0.036414268 Height Sample size: 507

R (correlation coefficient) = 0.68466211

cv= 0.196 Since absolute value of R is greater than critical value (cv) I conclude that there is not significance correlation between gender and height of a person.

Page 20: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

20 | Salt Lake Community College

Final conclusion.

Numbers and calculations are essential part for most of today's careers.

Statistics have been a very important tool that is described as a mathematical body of

science that pertains to the collection, analysis, interpretation or explanation, and

presentation of data, used by several agencies, companies, manufactures, sciences,

etc. Many studies such as health, consumers even politics, use Statistics to understand

and interpret data; also, based on those results it is possible to predict futures values or

outcomes for a research.

Through this project I did learn methods to improve my math skills. I understood

that math is every part of the real life. We need calculation for everything such as cost

of life, criminal, political and students rates, among thousand more. Numbers are

wonderful tool to develop many other skills such as statistic and analytical skills.

Page 21: Final project Math 1040 - Maria Davilamariadavila.weebly.com/.../math1040final_project_recovered.pdf · Final project Math 1040 [FINAL PROJECT MATH 1040] April 30, 2014 2 ... Conclusion

[FINAL PROJECT MATH 1040] April 30, 2014

21 | Salt Lake Community College