statistical methods and computing final project by: corey gingerich

12
Statistical Statistical Methods and Methods and Computing Computing Final Project Final Project By: Corey Gingerich By: Corey Gingerich

Upload: andrea-skinner

Post on 28-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Statistical Methods and Computing Final Project By: Corey Gingerich

Statistical Statistical Methods and Methods and Computing Computing

Final ProjectFinal Project

By: Corey GingerichBy: Corey Gingerich

Page 2: Statistical Methods and Computing Final Project By: Corey Gingerich

The Basis of the ProjectThe Basis of the Project

In this project, I carried out an experiment In this project, I carried out an experiment that involves studying a particular genetic that involves studying a particular genetic disorder caused when certain genes are disorder caused when certain genes are deleted from the human genome during early deleted from the human genome during early cell division. This disease already has a large cell division. This disease already has a large list of phenotypes and until now, obesity was list of phenotypes and until now, obesity was never attributed to it. In this experiment, I will never attributed to it. In this experiment, I will provide strong evidence that obesity is direct provide strong evidence that obesity is direct symptom of the disorder. By using a two-sided symptom of the disorder. By using a two-sided t test comparing weights of 76 knockout mice t test comparing weights of 76 knockout mice (-/-) and 76 wildtype (+/+) mice at 20 weeks of (-/-) and 76 wildtype (+/+) mice at 20 weeks of age, I will show that there is a significant age, I will show that there is a significant difference between having these genes in full difference between having these genes in full effect or not having the genes at all.effect or not having the genes at all.

Page 3: Statistical Methods and Computing Final Project By: Corey Gingerich

What the Heck does it all What the Heck does it all mean?mean?

Phenotype – A physical characteristic or Phenotype – A physical characteristic or symptom of a disorder (Primarily Genetic)symptom of a disorder (Primarily Genetic)

Knockout (-/-) – The chromosomal area Knockout (-/-) – The chromosomal area that contains the genes has been deleted.that contains the genes has been deleted.

Wildtype (+/+) – All of the genes that are Wildtype (+/+) – All of the genes that are normally present are there and active.normally present are there and active.

Mice? – The sample was taken from Mice? – The sample was taken from world’s largest colony of mice with this world’s largest colony of mice with this disease located in the University of Iowa disease located in the University of Iowa Med labsMed labs

Page 4: Statistical Methods and Computing Final Project By: Corey Gingerich

A Brief History of the A Brief History of the DisorderDisorder

Firsts things first: because the results of this Firsts things first: because the results of this experiment haven’t been published yet, I cannot experiment haven’t been published yet, I cannot release the name of the disorder (sorry!).release the name of the disorder (sorry!).

This disease is present from birth, and allows the This disease is present from birth, and allows the patient to live a full life, but there are problems…patient to live a full life, but there are problems…

Phenotypes already know: blindness, sterility in Phenotypes already know: blindness, sterility in males, hypogenitaliaism, polydactly (extra fingers males, hypogenitaliaism, polydactly (extra fingers and toes), and low IQ.and toes), and low IQ.

Why Obesity?: Obesity has been noted in some Why Obesity?: Obesity has been noted in some cases in humans, but because of variable activity cases in humans, but because of variable activity level, it could never be confirmed as a direct level, it could never be confirmed as a direct symptom. Mice allow scientists to study the weights symptom. Mice allow scientists to study the weights of those infected in order to trends and of those infected in order to trends and epidemiology. epidemiology.

Page 5: Statistical Methods and Computing Final Project By: Corey Gingerich

And I Present, THE And I Present, THE SUBJECTSUBJECT

Page 6: Statistical Methods and Computing Final Project By: Corey Gingerich

Overall Weight Trends in Overall Weight Trends in Previous StudiesPrevious Studies

0

10

20

30

40

50

60

KnockoutWeight WildytpeWeight

Page 7: Statistical Methods and Computing Final Project By: Corey Gingerich

Comparing Means from the Comparing Means from the Two Different Populations Two Different Populations at the Twentieth Week of at the Twentieth Week of

LiveLive Knockout Mice (-/-)Knockout Mice (-/-) N = 76N = 76 Mean = 38.9178Mean = 38.9178 Std Dev = 7.8575Std Dev = 7.8575 Std Error = .90131Std Error = .90131 95% Confidence 95% Confidence

Interval for the Interval for the values of the mean values of the mean is :is :(37.1225, 40.7133)(37.1225, 40.7133)

Wildtype Mice (+/+)Wildtype Mice (+/+) N = 76N = 76 Mean = 35.6197Mean = 35.6197 Std Dev = 8.5863Std Dev = 8.5863 Std Error = .98491Std Error = .98491 95% Confidence 95% Confidence

Interval for the Interval for the values of the mean values of the mean is:is:(33.6577, 37.5818)(33.6577, 37.5818)

Page 8: Statistical Methods and Computing Final Project By: Corey Gingerich

The Facts:The Facts:

2 Populations of Interest: All mice with 2 Populations of Interest: All mice with the two different genotypesthe two different genotypes

The Observed Units: The MiceThe Observed Units: The Mice Variable of Interest: Weight of the MiceVariable of Interest: Weight of the Mice What is the data type: Quantitative What is the data type: Quantitative

DiscreteDiscrete The Population Parameters: The mean The Population Parameters: The mean

weights of the two genotypes of miceweights of the two genotypes of mice

Page 9: Statistical Methods and Computing Final Project By: Corey Gingerich

Checking the Two Sided t-Checking the Two Sided t-test Assumptionstest Assumptions

Since the two samples have an equal n of 76, Since the two samples have an equal n of 76, the rule of thumb states that the test will be the rule of thumb states that the test will be quite accurate, even if the nquite accurate, even if the n11 = n = n22 would be as would be as small as 5.small as 5.

Also, since the sample size is >40 the t Also, since the sample size is >40 the t procedure can be accurate even if the procedure can be accurate even if the distribution of the sample would have been distribution of the sample would have been skewed.skewed.

T-tests can be affected by small or large T-tests can be affected by small or large outliers. This data set has no significant outliers outliers. This data set has no significant outliers that will affect the outcome of the t-test.that will affect the outcome of the t-test.

Page 10: Statistical Methods and Computing Final Project By: Corey Gingerich

The Makings of a Two Sided The Makings of a Two Sided t-testt-test

Step One: Established the HypothesisStep One: Established the Hypothesis

HHo o : : kk is equal to is equal to ww

HHA A : : k k is not equal tois not equal to ww

Step Two: Chose A Confidence LevelStep Two: Chose A Confidence Level

A 95% Confidence level is suitableA 95% Confidence level is suitable

Step Three: Used SAS to compute my data Step Three: Used SAS to compute my data using using the proc t-test function, with the proc t-test function, with the the variable in question being variable in question being weightweight

Page 11: Statistical Methods and Computing Final Project By: Corey Gingerich

The Makings of a Two Sided The Makings of a Two Sided t-test continued… The t-test continued… The

ResultsResultst statistic : 2.47t statistic : 2.47

p value: 0.0146p value: 0.0146

95% Con. Int. of 95% Con. Int. of kk – – ww= (.6601, 3.298)= (.6601, 3.298)

The p value of the test shows that the null The p value of the test shows that the null hypothesis can comfortably be rejected at the hypothesis can comfortably be rejected at the 95% confidence level. This means that the 95% confidence level. This means that the weight differences for the mice are significant. weight differences for the mice are significant. In addition, the average difference in weight In addition, the average difference in weight between mice of the two genotypes is between between mice of the two genotypes is between .6601 and 3.298 grams..6601 and 3.298 grams.

Page 12: Statistical Methods and Computing Final Project By: Corey Gingerich

FinFin