data analysis on tooth growth

7
Inferential Data Analysis: Tooth Growth in Guinea Pigs Karen J Yang September 13, 2014 INTRODUCTION The goal is to analyze the ToothGrowth data.The response is the length of odontoblasts (teeth) in each of 10 guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methods called supplements (orange juice or ascorbic acid). EXPLORATORY DATA ANALYSIS Load the ToothGrowth data and perform some basic exploratory data analyses. For the outcome variable, len, the summary statistics show a mean of 18.81 and a standard deviation of 7.65 for n = 60. library(datasets); data(ToothGrowth); attach(ToothGrowth) head(ToothGrowth) ## len supp dose ## 1 4.2 VC 0.5 ## 2 11.5 VC 0.5 ## 3 7.3 VC 0.5 ## 4 5.8 VC 0.5 ## 5 6.4 VC 0.5 ## 6 10.0 VC 0.5 summary(ToothGrowth) ## len supp dose ## Min. : 4.2 OJ:30 Min. :0.50 ## 1st Qu.:13.1 VC:30 1st Qu.:0.50 ## Median :19.2 Median :1.00 ## Mean :18.8 Mean :1.17 ## 3rd Qu.:25.3 3rd Qu.:2.00 ## Max. :33.9 Max. :2.00 dose<-as.factor(dose) # install.packages("psych"); library(psych) describe(len) ## vars n mean sd median trimmed mad min max range skew kurtosis ## 1 1 60 18.81 7.65 19.25 18.95 9.04 4.2 33.9 29.7 -0.14 -1.04 ## se ## 1 0.99 1

Upload: karen-yang

Post on 24-May-2015

1.474 views

Category:

Data & Analytics


2 download

DESCRIPTION

Data, inference

TRANSCRIPT

Page 1: Data Analysis on Tooth Growth

Inferential Data Analysis: Tooth Growth in Guinea PigsKaren J Yang

September 13, 2014

INTRODUCTION

The goal is to analyze the ToothGrowth data.The response is the length of odontoblasts (teeth) in each of 10guinea pigs at each of three dose levels of Vitamin C (0.5, 1, and 2 mg) with each of two delivery methodscalled supplements (orange juice or ascorbic acid).

EXPLORATORY DATA ANALYSIS

Load the ToothGrowth data and perform some basic exploratory data analyses. For the outcome variable,len, the summary statistics show a mean of 18.81 and a standard deviation of 7.65 for n = 60.

library(datasets); data(ToothGrowth); attach(ToothGrowth)head(ToothGrowth)

## len supp dose## 1 4.2 VC 0.5## 2 11.5 VC 0.5## 3 7.3 VC 0.5## 4 5.8 VC 0.5## 5 6.4 VC 0.5## 6 10.0 VC 0.5

summary(ToothGrowth)

## len supp dose## Min. : 4.2 OJ:30 Min. :0.50## 1st Qu.:13.1 VC:30 1st Qu.:0.50## Median :19.2 Median :1.00## Mean :18.8 Mean :1.17## 3rd Qu.:25.3 3rd Qu.:2.00## Max. :33.9 Max. :2.00

dose<-as.factor(dose)# install.packages("psych");library(psych)describe(len)

## vars n mean sd median trimmed mad min max range skew kurtosis## 1 1 60 18.81 7.65 19.25 18.95 9.04 4.2 33.9 29.7 -0.14 -1.04## se## 1 0.99

1

Page 2: Data Analysis on Tooth Growth

FREQUENCY TABLE AND BOXPLOTS FOR FACTOR VARIABLES

For the two explanatory variables, both of which are factor variables, I create a frequency table to displaythe counts. Here, we see the 60 observations evenly distributed across the 2 supplement types (supp), namelyOJ or VC, and the three types of dosages (dose),0.5, 1, and 2 milligrams.

For the first boxplot, we see that tooth growth length is greater when the delivery type is OJ rather than VC.For the second boxplot, we see that the tooth growth length increases with increasing dosage (0.5mg, 1mg,and 2mg).

table(supp,dose)

## dose## supp 0.5 1 2## OJ 10 10 10## VC 10 10 10

#install.packages("ggplot2");library(ggplot2)

#### Attaching package: 'ggplot2'#### The following object is masked from 'package:psych':#### %+%

ggplot(aes(x = supp, y = len), data = ToothGrowth) + geom_boxplot(aes(fill = supp))

10

20

30

OJ VCsupp

len

supp

OJ

VC

2

Page 3: Data Analysis on Tooth Growth

ggplot(aes(x = factor(dose), y = len), data = ToothGrowth) + geom_boxplot(aes(fill = factor(dose)))

10

20

30

0.5 1 2factor(dose)

len

factor(dose)

0.5

1

2

Thesummary statistic shows the means per delivery method type (OJ or VC) with OJ having a greater mean of21 over VC’s mean of 17.

round(with(ToothGrowth, sapply(split(len,supp), mean)),3)

## OJ VC## 20.66 16.96

I look at the means of tooth growth per dosage category. The group mean increases as the dosage increases.

aggregate(len,list(dose),mean)

## Group.1 x## 1 0.5 10.61## 2 1 19.73## 3 2 26.10

Taking a look at the dosage categories in conjunction with the the delivery methods, I see that the means arehigher for OJ when the dosage is 0.5 and 1. Overall, the means are the highest for dosage 2 with roughly thesame means for OJ and VC.

aggregate(len,list(supp,dose),mean)

3

Page 4: Data Analysis on Tooth Growth

## Group.1 Group.2 x## 1 OJ 0.5 13.23## 2 VC 0.5 7.98## 3 OJ 1 22.70## 4 VC 1 16.77## 5 OJ 2 26.06## 6 VC 2 26.14

As for the standard deviations, the highest 2 categories involve both the 0.5 dosage with OJ and the 2 dosagewith VC. There does not seem to be a distinct pattern between OJ and VC and between dosages 0.5, 1, and 2.

aggregate(len,list(supp,dose),sd)

## Group.1 Group.2 x## 1 OJ 0.5 4.460## 2 VC 0.5 2.747## 3 OJ 1 3.911## 4 VC 1 2.515## 5 OJ 2 2.655## 6 VC 2 4.798

T-TESTS FOR MEAN DIFFERENCE BY SUPPLEMENT TYPE

I start by looking at whether there are group differences by mean for OJ and VC, which are the supplementtypes. The t-value is 1.9, the p-value is 0.06, and the confidence interval contains the zero value, whichdenotes no effect so I fail to reject the null hypothesis that there is no difference on tooth length across OJand VC.

t.test(len ~ supp, data = ToothGrowth)

#### Welch Two Sample t-test#### data: len by supp## t = 1.915, df = 55.31, p-value = 0.06063## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## -0.171 7.571## sample estimates:## mean in group OJ mean in group VC## 20.66 16.96

T-TESTS FOR MEAN DIFFERENCE BY DOSAGE LEVEL

To examine the length of tooth growth by dosage, I subset the data set into 3 smaller data sets according to3 combinations of dosage pairs.

Tooth.dose0.5_1.0 <- subset(ToothGrowth, dose %in% c(0.5, 1.0))Tooth.dose0.5_2.0 <- subset(ToothGrowth, dose %in% c(0.5, 2.0))Tooth.dose1.0_2.0 <- subset(ToothGrowth, dose %in% c(1.0, 2.0))

4

Page 5: Data Analysis on Tooth Growth

I then use the t-tests to see if there are differences between the 2 dosage groups. The large absolute value oft-statistics and very small p-values suggest that all differences for each test are statistically significant so thatwe can reject the null hypothesis that the difference in means is 0 for each of the hypothesis tests. Also, theconfidence intervals do not contain zero.

t.test(len ~ dose, data = Tooth.dose0.5_1.0)

#### Welch Two Sample t-test#### data: len by dose## t = -6.477, df = 37.99, p-value = 1.268e-07## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## -11.984 -6.276## sample estimates:## mean in group 0.5 mean in group 1## 10.61 19.73

t.test(len ~ dose, data = Tooth.dose0.5_2.0)

#### Welch Two Sample t-test#### data: len by dose## t = -11.8, df = 36.88, p-value = 4.398e-14## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## -18.16 -12.83## sample estimates:## mean in group 0.5 mean in group 2## 10.61 26.10

t.test(len ~ dose, data = Tooth.dose1.0_2.0)

#### Welch Two Sample t-test#### data: len by dose## t = -4.901, df = 37.1, p-value = 1.906e-05## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## -8.996 -3.734## sample estimates:## mean in group 1 mean in group 2## 19.73 26.10

T-TESTS FOR MEAN DIFFERENCE OF SUPPLEMENT BY DOSAGE LEVEL

Now, to test by supplement (OJ or VC) for each dosage type, I subset the original data set by the 3 dosagelevels.

5

Page 6: Data Analysis on Tooth Growth

Tooth.dose0.5 <- subset(ToothGrowth, dose == 0.5)Tooth.dose1.0 <- subset(ToothGrowth, dose == 1.0)Tooth.dose2.0 <- subset(ToothGrowth, dose == 2.0)

Taking a look at the t-statistics, p-values, and confidence intervals, we see that the first two tests that involvedosages 0.5 and 1 are found to be statistically significant in terms of mean differences between OJ and VC sothat we can fail to accept the null hypothesis. For the third t-statistic result that involves the 2 dosage leveltells a very different story for mean differences between OJ and VC with a very high p-value of 0.96. Wecannot reject the null hypothesis that difference in means is 0. Also, the confidence interval contains zero aswell, which tells us that the null hypothesis of no mean difference stands.

t.test(len ~ supp, data = Tooth.dose0.5)

#### Welch Two Sample t-test#### data: len by supp## t = 3.17, df = 14.97, p-value = 0.006359## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## 1.719 8.781## sample estimates:## mean in group OJ mean in group VC## 13.23 7.98

t.test(len ~ supp, data = Tooth.dose1.0)

#### Welch Two Sample t-test#### data: len by supp## t = 4.033, df = 15.36, p-value = 0.001038## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## 2.802 9.058## sample estimates:## mean in group OJ mean in group VC## 22.70 16.77

t.test(len ~ supp, data = Tooth.dose2.0)

#### Welch Two Sample t-test#### data: len by supp## t = -0.0461, df = 14.04, p-value = 0.9639## alternative hypothesis: true difference in means is not equal to 0## 95 percent confidence interval:## -3.798 3.638## sample estimates:## mean in group OJ mean in group VC## 26.06 26.14

6

Page 7: Data Analysis on Tooth Growth

ASSUMPTIONS

Assumption 1. We assume that the experiment was done with random assignment of guinea pigs todosage category and delivery type to control for unaccounted outside factors that could affect the outcome.Assumption 2. We assume that members of the sample population, namely the 60 guinea pigs, arerepresentative of the population of guinea pigs. Assumption 3. An assumption is made for the t-statisticthat variances are equal across the 2 groups being compared.

CONCLUSIONS

Conclusion 1. By itself, supplement type,either OJ or VC, does not have an effect on tooth growth length.No difference. Conclusion 2. By itself, each of the dosage levels was found to be statistically significant ontooth growth length. Conclusion 3. For dosage level 2mg, there is no difference in effect of tooth growthlength by OJ and VC. Conclusion 4. For dosage levels 0.5mg and 1mg, there’s a greater effect for OJ thanVC on tooth growth length.

7