design of experiments survival analysis · • ‘introduction to. r ’ uploaded. 2. anders...

Anders Stockmarr Design of Experiments and Survival Analysis DTU Statistics and Data Analysis12 November 2019

Anders StockmarrTechnical University of DenmarkSection for Statistics and Data [email protected]

AQUAEXCELL2020 Training Course - Planning and Conducting Experimental Infection Trials in FishDTU AQUA, 12/11 2019

Design of ExperimentsSurvival Analysis

1


Two topics for today

• Design of Experiments and Survival Analysis

• Survival Analysis session: R commands uploaded in the script‘Commands.R’

• Data sets uploaded; should be placed in a folder labeled‘Data’ in your R working directory, if you want to followcalculations and figure generation simultaneously

• Pdf file ‘Introduction to R’ uploaded

2


Target Audience• You have:

–a first course in statistics;–heard of the normal distribution;–know about the mean and variance;–have done some regression analysis (or heard of it);–know something about ANOVA (or heard of it);–Have used Windows or Mac based computers;–Have done, or will be conducting experiments.

• These assumptions will form the basis of the communicationin this lecture.

3


Design of Experiments:Introduction

4


Main Reference

• Douglas C. Montgomery:Design and Analysis of ExperimentsWiley 2017.

A standard textbook held in an appropriate academic level.

5


Overview

• Introduction

• Basic Statistical Concepts

• The Blocking Principle

• The 2k Factorial Design

6


Introduction

7

Anders Stockmarr Design of Experiments and Survival Analysis DTU Statistics and Data Analysis12 November 2019 8

Design of ExperimentsIntroduction

• Why is this trip necessary? Goals of the lecture

• Some basic principles and terminology

• The strategy of experimentation

• Guidelines for planning, conducting and analyzing experiments


Introduction to DOX

• An experiment is a test or a series of tests• Experiments are used widely in the engineering

world –Process characterization & optimization–Evaluation of material properties–Product design & development–Component & system tolerance determination

• “All experiments are designed experiments, some are poorly designed, some are well-designed”


Experiments• Reduce time to design/develop

new products & processes• Improve performance of

existing processes• Improve reliability and

performance of products• Achieve product & process

robustness• Evaluation of materials, design

alternatives, setting component & system tolerances, etc.


The Basic Principles of DOX

• Randomization–Running the trials in an experiment in random order–Notion of balancing out effects of “lurking” variables

• Replication–Sample size (improving precision of effect estimation, estimation

of error or background noise)–Replication versus repeat measurements?

• Blocking–Dealing with nuisance factors


Strategy of Experimentation

• “Best-guess” experiments–Used a lot–More successful than you might suspect, but there are

disadvantages…• One-factor-at-a-time (OFAT) experiments

–Sometimes associated with the “scientific” or “engineering” method

–Devastated by interaction, also very inefficient• Statistically designed experiments

–Based on Fisher’s factorial concept


Factorial Designs• In a factorial experiment, all

possible combinations of factor levels are tested

• The golf experiment:– Type of driver– Type of ball– Walking vs. riding– Type of beverage– Time of round– Weather – Type of golf spike– Etc, etc, etc…


Factorial Design


Factorial Designs with Several Factors


Factorial Designs with Several FactorsA Fractional Factorial


Planning, Conducting & Analyzing an Experiment1. Recognition of & statement of problem2. Choice of factors, levels, and ranges3. Selection of the response variable(s)4. Choice of design5. Conducting the experiment6. Statistical analysis7. Drawing conclusions, recommendations


Planning, Conducting & Analyzing an Experiment

• Get statistical thinking involved early• Your non-statistical knowledge is crucial to

success• Pre-experimental planning (steps 1-3) vital• Think and experiment sequentially (use the

KISS principle)• Reference: Coleman & Montgomery

(Technometrics 1993).


Design of Experiments:

Basic Statistical Concepts

19


Design of ExperimentsBasic Statistical Concepts• Simple comparative experiments

–The hypothesis testing framework–The two-sample t-test–Checking assumptions, validity

• Comparing more that two factor levels…theanalysis of variance–ANOVA decomposition of total variability–Statistical testing & analysis–Checking assumptions, model validity–Post-ANOVA testing of means

• Sample size determination


Portland Cement Formulation

16.6216.7517.3717.1216.9816.8717.3417.0217.0817.27


Graphical View of the DataDot Diagram


Box Plots


The Hypothesis Testing Framework

• Statistical hypothesis testing is a useful framework for many experimental situations

• Origins of the methodology date from the early 1900s

• We will use a procedure known as the two-sample t-test


25

The Hypothesis Testing Framework

• Sampling from a normal distribution• Statistical hypotheses:

0 1 2

1 1 2

::

HH

µ µµ µ

=≠


Estimation of Parameters

1

2 2 2

1

1 estimates the population mean

1 ( ) estimates the variance 1

n

ii

n

ii

y yn

S y yn

µ

σ

=

=

=

= −−

∑

∑


Summary Statistics

Formulation 1

“New recipe”

Formulation 2

“Original recipe”

�𝑦𝑦1 = 16.76

𝑆𝑆12 = 0.100

𝑆𝑆1 = 0.316

𝑛𝑛1 = 10

�𝑦𝑦2 = 17.04

𝑆𝑆22 = 0. 061

𝑆𝑆2 = 0.248

𝑛𝑛2 = 10


How the Two-Sample t-Test Works:

1 2

22y

Use the sample means to draw inferences about the population means16.76 17.04 0.28

Difference in sample meansStandard deviation of the difference in sample means

This suggests a statistic:

y y

nσσ

− = − = −

=

1 20 2 2

1 2

1 2

Z y y

n nσ σ−

=

+


How the Two-Sample t-Test Works:2 2 2 2

1 2 1 2

1 22 2

1 2

1 2

2 2 21 2

2 22 1 1 2 2

1 2

Use and to estimate and

The previous ratio becomes

However, we have the case where Pool the individual sample variances:

( 1) ( 1)2p

S Sy yS Sn n

n S n SSn n

σ σ

σ σ σ

−

+

= =

− + −=

+ −


30

How the Two-Sample t-Test Works:

• Values of t0 that are near zero are consistent with the null hypothesis

• Values of t0 that are very different from zero are consistent with the alternative hypothesis

• t0 is a “distance” measure-how far apart the averages are expressed in standard deviation units

• Notice the interpretation of t0 as a signal-to-noise ratio

1 20

1 2

The test statistic is

1 1

p

y ytS

n n

−=

+


The Two-Sample (Pooled) t-Test2 2

2 1 1 2 2

1 2

1 20

1 2

( 1) ( 1) 9(0.100) 9(0.061) 0.0812 10 10 2

0.284

16.76 17.04 2.201 1 1 10.284

10 10

The two sample means are a little over two standard deviations apartIs t

p

p

p

n S n SSn n

S

y ytS

n n

− + − += = =

+ − + −=

− −= = = −

+ +

his a "large" difference?


The Two-Sample (Pooled) t-Test• So far, we haven’t really

done any “statistics”• We need an objective

basis for deciding how large the test statistic t0 really is

• In 1908, W. S. Gossetderived the referencedistribution for t0 … called the t distribution

• Available in software packages such as R

t0 = -2.20


The Two-Sample (Pooled) t-Test• A value of t0 between

–2.101 and 2.101 is consistent with equality of means

• It is possible for the means to be equal and t0 to exceed either 2.101 or –2.101, but it would be a “rareevent” … leads to the conclusion that the means are different

• Could also use the p-value approach

t0 = -2.20


34

The Two-Sample (Pooled) t-Test

• The test level α is the chosen risk of wrongly rejecting the null hypothesis of equal means. The usual level of α is 0.05.

• The p-value is the probability of getting a more extreme vent under the hypothesis of equal means (it measures rareness of the event).

• The null hypothesis is rejected if the p-value is lower than the test level. In our problem, the p-value is p = 0.042

t0 = -2.20


R Two-Sample t-Test ResultsR command:

t.test(modified,unmodified,var.equal=TRUE)

Output:Two Sample t-test

data: modified and unmodified

t = -2.1869, df = 18, p-value = 0.0422

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.54507339 -0.01092661

sample estimates:

mean of x mean of y

16.764 17.042

Here the p-value is found


Checking Assumptions –The Normal Quantile-Quantile Plot


Importance of the t-Test

• Provides an objective framework for simple comparativeexperiments

• Could be used to test all relevant hypotheses in a two-levelfactorial design, because all of these hypotheses involve themean response at one “side” of the cube versus the meanresponse at the opposite “side” of the cube


Confidence Intervals• Hypothesis testing gives an objective statement

concerning the difference in means, but itdoesn’t specify “how different” they are

• General form of a confidence interval

• The 100(1- α)% confidence interval on thedifference in two means:

where ( ) 1 L U P L Uθ θ α≤ ≤ ≤ ≤ = −

1 2

1 2

1 2 / 2, 2 1 2 1 2

1 2 / 2, 2 1 2

(1/ ) (1/ )

(1/ ) (1/ )n n p

n n p

y y t S n n

y y t S n nα

α

µ µ+ −

+ −

− − + ≤ − ≤

− + +


What If There Are More Than Two Factor Levels?• The t-test does not directly apply

• There are lots of practical situations where there are eithermore than two levels of interest, or there are several factors ofsimultaneous interest

• The analysis of variance (ANOVA) is the appropriateanalysis “engine” for these types of experiments

• The ANOVA was developed by Fisher in the early 1920s, andinitially applied to agricultural experiments


An Example• An engineer is interested in investigating the relationship between

the RF power setting and the etch rate for this tool. The objective ofan experiment like this is to model the relationship between etchrate and RF power, and to specify the power setting that will give adesired target etch rate.

• The response variable is etch rate.• She is interested in a particular gas (C2F6) and gap (0.80 cm), and

wants to test four levels of RF power: 160W, 180W, 200W, and220W. She decided to test five wafers at each level of RF power.

• The experimenter chooses 4 levels of RF power 160W, 180W,200W, and 220W

• The experiment is replicated 5 times – runs made in random order


41

An Example

• Does changing the power change the mean etch rate?

• Is there an optimumlevel for power?


42

The Analysis of Variance

• In general, there will be a level of the factor, or a treatment, and nreplicates of the experiment, run in random order…a completely randomized design (CRD)

• N = an total runs• Objective is to test hypotheses about the equality of the a treatment

means


The Analysis of Variance• The name “analysis of variance” stems from a

partitioning of the total variability in the response variable into components that are consistent with a model for the experiment

• The basic single-factor ANOVA model is

2

1, 2,...,,

1, 2,...,

an overall mean, treatment effect, experimental error, (0, )

ij i ij

i

ij

i ay

j n

ithNID

µ τ ε

µ τ

ε σ

== + + =

= =

=


Models for the DataThere are several ways to write a model for the data:

is called the effects modelLet , then

is called the means modelRegression models can also be employed

ij i ij

i i

ij i ij

y

y

µ τ ε

µ µ τµ ε

= + +

= += +


The Analysis of Variance• Total variability is measured by the total sum

of squares:

• The basic ANOVA partitioning is:

2..

1 1( )

a n

T iji j

SS y y= =

= −∑∑

2 2.. . .. .

1 1 1 1

2 2. .. .

1 1 1

( ) [( ) ( )]

( ) ( )

a n a n

ij i ij ii j i j

a a n

i ij ii i j

T Treatments E

y y y y y y

n y y y y

SS SS SS

= = = =

= = =

− = − + −

= − + −

= +

∑∑ ∑∑

∑ ∑∑


46

The Analysis of Variance

• A large value of SSTreatments reflects large differences in treatment means

• A small value of SSTreatments likely indicates no differences in treatment means

• Formal statistical hypotheses are:

T Treatments ESS SS SS= +

0 1 2

1

:: At least one mean is different

aHH

µ µ µ= = =


47

The Analysis of Variance• While sums of squares cannot be directly compared to

test the hypothesis of equal means, mean squares can be compared.

• A mean square is a sum of squares divided by its degrees of freedom:

• If the treatment means are equal, the treatment and error mean squares will be (theoretically) equal.

• If treatment means differ, the treatment mean square will be larger than the error mean square.

1 1 ( 1)

,1 ( 1)

Total Treatments Error

Treatments ETreatments E

df df dfan a a n

SS SSMS MSa a n

= +− = − + −

= =− −


48

The Analysis of Variance is Summarized in a Table

• The reference distribution for F0 is the Fa-1, a(n-1) distribution• Reject the null hypothesis (equal treatment means) if

0 , 1, ( 1)a a nF Fα − −>


ANOVA Table

• Never done by hand, alsways with a computer. In R, the lm() function applies (lm for ‘Linear Model’)


ANOVA Table: R code

Executed R code:

my.analysis<-lm(x~as.factor(Power),data=etching)

drop1(my.analysis,test="F")

Output:

Single term deletions

Model:

x ~ as.factor(Power)

Df Sum of Sq RSS AIC F value Pr(>F)

<none> 5339 119.74

as.factor(Power) 3 66871 72210 165.83 66.797 2.883e-09 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

>


The Reference Distribution:

In R, you can find 𝐹𝐹0.05,3,16 as qf(1-0.05,3,16) : 3.24


Model Adequacy Checking

• Checking assumptions is important• Normality• Constant variance• Independence• Have we fit the right model?• We will not discuss what to do if some of these

assumptions are violated, because of time issues.


Model Adequacy Checking in the ANOVA• Examination of residuals

• Residual plots are very useful

• Quantile-quantile plot of residuals

.

ˆij ij ij

ij i

e y yy y

= −

= −


Other Important Residual Plots


Post-ANOVA Comparison of Means• The analysis of variance tests the hypothesis of equal

treatment means• Assume that residual analysis is satisfactory• If that hypothesis is rejected, we don’t know which

specific means are different• Determining which specific means differ following an

ANOVA is a multiple comparisons problem• There are lots of ways to do this…• We will use pairwise t-tests on means…sometimes

called Fisher’s Least Significant Difference (orFisher’s LSD) Method


Fishers LSD: R code Output:Study: my.analysis ~ "Power"

LSD t Test for x

P value adjustment method: bonferroni

Mean Square Error: 333.7

Power, means and individual ( 95 %) CI

x std r LCL UCL Min Max

160 551.2 20.01749 5 533.8815 568.5185 530 575

180 587.4 16.74216 5 570.0815 604.7185 565 610

200 625.4 20.52559 5 608.0815 642.7185 600 651

220 707.0 15.24795 5 689.6815 724.3185 685 725

Alpha: 0.05 ; DF Error: 16

Critical Value of t: 3.008334

Minimum Significant Difference: 34.75635

Treatments with the same letter are not significantly different.

x groups

220 707.0 a

200 625.4 b

180 587.4 c

160 551.2 d

R code:install.packages("agricolae")

library(agricolae)

LSD.test(my.analysis,"Power", p.adj="bonferroni”,console=TRUE)

All different letters! ie none of the groups can be collapsedat a 5% Bonferroni-correctedtest level


Graphical Comparison of Means


The Regression Model


Why Does the ANOVA Work?

2 21 0 ( 1)2 2

0

We are sampling from normal populations, so

if is true, and

Cochran's theorem gives the independence of these two chi-square random variables

/(So

Treamtents Ea a n

Treatments

SS SSH

SSF

χ χσ σ− −

=

21

1, ( 1)2( 1)

2

2 21

1) /( 1)/[ ( 1)] /[ ( 1)]

Finally, ( ) and ( )1

Therefore an upper-tail test is appropriate.

aa a n

E a n

n

ii

Treatments E

a a FSS a n a n

nE MS E MS

aF

χχ

τσ σ

−− −

−

=

− −− −

= + =−

∑

~ ~

~ ~


Sample Size Determination• FAQ in designed experiments:• Answer depends on lots of things; including what type

of experiment is being contemplated, how it will beconducted, resources, and desired sensitivity – howsure do you want to be?

• Sensitivity refers to the difference in means that theexperimenter wishes to detect.

• Generally, increasing the number of replicationsincreases the sensitivity or it makes it easier todetect small differences in means


Sample Size Determination

• Can choose the sample size to detect a specificdifference in means and achieve desired values oftype I and type II errors

• Type I error – reject H0 when it is true ( )• Type II error – fail to reject H0 when it is false ( )• Power = 1 -• Operating characteristic curves plot against aparameter , where

αβ

βΦ 2

2 12

a

ii

n

a

τ

σ=Φ =∑

β


Sample Size Determination

• Rule of thumb for the t-test: You obtain a power of 80% when

𝑛𝑛 ≈8𝜎𝜎2

Δ2

where 𝜎𝜎2 is the residual variance, and Δ is the difference that you want to be able to detect.

Example: suppose that our measurements are around 20, with a variance of 10, and we want to detect a 10% change (ie. Δ = ±2). Then

𝑛𝑛 ≈ 8 × 10/22 = 20


Sample Size Determination• The general case of the t-test: For an arbitrary power 1 − 𝛽𝛽 and an

arbitrary test level 𝛼𝛼:

𝑛𝑛 ≈𝜎𝜎2 𝑧𝑧1−𝛽𝛽 + 𝑧𝑧1−𝛼𝛼/2

2

Δ2

Where 𝑧𝑧𝑞𝑞 is the 𝑞𝑞-percentile in the standard normal distribution. One can find it in R as qnorm(q).

Example: Suppose that 𝛼𝛼 = 0.05, and the desired power is 1 − 𝛽𝛽 = 0.8. Since it is well known that qnorm(1-0.05/2)is 1.96, and qnorm(0.8)returns the value 0.84, it holds that

𝑛𝑛 ≈𝜎𝜎2 2.8 2

Δ2=

7.84𝜎𝜎2

Δ2The rule of thumb reappears.


Sample Size DeterminationPower and sample size can be explored in R with the functionpower.t.test. For more general designs, use the pwr package:

Function Power Calculations forpwr.2p.test Two proportions (equal n)pwr.2p2n.test Two proportions (unequal n)pwr.anova.test Balanced one-way anovapwr.chisq.test Chi-square testpwr.f2.test General linear modelpwr.p.test Proportion (one-sample)pwr.r.test Correlationpwr.t.test T-tests (one sample, two sample,

paired)pwr.t2n.test T-test (two samples with unequal n)


Sample Size Determination – Example:Let us investigate the Portland Cement formulation example. Here we

found a difference between the groups of -0.28, and a pooled sd of0.284:

�𝑦𝑦1 − �𝑦𝑦2 = −0.28

𝑆𝑆𝑝𝑝 = 0.284

If these values were indeed the real differences between groups and sd,how many runs should we have in the experiment to be 80% sure to detecta statiastical significance?


Sample Size Determination – Example:R code:power.t.test(delta=-0.28,sd=0.284,power=0.8)

Output:Two-sample t test power calculation

n = 17.16492delta = 0.28

sd = 0.284sig.level = 0.05

power = 0.8alternative = two.sided

NOTE: n is number in *each* group

Thus, we need 18 runs in each group to be 80% sure of detecting thedifference. Perhaps we were lucky with only 10 in each group.

.


Sample Size Determination – Example:R code:power.t.test(n=10, delta=-0.28,sd=0.284)

Output:Two-sample t test power calculation

n = 10delta = 0.28

sd = 0.284sig.level = 0.05

power = 0.5502385alternative = two.sided

NOTE: n is number in *each* group

Thus, the real power of the experiment is close to 50-50, and we may havegotten lucky to detect it.


Design of Experiments:The Blocking

Principle68


Design of ExperimentsThe Blocking Principle• Blocking and nuisance factors

• The randomized complete block design - the RCBD

• Extension of the ANOVA to the RCBD

• Other blocking scenarios…Latin Square designs


The Blocking Principle• Blocking is a technique for dealing with nuisance factors

• A nuisance factor is a factor that probably has some effect on theresponse, but it’s of no interest to the experimenter…however, thevariability it transmits to the response needs to be minimized

• Typical nuisance factors include batches of raw material,operators, pieces of test equipment, time (shifts, days, etc.),different experimental units

• Many experiments involve blocking (or should)

• Failure to block is a common flaw in designing an experiment(consequences?)


The Blocking Principle• If the nuisance variable is known and controllable (ie. we

can choose the values), we use blocking

• If the nuisance factor is known and uncontrollable,sometimes we can use the regression analysis to removethe effect of the nuisance factor from the analysis

• If the nuisance factor is unknown and uncontrollable (a“lurking” variable), we hope that randomization balancesout its impact across the experiment

• Sometimes several sources of variability are combined in ablock, so the block becomes an aggregate variable


The Hardness Testing Example• We wish to determine whether 4 different tips produce different

(mean) hardness reading on a Rockwell hardness tester

• Assignment of the tips to an experimental unit; that is, a test coupon

• Structure of a completely randomized experiment

• The test coupons are a source of nuisance variability

• Alternatively, the experimenter may want to test the tips across coupons of various hardness levels

• The need for blocking


The Hardness Testing Example:Randomized Complete Block Design (RCBD)

• To conduct this experiment as a RCBD, assign all 4 tips toeach coupon

• Each coupon is called a “block”; that is, it’s a morehomogenous experimental unit on which to test the tips

• Variability between blocks can be large, variability within ablock should be relatively small

• In general, a block is a specific level of the nuisance factor• A complete replicate of the basic experiment is conducted in

each block• A block represents a restriction on randomization• All runs within a block are randomized


The Hardness Testing Example• Suppose that we use b = 4 blocks:

• Notice the two-way structure of the experiment• Once again, we are interested in testing the equality of

treatment means, but now we have to remove thevariability associated with the nuisance factor (the blocks)


Using ANOVA to model the RCBD• Suppose that there are a treatments (factor

levels) and b blocks• A statistical model (effects model) for the

RCBD is

• The relevant (fixed effects) hypothesis is

𝐻𝐻0: 𝜏𝜏1 = 𝜏𝜏2 = ⋯ = 𝜏𝜏𝑎𝑎

1,2,...,1, 2,...,ij i j ij

i ay

j bµ τ β ε

== + + + =


Using ANOVA to model the RCBDANOVA partitioning of total variability:

2.. . .. . ..

1 1 1 1

2. . ..

2 2. .. . ..

1 1

2. . ..

1 1

( ) [( ) ( )

( )]

( ) ( )

( )

a b a b

ij i ji j i j

ij i j

a b

i ji j

a b

ij i ji j

T Treatments Blocks E

y y y y y y

y y y y

b y y a y y

y y y y

SS SS SS SS

= = = =

= =

= =

− = − + −

+ − − +

= − + −

+ − − +

= + +

∑∑ ∑∑

∑ ∑

∑∑


The degrees of freedom for the sums of squares in

are as follows:

Therefore, ratios of sums of squares to theirdegrees of freedom result in mean squares, andthe ratio of the mean square for treatments to theerror mean square is an F statistic that can be usedto test the hypothesis of equal treatment means

T Treatments Blocks ESS SS SS SS= + +

Using ANOVA to model the RCBD

1 1 1 ( 1)( 1)ab a b a b− = − + − + − −


ANOVA Display for the RCBD

In R: lm does the job again.


Vascular Graft Example

• To conduct this experiment as a RCBD, assign all 4pressures to each of the 6 batches of resin

• Each batch of resin is called a “block”; that is, it’s amore homogenous experimental unit on which to testthe extrusion pressures


Vascular Graft Example• R code:graftdata<-data.frame(x=c(90.3,89.2,98.2,93.9,87.4,97.9,92.5,89.5,90.6,94.7,87.0,95.8,

85.5,90.8,89.6,86.2,88.0,93.4,82.5,89.5,85.6,87.4,78.9,90.7),PSI=as.factor(c(rep(c(8500,8700,8900,9100),each=6))),batch=as.factor(rep(1:6,4)))

my.analysis<-lm(x~PSI+batch,data=graftdata)anova(my.analysis)

• Output:Analysis of Variance Table

Response: xDf Sum Sq Mean Sq F value Pr(>F)

PSI 3 178.17 59.390 8.1071 0.001916 **batch 5 192.25 38.450 5.2487 0.005532 **Residuals 15 109.89 7.326 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Statistically significant Batch effect – correction is needed


Residual Analysis for the Vascular Graft Example



• Basic residual plots indicate that normality, constant variance assumptions are satisfied

• No obvious problems with randomization• No patterns in the residuals vs. block• Can also plot residuals versus the (numerical) pressure

(residuals by factor) • These plots provide more information about the constant

variance assumption, possible outliers


The Vascular Graft Example – Which Pressure is Different? Output:

Study: my.analysis ~ "PSI"

LSD t Test for x

P value adjustment method: bonferroni

Mean Square Error: 7.32575

PSI, means and individual ( 95 %) CI

x std r LCL UCL Min Max

8500 92.81667 4.577081 6 90.46148 95.17185 87.4 98.2

8700 91.68333 3.304189 6 89.32815 94.03852 87.0 95.8

8900 88.91667 2.966760 6 86.56148 91.27185 85.5 93.4

9100 85.76667 4.445072 6 83.41148 88.12185 78.9 90.7

Alpha: 0.05 ; DF Error: 15

Critical Value of t: 3.036283

Minimum Significant Difference: 4.744688

Treatments with the same letter are not significantly different.

x groups

8500 92.81667 a

8700 91.68333 a

8900 88.91667 ab

9100 85.76667 b

Fishers LSD. R code:

LSD.test(my.analysis,"PSI",

p.adj="bonferroni",console=T)

8500 and 8700 constitutes a lower group; 9100 a higher. 8900 cannot be distingushedfrom either


The Latin Square Design• These designs are used to simultaneously

control (or eliminate) two sources of nuisance variability

• A significant assumption is that the three factors (treatments, nuisance factors) do not interact

• If this assumption is violated, the Latin square design will not produce valid results

• Latin squares’ force is the low number of runs. If resources is not an issue, RCBD is a possibility.


86

The Rocket Propellant Problem –A Latin Square Design

• This is a 5 × 5 Latin Square design.• Corresponding RCBD: a 5 × 5 design for each

rocket propellant formula (A-E). • Statistical analysis: lm


Statistical Analysis of the Latin Square Design• The statistical (effects) model is

• The statistical analysis (ANOVA) is much likethe analysis for the RCBD.

1,2,...,1, 2,...,1, 2,...,

ijk i j k ijk

i py j p

k pµ α τ β ε

== + + + + = =


Statistical Analysis of the Latin Square DesignOrganizing data for analysis:

rocket.data<-data.frame(x=c(24,20,19,24,24,

17,24,30,27,36,18,38,26,27,21,26,31,26,23,22,22,30,20,29,31),

operator=as.factor(rep(1:5,5)),batch=as.factor(rep(1:5,each=5)),formula=as.factor(c("A","B","C","D","E",

"B","C","D","E","A","C","D","E","A","B","D","E","A","B","C","E","A","B","C","D")))


Statistical Analysis of the Latin Square Design

Analysis: R commands:my.analysis<-lm(x~formula+operator+batch,

data=rocket.data)anova(my.analysis)

Outcome:

Analysis of Variance Table

Response: x

Df Sum Sq Mean Sq F value Pr(>F)

formula 4 330 82.500 7.7344 0.002537 **

operator 4 150 37.500 3.5156 0.040373 *

batch 4 68 17.000 1.5937 0.239059

Residuals 12 128 10.667

---

Signif. codes:

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Batch is not statistically significant, and we can proceed to analyse formula and operator through a RCBD design with repetitions


Other Latin Squares: Examples

4-6 dimensions:

90

4x4 5x5 6x6ABDC ADBEC ABCEBFBCAD DACBE BAECFDCDBA CBEDA CEDFABDACB BEACD DCFBEA

ECDAB FBADCEEFBADC


Design of Experiments

The 2k Factorial Design

91


The 2k Factorial Design• Special case of the general factorial design; kfactors, all at two levels

• The two levels are usually called low and high(they could be either quantitative or qualitative)

• Very widely used in industrial experimentation• Form a basic “building block” for other very useful

experimental designs • Special (short-cut) methods for analysis


The Simplest Case: The 22

“-” and “+” denote the low and high levels of a factor, respectively

• Low and high are arbitrary terms

• Geometrically, the four runs form the corners of a square

• Factors can be quantitative or qualitative, although their treatment in the final model will be different


Chemical Process Example

A = reactant concentration, B = catalyst amount, y = recovery


Analysis Procedure for a Factorial Design• Formulate model• Statistical testing (ANOVA)• Refine the model• Analyze residuals (graphical)• Estimate factor effects• Interpret results


Model formulation

Organizing data for analysis:chem.proces.data<-data.frame(y=c(28,25,27,36,32,32,18,19,23,31,30,29),A=rep(c(-1,1,-1,1),each=3),B=rep(c(-1,-1,1,1),each=3))


Statistical Testing – ANOVAR code:my.analysis<-lm(y~A+B+A:B,data=chem.proces.data)anova(my.analysis)

Outcome:Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

A 1 208.333 208.333 53.1915 8.444e-05 ***B 1 75.000 75.000 19.1489 0.002362 ** A:B 1 8.333 8.333 2.1277 0.182776 Residuals 8 31.333 3.917 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’

A:B is not significant, and we proceed with a reduced model without A:B


Statistical Testing – ANOVAR code:my.analysis<-lm(y~A+B,data=chem.proces.data)anova(my.analysis)

Outcome:Analysis of Variance Table

Response: yDf Sum Sq Mean Sq F value Pr(>F)

A 1 208.333 208.333 47.269 7.265e-05 ***B 1 75.000 75.000 17.017 0.002578 ** Residuals 9 39.667 4.407 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

A and B are both significant, and we proceed to estimate effects.


Statistical Testing – ANOVAR code:my.analysis<-lm(y~A+B,data=chem.proces.data)summary(my.analysis)$coefficients

Outcome:

Estimate Std. Error t value Pr(>|t|)(Intercept) 27.500000 0.6060396 45.376576 6.132482e-12A 4.166667 0.6060396 6.875239 7.265111e-05B -2.500000 0.6060396 -4.125143 2.578088e-03

A high concentration of reactant (A) seem to increase the recoveryrate, while a high amount of catalyst (B) seem to decrease it.


Residuals and Diagnostic Checking


The 23 Factorial Design


Table of – and + Signs for the 23

Factorial Design


Properties of the Table • Except for column I, every column has an equal number of + and –

signs• The sum of the product of signs in any two columns is zero• Multiplying any column by I leaves that column unchanged (identity

element)• The product of any two columns yields a column in the table:

• Orthogonal design• Orthogonality is an important property shared by all factorial

designs – we shall not pursue this further

2

A B ABAB BC AB C AC× =

× = =


The General 2k Factorial Design• There will be k main effects, and:

two-factor interactions2

three-factor interactions3

1 factor interaction

k

k

k

−


ConcludingRemarks

105


Conducting an Experiment: The Process• Plan your experiment!• Successful experiments depend on how well they are

planned.

What are you investigating?What is the objective of your experiment?What are you hoping to learn more about?What are the critical factors?Which of the factors can be controlled?What resources will be used?

106


This presentation is an introduction• Design of experiments go much deeper;

• This presentation only refer to the simple situations.

• I refer you to the literature; t.ex. The Montgomery reference on slide 5.

107


Survival Analysis

108


Main Reference

109


Overview• Introduction;

• Terminology and Notation;

• Data Structures and Kaplan-Meier Curves;

• The Cox proportional Hazards Model;

• Survival Analysis with Time Dependent Covariates.

110


Introduction

111


Introduction• Survival analysis is about analyzing time until an event occurs.

Start follow-up Event

• ‘Time’ can be many things; – days, months, years, seconds, age, time since beginning of

follow-up of an individual, etc.• ‘Event’ can be many things; but generally referred to as the

Failure:–death, disease incidence, relapse from remission, recovery

(e.g. return to work), etc. Not neccesarily negatively loaded concepts.

112

TIME


• Water turbidity in water bodies may be measured by loweringa secci disc, until you can’t see the disc

The distance to the water surface when the disc can’t be seenis the secchi depth.

Example: Secchi Depth

113


Example: Secchi Depth• Survival analysis framework:TIME is the distance to the water surfaceEVENT is when the secchi disc can’t be seenSURVIVAL TIME is the secchi depth.

The secchi depth can be interpreted as a measure of eutrophication

How should the event that the secchi disc hits the sea bed beinterpreted?

114


Example: Secchi Depth• When the secchi disc hits the sea bottom and can still be

seen, the information is the following:

• The secchi depth is more than the current depth;• The disc can’t be lowered further to invetigate the true secchi

depth; in other words, the current varaible can’t spen amymore TIME (=distance to surface).

• We say that the variable is CENSORED.

115


Censoring• A subject is censored at its censor time if at some time point

we can no longer observe the survival of the subject; ie. The depth when the secchi disc hits the seabed.

• Some subjects are censored, while others are not:

• Reasons for (right-) censoring: - Loss to follow-up (ie. Subject may have moved

away/do not show up at clinic/refuse to continue); - Loss to competing risks;- Survival past end of study.

116


Survival Analysis designs:Cohort study (prospective/retrospective)

Target population

Exposed

Unexposed

Disease

Disease-free

Disease

Disease-free

TIME

Disease-free cohort

Slide design: Kristin Sainani


Survival Analysis designs: Randomized Clinical Trial

Target population

Intervention

Control

Disease

Disease-free

Disease

Disease-free

TIME

Random assignment

Disease-free, at-risk cohort



Target population

Treatment

Control

Cured

Not cured

Cured

Not cured

TIME

Random assignment

Patient population




Target population

Treatment

Control

Dead

Alive

Dead

Alive

TIME

Random assignment

Patient population




Why Survival Analysis?• Why not compare mean time-to-event between groups, using

a t-test or linear regression?– ignores censoring

• Why not compare proportion of events in groups, using risk/odds ratios or logistic regression?–ignores time

121


Setting the Scene: Terminology –ObservationsT and d• What we observe:• T: Survival time. T is a random variable• d: Failure status:

𝑑𝑑 = �1 𝑖𝑖𝑖𝑖 𝑖𝑖𝑓𝑓𝑖𝑖𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓0 𝑖𝑖𝑖𝑖 𝑐𝑐𝑓𝑓𝑛𝑛𝑐𝑐𝑐𝑐𝑓𝑓𝑓𝑓𝑑𝑑

Observations: 𝑇𝑇,𝑑𝑑𝑇𝑇𝐴𝐴,𝑑𝑑𝐴𝐴 = 5,1 ; 𝑇𝑇𝐵𝐵 ,𝑑𝑑𝐵𝐵 = 12,0𝑇𝑇𝐶𝐶 ,𝑑𝑑𝐶𝐶 = 3.5,0 ; 𝑇𝑇𝐷𝐷 ,𝑑𝑑𝐷𝐷 = 8,0𝑇𝑇𝐸𝐸 ,𝑑𝑑𝐸𝐸 = 6,0 ; 𝑇𝑇𝐹𝐹 ,𝑑𝑑𝐹𝐹 = 3.5,1

Note that C-D also have delayed entry,so there’s a third variable i play.

122


Survival Analysis – Terminologyand Notation

123


Terminology – The Survival Function S• The stochastic variable T has a distribution. • This is given by the survival function S:

𝑆𝑆 𝑡𝑡 = 𝑃𝑃 𝑇𝑇 > 𝑡𝑡

124


The Survival Function S: Example• T=Onset of Alzheimer’s disease, grouped by the number of E4

alleles in the APOE gene

• The area between the curves, weighted with general survival, is the average number of years you loose/gain by having a specific genotype relative to another

125


Terminology – The Hazard function 𝒉𝒉The hazard function:

ℎ 𝑡𝑡 = limΔ𝑡𝑡→0

𝑃𝑃 𝑡𝑡 ≤ 𝑇𝑇 < 𝑡𝑡 + Δ𝑡𝑡|𝑇𝑇 ≥ 𝑡𝑡Δ𝑡𝑡

ℎ(𝑡𝑡) gives the instantaneous potential per unit time for the event to occur, given that the individual has survived up to time t.Relationship with the survival function S:

ℎ 𝑡𝑡 =𝑆𝑆′(𝑡𝑡)𝑆𝑆(𝑡𝑡)

; 𝑆𝑆 𝑡𝑡 = 𝑓𝑓𝑒𝑒𝑒𝑒 −�0

𝑡𝑡ℎ 𝑐𝑐 𝑑𝑑𝑐𝑐

126


Terminology – The Hazard function 𝒉𝒉The hazard function is a rate, not a probability:Suppose that you drive 60 km/h. This then gives you a potential for driving: If you continue for 1 hour, you cover 60 km. However, you may slow down, speed up or stop during the next hour. The 60 km/h gives the instantaneous potential for driving, but says nothing about the distance covered.

Similarly with the hazard rate h: It gives the instantaneous potential for failure, but says nothing abut survival over intervals.

127


The Hazard function 𝒉𝒉 - Example

Constant hazard: 𝑆𝑆 𝑡𝑡 = 𝑓𝑓−𝜆𝜆𝑡𝑡.Subjects healthy in the study period

128



Increasing Weibull hazard: With no to treatment, the risk of dieing increases.

129



Decreasing Weibull hazard: The risk of dying after surgery is highest immediately after.

130



Lognormal hazard: The risk of dieing from TB increases early in the disease progression and decreases later.

131


The Hazard function 𝒉𝒉Main reasons for studying the hazard function:

• It is a measure of instantaneous potential, whereas a survival curve is a cumulative measure over time;

• It may be used to identify a specific model form, such as an exponential, a Weibull, or a lognormal curve that fits one’s data;

• It is the vehicle by which mathematical modeling of survival data is carried out; that is, the survival model is usually written in terms of the hazard function.

132


Censoring RevisitedThree assumtions on censoring to make analysis work:

• Independent (vs.non-independent) censoring

• Random (vs. non-random) censoring (more restrictive thanIndependent censoring)

• Non-informative (vs. informative) censoring

For matematical formulations, see t.ex. Kalbfleisch and Prentice (1980)

133


Random Censoring• The subjects who are censored at time t should be

representative of all the subjects who remain at risk at time t with respect to their survival experience.

• Thus: Failure rate of those censored at time t is assumed equal to the failure rate of those remaining at time t.

• If there is only one group, random and independent censoring is the same.

• Random censoring implies independent censoring.

134


Independent Censoring• Within any subgroup of interest, the subjects who are

censored at time t should be representative of all the subjectsin that subgroup who remain at risk at time t with respect totheir survival experience.

• In other words, censoring is independent provided that it is random within any subgroup of interest.

• Problem: Bias.135


Non- Informative Censoring• Non-informative censoring occurs if the failure time

distribution of T provides no information about the distribution of censorship times C, and vice versa

• Often justifiable under random and independent censoring

136


Informative Censoring: Example• Informative censoring: • In a study comparing disease-free survival after two treatments for cancer, the

control arm may be ineffective, leading to more recurrences and patients becoming too sick to follow-up.

• On the other hand, patients on the intervention arm may be completely curedby an effective treatment and may no longer feel the need to follow-up. If these participants are routinely censored, the true treatment effect will not be picked up and the results of the study will be biased.

• Disease-free survival rates would be based on the patients who continued to be followed-up in the study, and would be overestimated for the control arm and underestimated for the treatment arm.

Ranganathan and Pramesh (2012)

137


Dealing with Issues of Non-Compliance

• Well-structured designs! Rule out the problem by carefullydesigning your survey.

• Imputation of values (R package: InformativeCensoring);

• Sensitivity analyses.

• See t.ex. Leung, Elashoff and Afifi (1997); Campigotto and Weller (2014); Jackson et al (2014); Hsu and Taylor (2009).

138


Data Structuresand Kaplan-

Meier Curves139


Goals of Survival AnalysisGoal 1: To estimate and interpret survivor and/or hazard functions from survival data.

- Constant, Weibull, lognormal hazards examples

Goal 2: To compare survivor and/or hazard functions.- Alzheimers Disease example

Goal 3: To assess the relationship of explanatory variables to survival time

- Mathematical modelling – to be adressed

140


Data Structures for Survival Analysis

141


Data Structures in Practice: The Takafumi data • R commands:TAKAFUMI<-read.csv2("Data/TAKAFUMI_nga.csv")

head(TAKAFUMI)

• Output:Tank Time Status Group Infection_model

1 41 19 1 SE-SVA-1033-9C Bath

2 41 29 0 SE-SVA-1033-9C Bath

3 41 29 0 SE-SVA-1033-9C Bath

4 41 29 0 SE-SVA-1033-9C Bath

5 41 29 0 SE-SVA-1033-9C Bath

6 41 29 0 SE-SVA-1033-9C Bath

142

This is T! This is d!


Data Structures in Practice: The Takafumi data Restructuring to add subject ID and get important columns first:• R commands:TAKAFUMI<-data.frame(ID=1:dim(TAKAFUMI)[1],TAKAFUMI[,c(2,3,1,4,5)])

head(TAKAFUMI)

• Output:ID Time Status Tank Group Infection_model

1 1 19 1 41 SE-SVA-1033-9C Bath

2 2 29 0 41 SE-SVA-1033-9C Bath

3 3 29 0 41 SE-SVA-1033-9C Bath

4 4 29 0 41 SE-SVA-1033-9C Bath

5 5 29 0 41 SE-SVA-1033-9C Bath

6 6 29 0 41 SE-SVA-1033-9C Bath143


Alternative Data Structure: The counting Process Approach

• Several lines per subject;

• TWO time points: Start and Stop

• We shall return to this structurewhen considering time-dependent covariates.

144


Goal 1: Estimating the Survival Function S

• We observe 𝑇𝑇1,𝑑𝑑1 , 𝑇𝑇2,𝑑𝑑2 , … , 𝑇𝑇𝑛𝑛,𝑑𝑑𝑛𝑛 (ie. 𝑛𝑛 subjects).• Define the process of events 𝑵𝑵 as

𝑁𝑁 𝑡𝑡 = �𝑖𝑖=1

𝑛𝑛

1{𝑇𝑇𝑖𝑖≤𝑡𝑡,𝑑𝑑𝑖𝑖=1}

The jumps of 𝑁𝑁(𝑡𝑡) indicates the number of events at time 𝑡𝑡.• Define the population a risk Y as

𝑌𝑌 𝑡𝑡 = �𝑖𝑖=1

𝑛𝑛

1{𝑡𝑡≤𝑇𝑇𝑖𝑖}

145


• S is the survival function:

• Define the Kaplan-Meier estimator �̂�𝑆 as

�̂�𝑆 𝑡𝑡 = �𝑠𝑠≤𝑡𝑡

1 −Δ𝑁𝑁(𝑐𝑐)𝑌𝑌(𝑐𝑐)

If events occurs at 𝑡𝑡1, … , 𝑡𝑡𝑘𝑘, The Kaplan-Meier estimator takesthe form

�̂�𝑆 𝑡𝑡 = �𝑖𝑖=1

𝑘𝑘

1 −Δ𝑁𝑁(𝑡𝑡𝑖𝑖)𝑌𝑌(𝑡𝑡𝑖𝑖)

Goal 1: Estimating the Survival Function S

146


• Alternative formulation of the the Kaplan-Meier estimator:

�̂�𝑆 𝑡𝑡 = �𝑖𝑖=1

𝑘𝑘𝑌𝑌 𝑡𝑡𝑖𝑖 − Δ𝑁𝑁(𝑡𝑡𝑖𝑖)

𝑌𝑌(𝑡𝑡𝑖𝑖)

Thus, the Kaplan-Meier estimator is the successive product of the ratio between those that survive and those that are at risk.

𝑉𝑉𝑓𝑓𝑓𝑓 �̂�𝑆(𝑡𝑡) = �̂�𝑆(𝑡𝑡)2�𝑡𝑡𝑖𝑖≤𝑡𝑡

𝑌𝑌 𝑡𝑡𝑖𝑖 − Δ𝑁𝑁(𝑡𝑡𝑖𝑖)𝑌𝑌(𝑡𝑡𝑖𝑖)

Greenwood (1926)

Goal 1: The Kaplan-Meier estimator

147


• R code:

head(TAKAFUMI,n=3)

plot(survfit(Surv(Time, Status) ~ Group, data = TAKAFUMI),col=1:8)

legend("bottomleft",legend=levels(as.factor(TAKAFUMI$Group)),

col=1:8,lty=1)

• Output:ID Time Status Tank Group Infection_model

1 1 19 1 41 SE-SVA-1033-9C Bath

2 2 29 0 41 SE-SVA-1033-9C Bath

3 3 29 0 41 SE-SVA-1033-9C Bath

Goal 1: The Kaplan-Meier Estimator

148


• One group at a time, with confidence intervals:• R code:my.levels<-levels(as.factor(

TAKAFUMI$Group))

par(mfrow=c(3,3))

for(i in 1:length(my.levels)){

plot(survfit(Surv(Time, Status)~1,

data = TAKAFUMI[

TAKAFUMI$Group==my.levels[i],]),

col=i,main=my.levels[i],lwd=1.5)

}

par(mfrow=c(1,1))


149


• Comparing group 1 and 8:• R code:TAKAFUMI.temp<-TAKAFUMI[TAKAFUMI$Group %in% my.levels[c(1,8)],]

TAKAFUMI.temp$Group<-

as.factor(as.character(TAKAFUMI.temp$Group))

plot(survfit(Surv(Time, Status) ~ Group,

data = TAKAFUMI.temp),conf.int=TRUE,col=c(1,2),

main=paste("Comparing",my.levels[1],"and",

my.levels[8]))

legend("bottomleft",

legend=levels(as.factor(TAKAFUMI.temp$Group)),

col=1:2,lty=1)


150


• Comparing group 1 and 8, onlyinfection method ”Bath”:

• R code:TAKAFUMI.temp<-TAKAFUMI[TAKAFUMI$Group %in%

my.levels[c(1,8)] &

TAKAFUMI$Infection_model=="Bath",]

TAKAFUMI.temp$Group<-

as.factor(as.character(TAKAFUMI.temp$Group))

plot(survfit(Surv(Time, Status) ~ Group,

data = TAKAFUMI.temp),conf.int=TRUE,col=c(1,2),

main=paste("Comparing",my.levels[1],"and",

my.levels[8]),

sub="Infection Type: Bath")


legend=levels(as.factor(TAKAFUMI.temp$Group)),

col=1:2,lty=1)


151


• The test statistic for comparing two groups is calculated as follows:

𝑍𝑍2 =𝑂𝑂1 − 𝐸𝐸1 2

𝐸𝐸1+

𝑂𝑂2 − 𝐸𝐸2 2

𝐸𝐸2

where the 𝑂𝑂1 and 𝑂𝑂2 are the total numbers of observed events in groups 1 and 2,respectively, and E1 and 𝐸𝐸2 the total numbers of expected events. Under theassumption of identical hazards, 𝑍𝑍2 is 𝝌𝝌𝟐𝟐-distributed with 1 degree of freedom.

• The total expected number of events for a group is the sum of the expectednumber of events at the time of each event.

• The expected number of events at the time of an event can be calculated as therisk of an event at that time, multiplied by the number at risk in the group.

Goal 2: The Log Rank Test

152


• Lets make the observations from the comparisons formal:• In R, th Log Rnk test is performed by the survdiff function:

• Group comparisons ingnoring infection method, R code:TAKAFUMI.temp<-TAKAFUMI[TAKAFUMI$Group %in% my.levels[c(1,8)],]

TAKAFUMI.temp$Group<-as.factor(as.character(TAKAFUMI.temp$Group))

survdiff(Surv(Time, Status) ~ Group, data = TAKAFUMI.temp)

Output:Call:

survdiff(formula = Surv(Time, Status) ~ Group, data = TAKAFUMI.temp)

N Observed Expected (O-E)^2/E (O-E)^2/V

Group=negative control bath 72 4 28.9 21.42 31.9

Group=SE-SVA-14 wild-type 198 89 64.1 9.64 31.9

Chisq= 31.9 on 1 degrees of freedom, p= 2e-08


153


• Group comparisons with infection method ”Bath”, R code:

TAKAFUMI.temp<-TAKAFUMI[TAKAFUMI$Group %in% my.levels[c(1,8)] &

TAKAFUMI$Infection_model=="Bath",]

TAKAFUMI.temp$Group<-as.factor(as.character(TAKAFUMI.temp$Group))

survdiff(Surv(Time, Status) ~ Group, data = TAKAFUMI.temp)

Output:

Call:

survdiff(formula = Surv(Time, Status) ~ Group, data = TAKAFUMI.temp)

N Observed Expected (O-E)^2/E (O-E)^2/V

Group=negative control bath 72 4 4.93 0.174 0.297

Group=SE-SVA-14 wild-type 102 8 7.07 0.122 0.297

Chisq= 0.3 on 1 degrees of freedom, p= 0.6


154


Goal 2: The Log Rank Test• Other formula:

Division with variance instead of mean; approximately similar

• Alternatives:

• The Wilcoxon test (rank test);

• Maximum Likelihood methods.

• Reference: Fleming and Harrington (1982).

155


The Cox Proportional

Hazards Model156


Goal 3: The Cox Proportional Hazards Model• Goal 3: To assess the relationship of explanatory variables to

survival time

• We need a framework where we can take covariates into account:

> summary(TAKAFUMI)

ID Time Status Tank Group Infection_model

Min. : 1.0 Min. : 1.0 Min. :0.0000 Min. :25.00 SE-SVA-1033-9C :223 Bath:722

1st Qu.: 362.2 1st Qu.:29.0 1st Qu.:0.0000 1st Qu.:39.00 SE-SVA-1033-3F :221 IP :724

Median : 723.5 Median :29.0 Median :0.0000 Median :53.00 SE-SVA-14-3D :221

Mean : 723.5 Mean :25.7 Mean :0.2075 Mean :51.06 SE-SVA-1033 wild-type :218

3rd Qu.:1084.8 3rd Qu.:29.0 3rd Qu.:0.0000 3rd Qu.:65.00 SE-SVA-14-5G :217

Max. :1446.0 Max. :29.0 Max. :1.0000 Max. :76.00 SE-SVA-14 wild-type :198

157

summary(TAKAFUMI)

summary(TAKAFUMI)

summary(TAKAF

UMI)

Covariates


Goal 3: The Cox Proportional HazardsModel• Semi-parametric model;• Abstain from parametrizing the hazard function completely, in

order to be able to perform comparisons.

For subject 𝑖𝑖 k covariates:

ℎ𝑖𝑖 𝑡𝑡 = ℎ0 𝑡𝑡 exp �𝑗𝑗=1

𝑘𝑘

𝜃𝜃𝑗𝑗𝑋𝑋𝑖𝑖𝑗𝑗

Where ℎ0 𝑡𝑡 is a baseline hazard that in general is not estimated.

158


Goal 3: The Cox Proportional HazardsModel• In the TAKAFUMI case (at first we ignore the tank):

For subject 𝑖𝑖:

ℎ𝑖𝑖 𝑡𝑡 = ℎ0 𝑡𝑡 exp 𝛼𝛼𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝑝𝑝(𝑖𝑖) + 𝛽𝛽𝑖𝑖𝑛𝑛𝑖𝑖𝑖𝑖𝑖𝑖𝑡𝑡𝑖𝑖𝐺𝐺𝑛𝑛.𝑚𝑚𝐺𝐺𝑑𝑑𝑖𝑖𝑚𝑚(𝑖𝑖)

Individuals within the same Group and infection model: Same hazard. The reference group ”negative control bath” has hazardℎ0 𝑡𝑡 .

159


Goal 3: The Cox Proportional HazardsModel• Hazard rate between individuals, say 1 and 2, with the same

Group (ie, Group(1)=Group(2)), but different infection model (IP, Bath respectively):

ℎ1 𝑡𝑡ℎ2(𝑡𝑡)

=ℎ0 𝑡𝑡 exp 𝛼𝛼𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝑝𝑝(1) + 𝛽𝛽𝐼𝐼𝐼𝐼

ℎ0 𝑡𝑡 exp 𝛼𝛼𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝑝𝑝(2)= exp 𝛽𝛽𝐼𝐼𝐼𝐼

• Thus, the exponentiated coefficient gives the hazard ratio when changing infection model, irrespectively of Group status.

• The Hazards ℎ1 𝑡𝑡 and ℎ2 𝑡𝑡 are proportional.• Hence the name…

160


Goal 3: The Cox Proportional HazardsModelIn R, the coxph function estimates the proportional hazardsmodel. We use the Surv function to specify which variables that are time-to event and censoring.

• R code:

my.analysis<-coxph(

Surv(Time,Status)~Group+Infection_model,

data=TAKAFUMI)

161


Goal 3: The Cox Proportional Hazards Model• Is the cox proprtional hazards model a good model for these

data?

• Model control of the porportional hazards assumption: •cox.zph in R.> cox.zph(my.analysis)

rho chisq p

GroupNegative control IP 0.1604 7.681 0.00558

GroupSE-SVA-1033-3F 0.1122 3.882 0.04880

GroupSE-SVA-1033-9C 0.1238 4.646 0.03112

GroupSE-SVA-1033 wild-type 0.1118 3.722 0.05369

GroupSE-SVA-14-3D 0.0978 2.869 0.09033

GroupSE-SVA-14-5G 0.0940 2.674 0.10202

GroupSE-SVA-14 wild-type 0.1221 4.437 0.03517

Infection_modelIP -0.0488 0.734 0.39167

GLOBAL NA 9.744 0.28349

162

Overall, no problem!

Less than 0.05!But Bonferroni corrected, the value is only borderlinesignificant


Goal 3: The Cox Proportional HazardsModel

Investigating significant effects of Group:

• R code:my.analysis2<-coxph(Surv(Time,Status)~Infection_model,

data=TAKAFUMI)

anova(my.analysis, my.analysis2)

Output:Analysis of Deviance Table

Cox model: response is Surv(Time, Status)

Model 1: ~ Group + Infection_model

Model 2: ~ Infection_model

loglik Chisq Df P(>|Chi|)

1 -1924.2

2 -2042.2 236.05 7 < 2.2e-16 ***

---

Group, corrected for Infection model, is strongly significant.163


Goal 3: The Cox Proportional HazardsModel

Investigating significant effects of Infection Model:

• R code:my.analysis2<-coxph(Surv(Time,Status)~Group,

data=TAKAFUMI)

anova(my.analysis, my.analysis2)

Output:Analysis of Deviance Table

Cox model: response is Surv(Time, Status)

Model 1: ~ Group + Infection_model

Model 2: ~ Group

loglik Chisq Df P(>|Chi|)

1 -1924.2

2 -2063.9 279.49 1 < 2.2e-16 ***

Infection Model, corrected for Group, is strongly significant.164


Goal 3: The Cox Proportional Hazards Model

• Parameter values:> summary(my.analysis)$coef

coef exp(coef) se(coef) z Pr(>|z|)

GroupNegative control IP -3.12569262 0.04390651 0.8836013 -3.53744669 4.040157e-04

GroupSE-SVA-1033-3F 0.03483477 1.03544861 0.5398640 0.06452509 9.485521e-01

GroupSE-SVA-1033-9C -1.04266264 0.35251481 0.5629389 -1.85217723 6.400038e-02

GroupSE-SVA-1033 wild-type 0.38183948 1.46497690 0.5360187 0.71236217 4.762405e-01

GroupSE-SVA-14-3D -0.46135776 0.63042710 0.5481104 -0.84172411 3.999424e-01

GroupSE-SVA-14-5G -1.67392506 0.18750963 0.5937000 -2.81947963 4.810158e-03


Infection_modelIP 2.33718070 10.35200996 0.1753049 13.33208884 1.506205e-40

Reference group: ”groupNegative control Bath”

165


Goal 3: The Cox Proportional Hazards Model• Visualizing: Estimated Kaplan-Meier curves> new.data<-data.frame(Group=c(levels(TAKAFUMI$Group)[1:2],

rep(levels(TAKAFUMI$Group)[-(1:2)],2)),

Infection_model=c(levels(TAKAFUMI$Infection_model),

rep(levels(TAKAFUMI$Infection_model),

each=length(levels(TAKAFUMI$Group))-2)))

> new.dataGroup Infection_model

1 negative control bath Bath

2 Negative control IP IP

3 SE-SVA-1033-3F Bath

4 SE-SVA-1033-9C Bath

5 SE-SVA-1033 wild-type Bath

6 SE-SVA-14-3D Bath

7 SE-SVA-14-5G Bath

8 SE-SVA-14 wild-type Bath

9 SE-SVA-1033-3F IP

10 SE-SVA-1033-9C IP

11 SE-SVA-1033 wild-type IP

12 SE-SVA-14-3D IP

13 SE-SVA-14-5G IP

14 SE-SVA-14 wild-type IP

166


Goal 3: The Cox Proportional Hazards Model• Visualizing: Estimated Kaplan-Meier curves:

• Drawing:

plot(survfit(my.analysis,newdata=new.data),col=1:14,

lty=as.numeric(new.data$Infection_model))


legend=paste(new.data$Group,new.data$Infection_model),

col=1:14,text.col=1:14,bty="n",cex=1.2,

lty=as.numeric(new.data$Infection_model))

167



168



• What if we add Tank to the model?>my.analysis<-coxph(

Surv(Time,Status)~Group+Infection_model+as.factor(Tank),

data=TAKAFUMI)

Warning message:

In fitter(X, Y, strats, offset, init, control, weights = weights, :

Loglik converged before variable 9,15,22,38,39 ; coefficient may beinfinite.

• Not a super model; and we have no direct interest ín the effect of Tank.

• While we may expect Tank to influence results, the effect is of no valueprospectively:

• In the next experiment, it will be Tanks under different circumstances

169



• Checking the propotional hazards assumption:

cox.zph(my.analysis)

• Some combinations of Group and Infection_model only uses1 tank, so many parameters cannot be estimated.

• But the proportional hazards assumption is no longer questionable: The smallest value in cox.zph is 0.025 beforeBonferroni correction, global p-value is 0.19.

170



• Lets randomize the Tank effect:

library(coxme)

my.analysis<-coxme(Surv(Time, Status)~

Group+Infection_model+(1|as.factor(Tank)),

data=TAKAFUMI)

171



• Statistical inference on Group and Infection_model:

my.analysis2<-coxme(Surv(Time, Status)~

Infection_model+(1|as.factor(Tank)),data=TAKAFUMI)

anova(my.analysis,my.analysis2)

my.analysis2<-coxme(Surv(Time, Status)~

Group+(1|as.factor(Tank)),data=TAKAFUMI)

anova(my.analysis,my.analysis2)

• Both analyses gives strong significances.

172



Parameter estimates in the random effects model:summary(my.analysis)$fixed

Model: Surv(Time, Status) ~ Group + Infection_model + (1 | as.factor(Tank))

Fixed coefficients

coef exp(coef) se(coef) z p

GroupNegative control IP -3.0897083 0.04551523 0.9080259 -3.40 0.00067

GroupSE-SVA-1033-3F 0.1328129 1.14203629 0.5651421 0.24 0.81000

GroupSE-SVA-1033-9C -0.9921949 0.37076201 0.5907804 -1.68 0.09300

GroupSE-SVA-1033 wild-type 0.3731670 1.45232688 0.5629024 0.66 0.51000

GroupSE-SVA-14-3D -0.4303877 0.65025694 0.5759943 -0.75 0.45000

GroupSE-SVA-14-5G -1.6311436 0.19570563 0.6209667 -2.63 0.00860


Infection_modelIP 2.2969157 9.94346652 0.1910049 12.03 0.00000

173



Parameter estimates comparison:# random effects model

coef exp(coef) se(coef) z p

GroupNegative control IP -3.0897083 0.04551523 0.9080259 -3.40 0.00067

GroupSE-SVA-1033-3F 0.1328129 1.14203629 0.5651421 0.24 0.81000

GroupSE-SVA-1033-9C -0.9921949 0.37076201 0.5907804 -1.68 0.09300


GroupSE-SVA-14-3D -0.4303877 0.65025694 0.5759943 -0.75 0.45000

GroupSE-SVA-14-5G -1.6311436 0.19570563 0.6209667 -2.63 0.00860


Infection_modelIP 2.2969157 9.94346652 0.1910049 12.03 0.00000

# fixed effects model:

coef exp(coef) se(coef) z Pr(>|z|)

GroupNegative control IP -3.12569262 0.04390651 0.8836013 -3.54 4.040157e-04

GroupSE-SVA-1033-3F 0.03483477 1.03544861 0.5398640 0.06 9.485521e-01

GroupSE-SVA-1033-9C -1.04266264 0.35251481 0.5629389 -1.85 6.400038e-02


GroupSE-SVA-14-3D -0.46135776 0.63042710 0.5481104 -0.84 3.999424e-01

GroupSE-SVA-14-5G -1.67392506 0.18750963 0.5937000 -2.82 4.810158e-03


Infection_modelIP 2.33718070 10.35200996 0.1753049 13.33 1.506205e-40

174


Sample Size Determination for the Cox Proportional Hazards Model• The sample size requires a specific number of events, 𝑁𝑁𝐸𝐸𝐸𝐸. • For two equally sized groups, let Δ be the hazard ratio between them. To detect

a hazard ratio of Δ with a power of 1 − 𝛽𝛽, using a test level 𝛼𝛼, the necessarynumber of events is

𝑁𝑁𝐸𝐸𝐸𝐸 =𝑧𝑧1−𝛼𝛼/2 − 𝑧𝑧1−𝛽𝛽 Δ + 1

Δ − 1

2

• Power:

𝑧𝑧𝐸𝐸𝐸𝐸 = 𝑁𝑁𝐸𝐸𝐸𝐸Δ + 1Δ − 1

− 𝑧𝑧1−𝛼𝛼/2

𝑃𝑃𝑐𝑐𝑃𝑃𝑓𝑓𝑓𝑓 = Φ 𝑍𝑍𝐸𝐸𝐸𝐸 ,Where Φ is the distribution function for the standard normal:

Φ 𝑒𝑒 = �−∞

𝑥𝑥 12𝜋𝜋

𝑓𝑓−𝑥𝑥2/2𝑑𝑑𝑒𝑒

• In R: Φ=pnorm175


Sample Size Determination for the Cox Proportional Hazards Model

• Because of the semiparametric nature of the Cox Proportional Hazards model, no general methods exist to derive 𝑁𝑁 from 𝑁𝑁𝐸𝐸𝐸𝐸.

• Assume that the probability of an event is 𝑒𝑒1 in group 1 and 𝑒𝑒2in group 2. Then

𝑁𝑁 =𝑁𝑁𝐸𝐸𝐸𝐸

⁄𝑒𝑒1 + 𝑒𝑒2 2

R package: powerSurvEpi (2018).176


Survival Analysis with Time-Dependent Covariates

177


Survival Analysis with Time-Dependent Covariates• The addict dataset: Survival times in days of heroin addicts

from entry to a clinic until departure.• Data provided by John Caplehorn, The University of Sydney,

Dept of Public Health.

Column 1 = ID of subject

2 = Clinic (1 or 2)

3 = status (0=censored, 1=endpoint)

4 = survival time (days)

5 = prison record?

6 = methodone dose (mg/day)

178


Survival Analysis with Time-Dependent Covariates• addicts dataset in R:

addicts<-read.table("Data/addicts.txt",header=TRUE)

head(addicts)

ID Clinic Status Survival prison methodone

1 1 1 1 428 0 50

2 2 1 1 275 1 55

3 3 1 1 262 0 55

4 4 1 1 183 0 30

5 5 1 1 259 1 65

6 6 1 1 714 0 55

• 238 data lines of drug addicts treated with methodone179


Survival Analysis with Time-Dependent Covariates• Suppose that the variable ‘methodone dose’ violates the

proportional hazards assumption, and we are interested indefining a time-varying covariate as the product of DOSE andthe natural log of time (Survival).

• We need to re-organize data to facilitate this.

• For this, we have the survSplit function in R.

180


Survival Analysis with Time-Dependent Covariates

addicts.cp<-survSplit(addicts,

cut=addicts$Survival[addicts$Status==1],

end="Survival",

event="Status",

start="start",

id="ID2")

Breaks up the addicts dataset in lines corresponding to the passage between every point where an event happens(mimicking continuity).

181


Survival Analysis with Time-Dependent Covariates> head(addicts.cp)

ID Clinic prison methodone ID2 start Survival Status

1 1 1 0 50 1 0 7 0

2 1 1 0 50 1 7 13 0

3 1 1 0 50 1 13 17 0

4 1 1 0 50 1 17 19 0

5 1 1 0 50 1 19 26 0

6 1 1 0 50 1 26 29 0

The ID 1 is broken into 97 lines:> addicts.cp[96:98,]

ID Clinic prison methodone ID2 start Survival Status

96 1 1 0 50 1 394 399 0

97 1 1 0 50 1 399 428 1

98 2 1 1 55 2 0 7 0

182


Survival Analysis with Time-Dependent Covariates• Adding dose*log(time) :addicts.cp$logtdose=addicts.cp$methodone*log(addicts.cp$Survival)

# removing intervals of length 0:

addicts.cp<-addicts.cp[addicts.cp$start<addicts.cp$Survival,]

• ID 114 has an event at day 35:addicts.cp[addicts.cp$ID==114,c("ID","start","Survival","Status",

"methodone","logtdose")]

ID start Survival Status methodone logtdose

10515 114 0 7 0 40 77.83641

10516 114 7 13 0 40 102.59797

10517 114 13 17 0 40 113.32853

10518 114 17 19 0 40 117.77756

10519 114 19 26 0 40 130.32386

10520 114 26 29 0 40 134.69183

10521 114 29 30 0 40 136.04790

10522 114 30 33 0 40 139.86030

10523 114 33 35 1 40 142.21392

183


Survival Analysis with Time-Dependent CovariatesAnalysis results:

>my.analysis<-

coxph(Surv(addicts.cp$start,addicts.cp$Survival,addicts.cp$Status) ~

prison + methodone + Clinic + logtdose + cluster(ID),data=addicts.cp)

>summary(my.analysis)$coef

coef exp(coef) se(coef) robust se z Pr(>|z|)

prison 0.340633209 1.4058375 0.167474080 0.159717275 2.132726 3.294720e-02

methodone -0.082624866 0.9206965 0.035984407 0.029601316 -2.791257 5.250384e-03

Clinic -1.019875123 0.3606400 0.215415952 0.236365216 -4.314827 1.597276e-05

logtdose 0.008615205 1.0086524 0.006454814 0.005248135 1.641575 1.006782e-01

• The methodone dose is significant, just as the event risk increasesif you have been to prison; also there is a difference between the clinics. But the logtdose is not significant with methodone in the model.

184


Survival Analysis with Time-Dependent Covariates• Relevant cut points for epidemiological studies:

• Time points where exposure changes

• This way, subjects may serve as their own controls

185


Concluding Remarks• Survival analysis is a wide study area; half a day only lets a

glimmer of light out from the shining world of survival analysis.

• Go explore the relavant areas for you, on the basis of thisbrief introduction.

• Main references:–Kleinbaum & Klein: Survival analysis. Springer 2012.–Andersen, Borgan, Gill and Keiding: Statistical Emthods

based on Counting Processes. Springer 1997.–Martinussen & Scheike: Dynamic Regression Models for

Survival Data. Springer 2006.186


Thank you for your attention

187

design of experiments survival analysis · • ‘introduction to. r ’ uploaded. 2. anders...

Documents