case-crossover study

30
Analysis of Time-series Data Case-crossover Study Jinseob Kim July 17, 2015 Jinseob Kim Analysis of Time-series Data July 17, 2015 1 / 30

Upload: jinseob-kim

Post on 19-Feb-2017

846 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Page 1: Case-crossover study

Analysis of Time-series DataCase-crossover Study

Jinseob Kim

July 17, 2015

Jinseob Kim Analysis of Time-series Data July 17, 2015 1 / 30

Page 2: Case-crossover study

Contents

1 ConceptsIndividual dataDesign

2 Conditional logistic regressionReview Basic linear regressionLogistic regressionConditional logistic regression

3 PracticeIssuesIn R

Jinseob Kim Analysis of Time-series Data July 17, 2015 2 / 30

Page 3: Case-crossover study

Objective

1 Individual risk VS population risk

2 Case-crossover design의 개념

3 주의사항

4 적용: season package in R

Jinseob Kim Analysis of Time-series Data July 17, 2015 3 / 30

Page 4: Case-crossover study

Concepts

Contents

1 ConceptsIndividual dataDesign

2 Conditional logistic regressionReview Basic linear regressionLogistic regressionConditional logistic regression

3 PracticeIssuesIn R

Jinseob Kim Analysis of Time-series Data July 17, 2015 4 / 30

Page 5: Case-crossover study

Concepts Individual data

Two approaches to see the relationship between weatherand health outcome

Population based study

Y: # events (daily death counts or # hospital admissions)

X: temperature

Estimates pop’n risk (% change in daily death counts correspondingto the change in temperature)

Individual based study

Y : 1 if an event occurs, 0 otherwise

X : temperature

Estimates individual risk (% change in individual probability of eventor odds ratio corresponding to the change in temperature)

Jinseob Kim Analysis of Time-series Data July 17, 2015 5 / 30

Page 6: Case-crossover study

Concepts Individual data

Data structure change

(Year,week,case)

(2006,1,20) : 1 case

(Year,week,event)

(2006,1,1), (2006,1,1), · · · , (2006,1,1) : 20개 case

(2005,53,0), · · · , (2005,53,0), (2006,2,0), · · · , (2006,2,0) : controls..

Jinseob Kim Analysis of Time-series Data July 17, 2015 6 / 30

Page 7: Case-crossover study

Concepts Design

Case + Crossover

Case: 환자만 이용.

Crossover: 환자의 다른 시점이 대조군.

Jinseob Kim Analysis of Time-series Data July 17, 2015 7 / 30

Page 8: Case-crossover study

Concepts Design

If average (air pollution) of controls < average (air pollution) of casedays..

We conclude that the event is associated with higher values of airpollution

Jinseob Kim Analysis of Time-series Data July 17, 2015 8 / 30

Page 9: Case-crossover study

Concepts Design

Various control day

Time trend로 인한 bias 보정

Jinseob Kim Analysis of Time-series Data July 17, 2015 9 / 30

Page 10: Case-crossover study

Conditional logistic regression

Contents

1 ConceptsIndividual dataDesign

2 Conditional logistic regressionReview Basic linear regressionLogistic regressionConditional logistic regression

3 PracticeIssuesIn R

Jinseob Kim Analysis of Time-series Data July 17, 2015 10 / 30

Page 11: Case-crossover study

Conditional logistic regression Review Basic linear regression

Remind

β estimation in linear regression

1 Ordinary Least Square(OLS): semi-parametric

2 Maximum Likelihood Estimator(MLE): parametric

Jinseob Kim Analysis of Time-series Data July 17, 2015 11 / 30

Page 12: Case-crossover study

Conditional logistic regression Review Basic linear regression

Least Square(최소제곱법)

제곱합을 최소로: y 정규성에 대한 가정 필요없다.

Figure: OLS Fitting

Jinseob Kim Analysis of Time-series Data July 17, 2015 12 / 30

Page 13: Case-crossover study

Conditional logistic regression Review Basic linear regression

Likelihood??

가능도(likelihood) VS 확률(probability)

Discrete: 가능도 = 확률 - 주사위 던져 1나올 확률은 16

Continuous: 가능도 != 확률 - 0∼1 에서 숫자 하나 뽑았을 때 0.7일확률은 0...

Figure: Likelihood

Jinseob Kim Analysis of Time-series Data July 17, 2015 13 / 30

Page 14: Case-crossover study

Conditional logistic regression Review Basic linear regression

Maximum likelihood estimator(MLE)

최대가능도추정량: ε1, · · · , εn이 서로 독립이라하자.

1 각각의 가능도 함수를 구한다.

2 가능도를 전부 곱하면 전체 사건의 가능도 (독립이니까)

3 가능도를 최대로 하는 β를 구한다.

Jinseob Kim Analysis of Time-series Data July 17, 2015 14 / 30

Page 15: Case-crossover study

Conditional logistic regression Review Basic linear regression

MLE: 최대가능도추정량

데이터가 일어날 가능성을 최대로: y또는 ε 분포가정필요.

Jinseob Kim Analysis of Time-series Data July 17, 2015 15 / 30

Page 16: Case-crossover study

Conditional logistic regression Review Basic linear regression

Logistic function: MLE

Figure: Fitting Logistic Function

Jinseob Kim Analysis of Time-series Data July 17, 2015 16 / 30

Page 17: Case-crossover study

Conditional logistic regression Review Basic linear regression

LRT? Ward? score?

Likelihood Ratio Test VS Ward test VS score test

1 통계적 유의성 판단하는 방법들.

2 가능도비교 VS 베타값비교 VS 기울기비교/

Jinseob Kim Analysis of Time-series Data July 17, 2015 17 / 30

Page 18: Case-crossover study

Conditional logistic regression Review Basic linear regression

비교

Figure: Comparison

Jinseob Kim Analysis of Time-series Data July 17, 2015 18 / 30

Page 19: Case-crossover study

Conditional logistic regression Logistic regression

Model

Log(pi

1− pi) = β0 + β1 · xi1

pi = P(Yi = 1) =exp(β0 + β1 · xi1)

1 + exp(β0 + β1 · xi1)

P(Yi = 0) =1

1 + exp(β0 + β1 · xi1)

P(Yi = yi ) = (exp(β0 + β1 · xi1)

1 + exp(β0 + β1 · xi1))yi (

1

1 + exp(β0 + β1 · xi1))1−yi

Jinseob Kim Analysis of Time-series Data July 17, 2015 19 / 30

Page 20: Case-crossover study

Conditional logistic regression Logistic regression

Likelihood

Likelihood=

n∏i=1

P(Yi = yi ) =n∏

i=1

(exp(β0 + β1 · xi1)

1 + exp(β0 + β1 · xi1))yi (

1

1 + exp(β0 + β1 · xi1))1−yi

개인별로 가능도(데이터의 상황이 나올 확률)이 나온다.

그것들을 다 곱하면 Likelihood

이것을 최소로 하는 β를 구하는 것.

Case나 Control이나 따로따로 Likelihood를 구한다.

Jinseob Kim Analysis of Time-series Data July 17, 2015 20 / 30

Page 21: Case-crossover study

Conditional logistic regression Conditional logistic regression

Conditional likelihood

Matched case-control set

Case와 그의 control들(1:1 or 1:N)이 한 쌍!!

쌍별로 likelihood가 나온다.

쌍별로 우리의 데이터를 볼 가능성을 계산.

모든 쌍에 대해 다 곱하면 전체 Likelihood

Jinseob Kim Analysis of Time-series Data July 17, 2015 21 / 30

Page 22: Case-crossover study

Conditional logistic regression Conditional logistic regression

Definition

ith strata(1 ≤ i ≤ N): 1 case(이름:갑), ni control이라 하자.

Conditional likelihood of ith strata=

Li = P(갑이 case고 나머지가 control|case 1명&control ni 명)

Total likelihood=

N∏i=1

Li

Jinseob Kim Analysis of Time-series Data July 17, 2015 22 / 30

Page 23: Case-crossover study

Practice

Contents

1 ConceptsIndividual dataDesign

2 Conditional logistic regressionReview Basic linear regressionLogistic regressionConditional logistic regression

3 PracticeIssuesIn R

Jinseob Kim Analysis of Time-series Data July 17, 2015 23 / 30

Page 24: Case-crossover study

Practice Issues

Control 확실하냐?

앞 뒤 7일, 14일 등.. control이 확실??

Exposure → Disease가 짧아야..

Exposure 가 축적되지 않아야..

급성질환, 폭로의 일시적 효과 (ex:폭염과 사망)

Jinseob Kim Analysis of Time-series Data July 17, 2015 24 / 30

Page 25: Case-crossover study

Practice In R

season package

> library(season)

> data(CVDdaily) # cardiovascular disease data

> CVDdaily=subset(CVDdaily,date<=as.Date('1987-12-31')) # subset for example

> head(CVDdaily)

date cvd dow tmpd o3mean o3tmean Mon Tue Wed Thu Fri Sat

3 1987-01-01 55 Thursday 54.50 -16.0073 -15.89619 0 0 0 1 0 0

5 1987-01-02 73 Friday 58.50 -11.6595 -11.19102 0 0 0 0 1 0

9 1987-01-03 64 Saturday 55.25 -10.3241 -10.51787 0 0 0 0 0 1

12 1987-01-04 57 Sunday 54.75 -18.6471 -18.27014 0 0 0 0 0 0

15 1987-01-05 56 Monday 54.50 -17.5291 -17.13201 1 0 0 0 0 0

18 1987-01-06 65 Tuesday 49.75 -22.7846 -22.74711 0 1 0 0 0 0

month winter spring summer autumn

3 1 1 0 0 0

5 1 1 0 0 0

9 1 1 0 0 0

12 1 1 0 0 0

15 1 1 0 0 0

18 1 1 0 0 0

Jinseob Kim Analysis of Time-series Data July 17, 2015 25 / 30

Page 26: Case-crossover study

Practice In R

casecross()

> # Effect of ozone on CVD death

> model1 = casecross(cvd ~ o3mean+tmpd+Mon+Tue+Wed+Thu+Fri+Sat, data=CVDdaily)

> # match on day of the week

> model2 = casecross(cvd ~ o3mean+tmpd,matchdow=TRUE, data=CVDdaily)

> # match on temperature to within a degree

> model3 = casecross(cvd ~ o3mean+Mon+Tue+Wed+Thu+Fri+Sat, data=CVDdaily, matchconf='tmpd', confrange=1)

Jinseob Kim Analysis of Time-series Data July 17, 2015 26 / 30

Page 27: Case-crossover study

Practice In R

casecross(formula = cvd ~ o3mean + tmpd + Mon + Tue + Wed + Thu +

Fri + Sat, data = CVDdaily, exclusion = 2, stratalength = 28,

matchdow = FALSE, usefinalwindow = FALSE, matchconf = "",

confrange = 0, stratamonth = FALSE)

Time-stratified case-crossover with a stratum length of 28 days

Total number of cases 17502

Number of case days with available control days 364

Average number of control days per case day 23.2

Parameter Estimates:

coef exp(coef) se(coef) z Pr(>|z|)

o3mean -0.002882613 0.9971215 0.001128975 -2.55330077 0.01067073

tmpd 0.001461400 1.0014625 0.001981047 0.73769030 0.46070267

Mon 0.042733425 1.0436596 0.028942815 1.47647783 0.13981566

Tue 0.057910712 1.0596204 0.028772745 2.01269332 0.04414690

Wed -0.010008025 0.9900419 0.029171937 -0.34307029 0.73154558

Thu -0.016790296 0.9833499 0.029455877 -0.57001513 0.56866744

Fri 0.027247952 1.0276226 0.029173235 0.93400517 0.35030123

Sat 0.001855841 1.0018576 0.028900116 0.06421568 0.94879849

Jinseob Kim Analysis of Time-series Data July 17, 2015 27 / 30

Page 28: Case-crossover study

Practice In R

casecross(formula = cvd ~ o3mean + tmpd, data = CVDdaily, matchdow = TRUE,

exclusion = 2, stratalength = 28, usefinalwindow = FALSE,

matchconf = "", confrange = 0, stratamonth = FALSE)

Time-stratified case-crossover with a stratum length of 28 days

Matched on day of the week

Total number of cases 17502

Number of case days with available control days 364

Average number of control days per case day 3

Parameter Estimates:

coef exp(coef) se(coef) z Pr(>|z|)

o3mean -0.0030752572 0.9969295 0.001188540 -2.5874238 0.009669658

tmpd -0.0004095116 0.9995906 0.002131744 -0.1921017 0.847662557

Jinseob Kim Analysis of Time-series Data July 17, 2015 28 / 30

Page 29: Case-crossover study

Practice In R

casecross(formula = cvd ~ o3mean + Mon + Tue + Wed + Thu + Fri +

Sat, data = CVDdaily, matchconf = "tmpd", confrange = 1,

exclusion = 2, stratalength = 28, matchdow = FALSE, usefinalwindow = FALSE,

stratamonth = FALSE)

Time-stratified case-crossover with a stratum length of 28 days

Matched on tmpd plus/minus 1

Total number of cases 15180

Number of case days with available control days 318

Average number of control days per case day 4.9

Parameter Estimates:

coef exp(coef) se(coef) z Pr(>|z|)

o3mean -0.003238583 0.9967667 0.00131839 -2.4564691 1.403099e-02

Mon 0.182058170 1.1996840 0.03577818 5.0885255 3.608582e-07

Tue 0.144181049 1.1550932 0.03563272 4.0463108 5.203115e-05

Wed 0.099443480 1.1045560 0.03554924 2.7973451 5.152447e-03

Thu 0.088518237 1.0925542 0.03459482 2.5587140 1.050601e-02

Fri 0.108107305 1.1141673 0.03437323 3.1451022 1.660288e-03

Sat 0.023660066 1.0239422 0.03525152 0.6711786 5.021068e-01

Jinseob Kim Analysis of Time-series Data July 17, 2015 29 / 30

Page 30: Case-crossover study

Practice In R

END

Email : [email protected]

Jinseob Kim Analysis of Time-series Data July 17, 2015 30 / 30