Download - Issues in case-control studies

Issues in case-control studies

Internal Medicine Samsung Medical CenterSungkyunkwan University School of MedicineKwang Hyuck [email protected]

mailto:[email protected]

Presenter’s Name

Date

Issues in case-control stud-iesEliseo Guallar, MD, [email protected]

Juhee Cho, M.A., [email protected]



Presenter’s Name

Date

Case-control study – historical synonyms

Retrospective study Trohoc study Case comparison study Case compeer study Case history study Case referent study

3

Presenter’s Name

Date

Case Control Study

1 0

0 1

( )A B

OR cross product ratioA B

Disease

Yes No

Exposed Yes A1 B1

No A0 B0

Case Control

Presenter’s Name

Date

생체 간이식 후 간수치 상승 환자에서 담도 협착의 조기 발견과 관련된 요인

오초롱 , 이광혁 , 이종균 , 이규택 , 권준혁 *, 조재원 *, 조주희 **성균관대학교 의과대학 , 삼성서울병원 소화기내과 , 이식외과 *, 암교육센터 **

연구목적

생체간이식 (LDLT) 후 발생하는 담도 합병증 가장 좋은 치료인 내시경적 치료 성공률 : 50% 전후

담도 합병증을 조기에 발견하여 내시경적 배액술을 시행하면 성공률이 높다 .

LDLT 후 간 기능 이상 소견을 보이는 환자 중에 담도 합병증을 예측할 수 있는 요인을 찾고자 하였다 .

대상 및 방법

기간 및 대상 환자 2006 년 1 월부터 2008 년 12 월 생체간이식을 받은 환자 수술 후 회복된 간기능이 다시 악화되었던 환자 duct to duct 문합 환자만 포함 (hepaticojejunostomy 환자는

제외 )

조사한 항목 기저질환 , 증상 간기능 검사 수술기록 영상의학검사

분석 group

LDLT 후 간수치가 재상승한 환자를 대상으로 group 을 나눔

( 상승 기준 : AST>80, ALT>80, ALP>250 or bilirubin>2.2)

Group A: ERCP 가 필요한 환자 Vs ERCP 필요하지 않은 환자Group B: 문합부 담도협착 환자 Vs 거부반응 환자 Group C: CT 상 협착소견이 없었던 환자 중에ERCP 가 필요한 환자 Vs 필요하지 않은 환자

n=4623

7

5

3

3

5

n=7458

13

3

LDLT patients during 3years : n=213

need ERCPstricture

leakage

stone

Patients with LFT elevation : n=120

not need ERCPrejection

infection

HCC

viral reactivation

vessel stenosis

etc

Analysis group B

Analysis group A

Analysis group C CT(-) need ERCP : 32 CT(-) not need ERCP : 40

Case-Control Study or not?

Presenter’s Name

Date

11

Presenter’s Name

Date

12

Presenter’s Name

Date

Brock MV, et al. N Engl J Med 2008;358:900-913

Presenter’s Name

Date

Conducting case-control studies

Case and Control selection Exposure measurement Odds ratio

Presenter’s Name

Date

Research

New Question ?? Method

Clinical study Translational study Laboratory study

Clinical study Observational studies

• Case-control study Vs Cohort study Randomized controlled trial

Presenter’s Name

Date

Why case-control studies?

New question of interest Cohort study with the appropriate outcome

or exposure ascertainment does NOT exist Need to initiate a new study Do you have the time and/or resources to

establish and follow new cohort?

16

Presenter’s Name

Date

Case control study ??

High cholesterol Myocardial infarction

MI (+) case MI (-) control Cholesterol level Result

• Negative • Positive

17

Presenter’s Name

Date

Impetus for case-control studies : EFFICIENCY

May not have the sufficient duration of time to see the development of diseases with long latency periods.

May not have the sufficiently large cohort to observe outcomes of low incidence.

NOTE: Rare outcomes are not necessary for acase-control study, but are often the drive.

18

Presenter’s Name

Date

19

Presenter’s Name

Date

Efficiency of case-control study Do maternal exposures to estrogens around

time of conception cause an increase in congenital heart defects?

Assume RR = 2, 2-sided α = 0.05, 90% power Cohort study: If I0 = 8/1000, I1 = 16/1000, would

need 3889 exposed and 3889 unexposed mothers

Case-control study: If ~30% of women are exposed to estrogens around time of conception, would need 188 cases and 188 controls

Schlesselman, p. 17 20

Presenter’s Name

Date

Strengths of case-control study

Efficient – typically: Shorter period of time Not as many individuals needed Cases are selected, thus particularly good for

rare diseases

Informative – may assess multiple exposures and thus hypothesized causal mechanisms

21

Presenter’s Name

Date

Learning objectives Exposure Selection of cases and controls Bias

Selection, Recall, Interviewer, Information Odds ratios Matching Nested studies Conducting a case-control study

DCR Chapter 8

22

Presenter’s Name

Date

Exposure ascertainment – examples

Active methods Questionnaire (self- or interviewer-

administered) Biomarkers

Passive methods Medical records Insurance records Employment records School records

23

Presenter’s Name

Date

Exposure ascertainment issues

Establish biologically relevant period Measurement occurs once at current time

Repeated exposure Previous exposure

Measure of exposure occurs after outcome has developed

Possibility of information bias Possibility of reverse causation (outcome

influences the measure of exposure)

24

Presenter’s Name

Date

Is it possible in case-control study? – relevant period

25

Yesterday smoking and radiation Cancer risk

Presenter’s Name

Date

Information bias: recall bias

Mothers of babies born with congenital malformations more likely to recall (accurately or “over-recall”) events during pregnancy such as illnesses, diet, etc.

26

Presenter’s Name

Date

Possibility of reverse causation

High cholesterol Myocardial infarction

MI (+) case MI (-) control Cholesterol level Result ? MI Cholesterol level decrease Measure cholesterol after MI

27

Presenter’s Name

Date

Case selection – basic tenets

Eligibility criteria Characteristics of the target and source population

Diagnostic criteria Definition of a case: misclassification

Feasibility

28

Presenter’s Name

Date

Source populations – samples

Health providers: clinics, hospitals, insurers Occupations: work place, unions Surveillance/screening programs Laboratories, pathology records Birth records Existing cohorts Special interest groups: disease foundations or

organizations

29

Presenter’s Name

Date

Incident versus prevalent cases

Incident cases: All new cases of disease cases (that become diagnosed) in a certain period

Prevalent cases: All current cases regardless of when the case was diagnosed

30

Presenter’s Name

Date

Incident Vs Prevalence

Do the cases represent all incident cases in the target population?

Exposure–disease association Vs Exposure–survival association

31

Presenter’s Name

Date

Prevalence cases

32

Disease only A (causal factor) 1-month survival A+B (protective factor) 1-year survival A+C (protective factor) 10-year survival

Patient A: A1 1 month Patient B: A1+B 1 year Patient C: A1+C 10 years

Prevalence cases A1,B,C : Causes intervention of B or C ↓↓Survival

Presenter’s Name

Date

Disease severity

Which stage is chosen for a case? Early stage only Progression not always Late stage only Influence of severity

Increase sample size for stratification

33

XP_USER

why is it needed? what is the problem if progression does not occur?

Presenter’s Name

Date

Early stage only

Case selection was done in prevalent cases of thyroid cancer

Case: small thyroid cancer Control: normal population Determined the differences

Clinical meaning of this study if there is no difference of survival between them

34

Presenter’s Name

Date

Late stage only – difficult diagnosis

35

Pancreatic cancer Vs. Weight Cases: late stage pancreatic cancer

Low weight due to Cancer progression Conclusion

low weight pancreatic cancer

Increase sample size for stratification

Presenter’s Name

Date

Selection bias

Selection of cases independent of exposure status

Related to severity

Related to hospitalization or visiting

36

XP_USER

what is the meaning

Presenter’s Name

Date

Example selection bias (1)

Hypothesis Common cold Asthma

Setting Patients in Hospital

Truth Common cold: aggravating factor not causal factor No different incidence of asthma according to

common cold Common cold (+) aggravation hospital visit Common cold (-) no symptoms no visit

37

Presenter’s Name

Date

38

Total Common cold in society

Patients in hospital

Common cold in hospital

Asthma 1000 10 50 10

General 200000 2000 1000 20 (10+ alpha)

Cause positive Cause negativeCase (asthma) 10 40

Control 1 49

Odds ratio = (1X49)/(4X1)

Example selection bias (2)

Presenter’s Name

Date

Case and Control selection

39

Same distribution of risk factors ??

Presenter’s Name

Date

Guallar E, et al. N Engl J Med 2002;347:1747-5440

Presenter’s Name

Date

Selection of controls – basic tenets

Same target population of cases Confirmation of lack of outcome/disease Selection needs to be independent of

exposure

41

Presenter’s Name

Date

Controls in case-control studies

Should have the same proportion of exposed to non-exposed persons as the underlying cohort (source population)

Should answer yes to: If developed disease of interest during study period, would they have been included as a case?

42

Presenter’s Name

Date

Selecting controls – Same as case source

Characteristics 1. Convenient2. Most likely same target population3. Rule out outcome – avoids misclassification4. Similar factors leading to inclusion into source population5. Sometimes impractical

Examples Breast cancer screening program

• Confirmed breast cancer – cases• No breast cancer – controls

Same hospital as case series• Similar referral pattern – examine by illness types

Pediatric clinics Geographic population Other special populations (e.g., occupational setting)

43

Presenter’s Name

Date

Source for controls

Geographic population Roster needed Probability sampling

Neighborhood controls Random sample of the neighborhood

Friends and family members Hospital-based control

44

Presenter’s Name

Date

Selection of controls: Friends or family

members Friends or family members

Ask each case for list of possible friends who meet eligibility criteria

Randomly select among list Type of matching - will be addressed later

Concerns: May inadvertently select on exposure status, that is,

friends because of engaging in similar activities or having similar characteristics/culture/tastes

“over-matching”

45

Presenter’s Name

Date

Am J Epidemiol 2004;159:915-21 46

Presenter’s Name

Date

Selection of controlsHospital or clinic-based

Strengths Ease and accessibility Avoid recall bias

Concerns Section bias: exposure related to the hospitalization A mixture of the best defensible control

Referral pattern Same Or not

47

nanunya

why is the referral pattern a strenght?

Presenter’s Name

Date

Diet pattern: Colon cancer

소화기 암 전문 병원 (GI referral center) 에서 연구를 수행함 Case : 소화기 클리닉의 대장암 (+) Control : 호흡기 클리닉의 대장암 (-)

• 소화기 클리닉 : 대기실 소화기 암 관련 음식 정보• 호흡기 클리닉

두 군 간에 차이는 질환의 차이가 아니라 클리닉의 차이를 반영할 수도 있다 . Control : 소화기 클리닉의 위암 (+)

48

Presenter’s Name

Date

Guallar E, et al. N Engl J Med 2002;347:1747-5449

Presenter’s Name

Date

Weakness of Case-Control Studies

Time period from which the cases arose Survival factor, Reverse causation Biologically relevant period

Only one outcome measured Susceptibility to bias

Separate sampling of the cases and controls Retrospective measurement of the predictor

variables

50

Presenter’s Name

Date

Issues in case-control stud-iesEliseo Guallar, MD, [email protected]

Juhee Cho, M.A., [email protected]



Presenter’s Name

Date

Case and Control selection

52

Same distribution of risk factors ??

Presenter’s Name

Date

Selection of cases Case selection in hospitals Alcohol Hip fractures: All visit hospitals IUD abortion

1st abortion: Some visit but others not Women with IUD in general population more frequently visit clinics

53

Disease No disease

Exposed

Non-exposed

Target populationDisease No disease

Exposed

Non-exposed

Study sample

aA B b

C cD d

Presenter’s Name

Date

1st abortion: 3% rate and no relation of IUD

IUD: frequent visit General population

IUD(+) 1000 970/30 IUD(-) 9000 8730/270

Hospital population IUD (+) 90% 873/27 IUD (-) 45% 4050/120

54

case controlYes 10 10

No 90 90

100 100

case controlYes 18

No 82

100

Control: general population difference due to frequent visitControl: Hospital population theoretically same unless this control group has higher abortion rates due to other problemsControl mixture: both

Presenter’s Name

Date

Actual situation

Limited cases

Selection bias from control selection

55

Presenter’s Name

Date

56

Presenter’s Name

Date

Nomura A, et al. N Engl J Med 1991;325:1132-6 57

Presenter’s Name

Date

Selection bias in nested case-control study

Controls were excluded if they had had gastrectomy or history of peptic ulcer disease

Controls with a cardiovascular disease or cancer at baseline or during follow-up were excluded

Disease No disease

Exposed

Non-exposed


Exposed

Non-exposed

Study sample

aA B b

C cD d

58

Presenter’s Name

Date

59

Presenter’s Name

Date

MacMachon B, et al. N Engl J Med 1981;304:630-3 60

Presenter’s Name

Date


Presenter’s Name

Date

Selection bias in case-control study

Controls were largely patients with diseases of the gastrointestinal tract

Control patients may have reduced their coffee intake as a consequence of GI symptoms

Disease No disease

Exposed

Non-exposed


Exposed

Non-exposed

Study sample

aA B b

C cD d

63

Presenter’s Name

Date

64

Presenter’s Name

Date

Antunes CMF, et al. N Engl J Med 1979;300:9-13 65

Presenter’s Name

Date


Non-GY Control 6.0 GY Control 2.1

Presenter’s Name

Date

Criticisms of prior case-control

studies Diagnostic surveillance bias

Women on estrogens are evaluated more intensively – they are more likely to be diagnosed and to be diagnosed at earlier stages

Women with asymptomatic cancer who receive estrogens are more likely to bleed and to be diagnosed


Presenter’s Name

Date

To avoid selection bias in case-control

studies Selection of cases

Types of cases selected (non-fatal, symptomatic, advanced) Response rates among cases Relation of selection to exposure – Are exposed cases more

(or less) likely to be included in the study? Selection of controls

Type of controls (general population, hospital, friends and relatives)

For hospital controls, diseases selected as control conditions Response rate among controls Relation of selection to exposure – Are exposed controls

more (or less) likely to be included in the study? Similar response rates in cases and controls do NOT

rule out selection bias68

Presenter’s Name

Date

69

Presenter’s Name

Date

Recall issues

All information in case-control studies is historic, so if relying on reporting by participants, accuracy depends on recall

Concerns: Do cases recall prior events differently from controls? Mindset of someone with disease : Is there

something that I did that may have caused the disease?

Recall Bias

(Information Bias)70

Presenter’s Name

Date

Recall bias – example

Mothers of babies born with congenital malformations more likely to recall (accurately or “over-recall”) events during pregnancy such as illnesses, diet, etc.

71

Presenter’s Name

Date

72

Folic acid and neural tube defects

Figure 1: Features of neural tube development and neural tube defects. Botto et el. Neural tube defects. NEJM 1999. (28th days after fertilization)

Background and Aim A reduced recurrent risk of neural tube defects among

women receiving muti-vitamin supplements containing folic acid.

Most of NTDs are de-novo; less than 10% of NTDs are recurrent.

First occurrence of only NTDs and periconceptional folate supplements

Study population

Case NTDs

Control Other major malformations due to recall bias Subjects with oral clefts were excluded because vitamin

supplementation has been hypothesized to reduce the risk: selection bias

Pregnant womenTarget

Source

Study

Overall data

76

Folate (+) OR = 0.6 (0.4 – 0.8)

Recall Bias: Previous knowledge

77

Recall Bias quantificationCase Control OR In this study1000 1000 Recall rate

real 500 800 0.625 Control – 75%

all 400 600 0.667 Case – 80% 0.6

Prev known 450 600 0.750 Case – 90% 0.8

Prev unknown 375 600 0.625 Case – 75% 0.4

78

Presenter’s Name

Date

Recall bias – assessment / avoidance

Check with recorded information, if possible Use objective markers or surrogates for

exposure – careful of markers that are affected by disease

Ask participant to identify which factor(s) are important for disease

Build in false risk factor to test for over-reporting

Use controls with another disease

79

Study population

Case NTDs

Control Other major malformations due to recall bias Subjects with oral clefts were excluded because vitamin

supplementation has been hypothesized to reduce the risk:

selection bias

Pregnant womenTarget

Source

Study

Selection bias If oral clefts were included in control group, control

with exposure (lack of vitamin supplement or folate intake) increased.

As B number increases, the probability of rejecting null hypothesis decreases.

Case Control

Exposure (+) A B

Exposrue (-) C D

Exposure: lack of folate intake

Cleft = ↓intake of vitamin

Methods Periconceptional folic acid exposure was determined by

Interview with study nurses

Demographic Health behavior factors Reproductive history Family history of birth defects Occupation Illnesses (chronic and during pregnancy) Use of alcohol, cigarettes and medications Vitamin use during the 6 months before the last LMP

through the end of pregnancy Semi-quantitative food frequency questionnaire Knowledge of vitamins and birth defects

Confounding

Exposure ↓ Folate intake

Outcome↑ NTDs

Confounding Alcohol

Presenter’s Name

Date

Interviewer bias

Differential interviewing of cases and controls, i.e., may probe or interpret responses differently

Interviewer Bias

(Information Bias)

84

Presenter’s Name

Date

Interviewer bias – avoidance / assessment

Self-administered instruments (prone to more non-response)

Standardized instruments Computerized instruments (CADI, ACASI)

Avoid open-ended questions but rather use questions with each possible response elicited

Training Masking interviewers to research question Masking interviewers to case/control status Same interviewers for cases and controls

85

Presenter’s Name

Date

Odds ratio

1 0

0 1

( )A B

OR cross product ratioA B

DiseaseYes No

Exposed Yes A1 B1

No A0 B0

Presenter’s Name

Date

Example: CHD and Diabetes

CHDYes No

Diabetes Yes 183 65

No 575 735

183 / 65 3.62575 / 735CHDOR

No units!

87

Presenter’s Name

Date

Some properties of odds ratios

Null value: OR = 1 OR >= 0 (cannot be negative) Multiplicative scale (be careful with plots) Use logistic regression to estimate

multivariate adjusted odds ratios in case-control studies

88

Presenter’s Name

Date

Odds ratios and the “rare disease assumption”

With incidence density sampling (represents underlying cohort at time of case) and sampling of cases and controls independent of exposure:

OR ≈ IR With outcomes of very low incidence in the

underlying cohort and sampling of cases and controls independent of exposure:

OR ≈ RR Higher incidence increases the bias away from

the null89

Presenter’s Name

Date

90

Presenter’s Name

Date

Matching Individual matching Frequency matching Stratified matching

Nested study Case-control study Case-cohort study

91

Presenter’s Name

Date

Siegel DS, et al. Blood 1999;93:51-4

Matching in cohort study – example

92

Presenter’s Name

Date

Matching in case-control studies – individual matching Pairing or grouping controls to case by known risk

factors in the design phase, i.e., when selecting controls

In protocol, define matching characteristics and their “boundaries”

Dichotomous or categorical: self-explanatory (e.g., sex, race, blood type, disease stage)

Continuous: can be exact, or typically a window (e.g., age ± 5 years, CD4 cell count ± 50 cells)

For each recruited case, search in control source population for the person(s) who meet the matching criteria

Select 1 or more of them at random93

Presenter’s Name

Date

Odds ratio – matched pairs

Case Control # pairs A1 B1 n11

A1 B0 n10

A0 B1 n01

A0 B0 n00

N = total # pairsN pairs = N cases and N controls 2 N people

94

Presenter’s Name

Date


Presenter’s Name

Date

Frequency matching

Select cases Examine distribution of potential confounder

(matching variable) Select controls so that they have same

distribution of the potential confounder Conduct stratified analyses or regression to

control for the induced selection bias

96

Presenter’s Name

Date

Stratified sampling – alternative to matching Decide up front what distribution of cases and

controls according to confounder is desired Select cases and controls so that expectations

are met Selection of controls does not depend on

cases being selected first Note that distribution of confounder is not the

distribution one may see among all cases in the population

97

Presenter’s Name

Date

Stratified sampling – example Want 50% females in 100 cases and controls

50 female cases and 50 male cases 50 female controls and 50 male controls

In the study period, 175 incident male cases and 75 incident female cases occur

As they occur, enroll cases until 50 are recruited in each stratum

Throughout study period, enroll 50 male and 50 female controls

98

Presenter’s Name

Date

Matching – limitations Cannot examine the independent effect of matched

variable on outcome Cases are controls are balanced for the matched factor

May be costly to perform May inadvertently match

On the exposure itself or its surrogate On a factor in the causal pathway On a factor that is affected by the outcome

Matching on an exposure-related factor but not a disease determinant may reduce the statistical efficiency (matched cases and controls with same exposure are not used in matched analysis)

Logistical complexity of matching99

Presenter’s Name

Date

Matching – strengths Costs of finding a matched control may

< costs of performing tests to assess confounding

< costs of recruiting additional controls to yield enough persons across entire range of confounding variable

Particularly useful when distribution of confounders is very different in cases and controls

Increases amount of information/subject Matching yields same ratio of cases and controls

according to distribution of matched variable100

Presenter’s Name

Date

Nested studies

In an existing cohort study New questions arise Need efficient method to use existing information

Do not want to conduct methods on entire cohort, due to limited resources

Nest a study without sacrificing validity and too much precision

Some nesting options: Case-cohort

• Sub-cohort Case-control

101

Presenter’s Name

Date

102

Nested Case-Control and Case-Cohort Studies Case-comparison studies

Use all cases or representative subset as of date of analysis

Comparison group: Cohort member for all nested designs

Study Design Comparison Case-control Event-free member at time of case’s

event (incidence density sampling)

Case-cohort Members of subcohort, selected at random from cohort at time of enrollment, at risk at time of case’s event= In the subcohort riskset

Presenter’s Name

Date

Full Cohort

Events: A 1 1 2 S1 S6 S3,S8

At risk: N 8 6 4 S1,S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S6,S7,S8 S3,S4,S7,S8

10 20 30 35

S1S2S3S4S5S6S7S8

103

Presenter’s Name

Date

104

Case-cohort study

Presenter’s Name

Date

Nested case-control study

Events: A 1 1 2 S1 S6 S3,S8

At risk: N 8 6 4 S1,S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S6,S7,S8

S3,S4,S7,S8

10 20 30 35

S1S2S3S4S5S6S7S8

Potential controls: S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S7,S8 S4,S7 105

Presenter’s Name

Date

106

A cohort study

3 events or cases occur among 8 people, of whom 5 are ever exposed

Exposed are solid lines, unexposed are dashed

Dots are eventsTime

Pers

ons

Presenter’s Name

Date

107

A nested case-control study

Compare 3 cases to 3 non-cases (at event time) among cohort members

Time

Pers

ons

Incidence Density Sampling

Presenter’s Name

Date

108

A case-control study

Compare 3 cases to 3 non-cases (at event time) among cohort members

but “what is the cohort?”

They arise from some underlying cohort!!Time

Pers

ons

Incidence Density Sampling

Presenter’s Name

Date

Designing a case-control studyOverview I

What is the research question? In what target population? What source(s) will be used? How long will recruitment take? What is the definition of the cases? What confirmation is needed? Is screening/additional

testing necessary? Will prevalent cases be used? Does exposure

influence the disease prognosis? What is the underlying cohort? How many cases are seen per year in the source?

109

Presenter’s Name

Date

What are the eligibility criteria for controls? What source(s) will be used to identify controls? Do they represent the same underlying cohort as the

cases? What confirmation is needed? Is screening/additional

testing necessary? Sampling methods? Will the controls be selected

throughout the study period? Can they be selected as cases if they later develop disease?

Do additional sources need to be used? For both cases and controls, does exposure status

affect: inclusion in source populations or participation?

110

Designing a case-control studyOverview II

Presenter’s Name

Date

Are there known confounders? Should matching be used?

What methods will be used to recruit cases and controls?

What methods will be used to obtain information about exposures and potential confounders? Active / Passive?

Are the methods of data collection objective and independent of case/control status?

What methods are in-place to avert and monitor differential recall by case/control status if interviewing is involved?

If study involves personnel-administered data collection, are the personnel masked to case-control status? 111

Designing a case-control study Overview III

Thank you for your attention.

Download - Issues in case-control studies

Top Related