Download - Issues in case-control studies
Issues in case-control studies
Internal Medicine Samsung Medical CenterSungkyunkwan University School of MedicineKwang Hyuck [email protected]
Presenter’s Name
Date
Issues in case-control stud-iesEliseo Guallar, MD, [email protected]
Juhee Cho, M.A., [email protected]
Presenter’s Name
Date
Case-control study – historical synonyms
Retrospective study Trohoc study Case comparison study Case compeer study Case history study Case referent study
3
Presenter’s Name
Date
Case Control Study
1 0
0 1
( )A B
OR cross product ratioA B
Disease
Yes No
Exposed Yes A1 B1
No A0 B0
Case Control
Presenter’s Name
Date
생체 간이식 후 간수치 상승 환자에서 담도 협착의 조기 발견과 관련된 요인
오초롱 , 이광혁 , 이종균 , 이규택 , 권준혁 *, 조재원 *, 조주희 **성균관대학교 의과대학 , 삼성서울병원 소화기내과 , 이식외과 *, 암교육센터 **
연구목적
생체간이식 (LDLT) 후 발생하는 담도 합병증 가장 좋은 치료인 내시경적 치료 성공률 : 50% 전후
담도 합병증을 조기에 발견하여 내시경적 배액술을 시행하면 성공률이 높다 .
LDLT 후 간 기능 이상 소견을 보이는 환자 중에 담도 합병증을 예측할 수 있는 요인을 찾고자 하였다 .
대상 및 방법
기간 및 대상 환자 2006 년 1 월부터 2008 년 12 월 생체간이식을 받은 환자 수술 후 회복된 간기능이 다시 악화되었던 환자 duct to duct 문합 환자만 포함 (hepaticojejunostomy 환자는
제외 )
조사한 항목 기저질환 , 증상 간기능 검사 수술기록 영상의학검사
분석 group
LDLT 후 간수치가 재상승한 환자를 대상으로 group 을 나눔
( 상승 기준 : AST>80, ALT>80, ALP>250 or bilirubin>2.2)
Group A: ERCP 가 필요한 환자 Vs ERCP 필요하지 않은 환자Group B: 문합부 담도협착 환자 Vs 거부반응 환자 Group C: CT 상 협착소견이 없었던 환자 중에ERCP 가 필요한 환자 Vs 필요하지 않은 환자
n=4623
7
5
3
3
5
n=7458
13
3
LDLT patients during 3years : n=213
need ERCPstricture
leakage
stone
Patients with LFT elevation : n=120
not need ERCPrejection
infection
HCC
viral reactivation
vessel stenosis
etc
Analysis group B
Analysis group A
Analysis group C CT(-) need ERCP : 32 CT(-) not need ERCP : 40
Case-Control Study or not?
Presenter’s Name
Date
11
Presenter’s Name
Date
12
Presenter’s Name
Date
Brock MV, et al. N Engl J Med 2008;358:900-913
Presenter’s Name
Date
Conducting case-control studies
Case and Control selection Exposure measurement Odds ratio
Presenter’s Name
Date
Research
New Question ?? Method
Clinical study Translational study Laboratory study
Clinical study Observational studies
• Case-control study Vs Cohort study Randomized controlled trial
Presenter’s Name
Date
Why case-control studies?
New question of interest Cohort study with the appropriate outcome
or exposure ascertainment does NOT exist Need to initiate a new study Do you have the time and/or resources to
establish and follow new cohort?
16
Presenter’s Name
Date
Case control study ??
High cholesterol Myocardial infarction
MI (+) case MI (-) control Cholesterol level Result
• Negative • Positive
17
Presenter’s Name
Date
Impetus for case-control studies : EFFICIENCY
May not have the sufficient duration of time to see the development of diseases with long latency periods.
May not have the sufficiently large cohort to observe outcomes of low incidence.
NOTE: Rare outcomes are not necessary for acase-control study, but are often the drive.
18
Presenter’s Name
Date
19
Presenter’s Name
Date
Efficiency of case-control study Do maternal exposures to estrogens around
time of conception cause an increase in congenital heart defects?
Assume RR = 2, 2-sided α = 0.05, 90% power Cohort study: If I0 = 8/1000, I1 = 16/1000, would
need 3889 exposed and 3889 unexposed mothers
Case-control study: If ~30% of women are exposed to estrogens around time of conception, would need 188 cases and 188 controls
Schlesselman, p. 17 20
Presenter’s Name
Date
Strengths of case-control study
Efficient – typically: Shorter period of time Not as many individuals needed Cases are selected, thus particularly good for
rare diseases
Informative – may assess multiple exposures and thus hypothesized causal mechanisms
21
Presenter’s Name
Date
Learning objectives Exposure Selection of cases and controls Bias
Selection, Recall, Interviewer, Information Odds ratios Matching Nested studies Conducting a case-control study
DCR Chapter 8
22
Presenter’s Name
Date
Exposure ascertainment – examples
Active methods Questionnaire (self- or interviewer-
administered) Biomarkers
Passive methods Medical records Insurance records Employment records School records
23
Presenter’s Name
Date
Exposure ascertainment issues
Establish biologically relevant period Measurement occurs once at current time
Repeated exposure Previous exposure
Measure of exposure occurs after outcome has developed
Possibility of information bias Possibility of reverse causation (outcome
influences the measure of exposure)
24
Presenter’s Name
Date
Is it possible in case-control study? – relevant period
25
Yesterday smoking and radiation Cancer risk
Presenter’s Name
Date
Information bias: recall bias
Mothers of babies born with congenital malformations more likely to recall (accurately or “over-recall”) events during pregnancy such as illnesses, diet, etc.
26
Presenter’s Name
Date
Possibility of reverse causation
High cholesterol Myocardial infarction
MI (+) case MI (-) control Cholesterol level Result ? MI Cholesterol level decrease Measure cholesterol after MI
27
Presenter’s Name
Date
Case selection – basic tenets
Eligibility criteria Characteristics of the target and source population
Diagnostic criteria Definition of a case: misclassification
Feasibility
28
Presenter’s Name
Date
Source populations – samples
Health providers: clinics, hospitals, insurers Occupations: work place, unions Surveillance/screening programs Laboratories, pathology records Birth records Existing cohorts Special interest groups: disease foundations or
organizations
29
Presenter’s Name
Date
Incident versus prevalent cases
Incident cases: All new cases of disease cases (that become diagnosed) in a certain period
Prevalent cases: All current cases regardless of when the case was diagnosed
30
Presenter’s Name
Date
Incident Vs Prevalence
Do the cases represent all incident cases in the target population?
Exposure–disease association Vs Exposure–survival association
31
Presenter’s Name
Date
Prevalence cases
32
Disease only A (causal factor) 1-month survival A+B (protective factor) 1-year survival A+C (protective factor) 10-year survival
Patient A: A1 1 month Patient B: A1+B 1 year Patient C: A1+C 10 years
Prevalence cases A1,B,C : Causes intervention of B or C ↓↓Survival
Presenter’s Name
Date
Disease severity
Which stage is chosen for a case? Early stage only Progression not always Late stage only Influence of severity
Increase sample size for stratification
33
Presenter’s Name
Date
Early stage only
Case selection was done in prevalent cases of thyroid cancer
Case: small thyroid cancer Control: normal population Determined the differences
Clinical meaning of this study if there is no difference of survival between them
34
Presenter’s Name
Date
Late stage only – difficult diagnosis
35
Pancreatic cancer Vs. Weight Cases: late stage pancreatic cancer
Low weight due to Cancer progression Conclusion
low weight pancreatic cancer
Increase sample size for stratification
Presenter’s Name
Date
Selection bias
Selection of cases independent of exposure status
Related to severity
Related to hospitalization or visiting
36
Presenter’s Name
Date
Example selection bias (1)
Hypothesis Common cold Asthma
Setting Patients in Hospital
Truth Common cold: aggravating factor not causal factor No different incidence of asthma according to
common cold Common cold (+) aggravation hospital visit Common cold (-) no symptoms no visit
37
Presenter’s Name
Date
38
Total Common cold in society
Patients in hospital
Common cold in hospital
Asthma 1000 10 50 10
General 200000 2000 1000 20 (10+ alpha)
Cause positive Cause negativeCase (asthma) 10 40
Control 1 49
Odds ratio = (1X49)/(4X1)
Example selection bias (2)
Presenter’s Name
Date
Case and Control selection
39
Same distribution of risk factors ??
Presenter’s Name
Date
Guallar E, et al. N Engl J Med 2002;347:1747-5440
Presenter’s Name
Date
Selection of controls – basic tenets
Same target population of cases Confirmation of lack of outcome/disease Selection needs to be independent of
exposure
41
Presenter’s Name
Date
Controls in case-control studies
Should have the same proportion of exposed to non-exposed persons as the underlying cohort (source population)
Should answer yes to: If developed disease of interest during study period, would they have been included as a case?
42
Presenter’s Name
Date
Selecting controls – Same as case source
Characteristics 1. Convenient2. Most likely same target population3. Rule out outcome – avoids misclassification4. Similar factors leading to inclusion into source population5. Sometimes impractical
Examples Breast cancer screening program
• Confirmed breast cancer – cases• No breast cancer – controls
Same hospital as case series• Similar referral pattern – examine by illness types
Pediatric clinics Geographic population Other special populations (e.g., occupational setting)
43
Presenter’s Name
Date
Source for controls
Geographic population Roster needed Probability sampling
Neighborhood controls Random sample of the neighborhood
Friends and family members Hospital-based control
44
Presenter’s Name
Date
Selection of controls: Friends or family
members Friends or family members
Ask each case for list of possible friends who meet eligibility criteria
Randomly select among list Type of matching - will be addressed later
Concerns: May inadvertently select on exposure status, that is,
friends because of engaging in similar activities or having similar characteristics/culture/tastes
“over-matching”
45
Presenter’s Name
Date
Am J Epidemiol 2004;159:915-21 46
Presenter’s Name
Date
Selection of controlsHospital or clinic-based
Strengths Ease and accessibility Avoid recall bias
Concerns Section bias: exposure related to the hospitalization A mixture of the best defensible control
Referral pattern Same Or not
47
Presenter’s Name
Date
Diet pattern: Colon cancer
소화기 암 전문 병원 (GI referral center) 에서 연구를 수행함 Case : 소화기 클리닉의 대장암 (+) Control : 호흡기 클리닉의 대장암 (-)
• 소화기 클리닉 : 대기실 소화기 암 관련 음식 정보• 호흡기 클리닉
두 군 간에 차이는 질환의 차이가 아니라 클리닉의 차이를 반영할 수도 있다 . Control : 소화기 클리닉의 위암 (+)
48
Presenter’s Name
Date
Guallar E, et al. N Engl J Med 2002;347:1747-5449
Presenter’s Name
Date
Weakness of Case-Control Studies
Time period from which the cases arose Survival factor, Reverse causation Biologically relevant period
Only one outcome measured Susceptibility to bias
Separate sampling of the cases and controls Retrospective measurement of the predictor
variables
50
Presenter’s Name
Date
Issues in case-control stud-iesEliseo Guallar, MD, [email protected]
Juhee Cho, M.A., [email protected]
Presenter’s Name
Date
Case and Control selection
52
Same distribution of risk factors ??
Presenter’s Name
Date
Selection of cases Case selection in hospitals Alcohol Hip fractures: All visit hospitals IUD abortion
1st abortion: Some visit but others not Women with IUD in general population more frequently visit clinics
53
Disease No disease
Exposed
Non-exposed
Target populationDisease No disease
Exposed
Non-exposed
Study sample
aA B b
C cD d
Presenter’s Name
Date
1st abortion: 3% rate and no relation of IUD
IUD: frequent visit General population
IUD(+) 1000 970/30 IUD(-) 9000 8730/270
Hospital population IUD (+) 90% 873/27 IUD (-) 45% 4050/120
54
case controlYes 10 10
No 90 90
100 100
case controlYes 18
No 82
100
Control: general population difference due to frequent visitControl: Hospital population theoretically same unless this control group has higher abortion rates due to other problemsControl mixture: both
Presenter’s Name
Date
Actual situation
Limited cases
Selection bias from control selection
55
Presenter’s Name
Date
56
Presenter’s Name
Date
Nomura A, et al. N Engl J Med 1991;325:1132-6 57
Presenter’s Name
Date
Selection bias in nested case-control study
Controls were excluded if they had had gastrectomy or history of peptic ulcer disease
Controls with a cardiovascular disease or cancer at baseline or during follow-up were excluded
Disease No disease
Exposed
Non-exposed
Target populationDisease No disease
Exposed
Non-exposed
Study sample
aA B b
C cD d
58
Presenter’s Name
Date
59
Presenter’s Name
Date
MacMachon B, et al. N Engl J Med 1981;304:630-3 60
Presenter’s Name
Date
MacMachon B, et al. N Engl J Med 1981;304:630-3 61
Presenter’s Name
Date
MacMachon B, et al. N Engl J Med 1981;304:630-3 62
Presenter’s Name
Date
Selection bias in case-control study
Controls were largely patients with diseases of the gastrointestinal tract
Control patients may have reduced their coffee intake as a consequence of GI symptoms
Disease No disease
Exposed
Non-exposed
Target populationDisease No disease
Exposed
Non-exposed
Study sample
aA B b
C cD d
63
Presenter’s Name
Date
64
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13 65
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13 66
Non-GY Control 6.0 GY Control 2.1
Presenter’s Name
Date
Criticisms of prior case-control
studies Diagnostic surveillance bias
Women on estrogens are evaluated more intensively – they are more likely to be diagnosed and to be diagnosed at earlier stages
Women with asymptomatic cancer who receive estrogens are more likely to bleed and to be diagnosed
Antunes CMF, et al. N Engl J Med 1979;300:9-13 67
Presenter’s Name
Date
To avoid selection bias in case-control
studies Selection of cases
Types of cases selected (non-fatal, symptomatic, advanced) Response rates among cases Relation of selection to exposure – Are exposed cases more
(or less) likely to be included in the study? Selection of controls
Type of controls (general population, hospital, friends and relatives)
For hospital controls, diseases selected as control conditions Response rate among controls Relation of selection to exposure – Are exposed controls
more (or less) likely to be included in the study? Similar response rates in cases and controls do NOT
rule out selection bias68
Presenter’s Name
Date
69
Presenter’s Name
Date
Recall issues
All information in case-control studies is historic, so if relying on reporting by participants, accuracy depends on recall
Concerns: Do cases recall prior events differently from controls? Mindset of someone with disease : Is there
something that I did that may have caused the disease?
Recall Bias
(Information Bias)70
Presenter’s Name
Date
Recall bias – example
Mothers of babies born with congenital malformations more likely to recall (accurately or “over-recall”) events during pregnancy such as illnesses, diet, etc.
71
Presenter’s Name
Date
72
Folic acid and neural tube defects
Figure 1: Features of neural tube development and neural tube defects. Botto et el. Neural tube defects. NEJM 1999. (28th days after fertilization)
Background and Aim A reduced recurrent risk of neural tube defects among
women receiving muti-vitamin supplements containing folic acid.
Most of NTDs are de-novo; less than 10% of NTDs are recurrent.
First occurrence of only NTDs and periconceptional folate supplements
Study population
Case NTDs
Control Other major malformations due to recall bias Subjects with oral clefts were excluded because vitamin
supplementation has been hypothesized to reduce the risk: selection bias
Pregnant womenTarget
Source
Study
Overall data
76
Folate (+) OR = 0.6 (0.4 – 0.8)
Recall Bias: Previous knowledge
77
Recall Bias quantificationCase Control OR In this study1000 1000 Recall rate
real 500 800 0.625 Control – 75%
all 400 600 0.667 Case – 80% 0.6
Prev known 450 600 0.750 Case – 90% 0.8
Prev unknown 375 600 0.625 Case – 75% 0.4
78
Presenter’s Name
Date
Recall bias – assessment / avoidance
Check with recorded information, if possible Use objective markers or surrogates for
exposure – careful of markers that are affected by disease
Ask participant to identify which factor(s) are important for disease
Build in false risk factor to test for over-reporting
Use controls with another disease
79
Study population
Case NTDs
Control Other major malformations due to recall bias Subjects with oral clefts were excluded because vitamin
supplementation has been hypothesized to reduce the risk:
selection bias
Pregnant womenTarget
Source
Study
Selection bias If oral clefts were included in control group, control
with exposure (lack of vitamin supplement or folate intake) increased.
As B number increases, the probability of rejecting null hypothesis decreases.
Case Control
Exposure (+) A B
Exposrue (-) C D
Exposure: lack of folate intake
Cleft = ↓intake of vitamin
Methods Periconceptional folic acid exposure was determined by
Interview with study nurses
Demographic Health behavior factors Reproductive history Family history of birth defects Occupation Illnesses (chronic and during pregnancy) Use of alcohol, cigarettes and medications Vitamin use during the 6 months before the last LMP
through the end of pregnancy Semi-quantitative food frequency questionnaire Knowledge of vitamins and birth defects
Confounding
Exposure ↓ Folate intake
Outcome↑ NTDs
Confounding Alcohol
Presenter’s Name
Date
Interviewer bias
Differential interviewing of cases and controls, i.e., may probe or interpret responses differently
Interviewer Bias
(Information Bias)
84
Presenter’s Name
Date
Interviewer bias – avoidance / assessment
Self-administered instruments (prone to more non-response)
Standardized instruments Computerized instruments (CADI, ACASI)
Avoid open-ended questions but rather use questions with each possible response elicited
Training Masking interviewers to research question Masking interviewers to case/control status Same interviewers for cases and controls
85
Presenter’s Name
Date
Odds ratio
1 0
0 1
( )A B
OR cross product ratioA B
DiseaseYes No
Exposed Yes A1 B1
No A0 B0
Presenter’s Name
Date
Example: CHD and Diabetes
CHDYes No
Diabetes Yes 183 65
No 575 735
183 / 65 3.62575 / 735CHDOR
No units!
87
Presenter’s Name
Date
Some properties of odds ratios
Null value: OR = 1 OR >= 0 (cannot be negative) Multiplicative scale (be careful with plots) Use logistic regression to estimate
multivariate adjusted odds ratios in case-control studies
88
Presenter’s Name
Date
Odds ratios and the “rare disease assumption”
With incidence density sampling (represents underlying cohort at time of case) and sampling of cases and controls independent of exposure:
OR ≈ IR With outcomes of very low incidence in the
underlying cohort and sampling of cases and controls independent of exposure:
OR ≈ RR Higher incidence increases the bias away from
the null89
Presenter’s Name
Date
90
Presenter’s Name
Date
Matching Individual matching Frequency matching Stratified matching
Nested study Case-control study Case-cohort study
91
Presenter’s Name
Date
Siegel DS, et al. Blood 1999;93:51-4
Matching in cohort study – example
92
Presenter’s Name
Date
Matching in case-control studies – individual matching Pairing or grouping controls to case by known risk
factors in the design phase, i.e., when selecting controls
In protocol, define matching characteristics and their “boundaries”
Dichotomous or categorical: self-explanatory (e.g., sex, race, blood type, disease stage)
Continuous: can be exact, or typically a window (e.g., age ± 5 years, CD4 cell count ± 50 cells)
For each recruited case, search in control source population for the person(s) who meet the matching criteria
Select 1 or more of them at random93
Presenter’s Name
Date
Odds ratio – matched pairs
Case Control # pairs A1 B1 n11
A1 B0 n10
A0 B1 n01
A0 B0 n00
N = total # pairsN pairs = N cases and N controls 2 N people
94
Presenter’s Name
Date
Antunes CMF, et al. N Engl J Med 1979;300:9-13 95
Presenter’s Name
Date
Frequency matching
Select cases Examine distribution of potential confounder
(matching variable) Select controls so that they have same
distribution of the potential confounder Conduct stratified analyses or regression to
control for the induced selection bias
96
Presenter’s Name
Date
Stratified sampling – alternative to matching Decide up front what distribution of cases and
controls according to confounder is desired Select cases and controls so that expectations
are met Selection of controls does not depend on
cases being selected first Note that distribution of confounder is not the
distribution one may see among all cases in the population
97
Presenter’s Name
Date
Stratified sampling – example Want 50% females in 100 cases and controls
50 female cases and 50 male cases 50 female controls and 50 male controls
In the study period, 175 incident male cases and 75 incident female cases occur
As they occur, enroll cases until 50 are recruited in each stratum
Throughout study period, enroll 50 male and 50 female controls
98
Presenter’s Name
Date
Matching – limitations Cannot examine the independent effect of matched
variable on outcome Cases are controls are balanced for the matched factor
May be costly to perform May inadvertently match
On the exposure itself or its surrogate On a factor in the causal pathway On a factor that is affected by the outcome
Matching on an exposure-related factor but not a disease determinant may reduce the statistical efficiency (matched cases and controls with same exposure are not used in matched analysis)
Logistical complexity of matching99
Presenter’s Name
Date
Matching – strengths Costs of finding a matched control may
< costs of performing tests to assess confounding
< costs of recruiting additional controls to yield enough persons across entire range of confounding variable
Particularly useful when distribution of confounders is very different in cases and controls
Increases amount of information/subject Matching yields same ratio of cases and controls
according to distribution of matched variable100
Presenter’s Name
Date
Nested studies
In an existing cohort study New questions arise Need efficient method to use existing information
Do not want to conduct methods on entire cohort, due to limited resources
Nest a study without sacrificing validity and too much precision
Some nesting options: Case-cohort
• Sub-cohort Case-control
101
Presenter’s Name
Date
102
Nested Case-Control and Case-Cohort Studies Case-comparison studies
Use all cases or representative subset as of date of analysis
Comparison group: Cohort member for all nested designs
Study Design Comparison Case-control Event-free member at time of case’s
event (incidence density sampling)
Case-cohort Members of subcohort, selected at random from cohort at time of enrollment, at risk at time of case’s event= In the subcohort riskset
Presenter’s Name
Date
Full Cohort
Events: A 1 1 2 S1 S6 S3,S8
At risk: N 8 6 4 S1,S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S6,S7,S8 S3,S4,S7,S8
10 20 30 35
S1S2S3S4S5S6S7S8
103
Presenter’s Name
Date
104
Case-cohort study
Presenter’s Name
Date
Nested case-control study
Events: A 1 1 2 S1 S6 S3,S8
At risk: N 8 6 4 S1,S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S6,S7,S8
S3,S4,S7,S8
10 20 30 35
S1S2S3S4S5S6S7S8
Potential controls: S2,S3,S4,S5,S6,S7,S8 S3,S4,S5,S7,S8 S4,S7 105
Presenter’s Name
Date
106
A cohort study
3 events or cases occur among 8 people, of whom 5 are ever exposed
Exposed are solid lines, unexposed are dashed
Dots are eventsTime
Pers
ons
Presenter’s Name
Date
107
A nested case-control study
Compare 3 cases to 3 non-cases (at event time) among cohort members
Time
Pers
ons
Incidence Density Sampling
Presenter’s Name
Date
108
A case-control study
Compare 3 cases to 3 non-cases (at event time) among cohort members
but “what is the cohort?”
They arise from some underlying cohort!!Time
Pers
ons
Incidence Density Sampling
Presenter’s Name
Date
Designing a case-control studyOverview I
What is the research question? In what target population? What source(s) will be used? How long will recruitment take? What is the definition of the cases? What confirmation is needed? Is screening/additional
testing necessary? Will prevalent cases be used? Does exposure
influence the disease prognosis? What is the underlying cohort? How many cases are seen per year in the source?
109
Presenter’s Name
Date
What are the eligibility criteria for controls? What source(s) will be used to identify controls? Do they represent the same underlying cohort as the
cases? What confirmation is needed? Is screening/additional
testing necessary? Sampling methods? Will the controls be selected
throughout the study period? Can they be selected as cases if they later develop disease?
Do additional sources need to be used? For both cases and controls, does exposure status
affect: inclusion in source populations or participation?
110
Designing a case-control studyOverview II
Presenter’s Name
Date
Are there known confounders? Should matching be used?
What methods will be used to recruit cases and controls?
What methods will be used to obtain information about exposures and potential confounders? Active / Passive?
Are the methods of data collection objective and independent of case/control status?
What methods are in-place to avert and monitor differential recall by case/control status if interviewing is involved?
If study involves personnel-administered data collection, are the personnel masked to case-control status? 111
Designing a case-control study Overview III
Thank you for your attention.