12 march 2007 andy bogart. 12 march 2007 andy bogart a cooperative effort: university of north...
Post on 19-Dec-2015
215 views
TRANSCRIPT
12 March 200712 March 2007Andy BogartAndy Bogart
12 March 200712 March 2007Andy BogartAndy Bogart
A cooperative effort:A cooperative effort:
University of North DakotaUniversity of North DakotaNational Resource Center National Resource Center on Native American Agingon Native American Aging
University of WashingtonUniversity of WashingtonCenter for Clinical and Center for Clinical and
Epidemiological ResearchEpidemiological Research
12 March 200712 March 2007Andy BogartAndy Bogart
A cooperative effort:A cooperative effort:
Training and practical experience inTraining and practical experience in
Research DesignResearch Design
Statistics and Data AnalysisStatistics and Data Analysis
Manuscript PreparationManuscript Preparation
12 March 200712 March 2007Andy BogartAndy Bogart
Research DesignResearch Design
Social Social EngagementEngagement
Pap TestingPap Testing
AgeAge
DisabilityDisabilityEducationEducation
??
12 March 200712 March 2007Andy BogartAndy Bogart
Social Social EngagementEngagement
Pap TestingPap Testing
AgeAge
DisabilityDisabilityEducationEducation
??
Research DesignResearch Design
12 March 200712 March 2007Andy BogartAndy Bogart
Research DesignResearch Design
Scientific ConsiderationsScientific Considerations
Specific Aims DevelopmentSpecific Aims Development
12 March 200712 March 2007Andy BogartAndy Bogart
Statistics and Data AnalysisStatistics and Data Analysis
Hypothesis TestingHypothesis Testing
Data cleaning, coding, and Data cleaning, coding, and analysis using SPSS 14analysis using SPSS 14
120 140 160 180 200 220
40
00
06
00
00
80
00
0
Yea
rly I
ncom
e in
Dol
lars
Height in Centimeters
Income as a Function of Height
Regression AnalysisRegression Analysis
12
34
56
Fo
rce
d E
xpir
ator
y V
olu
me
(FE
V)
Males Females
Lung Function and Smoking Among Children
Non Smokers Smokers
Basic summary Basic summary statisticsstatistics
12 March 200712 March 2007Andy BogartAndy Bogart
Manuscript PreparationManuscript Preparation
Creating Tables Creating Tables and Figuresand Figures
TTaabbllee 11:: SSuubbjjeecctt CChhaarraacctteerriissttiiccss
Pap test in past 3 years
Subject Characteristics
Yes n = 1917
No n = 596 p-value
Age, mean (sd) 65 (5) 64 (5) 0.003
ADL Count , % 0.008
None 71 69
1-3 adls 23 20
4 or more 7 11
BMI Category, % 0.732
Underweight 2 2 Healthy 20 19
Overweight 28 28
Obese I 26 26
Obese II 13 15
Obese III 12 10
Social engagement (per week), % <0.001
No meetings 60 70
One meeting 25 22
Two or more 15 9
Married or living as married 41 35 0.007
Has at least one personal physician 81 71 <0.001
High School graduate 80 74 0.002
Insured 0.382
No Insurance 7 6
IHS Only 18 16
At least one other type of insurance 75 78
Rurality 0.602
Urban 23 24
Large rural 18 16
Small rural 20 21
Isolated rural 39 40
0
4
8
20
0
2
4
6
8
10 Sleep Quality
Treatment GroupDirect from MasterDistance from Master
Direct from ActorDistance from Actor
Weeks Since Randomization
Sle
ep
Qu
alit
yb
y 1
0cm
VA
S S
core
None
Best Ever
Direct from Master (n) 24 21 22 19
Distance from Master (n) 25 22 18 20
Direct from Actor (n) 25 21 21 21
Distance from Actor (n) 25 19 20 19
Methods and Results Methods and Results SummariesSummaries
ˆ |
1
j jWMR j N j
j j
jN N j
K KE AUC t E E E N n
N N
KE E E E K
n n
The random variable Kj is a sum of n Bernoulli experiments, each sharing a common
probability of success denoted as jjj tTtTMMpp 2121 ,| . The expectation of Kj is
therefore the sum of the expectations of all n of these Bernoulli random variables. We substitute
the expectation npj into the expression, and obtain the final result:
jjNjNjN ppEnpn
EKEn
E
11
Thus
jjjjWMR tTtTMMpptCUAE 2121 ,|])(ˆ[ as desired.
Each individual who is observed to fail at time tj contributes an estimate of AUC(tj) consisting
of the proportion of those still living whose marker values fall below his or her own. If more than
one subject fails at time tj, then we repeat the above procedure for each subject who fails, using
their marker values in turn to play the role of m*(tj), and recording separately each resulting
estimate of AUC(tj). Multiple estimates of AUC(tj) are accommodated either by averaging or by
implementing a smoothing algorithm, as described below. The estimates of AUC(tj) are
descriptions of the set of subjects who are still at risk: the numerator Kj of the AUC(tj) estimator
sums only over the subset of individuals who are observed to survive beyond time tj. Similarly, the
denominator Nj counts only those individuals who survive beyond tj. Those
deleterious (increased age, hepatomegaly, and high bilirubin) to prognosis. Differences in the
standard errors derived from the p-values reported by Roll et al. and those calculated while fitting
the Yale-like model are attributable in part to having fit them on a larger Mayo data set than in the
original Yale study.
Table 10: Comparison of Yale Coefficient Estimates
Roll et al. (1983)
n=280
Yale-like Model n=312
se* p-value se p-value
Age 0.037 0.0120 0.002 0.0339 0.0082 3.6.10-5
Hepatomegaly 0.74 0.3368 0.028 0.4618 0.2148 0.032
Bilirubin > 5 mg/dl 0.82 0.2492 0.001 0.9675 0.2289 2.5.10-5
Bilirubin < 1.5 mg/dl -0.73 0.3138 0.020 -1.2954 0.2326 2.6.10-8
Portal fibrosis -1.34 0.4336 0.002 -0.6590 0.2728 0.016
* Standard errors estimates derived from p-values reported in Roll et al. (1983)
As in the previous example, we present in Table 11 the concordance estimates for each
marker along with bootstrapped confidence intervals, and inference on the null hypothesis of no
difference between the concordance probabilities given by each estimator. This time each
method but one provides evidence in support of the Mayo researchers’ claim that their model
performed on par with the Yale model. Harrell’s estimator is the only dissenter in this regard,
providing much lower bootstrap p-values of both types for the difference than any other estimator.
As seen in bivariate-normal data simulations in section 2, Harrell’s estimator is much higher than
the point
the relative variances: for Harrell, the relative variance increases to 23% above that of the CoxTVC.
WMRnone and WMRloess estimators show variances 17% and 20% higher than CoxTVC respectively,
and the CoxPH estimator’s variance is 6% higher than the CoxTVC. Results are similar when
correlation is instead set to -0.70, with relative variances in most cases being slightly less different
from 1 than the results stated above. The non-parametric estimators suffer a larger variance
estimate than do the two model-based estimators, as might be expected given that they have
make fewer distributional assumptions of the data.
-2 -1 0 1 2
0.0
0.2
0.4
0.6
0.8
1.0
Log of Analysis Time
Are
a U
nder
RO
C(t
)
Methods
Weighted Mean RankCox PH ModelCox TVC Model
2f tSt
Figure 1: AUC(t) Estimates for Bivariate Normal Data Simulation
Further examination of Figure 1 illustrates interesting features visible in the estimates from Table
1. In the region where log-analysis time is between edit
Manuscript editingManuscript editing
12 March 200712 March 2007Andy BogartAndy Bogart
Social engagements per week
Number of ADL disabilities
Adjusted Pap Test ReceiptAdjusted Pap Test ReceiptOdds Ratio EstimatesOdds Ratio Estimates
0.0 1.0 2.0 3.0
4 or more
1 to 3
0
2 or more
1
0
12 March 200712 March 2007Andy BogartAndy Bogart
Future DirectionsFuture Directions
Ongoing manuscript assistanceOngoing manuscript assistance
2 analysis projects per year2 analysis projects per year
New participants from UNDNew participants from UND
12 March 200712 March 2007Andy BogartAndy Bogart
Thank you for listeningThank you for listening