data analysis workshop

25
Data Analysis Workshop Chuck Spiekerman (cspieker@u) Karl Kaiyala (kkaiyala@u)

Upload: lark

Post on 08-Feb-2016

103 views

Category:

Documents


0 download

DESCRIPTION

Data Analysis Workshop. Chuck Spiekerman (cspieker@u) Karl Kaiyala (kkaiyala@u). Course Outline. February 20 How to describe your study Choosing an Analysis method March 13 Student presentations of study designs and data-analysis plans March 20 Student presentations of data analyses. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Analysis Workshop

Data Analysis Workshop

Chuck Spiekerman (cspieker@u)

Karl Kaiyala (kkaiyala@u)

Page 2: Data Analysis Workshop

Course Outline

• February 20– How to describe your study– Choosing an Analysis method

• March 13– Student presentations of study designs and

data-analysis plans

• March 20 – Student presentations of data analyses

Page 3: Data Analysis Workshop

Describing your study

• Next session (3/13) we are asking you to present a description of your planned study

• The next few slides give an outline of suggested components of this description

• Attention to all these components should help you (and/or a consultant) decide on appropriate methods of statistical analysis

Page 4: Data Analysis Workshop

Study Design Description

• Specific Aims (what?)

• Background (why?)

• Previous work (who?) *

• Study methods (how?)– several components

*optional for student presentations

Page 5: Data Analysis Workshop

Specific Aims

• Describe the scientific question(s)

• Be specific and precise

• Stick to the study at hand

Page 6: Data Analysis Workshop

Background and Motivation

• Relevance of this research

– Existing knowledge

– Identify gap this research will fill

• Relate to specific aims

• If part of a larger study, where does this

study fit?

Page 7: Data Analysis Workshop

Study Methods Components

• Primary outcomes

• Study population• Methods and procedures *

• Data analysis plan

*optional for student presentations

Page 8: Data Analysis Workshop

Primary Outcomes

• Precise definition of key measurement (individual data item) of interest

• Justify why this outcome and not something else.

– Relate to specific aim

• Details of collection can be left to methods and procedures section

Page 9: Data Analysis Workshop

Study population

• How were the subjects selected?

– Exclusion and inclusion criteria

– Group classification?

– Matching?

– Randomization?

Page 10: Data Analysis Workshop

Data analysis plan

• Outline data analysis for each specific aim

• Make clear which procedures are being used toward which aim

• Usually some simple tables and plots should be sufficient

• Keep it simple

Page 11: Data Analysis Workshop

Forming an analysis plan

Two important questions

1. What do you want to do/show?

2. What kind of data …i. …will answer your question best?ii. … can you get?iii. … do you have?

Page 12: Data Analysis Workshop

Types of data

• Continuous– Differences between values have meaning, and

are interpretable independent of the values themselves

– E.g. difference between 8 and 9 basically the same as difference between 1 and 2.

• Ordinal– Values have an order, but differences are not

easily interpretable (e.g. good, fair, poor)

Page 13: Data Analysis Workshop

Types of data (cont.)

• Categorical

– Values are descriptive but do not have any obvious ordering. E.g. tx A, tx B, tx C.

• Binary, Dichotomous

– Fancy names for categorical variables with only two possible values.

Page 14: Data Analysis Workshop

Types of data (sampling)

• one-sample– Refers to situation when values of interest all

come from one group and will be compared to a known quantity (e.g. “change greater than zero”)

• two-sample– When data are divided/sampled in two groups

and observed values compared between groups.

Page 15: Data Analysis Workshop

What do you want to do?• Show evidence of differences

• Estimate population parameters

• Demonstrate equivalence

• Show evidence of association

• Create/validate a predictive model

• Assess agreement or reliability

• Other?

Page 16: Data Analysis Workshop

Showing evidence of differences• Standard hypothesis testing procedures, usually

comparing means or proportions• Which test will depend on type of data. Usual

suspects (YMMV)– T-test or ANOVA for Continuous data– Chi-square test for Categorical data– Rank-based tests (e.g. Wilcoxon) for Ordinal data

• Use Rosner flowchart for guidance• Supplement p-value with estimate of difference

(with confidence interval)

Page 17: Data Analysis Workshop

Estimate Population Parameters

• P-values and hypothesis tests aren’t always necessary

• Sometimes you don’t really want to compare things but only estimate values

• Estimate parameters of interest and supplement with confidence intervals (IMPORTANT!) .

Page 18: Data Analysis Workshop

Demonstrate equivalence

• In some instances the goal is to show equivalence of, say, two treatments.

• Failing to show a difference using a standard hypothesis test is usually not sufficient evidence of equivalence

• Two strategies– Estimate difference and show ‘worst cases’

with confidence interval– Compute a standard hypothesis test with very

good power (> 95%)

Page 19: Data Analysis Workshop

Demonstrate associations

Independent variable

outcome variable

dichotomous continuous

categorical•Chi-square

•Logistic regression

•T-test/ANOVA

•Linear regression

continuous•Logistic regression

•T-test/ANOVA (backwards)

•Correlation

•Linear regression

•Scatterplots

Page 20: Data Analysis Workshop

Prediction• Dichotomous outcome

– Logistic regression*

– Sensitivities, specificities†

– ROC curves† (continuous predictor)

• Continuous outcome– Linear regression*

– “Leave one out” statistics or cross validation†

* Predictive model building

† assessing predictive model

Page 21: Data Analysis Workshop

Reliability/Agreement

• Kappa statistic is commonly used for categorical data and two raters.

• Intra-class correlation coefficient for multiple raters

• If you have a ‘gold standard’ it makes the most sense to tabulate percent correct or average distance from correct.

Page 22: Data Analysis Workshop

more Reliability/Agreement

• If trying to demonstrate agreement between two continuous measures the correlation coefficient is tangential at best

• Better to tabulate statistics related to mean pairwise differences between judges

• See – Bland JM, Altman DG. (1986). Statistical methods for assessing

agreement between two methods of clinical measurement. Lancet, i, 307-

310. – Available at http://www-users.york.ac.uk/~mb55/meas//ba.htm

Page 23: Data Analysis Workshop

Other?

• Time-to-event data– Kaplan-Meier survival estimate– Cox regression

• Other other?

Page 24: Data Analysis Workshop

Correlated Data Issues• Data consist of “clusters” of correlated

observations. This is common in dental studies (many teeth from same mouth)

• Common Solutions?– Collapse data to independent units (patient-

level averages)– Adjust for correlation using generalized

estimating equations (GEE) or mixed model regression approaches

Page 25: Data Analysis Workshop

Homework for Feb. 29

• Following the guidelines presented in class today, present a concise description of your study and planned data analysis to the class.

• Plan to keep your talk under ____ minutes

• Limited office hours will be available with myself and Dr. Kaiyala to help. Call or email us for appointments.