division of geriatrics finding and using publicly available datasets for secondary data analysis...

42
Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Post on 19-Dec-2015

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Finding and Using Publicly Available Datasets for Secondary Data

Analysis Research

KL2 SeminarFebruary 2011

Page 2: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Disclosures:

None

Acknowledgements:

Alex Smith, Michael McWilliams, Ann Nattinger, SGIM Research Committee

Disclosures and acknowledgements

Page 3: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Two shout-outs

• Comparative Effectiveness Research through CTSI

Smith AK et al, JGIM 2011

Page 4: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

• Appreciate key conceptual and practical issues involved in secondary data analysis

• Identify and use online tools for locating and learning about publicly available datasets relevant to your research

• Focus on what is useful to you

Learning objectives

Page 5: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Data that have been collected

but not for you

(My) Definition of Secondary Data

Page 6: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

• Survey (NHIS, NHANES, HRS, BRFSS)

• Administrative (Medicare claims)

• Discharge (HCUP SID and NIS)

• Medical chart / EMR

• Disease registries (SEER)

• Aggregate (ARF, US Census)

• Research databases (SOF)

• Combinations and linkages

Types of Secondary Data

Page 7: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Key Conceptual Issues

• Someone else’s secondary data is your primary data• Treat data and research plan with same rigor as would

for a primary data collection study• Research questions should be conceptually driven,

interesting a priori– Some exceptions – Warren Browner rule

• Know data as well as if you had collected it yourself– Who is in the cohort?– Strengths and limitations of data collection procedures,

instruments

Page 8: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

• Compatibility with research question(s)

• Availability and expense

• Sample: representativeness, power

• Measures of interest present and valid

• Messiness and missingness

• Local expertise

• Linkages

Selecting a Database

Page 9: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Resources Needed

• Your effort

• Computer resources and security

• Programmer and/or statistician effort

• PhD statistical support – complex sampling or analyses

• Coordinator if merging datasets

• Realistic timeline / Gantt chart

Page 10: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Cases

• Amita is a junior faculty member interested in doing a secondary data analysis project on association between race/ethnicity and the prevalence and outcomes of atrial fibrillation. No prior experience and limited direct mentorship.

• Eric is a junior faculty member with past experience. Wants to find new dataset around which write grant on association between SES and ADL function in elders.

Page 11: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Amita –Getting Started

• Amita– Get acquainted with basics– Find dataset and assess merit and feasibility– Find a mentor / get expert help

– www.sgim.org/go/datasets

Page 12: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 13: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 14: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Get Acquainted with Basics

Page 15: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 16: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Find a Dataset, Assess Merit & Feasibility

Page 17: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 18: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 19: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 20: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

CARDIA

Page 21: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

CARDIA

Page 22: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Get Expert Help

Page 23: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Getting Expert Help

• Request a consultation– 1 on 1 consultation– Clear, defined questions about dataset

• “strengths and weaknesses about using XYZ to study patterns of medication use for heart failure”

Page 24: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Eric – Getting Down to Business

• Identify datasets relevant to his research interests

• Identify health statistics, validated instruments, funding sources

• www.sgim.org/go/datasets

Page 25: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 26: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Finding Additional Resources• National Information Center on Health Services Research and Health Care

Technology (NICHSR)• Inter-University Consortium for Political and Social Research (ICPSR)• Partners in Information Access for the Public Health Workforce• Roadmap K-12 Data Resource Center (UCSF)• List of datasets from the American Sociologic Association• Canadian Research Data Centers – Data Sets and Research Tools (Canada)• Directory of Health and Human Services Data Resources • Publicly Available Databases from National Institute on Aging (NIA)• Publicly Available Databases from National Heart, Lung, & Blood Institute (NHLBI)• National Center for Health Statistics (NCHS) Data Warehouse• Medicare Research Data Assistance Center (RESDAC); and Centers for Medicare

and Medicaid Services (CMS) Research, Statistics, Data & Systems• Veterans Affairs (VA) data

Page 27: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

CELDAC

• Comparative Effectiveness Large Dataset Analysis Core– UCSF CTSI

• Access to local and national datasets and expertise

http://ctsi.ucsf.edu/research/celdac

Page 28: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

National Information Center on Health Services Research and Health Care Technology (NICHSR)

•Databases, data repositories, health statistics

•Fellowship and funding opportunities

•Glossaries, research and clinical guidelines

•Evidence-based practice and health technology assessment

•Specialized PubMed searches on healthcare quality and costs

http://www.nlm.nih.gov/hsrinfo/index.html

Page 29: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

ISPOR

• International Society for Pharmacoepidemiology and Outcomes Research

http://www.ispor.org/DigestOfIntDB/CountryList.aspx

Page 30: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Inter-University Consortium for Political and Social Research (ICPSR)

•World’s largest archive of social science data

•Searchable

•Many sub-archives relevant to HSR–Health and Medical Care Archive–National Archive of Computerized Data on Aging

http://www.icpsr.umich.edu/icpsrweb/ICPSR/access/index.jsp

Page 31: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Questions?

• Specific high-value datasets

• Causal inference / comparative effectiveness

• Which comes first – RQ or dataset?

• Evaluating and managing validity of measures

• Analyzing complex survey data

Page 32: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

EXTRA SLIDES

• Additional brief information about specific high-value datasets– VA administrative data– NHANES– NAMCS– NIS

Page 33: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Administrative Data (VA)

• VA has multiple high-value administrative databases– Outpatient visit information

• Visit date, type of clinic, provider, ICD9 diagnoses

– Inpatient information• Admitting dx(s), discharge dx(s), CPT codes, bed section, meds

administered

– Lab data• >40 labs

– Pharmacy data• All inpatient and outpatient fills

– Academic affiliation– etc

Page 34: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Administrative Data (VA)

• Huge bureaucracy and paperwork

Page 35: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Administrative Data (VA)

• Messy data

• Huge size– 2 TB server

• Data analyst

Page 36: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Survey Data (NHANES)

• National Health and Nutrition Examination Survey (NHANES)– Nationally representative sample of >10K

patients every 2 years– Extensive interview data on clinical history

(including diseases, behaviors, psychosocial parameters, etc.)

– Physical exam information (e.g. VS)– Labs, biomarkers

Page 37: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Survey Data (NHANES)

• Free and easy to download• (Relatively) easy to use

– Although requires careful reading of documentation

• Serial cross-sectional • Disease data self-report• Very limited information about providers and

systems of care

Page 38: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Survey Data (NAMCS)

• National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS)

• Nationally representative sample of ~70K outpatient and ED visits per year

• Physician-completed form about office visit

Page 39: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Page 40: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Survey Data (NAMCS)

• Data more from physician perspective (diagnoses, treatments Rx’ed, etc) and some info on providers (e.g., clinic organization, use of EMRs, etc)

• Serial cross-sectional– Visit-focused– Not comprehensive, ? value for chronic diseases

Page 41: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Discharge Data (NIS)

• National Inpatient Sample (NIS)– Database of inpatient hospital stays collected from ~20% of US

community hospitals by AHRQ– Diagnoses and procedures, severity adjustment elements,

payment source, hospital organizational characteristics– Hospital and county identifiers that allow linkage to the American

Hospital Association Annual Survey and Area Resource File

Page 42: Division of Geriatrics Finding and Using Publicly Available Datasets for Secondary Data Analysis Research KL2 Seminar February 2011

Division of Geriatrics

Discharge Data (NIS)

• Relatively easy to access (DUA, $200/yr)

• Relatively easy to use– Though need close attention to

documentation

• Limited data elements

• Huge data files