conflict of interestcme.baptisthealth.net/research/...11_2014_tamariz.pdf · leonardo tamariz, md,...
TRANSCRIPT
11/6/2014
1
Using free large datasets for research to change how
medicine is practiced
Leonardo Tamariz, MD, MPHUniversity of Miami
BSFH Research SummitNovember 7, 2014
Conflict of interest
• No conflicts of interest.
– IRB Chair at the Miami VA.
– Federal employee.
– Sources of funding: NIH, RWJ, UM medical education, Humana consulting fees.
11/6/2014
2
Educational objectives
• Identify the sources of large datasets that can potentially be used in clinical research.
• Compare and contrast the most commonly used large datasets in clinical research.
• Identify the statistical complexities of using large datasets in clinical research.
• Discuss the future of large datasets in the current regulatory and research environments.
Impact of the Affordable Care Act on big data
• Electronic health records
• Accountable care organizations
• Health information exchanges
• Comparative effectiveness
• PCORI
11/6/2014
3
Where does data come from?
BIG DATA
Administrative encounters
Hospital EHR Tests
Gym and wearableHealth monitors
Social media andMobile apps
Health care providerencounter
Insurance company
Home monitoring
Surveys, health fairs and healthrisk assessments
The four V of big data
SAS. 2014
11/6/2014
4
Complexities of big data
BIG BIG BIG DATA
Hospital 1 Hospital 2 Hospital 3
Urgent care center 1
Urgent care center 2
Outpatientcenter
How can we prepare for this complex process?
• Understanding the structure of your own data.
• Getting out of your comfort zone and work with other departments for pilot testing the use of parts of BIG DATA.
• Familiarize with existing large datasets that can help us answer clinical questions and practice finding answers in big data.
11/6/2014
5
Can big data transform population, patient centered
and personalized care• Big data: Massive quantities of health care data
accumulating from patients or populations.• Population based medicine: Assessment of the health
status and health needs of a target population in a way that is consistent with the community’s cultural, policy, and health resource values.
• Patient centered health care: Transparency, individualization, recognition, respect, dignity, and choice in the patient’s health care experience.
• Personalized medicine: tailoring of medical care to the individual characteristics, needs, and preferences of a patient during all stages of care, including prevention, diagnosis, treatment, and follow-up.
Big data influence
Big data
• Most commonly prescribed anticoagulants in AF were NOAC’s.
• Serious bleeding events.
Population
Hospital study
• Asked questions about anticoagulant preference.
Patient centered/ Personalized
medicine
• More informed decisions
• Discuss rationale for options
11/6/2014
6
Anticoagulant preference
0 5 10 15 20 25 30 35 40
Medication with antidote
Medication with best QoL
Physician decides
Medication longer in market
More information before decision
Medication with lower stroke risk
Non-warfarin users
Warfarin users
Palacio and Tamariz. Patient preference and adherence. In press
The large dataset
Medical fileICD-9 codesCPT codes
Pharmacy fileMedication
Refills
Member fileCostrace
Medical fileICD-9 codesCPT codes
Outpatient Inpatient
Lab fileResults
Text filesReports
Structured data Unstructured data
Radiology reportsSurgical reportsAnesthesia reports
11/6/2014
7
Sequence of steps in a research project
• Conceptualization
• Funding
• Planning/Design
• Execution
• Interpretation
• Reporting
Abstracts, Presentation, Publication
Data Collection & Processing
Data Analysis
Research cycle: The way we should not do research
Research question
Funding
Execute
Collect data
Publish
11/6/2014
8
Research cycle: the way we should do research
Funding
Collect data
Execute
Publish
Research question
Health careencounter
Research Opportunities for Health Care Systems
• Large complex datasets.
• Enough power to find differences.
• Can focus on health disparities.
• Can focus on quality of care.
Ultimate goal: Improve care and science
11/6/2014
9
Research Opportunities for health care systems
Identify sources of data
Create usable data
Analyze
Outcomes
Quality of care
Health disparities
Advantages of large datasets
• Large sample sizes.
• Fast.
• Provide population estimates.
• Test trends over time.
11/6/2014
10
Disadvantages of large datasets
• Not designed for research.
• Non-randomized.
• Special statistical and data management skills.
FDA Amendment act
• Mandates FDA establish capacity to use electronic health data to assess safety of marketed drugs
– Data covering at least 100 million people required by mid-2012
• FDA is addressing drugs, biologics, and devices
11/6/2014
11
Mini-Sentinel Data Partners
The Mini-Sentinel Distributed Database
� Populations with well-defined person-time for which medically-attended events are known
• 150 million individuals
– 360 million person-years of observation time (2000-2013)
– 4.1 billion pharmacy dispenses
– 4 billion clinical encounters
11/6/2014
12
Mini-Sentinel Distributed Analysis1- User creates and
submits query
(a computer program)
2- Data partners retrieve
query
3- Data partners review
and run query against
their local data
4- Data partners review
results
5- Data partners return
results via secure
network
6 Results are aggregated
Validity of the data
Disease ICD-9 code PPV Sensitivity
Atrial fibrillation
427.31 89% 79%
Cardiacarrhythmias
427.x 85%
427.x and 798
92%
Tamariz et al. PDS. 2012Jensen et al. PDS.2012
11/6/2014
13
Research scenarios
Prevalence of atrial fibrillation in minorities
Use of catheter ablation in AF in minorities
Impact of warfarin use in minorities on bleeding and QoL after ablation
Steps in the process
• Determine interest area
• Search for existing databases
• Learn the database
– Data documentation manuals, CDs, web
• Derive research question(s)
• Conduct analyses
– Statistical consultation, programming
11/6/2014
14
Sources large databases
Public/Federal (free)• Surveillance, Epidemiology
and End Results program(SEER).
• National Health and Nutrition Examination Survey (NHANES).
• Medicare.• Healthcare Utilization
Program (HCUP).• Florida datasets.• Veterans affair data.• Clinical trials.• Your own data.
Not public (not free)
• Insurance companies
Data comparisons: Characteristics
Data source
Sponsor Disease Goal Population based
Representative
Communication
Costs
SEER NCI Cancer Research: Incidence and cancer survival
Yes Yes MedicareHCUPMEPSSS census
Free
NHANES CDC Not specific
Evaluatehealth and nutrition
Yes Yes $
Medicare CMS All Administrative
No Yes CensusSS
$
HCUP/MEPS AHRQ Qualityand cost
Yes $
Florida State of Florida
All Administrative
No No HCUP Free
VA DVA All Administrative
No Yes Medicare Free
Insurancecompany
Private All Administrative
No Yes/No SSCensusMedicare
$$$
11/6/2014
15
Data comparisons: Variables
Data source
Sample size
Race/ethnicity
Diseases Procedures Medications Follow-up
Lab result
QoL
SEER 6 million Limited to cancer
NHANES Varies Chronicdiseases
Medicare Varies All
HCUP 7 million All
Florida 3 million All
VA Varies All
Insurance company
Varies All
Can you answer the research questions?
Data source Prevalence of AF in minorities
Use of catheter ablation in minorities
Warfarin after catheter ablation and QoL
SEER No No No
NHANES No No No
Medicare Yes Yes No
HCUP/MEPS Yes Yes No
Florida Yes Yes No
VA Yes Yes No
Insurancecompany
Yes Yes No
11/6/2014
16
Research scenarios
Prevalence of atrial fibrillation in minorities
Use of catheter ablation in AF in minorities
Impact of warfarin use in minorities on bleeding and QoLafter catheter ablation
HCUP 13’967,949Whites 3.4%Blacks 1.83%Hispanics 1.43%
Dewland et a. Circulation. 2013Tamariz et al. Clinical Cardiology. 2014
Florida 1’020,049Whites 0.89%Blacks 0.53%Hispanics 0.74%
Research Opportunities for Health Care Systems
• Large complex datasets.
• Enough power to find differences.
• Can focus on health disparities.
• Can focus on quality of care.
Ultimate goal: Improve care and science
11/6/2014
17
The ideal dataset
Medical fileICD-9 codesCPT codes
Pharmacy fileMedication
Refills
Member fileCostrace
Medical fileICD-9 codesCPT codes
Claims data
Outpatient Inpatient
Detailed eventsSeverity of
diseaseResults of
diagnostic tests
Clinical data
Quality of lifeSatisfaction
Survey
POCR data
Lab fileResults
Repository
Statistical considerations: Confounding
11/6/2014
18
Statistical considerations: How to deal with confounding
Tamariz et al. AJC. 2011
Data models: Common data model
Common data
model (CDM)
Common data
model (CDM)
Data source
1
Data source
1
Data source
2
Data source
2Data
source 3
Data source
3
Observational Medical Outcomes Partnership
11/6/2014
19
Data models: Distributive data model
Transform data
AnalysisMini-sentinel
Regulatory considerations of big data
Patient and their data
Minimise risk Privacy
Maximise public benefit
Maintain public trust
Consent
Deidentification
11/6/2014
20
Just an idea!
Others
Data model
Closing remarks
• Big data is here to stay.
• It would be a terrible disadvantage not to use for research or quality of care.
• Preparing your own big data for research is complex and requires a process.
• There are several free datasets that can be used to answer research.