panel discussion: big data; lucila ohno-machado, md, phd
DESCRIPTION
Tuesday, October 23, 2012 Panel Discussion: Big Data Moderator: Roozbeh Jafari, PhD – Electrical Engineering, UT Dallas Panelists: Holly Jimison, PhD – Medical Informatics & Clinical Epidemiology, OHSU James McClain, PhD – Physical Activity Epidemiologist , Risk Factor Monitoring & Methods Branch, National Cancer Institute (NCI) Lucila Ohno-Machado, MD, PhD – Associate Dean for Informatics & Technology, School of Medicine; Founding Chief, Division of Biomedical Informatics; Professor of Medicine, UC San DiegoTRANSCRIPT
![Page 1: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/1.jpg)
The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid Services, Department of Health and Human Services.
Panel Discussion: Big Data at the iDASH Center
Lucila Ohno-Machado, MD, PhDDivision of Biomedical InformaticsUniversity of California San Diego Editor-in-Chief, Journal of the American Medical Informatics Association
Wireless Health 2012
![Page 2: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/2.jpg)
21st Century Healthcare
What is the influence of genetics, environment?
Which therapies work best for individual patients?
![Page 3: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/3.jpg)
Patient-Centered Outcomes Research
• Genome– Sequencing data
• Phenotype– Personal monitoring
• Blood pressure, glucose
– Personal health records– Behavior monitoring
• Adherence to medication, exercise
• Environment– Air sensors, food quality– Location Source: DOE
![Page 4: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/4.jpg)
Where does knowledge come from?
• Small controlled studies with strict eligibility criteria• Does this apply to my patient?
Hopefully, but we need a lot of data to answer this question:• We need to build infrastructure to access large data
repositories – Lower the barriers to share data
• We need to share tools to analyze the data– Algorithms and computational facilities
![Page 5: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/5.jpg)
Big Data, Small Data, and Other Data
• Data integration across biological scales• Data analysis from multiple sources• Data ‘anonymization’ and privacy preservation
5/18/2012
Gen
otyp
e•
Gen
om
e•
Tran
scrip
tion
RNA
•Tr
ansc
ripto
me •
Tran
slati
on
Prot
ein
• Pro
teom
e
Biom
ark
ers
•La
b
Phen
oty
pe•
Clin
ica
l Dat
a
Popu
latio
n•
Regi
strie
s
![Page 6: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/6.jpg)
Clinical Translational Science
• Integration of Clinical Data Warehouses from 5 University of California Medical Centers and affiliated institutions (>10 million patients)– Aggregate and individual-level patient data
will be accessible according to data use agreements and IRB approval
• Objectives– Monitor patient safety– Improve outcomes– Promote researchFunded by the UC Office of the
President to the NIH-funded CTSAs
![Page 7: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/7.jpg)
Data for Personalized Medicine
Handling Protected Health Information - Secure Electronic Environment
• Electronic Health Records• Genetic Data
Prevention, Diagnosis and Therapy– Genetic predisposition– Biomarkers– Pharmacogenomics– Health records– Sensors
![Page 8: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/8.jpg)
• Data use agreements across institutions – Limited and complicated – Specific to a particular study– Resources for sharing are limited– Security/privacy constraints are hard
for small institutions to follow
• Sharing data today– Little incentive– Only one model: users download data– Yes/No decision on sharing
Sharing Data
![Page 9: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/9.jpg)
9
iDASH
![Page 10: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/10.jpg)
Mission
“A national center for biomedical computing that develops new algorithms, open-source tools, computational infrastructure, and services that will enable biomedical and behavioral researchers nationwide to integrate Data for Analysis, ‘anonymization,’ and Sharing”
5/18/201210
![Page 11: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/11.jpg)
Models for Data Sharing
• Cloud Storage: data exported for computation
elsewhere– Users download data from the cloud
• Cloud Compute and Virtualization: computation goes to the data
– Users analyze data in the cloud– Users download virtual machines
11funded by NIH U54HL108460
![Page 12: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/12.jpg)
Models for Sharing Data Access
Supported by the NIH Grant U54 HL108460 to the University of California, San Diego04/10/2023
Tool Creator
System Creator
data 1
tool 1
tool 2
tool 3
Data Owner
data 1 Contributor DUAQA
ContributorQA
tool 2
ContributorQA VM 2VM 2
access control
VM 1
Data Use AgreementDUA
Quality AssuranceQA
![Page 13: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/13.jpg)
Models for Sharing Data Access
Supported by the NIH Grant U54 HL108460 to the University of California, San Diego04/10/2023
User A
Tool Creator
System Creator
data 1
data 2
tool 1
tool 2
tool 3
data 2
Data Owner
data 1 Contributor DUAQA
UserDUA
tool A
ContributorQA
tool 2
MODEL 1. User downloads iDASH data
ContributorQA VM 2VM 2
access control
VM 1
![Page 14: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/14.jpg)
Models for Sharing Data Access
Supported by the NIH Grant U54 HL108460 to the University of California, San Diego04/10/2023
User A
User B
Tool Creator
System Creator
data 1
data 2
tool 1
tool 2
tool 3
data 2
Data Owner
data 1 Contributor DUAQA
UserDUA
tool A
ContributorQA
tool 2
MODEL 1. User downloads iDASH data
UserDUA
MODEL 2. User computes with iDASH hosted data in iDASH environment
ContributorQA VM 2VM 2
access control
VM 1
![Page 15: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/15.jpg)
Models for Sharing Data Access
Supported by the NIH Grant U54 HL108460 to the University of California, San Diego04/10/2023
User A
User B
Tool Creator
System Creator
data 1
data 2
tool 1
tool 2
tool 3
User C
data 2
Data Owner
data 1 Contributor DUAQA
UserDUA
tool A
ContributorQA
tool 2
MODEL 1. User downloads iDASH data
UserDUA
MODEL 2. User computes with iDASH hosted data in iDASH environment
ContributorQA VM 2VM 2
access control
VM 2
data C
MODEL 3 User performs iDASH computation in his own environment
VM 1
![Page 16: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/16.jpg)
User requests data for Quality Improvement
or ResearchAre the data accessible?
• Identity & Trust Management
• Policy enforcement
Trusted Broker(s)
Security Entity
AHRQ R01HS19913 / EDM forum
Quality Improvement, Health Services Research
Count queries and statistics across data warehouses
Diverse Healthcare Entities
in 3 different states(federal, state, private)
How many patients over 65 are on Warfarin or Dabigatran?
What are the major and minor bleeding rates for patients on these drugs?
![Page 17: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/17.jpg)
User requests data for Quality Improvement
or ResearchAre the data accessible?
• Identity & Trust Management
• Policy enforcement
Trusted Broker(s)
Security Entity
AHRQ R01HS19913 / EDM forum
Adjusting for Confounders
Distributed regression modelsWu Y et al. Grid Binary LOgistic REgression (GLORE): Building Shared Models Without Sharing Data. JAMIA 2012
Diverse Healthcare Entities
in 3 different states(federal, state, private)
![Page 18: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/18.jpg)
Shared Services and Infrastructure
04/10/2023
SaaS
PaaS
IaaSOperators,
Developers, Collaborators
Researchers, DevelopersCollaborators
Healthcare professionals,End-user services
• Software as a Service• Platform • Infrastructure
• Security & Policies• Scalability & Reliability• Flexibility & Extensibility
Frame/Infrastructure
Body/Platform
Business/Service
i D A
S H
![Page 19: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/19.jpg)
Underlying Infrastructure
5/18/2012
SaaS
PaaS
IaaSiDASH Operators,
Developers, Collaborators
Researchers, DevelopersCollaborators
Biomedical Researchers,End-user services • Resource virtualization
• Security• Scalability• Flexibility
i D A
S H
Figure courtesy of Dallas Thornton
![Page 20: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/20.jpg)
Cyberinfrastructure Security
• HIPAA (Health Insurance Portability and Accountability Act) compliant Computing environment
• Segmentation (Zones) of iprojects & functionality• Physical and Environmental Protection of compute hardware• Access control with Two Factor Authentication• Secure (encrypted tunnel) system access and upload
capability• Centralized logging, intrusion detection• Proxies and filters• Hardened (secured) system configurations
5/18/2012
![Page 21: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/21.jpg)
Research data from several institutions:Clinical & genomic data hosting in a HIPAA compliant facility
• 315TB Cloud and project storage for 100s of virtual servers
• 54TB high-speed database and system storage; high-performance parallel databases
• 10Gb redundant network environment; firewall and IDS to address HIPAA requirements
• Multiple-site encrypted storage of critical data
Shared Infrastructure
![Page 22: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/22.jpg)
Repository for Healthcare & Biomedical Data
5/18/2012
![Page 23: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/23.jpg)
5/18/2012
http://idash.ucsd.edu
![Page 24: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/24.jpg)
Informed Consent
Management System
Do I wish to disclose data D
to U?
Information Exchange Registry
User U requests Data D on individual I for
Quality Improvement or Research
Are the data available?
YesNo
Yes
No
Preferences
Inspection
• Identity Management
• Trust Management
Home
Trusted Broker(s)
Patient I
Security Entity
Healthcare Entity
Privacy Registry
I can check who or which entity
looked (wanted to look) at the data for what reasons
AHRQ R01HS19913 / EDM forum NIH U54HL10846
Patient-Centered Data Sharing
![Page 25: Panel Discussion: Big Data; Lucila Ohno-Machado, MD, PhD](https://reader033.vdocuments.site/reader033/viewer/2022061206/5482aaecb4af9f7d0d8b481e/html5/thumbnails/25.jpg)
Acknowledgements
• Slides contributed byBrian ChapmanClaudiu FarcasDallas ThorntonDanielle MoweryHyeon-eui KimJihoon KimKamalika ChaudhuriNatasha BalacRon Joyce
Shuang WangStaal VinterboVineet BafnaWendy ChapmanWinston ArmstrongXiaoqian Jiang
• Division of Biomedical Informatics
• Funding byNIHAHRQPCORIUCOPUCSD