learning and big data intersection of healthcare, machine · intersection of healthcare, machine...
TRANSCRIPT
Somalee Datta, PhDDirector Research IT,
SoM Information Resources & Technology
Intersection of Healthcare, Machine Learning and Big Data
(or the frontier of precision health)
New era for precision health
● Electronic Medical Records are ubiquitous in healthcare● Cost of genome sequencing has reduced a million-fold ● Wearable sensors and IoMT are collecting billions of measurements a day● Cloud has democratized access to petascale computing● Data science is a cool career
Data
Computing Scientists
The SUMEX-AIM project (Stanford University Medical EXperimental computer for Artificial Intelligence in Medicine) was a national computer resource funded by NIH between 1973-92.
INTERNIST (Pittsburg, 1974): Simulate the reasoning processes used by human physicians.
Healthcare at XLDBHealthcare sessions 1 & 2
● Electronic Medical Records● Genomics ● Population Health● Imaging● Internet of (Medical) Things● Policies● Multi-modal analytics
Panel
Data / Tools / publications / sharing
Collaboration session
● Data biosphere● Data sharing made easy ● “Model Commons”
Lightening talks
Quilt Data, Catalog Tech (DNA as storage medium)
Biomedical data analysis brings novel benefits to
human health
Imaging and AI
Dec 2016: Original Investigation
Apr 2016- Mar 2017 statistics:● 4M out patient visits ● 450K+ surgeries ● 2/3 outpatient visits and 3/4 of
the surgeries were serviced to the poor nearly free of cost
Dec 2017: Potential to make planet scale impact
400 million diabetic patients in world; 70 million in India; 30% estimated to develop DR
Apr 2018: FDA approves AI based diagnostic
Apr 11, 2018: FDA permits marketing of AI-based device to detect diabetic retinopathy
Stanford radiology data
Stanford data: (0.5 Pb)● 5-6M studies● 1B+ images
Stanford research: Algorithms diagnose pneumonia better than radiologists
Stanford researchers develop convolutional neural network algorithm that can diagnose up to 14 types of medical conditions based on chest x-ray image.
Road to Next Gen Sequencing based diagnostics
Cost of Genome Sequencing Reduces Million Fold
2016: NASA Astronaut Kate Rubins sequenced DNA in space, using the MinION sequencing device (Oxford Nanopore).
The first human genome sequence (HGP, 1990-2003) took 13 years and $2.7B, now it takes a day and $1000
https://bit.ly/1U15yDW
Authoritative Human Genomes (“Gold standard”)
● NIST RM 8398: female with European ancestry
● NIST RM 8391: male of Eastern European Ashkenazic Jewish ancestry
● NIST RM 8392—male son, father and mother who are a family of Eastern European Ashkenazic Jewish ancestry (with the son’s genome being the same released as NIST RM 8391)
● NIST RM 8393—male of East Asian (Chinese) ancestry
“Gold standard” has brought algorithmic innovations
https://blog.dnanexus.com/2017-12-05-evaluating-deepvariant-googles-machine-learning-variant-caller/
Clinically relevant is a moving target!
● 15+ years of research● $100+ million in investment● ~60% of genome is functional, <1% understood
Mar 2018
In 2018, there will be ~1.7M new cancer diagnosed and 600K deaths in US.
● Men (19% Prostate), Women (30% Breast)● All (13-14% Lung)
Stanford Clinical Genomics ServicePilot: 2014-2016
Preproduction: 2017
Launch: Mar 2018
Estimated impact: 5000 patients / year
Leads to diagnosis in 25-30% of the undiagnosed cases
Ethics, and Privacy
Ethics: one lapse is one too many
“But even one lapse is one too many.” - Donna Shalala, Secretary of the U.S. Department of Health and Human Services, NEJM, 2000
In Sep 1999, teenager Jesse Gelsinger died in a gene therapy trial at UPenn
Why is privacy a hard problem?
P. GOLLE: Proceedings of the 5th ACM Workshop on Privacy in the Electronic Society. 2006: 77-80.
5 digit zip code
County
YYYY 0.2 % 0.0%
MM / YYYY 4.2 % 0.2%
MM / DD / YYYY 63.3 % 14.8%
Probability of re-ID w/ gender known:
Biomedical data analysis brings novel benefits to
human health and continues to offer many challenges