hochheiser nlm-meeting-201406041612

27
MWRI WIP February 2014 Harry Hochheiser, [email protected] User tools for Biomedical Informatics: the Human Side of the Fundamental Theorem Harry Hochheiser University of Pittsburgh School of Medicine Department of Biomedical Informatics [email protected]

Upload: harry-hochheiser

Post on 29-Nov-2014

235 views

Category:

Education


0 download

DESCRIPTION

Slides from the 2014 NLM Biomedical Training Program meeting: The Human side of the Fundamental theorem

TRANSCRIPT

Page 1: Hochheiser nlm-meeting-201406041612

MWRI WIP February 2014Harry Hochheiser, [email protected]

User tools for Biomedical Informatics: the Human Side of the Fundamental Theorem

Harry Hochheiser !University of Pittsburgh School of Medicine Department of Biomedical Informatics [email protected]!

Page 2: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

• Human + Computer > Human iff

• Value(Computer) > Cost(Computer)

• all too often, this does not hold

Hochheiser's perspective on biomedical informatics

• Informatics tools must

• Support researcher’s tasks and goals.

• Take care of the “stupid” work

Page 3: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

GRADS: Genomic Research In Alpha-1 Antitrypsin Deficiency Syndrome and Sarcoidosis

• Alpha-1 antitrypsin deficiency

• “genetic predisposition to early onset pulmonary emphysema and airway obstructions” (GRADS MOP)

• Mutation in SERPINA1 gene - codes for alpha 1-antitrypsin

• Genotyes PiMM (normal), PiMS, (80% serum level), PiSS/PiMZ (60%), PiSZ (40%), PiZZ (20%)

• Sarcoidosis

• “systemic disease characterized by the formation of granulomatous lesions, especially in the lungs, liver, skin, and lymph nodes, with a heterogeneous set of clinical manifestations and a variable course” (GRADS MOP)

• No specific genetic cause

• Infection may play a role..

Page 4: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

GRADS Goals

Use ‘omics data to characterize phenotypes

gene expression

miRNA expression

microbiome

~ 600 patients (400 sarc., 200 A1AT, distribute across phenotypic/genotypic groups), 7 centers

detailed clinical data

lung CT

‘omics, etc.

!

Page 5: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

GRADS Data sharing Goals• Integrative exploration of clinical and ‘omic data

• Identify cohorts suitable for analysis

• Are there enough participants to ask my questions?

• Which genes/miRNAS/microbes might be “interesting”

• How do clinical data relate to ‘omic data

• Web-based interactive filters and exploration

• Coordinated histogram widgets as both input and output

• Initially, GRADS clinical centers

• eventually, broader community

Page 6: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Demo

Page 7: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Demo

Page 8: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Demo

Page 9: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Demo

Page 10: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Research Challenges

• Algorithmic enhancements

• Data retrieval and management

• Calculation of “interesting” genes

• GPU-based calculation

• Additional user facilities?

• statistical comparison of subgroups?

Page 11: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Interactive Search and Review of Clinical Records with Multi-Layered Semantic Annotations• Challenge: retrospective chart review for clinical research

• Quality assessment

• measuring guideline adherence for colonoscopy

• Cohort identification

• patients who may have had adverse reactions

!

• Use Natural Language Processing to extract relevant variables

• But… researchers need to review findings and correct mistakes.

• Ultimate goal: bridge gap between NLP and clinical research

Page 12: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

NLP Chart Review Visualization

Page 13: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Word Tree Visualization Wattenberg and Viégas, 2008, implementation from https://github.com/silverasm/wordtree

Patterns in the text can help facilitate review of NLP results.

Page 14: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Word Tree Part 2..

Page 15: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Word Tree, #3

Page 16: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Next Steps

Interfaces for handling suggested revisions to NLP models:

Selecting spans

Changing variable assignments

Submitting changes

Reviewing modified variable assignments

Assessments

Usability studies

Empirical studies

How much training is needed to “seed” expert review?

Page 17: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Monarch Initiative: Using cross-species phenotypes to explore disease (some slides courtesy of M. Haendel)

Problem: Clinical and model phenotypes are described differently

Page 18: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

OWLSim: Phenotype similarity across patients or organisms !https://code.google.com/p/owltools/wiki/OwlSim

Statistical details available on demand

Page 19: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Scaling up.. Multiple candidates

b2b1035Clo (aka Blue Meanie)

Duplex kidney Cleft palate Prenatal growth retardation Tricuspid valve atresia Persistent truncus arteriosis Double outlet right ventricle Anophthalmia Microphthalmia Kidney cysts Pulmonary valve atresia Polycystic kidney Ventricular septal defect Common atrium Atrioventricular septal defect Complete atrioventricular septal defect …… !!b2b012Clo

(aka Heart Under Glass)Cleft palate Abnormal sternum morphology Double outlet right ventricle Polydactyly Pulmonary hypoplasia Kidney cysts Duplex kidney Right aortic arch Common atrium Complete atrioventricular septal defect Pulmonary artery atresia !

Fgfr2

Fuzb2b1273Clo

(aka octomouse)

Page 20: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

The Monarch Infrastructure

Page 21: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Visualization Challenges:

How to explain the inferences driven by ontological calculations?

How to integrate multiple data types to aid interpretation?

Pathways

Gene expression

protein-protein interaction

…..

How to compare across phenotype profiles?

Page 22: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Undiagnosed Disease Program: Comparing Phenotype Profiles

Page 23: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Phenotype Matrix

Page 24: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Phenotype Profile - Model Views

Page 25: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Other challenges

Process support - search and interpretation as an ongoing activity

!

Reducing bias - how do we avoid cherry-picking and thorough investigation

!

Navigating semantic chains

phenotypes -> networks -> genes - > model

Page 26: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

Closing thoughts…

• The hard problems are not technical

• Collaboration required..

Page 27: Hochheiser nlm-meeting-201406041612

NLM Training Conference June 2014Harry Hochheiser, [email protected]

AcknowledgmentsGRADS: U. Pittsburgh: Steve Wisniewski, Mike Becich, Scott O’Neal, Bill Shirey, Becky Boes, Sahawut Wesaratchakit Yale: Naftali Kaminski

Support: NHLBI U01HL112707

Monarch: U. Pittsburgh: Chuck Borromeo, Bec ky Boes, Jeremy Espino OHSU: Melissa Haendel, Nicole Vasilevky, Matt Brush NIH-UDP: Murat Sincan, David Adams, Neal Boerkel, Amanda Links, Bill Gahl LBNL: Nicole Washington, Suzanna Lewis, Chris Mungall + colleagues at Sanger, Charite , Toronto, and JAX UCSD: Anita Bandrowski, Amarnath Gupta, Jeff Grethe, Maryann Martone, Trish Whetzel

Support: NIH Office of Director: 1R24OD011883, NIH-UDP: HHSN2682013

Interactive Search and Review of Clinical Records with Multi-Layered Semantic Annotations:

U. Pittsburgh: Janyce Wiebe, Rebecca Hwa, Alex Conrad, Phuong Pham, Lanfei Shi, Gaurav Trivedi U. Utah: Wendy Chapman, Danielle Mowery Support: NLM 7R01LM010964 !Other Support: Addressing Gaps in Clinically Useful Evidence on Drug-Drug Interactions (R. Boyce, NLM: 1R01LM011838) Cancer Deep Phenotype Extraction from Electronic Medical Records (R. Crowley & G.Savova, NCI: 1U24CA184407) Quantifying Electronic Medical Record Usability to Improve Clinical Workflow (Z. Agha, AHRQ: 5R01HS021290)