machine learning for precision public health: visualizing data for … · 2021. 7. 18. · example:...

85
Machine Learning for Precision Public Health: Visualizing Data for Analysis and Communication @amcrisan http://cs.ubc.ca/~acrisan [email protected] Anamaria Crisan Vanier Canada Scholar & UBC Public Scholar PhD Candidate, Computer Science University of British Columbia

Upload: others

Post on 07-Aug-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Machine Learning for Precision Public Health:Visualizing Data for Analysis and Communication

@amcrisan http://cs.ubc.ca/[email protected]

Anamaria CrisanVanier Canada Scholar & UBC Public ScholarPhD Candidate, Computer ScienceUniversity of British Columbia

Page 2: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Master of Science ( Bioinformatics )

PhD(Computer Science)

GenomeDX Biosciences

British Columbia Centre for Disease Control

2010 2013 20152008

PhD Candidate, Computer ScienceUniversity of British Columbia

Page 3: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

What we’ll talk about

Page 4: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Why should we visualize data?

How should we visualize data?

What datavis tools are available?

Page 5: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Why should we visualize data?

Page 6: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Translating Numbers to Words

http://bit.ly/1FxtT2z

It is not always easy to reason consistently with numbers

Page 7: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

60%

Probability Frequency Visualization6 in 10

< <

Whiting (2015) “How well do health professionals interpret diagnostic information? A systematic review”

Least Understandable Most Understandable

Data Visualization is a Powerful Medium

Page 8: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Role of data visualization in the current paradigm of scientific research

= Communication

Page 9: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Do you have a

research

Problem?

Yes.

No.

Do all the

Science!

But eventually you’ll have a problem

right?

Duh.

Informthe public!

https://www.ratbotcomics.com/comics/pgrc_2014/1/1.html

Page 10: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Yes.

No.

Do all the

Science!

Duh.

Inform

Maybe data

Visualization?

Infographics are pretty

the public!

Problem?

right?

Do you have a

research

But eventually you’ll have a problem

Page 11: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Yes.

No.

Do all the

Science!

Duh.

Inform

Did it work?

Maybe data

Visualization?

the public!

Infographics are pretty

Problem?

right?

Do you have a

research

But eventually you’ll have a problem

Page 12: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Yes.

No.

Do all the

Science!

Duh.

Inform

Did it work?

Maybe data

Visualization?No : (

the public!

Different Infographics?

Problem?

right?

Do you have a

research

But eventually you’ll have a problem

Page 13: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Yes.

No.

Do all the

Science!

Duh.

the public!Inform

Did it work?

Maybe data

Visualization?No : (

Different Infographics?

Declare VictoryYes!

(maybe?)

Problem?

right?

Do you have a

research

But eventually you’ll have a problem

Page 14: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Limitation #1 : Missed Opportunity in Exploration

Do all the

Science!

DataVisualization!

the public!Inform

Missed Opportunity for Exploration§ Exploration is looking at your data,

trying different analysis methods, assessing if there are outliers or missing data etc.

Page 15: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats

Same stats, different graphs

Limitation #1 : Missed Opportunity in Exploration

Page 16: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats

Same stats, different graphs (Datasaurus)

Limitation #1 : Missed Opportunity in Exploration

Page 17: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Opening up the machine learning black box

Limitation #1 : Missed Opportunity in Exploration

Page 18: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Limitation #1 : Missed Opportunity in Exploration

Chihuahua or muffin? Mop or sheep dog?

Page 19: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Limitation #1 : Missed Opportunity in Exploration

Goodfellow (2014). “Explaining and Harnessing Adversarial Examples”

Page 20: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Olah (2018). “Building blocks of interpretability” (https://distill.pub/2018/building-blocks/) Made with : JavaScript

Example : Trying to understand the black box

Page 21: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Health data are complex to analyze and visualization

Page 22: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Limitations #2 : Identifying the Appropriate Vis

Selecting the appropriate data visualization is challenging

DataVisualization!

§ True for exploration & communication applications

Page 23: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Visualization Design ALSO matters

Page 24: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Baseline Visualization

Alternative 1 Alternative 2

Zikmund-Fisher (2013). A demonstration of ''less can be more'' in risk graphics.

Example: Communicating Survival Benefit of Cancer Therapy

Page 25: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Example: Visualizing Arteries of the Heart for Surgery Planning

Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis” Made with : Processing

Page 26: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

EXISTING STANDARD Accuracy : 39%

REVISED VISUALIZATIONAccuracy: 91%

Borkin (2011). “Evaluation of Artery Visualizations for Heart Disease Diagnosis” Made with : Processing

Example: Visualizing Arteries of the Heart for Surgery Planning

Page 27: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

There are two aspects of visualizations to think about:

How do you make a visualization?What datavis tools are available?

Is it the appropriate visualization?How should we visualize data?

Page 28: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

How should we visualize data ?

Page 29: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Human Perception & Cognition

Computer Graphics

Data Analysis

Cross Cutting Disciplines in Information Visualization

Visualization Design & Analysis

Page 30: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

R. Kosara (EagerEyes) – https://eagereyes.org/basics/encoding-vs-decoding

Encoding and Decoding Information

Page 31: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Putting it all Together for Visualization Design & Analysis

§ Non-trivial to condense knowledge across all these areas

§ Still an ongoing area of research§ I will try convey a simpler

intuition about design & analysis

Page 32: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Guiding Principles for Visualizing your Data

Image Source: Valentin Antonucci via Pexels

Page 33: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Why? (Motivation)Why do you need to visualize data?How will you, or others, use the visualization?

Breaking Down a Visualization in Three Questions

34

Page 34: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Breaking Down a Visualization in Three Questions

Why? (Motivation)Why do you need to visualize data?How will you, or others, use the visualization?

What? (Data & Tasks)What kind of data is being visualized?What tasks are performed with the data?

35

Page 35: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

People tend to jump to this level and ignore why and what

What? (Data & Tasks)What kind of data is being visualized?What tasks are performed with the data?

How? (Visual & Interactive Design)How do you make the visualization?Is it the right visualization?

Why? (Motivation)Why do you need to visualize data?How will you, or others, use the visualization?

Breaking Down a Visualization in Three Questions

36

Page 36: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Design & Evaluation with Three Questions

Why?

What?

How?

Design EvaluationDoes the visualization address the the intended need?

Are you using the right data, or deriving the right data?

Are the visual & interactive choices appropriate for the data and tasks?

Does the visualization support the tasks using that data?

If interactive / computer based, is the visualization easy to use and reliable (i.e doesn’t crash all the time)

37

Page 37: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Ideas from the research literature : the nested-model

Why?

What?

How?

Design

Evaluation

T. Munzner (2014) – Visualization Design and Analysis

Page 38: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Steps to Systematic Thinking in Data Visualization

Image Source: Valentin Antonucci via Pexels

Page 39: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Domain Problem*

Data+ Task

Visual + Interaction Design Choices

Algorithm

Infovis (Information Visualization) research advocates an iterative process

T. Munzner (2014) – Visualization Design and Analysis

Design

Evaluation

Thinking Systematically about Data Visualization

*Domain Problem = Motivation

Page 40: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

An iterative approach to development allows us to get feedback before committing to ineffective design choices

An Iterative Process

Page 41: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

1. Identify a relevant problem that effects you or a group of stakeholders

Domain Problem

Data+ Task

Visual + Interaction Design Choices

Algorithm

T. Munzner (2014) – Visualization Design and Analysis

Thinking Systematically about Data Visualization

Page 42: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

NursesClinicians

Medical Health Officers Researchers

Community Leaders

§ Multidisciplinary decision making teams§ More data & diverse data types = more informed decision making§ BUT – different stakeholder abilities to interpret data & different needs

Public Health Stakeholders

Policy MakersPatients

Page 43: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

2. Ask what data stakeholders use (is it available)?

3. Ask what stakeholders do with the data [tasks]

Domain Problem

Data+ Task

Visual + Interaction Design Choices

Algorithm

T. Munzner (2014) – Visualization Design and Analysis

Thinking Systematically about Data Visualization

Page 44: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Data - Many Different Types of Data!

T. Munzner (2014) – Visualization Design and Analysis

Page 45: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Data - Don’t Just Visualize the Raw Data!

Original (Raw) Data

Derived Data

Example Example when this advice is ignored

T. Munzner (2014) – Visualization Design and Analysis XKCD

Page 46: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tasks - How People Use the Data

Source : Atlanta CDC

Geographic Overview of Prostate Cancer§ Useful for epidemiologists and policy makers§ Supports surveillance tasks

Individual Prostate Cancer Risk§ Good for patients and doctors§ Supports treatment decision making tasks

Source : http://riskcalc.org/PCPTRC/(UT San Antonio)

Page 47: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tasks - How People Use the Data• Tasks can also change how the same data should be visualized• Example: representing US electoral collage results

Standard Map Cartogram

Page 48: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tasks - How People Use the Data• Tasks can also change how the same data should be visualized• Example: representing US electoral collage results

Standard Map Snakey Diagram

Page 49: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tasks - How People Use the Data• Tasks can also change how the same data should be visualized• Example: representing US electoral collage results

Page 50: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Examples from my own research

How can we identify tasks and data?

Page 51: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research : making a clinical report for tuberculosis• Mixed methods approach to gathering data and tasks

Discovery Design ImplementInformation Gathering Design & Evaluation Finalize Design

Expert Consults

Task & DataQuestionnaire

Design Sprint

Design Choice Questionnaire

TB Workflow

MapData GatheredQualitative

QuantitativeStudy Design Exploratory Sequential Model Embedded Model

MYCOBACTERIUM TUBERCULOSISGENOME SEQUENCING REPORTNOT FOR DIAGNOSTIC USE

Pa ent Name JOHN DOE BarcodeBirth Date 2000-01-01 Pa ent ID 12345678910Loca on SOMEPLACE Sample Type SPUTUM

Sample Source PULMONARY Sample Date 2016-12-25

Sample ID A12345678 Sequenced From MGIT CULTURED ISOLATE

Repor ng Lab LAB NAME Report Date/Time 2017-01-01, 15:36

Requested By REQUESTER NAME Requester Contact [email protected]

SummaryThe specimen was posi ve for Mycobacterium tuberculosis. It is resistant to isoniaizd and ri-fampin. It belongs to a cluster, sugges ng recent transmission.

OrganismThe specimen was posi ve forMycobacterium tuberculosis, lineage 2.2.1 (East-Asian Beijing).

Drug Suscep bility

Resistance is reported when a high-confidenceresistance-conferring muta on is detected. “Nomuta on detected” does not exclude the possi-bility of resistance.

! No drug resistance predicted!Mono-resistance predicted"!Mul -drug resistance predicted! Extensive drug resistance predicted

Drug class Interpreta on Drug Resistance Gene (Amino Acid Muta on)

Ethambutol No muta on detectedSuscep blePyrazinimide No muta on detected

Isoniazid katG (S315T)First Line

ResistantRifampin rpoB (S531L)

Streptomycin No muta on detected

Ciprofloxacin No muta on detected

Ofloxacin No muta on detectedMoxifloxacin No muta on detectedAmikacin No muta on detectedKanamycin No muta on detected

Second Line Suscep ble

Capreomycin No muta on detected

Page 1 of 2 Pa ent ID: 12345678910 | Date: 2017-01-01 | Loca on: Someplace

Page 52: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research : making a clinical report for tuberculosis

WGS equivalent

DIAGNOSIS TASKS TREATMENT TASKS SURVEILLANCE TASKS

TOTAL SCORE

Diagnose Latent TB

Diagnose Active TB

Reactive vs New Infection

Characterize Transmission

RiskChoose Meds

Choose TxDuration

Assess Response

to Tx

Guide Contact Tracing

Report to Public Health

Define a Cluster

Connect Case to Existing Cluster

Guide Public Health

ResponsePatient Identifier Same 3 3 3 3 3 3 3 2 1 1 1 1 26Sample Collection Date Same 3 3 2 3 3 3 3 1 1 1 1 1 24Patient Prior TB Results Same 3 2 3 3 3 3 3 1 1 1 0 1 23Speciation Speciation 1 3 2 3 3 3 3 2 1 1 1 1 23Sample Type (sputum, fine needle aspirate etc.) Same 2 3 2 3 3 3 3 1 1 1 0 1 22

Culture results NA 1 3 2 3 3 3 3 2 1 1 0 1 22Sample Collection Site (lymph node, lung etc..) Same 2 3 2 3 3 3 3 1 1 0 0 1 21

Acid Fast Bacilli Smear Speciation 2 3 2 3 2 3 3 1 1 1 0 1 21Resistotype Predicted DST 0 2 3 1 3 3 2 2 1 1 1 1 19Phenotypic DST Predicted DST 0 2 3 2 3 3 2 1 1 1 0 1 18Chest x-ray NA 3 3 2 3 0 2 3 1 0 0 0 0 17Report Release Date Same 2 2 1 2 2 2 2 1 0 1 0 1 15Requester IDs Same 2 2 2 2 2 2 2 1 0 0 0 0 15Interpretation or comments from reviewer Same 2 2 1 2 2 2 3 1 0 0 0 0 15

Predicted DST Predicted DST 0 2 2 1 3 3 2 1 0 1 0 0 15MIRU-VNTR SNPs 0 2 3 1 1 1 1 1 1 1 1 1 13Cluster Assignment Same 0 2 2 1 1 1 0 1 1 1 1 1 11SNP/variant distance SNPs 0 1 2 1 1 1 0 1 1 1 1 1 10Phylogenetic Tree Same 0 2 1 1 1 1 0 1 0 1 1 1 9Reviewer ID Same 1 1 1 1 1 1 1 1 0 0 0 0 8TST results Speciation* 3 1 1 1 0 0 0 1 0 0 0 0 7IGRA results Speciation* 3 1 1 1 0 0 0 1 0 0 0 0 7Lab QC WGS Specific 0 1 2 1 1 1 0 1 0 0 0 0 7Spoligotype SNPs 0 1 1 1 0 0 0 0 0 0 0 0 3RFLP SNPs 0 1 1 1 0 0 0 0 0 0 0 0 3

Degree of Consensus: High (3) Some (2) Low (1) Very low (0)

Data

3 (>75%)

2 (50% - 25%)

1 (25% -50%)

0 (<25%)

Consensus among participants

% agreecat.

Page 53: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research : making a clinical report for tuberculosisMYCOBACTERIUM TUBERCULOSISGENOME SEQUENCING REPORTNOT FOR DIAGNOSTIC USE

Pa ent Name JOHN DOE BarcodeBirth Date 2000-01-01 Pa ent ID 12345678910Loca on SOMEPLACE Sample Type SPUTUM

Sample Source PULMONARY Sample Date 2016-12-25

Sample ID A12345678 Sequenced From MGIT CULTURED ISOLATE

Repor ng Lab LAB NAME Report Date/Time 2017-01-01, 15:36

Requested By REQUESTER NAME Requester Contact [email protected]

SummaryThe specimen was posi ve for Mycobacterium tuberculosis. It is resistant to isoniaizd and ri-fampin. It belongs to a cluster, sugges ng recent transmission.

OrganismThe specimen was posi ve forMycobacterium tuberculosis, lineage 2.2.1 (East-Asian Beijing).

Drug Suscep bility

Resistance is reported when a high-confidenceresistance-conferring muta on is detected. “Nomuta on detected” does not exclude the possi-bility of resistance.

! No drug resistance predicted!Mono-resistance predicted"!Mul -drug resistance predicted! Extensive drug resistance predicted

Drug class Interpreta on Drug Resistance Gene (Amino Acid Muta on)

Ethambutol No muta on detectedSuscep blePyrazinimide No muta on detected

Isoniazid katG (S315T)First Line

ResistantRifampin rpoB (S531L)

Streptomycin No muta on detected

Ciprofloxacin No muta on detected

Ofloxacin No muta on detectedMoxifloxacin No muta on detectedAmikacin No muta on detectedKanamycin No muta on detected

Second Line Suscep ble

Capreomycin No muta on detected

Page 1 of 2 Pa ent ID: 12345678910 | Date: 2017-01-01 | Loca on: Someplace

Page 54: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

4. Explore if other visualizations have addressed this problem and set of tasks & data

5. Implement your own solution (remember this include interaction!)

T. Munzner (2014) – Visualization Design and Analysis

Domain Problem

Data+ Task

Visual + Interaction Design Choices

Algorithm

Thinking Systematically about Data Visualization

Page 55: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Mark:Basic Graphical Element(basic building block)

Channel:Controls the appearance of marks

Marks & Channels : Basic Building Blocks

T. Munzner (2014) – Visualization Design and Analysis49

Page 56: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Example

Marks Vary in their Effectiveness

Bar ChartPosition Common Scale

Pie ChartAngle & Area

J. Heer (2010) – Crowdsourcing Graphical Perception: Using Mechanical Turk ……50

Page 57: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Perception and Cognition Matter Too!

Colour Blind Simulator: http://www.color-blindness.com/coblis-color-blindness-simulator/

Original Visualization Visualization as seen by color blind person(color blindness (deuteranopia) impacts men more often))

Page 58: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Perception and Cognition Here too!

Colour scales also impact interpretation!Perceptual research from Liu et al (2018)

Liu et al. (2018) - Somewhere Over the Rainbow: An Empirical Assessment of Quantitative Colormaps

Page 59: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

ggplot (data = mpg, aes( x= display, y = cty, colour = class)) + geom_point( )

Channel: Position Channel: Colour

Mark: Point

Marks & Channels : ggplot2 example

Note: Generally in ggplot2 aesthetics refer to channels and geoms refer to marks, but there are complex geoms that aren’t simple marks but chart types (i.e. geom_density) and there are aesthetics that have little to do with the visual channels directly (i.e. group)

https://rpubs.com/hadley/ggplot-intro51

Page 60: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Marks & Channels : Tableau example

51

Marks

Channels

Page 61: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Linking Data to Mark and Channels to Make Visualizations

Data Marks & Channels Visualization

Page 62: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Linking Data to Mark and Channels to Make Visualizations

Chart Chooserhttps://bit.ly/2P9zLEW

Data to vizhttps://www.data-to-viz.com/

Page 63: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Examples from my own research

How do people visualize data?

Page 64: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research: surveying visualizations in genomic epidemiology

http://gevit.netCrisan et. al (2018) “A systematic method for surveying data visualizations and a resulting genomic epidemiology visualization typology: GEViT”OXFORD BIOINFORMATICS

Page 65: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Examples from my own research

How can we help people visualize data?

Page 66: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research: simplifying the creation of data visualizations

#specify individual chartsphyloTree_chart<-specify_base(chart_type = "phylogenetic tree",data="tree_dat") epicurve<-specify_base(chart_type = "histogram",data="tab_dat",x = "month")map_chart<-specify_base("geographic map",data="tab_dat",lat = "latitude",long = "longitude")

#specify a combinationcolour_ combo<-specify_combination(combo_type = "color_linked", base_charts = c("phyloTree_chart","map_chart","epicurve"),link_by="country")

#plot the resultplot(color_combo)

Page 67: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

My research: automatic data visualization

# Analyze different # data types automaticallyharmon_obj<-data_harmonization(tab_dat,tree_dat,genomic_dat,all_spatial)

# Create specifications # that compile to minCombinrcomponent_specs<-get_spec_list(harmon_obj)

#plot the result one view at a timeplot_view(component_specs,view_num=1)

Preliminary Result

Page 68: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

4. Explore if other visualizations have addressed this problem and set of tasks

5. Implement your own solution (part or all of that solution could be a new algorithm)

Domain Problem

Data+ Task

Visual + Interaction Design Choices

Algorithm

Thinking Systematically about Data Visualization

Page 69: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

6. Test multiple alternatives (including new ones you develop) with stakeholders

7. Gather qualitative & quantitative evaluation data

Domain Problem*

Data+ Task

Visual + Interaction Design Choices

Algorithm

Thinking Systematically about Data Visualization

Page 70: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

1. Identify a relevant problem that effects you or a group of stakeholders

2. Ask what data stakeholders use (is it available)?

3. Ask what stakeholders do with the data [tasks]

4. Explore if other visualizations have addressed this problem and set of tasks & data

5. Implement your own solution (vis and/or algorithm)

6. Test multiple alternatives (including new ones you develop) with stakeholders

7. Gather qualitative & quantitative evaluation data

Design

Evaluation

Thinking Systematically about Data Visualization

Page 71: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

What datavistools are available?

Page 72: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Data Visualization Tools to Get You Started

Page 73: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationLisa Charlotte Rost has an excellent blog post about this: http://bit.ly/2gRGx1JI am presenting her figures here

Page 74: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationLisa Charlotte Rost has an excellent blog post about this: http://bit.ly/2gRGx1J

Analysis vs Presentation

Page 75: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationLisa Charlotte Rost has an excellent blog post about this: http://bit.ly/2gRGx1J

Extent of FlexibilityHow easy/hard it is to make data visualizations (including custom/novel visualizations)

Page 76: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationLisa Charlotte Rost has an excellent blog post about this: http://bit.ly/2gRGx1J

Static vs Interactive

Page 77: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationLisa Charlotte Rost has an excellent blog post about this: http://bit.ly/2gRGx1J

“There are no perfect tools, just good tools for people with certain goals”See a detailed table here:http://bit.ly/2DeWPwV

Page 78: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Tools & Libraries for data visualizationAnother take with commonly used tools : https://bit.ly/2SgrOzS

Page 79: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Don’t forget that pen and paper is an option too!

Dear Data Project (Lupi & Posavec)

Page 80: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Wrapping up

Page 81: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

DATA VISUALIZATION IS NOT

JUST AN ART PROJECT

Page 82: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Key take-aways from this talk

§ Visualizations of data are useful§ Helpful in instance of low numeracy§ Can used in communication and exploration

§ But.. visualization design also matters§ Many different alternatives, important to test

§ It’s possible to think systematically about visualizations§ Many disciplines cross cut information visualization research§ At the minimum think “Why”, “What”, “How”

§ Encode data well so that others can decode it later

§ Data visualization is a research process with open and interesting problems

Page 83: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Additional Resources

§ Books to consider:§ Interpretable Machine Learning: https://christophm.github.io/interpretable-ml-book/

§ Making Data Visual: A Practical Guide to Using Visualization for Insight by Danyel Fisher and Miriah Meyer

§ Visualization Design and Analysis by Tamara Munzner (more technical )

§ Online resources:§ Distill Publication : https://distill.pub/§ UBC Infovis Resource Page : http://www.cs.ubc.ca/group/infovis/resources.shtml§ UW Interactive Data Lab : https://medium.com/@uwdata

§ Data stories podcast : http://datastori.es/

§ Inspiration :§ Information is Beautiful : https://informationisbeautiful.net/

§ Visualization WTF (examples of what not to do) : http://viz.wtf/

Page 84: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Additional Resources

§ I’ll be presenting more on my own research on June 18th!

Page 85: Machine Learning for Precision Public Health: Visualizing Data for … · 2021. 7. 18. · Example: Communicating Survival Benefit of Cancer Therapy. ... •Example: representing

Machine Learning for Precision Public Health:Visualizing Data for Analysis and Communication

@amcrisan http://cs.ubc.ca/[email protected]

Anamaria CrisanVanier Canada Scholar & UBC Public ScholarPhD Candidate, Computer ScienceUniversity of British Columbia