exploring how to use hadoop in your healthcare big data strategy

40
© 2016 Health Catalyst Proprietary and Confidential Exploring How to Use Hadoop in your Healthcare Big Data Strategy 1 Sean Stohl Senior Vice President, Product Development Health Catalyst

Upload: health-catalyst

Post on 09-Jan-2017

214 views

Category:

Healthcare


0 download

TRANSCRIPT

Page 1: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Exploring How to Use Hadoop in your Healthcare Big Data Strategy

1

Sean StohlSenior Vice President, Product DevelopmentHealth Catalyst

Page 2: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

2

Page 3: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Poll Question #1

3

What brought you to this webinar? 115 Respondents

1. Everyone is talking about Big Data/Hadoop – What is it? – 31%2. Searching for uses cases – What is the value proposition? – 42%3. Need help implementing it – 7%4. Want to hear others’ experiences – 16%5. I am bored so why not try this webinar – 4%

Page 4: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

4

Learning Objectives

Be able to explain

• What is Big Data and Hadoop

• Why do we need Big Data and Hadoop in Healthcare

• What are the challenges to adoption

• How do I get started

• See it in action

Page 5: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

5

Scaling Up Limits

Page 6: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

6

What does it take to reach the Big Data threshold?

3 V’s of Big Data

Page 7: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

We Are Not “Big Data” in Healthcare Yet

7

Page 8: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

8

Volume, Velocity, and Variety aren’t the only reasons to move

Dear Data…

Page 9: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

• Created by Doug Cutting and Mike Cafarella at Yahoo in 2005.

• Hadoop named after Cutting’s son’s toy elephant.

• “The name my kid gave a stuffed yellow elephant. Short, relatively easy to spell and pronounce, meaningless, and not used elsewhere: those are my naming criteria. Kids are good at generating such. Googol is a kid’s term.” - Doug Cutting

• Open-sourced software framework that supports processing and storing of large data sets distributed across clusters of commodity hardware.

• HDFS – Hadoop Distributed File System. File System that provides the capability to distribute data across a cluster to take advantage of the parallel processing of Map Reduce.

• Map Reduce - Parcels out work to various nodes within the cluster or map, and it organizes and reduces the results from each node into a cohesive answer to a query.

History of Hadoop

Page 10: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Poll Question #2

10

How would you categorize your organization’s involvement with Hadoop? 126 Respondents

1) Piloting Hadoop in the Cloud or Plan to – 9%2) Piloting Hadoop on Premise or Plan to – 18%3) Heavily using Hadoop in the Cloud – 1%4) Heavily using Hadoop on Premise – 5%5) Unsure or not applicable – 68%

Page 11: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

11

• Data Growth

• Different Types of Workload• Semi Structured• Archiving• Streaming• Machine Learning

Why Big Data and Hadoop in Healthcare

Page 12: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Just Beginning: Digitization of Health

12

“EMR data represents ~8% of the data we need for population health and precision medicine.” — Alberta Secondary Use Data Project

The Growing Ecosystem of Human Health Data

Healthcare Encounter

Data

7x24 Biometric

DataConsumer

Data

Genomic &

Familial Data

Social Data

Outcomes Data

Page 13: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

13

• Structured• Data that can be stored relationally in RDBMS

• Semi Structured• Data that has some organizational properties but isn’t in a relational database format• CSV, XML, X12 (835/837) , HL7, JSON• Doctor Notes - Template Generated Sections

• Unstructured• E-mails, text messages, Word documents, videos, and pictures• Doctor Notes – Free Form Sections

Types of Data

Page 14: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

14

Archiving

Page 15: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

15

Streaming

Page 16: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

16

Page 17: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

17

Page 18: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Implementation

18

Page 19: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential19

Challenges to Adoption and How to Overcome Them

Page 20: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Poll Question #3

20

Which challenge has been or would be the greatest barrier for your organization to adopt Hadoop? 137 Respondents

1. People with the right skill sets – 33%2. Funding hardware costs - 8%3. Defining the business value – 37%4. Security concerns – 6%5. Unsure or not applicable – 16%

Page 21: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

21

Page 22: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

22

Challenges to adoption

OrganizationalBuyingAdministeringUsing

Page 23: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

23

Organizational

Stuck in the Mud

Page 24: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

24

Buying

Page 25: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

25

Page 26: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

26

Cloud

Page 27: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

27

Administering Fewer experienced people Lack of best practices Myriad of tools Open Source yes – but lots of assembly required Security?

Page 28: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

28

Administering

Page 29: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

29

Packaged Solutions

Page 30: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

30

Administering

Page 31: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

31

Invest in your people

Page 32: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

32

Using• Which SQL on Hadoop

Hive

Impala

Spark SQL

Apache Drill

Page 33: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

33

Tools continue to Evolve

http://www.infoworld.com/article/3131058/analytics/big-data-face-off-spark-vs-impala-vs-hive-vs-presto.html

Page 34: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

34

Don’t Rip and Replace

Page 35: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

35

Meeting in the middle

RDBMS Vendors

• Oracle• SQL Server• Teradata• …

Hadoop Solutions

• Hortonworks• Cloudera• Mapr• Cloud• …

Convergence

Page 36: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

36

Additive Approach

Page 37: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

37

Data Operating System

Page 38: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

Demos

Page 39: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Lessons Learned

39

1. Let use cases help drive the need to implementing Hadoop. (Be Pragmatic.)2. Think additive.3. Invest in people now.4. In general, the Cloud will give you the most flexibility in deploying Hadoop.

Page 40: Exploring How to Use Hadoop in your Healthcare Big Data Strategy

© 2016 Health CatalystProprietary and Confidential

Thank You

40