![Page 1: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/1.jpg)
April 10-12, Chicago, IL
Ensuring Compliance of Patient Data with Big Data and BIAyad Shammout & Denny Lee
![Page 2: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/2.jpg)
April 10-12, Chicago, IL
Please silence cell phones
![Page 3: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/3.jpg)
3
Agenda
A Quick Big Data Primer
Healthcare and Big Data
Compliance and AuditingSQL Compliance Project
Compliance and Auditing with Big Data and BIBig Data: Unstructured Volumes of DataAnalytics: PowerPivot, Power View
![Page 4: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/4.jpg)
4
What is Big Data?
VolumeExceeds physical limits of vertical scalability
VelocityDecision window small compared to data change rate
VarietyMany different formats makes integration expensive
VariabilityMany options or variable interpretations confound analysis
![Page 5: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/5.jpg)
5
10x increase every five years
85% from new data types
Dataexplosion
Easy Accessibility of External Data
Cheap, Distributed Storage & Processing
VolumeVelocityVariety
Hadoop
Cloud
By 2015, organizations that build a modern information management system will outperform their peers financially by 20 percent.
– Gartner, Mark Beyer
“Information Management in the 21st Century”
![Page 6: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/6.jpg)
Large Data Volumes
Non-traditional Data Types
New TechnologiesNew Data Sources
New Economics
![Page 7: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/7.jpg)
7
Big Data Business Value
140,000-190,000 more deep analytical talent positions
1.5 millionmore data savvy managersin the US alone
$300 billionPotential annual value to US healthcare
15 out of 17sectors in the US have more data stored per company than the US Library of Congress
€250 billionPotential annual value to Europe’s public sector
50-60% increase in the number of Hadoop developers within organizations already using Hadoop within a year
![Page 8: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/8.jpg)
8
Databecomes the new currency
![Page 9: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/9.jpg)
9
Hadoop: The most visible face of Big Data
MapReduce Layer
HDFS Layer
Task trackerTask tracker
Job tracker
Name node
Data node Data node
![Page 10: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/10.jpg)
10
HDInsight: Visit HadoopOnAzure.com
10
![Page 11: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/11.jpg)
Healthcare and Big Data
![Page 12: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/12.jpg)
12
Healthcare and IT
Often the laggard in technology
Yet application of IT to healthcare can radically change what we can do
Genomic SequencingProteomic sequencingIncidence Prediction
![Page 13: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/13.jpg)
13
Healthcare Big Data Example ScenariosClinical Trial DeviationsOriginally Viagra was developed to lower blood pressure and treat AnginaNow its used to help newborn pulmonary hypertension and altitude sickness
Incidence PredictionMissed 4 or more visits, twice as likely to have an asthmatic incidentParticular Cardiac monitor sine wave points to highly likelihood of heart attack
CampaignsSocial media and advertising campaigns to understand user behavior and sentiment
Patient SatisfactionSocial media and advertising campaigns to understand user behavior and sentiment
![Page 14: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/14.jpg)
14
BIDMC Auditing Scenario
Auditing is critical component HIPAA in ensuring patient privacy1 Billion rows+ of audit data 146 mission critical clinical applicationsComprehensive audits yield 300-500k transactions/dayHIPAA requires audit system with 20 years of data
Auditing ProjectAvailable to community as part of Compliance SDKUpdating for SQL Server 2012, HDInsight, Power View, and MobileBI*
Creating an enterprise tool for consolidated storage, reporting and alerting of all application audit data - that's cool!
John Halamka’s Cool Technology of the Week (Wellsphere Top Health Blogger, Health Impact Award)
![Page 15: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/15.jpg)
15
BIDMC Compliance Project
SSIS
SSIS
SSIS
HDInsight Windows
HDInsight Azure
SQ
L Serv
er
20
08
/20
12
Audit LogsETL Logs to HDFS
Use Excel 2013 PowerPivot and Power View
SSAS (tabular)
![Page 16: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/16.jpg)
16
Auditing Sensitive Information
16
Querying Audit InformationUse PowerPivot / Power View / Analysis Services to Query the data.
Security InformationPolicy Information
Process Audit InformationUse SSIS to process SQL2008 All-Actions Audit Information and other CG application audit log data; potentially can use Management Performance DW framework.
Caregroup Environment
File Server
SQL Audit
Connect/Logic
SSIS
CG Application Data
Intersystems Cache
SQL2005
Oracle
SQL2008 All-Actions Audit Data
SQL 2008 / 2012 R2
SSRS 2008 /Power View
Policy Analysis
Policy Reports
Policy Best Practices
Security Analysis
Security Reports
Compliance Reports
Feedback Action LoopUpdate systems to keep them
compliant and secure
![Page 17: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/17.jpg)
Audit Logs
17
Storage Infrastructure
Transfer files to ASV via AzCopy,CloudExplorer, etc.
![Page 18: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/18.jpg)
18
Storage Infrastructure
18
Hadoop on AzureCompute Nodes (Medium VMs)
Azure Storage Vault (ASV)Azure Blob Storage
Azure Flat Network Storage
![Page 19: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/19.jpg)
19
Storage Infrastructure
19
Hadoop on AzureCompute Nodes (Medium VMs)
Azure Storage Vault (ASV)Azure Blob Storage
Azure Flat Network Storage
Stream dataTo compute
Push dataBack to Storage
map sort shuffle reduce
http://dennyglee.com/2013/03/18/why-use-blob-storage-with-hdinsight-on-azure/
![Page 20: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/20.jpg)
2020
SSIS to HDInsight
![Page 21: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/21.jpg)
2121
SSIS Processing
![Page 22: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/22.jpg)
22
SSAS Tabularof HoA Audit Data
![Page 23: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/23.jpg)
23
Hadoop / Auditing: File sizes
Currently testing gz vs. rawE.g. 12MB raw text file vs. 633Kb gz file (~20x compression)
20x smaller size, ~same query timeApprox same map / reduce task utilization
File Size is 250MB-1GBSSIS package takes care of the size
Future testing: avro, protobuf23
Query Duration (s)
select count(*) from sql_audit_asv_raw 56.066
select count(*) from sql_audit_asv_gz 58.994
![Page 24: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/24.jpg)
24
Hadoop / Auditing: Formats
For ease of processing, replace carriage returns within embedded SQL statements, e.g.
select col1, col2 from tableAto
select col1, col2 from tableA
This allows you to create a Hive table using CR as row delimiter (i.e. does not have things like SQL quoted identifiers)
24
![Page 25: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/25.jpg)
25
![Page 26: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/26.jpg)
SQOOP, HiveODBC, Templeton, CSV, etc
BI Connectivity
![Page 27: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/27.jpg)
27
Big Data … Excel-lerated!
2 Server, 3mo110 GBbinaryfiles
SSIS
SSIS
SSIS
SSIS extraction1.2GB of text
120MB gz
Hadoop toPowerPivot
6MB
![Page 28: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/28.jpg)
28
PowerPivot workbook of HoA Audit data
![Page 29: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/29.jpg)
29
Power View of HoA Audit Data
![Page 30: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/30.jpg)
30
Win a Microsoft Surface Pro!
Complete an online SESSION EVALUATION to be entered into the draw.
Draw closes April 12, 11:59pm CTWinners will be announced on the PASS BA Conference website and on Twitter.
Go to passbaconference.com/evals or follow the QR code link displayed on session signage throughout the conference venue.
Your feedback is important and valuable. All feedback will be used to improve and select sessions for future events.
![Page 31: Ensuring compliance of patient data with big data and bi [bdii 301-m] - (4078)](https://reader034.vdocuments.site/reader034/viewer/2022051612/54c637a94a7959c9388b4641/html5/thumbnails/31.jpg)
April 10-12, Chicago, IL
Thank you!Diamond Sponsor Platinum Sponsor