himss national data warehousing webinar
DESCRIPTION
TwentyTRANSCRIPT
Avoiding My Mistakes: 25 Years in Data Warehousing and Business Intelligence
Dale Sanders
Chief Information Officer,
Cayman Islands National Health SystemHIMSS Webinar Version 3
You are here….
Why Am I Here Today? Help you avoid my costly mistakes…
Healthcare is finally becoming more analytically driven Balancing quality and cost is now mandatory, not optional
Sinking big capital into EMR adoption
But…the majority ROI of an EMR resides in the analytic value of the data collected
EMR vendors are not addressing enterprise-wide analytics Data goes in, but data won’t come out
Vendors and consultants are promising analytic solutions that have and are going to fail…leaving organizations stranded in 2-3 years without the data they need to succeed in the Healthcare 2.0 world
Coming to Terms
The topic is data warehousing, which implies technology
But the message is about the blend of data technology, organizational structure, and culture
Analytic Success =
(Analytic Technology) x (Data Literate Culture) x (Organizational Structure)
If any of these trend towards zero, Success trends to zero
+The Cake Metaphor
Thanks to Dan Lidgard, Catalyst
My Background Education
Chemistry, Biology, Information Systems Engineering
US Air Force, National Security Agency, Joint Chiefs of Staff 1983 – 1995
Intel, IBM, Motorola, National Institutes of Health 1995 -1997
Intermountain Healthcare, Northwestern University Medical School 1997 – 2009
Cayman Islands
Data Warehousing/BI US Air Force
Force Status Monitoring System (FSMS)
Peacekeeper Information Retrieval System (PIRS)
Integrated Minutemen Data Base (IMDB)
National Security Agency National Nuclear Threat Database (2NTD)
Joint Chiefs of Staff Strategic Execution Decision Aid
Intel Integrated Logistics System (ILS)
Enterprise Data Warehouse
Lots of mistakes
Fewer Mistakes
Healthcare Specific Details
Intermountain Healthcare Enterprise Data Warehouse
Gratis and paid consulting 11 healthcare organizations in US and Canada
Healthcare Data Warehousing Association www.hdwa.org
Northwestern University Medical Enterprise Data Warehouse
Fewer mistakes
No Mistakes…
Northwestern Context
Three separate organizations, no overarching governance structure
Feinberg School of Medicine Researchers, research data sets, genomic data
Northwestern Medical Faculty Foundation Physician faculty group, ~750 Epic clinicals, IDX revenue cycle
Northwestern Memorial Hospital Cerner clinicals, legacy HBOC revenue cycle and case mix
8 years of EHR data @ EDW project start
Northwestern EDW Funding
Actual spend: 47% less, bottom line
How? Great people – Efficient and productive Microsoft BI technology stack No mistakes in strategy, no rework
Sanders’ Hierarchy of Analytic Needs
Thank you Dr. Maslow
Make sure you take care of the lower levels, first
Turn these “analytics” into a utility– fast, repeatable, cheap to produce There is no positive differentiating value in these lower levels…
everyone must meet them…they are utilitarian
Move as quickly as possible to the higher levels– technically and culturally
Publicly available cohortand metrics definitions
12
Level 5: ACO
Optimization
Level 4: Payer Financial Incentives
Level 3: Professional Societies
Level 2: Accreditation
Level 1: Compliance & Regulatory
JCAHO, NCQA, HEDIS
CMS, EMTLA, HIPAA, SOX, GLBA
STS, NRMI, Trauma
P4P, MU, PQRS
Healthcare 2.0; Intermountain, Kaiser, Geisinger
Non-Teaching Facilities
Publicly available cohortand metrics definitions
Level 6: ACO Optimization
Level 5: Translational
Research
Level 4: Payer Financial Incentives
Level 3: Professional Societies
Level 2: Accreditation
Level 1: Compliance & Regulatory
13
JCAHO, NCQA, HEDIS
CMS, EMTLA, HIPAA, SOX, GLBA
STS, NRMI, Trauma
P4P, MU, PQRS
Healthcare 2.0; Intermountain, Kaiser, Geisinger
Academic Medical Centers
Federal and private research agendas
Next Slide…The Data Warehouse Blueprint
Is old and musty… been with me for 20 years
Unfortunately, it still applies
Unfortunate because the symptoms addressed by data warehousing still exist…not much has changed to solve the root cause Monolithic transaction systems don’t meet the need for enterprise
analytics Virtual “federated” data warehouses are not yet viable and there
are no indications that will change soon
15ED
W B
luep
rin
t
Metadata Repository (The “Yellow Pages”)
Master Reference & Vocabulary DataAcademic
SourceData Content
Staging andPreprocessing
Supplies
Internal
State
External
Clinical
Financial
HR
Others
Customized Data Marts
FinancialEvents
ClinicalEvents
DiseaseRegistries
OperationalEvents
Data Analysis
OLAP Tools
Microsoft Access/ODBC
Web applications
Excel
Access Control, Security, and Auditing
SAS, SPSS
Et al
Compliance& Payer
Measures
ResearchRegistries
Governance Framework
Servers, Storage, Database, and Tools Infrastructure
Academic
Supplies
State
Clinical
Financial
HR
Others
Data Modeling Options
Enterprise data model
Dimensional data model
Bus model
I2B2
Federated
So…?
There is no single data model that will meet all analytic needs in a healthcare data warehouse
Don’t believe anyone– vendor or otherwise– that tries to sell or tell you something different
If you peeled open the most successful data warehouses and looked inside, you would not see a single data modeling strategy…you would see several data modeling approaches, tailored to specific needs, layered on top of a data bus architecture
+ 18
For Example…C
an
cer
Reg
istr
y
Mam
mogra
ph
yR
adio
logy
Path
olo
gy
Lab
ora
tory
Conti
nuin
g C
are
And F
ollo
w-U
p
Qualit
y o
f Li
feSurv
ey
Ele
ctro
nic
Healt
h P
lan
sC
laim
s
Am
bu
lato
ryC
ase
mix
Acu
te C
are
Case
mix
An Integrated Analytic Data Model For Cancer Care
Oncology Data Integration Strategy
Top down metrics & research requirements
Disparate Sources “connected” semantically to the data bus
+Master Data in the Bus
Patient ID*
Provider ID*
Encounter ID
Facility ID
Payer/Carrier ID
Department ID
Region ID
Postal Code
CPT Code
DRG Code
ICD Diagnosis Code
Charge Code
Patient Type
Gender
Date/Time
* - The most important, initially, to standardize
Standard format and content is critical
Don’t Start From Scratch: Opportunities For Reuse
Source System
EDW Operational Data Store
(Source Mart)
Master Data Bus
Subject Area Data Mart
Reporting Logic
Source System
EDW Operational Data Store
(Source Mart)
Source System
EDW Operational Data Store
(Source Mart)
Don’t Over-Think This Stuff…
“The art of being wise is knowing what to overlook.” William James, Principles of Psychology, 1890
My Biggest Mistakes
1. Believing a star schema data model would meet all the needs of healthcare analytics
2. Conducting enterprise, top down data modeling before coalescing any source data
3. Normalizing all vocabulary to a “standard” before making it available in the data warehouse
4. Allowing multiple “enterprise data warehouses” for different categories of users (e.g., research, strategy, quality, finance)
5. Underestimating the number of source systems and the demand for adaptability of the data model
Design & Plan for Content Growth
My Biggest Mistakes (cont)
6. Over investing money and expectations in tools like BusinessObjects and Cognos that are tightly coupled to the data model
People love Excel…you can go a long way towards success by simply delivering rows and columns of data to Excel users
7. Underestimating the importance of a good metadata repository
8. Embedding analytic logic in the ETL and data model vs. the reporting layer
9. Data literacy: Failing to create a culture of metrics-driven, continuous process improvement that could utilize the data
My Biggest Mistakes (cont)
6. Hiring the wrong kind of personalities on the data warehouse team
7. Data security overkill – too many roles ruined the cake
8. Data stewardship overkill – Librarians not guardians!
9. “Cleaning” source system data after loading into the EDW
10. Bonus Mistake: Believing I needed Oracle MPP architecture for scalability
The Good and Bad of Kimball
Good News: The book gave the IT specialty a vocabulary Data warehouse Data mart Extract, Transformation, Loading (ETL) Data staging areas
Bad News: His concepts became dogma Star schemas, dimensions, facts, and grains “This, and only this, is a data warehouse.”
Dimensional data modeling paints healthcare data analysis into a corner… and it’s happening quite often in healthcare right now
Other Popular, Misinformed Authors
Consultants with no hands-on, year-after-year, operational data warehousing experience Claudia Imhoff Bill Inmon
The problem is, our mid-level healthcare data warehouse managers are reading these authors and leaving common sense at the door
Who do you want to please? Ralph Kimball or your data analysts?
Give the analysts what they want. Not what you want them to want.
Governance, Business, Financial Model Start lean and simple… grow as needed
Minimal governance, maybe none, initially Seed the data warehouse with just enough money to prove its
value, then give it a bit more, and iterate Expect a 6 month “Time To Value” (TTV)
Report administratively to the CIO, but operationally to the major customers in Research, Quality, and Finance The CIO can ensure access to the source system data and a
“play friendly” attitude towards the data warehouse Common problem: Source systems do not perceive the EDW as
a priority
Fatal flaw: All planning & governance, no data warehouse
Organizational Alignment
Director of Analytics & Data Warehousing
Administratively reports to the CIO Access to the source system data is the lifeblood of the data
warehouse
60/40 split of centralized/business unit data analysts
Matrixed, customer-centric reporting to the business and clinical customers Clinical Research Clinical Quality Finance & Compliance Strategic Planning
Hiring for Success The best data warehouse professionals are difficult to find
These are not your typical IT personality
They have personal & communication skills like Business Analysts Must have great social skills
They have technical skills like Programmers Must understand SQL and data modeling
They have business acumen like your customers Must understand the business context of data
“Did you build the XYZ data warehouse?” “No. I built the teams that built the data warehouse.”
Technology Infrastructure Three mid-range, scalable servers, initially
Staging, production, and data analysis tools
Microsoft SQLServer and BI Stack Best available in the market Lowest TCO, by far Performance and scalability are not a problem, contrary to
popular rumor Nicely integrated and bundled tools specifically for DW and BI
Index everything! Sometimes, the indexes are 10 times larger than the data
Storage: Low transaction rates, very wide data sets
+New, Cool Analysis Tools
Tableau
QlikView
Tibco Silver Spotfire
Cloud-based analysis BIRST MyDials PivotLink
Message for CIOs and IT Managers: Stop trying to impose a “standard” tool for BI. Let the analysts
choose the right suite of BI tools for them.
Source Data Content
Grab as much granular data as possible, whether you see a need for it now or not Someone will eventually ask for it… and storage is cheap
Go after the largest repositories of organizational data first Volume of data => Value of data Quality of data = Data Completeness x Validity Billing and case mix Lab, radiology EMR
+
So… What’s Next in Healthcare BI and Data Warehousing?
Linking Evidence, EMR & Analytics
British Medical Journal 1,500 knowledge engineers 7,000 journals 20,000,000 papers
Evidence based best practices
Evidence based metrics of best practices
Cerner PowerChart Action Sets
Cerner Chronic Condition Management
Evidence
Analytics
+
Who are we monitoring?
What are we measuring?
What are our goals?
How will we achieve them?
Identify patients with condition of interest - Chronic Condition Management
Measure and display condition outcomes and care - Chronic Condition Management
Local targets for patient outcomes and care
Ongoing Local Improvement
Evidence Based Medicine orders and reference
information built into EMR and workflow – Action Sets
The Four Questions
Patient care:• Better informed• More consistent• Better co-ordinated
Localized Action Sets built into Cerner workflow (local knowledge and evidence at the point of care):• Orders (meds, tests)• Recommendations
Evidence-based medicine solution:• Locally developed• Cerner+ BMJ enabled
• Local content design using BMJ evidence-based starting point
• Clinical engagement and governance
Physician
Patient
Physician
Cerner CCM reports
Better informationBetter information
Project Overview
Assertion
Treating (and preventing) disease is our primary mission in healthcare
You can’t treat (or prevent) what you can’t define
Standardized, data-driven definitions of diseases are largely lacking in healthcare Three healthcare providers, three significantly different
definitions of “diabetes mellitus”
Patient registries are a means to standardize our definitions
38
+Large n Disease Registries
AsthmaBreast cancerCataractsChronic lymphocytic leukemiaChronic obstructive pulmonary diseaseColorectal cancerCommunity acquired bacterial pneumoniaCoronary artery bypass graftCoronary artery diseaseCoumadin managementDiabetesEnd stage renalGastroesophageal reflux diseaseGlaucomaHeart failureStroke (Hemorrhagic and/or Ischemic)High risk pregnancyHIVHypertensionLower back pain
Macular degenerationMajor depressionMigrainesMRSA/VREMultiple myelomaMyelodysplastic syndrome & acute leukemiaMyocardial infarctionObesityOsteoporosisOvarian cancerPreoperative antibiotic prophylaxisProstate cancerRheumatoid ArthritisSickle CellSystemic LupusUpper respiratory infection (3-18 years)Urinary incontinence (women over 65)Venous thromboembolism prophylaxis
Every EMR should include these data definitions and registries as a standard feature
Lifestyle & Data Warehousing
60% of claims and 72% of healthcare costs are lifestyle related (Association for Health Services Research)
Diet, drugs, drinking, smoking
We don’t need sophisticated data warehouses to tell us what we can clearly see Analytics can help us identify better treatments for these lifestyle
issues But, will analytics be able to affect lifestyle and behavior?
Probably not much…
The Breakthrough Opportunity for Data Warehousing Finding cures and prevention of small n diseases that have
disproportionately high human suffering and are NOT lifestyle related Amyotrophic Lateral Sclerosis Alzheimer's Hemophilia Hodgkin's Disease Multiple Sclerosis Rett Syndrome Scleroderma
Our national healthcare policy should address the creation of national registries for small n diseases
Don’t Get Fooled By NLP Natural Language Processing
Analysis of large collections of text for patterns and knowledge
National Security Agency Billions of $$, very challenging to find value in NLP
The difference between inference and measurement Inference => Text data mining Measurement => Discrete data mining
Walk before you run Healthcare needs measurement right now NLP is cool and will be important in the future– Watson is coming
Include text data in your data warehouse content, as a hedge for the future
IBM Watson & Data Warehousing The most exciting computer science event in my 30 year
career
But…don’t assume that the knowledge inference to win at Jeopardy is transferable to the problems we face in healthcare
Watson analyzed and clustered reams of human knowledge, captured in text
Our EMR text is not knowledge– it’s observations and treatment
Most of our text knowledge is captured in a few journals that don’t necessarily apply outside the clinical trial
In Closing… Be wary of what vendors and consultants will try to sell you
There are no off-the-shelf products that can meet all of your data warehousing/BI needs
There are very few data warehousing consultants who have the experience to entrust with your future… you don’t want to rebuild your data warehouse in 2-3 years because it’s not working
Don’t forget to build a data literate culture that can take advantage of the technology
Join and participate in the Healthcare Data Warehousing Association www.hdwa.org
Thank You!