epic emr to omop cdm to research data mart: an … · 2017-04-06 · ü omop cdm is open source. no...

1
RESEARCH POSTER PRESENTATION DESIGN © 2015 www.PosterPresentations.com Elicit data requirements Identify primary data source Identify needed subset of source data components Identify cohort Build and populate data mart Test and validate data with the user Develop user manuals In this research data delivery project, we explored a less traveled path of building a clinical “data mart” for a registry study on kidney transplant patients based on our institutional OMOP database. Background Project Goals The 5 Things We learned References 1. Observational Health Data Sciences and Informatics (OHDSI) Website: https://www.ohdsi.org/ 2. HuserV, DeFalco FJ, Schuemie M, Ryan PB, Shang N, Velez M, Park RW, Boyce RD, Duke J, Khare R, Utidjian L, Bailey C. EGEMS (Wash DC). 2016 Nov 30; Multisite Evaluation of a Data Quality Tool for Patient-Level Clinical Data Sets. 4(1):1239. doi: 10.13063/2327-9214.1239. eCollection 2016. 3. User acceptance testing framework: https://usersnap.com/blog/types-user-acceptance-tests- frameworks/ Acknowledgements ² This project is supported by the UCSF Clinical and Translational Science Institute (CTSI), part of the Clinical and Translational Science Award program funded by the National Center for Advancing Translational Sciences (Grant Number UL1 TR000004) at the National Institutes of Health (NIH). ² We thank UCSF pSCANNER team, PI Dr. Mary Whooley, MD, project manager Nirupama Krishnamurthi, MPH and UCSF IT EIA team that implemented our institution’s instance of OMOP database, for all their support and inspiration to use OMOP CDM for research. First, we supported the study by providing access to data o Provide ongoing access to the up-to-date clinical data on kidney transplant patients that the study team can use to answer the research questions Second, we learned what it takes and how it could scale o Learn about building a study data product, based on specific solution choices. o Assess feasibility of generalizing this approach for other studies that rely on EMR data; identify generalizable components 1,2,3,4,5,6,7,8,9,10,11 University of California San Francisco, CA Oksana Gologorskaya, MS 1 , Meyeon Park, MD 2 , Debbie Huang, MS 3 , Robert Hink, PhD, MBA 4 , Vijaykumar Rayanker 5 , MS, Nelson Lee, MA, MBA 6 , Hasan Bijli, BS, MBA 7 , Govardhan Giri, MBA 8 , Amit Shetty, BS 9 , Leslie Yuan, MPH 10 , Mark Pletcher, MD, MPH 11 EPIC EMR to OMOP CDM to Research Data Mart: An Unmaintained Road or a Highway? START HERE: Researcher needs access to extensive up-to-date clinical information on kidney transplant patients to support long term registry study Solution choices, methods and the questions we had o Delivery format: data mart built from the institutional OMOP data warehouse o When is it appropriate to use OMOP DB as the primary source of EMR data for research? o Data mart implementation process: what is generalizable? What recourses/time it takes? o What else should the research team get besides access to the data mart? E.g. documentation (user manual, including data limitations), other resources? o Primary data source: subset of institutional EMR (Epic) data available in OMOP DB o What about adding other data sources, e.g. pathology data or kidney transplant data? o Deliverables: data mart access, documentation (user manual for data access), including data limitations o QA and data validation: User-centered approach: user acceptance testing and data validation procedures o What are researcher’s expectations about the quality of data? o General best practices and understanding of working with EMR data o Important questions that came up in the process: o How can we help the researcher use the imperfect data that’s available? o When is it right to build a data mart? What kind of projects and what kind of study teams can fully benefit from it? DATA REQUIREMENTS Most of the required data are available in the EMR DB, Epic Clarity. Need lab results, medications, health conditions, vitals, other observations (imaging etc.) pre- and post-transplant WAIT Custom queries getting the data scattered all over EMR DB, repeated for data refresh, would not scale. HOW could we meet these needs by spending LESS EFFORT, and getting MORE VALUE in the future? 1. Data is never perfect but you can still trust it if you understand it! In order to use the data in the best way, and to trust our data, we need to understand its limitations. Present/analyze the data along with the limitations, based on the level of evidence the data provides. 2. Study team’s involvement in the quality control / validation of the data was extremely effective. We adopted a User Acceptance Testing method as part of our data delivery process. We developed a user acceptance testing procedure for the research data mart that may now be used as a model for all research data delivery projects at UCSF 3. Setting expectations with the researcher is important Set expectations with the researcher about the quality of data, the complexity of the data and the necessity of their involvement in the process of data delivery 4. Advantages of using OMOP-based vs. Epic CLARITY data source ü OMOP is a research-oriented data model. Alternative to CLARITY reports, potentially faster access, easy enough for skilled analyst to use independently ü OMOP CDM is open source. No need to go to CLARITY training to learn the data model ü Common data models (CDMs) shared across many organizations allow the same analytical code to be executed on multiple distributed data sets. In some cases, adherence to a CDM is a prerequisite for participating on a grant (or research network).[2] 5. OMOP data quality issues we found sparked internal OMOP QA initiative Implementing research data mart – what can be streamlined? Study-specific, manual work Reusable method Reusable code, tools and deliverables and much faster execution in repeat projects We believe that building OMOP-based data marts is a very efficient way to deliver data for research because for the next similar project, we can replicate this solution, plug-in a new cohort and be done! Implementation Highlights: ² ETL/Data integration tool: IBM InfoSphere DataStage ² Data flow: UCSF EPIC CLARITY EMR -> UCSF OMOP DB -> Research Datamart ² UCSF OMOP version: v4, being upgraded to v5 ² Source DB platform: SQL Server ² Target DB: SQL Server ² Refresh frequency: Weekly ² datamart access for study data analysts to query directly in the DB or from SAS. Contact Oksana Gologorskaya Sr. Product Manager, Research Technology http://profiles.ucsf.edu/oksana.gologorskaya Clinical & Translational Science Institute (CTSI) University of California, San Francisco (UCSF) 550 16th St, 6th Floor, San Francisco, CA 94143-0558

Upload: others

Post on 03-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EPIC EMR to OMOP CDM to Research Data Mart: An … · 2017-04-06 · ü OMOP CDM is open source. No need to go to CLARITY training to learn the data model ü Common data models (CDMs)

RESEARCH POSTER PRESENTATION DESIGN © 2015

www.PosterPresentations.com

Elicitdatarequirements

Identifyprimarydata

source

Identifyneededsubsetofsourcedatacomponents

Identifycohort

Buildandpopulatedatamart

Testandvalidatedatawiththeuser

Developuser

manuals

In this research data delivery project, we explored a less traveled path of building a clinical “data mart” for a registry study on kidney transplant patients based on our institutional OMOP database.

Background

ProjectGoals

The5ThingsWelearned

References

1. Observational Health Data Sciences and Informatics (OHDSI) Website: https://www.ohdsi.org/2. Huser V, DeFalco FJ, Schuemie M, Ryan PB, Shang N, Velez M, Park RW, Boyce RD, Duke J,

Khare R, Utidjian L, Bailey C. EGEMS (Wash DC). 2016 Nov 30; Multisite Evaluation of a Data Quality Tool for Patient-Level Clinical Data Sets. 4(1):1239. doi: 10.13063/2327-9214.1239. eCollection 2016.

3. User acceptance testing framework: https://usersnap.com/blog/types-user-acceptance-tests-frameworks/

Acknowledgements² This project is supported by the UCSF Clinical and Translational Science Institute (CTSI), part of

the Clinical and Translational Science Award program funded by the National Center for Advancing Translational Sciences (Grant Number UL1 TR000004) at the National Institutes of Health (NIH).

² We thank UCSF pSCANNER team, PI Dr. Mary Whooley, MD, project manager NirupamaKrishnamurthi, MPH and UCSF IT EIA team that implemented our institution’s instance of OMOP database, for all their support and inspiration to use OMOP CDM for research.

First, we supported the study by providing access to datao Provide ongoing access to the up-to-date clinical data on kidney transplant patients that the study

team can use to answer the research questionsSecond, we learned what it takes and how it could scaleo Learn about building a study data product, based on specific solution choices. o Assess feasibility of generalizing this approach for other studies that rely on EMR data; identify

generalizable components

1,2,3,4,5,6,7,8,9,10,11UniversityofCaliforniaSanFrancisco,CA

Oksana Gologorskaya, MS1, Meyeon Park, MD2, Debbie Huang, MS3, Robert Hink, PhD, MBA4, Vijaykumar Rayanker5, MS, Nelson Lee, MA, MBA6, Hasan Bijli, BS, MBA7, Govardhan Giri, MBA8, Amit Shetty, BS9, Leslie Yuan, MPH10, Mark Pletcher, MD, MPH11

EPICEMRtoOMOPCDMtoResearchDataMart:AnUnmaintainedRoadoraHighway?

STARTHERE:• Researcherneedsaccesstoextensive

up-to-date clinicalinformationonkidneytransplantpatientstosupportlongtermregistrystudy

Solutionchoices,methodsandthequestionswehad

o Delivery format: data mart built from the institutional OMOP data warehouseo When is it appropriate to use OMOP DB as the primary source of EMR data for research?o Data mart implementation process: what is generalizable? What recourses/time it

takes?o What else should the research team get besides access to the data mart? E.g.

documentation (user manual, including data limitations), other resources?

o Primary data source: subset of institutional EMR (Epic) data available in OMOP DBo What about adding other data sources, e.g. pathology data or kidney transplant data?o Deliverables: data mart access, documentation (user manual for data access),

including data limitations

o QA and data validation: User-centered approach: user acceptance testing and data validation procedures

o What are researcher’s expectations about the quality of data?o General best practices and understanding of working with EMR data

o Important questions that came up in the process:o How can we help the researcher use the imperfect data that’s available?o When is it right to build a data mart? What kind of projects and what kind of study

teams can fully benefit from it?

DATAREQUIREMENTS• Mostoftherequireddataareavailable

intheEMRDB,EpicClarity.• Needlabresults,medications,health

conditions,vitals,otherobservations(imagingetc.)pre- andpost-transplant

WAITCustomqueriesgettingthedatascatteredalloverEMRDB,repeatedfordatarefresh,wouldnotscale.

HOWcouldwemeettheseneeds

byspendingLESSEFFORT,andgettingMOREVALUE

inthefuture?

1. Dataisneverperfectbutyoucanstilltrustitifyouunderstandit!Inordertousethedatainthebestway,andtotrustourdata,weneedtounderstanditslimitations.Present/analyzethedataalongwiththelimitations,basedonthelevelofevidencethedataprovides.

2. Studyteam’sinvolvementinthequalitycontrol/validationofthedatawasextremelyeffective.WeadoptedaUserAcceptanceTestingmethodaspartofourdatadeliveryprocess.WedevelopedauseracceptancetestingprocedurefortheresearchdatamartthatmaynowbeusedasamodelforallresearchdatadeliveryprojectsatUCSF

3. SettingexpectationswiththeresearcherisimportantSetexpectationswiththeresearcheraboutthequalityofdata,thecomplexityofthedataandthenecessityoftheirinvolvementintheprocessofdatadelivery

4. Advantages of using OMOP-based vs. Epic CLARITY data sourceü OMOPisaresearch-orienteddatamodel.AlternativetoCLARITYreports,potentiallyfaster

access,easyenoughforskilledanalysttouseindependentlyü OMOPCDMisopensource.NoneedtogotoCLARITYtrainingtolearnthedatamodelü Commondatamodels(CDMs)sharedacrossmanyorganizationsallowthesameanalyticalcode

tobeexecutedonmultipledistributeddatasets.Insomecases,adherencetoaCDMisaprerequisiteforparticipatingonagrant(orresearchnetwork).[2]

5. OMOP data quality issues we found sparked internal OMOP QA initiative

Implementingresearchdatamart– whatcanbestreamlined?

Study-specific, manual work Reusable method Reusable code, tools and deliverables and much faster execution in repeat projects

We believe that building OMOP-based data marts is a very efficient way to deliver data for research because for the next similar project, we can replicate this solution, plug-in a new cohort and be done!

ImplementationHighlights:² ETL/Dataintegrationtool:IBMInfoSphere DataStage² Dataflow:UCSFEPICCLARITYEMR->UCSFOMOPDB->ResearchDatamart² UCSFOMOPversion:v4,beingupgradedtov5² SourceDBplatform:SQLServer² TargetDB:SQLServer² Refreshfrequency:Weekly² datamart accessforstudydataanalyststoquerydirectlyintheDBorfromSAS.

ContactOksana GologorskayaSr. Product Manager, Research Technologyhttp://profiles.ucsf.edu/oksana.gologorskayaClinical & Translational Science Institute (CTSI)University of California, San Francisco (UCSF)550 16th St, 6th Floor, San Francisco, CA 94143-0558