Download - Translational Research IT (TraIT)
Translational Research IT (TraIT) “TraIT and OpenClinica: partners in
translational research”Marinel Cavelaars, Cuneyt Parlayan, Jacob Rousseau,
Sander de Ridder, Jan Willem Boiten and Jeroen Beliën
Boston; June 21st 2013
Overview
• Introduction and background
– CTMM
– Translational Research
• TraIT
– Three real-life examples: OpenClinica, BMIA, tranSMART
• OpenClinica.com – TraIT partnership
• CTMM-TRACER and OpenClinica by Sander de Ridder
– Scripts, Long Lists, Tools developed
– Things we learned/found useful
Who am I?
• My name: Jeroen Beliën, PhD, MSc
• Associate Professor, medical informatics, dept. of Pathology, VU University medical center, Amsterdam
– Digital Pathology, Image processing, IT in translational research
– String of Pearls
– IT-lead 2 CTMM projects: DeCoDe and TRACER
– CTO CTMM-TraIT
– BioMedBridges
• Member of taskforce Stichting Palga
– Palga: Dutch National Electronic Pathology Archive
• Faculty member of NBIC
CTMM, TIPharma and BMM offer an integrated approach for innovations in
the Dutch health care sector
CTMM: diagnosis
• Early detection of disease by in-vitro and in-vivo diagnostics
• Stratification of patients for personalized treatment
• Assessing efficiency and efficacy of medicines by imaging
• Image guided delivery of medication
• Focus on cancer, cardiovascular, neurodegenerative and infectious /autoimmune disease.
TIPharma: drugs
• Translational research on novel pharmaceutical therapies
• Target finding, animal models and lead selection
• Drug formulation, delivery and targeting
• Special Theme focusing on the efficiency of the process of drug development
BMM: devices
• Smart drug delivery systems
• Innovations in contemporary organ replacement therapies
• Passive and active scaffolds, including cell signalling functions
Image guided drug delivery
Biomarkers
Drug delivery
Imaging for regenerative medicine
CTMM projects € 300 mlnCTMM projects € 300 mln
GovernmentAcademia
Industry
€ 37,5 mlnCASH
€ 37,5 mlnKindIn kind
€ 75 mln
Subsidy
€ 150 mln
50%
Public-private partnerships: Financial modelSubsidy: 50% of research cost
CTMM projects
Breast
Prostate Colon
Lung
Leukemia
Heart Failure
Stroke
DiabetesKidney Failure
Arrhythmia
Peripheral Vascular Disease
Thrombosis
AlzheimerRheumatoid Arthritis
Sepsis
Translational research processGuiding principle: connecting phenotype to biology
Scientific OutputScientific Output
Patient enters medical centerPatient enters medical center
Intellectual Property
Intellectual Property
Improved HealthcareImproved
Healthcare
Experimental data
Experimental data
Downstreamanalysis
Downstreamanalysis
Clinical Procedures
Clinical Procedures
ImagingImaging SamplesSamples ExperimentsExperimentsElectronicHealth Record
ElectronicHealth Record
DataIntegration
DataIntegration
External dataExternal data
Image databaseImage database Biobank databaseBiobank databaseClinical databaseClinical database
TraIT consortium - Started Oct. 2011status 2013: 26 partners
Growing TraIT project team
• IT infrastructure = main goal
• No research on the side
• Workflow-oriented approach
• Create data pipelines to link data production and data analysis
• User driven priority setting
• Regular reprioritization possible (agile)
• Avoid reinventing wheels
• Adopt/adapt existing technology and expertise
• Connect with other initiatives
• Organizations (NBIC, EBI, PSI, IMI, etc.)
• Think big; start small; act now
• Short term focus on immediate needs CTMM projects
The TraIT approach
Division in work packages
Five data generating work packages
Data integration & analysis across the four platforms
Shared service center for hardware, training & support
TraIT has been subdivided into four work packages (WPs) supporting data generating domains, and two work packages dealing with the overarching TraIT requirements: data integration and professional support respectively:
WP 5 Core Infrastructure
WP 6 Deployment
Imaging Data
High-level TraIT data flows
Hospital (IT) Translational Research (IT)
Research DataLIMS
data domains
clinical data
imaging dataannotations
experimental data
biobanking
integrated data
translational analytics
workbench
Public Data
…
e.g. tranSMART/
i2b2
NBIA
OpenClinica
Varioussolutions
HIS
PACS
LIS
e.g. Galaxy
cohortexplorer
e.g. R…
CBM-NL
TraIT PseudonymizationHospital (IT) Translational Research (IT)
Research DataLIMS
data domains
clinical data
imaging data
experimental data
biobanking
integrated data translational analytics
workbench
Public Data
HIS
PACS
LIS
Galaxy
tranSMART/cohort explorer
R
…
NBIA + AIM
e.g.CBM catalog
e.g. PhenotypeDB, Annai Systems
e.g. Galaxy, Chipster
e.g.caTissue
e.g. GEO, EMBL-EBI
TraIT - study driven approach
Data Integration TranslationalAnalytics
Workbench
TranslationalAnalytics
Workbench
Study1
Study1
Study2
Study2
Study…
Study…
UC 1UC 1 UC 2UC 2 UC …UC …
Task 1:•study selection
Task 2:•use cases & prototypes
Data Integration
integrated translational
data warehouse
ETL
TranslationalAnalytics
Workbench
TranslationalAnalytics
Workbench
Analytics
Data Integration TranslationalAnalytics
Workbench
TranslationalAnalytics
Workbench
Task 3, 4, 5:development of•data integration platform
•analytics workbench
•shared components
···
···
···
2013 2014
Translational Research (IT)
Three real-life examples
Hospital (IT)
clinical
imaging
integrated data
e.g. tranSMARTNBIA
OpenClinica
PACS
Example 2: CTMM AIRFORCE
Example 1: CTMM INCOAG
Example 3: CTMM PCMM
Real-life example 1 - CTMM Incoag
• Discover new risk factors for thrombotic diseases
• Approach: Combine existing clinical studies into one OpenClinica data set for higher statistical power
OpenClinica:
• Clinical data capture
• Web-based
• Open-source
• Full audit-trail
• 10,000+ installations
• TraIT tool of choice
Incoag - Technical integration
Out-of-the-box OpenClinica can be applied in most projects: currently used in CTMM projects AirForce, Cohfar, DeCoDe, Parisk, PCMM, and Tracer
Specific Incoag question: how to combine 5+ independent existing studies from mixed sources into one OpenClinica installation?
Study 1 Study 2 Study 3
?
Sustainable storage in TraIT environment
Incoag - Technical integration
Solution: TraIT-team created a batch upload toolbox for OpenClinica
Will be submitted to the OpenClinica open-source community
Study 1 Study 2 Study 3
Sustainable storage in TraIT environment
Incoag - Semantic integration
Study 1
Study 1
Study 2
Study 2
Study 4
Study 4
Study 5
Study 5
Study 3
Study 3
Second question from Incoag project: how to identify common fields and data items?
How to determine the overlap?How to determine the overlap?
Incoag - Semantic integration
Study 1
Study 1
Study 2
Study 2
Study 4
Study 4
Study 5
Study 5
Study 3
Study 3
Second question from Incoag project: how to identify common fields and data items?
How to determine the overlap?How to determine the overlap?
100-150 fieldsin each study
More than 1005 combinations to consider!
Studies speak different “languages”:A biomedical “Esperanto” needed
Study 1
Study 1
Study 2
Study 2
Study 3
Study 3
Study 4
Study 4
Study 5
Study 5
Common ground?Common ground?
Incoag - Semantic integration
Project 1: Provide tools to standardize studies at data registration (as far as possible):
TraIT building blocksto rapidly build CRFsfor new studies basedon common dictionary
Study n
Study n
Study 1
Study 1
Study 2
Study 2
Study 4
Study 4
Study 3
Study 3
Study 5
Study 5
Project 2: First test with tools for automatic “after-the-fact” harmonization for historical data:
Harmonized Incoag dataset
Harmonized Incoag dataset
Automatic mapping againstmultiple dictionaries(SNOMED-CT, LOINC, NCI thesaurus & Gene Ontology)
Real-life example 2 – CTMM AirForce
• Personalized chemo-radiation of lung and head & neck cancer
• Lung cancer patients with PET-CT (and clinical data & tissue)
– VUMC, MUMC+, NKI, UMCG + 35 patients from Policlinico Gemelli in Rome (via MUMC+)
• Transfer of images from Rome using TraIT’s BioMedical Image Archive (www.bmia.nl)
WP2 High level design – Upload(Implemented)
Image pseudonymization pipeline(based on CTP from the RSNA)
Image pseudonymization pipeline(based on CTP from the RSNA)
Image storage & simple web-shop like image viewing (based on NBIA)
AirForce - de-identification of images
• Install TraIT de-identification client in Rome
– Adopt: Clinical Trial Processor (RSNA, open source, Java)
• Configure DICOM de-identification
– Remove identifying DICOM tags
– Replace Codice Sanitario (PatientID) with AirForce ID
– Keep important tags (e.g. some tags are crucial for downstream analysis of PET)
• Result: A pipeline to TraIT’s BMIA from the local Rome Image Archive
AirForce - QC of de-identification• Perform QC step by collection administrator before images are
visible in BMIA to prevent privacy breach (esp. burnt-in names).
AirForce - Resulting image archive in BMIA• Collection AirForce on www.bmia.nl with 35 patients from Rome
• Web shop model where you can fill a basket with patients for download
Real-life example 3 – CTMM PCMM
• Develop and validate biomarkers for diagnosis of prostate cancer
• Requires correlation of phenotype data to biomarker data
• Potential solution: tranSMART; to be validated with real-life data from CTMM projects like PCMM
Can we address thegeneric translationalquestion with thetranSMART solution?
Role of tranSMART in TraIT
PCMM – tranSMART as a candidate solution
tranSMART:
• Developed in J&J
• Made open-source
• “Data workbench” for translational researchers
• Searching across studies
• Data exploration
PCMM - Import of prostate data
Prostate data
Prostate data
Gleason score,PSA values, etc.
Usually gene expression data will be loaded as well; not yet done for PCMM
Reference to public data sourcesavailable
PCMM - QC of the data set
PCMM - QC of the data set
Drag-and-drop data parameters to create simple distribution plots and statistical values
PCMM: tranSMART for correlation analysis
Easy to create correlation plots between existing and potential predictors for prostate cancer
Second tranSMART developer/user meeting, June 17th-19th 2013, Amsterdam
CTMM-TraIT
CTMM-TraIT
SanofiSanofi
Recombinant / Deloitte
Recombinant / Deloitte
University of MichiganUniversity
of Michigan
Thomson Reuters
Thomson Reuters PfizerPfizer eTRIKS /
Imperial College
eTRIKS / Imperial College
CDISCCDISC
University of Luxembourgh
University of Luxembourgh
PhilipsPhilipsJohnson & Johnson
Johnson & Johnson
OpenClinica.com – TraIT partnershipStatement of Work
• TraIT: automate data capture in OC as much as possible
– E.g. automate upload of excel data and hospital lab data
– Approach: OC’s Web Services
• Requires Improvements on OIDs and Bug Fixes
• Support configurable role based authentication and authorization within OC
– E.g. Central review of images for all subjects in the different sites. Each image is reviewed by three reviewers who are not allowed to see each other’s reports in the CRFs
• Parameterized links in CRFs
– E.g. Links to images or to other subjects, with a dynamic URL based on data in CRF
Other wishes
• Study migration
– E.g. Users want to switch to different OC server
– Currently only "ClinicalData" ODM is imported
– Studies can be exported in full detail but cannot be imported as such
• Support reference to ontologies in the CRF
– Standardization of data
• Easy view for data entry
– E.g. tree structure that indicates where you are while entering data for easy navigation to other CRF for subject
• The load on TraIT OpenClinica increased significantly in 2012• Considerable time and energy was spent on delivery management (availability, capacity and
security) and on improvement of the TraIT OpenClinica user support
• The load on TraIT OpenClinica increased significantly in 2012• Considerable time and energy was spent on delivery management (availability, capacity and
security) and on improvement of the TraIT OpenClinica user support
03 3
15
26
47
03 3
15
26
47
03 3
15
26
47
03 3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Mid june
2008 2009 2010 2011 2012 2013
Num
ber o
f stu
dies
Timeline
Uptake of OpenClinica
03 3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Mid june
2008 2009 2010 2011 2012 2013
Num
ber o
f stu
dies
Timeline
Uptake of OpenClinica
03 3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Mid june
2008 2009 2010 2011 2012 2013
Num
ber o
f stu
dies
Timeline
Uptake of OpenClinica
03 3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Mid june
2008 2009 2010 2011 2012 2013
Num
ber o
f stu
dies
Timeline
Uptake of OpenClinica
03 3
15
26
47
0
5
10
15
20
25
30
35
40
45
50
Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Mid june
2008 2009 2010 2011 2012 2013
Num
ber o
f stu
dies
Timeline
Uptake of OpenClinica
Pre TraIT effect:all multicenterVUmc studies
Pre TraIT effect:all multicenterVUmc studies Also multicenter studies
UMCU, UMCN, EMC, Meander MC
Also multicenter studies UMCU, UMCN, EMC, Meander MC
47 studies 77 sites256 users
47 studies 77 sites256 users
Start DeCoDe
OpenClinica
Start DeCoDe
OpenClinica
Start TraIT
OpenClinica
Start TraIT
OpenClinica
Who am I?
• My name: Sander de Ridder
– Computer Science (MSc) & Bioinformatics (MSc)
• Inflammatory Disease Profiling, Dept. of Pathology, VU University medical center, Amsterdam
– Bioinformatics for Inflammatory Disease Profiling Group
– IT implementation CTMM TRACER
CTMM-TRACER
Background information on TRACER
• CTMM TRACER: Rheumatoid Arthritis– Prospective data
– Retrospective data (To Do)
• Go Live:– Wednesday the 5th of June
• Started at 9:00 - Finished at 12:00
• Approximately 1 hour/study
Prospective Studies VERA ERA ESRA
Sites 4 7 7
Events 7 6 6
CRFs ~35 ~30 ~30
Rules ~250 ~450 ~650
Age Calculation
After entering the DOB and the date of signing…
The age is calculated
Age calculation script: http://en.wikibooks.org/wiki/OpenClinica_User_Manual/AgeFieldCreated by Sander de Ridder and improved by Gerben Rienk
Long List Implementation
• Problem:
– Maximum of 4000 characters for single-select response options text
– Some lists need more characters: e.g. medication list > 9000 characters
• Solution:
– Created external list
– Add field to CRF which opens new page with list
– Allows user to select option; selected value is copied back to CRF
ITEM_NAME RESPONSE_TYPE RESPONSE_OPTIONS_TEXT RESPONSE_VALUES_OR_CALCULATIONS
Smoking_Category single-select Never smoked, Current smoker 1,2
Example: Medication
User selects “Other” and then clicks on question 3)’s field
A new tab/window opens with an HTML page with a single-select The user can select desired medication from the list
Selected medication is copied to the CRF
Some tools we created: CRF validator
• Compares items between CRFs based on uids and ensures they match– CRF1
• ID: Patient_Weight; DATA_TYPE: INT
– CRF2• ID: Patient_Weight; DATA_TYPE: REAL
Mismatch for Patient_Weight!
• Checks NULL-flavour coding integrity– Coding: -1=No Information, -2=Not Applicable, -3=Unknown, …
– CRF1 • RESPONSE_OPTIONS_TEXT: No Information
RESPONSE_VALUES_OR_CALCULATIONS: -2
Incorrect NULL-flavour coding!
Prevents errors and inconsistencies
Some tools we created: ID-Translator
• Move rules file to new OC server replace all item IDs• Automatic translation of item identifiers in rules
Prevented replace errors and saved many hours of work
• Requires: – ViewCRFVersion file
• Contains item ID information for CRF on new server
– Rule file with properly specified header• Contains item ID information for CRF on old server
Parse ViewCRFVersion mapping ITEM_NAME – new OC_IDMedicatieBijgewerkt = I_TRACE_MEDICATIEBIJGEWERKT_4714
Parse Header of rule file mapping ITEM_NAME – old OC_IDMedicatieBijgewerkt = I_TRACE_PATIENTSTUDIE_MOMENT_AFROND
Translate rule fileold OC_ID new OC_ID via ITEM_NAME I_TRACE_PATIENTSTUDIE_MOMENT_AFROND = I_TRACE_MEDICATIEBIJGEWERKT_4714
ViewCRFVersion (new Server)
Rules for old server
Translated Rules for new server
ITEM_NAME
OC_ID
OC_IDITEM_NAME
Things we learned/found useful
• ITEM_NAME max 64 characters– SPSS compatibility
• Truly unique identifiers (description label)– Easy to link to study definition (CTMMC)– Useful for consistency checking
• Negative NULL-flavour coding– Prevent conflict with retrospective data – Easy to keep NULL-flavour coding consistent
• Specify identifiers in header of rule file– Automatic translation
• JavaScript code– $.noConflict();
• Prevents our code from interfering with OC’s code
– Reference to jquery• <script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"> • Prevents dependency on OC’s jQuery version
• Create a checklist and follow it during go-live
Goal: make researchers want to use OpenClinica and tranSMART
And many more…
Acknowledgements