what is hic? what can we do for you? - ahsp · data linkage service (dls) •advise research staff...
TRANSCRIPT
What is HIC? What can we do for you?
Dr Claire Jones
Senior Software Engineer
HIC Services
http://medicine.dundee.ac.uk/hic
• Ran 130 active projects with turn over of 1.3M for 2013/14.
Development Team Clinical Information Bureau (CIB) Team
Data Linkage Service (DLS)
Data Linkage Service (DLS)
• Advise research staff on data linkage requirements
• Feasibility studies using aggregates to quantify population sizes
• Cohort identification and management Recruitment to clinical trials
• Creation of matched Case/Control groups
• New dataset acquisition
• Dataset curation (loading and cleaning of local and national dataset extracts)
• Link, anonymise and extract data
• Provision of Safe Haven
Jim Galloway
https://medicine.dundee.ac.uk/dataset-inventory
Demography
GRO ECHO
There was a 22% overall
reduction in all cause mortality with β blocker
use
Prescribing TARDIS
Biochemistry
Microbiology Haematology
The effect of β blockers in the management of chronic obstructive pulmonary disease (COPD)
Setting Tayside, Scotland (2001–2010) Population 5977 patients aged >50 years with a diagnosis of COPD.
Example of linkage using just HIC hosted data
BMJ. 2011; 342: d2549. 10.1136/bmj.d2549 P.M Short, S.I.W Lipworth, D.H.J Elder, S. Schembri, B.J. Lipworth.
Hospital admissions
GRO
More than 400 lives are being lost each year because breast cancer patients fail to take the
full course of the drug Tamoxifen due to "intolerable"
side-effects
Prescribing
Cohort study examining Tamoxifen adherence and its relationship to mortality in women with breast cancer
Example of linking HIC hosted data to clinical data set
Br J Cancer. 2008 December 2; 99(11): 1763–1768. 10.1038/sj.bjc.6604758 McCowan, J Shearer, P T Donnan, J A Dewar, M Crilly, A M Thompson and T P Fahey
Cohort of cancer patients collected from a Ninewells
clinic
Fire Service
What is the relationship
between vulnerable adults in the
different services?
Police NHS Social Services
HIC linking new data sets from different data custodians
What does HIC Safe Haven Offer Researchers?
• Safe and Secure • Complies with NHS Governance, HIC SOPs • Managed access control • Routine Backup of project data to secure storage
• Managed Service • Helpdesk • Hardware status monitored 24x7 • Built-in hardware resilience
• Collaborative working environment - shared data areas
• Accessible from anywhere within the EU
• Access to Extensive Software Library : • R, STATA, SPSS, MATLAB (if reqd.), Mathematica (if reqd.) • MSOffice 2012
• Archive of project data and results post project completion
Keith Milburn
Application Development Projects (ADP)
• Bespoke Software
• Web-based Data Collection
• Disease Registries
• Recruitment Tracking
• Randomisation for Clinical Trials
• Study Questionnaire Forms - either web-based or paper-based (or both)
• Texting Interventions
• NHS Network Hosting
Application Development Projects
Keith Milburn
Clinical Information Bureau (CIB)
• Project-specific data entry
• Adding missing data
• Patient paper record data entry
• Study questionnaire results
• Clerical support to studies
• Mailing Services
• Paper scanning and shredding
Patient Recruitment to Study Completion
Data Linkage - Follow-up of study participants by linking to routinely collected EHR
Data entry of project data from paper forms or adding missing data to study datasets
Aggregates of study feasibility and potential study recruitment
Recruitment mailings and questionnaires
Track the recruitment process via online tracking system
Bespoke web-based data collection tools
Duncan Heather
Governance Support
What approvals do you need?
https://medicine.dundee.ac.uk/governance-approval-support
Project Support
• Service Level Agreements (SLA)
• Data Sharing Agreements
• Approvals forms
• PRFs from HIC Quotes
• Tracking of finance of project
Duncan Heather
Research Data Management Platform (RDMP)
Research
Service
Data
Researchers spend months cleaning and transforming
datasets before they can do
Research
Researchers Do
Research
• Clean • Process • Transform • Standardise • Add metadata
Main goals of RDMP
• Automate the loading, storage, linkage and provision
• To clean, transform and add meta-data and domain knowledge to each data set to make them “research ready”.
• Link to images and genomic data.
• Reproducibility and historical information, user interfaces, additional security and an audit trail.
• To provide the High Performance Computing (HPC)
Profiling Dashboard: Community-dispensed Prescription Data (Prescribing)
*Consistency representation is based on 91.7% validated data
Accuracy - Defined Key & Mandatory Fields
Col. Name Accurate Missing Wrong Null Unavailable
CHI 99.96% 0.00% 0.00% 0.00% 0.04%
corrected_prescribed_date 99.96% 0.00% 0.00% 0.00% 0.04%
res_seqno 99.96% 0.00% 0.00% 0.00% 0.04%
dataMonth 99.96% 0.00% 0.00% 0.00% 0.04%
hb_code 99.96% 0.00% 0.00% 0.00% 0.04%
SCAN_REF_NO 73.27% 0.00% 26.69% 0.00% 0.04%
prescribed_date 99.96% 0.00% 0.00% 0.00% 0.04%
quantity 83.26% 0.00% 15.89% 0.81% 0.04%
directions 33.20% 0.00% 0.27% 66.49% 0.04%
no_of_packs 13.08% 0.00% 0.00% 86.88% 0.04%
line_no 73.23% 0.00% 0.00% 26.73% 0.04%
Approved_Name 98.83% 0.00% 0.00% 1.12% 0.04%
Inaccurate
99.96%
0.04%
Data Availability
Clean Data
Unclean Data
Unavailable Data
0.0% 1.1%0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
chi healthboardcode date
Consistency Representation*
Consistency Inconsistencies
13.34%
0.00% 0.00% 0.81%
66.49%
86.88%
0.04%
26.73%
1.12%0%
10%20%30%40%50%60%70%80%90%
100%
Attribute Completeness
Complete Incomplete
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
CHI
corrected_prescribed_date
res_seqno
dataMonth
hb_code
SCAN_REF_NO
prescribed_date
quantity
directions
no_of_packs
line_no
Approved_Name
GREEN
AMBER
RED
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,0001
-198
9
11
-19
89
9-1
990
7-1
991
5-1
992
3-1
993
1-1
994
11
-19
94
9-1
995
7-1
996
5-1
997
3-1
998
1-1
999
11
-19
99
9-2
000
7-2
001
5-2
002
3-2
003
1-2
004
11
-20
04
9-2
005
7-2
006
5-2
007
3-2
008
1-2
009
11
-20
09
9-2
010
7-2
011
5-2
012
3-2
013
Dataset Time Series
Episodes Unique Patient
Research and
Analytics
Governance
Infrastructure and Data
Services MRI MEI DHSRU
CLS
SNM
Thanks for Listening! Any Questions?