step 1: collect f.a.i.r. data is fair (findable ......connect data is securely transferred,...

1
Liesbet M. Peeters 1 , Ilse Lamers 1 , Dirk Valkenborg 2 , Peter Feys 1 , Veerle Somers 1 , Annemie Spooren 1,6 , Veronica Popescu 3 , Niel Hens 2 , Paul M. Matthews 4 , Christoph Thalheim 5 , Bart van Wijmeersch 1,3 & Niels Hellings 1 1 Biomedical Research Institute, Hasselt University, Belgium; 2 Center of Statistics, Hasselt University, Belgium; 3 Revalidation and MS center, Overpelt, Belgium; 4 Data Science Institute, Imperial College Londen, UK OPTIMISE, United Kingdom; 5 European Multiple Sclerosis Platform, Schaarbeek, Belgium; 6 PXL University of Applied Sciences and Arts, Belgium Let’s dream and team up Dr. Ir. Liesbet M. Peeters We are open for collaboration. Please contact us for more info [email protected] Funded by MS DATACONNECT CONNECT DATA, CONNECT PEOPLE www.msdataconnect.com FAIR data for next generation management of Multiple Sclerosis A FOUR C-PLAN FOR SUCCCCESS We aim to transform the mostly population based management of MS of today into an individualized, personalized and precision level management. We believe the key to achieve this next level is “FAIR” data. An intuitive representation of this 4C plan is presented. Data is collected all over the world by different stakeholders (step 1: COLLECT). This results in an extensive amount of data and datasets. Every dataset is represented using a puzzle of a face. Many insights could be discovered when these datasets could be pooled and connected (step 2: CONNECT). Methods to identify the minimal requirements for common datasets are required (=step 3: COMPLETE). When sufficient overlap between the databases involved is secured and powerful analytical methods are developed to cope with the imperfections of datasets featured by different layers of missing data, these data sets can be optimally mined to create new insights for MS management (=step 4: CONSTRUCT). Proof-of-concept is provided by building a multidisciplinary repository for MS in Belgium An online catalogue listing the desired content of the repository is defined. This list contains, among others, following variables: 1° patient specific data 2° disease specific data 3° treatment strategies 4° paramedical data 5° clinical data 6° patient reported outcomes data 7° biological sample specific data 8° patient and sample phenotyping In collaboration with COLLECT Data is collected 1° in the hospitals or laboratories using the primary systems 2° by patients 3° by independent practitioners using the electronical medical dossier (EMD). The data is registred and captured by optimizing and expanding 3 existing open source IT software (HD4DP, HD4PROM, HD4PrC) (=0 & 1 in fig) CONNECT Data is securely transferred, encrypted and pseudo-anonymized by a Trusted Third Party. The use of the National Registry Number as identification number in all data collection steps makes it possible to connect all data belonging to the same patient (2 in fig). COMPLETE The HD4RES software is used by the researchers for monitoring, follow-up and managing the data collections (3 & 4 in fig) CONSTRUCT Data is stored in a secure data warehouse. Researchers are greanted access and analysis of the data is possible in the data warehouse using standard statistical tools. Data is used for research, for example to build personalized decision support systems and to evaluate and construct composite outcome score measures. The repository is used to provide Health Statistics for all stakeholders involved (5, 6 & 7 in fig) Step 1: COLLECT Develop data collection procedures and tools to create data that is FAIR (Findable, Accessible, Interoperable and Re-usable) Step 2: CONNECT Develop IT solutions to allow (temporarily) pooling and linking of FAIR datasets Step 3: COMPLETE Develop statistical methods to define minimal requirements for datasets Step 4: CONSTRUCT Develop new analytical methods for optimal mining of connected and pooled FAIR datasets F.A.I.R. data Findable, Accessible, Interoperable and Re-usable

Upload: others

Post on 23-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Step 1: COLLECT F.A.I.R. data is FAIR (Findable ......CONNECT Data is securely transferred, encrypted and pseudo-anonymized by a Trusted Third Party. The use of the National Registry

Liesbet M. Peeters1, Ilse Lamers1, Dirk Valkenborg2, Peter Feys1, Veerle Somers1, Annemie Spooren1,6, Veronica Popescu3, Niel Hens2, Paul M. Matthews4, Christoph Thalheim5, Bart van Wijmeersch1,3 & Niels Hellings1

1 Biomedical Research Institute, Hasselt University, Belgium; 2 Center of Statistics, Hasselt University, Belgium; 3 Revalidation and MS center, Overpelt, Belgium; 4 Data Science Institute, Imperial College Londen, UK OPTIMISE, United Kingdom; 5 European Multiple Sclerosis Platform,

Schaarbeek, Belgium; 6 PXL University of Applied Sciences and Arts, Belgium

Let’s dream… and team upDr. Ir. Liesbet M. Peeters

We are open for collaboration. Please contact us for more [email protected] by

MS DATACONNECT – CONNECT DATA, CONNECT PEOPLE –

www.msdataconnect.com

FAIR data for next generation management of Multiple SclerosisA FOUR C-PLAN FOR SUCCCCESS

We aim to transform the mostly population based management of MS of today into an individualized, personalized and precision level management.

We believe the key to achieve this next level is “FAIR” data.

An intuitive representation of this 4C plan is presented. Data is collected all over the world by

different stakeholders (step 1: COLLECT). This results in an extensive amount of data and

datasets. Every dataset is represented using a puzzle of a face. Many insights could be

discovered when these datasets could be pooled and connected (step 2: CONNECT). Methods

to identify the minimal requirements for common datasets are required (=step 3: COMPLETE).

When sufficient overlap between the databases involved is secured and powerful analytical

methods are developed to cope with the imperfections of datasets featured by different layers

of missing data, these data sets can be optimally mined to create new insights for MS

management (=step 4: CONSTRUCT).

Proof-of-concept is provided by building a multidisciplinary repository for MS in Belgium

An online catalogue listing the desired content of therepository is defined. This list contains, among others,following variables:

1° patient specific data2° disease specific data3° treatment strategies

4° paramedical data5° clinical data

6° patient reported outcomes data7° biological sample specific data

8° patient and sample phenotyping

In collaboration with

COLLECTData is collected 1° in the hospitals or laboratories using the primary systems 2° by patients 3° by

independent practitioners using the electronical medical dossier (EMD). The data is registred and captured by optimizing and expanding 3 existing open source IT software (HD4DP, HD4PROM, HD4PrC) (=0 & 1 in fig)

CONNECTData is securely transferred, encrypted and pseudo-anonymized by a Trusted Third Party. The use of the

National Registry Number as identification number in all data collection steps makes it possible to connect all data belonging to the same patient (2 in fig).

COMPLETEThe HD4RES software is used by the researchers for monitoring, follow-up and managing the data

collections (3 & 4 in fig)

CONSTRUCTData is stored in a secure data warehouse. Researchers are greanted access and analysis of the data is

possible in the data warehouse using standard statistical tools. Data is used for research, for example to build personalized decision support systems and to evaluate and construct composite outcome score

measures. The repository is used to provide Health Statistics for all stakeholders involved (5, 6 & 7 in fig)

Step 1: COLLECTDevelop data collection procedures and tools to create data that

is FAIR (Findable, Accessible, Interoperable and Re-usable)

Step 2: CONNECTDevelop IT solutions to allow (temporarily) pooling and linking of

FAIR datasets

Step 3: COMPLETEDevelop statistical methods to define minimal requirements for

datasets

Step 4: CONSTRUCTDevelop new analytical methods for optimal mining of connected

and pooled FAIR datasets

F.A.I.R. dataFindable, Accessible, Interoperable and Re-usable