rd-connect wp5 update · 2017-10-03 · 2 platform moved to rd-connect cluster rd-connect clúster...
TRANSCRIPT
![Page 1: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/1.jpg)
RD-CONNECT WP5 UPDATE
R D - C o n n e c t A n n u a l M e e t i n g B e r l i n , M a y 1 s t 2 0 1 7
![Page 2: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/2.jpg)
2
Platform moved to RD -Connect c luster
RD-Connect clúster
• 19 servers• Each server has :
• 256 GB RAM
• 20 TB sata disk + 900 GB SSD disk
• 32 CPU cores
• Software suite• Apache mesos + DCOS for cluster management
• Apache marathon for docker orchestration
• Foreman + Puppet
• Jenkins
• AIM:• Moving towards 100% CI/CD (continuous integration/deployment)
![Page 3: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/3.jpg)
3
Platform moved to RD -Connect c luster
Monitoring
- Software stackBeats + Elasticsearch + Kibana
- Status of monitoringProxies (production and integration) monitorizedGenomics application monitoring under development
- Future developmentsComplete monitoring of all the applicationsIntegration of applications monitoring with resources/metric monitoring for performance optimization and resource allocation minimizationAnalysis of logs for anomaly detections (Kafka + Spark)
![Page 4: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/4.jpg)
4
Whole RD-Connect Platform Architecture Overview 2017
Application server(Liferay - Java)
Postgresql
REST API
Security
Application server
(Play2 - Scala)
Psql El
REST API
Security
Application server(Xwiki)
Mysql Solr
REST API (*)
Security
Application server(Spring - Java)
mysql El.
Security
VCFs
REST API
DiseaseCard (REST API)
Alfa (REST API) LUMC Tools (REST API) ***
Filtration tool (web) Client (angular)
UMD (web service **)
Web browser
Phenotips
CAS
Biobanks and Reg.(Id Card) Samples (Molgenis)
Biobanks andRegistries Samples
LegendEl : ElasticsearchPsql : PostgresSolr: Apache Solr
ID relationships(RDF,postgres, d2r)
Application server
(Play2 - Scala)
LDAP
REST API
Security
Genomics
IDs
Integrated security
![Page 5: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/5.jpg)
5
RD-Connect Genomics Platform
5
https://platform.rd-connect.eu
![Page 6: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/6.jpg)
6
CAS login
![Page 7: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/7.jpg)
7
Samples and users
2016 Annual Meeting 2017 Annual Meeting
Users
Users connecedT1 (Jan-Mar)
24798
34 41
GenomicSamples 567 2123
![Page 8: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/8.jpg)
8
Data f low to RD-Connect
EGA
RD-Connect platform
Sequencing lab
Standard analysis pipeline
Raw data(FASTQ/BAM)
Researcher/Clincian
AnalysisTools
PhenoTips (HPO terms)
N=2123and counting …
![Page 9: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/9.jpg)
9
Benchmarking of VC Pipel ines
Laurie et al. Human Mutation, 2016
NA12878 50xWGS FastQs (Illumina Platinum), analysed with several pipelines. Concordance with Gold Standard VC set from GIAB/NIST (Zook et al., 2014) for the reliably-callable region of the genome (70%)
![Page 10: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/10.jpg)
10
Benchmarking of VC Pipel ines
Laurie et al. Human Mutation, 2016
99% 65% 62%
76% 31% 31%
Reliably Callable
NotReliably Callable
![Page 11: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/11.jpg)
11
Genomics platform architecture
Hadoop File-system (HDFS)
RESTWeb
Server(Scala)
Metadata, user info & permissions (Postgres)
gVCFs
Variant Calling &
Annotation pipeline
Table format (Parquet)
Real-timeQueries
Indexed Data(ElasticSearch)
External hive table
D. Piscia, J. Protasio, S. Laurie, S. Beltran,JM Fernández, A. Cañada, V. de la Torre et al
BrowserClient
(Angular)
AuthorisedAccess
Web Services
![Page 12: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/12.jpg)
12
Improved GUI
![Page 13: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/13.jpg)
13
Improved GUI
![Page 14: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/14.jpg)
14
RD-Connect Genomics Platform
D. Piscia, J. Protasio, S. Laurie, A. Papakonstantinou, S. Beltran
![Page 15: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/15.jpg)
15
Preset fi lters and share queries
![Page 16: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/16.jpg)
16
Added Cl inVar (and looking @ HGMD)
ClinVar can be used for filtering, and ClinVar categories are shown
Started conversations to explore integration of HGMD
![Page 17: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/17.jpg)
17
Get l ists of genes associated to OMIM and HPO
Search for OMIM and HPO termsthrough OMIM and PhenoTips APIs
![Page 18: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/18.jpg)
18
Predefined l ists of genes
OMIM and HPO related genes accessedthrough OMIM and PhenoTips APIs
Added more lists of genes
![Page 19: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/19.jpg)
19
New l inks ( inc l . HSF, HGMD and gnomAD )
![Page 20: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/20.jpg)
20
Development of common API to integrate tools through Links
![Page 21: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/21.jpg)
21
Search across samples (per gene/s) with al l f i l ters
![Page 22: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/22.jpg)
22
Search across samples (per gene/s) with al l f i l ters
![Page 23: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/23.jpg)
23
Exomiser in product ion
Run Exomiser on filtered results (coming soon)
HPO terms and inheritance model extracted fromPhenoTips through API
![Page 24: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/24.jpg)
BBMRI-LPC Whole Exome Sequencing Call for RD (2016)
Goal:
to promote the utilization of cutting-edge next-
generation sequencing technology for the
identification of novel causative variants and
genes and to molecularly diagnose rare disease
patients. BBMRI-LPC also wants to promote
biobanking for rare diseases, the use of rare
diseases biobanks and responsible data sharing.
To sequence and analyse:
900 exomes in 17 coordinated projects.
Sequencing and analysis carried out at the
CNAG-CRG and the Wellcome Trust Sanger
Institute (WTSI).
Researches are able to analyse the data in
RD-Connect’s platform
3/17 projects released through RD-Connect
(follow-up session by Manuel and Marina on
submission and from Hanns on results)
![Page 25: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/25.jpg)
25
2016 Main Achievements
Deployment of platform in RD-Connect’s cluster
Improved the CAS; connection with ID-Cards, Sample Catalogue and
PhenoTips underway
Genomics platform with 2123 experiments
Filtering by genes linked to OMIM and HPO; PhenoTips API improved
Common API – to add informative links to external
Integration of ClinVar and links to external tools (HSF, gnomAD, HGMD etc.)
Exomiser in production
Additional features for genomics platform
Processing of the BBMRI-LPC projects
![Page 26: RD-CONNECT WP5 UPDATE · 2017-10-03 · 2 Platform moved to RD-Connect cluster RD-Connect clúster •19 servers • Each server has : • 256 GB RAM • 20 TB sata disk + 900 GB](https://reader033.vdocuments.site/reader033/viewer/2022042113/5e901621f1f02b7dff77e3f7/html5/thumbnails/26.jpg)
26
Contributors
WP1: Coordination
WP2: Patient registries
WP3: Biobanks
WP4: Bioinformatics
WP5: Unified platform
Hanns Lochmüller(Newcastle and TREAT-NMD)
Domenica Taruscio (ISS and EPIRARE)
Lucia Monaco(Fondaz. Telethon & EuroBioBank)
WP6 Ethical/legal/social
Ivo Gut (CNAG Barcelona)
Christophe Béroud(INSERM Marseille)
WP7: Impact/Innovation
Mats Hansson (Uppsala)
Kate Bushby(Newcastle and EUCERD/ EJARD)
I. Gut
S. Beltran
D. Piscia
S. Laurie
J. Protasio
A. Papakonstantinou
I. Martinez
R. Tonda
J.R. Trotta
CNIOA. Valencia
S. Capella
V. de la Torre
J.M. Fernández
A. Cañada
CNAG AMU
(Marseille)C. Béroud
D. Salgado
J.P. Desvignes
Interactive
BioSoftwareA. Blavier
S. Lair
LUMCP.B. t’Hoen
M. Roos
M. Thompson
R. Raliyaperumal
B. Mons
U. Of TorontoM. Brudno
M. Girdea
S. Dumitriu
O. Buske
EGAT. Keane
D. Spalding
J. Paschall
J. Almeida-King
J. Rambla
Newcastle U.H. Lochmüller
R. Thompson
A. Topf
I. Zaharieva
U. AveiroJ.L. Oliveira
P. Lopes
P. Sernaleda
U. of PatrasG. Patrinos
Murdoch U.M. Bellgard