Canada's IRIDA platform for genomic
epidemiology Gary Van Domselaar Chief, Bioinformatics
National Microbiology Lab
Public Health Agency of Canada
Integrated Rapid Infectious Disease Analysis informatics platform to support real-time infectious disease outbreak investigations
Open source, standards compliant, resource for public health agencies and researchers that complements other initiatives
Platform Overview
IRIDA
Servlet Co
ntain
er
REST API Central File
Storage
Web Interface
Ap
plicatio
n Lo
gic
Compute Cluster
Galaxy
$ ~ >_ Galaxy
Getting data into IRIDA
• Manual web interface upload
• Automated instrument upload (Illumina MiSeq)
Data Management
User Access Control System User Management
Project User Management
Getting data out of IRIDA
• Sharing project data
• Downloading
• Export to external Galaxy instance
• Exporting to the command-line
NCBI SRA Upload
NCBI SRA Upload
Accession
Data Sharing
Project X
Project Y
Project Z
Data Analysis
Galaxy
Assembly Tools
Variant Calling Tools
…
API
Worker Worker
IRIDA
…
Kmer=99 Min=500
Assembly
Sample Selection
Galaxy
Assembly Tools
Variant Calling Tools
…
API
Worker Worker
IRIDA
…
Kmer=99 Min=500
Assembly
Pipeline Selection
Galaxy
Assembly Tools
Variant Calling Tools
…
API
Worker Worker
IRIDA
…
Kmer=99 Min=500
Assembly
Analysis Execution
Galaxy
Assembly Tools
Variant Calling Tools
…
API
Worker Worker
IRIDA
…
Kmer=99 Min=500
Assembly
Variant Consolidation
HGT & Recombination
Filtering
Repeat region filtering
Meta-alignment generation
SNV Matrix
Whole Genome Phylogeny
Isolate Sequencing
Reads Variant Calling
Isolate Sequencing
Reads Variant Calling
The IRIDA SNVPhyl Pipeline
User
selects isolates
Phylogeny Viewer
…
selects reference
Reference Genome
Analysis Results
Automated Assemblies
Analysis Provenance
Auditing • Every creation,
modification, or deletion of data is audited.
• Data can be restored on accidental deletion or modification.
• Trace back data to justify decisions.
Sequencing Quality Control
Analytical Tool
Quality Control Module
Quality Metrics
Quality Control
Analysis QA/QC Model
Types of (Meta)Data Standardized Within IRIDA
Lab Analytics Genomics, PFGE
Serotyping, Phage typing MLST, AMR
Sample Metadata Isolation Source (Food, Host
Body Product, Environmental), BioSample
Epidemiology Investigation Exposures
Clinical Data Patient demographics, Medical
History, Comorbidities, Symptoms, Health Status
Reporting Case/Investigation Status
“Not just what data IS collected, but what SHOULD be collected”
Tools: GenGIS
Coming Soon: CARD – Comprehensive Antibiotic Resistance Database
Coming Soon: The Salmonella In Silico Typing Resource (SISTR)
SISTR: THE SALMONELLA IN SILICO TYPING RESOURCE │
https://lfz.corefacility.ca/sistr-app/ 25
In silico analysis of WGS data assembly statistics serovar prediction in silico typing (MLST, cgMLST) AMR prediction
Comparative genomic analyses cgMLST accessory gene content core SNPs
Epidemiologic analysis geospatial distribution temporal distribution source association
Coming Soon: IslandViewer and IslandCompare
Outbreak investigation Routine surveillance
PulseNet Canada Deployment
http://irida.ca
Contact
• Project Information: http://www.irida.ca
• Project source:
– https://github.com/phac-nml/irida
– https://github.com/phac-nml/irida-miseq-uploader
– https://github.com/phac-nml/irida-galaxy-importer
• Documentation: https://irida.corefacility.ca/documentation/
• E-mail: [email protected]
• IRC: #irida on irc.freenode.net
29