pulsenet updates: transitioning to wgs for reference ... · – sharepoint pulsenet documents wgs...
TRANSCRIPT
National Center for Emerging and Zoonotic Infectious Diseases
PulseNet Updates: Transitioning to WGS for Reference Testing and Surveillance
Kelley Hise, MPH
Enteric Diseases Laboratory BranchDivision of Foodborne, Waterborne and Environmental Diseases
PulseNet/OutbreakNet West Coast Regional MeetingFebruary 5, 2019
Overview Transition Timeline Updates on Conversions/Certifications Sequencing Prioritization Turn Around Times Data Analysis Workflow with the National Databases Combined Organism Databases
Transition TimelineUpdates on Conversions and Certifications
Dates for PulseNet’s Transition to WGS as the Gold Standard for Foodborne Surveillance
January 15, 2018
Listeria
October 1, 2018
Campylobacter
March 15, 2019Salmonella,
STEC, Shigella
WA
OR
MT
WY
COUT
AZ
ID
CA NV
ND
SD
KS
NE
TX
NMOK
MN
MO
IA
LA
AR
WI
IN
KY
MI
OHIL2
TN
GAALMS
SC
FL
PA
NC
WV VA
DEMD
NJ
ME
VT
NYNH
MACT
AK
HI
RI
Modified: February 1, 2019
PR
Converted to BioNumerics 7.6:31 states35 labs
PulseNet, WGS and Enhanced Epidemiological CapacityConverted to BioNumerics 7.6
OutbreakNet Enhanced or FoodCORE
DC
Area Laboratories
PulseNet Central
NYC
CA2
CAOC
CASC
LAC
HU
NVLV
IL
WGS Analysis Certified:5 states5 labs
Converted to BioNumerics 7.6 and WGS analysis certified for Listeria, Salmonella, Escherichia and Campylobacter
Not WGS wet lab certified
NYAG
NJEP
FNE
USDA/FSIS (EL)
USDA (MWL)
USDA/FSIS (WL)
FDA
Conversion Tips You MUST clean your local databases
– DATES: must be in correct format– KEYS: no spaces at the end– BUNDLES: permanent bundles should be deleted or moved to a different
location– PLUGINS: all plugins must be deactivated. The MLVA plugin is known to cause
an issue when converting.– LIBRARIES: all libraries should be deleted– COMPARISONS: saved comparisons will be lost in the conversion
For more detailed info: SharePointPulseNet DocumentsDatabaseCleaning Guidelines
Conversion Tips Local IT support should be readily available the week of your
conversion Review the Prep instructions Review training documents on SharePoint so you are ready to upload
PFGE patterns as soon as you have converted (WGS tools to be added in March)– SharePointPulseNet DocumentsWGSPHL Upgrade to BioNumerics
v7.6BN7 PFGE Data Management
Call/email PulseNet with questions related to conversion Once converted, email [email protected] to let them know and
request analysis certification information
Analysis Certifications: Request after Conversion PNQ08 has been updated WGS Analysis Certification is available for Escherichia,
Salmonella, Listeria, and Campylobacter using BioNumerics v 7.6
Receive: (1) Certification set assignment A, B or C (i.e. Listeria Certification Set_A) and (2) Instructions for accessing fastq files, associated metadata, a bundle file and an analysis submission template via the PulseNetQA FTP site
Analysis Certifications: Tips Do not read too much into the quality metrics threshold tables
– No need to list every possible reason a sequence should be repeated
• Pay attention to the metrics in Red on the table in PNQ08-6• Metrics in black type are important and can provide information
but aren’t required for data to be uploaded to PulseNet– Use metrics listed and not those for wet lab certifications (i.e.
cannot find median insert size in BioNumerics)– Look at NCBI submission presentation and SOP to understand NCBI
metadata requirements
Sequencing Prioritization and Turn Around Times
Sequencing Prioritization (as of March 15, 2019) Listeria and STEC: sequence all isolates Salmonella: sequence all isolates if possible
– prioritize isolates with cluster codes (while PFGE remains)– random sequencing, e.g. every other or every third– as requested by CDC and/or epi
Campylobacter and Shigella: sequence all isolates, but prioritize other organisms first*
*unless specifically funded by other projects, like FoodNet
Turn Around Times (TATs) Starting January 1, 2019 TAT will be calculated from the date
the isolate was received (or recovered) in the PHL to the date of upload to the national database
Day 1 is date of receipt of a culture; for CIDTs, day 1 is not until an isolate is recovered
Should be 7 working days or less for WGS Keep track of local TATs
– Track steps along the way to determine areas for improvement
Turn Around Times: Calculate in BioNumerics
1. Select entries to calculate TAT
2. Click on clock at top of screen
3. Define parameters
NOTE • Upload_Date is not available• PulseNet_UploadDate is populated in BN7.6 upon upload• Suggest moving contents of Upload_Date to PulseNet_UploadDate for
entries prior to conversion
Data Analysis Workflow with the National Databases
2. Save generated sequence files locally, on BaseSpace, or external hard drive
1. Sequence isolates using PulseNet Key number
File naming format:Key-LabID-M###-YYMMDD
Data Analysis Workflow with National Database
Raw Sequence Data Private Raw
Sequence Storage
Public Raw Sequence
Data Storage
Organism-specific Database
Reference ID Database
PHL
3a. Link sequence data to Reference ID database by PulseNet Key name3b. Submit data to calculation engine (CE) for denovo assembly, species identification (Genus, Species – by ANI)3c. Verify quality3d. Export entries
4a. Import entries from Reference ID4b. Add demographic information for entries4c. Submit to the CE* for allele calls and genotyping results (serotype, AST, virulence)4d. Verify quality and upload to national database (WGS id automatically assigns)4e. Upload raw data sequence reads to NCBI4f. Perform surveillance in BioNumerics
PulseNet National
Databases
Updated 10/23/2018*CE: Calculation Engine
Calculation Engine (CE)
Calculation engine built to be highly customizable with easy integration of both custom-made and open source code
CE Store: server on the CE that states upload their data to Offers temporary storage of
sequences QA/QC, trimming, mapping, SNP
detection, allele detection
Calculation Engine
CDC
RefID Database to Organism-Specific Database Select sequences to submit to the
CDC’s calculation engine (CE) and retrieve the de novo assemblies and basic QC metrics
Once assemblies received, resubmit to CE and get back the taxonomic identification
De novo assemblies, QC metrics and taxa ID can then be exported and imported into the organism-specific database based on the genus/species identified. Either create new entry or link to previously imported entry (i.e. PFGE already done).
3d. Export de novo assemblies, QC
metrics, taxa ID to correct organism-specific database
Organism-specific Database
Reference ID Database
3b and c. Submit raw reads, and retrieve assembly with basic QC
metrics
PHL
Allele Databases
Calculation Engine
CDC
Submission to the Calculation Engine
Select sequences for analysis Choose from a list of algorithms to select
for analysis (figure) Submit assemblies and raw reads to the
CDC calculation engine
Allele Databases
Calculation Engine
CDC
Organism-specific Database
4c. Submit sequence data for allele calls
and genotyping results
PHL
Analyzed Results in BioNumerics User retrieves jobs from
calculation engine Allele calls/additional quality
metrics are imported into user database, and includes predicted: Serotype Resistance Virulence
Allele Databases
Calculation Engine
CDC
Organism-specific Database
Retrieve allele calls after submission—Predicted Serotype,
Resistance, Virulence
PHL
Upload to the PulseNet National Database Authenticate to PulseNet firewall Select entries and analyzed data to
upload to PulseNet national databases User can search national database for
close matches to uploaded sequence data and download things like outbreak codes and allele codes
PulseNet National
Databases
Organism-specific Database
CDC
4e. Upload allele calls and metadata
Download allele code, outbreak code, etc.
PHL
Submission to NCBI Can create and save templates for
upload of biosample, sequence metadata, and fastq files to NCBI
Can import NCBI-assigned ids back into user database (e.g. NCBI accession and SRR numbers)
Organism-specific Database
PHL
Public Raw Sequence Data
Storage
Upload raw sequence data with minimal metadata
For more information, contact CDC1-800-CDC-INFO (232-4636)TTY: 1-888-232-6348 www.cdc.gov
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
Questions?
Telephone: 404-639-4558E-mail: [email protected] Web: www.cdc.gov/pulsenet
#PulseNet