the human variome database in australia in 2014 - graham taylor

63

Upload: human-variome-project

Post on 10-May-2015

169 views

Category:

Science


0 download

DESCRIPTION

There are a number of genetics and genomics initiatives underway in Australia, including the Australian node of the Human Variome Project (HVPA), as well as many active research collaborations including familial cancer, endocrine disease, and developmental delay. Most of these projects work with disease-specific databases on a research basis, with the risk that such archives may be ephemeral. HVPA is the only database that is directly integrated with accredited clinical reporting of variants. As such it is designed to capture variants that have passed scrutiny as diagnostically robust, and have therefore already been curated by qualified staff. Registered users access the HVPA database via a secure Internet portal. I will describe three recent developments of the HVPA database and portal: the upgraded search interface, linkage to other datasets via BioGrid using hash-based de-identified case matching, and the introduction of a genome wide database using LOVD3. Finally I will discuss the future direction of the HVPA and the questions of utility, quality control and sustainability of genetic variation databases. Search interface The search interface has to provide useful tools for clinicians and lab scientists so that the HPVA project offers them direct benefits and incentivises them to participate. Following a request for feedback from users, a series of improvements were implemented, initially on a demonstration server and then on the live server following review by the Steering Committee. The highest priorities were for more information about numbers of times particular variants were recorded, the ability to search by range and to filter by pathogenicity. There was also interest in enabling direct uploading of VCF files and the automated calculation of pathogenicity scores. Many of these features are now implemented and examples will be presented. Linkage to other datasets We have implemented the hash key algorithm and work is in progress with BioGrid to link variation data to clinical data sets. Genome wide database We have established an HVPA LOVD3 database and are working with the Human Genetics Society of Australasia on a pilot study to sequence the exomes of two trios and review the data using this database.

TRANSCRIPT

Page 1: The Human Variome Database in Australia in 2014 - Graham Taylor
Page 2: The Human Variome Database in Australia in 2014 - Graham Taylor

Acknowledgments

Genomic Medicine & Translational Pathology, University of Melbourne: Arthur Lian Chi Hsu, Renate Marquis-Nicholson, Sebastian Lunke, Clare Love, Kym Pham, Olga Kondrashova, Matt Wakefield, Tiffany Cowie, Barney Rudzki and Paul Waring

Human Variome Project Tim Smith, Alan Lo, Melvyn Leong, David Perkins, Heather Howard, Rania Horaitis Dick Cotton BioGrid Maureen Turner, Leon Heffer Royal College of Pathologists of Australasia Vanessa Tyrrell Peter MaCallum Cancer Centre Ken Doig, Andrew Fellowes Victorian Clinical Genetics Service John-Paul Plazzer, Desiree Du Sart

Page 3: The Human Variome Database in Australia in 2014 - Graham Taylor

Human Variome Project (Australasia)

• The bigger picture

• Infrastructure and search interface

• Linkage to other datasets

• Panel, exome and genome testing

• Database accreditation

• Next steps

Page 4: The Human Variome Database in Australia in 2014 - Graham Taylor

The big picture

• Rediscovery at the genomics community level that data sharing is win-win

• The Genomic Alliance, HGVS, HUGO

– Data standards

– Nomenclature

– Infrastructure

Page 5: The Human Variome Database in Australia in 2014 - Graham Taylor

Nature (Perspective) 508 469-475 2014 Guidelines for investigating causality of sequence variants in human disease

D. G. MacArthur, T. A. Manolio, D. P. Dimmock, H. L. Rehm, J. Shendure, G. R. Abecasis, D. R. Adams, R. B. Altman, S. E. Antonarakis, E. A. Ashley, J. C. Barrett, L. G. Biesecker, D. F. Conrad, G. M. Cooper, N. J. Cox, M. J. Daly, M. B. Gerstein, D. B. Goldstein, J. N. Hirschhorn, S. M. Leal, L. A. Pennacchio, J. A. Stamatoyannopoulos, S. R. Sunyaev, D. Valle, B. F. Voight, W. Winckler & C. Gunter.

Priorities for research and infrastructure development 1. Improved public databases of human genetic variants incorporating explicit, up-to-date supporting

evidence for variant implication in disease and audit trails recording changes in interpretation. 2. Improved incentives, and ethical and logistical solutions, for sharing of genetic and phenotypic data from

both research and clinical diagnostic laboratories. 3. Public databases of variant and allele frequency data from large sets of population reference samples

from a wide range of ancestries. 4. Large-scale genotyping of reported human disease-causing variants in large, well-phenotyped

population cohorts, reducing biases in the assessment of the associated penetrance and phenotypic heterogeneity.

5. Development and benchmarking of standardized, quantitative statistical approaches for objectively assigning probability of causation to new candidate disease genes and variants.

Déjà vu all over again?

Page 6: The Human Variome Database in Australia in 2014 - Graham Taylor

Nature Genetics 46, 107–115 (2014) Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database

Bryony A Thompson, Amanda B Spurdle, John-Paul Plazzer, Marc S Greenblatt, Kiwamu Akagi, Fahd Al-Mulla, Bharati Bapat, Inge Bernstein, Gabriel Capella ́, Johan T den Dunnen, Desiree du Sart, Aurelie Fabre, Michael P Farrell, Susan M Farrington, Ian M Frayling, Thierry Frebourg, David E Goldgar, Christopher D Heinen, Elke Holinski-Feder, Maija Kohonen-Corish, Kristina Lagerstedt Robinson, Suet Yi Leung, Alexandra Martins, Pal Moller, Monika Morak, Minna Nystrom, Paivi Peltomaki, Marta Pineda, Ming Qi, Rajkumar Ramesar, Lene Juel Rasmussen, Brigitte Royer-Pokora, Rodney J Scott, Rolf Sijmons, Sean V Tavtigian, Carli M Tops, Thomas Weber, Juul Wijnen, Michael O Woods, Finlay Macrae & Maurizio Genuardi, on behalf of InSiGHT.

Nature Genetics 46, 107–115 (2014)

1. Leiden Open Variation Database (LOVD) 2. Micro- attribution using Open Researcher & Contributor Identification (ORCID) 3. Variant Interpretation Committee (VIC) apply a 5-tiered scheme developed by the

International Agency for Research on Cancer (IARC) classification system 4. Endorsed by the Human Variome Project (HVP)

Page 7: The Human Variome Database in Australia in 2014 - Graham Taylor

Not everything in the Nature portfolio is gold

It is good to supplement your pocket money

Page 8: The Human Variome Database in Australia in 2014 - Graham Taylor

Early nomenclature papers

• Beaudet

• Tsui

• Antanorakis

Page 9: The Human Variome Database in Australia in 2014 - Graham Taylor

Translation into diagnostic practice

• 15 years ago Cotton predicted that the majority of human genetic variants will be detected in a diagnostic context

• As NGS moves into a service setting this transition will become even clearer

• Genetic variants will become part of a patient’s medical record

Page 10: The Human Variome Database in Australia in 2014 - Graham Taylor

HVPA database

• Primarily for and of diagnostics

• Diagnostic services are busy

• And cash and time limited

• We have to make it easy for them

• And secure

• And useful

• Maybe even essential

Page 11: The Human Variome Database in Australia in 2014 - Graham Taylor

HVPA Objective

A national data sharing facility for improving clinical genetic testing services and supporting medical research

Constitutional, not somatic, mutations

NECTAR project grant UoM FE31082

“Clinical and Molecular Data Linkage Tools”, completion date 30th June 2014

Page 12: The Human Variome Database in Australia in 2014 - Graham Taylor

Infrastructure and search interface

• Data repository (“the database”)

• Data handling tools that support data upload from laboratories

• Portal though which the database can be browsed

• Website for news and notifications

Page 13: The Human Variome Database in Australia in 2014 - Graham Taylor

Human Variome Project Australian Node

What We’ve Done • NeAT Funding (2010-2011)

– Pilot Phase – 4 labs, 3 diseases

• Breast Cancer • Colon Cancer • Huntington’s

– Portal Launched April 2011 – Molecular Data Only – Collaboration with Mawson

• NeCTAR Funding (2012-2014) – 12 more labs + all genes they test

for – Configuration Tool – Clinical Data/Phenotype Linkage – Transfer data internationally

What We Built

• Collection Tool

• Portal

• Data Model

• Ethics Processes

• Access & Usage Policy

• Data Sharing Agreements

Page 14: The Human Variome Database in Australia in 2014 - Graham Taylor

How it works

• Software to interface with existing LIMS (or lack thereof) • Collection occurs after report has been issued • Data types:

– All classified variants reported by a lab – Benign variants – NGS/Incidental findings – Not collecting negative results

• Secure data link between lab and Node • (Semi)-automatic transfer of data • Portal to allow interrogation of all Australian data

– http://www.hvpaustralia.org.au

• Linkage key generator • Submission to BioGrid Platform

Page 15: The Human Variome Database in Australia in 2014 - Graham Taylor

Open-Source Solutions

• HVP Portal (v1.0, r512) - A web application which features the basic interface for browsing and querying a HVP node. – Open source – MIT License – Python/django

• HVP Exporter (v1.0, r512) - Basic HVP exporting tool for laboratories. Features simple GUI and error checking interface, plug-in architecture for customisation between sites and common libraries for working with MS Access and MS Excel data sources – Open source – MIT License – .NET C#, python/ironpython

• HVP Importer (v1.0, r512) - A series of tools and web services that receive, decrypt and process information by submitting laboratories using the standard transaction XML format – Open source – MIT License – python

Page 16: The Human Variome Database in Australia in 2014 - Graham Taylor

Access to HVPA

• Controlled Access

– Diagnostic Lab Staff

– Registered Medical Practitioners

– Board Certified Genetic Counsellors

• Online application

Page 17: The Human Variome Database in Australia in 2014 - Graham Taylor

HVPA Status at November 2013

Strengths

1. Database available on demand for diagnostic labs

2. Tools for data sharing

3. Community engagement with RCPA (QUUP), SA/Mawson, BioGrid, VCGS

4. National reach with international connections via HVPI, WHO & UNESCO

Weaknesses 1. Performance of the existing

HVPA database is limited

2. Laboratory buy-in to the database across Australia is limited

3. The database itself has been hard to access because of low server bandwidth

4. The project has not anticipated the likely impact of next generation sequencing and risks missing inclusion in genomic-scale initiatives now underway.

Page 18: The Human Variome Database in Australia in 2014 - Graham Taylor

HVPA 24th March 2014

• 5 laboratories submitting

• 295 Unique Variants

• 27410 Instances

• 25 Registered users

Page 19: The Human Variome Database in Australia in 2014 - Graham Taylor

Developments proposed in November ID Area Idea Priority

1 B.Presentation Statisticsofnumberofvariantsforthatgeneastableorbargraph(#unique,#instances,top5qtysubmitted)

1

15 D.Feedback Raiseaconcernaboutaninstance'sinterpretation 12 A.Search Searchbyrange 23 A.Search Searchbygenomicposition 24 A.Search Filterbypathogenicity 25 B.Presentation Sortby...(pathogenicity,otherfields) 26 C.RelevantInfo Displaylinkstorelateddatabaseforgenebyreferencinggenenames.org 27 A.Search Wildcardsearchofvariants 29 A.Search Searchbydiseasewhichshowsmultiplegenesandvariantresults 210 E.NGS VCFdataimportsintoHVPAustralia 213 B.Presentation VarVis-visualisationofgeneandvariantsreported 211 B.Presentation VCFdataexportfromHVPAustraliaofasetofresults 312 B.Presentation Atinstancelevel-seeothervariantsfromthistest/patient 314 C.RelevantInfo Capture&displaySIFTscore 316 D.Feedback Notifylabsthegeneralconcensusofpathogencityofsomethingtheysubmittedhas

changed/updated.i.eTheysubmittedbenignanditsnowlikelypathogenicorsubmitedunknownandknowitssomethingelse

3

17 B.Presentation IntegrationwithEBI/NCBItoolsforqueriesanddisplays 319 B.Presentation Displaylastdateuploadedforthisvariant(orlast10dates) 3

Page 20: The Human Variome Database in Australia in 2014 - Graham Taylor

Accessing the test database

http://115.146.85.61/

Username:

lab_tester

Password: hvpaustralia2013

Page 21: The Human Variome Database in Australia in 2014 - Graham Taylor

Search Interface

• The search interface has to provide useful tools for clinicians and lab scientists so that the HPVA project offers them direct benefits and incentivises them to participate. Following a request for feedback from users, a series of improvements were implemented, initially on a demonstration server and then on the live server following review by the Steering Committee. The highest priorities were for more information about numbers of times particular variants were recorded, the ability to search by range and to filter by pathogenicity. There was also interest in enabling direct uploading of VCF files and the automated calculation of pathogenicity scores. Many of these features are now implemented and examples will be presented.

Page 22: The Human Variome Database in Australia in 2014 - Graham Taylor

Purpose of the HVPA Database

• Working database – Record and share diagnostic quality data genetic variation

data – Integrate with clinical phenotype data – Integrate with international efforts – Heads up for NGS gene panel data sets

• Test database – Showcase enhancements – Real world testing and feedback – Uses data edited from actual database – Not accurate or reliable: some parameters edited for test

purposes

Page 23: The Human Variome Database in Australia in 2014 - Graham Taylor

Major improvements to search facility

Page 24: The Human Variome Database in Australia in 2014 - Graham Taylor

Searching by expression match BRCA BR

Page 25: The Human Variome Database in Australia in 2014 - Graham Taylor

Instances of a variant

Page 26: The Human Variome Database in Australia in 2014 - Graham Taylor

Pathogenic Variants

Page 27: The Human Variome Database in Australia in 2014 - Graham Taylor

Direct Import from Results Lists

• Can recover historical data sets

• Reformat on the fly

• Useful as low-overhead catch up to enable labs to transition to using uplaoding toals as their IT permits – PathWest (John Bielby)

– Institute of Health and Biomedical Innovation, Queensland (Lyn Griffiths)

– Kconfab (Heather Thorne)

– Peter MaCallum Cancer Centre (Ken Doig)

Page 28: The Human Variome Database in Australia in 2014 - Graham Taylor

Variant Fields Mandatory GeneName RefSeqName RefSeqVer cDNA mRNA Genomic Protein Location

OfficialHGNC

Symbol

Nameof

reference

sequence(NCBI's

RefSeqproject)

Versionof

reference

sequence

(RefSeq)

HGVSvariant

name(c.)

HGVSvariant

name(m.)

HGVSvariant

name(g.)

HGVSvariant

name(g.)

Exonorintron

number

VARCHAR(20) VARCHAR(20) VARCHAR(20) VARCHAR(255) VARCHAR(255) VARCHAR(255) VARCHAR(255) VARCHAR(255)

Mandatory Mandatory Mandatory Atleastonerequired

Pathogenicity PatientID TestID InstanceDate GenomicRefSeq GenomicRefSeqVer

Levelofpathogenicity

(1=Pathogenic,2=Possibly

Pathogenic,3=Unknown,

4=Possiblebenign,

5=CertainlyBenign)

InternalIDfor

thepatient

usedwithin

thelab

InternalID

forthetest

usedwithin

thelab

Dateinstance

wastested

Genomic

reference

sequence

Genomicreference

sequenceversion

VARCHAR(20) DateTime VARCHAR(255) VARCHAR(255)

Mandatory Mandatory Mandatory Mandatory Mandatory Mandatory

Page 29: The Human Variome Database in Australia in 2014 - Graham Taylor

Variant Fields (Optional)

PatientAge TestMethod SampleTissue SampleSource Justification

Ageofpatient

whentestwas

taken

Thenameofthe

testmethodused

Typeofsample

taken

Thesourceofthe

samplee.g.:DNA,

g.DNA,RNA...

Justificationbymedical

scientist

INT32 VARCHAR(20) VARCHAR(20) VARCHAR(20) VARCHAR(65535)

Optional Optional Optional Optional Optional

PubMed RecordedInDatabase SampleStored

VariantSegregatesWi

thDisease HistologyStored

PedigreeA

vailable SIFTScore

PubMed

Identifier/Data

ObjectIdentifier

Whetheritis

recordedindisease

specificorgene

specific

Whetherlabstill

hassampleleft

Whetherpedigreee

wasconsideedduring

diagnosisof

pathogenicity

Whether

histogramsare

stored

Whether

organisati

onhas

pedigree

data

Calculated

SIFTScore

VARCHAR(255) Boolean Boolean Boolean Boolean Boolean INT32

Optional Optional Optional Optional Optional Optional Optional

Page 30: The Human Variome Database in Australia in 2014 - Graham Taylor

Linkage to other datasets

• HVPA have implemented the hash key algorithm and work is in progress with BioGrid to link variation data to clinical data sets.

• More details from Maureen Turner, BioGrid CEO who is speaking at this meeting

Page 31: The Human Variome Database in Australia in 2014 - Graham Taylor

Cost and performance will force diagnostic labs to adopt NGS as front-line approach

cost per base Illumina share price

Hype cycle

Page 32: The Human Variome Database in Australia in 2014 - Graham Taylor

HVPA LOVD3 database pilot

• Established an HVPA LOVD3 database and working with the Human Genetics Society of Australasia on a pilot study to sequence the exomes of two trios and review the data using this database.

• Includes exome-scale data

• Open access to Coriell cases with no “consent” issues

• Explore staging of variant “credibility classification” and access

Page 33: The Human Variome Database in Australia in 2014 - Graham Taylor

Relationship to Gene Panel Databases? e.g. http://genomics.bio21.unimelb.edu.au/lovd/

Page 34: The Human Variome Database in Australia in 2014 - Graham Taylor

Melbourne Genomics Health Alliance

34

Page 35: The Human Variome Database in Australia in 2014 - Graham Taylor

• Clinically led, rather than technology driven

• Fostering ‘end use’ of genomic data

• Common clinical repository

• Prospective : first tier test

• Evaluation to inform implementation

• Engineering collaboration

• Fostering system change

• A/Prof Clara Gaff: Program Leader

PARADIGM FOR IMPLEMENTING GENOMIC MEDICINE

35

Melbourne Genomics Health Alliance

Page 36: The Human Variome Database in Australia in 2014 - Graham Taylor

Connected nationally

and internationally

36

Page 37: The Human Variome Database in Australia in 2014 - Graham Taylor

How many variants per exome?

SNP count Study

20,000 Choi et al. PNAS 2009

142,000 Mullikin NIH, unpublished 2010

50,000 Clark et al. Nature biotechnology 2011

125,000 Smith et al. Genome Biology 2011

100,000 Johnston & Biesecker Human Molecular Genetics 2013

200,000 to 400,000 Yang et al.N Engl J Med 2013

• 20-fold range • Exome designs vary • Likely to be higher variant count in African populations as the

reference sequence is non-African

Page 38: The Human Variome Database in Australia in 2014 - Graham Taylor

Low concordance of multiple variant-calling pipelines

Rawe et al Genomic Medicine 2013

• 15 exomes

• 4 families

• HiSeq 2000

• Agilent SureSelect v.2

• ~120X mean coverage • SOAP, BWA-GATK, BWA-SNVer,

GNUMAP, and BWA- SAMTools

• SNV concordance between five Illumina pipelines across all 15 exomes was 57.4%

• 0.5-5.1% variants were called as unique to each pipeline

• Indel concordance was only 26.8% between three indel calling pipelines

• 11% of CG variants that fall within targeted regions in exome sequencing were not called by any of the Illumina-based exome analysis pipelines

• 97.1%, 60.2% and 99.1% of the GATK-only, SOAP-only and shared SNVs can be validated

• 54.0%, 44.6% and 78.1% of the GATK-only, SOAP-only and shared indels can be validated

• Additional accuracy gained in variant discovery by having access to genetic data from a multi- generational family

Page 39: The Human Variome Database in Australia in 2014 - Graham Taylor

Low concordance of multiple variant-calling pipelines O’Rawe et al. Genome Medicine 2013, 5:28

SNV concordance: 57.4% Indel concordance 26.8%

Page 40: The Human Variome Database in Australia in 2014 - Graham Taylor

Venn diagrams of selected CNV detection methods in real data processing

Duan J, Zhang J-G, Deng H-W, Wang Y-P (2013) Comparative Studies of Copy Number Variation Detection Methods for Next-Generation Sequencing Technologies. PLoS ONE 8(3): e59128. doi:10.1371/journal.pone.0059128 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0059128

Page 41: The Human Variome Database in Australia in 2014 - Graham Taylor

Sequence errors

Page 42: The Human Variome Database in Australia in 2014 - Graham Taylor

Post processing errors

Page 43: The Human Variome Database in Australia in 2014 - Graham Taylor

Remove errors before processing

K-mer selection

Merging'forward'and'reverse'reads'

0

200

400

600

800

1000

1200

1400

1600

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTTTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGTATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGTA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATTAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

TAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATCTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGAAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAAAGTAGGAAATGGAAGTCTATGTGATCAAGAAATTGATAGCATTTGCA

CAGAAAGAGTAGAAAATGGAAGTCTATGTGATCAAGAAATCGATAGCATTTGCA

Discard rare reads

Use a HiFi polymerase

Page 44: The Human Variome Database in Australia in 2014 - Graham Taylor

Four capture panels at SOD1

Page 45: The Human Variome Database in Australia in 2014 - Graham Taylor

• Known SNV concordance 100%, all assays

• Known indel <6bp concordance 100%, all assays

• Not able to detect c9orf72 hexanucleotide expansion or PRNP octapeptide region repeat with standard pipeline

• Diagnostic yield within appropriate clinical context (based on very limited sample size)

- NimbleGen SeqCap EZ Neuro: 33% (2/6)

- Nextera Neuro: 23% (6/26)

Results – detection of variants

Page 46: The Human Variome Database in Australia in 2014 - Graham Taylor

Filtering Variants

All variants None Qual Not in Blood

Blood 9828 8551 NA

Frozen 9920 8736 126

FFPE 9709 8163 199

Variants in Gene List None Qual Not in Blood

Blood 27 18 NA

Frozen 27 23 2 (EGFR)

FFPE 25 19 3 (EGFR, ROS)

Page 47: The Human Variome Database in Australia in 2014 - Graham Taylor

EGFR p.L858R

Page 48: The Human Variome Database in Australia in 2014 - Graham Taylor

EGFR p.T790M

Page 49: The Human Variome Database in Australia in 2014 - Graham Taylor

Confirmation by PCR

0.0

50.0

100.0

150.0

200.0

250.0

EGFR_NM_005228.3T790T790WT

EGFR_NM_005228.3784"c.2350T>C,p.S784P"

EGFR_NM_005228.3784"c.2351C>T,p.S784F"

EGFR_NM_005228.3785"c.2354C>T,p.T785I"

EGFR_NM_005228.3786"c.2356G>A,p.V786M"

EGFR_NM_005228.3790"c.2368A>G,p.T790A"

EGFR_NM_005228.3790"c.2369C>T,p.T790M"

EGFR_NM_005228.3828&861"828&861,wt"

EGFR_NM_005228.3858"c.2572C>A,p.L858M"

EGFR_NM_005228.3858"c.2573_2574delinsGT,

EGFR_NM_005228.3858"c.2573T>A,p.L858Q"

EGFR_NM_005228.3858"c.2573T>G,p.L858R"

EGFR_NM_005228.3860"c.2579A>T,p.K860I"

EGFR_NM_005228.3861"c.2582T>A,p.L861Q"

EGFR_NM_005228.3861"c.2582T>G,p.L861R"

EGFRnormalised

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

KRAS_NM_033360.212"c.34G>A,p.G12S"

KRAS_NM_033360.212"c.34G>C,p.G12R"

KRAS_NM_033360.212"c.34G>T,p.G12C"

KRAS_NM_033360.212"c.35G>A,p.G12D"

KRAS_NM_033360.212"c.35G>C,p.G12A"

KRAS_NM_033360.212"c.35G>T,p.G12V"

KRAS_NM_033360.213"c.37G>A,p.G13S"

KRAS_NM_033360.213"c.37G>C,p.G13R"

KRAS_NM_033360.213"c.37G>T,p.G13C"

KRAS_NM_033360.213"c.38G>A,p.G13D"

KRAS_NM_033360.213"c.38G>C,p.G13A"

KRAS_NM_033360.213"c.38G>T,p.G13V"

KRASnormalised

Page 50: The Human Variome Database in Australia in 2014 - Graham Taylor

Auto Upload Database of Results in LOVD Local LOVD instances sharable via HVPA

Page 51: The Human Variome Database in Australia in 2014 - Graham Taylor

• Coriell pedigree comparison

• Subset of 19 genes – targeted by all four assays

• Variant allele frequency cut-off of 35% (interested in germline variants)

Results – detection of variants

Total number of variants detected

Non-synonymous variants detected # variants with GAF <5% # variants with African AF 5%

Y077 Mother

Y077 Father

Y077 Child

Y077 Mother

Y077 Father

Y077 Child

Y077 Mother

Y077 Father

Y077 Child

Y077 Mother

Y077 Father

Y077 Child

NimbleGen SeqCap EZ Neuro

194 241 196 16 22 20 4 5 7 2 3 4

Nextera Neuro 250 296 283 17 23 22 4 6 7 2 3 4

TruSight One 121 137 119 16 23 20 3 6 6 1 3 3

Nextera Exome 101 118 114 16 22 22 4 5 7 2 2 4

Y117 Mother

Y117 Father

Y117 Child

Y117 Mother

Y117 Father

Y117 Child

Y117 Mother

Y117 Father

Y117 Child

Y117 Mother

Y117 Father

Y117 Child

NimbleGen SeqCap EZ Neuro

279 245 263 20 20 20 4 5 6 3 2 4

Nextera Neuro 382 371 342 20 21 21 5 5 6 3 2 4

TruSight One 148 154 148 18 18 17 4 4 5 3 2 3

Nextera Exome 121 67 66 19 15 16 5 3 4 3 1 3

Page 52: The Human Variome Database in Australia in 2014 - Graham Taylor

Example case showing concordance Gene Variant Chr Coordinate zyg Gene Variant Chr Coordinate zyg KEY

APOE T>T/C 19 45411941 het NPC1 T>T/C 18 21120444 het exome

APOE T>T/C 19 45411941 het NPC1 T>T/C 18 21120444 het nimbleneuro

APOE T>T/C 19 45411941 het NPC1 T>T/C 18 21120444 het nextneuro

APOE T>T/C 19 45411941 het NPC1 T>T/C 18 21120444 het trusight1

APOE C>C/T 19 45412040 het NPC1 TA>TA/T 18 21123536 het

APOE C>C/T 19 45412040 het NPC1 TA>TA/T 18 21123536 het

APOE C>C/T 19 45412040 het NPC1 TA>TA/T 18 21123536 het

APOE C>C/T 19 45412040 het NPC1 TAA>TAA/T 18 21123536 het

ATP7B G>G/A 13 52511606 het NPC1 C>G/G 18 21124945 hom

ATP7B G>G/A 13 52511606 het NPC1 C>G/G 18 21124945 hom

ATP7B G>G/A 13 52511606 het NPC1 C>G/G 18 21124945 hom

ATP7B G>G/A 13 52511606 het NPC1 C>G/G 18 21124945 hom

ATP7B A>A/G 13 52515354 het PARK2 G>G/C 6 162622239 het

ATP7B A>A/G 13 52515354 het PARK2 G>G/C 6 162622239 het

ATP7B A>A/G 13 52515354 het PARK2 G>G/C 6 162622239 het

ATP7B A>A/G 13 52515354 het PARK2 G>G/C 6 162622239 het

ATP7B C>C/T 13 52523808 het PINK1 A>A/G 1 20964328 het

ATP7B C>C/T 13 52523808 het PINK1 A>A/G 1 20964328 het

ATP7B C>C/T 13 52523808 het PINK1 A>A/G 1 20964328 het

ATP7B C>C/T 13 52523808 het PINK1 A>A/G 1 20964328 het

ATP7B T>T/C 13 52524488 het PINK1 G>G/A 1 20972048 het

ATP7B T>T/C 13 52524488 het PINK1 G>G/A 1 20972048 het

ATP7B T>T/C 13 52524488 het PINK1 G>G/A 1 20972048 het

ATP7B T>T/C 13 52524488 het PINK1 G>G/A 1 20975727 het

LRRK2 G>A/A 12 40619082 hom PINK1 G>G/A 1 20975727 het

LRRK2 G>A/A 12 40619082 hom PINK1 G>G/A 1 20975727 het

LRRK2 G>A/A 12 40619082 hom PINK1 A>A/C 1 20977000 het

LRRK2 G>A/A 12 40619082 hom PINK1 A>A/C 1 20977000 het

LRRK2 C>C/G 12 40657700 het PINK1 A>A/C 1 20977000 het

LRRK2 C>C/G 12 40657700 het PINK1 A>A/C 1 20977000 het

LRRK2 C>C/G 12 40657700 het PSEN2 G>G/A 1 227071449 het

LRRK2 T>T/A 12 40713901 het PSEN2 G>G/A 1 227071449 het

LRRK2 T>T/A 12 40713901 het PSEN2 G>G/A 1 227071449 het

LRRK2 T>T/A 12 40713901 het PSEN2 G>G/A 1 227071449 het

LRRK2 T>T/C 12 40758652 het VCP C>T/T 9 35062972 hom

LRRK2 T>T/C 12 40758652 het VCP C>T/T 9 35062972 hom

LRRK2 T>T/C 12 40758652 het VCP C>T/T 9 35062972 hom

LRRK2 T>T/C 12 40758652 het VCP C>T/T 9 35062972 hom

NPC1 G>G/A 18 21119777 het VCP A>A/G 9 35068364 het

NPC1 G>G/A 18 21119777 het VCP A>A/G 9 35068364 het

NPC1 G>G/A 18 21119777 het VCP A>A/G 9 35068364 het

NPC1 G>G/A 18 21119777 het VCP A>A/G 9 35068364 het

Page 53: The Human Variome Database in Australia in 2014 - Graham Taylor

Describing Coverage

% target region with non-zero

depth

% target regions >=

5x

% target regions >=

15x

% target regions >=

30x

% target regions >=

50x

average depth of coverage

5th-centile 20th-centile 50th-centile 95th-centile

MiSeq (12plex)

99.54% 98.25% 94.80% 89.10% 81.56% 180.76 19.08 67.42 160.42 414.00

HiSeq (48plex)

99.90% 99.71% 99.34% 98.85% 98.17% 920.84 126.75 408.83 871.17 1879.92

Mapping quality >= 15 Base quality score >= 15

Page 54: The Human Variome Database in Australia in 2014 - Graham Taylor

Coverage reproducibility

Coverage Coefficient of variation

Page 55: The Human Variome Database in Australia in 2014 - Graham Taylor

Higher coverage greater reproducibility

Coverage Coefficient of variation

Page 56: The Human Variome Database in Australia in 2014 - Graham Taylor

Can capture coverage report dosage to diagnostic standards?

samples

targ

ets

samples

auto

som

al t

arge

ts

chrX

tar

gets

Inter-sample variation is low, But low coverage prevents dosage estimation

Chr X is a good first pass test for dosage

Page 57: The Human Variome Database in Australia in 2014 - Graham Taylor

XX vs. XY

8 Female cases and 16 Male cases showing reproducibility of coverage of X loci within each group. Loci with higher SDs were associated with reduced coverage.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 10 20 30 40 50 60 70 80

AverageXX

AverageXY

-0.5

0

0.5

1

1.5

2

2.5

3

0 10 20 30 40 50 60 70 80

AVGEXX

AVGEXY

870

160

Page 58: The Human Variome Database in Australia in 2014 - Graham Taylor

Report

Page 59: The Human Variome Database in Australia in 2014 - Graham Taylor

Sharing Experience with TruSight One

• In partnership with Illumina, RCPA and the HGSA Kim Flintoff (Wellington Regional Genetics Laboratory) is leading an evaluation of exon sequencing using Illumina’s True Sight One panel. Two Coriell family trios will be sequenced by New Zealand Genomics Limited and the data will be shared on a HVPA database

• The VCF file will be available on the HVPA LOVD database and performance stats will also be made available.

Page 60: The Human Variome Database in Australia in 2014 - Graham Taylor

Next Steps

• Robust standards for genomic medicine

• Databases and data content – Access to identified and de-identified data (consent

and confidentiality)

– Database accreditation process in prep with RCPA

– Defining the performance of various aligners, variant callers and annotation programs

– Clinical grade Variant Call Format (VCF)

– Metafile covering data trail: what was tested, what was not tested

Page 61: The Human Variome Database in Australia in 2014 - Graham Taylor

Standards for Accreditation of DNA Sequence Variation Databases

Quality Use of Pathology Program (QUPP), a national project for the Development of Standards for Accreditation of DNA Sequence Variation Data Bases has been jointly initiated by the Royal College of Pathologists of Australasia (RCPA), and the Human Variome Project (HVP). Background • There is a rapidly increasing volume, spectrum, and complexity of genetic tests emerging within

diagnostic pathology laboratories. In particular, high throughput sequencing methods such as targeted panel, exome (WES), and whole genome sequencing (WGS), are producing an increasing quantity of genetic data requiring analysis and interpretation, forming a substantial proportion of the workload.

• Currently, there is a plethora of online mutation databases to refer to, however there is a distinct lack of such databases that meet the stringent accuracy and reproducibility that the clinical diagnostic environment demands. Additionally, The current databases are “Fractured”, with varied access and sharing of the data within; and variable quality due to errors / inaccurate data posting, all of which is a clear risk to the quality of patient care. With more widespread, secure sharing of variants and associated phenotypes, the value of cumulative variant information will accelerate the delivery of accurate, actionable, and efficient clinical reports.

• There are currently no standards or equivalent mechanisms for accreditation of databases to ensure the accuracy and quality of uploaded data into any central repository to meet the needs of the clinical diagnostics environment.

Page 62: The Human Variome Database in Australia in 2014 - Graham Taylor

Data quality classes Differentiate between three classes of data: The Clinically Reported data label would denote the class of data that the HVP Australian Node was originally designed to collect and share: data that has been generated in a NATA accredited Australian diagnostic laboratory and is able to be included in a clinical report. Unreported Clinical quality data would denote data that has been generated in a NATA accredited diagnostic laboratory, but is not capable of being included in a clinical report. This class would comprise, primarily, of next-generation sequencing (NGS) type data. Unaccredited data would be used to denote data that has been generated by an Australian laboratory that has not been NATA accredited A new filtering option would be made available to allow users to view only data of a certain class

Page 63: The Human Variome Database in Australia in 2014 - Graham Taylor

Beyond the NeCTAR funding

• Academic or charitable funding required

• Integrate NGS data resource into the HVPA portfolio

• Move database development into a medical academic centre of excellence

• Seek active partnerships with current and future collaborators with investment and risk sharing