www.iccg.org a unified clinical genomics database nhgri - u41 genomic resource grant

25
www.iccg.org A Unified Clinical Genomics Database NHGRI - U41 Genomic Resource Grant

Upload: frederick-dignam

Post on 14-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

www.iccg.org

A Unified Clinical Genomics Database

NHGRI - U41 Genomic Resource Grant

Variant Analysis for General Genome Report

3-5 million variants

~20,000 Coding/Splice Variants

20-40 “Pathogenic”

Variants

Published as Disease-Causing

Genes

5-10 Variants

Pharmacogenetics<1%

Rare CDS/Splice Variants

LOF in Disease Associated Genes

30-50 Variants

Review evidence for gene-disease association and LOF role

Review evidence for variant pathogenicity

Classification of Reported Pathogenic Variantsfound in Human Genomes

Benign18%

Likely Benign26%

Uncertain significance – 52%

Pathogenic – 2%

Likely Path – 1%

U41 Genomic Resource Grant: A Unified Clinical Genomics

Database

To raise the quality of patient care by:

• Standardizing the annotation and interpretation of genomic variants

• Sharing variant and case level data through a centralized database for clinical and research use

• Implementing an evidence-based expert consensus process for curating genes and variant interpretations

Supporting data collection, submission and curation

• Work with NCBI to design ClinVar to meet the needs of the community

• Develop data dictionary, ontologies, and work with standards bodies

• Define data submission and access policies for variant and case-level data including genotypes and phenotypes

• Work with labs to solicit and support data submission

• Evidence-based curation of structural variants - (Riggs et al. 2012 )

• Evidence-based curation of sequence variants (ACMG Committee work in progress)

• Develop a gene-centric resource to define the medical exome and provide tools to support use in genomic medicine

• Work with vendors to improve reagents for genomic analysis (CMA, WES, WGS)

www.ncbi.nlm.nih.gov/clinvar

NIH NCBI ClinVar

ClinVar Submitters Variants Genes

OMIM 23524 3077Harvard Medical School and Partners Healthcare 6996 155InVitae Inc. 5526 4International Standards For Cytogenomic Arrays 4194 46GeneReviews 2913 287ARUP Laboratories 1415 6LabCorp 1391 140Sharing Clinical Reports Project 902 2Finland Institute for Molecular Medicine 840 39Tuberous Sclerosis Database 431 1ClinSeq Project 425 35Leiden Muscular Dystrophy Database 220 10GeneDx 205 3Emory Genetics Laboratory 48 13American College of Medical Genetics and Genomics 23 1Osteogenesis Imperfecta Database; University of Leicester 15 3Ambry Genetics 10 1Other laboratories (19) 52 25Total 49130 3848

Sequencing Laboratories Which Have Agreed to Share Data

Alfred I Dupont Hospital for ChildrenAll Children's Hospital St. PetersburgAmbry LaboratoriesARUPAthena DiagnosticsBaylor Medical Genetic LaboratoriesBoston Children's Hospital Boston UniversityChildren's Hospital of PhiladelphiaChildren's Mercy Hospital, Kansas CityCincinnati Children's HospitalCity of Hope Molecular Diagnostic LabCureCMDDenver Genetic LaboratoriesDetroit Medical CenterEmory UniversityFullerton Genetics LaboratoryGeneDxCleveland ClinicGreenwood GeneticsHarvard-Partners Lab for Molec. MedicineHenry Ford HospitalHuntington Medical Research Institutes

Illumina Clinical Services Lab

Indiana University/Perdue UniversityInSiGHTLabCorp / Integrated Genetics / CorrelagenMasonic Medical Research Laboratory Mayo ClinicMt. Sinai School of MedicineNationwide Children's Hospital Nemours Biomolecular Core, Jefferson MedicalOregon Health Sciences UniversityProvidence Sacred Heart Medical CenterQuest DiagnosticsSickKids Molecular Genetic LaboratoryTransgenomicsUniversity of ChicagoUniversity of MichiganUniversity of Nebraska Medical CenterUniversity of OklahomaUniversity of PennUniversity of SydneyUniversity of WashingtonWomen and Children's HospitalWayne State University School of MedicineYale University

Documenting arguments will improve the evidence-based assessment of variants

53 discrepancies:60% differ based upon likelihood (Benign vs LB, P vs LP)34% differed VUS vs Likely Pathogenic/Likely Benign6% differed VUS vs Pathogenic

20% discrepant

U41/ClinVar pilot project

Scope Number of alleles

Total submitted to ClinVar 997

Multiple assertions 269

Comparison of three laboratories classifications for variants in 12 RASopathy genes: BRAF, CBL, HRAS, KRAS, MAP2K1, MAP2K2, NRAS, PTPN11, RAF1, SHOC2, SOS1, SPRED1

84% differences were Lab A reporting a more aggressive assertion (Pathogenic/Benign) than Lab B/C (LP, LB, VUS)

16% of differences were Labs B/C reporting a more aggressive assertion than Lab A

Lab Classification Differences

ACMG Lab QA Committeeon the

Interpretation of Sequence Variants

ACMGSue Richards (chair), Heidi Rehm (co-chair)

Sherri Bale, David Bick, Soma Das, Wayne Grody, Madhuri Hegde, Elaine Spector

AMPJulie Gastier-Foster, Elaine Lyon

CAPNazneen Aziz, Karl Voelkerding

12

Evidence supporting pathogenicity (check all that apply): I. Stand-alone □ Truncating variant (e.g. nonsense, frameshift, canonical +/-1,2 splice sites, initiation

codon) in a gene where loss of function is a known mechanism of disease1

□ Same amino acid change as a previously established pathogenic variant regardless of nucleotide change2

II. Strong □ De novo (paternity confirmed)3

□ Well-established in vitro or in vivo functional studies supportive of a deleterious effect on the gene or gene product4

□ Case-control studies show a p value <0.01 for enrichment in cases6

III. Supporting □ Located in a mutational hot spot and/or experimentally well-characterized functional

domain7 □ Variant occurs in a gene with high clinical specificity and sensitivity for a particular

phenotype and the proband has multiple, specific features of the disease8 □ Multiple lines of computational evidence support a deleterious effect on the gene or

gene product (conservation, evolutionary, splicing impact, etc)9

□ Type of variant fits known pathogenic variant spectrum for the disease10 □ Variant frequency in control data

Absent from controls in Exome Sequencing Project & 1000 Genomes, OR Case-control studies show p value between 0.01-0.05 for enrichment in cases

(only applies if well-phenotyped populations are available) and frequency is below highest general population minor allele frequency (MAF) expected for disease:6

General guidance: Autosomal dominant MAF <0.4% General guidance: X-linked MAF <0.4% males General guidance: Autosomal recessive MAF <1%

□ For recessive disorders, detected in trans with a pathogenic variant11

□ Assumed de novo, but without confirmation of paternity3

□ In-frame deletions/insertions in a non-repeat region or stop-loss variants12 □ Co-segregation with disease5 □ Novel missense change at an amino acid residue where a different missense change

determined to be pathogenic has been seen before2

5 Categories:PathogenicLikely PathogenicUncertain significanceLikely benignBenign

Pathogenic = 1 stand-alone OR 2 strong OR 1 strong + ≥3 supporting

Likely Pathogenic = 1 strong + 2 supporting OR ≥4 supporting Benign = 1 stand-alone OR 2 strong OR 1 strong + ≥3 supporting

Likely benign = 1 strong + 2 supporting OR ≥4 supporting

Evidence supporting benign classification (check all that apply): I. Stand-alone □ For autosomal recessive: ≥1% MAF frequency6

□ For autosomal dominant: ≥0.4% or lower depending on disease frequency and penetrance6

□ For X-linked: ≥0.4% or lower in males depending on disease frequency and penetrance6 □ Observed in a healthy adult individual for a recessive (homozygous), dominant

(heterozygous), or X-linked (hemizygous) disorder with full penetrance at an early age6

II. Strong □ Well-established in-vitro or in vivo functional studies shows no deleterious effect on

protein function or splicing4

□ Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder11

□ Variant present in multiple mammalian species despite adjacent conservation9

III. Supporting □ Located in a highly variable region without a known function7 □ Multiple lines of computational evidence suggest no impact on gene or gene product

(conservation, evolutionary, splicing impact, etc)9 □ Type of variant does not fit known pathogenic variant spectrum10 □ Case-control studies show comparable frequencies (e.g. p > 0.05)6

□ Variant in a dominant gene that does not segregate in a family5 or is found in a case with an alternate cause of disease13

□ Observed in cis with a pathogenic variant11

 *Variants should be classified as Uncertain Significance if other criteria are unmet

Large variant datasets

Intra-laboratory

Evidence-based review

Practice guidelines

Expert Curation

Single-Source Curation

Uncurated

Multi-Source Curation

Guideline

Inter-laboratory

dbSNP/dbVar

ClinVar

QC and Expert Concensus

Curation - ClinVar

Analysis of LOF Variants - single genome

False positive

Weak gene-disease association

Non-Mendelian

LOF not disease mechanism

Weak gene to disease association

10

Pathogenic - 2

VUS – 1 (novel)

Excluded46

Novel/Rare - 41

Common33 8

ReportedRare LOFs

(Both AR1 novel

1 known)

False Positives

13

NotMendelian

14

LOF not a disease Mechanism - 2

82 LOF variantsbelow 5% MAFfrom one case

Update database

1. Define genes with medical relevance2. Technical challenges

• High GC• Pseudogenes/homologies• Repeat expansions• Common sites of structural variation

3. Variant types (denote common vs rare types)• Sequence variants (substitutions, small indels)

• Loss-of-function vs. Gain-of-function• CNV – haploinsufficient vs. triplosensitive• Other structural changes (translocations, inversions, etc)• Imprinted loci• Repeat expansions

4. Medically relevant transcripts5. Gene regions of pathogenic relevance6. Patterns of inheritance (dominant, recessive, X-linked, mitochondrial, de novo, etc)7. Phenotypes and evidence base for phenotype associations8. Available approaches to define variant pathogenicity (assays, tools, etc)9. Clinical utility measures10. Clinical decision support opportunities

Gene-centric resource

Initiated through collaboration amongst CHOP, Emory, and Harvard/Partners and Structural Variant workgroup

U41 - Working with Existing Efforts

• NCBI (ClinVar, dbSNP, dbVar, dbGaP, GTR) and EBI

• NHGRI (CRVR, eMERGE, CSER, ROR), IRDiRC

• Regulatory and Standards: ACMG, CAP, CDC, FDA, ASHG, AMP, CMGS, Global Alliance

• Locus Specific Databases (LSDBs – LOVD and non-LOVD)

• InSiGHT, PharmGKB, MSeqDB, CFTR2, ENIGMA, etc

• Human Variome Project and HGVS

• PhenoDB (Ada Hamosh) and Human Phenotype Ontology (Peter Robinson)

• OMIM (Ada Hamosh) and GeneReviews (Bonnie Pagon)

• Patient Advocacy Groups (Genetic Alliance, Patient CrossRoads, UNIQUE, Disease Specific Groups)

• Industry partners (reagents, instruments, software, etc)

ClinGen: The Clinical Genome Resource Program

Collaboration between:• NHGRI U41 Grant

– PIs: Ledbetter (Geisinger), Martin (Geisinger), Nussbaum (UCSF), Mitchell (Utah), Rehm (Partners/Harvard)

• NHGRI U01 “Clinically Relevant Variant Resource” Grants– Grant 1 PIs: Bustamante (Stanford), Plon (Baylor)– Grant 2 PIs: Berg (UNC), Ledbetter (Geisinger), Watson

(ACMG)• NCBI

– ClinVar

Data Collection

Structural Variation

Sequence Variation

Other Genomic Data

Phenotype

Curation

Variant Curation – Clinical

Significance

Gene-Variant Pairs –

Actionability

Clinical Domain Curation

Machine Learning Curation

IT/Biofx

Data Extraction

Data Analysis

Data Dissemination

Laboratory Bioinformatics/IT

EHR Integration

Community

Education

ELSI/Actionability

Community

Patient Registry

U41

UNCGeisinger

ACMG U01

StanfordBaylor

U01

ClinGen Delegation of Responsibilities

CoreDBDisease Area Curation Tool

ClinGen System Interactions

OMIM

Patient Registries

EHR Interface

Expert Curation of Genes and Variants by Clinical Domain and Disease Area Workgroups

dbGaP

LSDBs

LabsLabs

Labs(Genotypes & Phenotypes)

Gene Resource

(Medical Exome, Actionability)

CNV Curation Tool

(JIRA)

Application Interface

External Informatics Activities Enabled

ExpertCuratedVariants

Case-level Data

Variant-level Data

ClinVar

Disease WGs

Clinical Domain WGs

Data

Crowd-sourced Curation

Controlled Access

Public Access

Private

PharmGKB

Machine Learning Algorithms

PopulationDatasets

MedicalLit

Portal for the Public

International Collaboration for Clinical Genomics

– Over 190 institutional members

– Over 2800 individual members

Annual Conference June 10-12, 2014, Bethesda, MD– Attendees include laboratory directors, physicians, genetic counselors,

researchers, parents, government employees, regulatory agency representatives, and vendor partners

Bioinformatics and IT WorkgroupKaren Eilbeck (co-chair) and Sandy Aronson (co-chair)

ARUP: Brendon O’Fallon; Cartagenia: Steven Van Vooren; Emory: Stuart Tinker; GeneDx: Rhonda Brandon, Lisa Vincent; Mayo: Eric Klee; NCBI: Deanna Church, Jennifer Lee, Donna Maglott; George Riley; Partners Healthcare: Eugene Clark, Larry

Babb, Matt Varugheese; University of Chicago Teja Nelakuditi; Utah: Karen Eilbeck, Shawn Rynearson

Sequence Variant Workgroup

Madhuri Hegde (co-chair, Emory)Sherri Bale (co-chair, GeneDx)Carlos Bustamante (Stanford)

Soma Das (U Chicago)Matt Ferber (Mayo)

Birgit Funke (Harvard/MGH)Marc Greenblat (UVM)

Elaine Lyon (ARUP)Dona Maglott (NCBI)Sharon Plon (Baylor)

Heidi Rehm (Harvard/Partners)Avni Santani (CHOP)

Patrick Willems (Gendia)

Structural Variant Workgroup

Erik Thorland (co-chair, Mayo)Swaroop Aradhya (co-chair, InVitae)

Deanna Church (NCBI)Hutton Kearney (Fullerton)

Charles Lee (Jackson Labs)Christa Martin (Emory)Sarah South (ARUP)Chad Shaw (Baylor)Karin Wain (Utah)

Phenotyping Workgroup

David Miller (chair, Harvard)Ada Hamosh (Hopkins)

Karen Eilbeck (Utah)Monica Giovanni (Geisinger)Robert Green (Harvard/BWH)

Mike Murray (Geisinger) Robert Nussbaum (USCF)

Erin Riggs (Emory)Peter Robinson (Berlin)

Steven Van Vooren (Cartagenia) Patrick Willems (Gendia)

Engagement, Education and Access Workgroup

Andy Faucett (chair, Geisinger)Erin Riggs (Emory)

Danielle Metterville (Partners)Genetic Counselors from participating laboratories

ConsultantsLes Biesecker, Johan den Dunnen, Robert Green, Ada Hamosh, Laird Jackson, Stephen Kingsmore,

Jim Ostell, Sue Richards, Peter Robinson, Lisa Salberg, Joan Scott, Sharon Terry

U41 Principal Investigators and Workgroups

NIH U41 PIs: David Ledbetter (Geisinger), Christa Martin (Geisinger), Joyce Mitchell (Utah), Robert Nussbaum (UCSF), Heidi Rehm (Harvard)