y. megan kong, nishanth marthandan, paula guidry, jyothi noronha,
DESCRIPTION
Sequence Feature Variant Type (SFVT) Method: HLA Associations with Systemic Sclerosis Genetic Determinants of Influenza Virus Host Range Restriction. Y. Megan Kong, Nishanth Marthandan, Paula Guidry, Jyothi Noronha, R. Burke Squires, Elizabeth McClellan, Mengya Liu, Yu Qian, - PowerPoint PPT PresentationTRANSCRIPT
Sequence Feature Variant Type (SFVT) Method: HLA Associations with Systemic Sclerosis
Genetic Determinants of Influenza Virus Host Range Restriction
Y. Megan Kong, Nishanth Marthandan, Paula Guidry, Jyothi Noronha, R. Burke Squires, Elizabeth McClellan, Mengya Liu, Yu Qian,
David Dougall, Jie Huang, Diane Xiang, Brett Pickett, Victoria Hunt, Young Kim, Jeff Wiser, Thomas Smith, Jonathan Dietrich, Edward Klem, Lindsay Cowell, Nancy Monson, David Karp, Richard H. Scheuermann
Laboratory of Molecular Pathology Retreat - 10 MAR 2011
Abstracts & Posters – Immunology
• HLA Research Data, Reference Data, Visualization Tools and Analysis Tools in ImmPort– Paula A. Guidry, Nishanth Marthandan, Thomas Smith, Patrick Dunn, Steven J. Mack, Glenys Thomson, Jeffrey
Wiser, David R. Karp, Richard H. Scheuermann
• Creating a Cell Detail Page for Hematopoietic Cells in ImmPort– David S. Dougall, Shai Shen-Orr, John Campbell, Yue Liu, Patrick Dunn, Y. Megan Kong, Mark M. Davis,
Richard H. Scheuermann
• Minimum Information about a Genotyping Experiment– Jie Huang, Nishanth Marthandan, Alexander Pertsemlidis, LiangHao Ding, Julia Kozlitina, Joseph Maher, Nancy
Olsen, Jonathan Rios, Michael Story, Chao Xing, Richard H. Scheuermann
• Translational Research in ImmPort– Y. Megan Kong, Carl Dalke, Diane Xiang, Max Y. Qian, David Dougall, David Karp, Richard H. Scheuermann
• Potential of a Unique Antibody Gene Signature to Predict Conversion to Clinically Definite Multiple Sclerosis
– A.J. Ligocki, L. Lovato, D. Xiang, P. Guidry, R.H. Scheuermann, S.N. Willis, S. Almendinger, M.K. Racke, E.M. Frohman, D.A. Hafler, K.C. O'Connor, N.L. Monson
• Analysis of DRB1 Sequence Feature Variant Type Associations with Systemic Sclerosis Autoantibodies Types and Racial Groups
– Nishanth Marthandan, Paula Guidry, Glenys Thomson, Frank Arnett, David R. Karp, Richard H. Scheuermann
• An automated analysis and visualization pipeline for identification and comparison of cell populations in high-dimensional flow cytometry data
– Yu Qian, David Dougall, Megan Kong, Paula Guidry, and Richard H. Scheuermann
Abstracts & Posters – Infectious Diseases
• Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data & Analysis – Victoria Hunt, R. Burke Squires, Jyothi Noronha, Ed Klem, Jon Dietrich, Chris Larsen, Richard H. Scheuermann
• Tool for Identifying Sequence Variations that Correlate with Virus Phenotypic Characteristics– Brett Pickett, Prabakaran Ponraj, Victoria Hunt, Mengya Liu, Liwei Zhou, Sanjeev Kumar, Jonathan Dietrich,
Sam Zaremba, Chris Larson, Edward B. Klem, Richard H. Scheuermann
• Conserved Epitope Regions (CER): Elucidation of Evolutionarily Stable, Immunologically Reactive Regions of Human H1N1 Influenza Viruses
– R. Burke Squires, Brett Pickett, Jyothi Noronha, Victoria Hunt, Richard H. Scheuermann
• Influenza NS1-dependent Host Range Restriction Demonstrated By Sequence Feature Variant Type Analysis
– Jyothi M. Noronha, R. Burke Squires, Mengya Liu, Victoria Hunt, Brett Pickett and Richard H. Scheuermann
MHC-mediated antigen presentation
HLA allele counts
HLA-A HLA-B HLA-C
1519 (1119) 2069 (1601) 1016 (750)
HLA-DRB HLA-DQA1 HLA-DQB1 HLA-DPA1 HLA-DPB1
966 (738) 35 (26) 144 (103) 28 (16) 145 (127)
MICA MICB TAP
73 (60) 31 (20) 11 (9)
Figures in parenthesis indicate the number of unique proteins encoded by thevarious alleles at each locus.1634 new alleles were described in 2010 alone.
IMGT HLA – March 2011
HLA and autoimmune disease
Disease HLA Allele Relative Risk
Ankylosing spondylitis B27 87.4
Postgonococcal arthritis B27 14.0
Acute anterior uveitis B27 14.6
Rheumatoid arthritis DR4 5.8
Chronic active hepatitis DR3 13.9
Sjogren syndrome DR3 9.7
Insulin-dependent diabetes DR3/DR4 14.3
21-Hydroxylase deficiency BW47 15.0
Robbins Pathologic Basis of Disease 6th Edition (1999)
HLA and infectious disease
• Correlation between HLA genotype and HIV viral burden and progression to AIDS
• M Dean, M Carrington and SJ O'Brien Annual Review of Genomics and Human Genetics Vol. 3: 263-292 (2002)
HLA and adverse drug reaction
HLA allele Drug sensitivity Association Prevalence
B*1502 cabamazepine (epilepsy) p = 3 x 10-27 high Chineseabsent Caucasians
B*5701 abacavir (HIV) p = 5 x 10-20 high Caucasiansabsent Africans, Hispanics
B*5801 allopurinol (gout) p = 5 x 10-24 high Chinese
P. Parham
HLA Allele Nomenclature
HLA - A * 24 02 01 01
Locus Asterisk Allele family(serological
where possible)
Aminoacid
difference
Non-coding(silent)
polymorphism
Intron, 3’ or 5’
polymorphism
N = nullL = low
S = Sec.A = Abr.
Q = Quest.
HLA - A * 24 02 01 02 L
10
DRB1 phylogeny
DRB1*07
DRB1*09
DRB1*10
DRB1*04
DRB1*16
DRB1*15
DRB1 phylogenyDRB1*13
DRB1*13
DRB1*13
DRB1*13
DRB1*13
DRB1*13
12
DRB1 phylogeny
DRB1*07
DRB1*09
DRB1*10
DRB1*04
DRB1*16
DRB1*15
DRB1 alignment07/15 07/09 09/15
HLA–mediated disease predisposition
• Hypothesis: – While the allelic/haplotypic structures reflect evolutionary history
of the locus, it is the focused regions in the HLA genes/proteins that effect gene expression, protein structure and/or protein function that are responsible for enhanced disease risk
Summary of SFVT approach
• Define individual sequence features (SF) in HLA proteins (genes)
• Determine the extent of polymorphism for each sequence feature by defining the observed variant types (VT)
• Re-annotate HLA typing information with complete list of VT for each SF
• Examine the association between every sequence feature variant type and disease or other phenotype
Representative Sequence Features
17
A*0201 - ‘peptide binding’ SF
18
A*0201 - ‘peptide binding pocket B’ SF
19
A*0201 - ‘CD8 binding’ & ‘TCR binding’ SF
CD8 Binding
TC
R B
inding
Summary of SFs defined
1775 total
Variant Types for Hsa_HLA-DRB1_beta-strand 2_peptide antigen binding
Representative Sequence Features Variant Types
23
HLA SFVT Association with Systemic Sclerosis
• Summary of data set– Systemic sclerosis (SSc, scleroderma) is a chronic condition characterized by
altered immune reactivity, thickened skin, endothelial dysfunction, interstitial fibrosis, gangrene, pulmonary hypertension, gastrointestinal tract dysmotility, and renal arteriolar dysfunction.
– A large cohort of ~1300 SSc patients and ~1000 healthy controls has been assembled by Drs. Frank C. Arnett, John Reveille and colleagues at the University of Texas Health Science Center at Houston.
– Information on autoantibody reactivity for over 15 nuclear antigens is available.– 4-digit typing has been done for DRB1, DQA1, and DQB1 in all individuals.
• Initial re-annotation of 4 digit DRB1 typing data– DRB1*1104 => SF1_VT43; SF2_VT4; SF3_VT12 ………
• Statistical analysis– Split data set into two - pseudo-replicates– 2 x n contingency table for every SF (286), where n = number of VT– Chi-squared or Fisher’s Exact Test analysis – Select SF with adjusted p-value <0.01 (83/286)– 2 x 2 contingency table (type vs non-type) for every VT (418 total)– Merge results of pseudo-replicates
DRB1*0101 Visualization
Composite SF- Risk and Protective Variants
DRB1*0101 Visualization
67F 70D
71R
86V
26F
37Y
30Y28D
67I 70D
71R
86G
26F
37F
30L28E
protective risk
Publication
Limitations to initial study
• Did not take into account difference in allele frequency distributions in different racial populations
• Treated SSc as a single disease– limited cutaneous involvement associated with pulmonary
hypertension; 60-70% are anti-centromere positive
– diffuse cutaneous involvement associated with more interstitial lung disease and kidney involvement; 30% are anti-topo positive
– the two antibodies tend to be mutually exclusive
Auto-antibody SFVT associations
• Separated SSc participants based on presence of anti-topoisomerase or anti-centromere auto-antibody (cases only)– 231 anti-topoisomerase
– 318 anti-centromere
– 3 both
– 752 neither
• SSc with anti-topo vs SSc without anti-topo
• SSc with anti-cent vs SSc without anti-cent
2872 75
Anti-centromere SFVTs
Anti-topoisomerase SFVTs
Overlap of top 100 SFVTs
018 10
Risky
Anti-centromere Anti-topoisomerase
Anti-centromere SFVTs Anti-topoisomerase SFVTs
010 18
Protective
Anti-centromere Anti-topoisomerase
28 common SFVTs
039 40
Risky vs Risky
Anti-centromere risky SFVTs
Anti-topoisomerase risky SFVTs
1821 22
Risky vs Protective
Anti-centromere risky SFVTs
Anti-topoisomerase risky SFVTs
102 30
Protective vs Risky
Anti-centromere protective SFVTs
Anti-topoisomerase risky SFVTs
012 40
Protective vs Protective
Anti-centromere protective SFVTs
Anti-topoisomerase protective SFVTs
Table 7. Some of the SFVTs significantly associated with the presence of anti-centromere autoantibody
Table 8. Some of the SFVTs significantly associated with presence of anti-topoisomerase autoantibody
Sequence Feature Variant Type (SFVT)
Variant Type Definition Odds ratio
No. of case alleles
No. of control alleles
Corrected p-value
DRB1*0101 2.96 100 116 4.55 e-13
DRB1*0401 2.09 62 96 4.91 e-04
DRB1*0801 2.64 30 36 3.29 e-04
Hsa_HLA-DRB1_SF163_VT1 67L_70Q_71R 2.52 194 296 1.28 e-17
Hsa_HLA-DRB1_SF137_VT128E_30C_47Y_61W_67L_71R
2.96 120 145 6.76 e-16
Hsa_HLA-DRB1_SF142_VT19W_56P_57D_60Y_61W_67L
2.87 124 155 1.10 e-15
Hsa_HLA-DRB1_SF130_VT160Y_67L_70Q_71R_77T_78Y_81H_82N_85V 2.44 174 267 4.38 e-15
Hsa_HLA-DRB1_SF98_VT1 67L 2.06 343 727 1.16 e-14
Sequence Feature Variant Type (SFVT)
Variant Type Definition Odds ratio
No. of case alleles
No. of control alleles
Corrected p-value
DRB1*1501 1.78 65 180 8.17 e-03
DRB1*1104 3.70 74 105 1.67 e-06
Hsa_HLA-DRB1_SF163_VT8 67F_70D_71R 2.22 149 375 1.72 e-11
Hsa_HLA-DRB1_SF137_VT25 28D_30Y_47F_61W_67F_71R 2.71 105 208 2.56 e-13
Hsa_HLA-DRB1_SF142_VT11 9E_56P_57D_60Y_61W_67F 2.72 137 285 1.85 e-16
Hsa_HLA-DRB1_SF130_VT1560Y_67F_70D_71R_77T_78Y_81H_82N_85V
2.26 149 370 9.93 e-12
Hsa_HLA-DRB1_SF98_VT3 67F 2.11 156 413 3.72 e-11
Anti-centr 9W_28E_30C_47Y_67LAnti-topo 9E_28D_30Y_47F_67F
Hsa_HLA-DRB1_SF137_VT25 (all SSc) 1.85 1.38 e-07
34
ImmPort HLA SFVT Workflow
Table of subject vs. HLA 4-digit typing data Table of subject vs. SFVT feature vector
Table of p-values, adj. p-values, odds ratio, confidence intervals
CD8 Binding
TC
R B
inding
Summary
• SFVT Approach– Proposed a novel approach for HLA disease associations based on sequence feature variant
type analysis (SFVT)– Defined structural and functional protein sequence features (SF) for all classical human MHC
class I and II proteins– Determined variant types (VT) for all SF in known alleles– Available in ImmPort www.immport.org, IMGT-HLA and dbMHC
• Systemic Sclerosis Analysis– Based on the SFVT approach, identified a region of the HLA-DRB1 protein centered around
peptide-binding pocket 7 that appears to be associated with disease risk– Sequences found in HLA-DRB1*1104 at positions 28, 30, 37, 67 and 86, especially with
aromatic amino acids, were associated with increase disease risk– Sequences found in this region of HLA-DRB1*0302 appear to be protective– Different alleles are associated with altered risk in different racial/ethnic populations, but
they share common SFVTs– SFVTs associated with risk of developing SSc are different in patients with anti-topo versus
anti-cent antibodies, supporting the idea that these are distinct disease– However, the risk-associated SFVTs are from the same SFs suggesting a common
mechanism of disease pathogenesis
Public Health Impact of Influenza
• Seasonal flu epidemics occur yearly during the fall/ winter months and result in 3-5 million cases of severe illness worldwide.
• More than 200,000 people are hospitalized each year with seasonal flu-related complications in the U.S.
• Approximately 36,000 deaths occur due to seasonal flu each year in the U.S.
• Populations at highest risk are children under age 2, adults age 65 and older, and groups with other comorbidities.
Source: World Health Organization - http://www.who.int/mediacentre/factsheets/fs211/en/index.html
Flu pandemics of the 20th and 21st centuries
• 1918 flu pandemic (Spanish flu)– H1N1 subtype
– The most severe pandemic
– Estimated to claim 2.5% - 5% of world’s population (20 – 100 million deaths)
• Asian flu (1957 – 1958)– H2N2 subtype
– 1 – 1.5 million deaths
• Hong Kong flu (1968 – 1969)– H3N2 subtype
– 750,000 - 1 million deaths
• 2009 pandemic – H1N1
– >16,000 deaths as of March 2010
Influenza Virus
Orthomyxoviridae familyNegative-strand RNASegmentedEnveloped
8 RNA segments encode11 proteinsClassified based on serology of HA and NA
SFVT approach
VT-1 I F D R L E T L I LVT-2 I F N R L E T L I LVT-3 I F D R L E T I V LVT-4 L F D Q L E T L V SVT-5 I F D R L E N L T LVT-6 I F N R L E A L I LVT-7 I Y D R L E T L I LVT-8 I F D R L E T L V LVT-9 I F D R L E N I V LVT-10 I F E R L E T L I LVT-11 L F D Q M E T L V S
Influenza A_NS1_nuclear-export-signal_137(10)
• Identify regions of protein/gene with known structural or functional properties – Sequence Features (SF)• an alpha-helical region, the binding site for another protein, an enzyme active site, an
immune epitope• Determine the extent of sequence variation for each SF by defining each unique sequence as
a Variant Type (VT)• High-level, comprehensive grouping of all virus strains by VT membership for each SF
independently• Genotype-phenotype association statistical analysis (virulence, pathogenesis, host range,
immune evasion, drug resistance)
Influenza A_NS1_alpha-helix_171(17)
Protein Subtype Functional Structural Immune Epitopes Total Count
PB2 - 7 10 564 585
PB1-F2 - 2 2 - 6
PB1 - 6 5 733 744
PA - 1 29 534 565
NS2 - 2 3 78 83
NS1 - 21 15 458 494
NP - 10 25 472 512
NA N1 10 26 113 153
NA N2 9 59 106 180
M2 - 4 10 96 116
M1 - 12 14 286 312
HA H1 4 37 335 376
HA H2 4 10 20 34
HA H3 2 59 390 481
HA H5 3 14 40 65
HA H7 - 1 2 3
Total 97 319 4227 4709
Influenza A Sequence Features as of January 2011
NS1 Sequence Features
VT for SF8 (nuclear export signal)
VT-1 strains
DO VARIATIONS IN NS1 SEQUENCE FEATURES INFLUENCE INFLUENZA VIRUS HOST RANGE?
VT for SF8 (nuclear export signal)
Causes of apparent NS1 VT-associated host range restriction
• Virus spread = capability + opportunity– Phenotypic property of the virus – limited capacity
– Restricted founder effect – limited opportunity• Restricted spatial-temporal distribution
• Sampling bias – assumption of random sampling– Oversampling – avian H5N1 in Asia; 2009 H1N1
– Undersampling – large and domestic cats
• Linkage to causative variant
VT-10 strains
VT for SF8 (nuclear export signal)
VT lineages
VT-10 lineage
VT-4 lineage
VT-4 strains
VT-4 lineage = B allele/group
VT-15 & VT-8 lineages
VT-5 strains
Summary• Compiling list of all known influenza protein sequence features (SFs) in
IRD• Observed dramatic skewing in NS1 SFVT host distributions• In some cases, attributable to sampling biases
– VT-1 and Avian H5N1 due to Asian sampling in mid-2000's– VT-2 and human due to 2009 pandemic H1N1– VT-11 and Other (Environment) in Delaware Bay
• Performing multivariate statistical analysis to control for confounding variables
• In other cases, attributable to founder effects– VT-13 and -14 and Viet Nam 2003
• However, in other cases these explanations do not appears to be consistent with the data, suggesting that these may indeed be NS1-mediated host range restrictions– Equine VT-10 lineage– Avian VT-4 lineage (B allele/group)– Human VT-8 lineage– Human VT-15 lineage
• Nuclear export vs linkage disequilibrium?
HLA SFVT Acknowledgements
BISC ImmPort Team
• David Karp (UTSW)
• Nishanth Marthandan (UTSW)
• Paula Guidry (UTSW)
• Frank C. Arnett (UTH)
• John Reveille (UTH)
• Chul Ahn (UTSW)
• Glenys Thompson (Berkeley)
• Tom Smith (NG)
• Jeff Wiser (NG)
DAIT HLA Working Group• David DeLuca (Hannover)• Raymond Dunivin (NCBI)• Michael Feolo (NCBI)• Wolfgang Helmberg (Graz)• Steven G. E. Marsh
(ANRI)• David Parrish (ITN)• Bjoern Peters (LIAI)• Effie Petersdorf (FHCRC)• Matthew J. Waller (ANRI)
Sequence Ontology WG• Michael Ashburner
(Cambridge)• Lindsay Cowell (UTSW)• Alexander D. Diehl
(Buffalo) • Karen Eilbeck (Utah)• Suzanna Lewis (LBNL)• Chris Mungall (LBNL)• Darren A. Natale
(Georgetown)• Barry Smith (Buffalo)
With support from NIAID N01AI40076
59
• U.T. Southwestern– Richard Scheuermann– Burke Squires– Jyothi Noronha– Mengya Liu– Victoria Hunt– Shubhada Godbole– Brett Pickett– Ayman Al-Rawashdeh
• MSSM– Adolfo Garcia-Sastre– Eric Bortz– Gina Conenello– Peter Palese
• Vecna– Chris Larsen– Al Ramsey
• LANL– Catherine Macken– Mira Dimitrijevic
• U.C. Davis– Nicole Baumgarth
• Northrop Grumman– Ed Klem– Mike Atassi– Kevin Biersack– Jon Dietrich– Wenjie Hua– Wei Jen– Sanjeev Kumar– Xiaomei Li– Zaigang Liu– Jason Lucas– Michelle Lu– Bruce Quesenberry– Barbara Rotchford– Hongbo Su– Bryan Walters– Jianjun Wang– Sam Zaremba– Liwei Zhou
• IRD SWG– Gillian Air, OMRF– Carol Cardona, Univ. Minnesota– Adolfo Garcia-Sastre, Mt Sinai– Elodie Ghedin, Univ. Pittsburgh– Martha Nelson, Fogarty– Daniel Perez, Univ. Maryland– Gavin Smith, Duke Singapore– David Spiro, JCVI– Dave Stallknecht, Univ. Georgia– David Topham, Rochester– Richard Webby, St Jude
• SFVT experts– Gillian Air, OMRF– Toru Takimoto, Rochester– Summer Galloway, Emory– Robert Lamb, Northwestern– Benjamin Hale, Mt. Sinai
• USDA– David Suarez
• Sage Analytica– Robert Taylor– Lone Simonsen
• CEIRS Centers
Influenza SFVT Acknowledgments