wednesday from qtl to candidate genes xidan li xiaodong liu dj de koning
TRANSCRIPT
Wednesday from QTL to candidate genes
Xidan Li
Xiaodong Liu
DJ de Koning
Overview of today• Schedule for teaching day• Morning Lectures• 9:00 – 10:00 Lecture: Chasing the genetic basis of a QTL in chicken – DJ de
Koning• 10:00 – 10:15 coffee break• 10:15 – 10:45 Lecture: Bioinformatics pipeline for targeted sequencing of QTL
region – Xiaodong Liu• 10:45 – 11:00 leg stretcher• 11:00 – 11:30 Lecture: Identification and evaluation of causative genetic variants –
Xidan Li• 11.30-12.00 discussion about morning topics• 12.00-13.00 lunch
• Afternoon exercises• 13:00 – 14:00 NGS data aligning• 14:00 – 15:00 SNPs calling• 15:00 – 16:00 Identify and evaluate causative genetic variants• 16:00 – 17:00 evaluate results and questions
Chasing the genetic basis of a QTL in chicken
DJ de Koning [email protected]
Contributors• Swedish University of Agricultural Sciences, Uppsala University
• Xidan Li
• Xiaodong Liu
• Roslin Institute, University of Edinburgh• Javad Nadaf
• Ian Dunn
• Chris Haley
• Ark-Genomics: Alison Downing, Mark Fell, Frances Turner
• INRA, Unité de Recherches Avicoles• Cécile Berri
• Elizabeth Le Bihan-Duval
From sequence to consequence
Phenotype
The observed trait is sum of many genes and environmental factors
Complex Traits
Detection of QTL using exotic crosses
QQ qq
QQ Qq qq
Use of extreme crosses to unravel complex traits
Detection of ”QTL”
• Use heritable variation in the genome as DNA Markers
• Follow inheritance of DNA markers through population
• Compare inheritance pattern with character of interest
Quantitative trait locus (QTL)
• Region of the genome with a ’significant’ effect on our trait of interest.
• Large region with very many genes.
Intermezzo: tool for QTL analysiswww.gridqtl.org.uk
Nowadays: Association studies• Take a large (thousands), representative,
sample of the population
• Characterise for a very large number of DNA variants
• Estimate a putative
effect of every DNA
variant on the trait
of interest
Challenge remains: What is the gene?
• Very large area
• Many candidate genes
• Very noisy signal
• Signal may not mark the gene
Livestock genomicsOutput
•QTL: Animal QTLdbhttp://www.animalgenome.org/QTLdb/
•Chicken3162 QTL from 158 papers
•Pigs6818 QTL from 290 papers
•Cattle5920 QTL from 330 papers
• From QTL to QTN
• PigsIGF2
• CattleDGAT1, ABCG2
…
1000’s of QTL, very few QTN
Next step up: Gene expression studies
• Measure the expression of thousands of genes simultaneously
• Snapshot of what is happening in a given tissue at a given time.
• QTL study AND gene expression study in Population.
• What are the gene expression effects of this QTL
X
0
10
20
30
40
50
0 50 100 150cM
Te
st S
tati
sti
c
0
10
20
30
40
50
0 50 100 150cM
Te
st
Sta
tis
tic
eQTL: Genome region that affects gene expression
Targeted eQTL Mapping
• Focus expression analysis on most informative individuals
• eQTL underlying functional QTL
• Increased power for target regions
Application to a chicken QTL
• Very important meat quality trait
• Related to activity on the slaughter line
• Here measured 15 minutes post mortem
PH in chicken meat
F. Ricard, 1975Nadaf et al 2007
Chicken High growth Line, Low growth Line
QTL affecting PH
QTL Interval ~ 50 cM?
Experimental design
• What are the local and global effects of this QTL on gene expression?
• Identify 12 birds with QQ genotypes on the basis of flanking markers and 12 with qq genotype
• Perform microarray analyis using mRNA from breast muscle (P. Major)
• Agilent 44k Array: 2-colour, dye-balanced
700 F2
24 F2
RNARNA
12 Microarray chips (Agilent 44k)
Genetic information
Genomic information
12 QQ 12 qq
Targeted genetical genomic approach
Enriched signals at the QTL position
Closer look at the QTL area
QTL appears to act on a region < 1Mb
Top 10
ProbeName GeneName t P.Value adj.P.ValAlternative Gene name
A_87_P016951 CR385747 -16.51 2.87E-10 7.19E-06 ZFY
A_87_P014348RCJMB04_23c
19 -16.28 3.42E-10 7.19E-06 ACOT9
A_87_P030344 BU299642 -12.64 8.27E-09 0.0001 PRDX4
A_87_P034725 BU106729 11.03 4.38E-08 0.0005
A_87_P014256 CR390282 10.28 1.02E-07 0.0009 KLHL15
A_87_P011383 CR523763 9.25 3.57E-07 0.0025 KLHL15
A_87_P032384 BU230994 8.44 1.03E-06 0.0062 PRDX4
A_87_P006189 TC202659 7.22 5.78E-06 0.0304 MSL3L1
A_87_P025536 BU476093 7.04 7.56E-06 0.0353 APOO
A_87_P034683 BU108463 6.89 9.55E-06 0.0401 PRDX4
Enriched signals at the QTL position
• 16 differentially expressed probes in 1Mb region around QTL
• QTL acting at chromatin or methylation level?• PH simply one of the downstream effects.
Next Step: Re-Sequencing the QTL region
• 5 birds of each QTL genotype
• Selected DNA from 1 Mb around QTL with Agilent
SureSelect Target Enrichment
• One lane on Illumina GA flow cell: 151 bp paired-end
• 4.9 Gbase of raw sequencing reads
• ~200 x coverage of each individual chicken
To be continued• YOU will work with this NGS data today!
• The work up to the NGS has been published
Over to you!
• Then coffee
Process for identifying candidate SNPs
Re-sequencing with 200 X
SNPs calling
SNPs analysis in non-coding regions
SNPs analysis in coding regions
CpG islands UTR regions
Missense in exons
Get gene data from Ensembl
Candidate genes with candidate
SNPs
List of top rank SNPs
Splicing sites
Non-synonymous mutations
ACOT9
Description: acyl-Coenzyme A thioesterase 2, mitochondrial [Source:RefSeq peptide;Acc:NP_001012841] Mutations:
coordinate in chr position in proteins ref codon VS. Mutated
codon QQ 121615732 290 [ TAC / TAT V / I ]
qq 121615675 309 [ CGC / CGA A / S ] 121615732 290 [ TAC / TAT V / I ]
Mutations predictions: conservaton score physico-chemical properties final effect A - S (309) 0.08 0.54 0.3726 V - I (290) 0.09 0.201 0.201
All scores are in range (0 - 1). The higher score, the more effect to protein function.
ACOT9
• Most significant gene from eQTL study
• Mitochondrial gene
• Function of this particular gene not clear.
• “Acyl-CoA thioesterases are a group of enzymes that catalyze the hydrolysis of acyl-CoAs to the free fatty acid and coenzyme A (CoASH), providing the potential to regulate intracellular levels of acyl-CoAs, free fatty acids and CoASH.”
F1NR19Description: DNA polymerase Source: UniProtKB/TrEMBL F1NR19 Mutations:
coordinate in chr position in proteins ref codon VS. Mutated
codon
121338134 1396 [ CGT / CAT T / M ] 121464309 523 [ AGG / AAG P / L ] 121466769 466 [ CGC / CAC A / V ] 121478463 289 [ ATG / ACG H / R ]
qq 121464309 523 [ AGG / AAG P / L ]
121466769 466 [ CGC / CAC A / V ] Mutations predictions:
conservaton score physico-chemical properties final effect H - R (289) 0.04 0.543 0.02172 A - V (466) 0.05 0.304 0.0152 P - L (523) 0.11 0.703 0.07733
T - M (1396) 0.03 0.412 0.01236
ENSGALT0000026337
PRDX4
• Peroxiredoxin 4
• Antioxidant enzyme, regulates NFĸB
• Highly differentially expressed but no candidate SNPs
• 2 probes up, 1 down => Splicing?
• Still a strong functional candidate