Array Genotyping to Dissect Quantitative Trait Loci in Arabidopsis thaliana
Justin BorevitzEcology and EvolutionUniversity of Chicagohttp://naturalvariation.org
If you haven’t voted, leave now,
I’ll/We’ll forgive you
Talk Outline
• QTL Intro
• Transcription based Cloning
• Single Feature Polymorphisms (SFPs)– Potential deletions
• Bulk Segregant Mapping– Extreme Array Mapping
• Haplotype analysis
• New Arrays, new models Aquilegia
Light Affects the Entire Plant Life Cycle
de-etiolation
hypocotyl
}
Quantitative Trait Loci
QTL geneConfirmation
MarkerIdentificationGenotyping
Genomics path
Experimental DesignMapping population PhenotypingQTL AnalysisFine Mapping
Candidate genePolymorphismsgene expressionloss of function
QTL gene
Confirmation
Experimental Design
Mapping population
Phenotyping
QTL Analysis
Fine Mapping
With the Aid of Genomics
Genomics to Clone QTL
• Recombination Fine Mapping
• Gene Expression Variation
• Hybridization Polymorphism
• Association Testing, LD mapping
• Direct Sequencing of Candidate Gene
• Quantitative Complementation
• Transgenic Complementation
• Look for gene expression differences between genotypes
• Identify candidate genes that map to mutation
• Downstream targets that map elsewhere
Transcription based cloning
differences may be due to expression or hybridization
PAG1 down regulated in Cvi
PLALE GREEN1 knock out has long hypocotyl in red light
What is Array Genotyping?
• Affymetrix expression GeneChips contain 202,806 unique 25bp oligo nucleotides.
• 11 features per probset for 21546 genes• New array’s have even more• Genomic DNA is randomly labeled with
biotin, product ~50bp.• 3 independent biological replicates
compared to the reference strain Col
GeneChip
Potential Deletions
Spatial Correction
Spatial Artifacts
Improved reproducibilityNext: Quantile Normalization
False Discovery and Sensitivity
PM only
SAM threshold
5% FDR
GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity
Polymorphic 340 117 223 34% Non-polymorphic 477 4 473
False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR
GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity
Polymorphic 340 195 145 57% Non-polymorphic 477 28 449
False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59
3/4 Cvi markers were also confirmed in PHYB
90% 80% 70%
41% 53% 85%
90% 80% 70%
67% 85% 100%
Cereonmay be asequencingError
TIGRmatch isa match
Chip genotyping of a Recombinant Inbred Line
29kb interval
Discovery 6 replicates X $500 12,000 SFPs = $0.25Typing 1 replicate X $500 12,000 SFPs = $0.041
SNP377
SM184
SM50
SM35
SM106
G2395
SNP65
SM40
SEQ8298
TH1
MSAT7964
MAT7787
CER45
5.50
5.87
6.34
7.01
7.30
7.44
7.60
7.79
7.96
8.13
8.29
8.65
9.32
MbMarker
Near-Isogenic Lines for LIGHT1 Ler / Cvi #3
mm
81N-J 17A-A/J 114 124 189Ler
6 2 4 3 3 3 Plants
Line
RVE7
GI
194
3
5.0 5.8 5.8 5.1 5.9 5.7 5.8 Phenotype
LIGHT1 NIL
Potential Deletions
>500 potential deletions45 confirmed by Ler sequence
23 (of 114) transposons
Disease Resistance(R) gene clusters
Single R gene deletions
Genes involved in Secondary metabolism
Unknown genes
Potential Deletions Suggest Candidate Genes
FLOWERING1 QTL
Chr1 (bp)
Flowering Time QTL caused by a natural deletion in MAF1
MAF1
MAF1 natural deletion
Fast Neutron deletions
FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1
Het
Map bibb100 bibb mutant plants100 wt mutant plants
bibb mapping
ChipMapAS1
Bulk segregantMapping usingChip hybridization
bibb maps toChromosome2 near ASYMETRIC LEAVES1
BIBB = ASYMETRIC LEAVES1
Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain
bibb as1-101
MYB
bib-1W49*
as-101Q107*
as1bibb
AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM
eXtreme Array Mapping
Histogram of Kas/Col RILs Red light
hypocotyl length (mm)
cou
nts
6 8 10 12 14
02
46
81
01
2
15 tallest RILs pooled vs15 shortest RILs pooled
LOD
eXtreme Array Mapping
Allele frequencies determined by SFP genotyping. Thresholds set by simulations
0
4
8
12
16
0 20 40 60 80 100cM
LO
D
Composite Interval Mapping
RED2 QTL
Chromosome 2
RED2 QTL 12cM
Red light QTL RED2 from 100 Kas/ Col RILs
eXtreme Array Mapping BurC F2
XAMLz x Col
F2
QTLLz x Ler
F2
XRED2 QTL
mark1 mark2
Select recombinants by PCR >200 from >1250 plants
HighLow~2Mb ~8cM
>400 SFPsCol
Kas
Col Col
Col het
Col
~2
Kas
het Col
het het
het
~43
Kas
Kas Col
Kas het
Kas
~268
~43 ~539 ~43
~268 ~43 ~2
Kas
eXtreme Array Fine Mapping
Array Haplotyping
• What about Diversity/selection across the genome?
• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ
• LD decay, Haplotype block size• Deep population structure?• Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas,
C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2Fl-1, Ita-0, Mr-0, St-0, Sah-0
A star phylogeny
163 markers 73 accessions ~ 750kb/marker
Array Haplotyping
Inbred lines
Low effectiverecombinationdue to partialselfing
Extensive LDblocks
Col Ler Cvi Kas Bay Shah Lz Nd
Chr
omos
ome1
~50
0kb
RNA DNA
Universal Whole Genome Array
Transcriptome AtlasExpression levelsTissues specificity
Transcriptome AtlasExpression levelsTissues specificity
Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription
Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription
Alternative SplicingAlternative Splicing Comparative GenomeHybridization (CGH)
Insertion/Deletions
Comparative GenomeHybridization (CGH)
Insertion/Deletions
MethylationMethylation
ChromatinImmunoprecipitation
ChIP chip
ChromatinImmunoprecipitation
ChIP chip
Polymorphism SFPsDiscovery/Genotyping
Polymorphism SFPsDiscovery/Genotyping
~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced
Transcriptome Viewer: http://signal.salk.edu
SNP SFP MMMMM MSFP
SFP
MMMMM M
Chromosome (bp)
con
serv
atio
n
SNP
ORFa
start AAAAA
Tra
nsc
ripto
me
Atla
s
ORFb
deletion
Improved Genome Annotation
Review
• Transcription Based Cloning• Single Feature Polymorphisms
(SFPs) can be used to• Potential deletions (candidate genes)• Identify recombination breakpoints• eXtreme Array Mapping
• Haplotyping• Diversity/Selection• Association Mapping
Scott Hodges (UCSB)
Elena Kramer (Harvard)
Magnus Nordborg (USC)
Justin Borevitz (U Chicago)
Jeff Tompkins (Clemson)
NSF Genomics of Adaptation to the Biotic and Abiotic Environment in Aquilegia
Aquilegia (Columbines)
Recent adaptive radiation, 350Mb genome
NSF Genomics of Adaptation to the Biotic and Abiotic Environment in Aquilegia
• 35,000 ESTs 5’ and 3’
• 350 arrays, RNA and genotyping– High density SFP Genetic Map
• Physical Map (BAC tiling path)– Physical assignment of ESTs
• QTL for pollinator preference – and abiotic stress– QTL fine mapping/ LD mapping
• Develop transformation techniques
NaturalVariation.orgNaturalVariation.orgSalk
Jon WernerSarah LiljegrenHuaming ChenJoanne ChoryDetlef WeigelJoseph Ecker
UC San Diego
Charles Berry
Scripps
Sam HazenSteve KayElizabeth Winzeler
University of Chicago
Xu ZhangEvadne Smith
Syngenta
Hur-Song ChangTong Zhu
UC Davis
Julin Maloof
University of Guelph, Canada
Dave Wolyn
Sainsbury Laboratory
Jonathan Jones
University of Chicago
Xu ZhangEvadne Smith
Syngenta
Hur-Song ChangTong Zhu
UC Davis
Julin Maloof
University of Guelph, Canada
Dave Wolyn
Sainsbury Laboratory
Jonathan Jones
Salk
Jon WernerSarah LiljegrenHuaming ChenJoanne ChoryDetlef WeigelJoseph Ecker
UC San Diego
Charles Berry
Scripps
Sam HazenSteve KayElizabeth Winzeler