microarrays for mapping and expression analysis: toward the genetic determinants of light response...

40
Microarrays for mapping and expression analysis: Toward the genetic determinants of light response adaptation in Arabidopsis and Aquilegia Justin Borevitz Ecology & Evolution University of Chicago naturalvariation.org

Post on 20-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Microarrays for mapping and expression analysis: Toward the genetic determinants of light response adaptation in Arabidopsis and Aquilegia

Justin BorevitzEcology & EvolutionUniversity of Chicagonaturalvariation.org

Light Affects the Entire Plant Life Cycle

de-etiolation

hypocotyl

}

Light Affects the Entire Plant Life Cycle

Light response variation can be seen under constant conditions in the lab

Seasons in the Growth Chamber

• Changing Day length• Cycle Light Intensity• Cycle Light Colors• Cycle Temperature

Sweden Spain

Seasons in the Growth Chamber

• Changing Day length

• Cycle Light Intensity

• Cycle Light Colors

• Cycle Temperature

Day Length

0:00

2:00

4:00

6:00

8:00

10:00

12:00

14:00

16:00

18:00

20:00

22:00

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

hour

s

Sweden

Spain

standard

standard

Light Intensity

0

200

400

600

800

1000

1200

1400se

p

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

month

W/m

2

Sweden

Spain

standard

Temperature

-10

-5

0

5

10

15

20

25

30

35

sep

oct

nov

dec

jan

feb

mar

apr

may jun jul

aug

monthde

gree

s C

Spain High

Spain Low

Sweden High

Sweden Low

standard

Local Population Variation

Talk Outline

• Single Feature Polymorphisms (SFPs)– Potential deletions

– Bulk segregant/ eXtreme Mapping

• Haplotype analysis

• Tiling arrays

• Aquilegia

• Single Feature Polymorphisms (SFPs)– Potential deletions

– Bulk segregant/ eXtreme Mapping

• Haplotype analysis

• Tiling arrays

• Aquilegia

What is Array Genotyping?

• Affymetrix tiling array GeneChips contain ~35bp spacing, 1.67million unique features

• Genomic DNA is randomly labeled

with biotin dCTP, product ~50bp.

• 3 independent biological replicates compared to the reference strain Col

GeneChip

Potential Deletions

False Discovery and Sensitivity

PM only

SAM threshold

5% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity

Polymorphic 340 117 223 34% Non-polymorphic 477 4 473

False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity

Polymorphic 340 195 145 57% Non-polymorphic 477 28 449

False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59

3/4 Cvi markers were also confirmed in PHYB

90% 80% 70%

41% 53% 85%

90% 80% 70%

67% 85% 100%

Cereonmay be asequencingError

TIGRmatch isa match

Chip genotyping of a Recombinant Inbred Line

29kb interval

Discovery 6 replicates X $500 120,000 SFPs = $0.025Typing 1 replicate X $500 120,000 SFPs = $0.0041

Potential Deletions

>500 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Potential Deletions Suggest Candidate Genes

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in FLM (Werner et al, Genetics 2005)

MAF1

FLM natural deletion

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

HetHazen et al Plant Physiology 2005

Map bibb100 bibb mutant plants100 wt mutant plants

Array Mapping

Hazen et al Plant Physiology 2005

LUX ARRHYTHMO encodes a Myb domain

protein essential for circadian rhythms

Hazen et al PNAS, 2005

Cloned with Array Mapping

eXtreme Array Mapping

15 tallest RILs pooled vs15 shortest RILs pooledWolyn et al Genetics 2004

LOD

eXtreme Array Mapping

Allele frequencies determined by SFP genotyping. Thresholds set by simulations

0

4

8

12

16

0 20 40 60 80 100cM

LO

D

Composite Interval Mapping

RED2 QTL

Chromosome 2

RED2 QTL 12cM

Red light QTL RED2 from 100 Kas/ Col RILs

Array Haplotyping

• What about Diversity/selection across the genome?

• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ

• LD decay, Haplotype block size• Deep population structure?• Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas,

C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2Fl-1, Ita-0, Mr-0, St-0, Sah-0

Array Haplotyping

Inbred lines

Low effectiverecombinationdue to partialselfing

Extensive LDblocks

Col Ler Cvi Kas Bay Shah Lz Nd

Chr

omos

ome1

~50

0kb

SFPs for reverse genetics

http://naturalvariation.org/sfp

14 Accessions 30,950 SFPs`

Chromosome Wide Diversity

Diversity 50kb windows

Tajima’s D like 50kb windows

R genes vs bHLH

Review

• Single Feature Polymorphisms (SFPs) can be used to

• Potential deletions (candidate genes)• Identify recombination breakpoints• eXtreme Array Mapping

• Haplotyping• Diversity/Selection

• Association Mapping

RNA DNA

Universal Whole Genome Array

Transcriptome AtlasExpression levelsTissues specificity

Transcriptome AtlasExpression levelsTissues specificity

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription

Alternative SplicingAlternative Splicing Comparative GenomeHybridization (CGH)

Insertion/Deletions

Comparative GenomeHybridization (CGH)

Insertion/Deletions

MethylationMethylation

ChromatinImmunoprecipitation

ChIP chip

ChromatinImmunoprecipitation

ChIP chip

Polymorphism SFPsDiscovery/Genotyping

Polymorphism SFPsDiscovery/Genotyping

~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced

SNP SFP MMMMM MSFP

SFP

MMMMM M

Chromosome (bp)

con

serv

atio

n

SNP

ORFa

start AAAAA

Tra

nsc

ripto

me

Atla

s

ORFb

deletion

Improved Genome Annotation

cDNA raw intensity 10% smoothed

Aquilegia (Columbines)

Recent adaptive radiation, 350Mb genome

• 300 F3 RILs growing (Evadne Smith)• >50,000 ESTs TIGR gene index and GenBank,

arrays being designed by Nimblegen

Aquilegia (Columbines)

Genetics of Speciationalong a Hybrid Zone

Species with> 20k ESTs 11/14/2003

Animal lineage: good coverage

Plant lineage: crop plant coverage

NSF Genome Complexity

• 52,000 ESTs 5’ and 3’– >9k contigs, 4k singletons

– >500 SNPs

• 350 arrays, RNA and genotyping– High density SFP Genetic Map

• Physical Map (BAC tiling path)– Physical assignment of ESTs

• QTL for pollinator preference – ~400 RILs, map abiotic stress

– QTL fine mapping/ LD mapping

• Develop transformation techniques

• http://www.AQgenome.org

Scott Hodges (UCSB)

Elena Kramer (Harvard)

Magnus Nordborg (USC)

Justin Borevitz (U Chicago)

Jeff Tompkins (Clemson)

University of Chicago

Xu ZhangEvadne Smith

University of Guelph, Canada

Dave Wolyn

Sainsbury Laboratory

Jonathan Jones

NaturalVariation.orgNaturalVariation.orgSalk

Jon WernerJoanne ChoryDetlef WeigelJoseph Ecker

UC Davis

Julin Maloof

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler

Salk

Jon WernerJoanne ChoryDetlef WeigelJoseph Ecker

UC Davis

Julin Maloof

UC San Diego

Charles Berry

Scripps

Sam HazenElizabeth Winzeler

University of Chicago

Xu ZhangEvadne Smith

University of Guelph, Canada

Dave Wolyn

Sainsbury Laboratory

Jonathan Jones

differences may be due to expression or hybridization

PAG1 down regulated in Cvi

PLALE GREEN1 knock out has long hypocotyl in red light