opportunities and challenges of statistical genetics in ...zhu and yu 2009, genetics 182:875-888. yu...

30
Opportunities and Challenges of Statistical Genetics in Genome-wide Association Studies Jianming Yu, Department of Agronomy, Kansas State University The 8 th International Purdue Symposium on Statistics Session: Interactions Between Omics and Statistics: Analyzing High Dimensional Data Sunday, June 24, 11:00am - 11:30am

Upload: others

Post on 15-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Opportunities and Challenges of Statistical

Genetics in Genome-wide Association StudiesJianming Yu,

Department of Agronomy, Kansas State University

The 8th International Purdue Symposium on Statistics

Session: Interactions Between Omics and Statistics: Analyzing High Dimensional Data

Sunday, June 24, 11:00am - 11:30am

Page 2: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Outline

• GWAS: Opportunities and challenges

• Rare allele (CR-GWAS)– Arabidopsis

• Genomic distribution of trait-associated SNPs– Maize

• Examples of functional haplotypes– Sorghum

Page 3: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Geno→Pheno & Geno↔Pheno

• Genome-wide association study (GWAS)– Gene identification

– Finding association “signals” with a large set of SNPs across diverse germplasm

– Array-based genotyping & resequencing

• Genome-wide selection (GS)– “Prediction”, breeding, genetic gain

– Selection of individuals based on predicted phenotypic values using all markers, rather than only significant markers linked to QTL

– Integration of genomic technology with plant breeding• Bernardo and Yu 2007, Crop Science 47:1082-1090

I. GWAS: Opportunities and Challenges

Page 4: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Opportunities of GWAS

• An additional strategy/tool in gene identification– Initiation/validation QTL cloning

– Hypothesis of new functions of known genes

– New genes/pathways

• The ability of rapidly nailing down genes for human diseases with relatively simple inheritance is impressive!

Genetic diversity

y = Xb + Sa + Qv +Zu + e

Methodology development

σ

Σ

α

Genomic technology

A

TG

C

A

G

D r2

Genetics of complex traits

p = 1e-7

Many groups

Many groups

Genetic diversity

y = Xb + Sa + Qv +Zu + e

Methodology development

σ

Σ

α

Genomic technology

A

TG

C

A

G

D r2

Genetics of complex traits

p = 1e-7

Genetic diversity

y = Xb + Sa + Qv +Zu + e

Methodology development

σ

Σ

α

y = Xb + Sa + Qv +Zu + e

Methodology development

σ

Σ

α

Genomic technology

A

TG

C

A

G

Genomic technology

A

TG

C

A

G

D r2

Genetics of complex traits

p = 1e-7

Many groups

Many groups

• Appreciation of the complexity and beauty of the natural variation

• Has the potential to offer a global view of genetic architecture of complex traits

Page 5: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Challenges of GWAS

• Population structure and relative kinship– Human genetics, plant and animal genetics

• Structure; Mixed model QK; Dimension determination/model testing; Accuracy of variance-covariance matrix, R2

LR for mixed model

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

I II III IV V

Simple PCA

nMDS K

PCA+K nMDS+K

Av

era

ge

po

wer

Sample type

b

0.00

0.20

0.40

0.60

0.80

1.00

I II III IV V

Simple PCA

nMDS K

PCA+K nMDS+K

Fals

e p

osit

ive r

ate

Sample type

a

0.00

0.20

0.40

0.60

0.80

1.00

I II III IV V

Simple PCA

nMDS K

PCA+K nMDS+K

Fals

e p

osit

ive r

ate

Sample type

a

Zhu and Yu 2009, Genetics 182:875-888.

Yu et al, 2006. A unified mixed-model method for

association mapping that accounts for multiple levels

of relatedness. Nature Genetics 38, 203-208

y = X + S + e

Environments,

etc.Candidate

SNP effectsTrait

values

QK model

Subpopulation

effectsQ model

+ Qv

Background

Genetic effects,

Var(u) = 2KVA

K model

+ Zu

Simple model

y = X + S + e

Environments,

etc.

Environments,

etc.Candidate

SNP effects

Candidate

SNP effectsTrait

values

Trait

values

QK modelQK model

Subpopulation

effectsQ model

+ Qv

Subpopulation

effectsQ modelQ model

+ Qv

Background

Genetic effects,

Var(u) = 2KVA

K model

+ Zu

Background

Genetic effects,

Var(u) = 2KVA

K model

Background

Genetic effects,

Var(u) = 2KVA

K modelK model

+ Zu

Simple model

Page 6: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Challenges of GWAS

• Computational demand for mixed model– P3D, Compression, EMMA, EMMAX, and Fast-LMM

– GEMMA, MLMM (Online first Nature Genetics)

• Multiple testing issue and significance threshold

• Missing heritability

• Rare allele and epistasis

• Genome, genetics, gene, coding region, allele, hapoltype, SNP, structural variation, etc.

Blame “complex biology and genetics”

Page 7: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Hypothesis of GWAS: CDCV or CDRV

Rare allele: 4% from Genotyping versus 50% from Resequencing

Single-locus test after controlling for population stratification

CDCV CDRV

CASE

CONTROL

Single locusGenomic

regionInteraction Pathway

GWAS Functional NetworkCommunity Resource

Bioinformatics

Nat Rev Genet 11, 773-785

I. Composite Resequencing-based GWAS (CR-GWAS)

Page 8: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Resequencing: Huang et al. Nat Genet 2010 42:961-967

Genotyping: Atwell et al. Nature 2010 465:627-631

Resequencing: Zhao et al. PLoS Genet 2007 3:e3GWAS in Plants

Genetic design + Resequencing: Tian et al. Nat Genet 2011 43:159-162

Genetic design + Resequencing: Kump et al. Nat Genet 2011 43:163-168

Page 9: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Critical Advances

• With next generation sequencing technologies, exome sequencing or whole genome resequencing is now possible (Shendure and Ji, 2008;

Ansorge, 2009; Ng et al., 2010).

• Biological functions of nucleotide polymorphisms can be predicted with the context sequence of genes (Ramensky et al., 2002;

Kumar et al., 2009).

• Attention has been given to the rare allele issue (Cohen et al., 2004;

Bodmer and Bonilla, 2008; Nejentsev et al., 2009) and some specific statistics have been developed to assess the significance of rare variants (Morgenthaler and Thilly, 2007; Li and Leal, 2008; Madsen and Browning, 2009;

Morris and Zeggini, 2010).

• Genome databases and gene networks have been developed to aid the search and confirmation processes of gene-trait associations (Lee et al., 2008; Lee et al., 2010; Lee et al., 2010).

Page 10: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Resequencing or genotyping-by-sequencing

Resequenced regions

Gene network(e.g., AraNet)

A priori

candidate genes

Genome database(e.g., TAIR)

Significant

candidate regions

A posterior

candidate genes

Function prediction(e.g., Polyphen, SIFT)

Validating and identifying

GWAS

Linkage analysis

QTL cloning

Mutation

Other research

Systematic testing

Combined multivariate

Pooled method

Single SNP

Multiple common

variant Sum testWeighted Sum

Function-aided Sum

Pooled rare variant

Composite Resequecing-based GWAS (CR-GWAS) integrates function prediction,

genome database and gene network information, and common and rare variant testing

Page 11: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

16840

450

680

546

5371

5820

4243

4150

0 5000 10000 15000 20000

Indel

Tri-allelic

3-UTR

5-UTR

Intergenetic

Intron

Synonymous

NonsynonymousPolyPhen

SIFT

0

500

1000

1500

2000

2500

Deleterious Tolerated

Page 12: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Distribution of SNPs with different function prediction across different minor allele frequency

SIFT-predicted function class, deleterious or tolerant, and Polyphen-predicted score value.

0 1 2 3

To

lera

nt

D

ele

teri

ou

s

Score

Benign Probably

damagingPo

ss

ibly

da

ma

gin

gf

0.00

0.10

0.20

0.30

0.40

5 10 15 20 25 30 35 40 45 50

Intronic

Syn

Benign

Pos D

Prob D

Minor allele frequency (%)

SN

P f

req

ue

ncy

c

Page 13: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Functional SNP frequency & minor allele frequency

y = 0.3562x-1.4162

R2 = 0.846

0

0.1

0.2

0.3

0.4

0.5

0.6

2.5

7.5

12.5

17.5

22.5

27.5

32.5

37.5

42.5

47.5

Minor allele frequency

Fu

ncti

on

al

SN

P f

req

uen

cy

d

y = 0.4163x-1.4524

R2 = 0.7816

0

0.1

0.2

0.3

0.4

0.5

0.6

2.5

7.5

12.5

17.5

22.5

27.5

32.5

37.5

42.5

47.5

Minor allele frequency (%)

Fu

ncti

on

al

SN

P f

req

uen

cy

e

PolyPhen-predicted functional SNP frequency SIFT-predicted functional SNP frequency

Page 14: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Candidate genes are over-represented among statistically significant associations.

Results for three genes (FLM, SPL5, and FY) with rare variants were consistent.

0 1i i iy z e

1

mij

ij

xz

m

1 (1 )

mij

ij i i

xz

np p

1

mF

i j j ijj

z S p x

1.41620.3562( )Fp p

S = average

probabilistic score

of each class

0

1

2

3

0 1 2 3

0

1

2

3

0 1 2 3

0

1

2

3

0 1 2 3

Sum test

We

igh

ted

su

m t

est

Weighted sum test Sum test

Fu

nc

tio

n-a

ide

d s

um

te

st

Fu

nc

tio

n-a

ide

d s

um

te

st

FY

SPL5

FLM

FLM

SPL5

FY

FY

SPL5

FLM

Rare allele statistics

Page 15: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Gene &

Gene IDSum test Weighted sum

Function-

aided sum

A priori

candidate

gene

Connected

in

AraNet

Supporting evidence

FLM

AT1G77080

LD (3.32)

JIC2W (4.78)

JIC4W (3.23)

JIC2W (1.45) JIC2W (3.12) yes yes Scortecci et al. 2001

Werner et al. 2005

BAS1

AT2G26710

LD (4.19)

SD (2.91)

JIC2W (3.52)

LD (4.55)

SD (3.47)

JIC2W (3.69)

LD (4.33)

SD (3.12)

JIC2W (2.23)

yes yes Turk et al. 2005

SPL5

AT3G15270

JIC/USC (3.22) JIC/USC (3.47) JIC/USC (3.57) yes yes Wu et al. 2009

Wu and Poethig 2006

FY

AT5G13480

JIC2W (3.55)

JIC8W (2.24)

JIC2W (4.05)

JIC8W (2.31)

JIC2W (3.67) yes yes Simpson et al. 2003

[Brachi et al. 2010]

Genes with rare variants showing associations to flowering time

Page 16: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Gene &

Gene ID

Single SNP

test

Multiple

common

Combined

multivariate

Pooled

A priori

candidate

Gene

Connected

in

AraNet

Supporting evidence

[GWAS]

AP1

AT1G69120

JIC0W (2.91)

FLC (2.85)

JIC0W (3.17)

FLC (3.35)

JIC4W (3.37)

JIC8W (3.09)

JIC0W (3.28)

FLC (3.72)

JIC4W (3.48)

VERN (3.77)

yes yes Gustafson-Brown et al.

1994

[Brachi et al. 2010]

Mouradov et al. 2002

CR88

AT2G04030

JIC/USC (1.75) JIC/USC (2.59) JIC/USC (2.72) yes no Cao et al. 2000

TIC

AT3G22380

JIC4W (5.31) JIC4W (1.87) JIC4W (3.27) yes no Ding et al. 2007

DCL2

AT3G03300

SDV (3.08) SDV (3.55) SDV (3.94) no yes Henderson et al. 2006

FCA

AT4G16280

±V(SD) (4.19) ±V(SD) (3.34) ±V(SD) (3.62) yes yes Macknight et al. 1997

[Atwell et al. 2010]

[Brachi et al. 2010]

[Zhao et al. 2007]

FRI

AT4G00650

FRI (14.78)

FLC (4.13)

FRI (12.34)

FLC (4.77)

FRI (9.43)

FLC (3.68)

JIC4W (4.23)

yes yes Johanson et al. 2000

Shindo et al. 2005

[Atwell et al. 2010]

[Zhao et al. 2007]

FLC

AT5G10140

SD/LD(V) (3.81) SD/LD(V)

(3.14)

SDV (3.59)

SD/LD(V) (4.34)

SDV (4.81)

yes yes Ratcliffe et al. 2001

[Atwell et al. 2010]

[Zhao et al. 2007]

Genes with common variants showing associations to flowering time

Page 17: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Summary• We presented a CR-GWAS strategy to systematically exploit

the collective biological information and analytical tools association analysis

• With the proposed strategy, we confirmed several well-known true positives and identified several new promising associations

• We identified AT3G03300 (DCL2) as a new target candidate in regulating flowering time in Arabidopsis

• We demonstrated that both common and rare variants contributed to the variation of flowering-time related traits

Page 18: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

• Genetic Architecture of Complex Traits– How many loci?

– What is the frequency of different alleles?

– Genomic location of these loci?

– Genetic effects (e.g., homozygous, heterozygous, epistatic and pleiotropic effects) of these loci?

– Gene x gene, gene x background, gene x environment interaction?

• Tabulation of previous QTL mapping results, information from QTL cloning experiments

• Tabulation of Trait-Associated SNPs (TASs) from NAM-GWAS– Which part of the genome should be prioritized?

II. Genomic Distribution of Trait-Associated SNPs (TAS)

Page 19: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

NAM

1

2

200

B73×

F1s

SSD

25 DL

B97

CM

L103

CM

L228

CM

L247

CM

L277

CM

L322

CM

L333

CM

L52

CM

L69

Hp30

1

Il14

H

Ki1

1

Ki3

Ky2

1

M1

62W

M3

7W

Mo

18W

MS

71

NC

350

NC

358

Oh

43

Oh

7B

P39

Tx3

03

Tzi8

An integrated mapping strategy, NAM

NAM

1

2

200

NAM

1

2

200

B73× B73×

F1s

SSD

F1s

SSD

25 DL

B97

CM

L103

CM

L228

CM

L247

CM

L277

CM

L322

CM

L333

CM

L52

CM

L69

Hp30

1

Il14

H

Ki1

1

Ki3

Ky2

1

M1

62W

M3

7W

Mo

18W

MS

71

NC

350

NC

358

Oh

43

Oh

7B

P39

Tx3

03

Tzi8

25 DL

B97

CM

L103

CM

L228

CM

L247

CM

L277

CM

L322

CM

L333

CM

L52

CM

L69

Hp30

1

Il14

H

Ki1

1

Ki3

Ky2

1

M1

62W

M3

7W

Mo

18W

MS

71

NC

350

NC

358

Oh

43

Oh

7B

P39

Tx3

03

Tzi8

An integrated mapping strategy, NAM

Yu et al., Genetics 138:539-551 (2008)

Buckler et al., Science 325:714-718 (2009)

McMullen et al., Science 325:737-740 (2009)

Tian et al. Nat Genet (2011)

Kump et al. Nat Genet (2011)

Page 20: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

High Resolution of NAM-GWAS

Page 21: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Two-stage, targeted dissection genome scan

Page 22: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Genic + Promoter 5kb = 13% of maize genome, account for 71% of TASs

Page 23: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Summary• Genic and non-genic TASs contribute approximately

equally to phenotypic variation for maize quantitative traits

• Genic+promoter region, while comprises only 13% of maize genome, account for 71% of identified TASs and explain 79% of PVE of all TASs– Evolutionary alterations in protein sequence appear to be

quantitatively less important than changes in gene regulation in shaping the wide natural variation observed in maize.

• Genotyping methods designed to discover SNPs in genes and their upstream regions would be the most economical approach for detecting genome-wide association signals– The combination of RNA-seq and exome capture experiments

using a long read (e.g., 454) and paired end (Illumina and 454) technologies

Page 24: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Tan1

tan1-a

tan1-b

tan1-a

tan1-b

Tan1

Tan1

Tan1

tan1-b

tan1-a

tan1-a

tan1-b

353 aa

318 aa

298 aa

Tannin1 Gene Cloning, Wu et al., 2012 PNAS (online first)

III. Multiple functional haplotypes at a single locus

Page 25: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

0

2

4

6

8

10

12

0 500 1000 1500 2000 2500

Promoter Coding

-Lo

g1

0(P

-va

lue

)Multi-allele test

All these SNPs may get you into the right gene

Tests of three functional

polymorphisms

Tannin1 Gene Cloning, Wu et al., 2012 PNAS (online first)

Tannin Sorghum

Non-tannin sorghum

Tannin Sorghum

Non-tannin sorghum

Non-tannin sorghum

Normal Bleached

Page 26: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Shattering1 Gene Cloning, Lin et al., 2012 Nature Genetics 44:720–724

SV 1 MSAQQIAPVPEHVCYVHCNFCNTILAVSVPSHSMLNIVTVRCGHCTSLLSVNLRGLLQSL 60

Tx430 1 MSAQQIAPVPEHVCYVHCNFCNTILAVSVPSHSMLNIVTVRCGHCTSLLSVNLRGLLQSL 60

SC265 1 MSAQQIAPVPEHVCYVHCNFCNTILAVSVPSHSMLNIVTVRCGHCTSLLSVNLRGLLQSL 60

Tx623 1 MSAQQIAPVPEHVCYVHCNFCNTILALQRRGNVFLQHITDLLRKRYEGLKQATQT----- 55

**************************. . * *

SV 61 PVQNHYSQENNFKVQNFSFTENYPEYAPSSSKYRMPTMLSAKGDLDHMLHVRAPEKRQRV 120

Tx430 61 PVQNHYSQENNFKVQNFSFTENYPEYAPSSSKYRMPTMLSAKGDLDHMLHVRAPEKRQRV 120

SC265 61 PVQNHYSQENNFKVQNFSFTENYPEYAPSSSKYRMPTMLSAKGDLDHMLHVRGKRYEGLK 120

Tx623 55 ------------------------------------------------------------ 55

Sv 121 PSAYNRFIKEEIRRIKASNPDISHREAFSTAAKNWAHFPNIHFGLGPYESSNKLDEAIGA 180

Tx430 121 PSAYNRFIKEEIRRIKASNPDISHREAFSTAAKNWAHFPNIHFGLGPYESSNKLDEAIGA 180

SC265 121 QATQT------------------------------------------------------- 125

Tx623 55 ------------------------------------------------------------ 55

Sv 181 TGHPQKVQDLY 191

Tx430 181 TGHPQKVQDLY 191

SC265 125 ----------- 125

Tx623 55 ----------- 55

SH1

Actin

576bpE1 E2 E3 E4 E5 E6

E1 E2 E3 E4 E5 E6

E1 E4 E5 E6

E1 E2 E3 E5 E6

SV

Tx430

Tx623

SC265

576bp

317bp

527bp

a

b

Page 27: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Phenotype

1

1

1

0

0

0

0

0

0

0

0

0

0

Multi-haplotype test

Shattering1 Gene Cloning, Lin et al., 2012 Nature Genetics 44:720–724

Page 28: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Overall Summary

• Traits are complex! Going genome-wide presents opportunities and challenges

• Many “biological” and “statistical” tools are available and it sure helps to use a combination of them– Geno→Pheno & Geno↔Pheno

• QTL → → TAS → → functional polymorphisms → → “haplotypes”– GWAS and GS

• Genic or non-genic region harbors functional polymorphisms?– Different functional polymorphisms/haplotypes for different allellic

series, complex versus less complex traits

• How do we validate the findings of TAS/QTN through GWAS, with small-moderate effects, for complex traits?– ZFN, TALEs?

Page 29: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Communication

Statisticians Biologists

These are the significant variables.

All we can do so far.

Why my favorite genes are not

tagged? No other choices?

These are the cut-off threshold

that was used.

What if we changed that a bit? Will

it make my genes significant?

These are the ones that are worth

follow-up studies.

I only have one postdoc/student to

chase after the strongest one.

We can look into other newly

developed methods.

Well, I will have “HapMap X” data ready

next week.

Statistical methods (generalized, less

assumptions, higher power)

Experimental design

Data collection (x, y, expression,

protein, independent experiment,

trangenic complementation, mutants)

Biological questions

Page 30: Opportunities and Challenges of Statistical Genetics in ...Zhu and Yu 2009, Genetics 182:875-888. Yu et al, 2006. A unified mixed-model method for association mapping that accounts

Acknowledgements

Guihua Bai (USDA-ARS)

MingLi Wang (USDA-ARS)

Gary Pederson (USDA-ARS)

Tom Clemente (UNL)

Harold Trick (KSU)

John Doebley (Wisconsin)

Ed Buckler (USDA-ARS)

James Holland (USDA-ARS)

Steve Kresovich (Cornell)

Mike McMullen (USD-ARS)

Zhiwu Zhang (Cornell)

Rex Bernardo (U of Minn.)

Ismail Dweikat (UNL)

Yiwei Jiang (Purdue)

Jianxin Ma (Purdue)

Min Zhang (Purdue)

Dabao Zhang (Purdue)

Weixing Song (KSU)

Hans-Peter Piepho (Hohenheim)

Frank White (KSU)

Clare Nelson (KSU)

Mitch Tuinstra (Purdue)

Mike Scanlon (Cornell)

Patrick Schnable (ISU)

Gary Muehlbauer (Cornell)

Marja Timmermans (CSHL)

Tesfaye Tesso (KSU)

Donghai Wang (KSU)

Scott Staggenborg (KSU)

Vara Prasad (KSU)

Ramasamy Perumal (KSU)

Scott Bean (USDA-ARS)

Lean Miller

Kallen Kershner

Chris Fontenot

Adeyanju Adedayo

Raymond Mutava