gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

19
200 7 Paul VanRaden and Melvin Tooker* Paul VanRaden and Melvin Tooker* Animal Improvement Programs Laboratory [email protected] 201 0 Gains in reliability from Gains in reliability from combining subsets of 500, combining subsets of 500, 5,000, 50,000 or 500,000 5,000, 50,000 or 500,000 genetic markers genetic markers

Upload: barrett-hall

Post on 03-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers. Introduction. Having more genetic markers can increase both reliability and cost of genomic selection. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

2007

Paul VanRaden and Melvin Tooker*Paul VanRaden and Melvin Tooker*Animal Improvement Programs Laboratory [email protected]

2010

Gains in reliability from combining Gains in reliability from combining subsets of 500, 5,000, 50,000 or subsets of 500, 5,000, 50,000 or

500,000 genetic markers 500,000 genetic markers

Page 2: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (2) Melvin Tooker201

0

IntroductionIntroduction

Having more genetic markers can increase both reliability and cost of genomic selection.

Fewer markers can be used to trace chromosome segments within a population once identified by high-density haplotyping.

Combinations of marker densities can improve reliability at lower cost.

Page 3: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (3) Melvin Tooker201

0

Introduction cont.Introduction cont.

Accurate genomic evaluations will be less costly if many animals are genotyped at less than the highest density

Missing genotypes are imputed (filled) using genotypes or haplotypes of relatives or from matching allele patterns in the general population

Page 4: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (4) Melvin Tooker201

0

How does imputation work?How does imputation work?

Identify haplotypes in population using many markers

Track haplotypes with fewer markers

e.g., use 5 SNP to track 25 SNP • 5 SNP: 22020

• 25 SNP: 2022020002002002000202200

Page 5: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (5) Melvin Tooker201

0

Imputed DamsImputed Dams

If progeny and sire both genotyped• First progeny inherits 1 of dam’s 2

haplotypes• Second progeny has 50:50 chance to

get same or other haplotype• Haplotypes known with 1, 2, 3, etc.

progeny are ~50%, 75%, 87%, etc.

Dam genotype used if >90% known

Page 6: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (6) Melvin Tooker201

0

Using Imputed Genotypes Using Imputed Genotypes July 2010July 2010

47,645 Holsteins• 2,035 cows imputed

4,575 Jerseys • 153 cows imputed

1,604 Brown Swiss• 66 cows imputed

Page 7: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (7) Melvin Tooker201

0

Marker Combinations TestedMarker Combinations Tested

Actual genotypes for 43,382 SNP combined with 3,209 SNP subset• All 40,351 Holsteins with 43,382 or 3,209 • Or half of young animals with low density• Or half of all animals with low density

Simulated genotypes for 500,000 SNP combined with 50,000 SNP subset• All 33,414 Holsteins with 500,000 or 50,000• Or 1,500 or 3,726 or 7,398 bulls with

500,000, remaining animals with 50,000

Page 8: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (8) Melvin Tooker201

0

Real Data Tests Real Data Tests

Half of young animals assigned 3K• Proven bulls, cows all had 43K• Dams imputed using 43K and 3K

Half of ALL animals assigned 3K• Could 3K reference animals help?• 10,000 proven bulls yet to genotype• Should cows with 3K be predictors?

Page 9: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (9) Melvin Tooker201

0

Results from 3K, 43K Actual Results from 3K, 43K Actual

Density Single Mixed Single

Chips 3K 3K and 43K 43K

# 43K N = 0 ½ All ½ Young 40,351

Before 1% 47% 27% 1%

After .05% 3% 1% .05%

REL 57 64 66 70

PA REL 36

Missing

Page 10: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (10) Melvin Tooker201

0

CorrelationsCorrelations22 of 3K and PA with 43K of 3K and PA with 43K Half ofHalf of YOUNG animals had 3K PTA, half 43K PTAYOUNG animals had 3K PTA, half 43K PTA

Consistent gains across traits• Corr(3K,43K)2 ranged from .90-.94• Corr(PA,43K)2 ranged from .42-.56• Reliability gain from progeny with 3K

was 79-87% of gain from 43K– Gain % = [Corr(3K,43K)Gain % = [Corr(3K,43K)22 - -

Corr(PA,43K)Corr(PA,43K)22] / [1 - Corr(PA,43K)] / [1 - Corr(PA,43K)22]]

Large benefits for smaller cost

Page 11: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (11) Melvin Tooker201

0

Using 3K as Reference Genotypes Using 3K as Reference Genotypes Half ofHalf of ALL animal NM$ were from 3K, half 43KALL animal NM$ were from 3K, half 43K

Gains in reliability as compared to genotyping all animals at 43K• 90% for young animals with 43K• 73% for young animals with 3K• 36% for dams imputed with 3K and

43K progeny instead of all 43K

Can use 3K reference genotypes

Page 12: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (12) Melvin Tooker201

0

Simulated 500K TestsSimulated 500K Tests

How many 500K genotypes needed?

Three subsets of mixed 500K and 50K:• Of 33,414 HO, only 1,586 (young) had 500K • Also bulls > 99% REL, total 3,726• Also bulls > 90% REL, total 7,398

Linkage generated in base population• Hopefully similar to actual linkage

Page 13: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (13) Melvin Tooker201

0

Results from 500K SimulationResults from 500K Simulation

Density Single Mixed Single

Chips 50K 50K and 500K 500K

# 500K N = 0 1,586 3,726 7,398 33,414

Before 1% 88% 80% 70% 1%

After .05% 5.3% 2.3% 1.5% .05%

REL 82.6 83.4 83.6 83.7 84.0

PA REL 36

Missing

Page 14: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (14) Melvin Tooker201

0

REL Using Only 3K, 50K, or 500KREL Using Only 3K, 50K, or 500Kwith increasing numbers of bullswith increasing numbers of bulls

0102030405060708090

100

Re

lia

bil

ity

PA only 2,500 10,000 25,000 100,000

Number of Bulls

3K

50K

500K

Page 15: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (15) Melvin Tooker201

0

ConclusionsConclusions

Genomic evaluations can mix different chip densities to save $

Only a few thousand of highest density genotypes needed, and other animals imputed

More animals can be genotyped to increase selection differential and size of reference population

Breeders must optimize chip choice

Page 16: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (16) Melvin Tooker201

0

Better Communication is NeededBetter Communication is Needed

“Progeny genotypes should affect dam, but programs are not yet available” Jan 2009 USDA Changes Memo

“Programs are available to impute 1300 dams” Oct 2009 USDA report to Council

“Encourage USDA to use genotypes, derived by imputation, in genetic evaluation” Oct 2009 Holstein USA Board of Directors (in Holstein Pulse)

Page 17: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (17) Melvin Tooker201

0

Better Communication is NeededBetter Communication is Needed

“…new genetic calculations should not be published when using female DNA (which is the intellectual property of each respective breeder) unless approved by the Holstein Association and its board of directors.” Resolution #2 – Adopted by Holstein Association USA Delegates, 125th annual meeting, July 2010

Page 18: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (18) Melvin Tooker201

0

Mixing Different ChipsMixing Different Chips

Page 19: Gains in reliability from combining subsets of 500, 5,000, 50,000 or 500,000 genetic markers

ADSA / ASAS annual meeting, July 2010 (19) Melvin Tooker201

0

AcknowledgmentsAcknowledgments

Curt Van Tassell (BFGL) selected the 3,209 low density SNP

Bob Schnabel (U. Missouri), Jeff O’Connell (U. Maryland), and George Wiggans fixed map locations for several SNP