breeding designs for recombinant inbred advanced intercross lines · 2018. 8. 3. · 1070 m. v....

10
Copyright Ó 2008 by the Genetics Society of America DOI: 10.1534/genetics.107.083873 Breeding Designs for Recombinant Inbred Advanced Intercross Lines Matthew V. Rockman 1 and Leonid Kruglyak Lewis-Sigler Institute for Integrative Genomics and Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey 08544 Manuscript received October 27, 2007 Accepted for publication March 31, 2008 ABSTRACT Recombinant inbred lines derived from an advanced intercross, in which multiple generations of mating have increased the density of recombination breakpoints, are powerful tools for mapping the loci underlying complex traits. We investigated the effects of intercross breeding designs on the utility of such lines for mapping. The simplest design, random pair mating with each pair contributing exactly two offspring to the next generation, performed as well as the most extreme inbreeding avoidance scheme at expanding the genetic map, increasing fine-mapping resolution, and controlling genetic drift. Circular mating designs offer negligible advantages for controlling drift and exhibit greatly reduced map expansion. Random-mating designs with variance in offspring number are also poor at increasing mapping resolution. Given equal contributions of each parent to the next generation, the constraint of monogamy has no impact on the qualities of the final population of inbred lines. We find that the easiest crosses to perform are well suited to the task of generating populations of highly recombinant inbred lines. D ESPITE decades of effort, the molecular variants that underlie complex trait variation have largely eluded detection. Quantitative genetics has mapped these fundamental units of the genotype–phenotype re- lationship to broad chromosomal regions and quanti- tative trait loci (QTL), but dissecting QTL into their constituent quantitative trait nucleotides (QTNs) has been challenging. Historically, a major impediment to QTL mapping was the development of molecular markers, but as the time and cost of marker development and genotyping have declined, the bottleneck in QTN discovery has shifted to the mapping resolution permit- ted by a finite number of meiotic recombination events. Simple, practical methods for increasing mapping resolution will help break the QTN logjam. We in- vestigated aspects of inbred line cross designs to identify such simple, practical methods. Many techniques increase the number of informative meioses contributing to a mapping population derived from a cross between inbred strains. The simplest tech- nique is merely to increase the number of individuals sampled in an F 2 or backcross (BC) population. A more appealing approach is to increase the number of meioses per individual, which increases the mapping resolution without increasing the number of individuals to be phenotyped and genotyped and permits the separation of linked QTL. The production of recombinant inbred lines (RILs), derived from an F 2 population by gener- ations of sib mating or selfing (Bailey 1971; Soller and Beckman 1990), allows for the accumulation of re- combination breakpoints during the inbreeding phase, increasing their number by twofold in the case of selfing and fourfold in the case of sib mating (Haldane and Waddington 1931). The accumulation of breakpoints in RILs is limited by the fact that each generation of inbreeding makes the recombining chromosomes more similar to one an- other, so that meiosis ceases to generate new recom- binant haplotypes. As an alternative to RILs, Darvasi and Soller (1995) proposed randomly mating the F 2 progeny of a cross between inbred founders and using successive generations of random mating (RM) to promote the accumulation of recombination break- points in the resulting advanced intercross lines (AILs). Linkage disequilibrium decays each generation as sex does the work of producing new haplotypes. A trivial extension of the AIL design preserves the diversity in the intercross population by inbreeding to yield highly re- combinant inbred lines, as Darvasi and Soller (1995) noted. Such a design had previously been employed by Ebert et al. (1993). This recombinant inbred advanced intercross line design has obvious appeal in its union of the advantages of both AILs and RILs and has been employed in the production of mapping populations in several species (Ebert et al. 1993; Liu et al. 1996; Lee et al. 2002; Peirce et al. 2004). The inbred lines derived by inbreeding from an advanced intercross population have many names in the literature, including intermated recombinant in- 1 Corresponding author: 110 Carl Icahn Laboratory, Princeton University, Princeton, NJ 08544. E-mail: [email protected] Genetics 179: 1069–1078 ( June 2008)

Upload: others

Post on 18-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

Copyright � 2008 by the Genetics Society of AmericaDOI: 10.1534/genetics.107.083873

Breeding Designs for Recombinant Inbred Advanced Intercross Lines

Matthew V. Rockman1 and Leonid Kruglyak

Lewis-Sigler Institute for Integrative Genomics and Department of Ecology andEvolutionary Biology, Princeton University, Princeton, New Jersey 08544

Manuscript received October 27, 2007Accepted for publication March 31, 2008

ABSTRACT

Recombinant inbred lines derived from an advanced intercross, in which multiple generations ofmating have increased the density of recombination breakpoints, are powerful tools for mapping the lociunderlying complex traits. We investigated the effects of intercross breeding designs on the utility of suchlines for mapping. The simplest design, random pair mating with each pair contributing exactly twooffspring to the next generation, performed as well as the most extreme inbreeding avoidance scheme atexpanding the genetic map, increasing fine-mapping resolution, and controlling genetic drift. Circularmating designs offer negligible advantages for controlling drift and exhibit greatly reduced mapexpansion. Random-mating designs with variance in offspring number are also poor at increasingmapping resolution. Given equal contributions of each parent to the next generation, the constraint ofmonogamy has no impact on the qualities of the final population of inbred lines. We find that the easiestcrosses to perform are well suited to the task of generating populations of highly recombinant inbredlines.

DESPITE decades of effort, the molecular variantsthat underlie complex trait variation have largely

eluded detection. Quantitative genetics has mappedthese fundamental units of the genotype–phenotype re-lationship to broad chromosomal regions and quanti-tative trait loci (QTL), but dissecting QTL into theirconstituent quantitative trait nucleotides (QTNs) hasbeen challenging. Historically, a major impedimentto QTL mapping was the development of molecularmarkers, but as the time and cost of marker developmentand genotyping have declined, the bottleneck in QTNdiscovery has shifted to the mapping resolution permit-ted by a finite number of meiotic recombination events.Simple, practical methods for increasing mappingresolution will help break the QTN logjam. We in-vestigated aspects of inbred line cross designs to identifysuch simple, practical methods.

Many techniques increase the number of informativemeioses contributing to a mapping population derivedfrom a cross between inbred strains. The simplest tech-nique is merely to increase the number of individualssampled in an F2 or backcross (BC) population. A moreappealing approach is to increase the number of meiosesper individual, which increases the mapping resolutionwithout increasing the number of individuals to bephenotyped and genotyped and permits the separationof linked QTL. The production of recombinant inbredlines (RILs), derived from an F2 population by gener-

ations of sib mating or selfing (Bailey 1971; Soller andBeckman 1990), allows for the accumulation of re-combination breakpoints during the inbreeding phase,increasing their number by twofold in the case of selfingand fourfold in the case of sib mating (Haldane andWaddington 1931).

The accumulation of breakpoints in RILs is limited bythe fact that each generation of inbreeding makes therecombining chromosomes more similar to one an-other, so that meiosis ceases to generate new recom-binant haplotypes. As an alternative to RILs, Darvasi

and Soller (1995) proposed randomly mating the F2

progeny of a cross between inbred founders and usingsuccessive generations of random mating (RM) topromote the accumulation of recombination break-points in the resulting advanced intercross lines (AILs).Linkage disequilibrium decays each generation as sexdoes the work of producing new haplotypes. A trivialextension of the AIL design preserves the diversity in theintercross population by inbreeding to yield highly re-combinant inbred lines, as Darvasi and Soller (1995)noted. Such a design had previously been employed byEbert et al. (1993). This recombinant inbred advancedintercross line design has obvious appeal in its union ofthe advantages of both AILs and RILs and has beenemployed in the production of mapping populations inseveral species (Ebert et al. 1993; Liu et al. 1996; Lee

et al. 2002; Peirce et al. 2004).The inbred lines derived by inbreeding from an

advanced intercross population have many names inthe literature, including intermated recombinant in-

1Corresponding author: 110 Carl Icahn Laboratory, Princeton University,Princeton, NJ 08544. E-mail: [email protected]

Genetics 179: 1069–1078 ( June 2008)

Page 2: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

bred populations (IRIP; Liu et al. 1996) or intermatedrecombinant inbred lines (IRIL; Sharopova et al.2002). These names risk confusion with recombinantinbred intercross populations (Zou et al. 2005), whichare populations of heterozygotes derived by intermatingrecombinant inbred lines. IRIP also implies that thepopulation is inbred, although in fact it is the individuallines that are inbred. The term advanced recombinantinbreds (ARI) (Peirce et al. 2004) also fosters confusionwith RILs in an advanced stage of inbreeding. Wetherefore use the simplest abbreviation, combining theRILs with AILs: RIAILs, or recombinant inbred advancedintercross lines (Valdar et al. 2006).

RIAILs have become increasingly popular for QTLmapping, but one aspect of the approach seldom ad-dressed is the mating design best suited to the inter-crossing phase of RIAIL production. Analytical studieshave generally assumed random mating in an infinitepopulation. In practice, finite populations suffer rela-tive to the ideal case because of genetic drift and be-cause of the reduced effective recombination rate thatresults from mating among relatives (Darvasi andSoller 1995; Teuscher and Broman 2007). Moreover,the exact pattern of mating among relatives has con-sequences for the rate of loss of heterozygosity withinindividuals and populations, as worked out by SewallWright (1921). We investigated the practical conse-quences of mating pattern, including regular-matingschemes and variations on random mating, on the utilityof RIAILs for fine mapping.

In regular-mating schemes, each individual mateswith a specified relative each generation, so that thepedigree is independent of the generation. Selfing andsibmating are examples of regular mating in which thepopulation is subdivided into separate lines that do notcross with one another, but many regular-mating sys-tems involve intermating among an entire population.Wright (1921) described schemes for the ‘‘least possi-ble inbreeding’’ in finite populations, where all matesare necessarily related. In a population of size 4, forexample, the scheme calls for matings between individ-uals that are first cousins twice over; they share noparents, but necessarily share all grandparents. Matingsinvolve quadruple second cousins in populations of size8, octuple third cousins in populations of 16, and soforth.

Although Wright’s inbreeding avoidance (IA) designeliminates drift resulting from variance in parental con-tributions to subsequent generations, it increases thedrift due to segregation in heterozygotes because itmaximizes the frequency of heterozygotes. Alternativeregular-mating schemes that force matings among closerelatives increase inbreeding in the short term but, bydecreasing segregation drift, may reduce allele-frequencydrift at the population level. Kimura and Crow (1963)introduced two simple mating designs, circular mating(CM) and circular pair mating (CPM), whose reliance

on matings among relatives preserves more diversity in apopulation than Wright’s (1921) scheme. The unin-tuitive relationship between inbreeding and diversity isillustrated by the fact that optimal designs for maintain-ing diversity in a population are those that eliminateintermating and instead divide the population intoindependent highly inbred lineages by, for example,selfing (Robertson 1964). Among regular breedingsystems that do not subdivide the population, includingall systems that would be useful for RIAIL construction,the optimal design for retaining diversity is unknown(Boucher and Cotterman 1990).

In addition to the extreme forms of regular inbreed-ing defined by Wright’s (1921) inbreeding avoidancescheme and Kimura and Crow’s (1963) circular de-signs, variations on the theme of random mating mayalso alter the outcome of RIAIL production. Moreover,regular inbreeding designs may be difficult to carry outin practice. The formal designs require a truly constantpopulation size. The loss of individuals due to accidentor to segregating sterility, for example, would preventthe application of the scheme, unlike a random schemewhere losses merely shrink the population. One mod-ification of random mating is the imposition of monog-amy in which each mate has offspring with only a singlepartner. This is often a practical necessity, but it mightalso have consequences for the relatedness of mates insubsequent generations. A second possible modifica-tion of random mating is to equalize the contributionsof each individual to the next generation. Eliminatingthe variance in offspring number, by permitting exactlytwo offspring per parent, has the effect of doubling theeffective population size relative to the truly randomcase, which exhibits Poisson variance in offspringnumber (Crow and Kimura 1970).

METHODS

We simulated populations of recombinant inbredadvanced intercross autosomes generated under avariety of breeding designs. Additional variables werethe number of generations of intercrossing, whichranged from 2 (for RILs derived from an F2 population)to 20 (for RIAILs derived from an F20 population), andthe population size (N ), which ranged from 8 to 512 bydoublings. The final number of RIAILs derived from anintercross of size N is N/2, as each mating pair in thefinal generation of the intercross yields a single RIAIL.Motivated by an interest in Caenorhabditis elegans, weconsidered an androdioecious population with malesand self-fertile hermaphrodites. Hermaphrodites be-have solely as outcrossing females during the intercrossphase of our simulations, but self during the inbreedingphase. The model is directly applicable to any systemwith both outcrossing and selfing. Chromosomes werealso modeled to resemble those of C. elegans, with asingle obligate crossover event at each meiosis, yielding

1070 M. V. Rockman and L. Kruglyak

Page 3: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

one breakpoint or none in each gametic chromosome.This meiotic mechanism corresponds to 50-cM chro-mosomes with total interference. Each chromosomewas tracked as a string of 10,000 markers, all differingbetween the parental lines. To generate F2 individuals,we simulated meiosis in F1 individuals heterozygous forthe 10,000 markers and drew two gametes randomly. Wethen considered eight breeding designs, illustrated inFigure 1, as well as a backcross population, in which eachindividual was represented by a chromosome drawn atrandom from a simulated F2 genome. We used thebackcross rather than an F2 population as a baseline forour results because it has, like an inbred line popula-tion, only two genotypes per marker. Every cross designwas simulated 1000 times to yield distributions of eval-uation statistics.

Breeding designs: We confine our attention to breed-ing designs with discrete, nonoverlapping generationsand constant population size. We simulated the followingeight breeding designs.

1. RIL: Recombinant inbred lines by selfing. Each F2

individual capable of selfing (i.e., hermaphrodites,N/2 individuals) produced two gametes by indepen-dent meioses and these two chromosomes formedthe genotype of an F3 individual. We simulated 10generations of selfing and considered chromosomesfrom the final generation. This inbreeding phase isshared by all the RIAIL breeding designs below.

2. CM–RIAIL: Circular mating. F2 males and hermaph-rodites alternate around a circle, and each individualmates with each of its neighbors, according to

Figure 1.—Mating schemes. Each mating scheme is represented in the M-matrix format of Boucher and Nagylaki (1988) andas a three-generation pedigree. In the M matrix, each row represents a progeny of the parental generation represented by thecolumns. Entry mij is the number of gametes from individual j contributing to the zygote of individual i. Row totals are always 2. Inthe regular-mating schemes (left), the matrix is fixed from generation to generation, while in random schemes (right) it is arandom realization each generation. Some schemes have zero variance in offspring number (equal contributions), so that columntotals are also constrained to equal 2. In the pair mating schemes, each column must share all entries with another. In the ped-igrees, hermaphrodites are circles, males are squares, and the lines depict parent–offspring relationships. For the random-matingschemes, the M matrix corresponds to the relationships depicted in the first two generations of the pedigree.

Recombinant Inbred Advanced Intercross Lines 1071

Page 4: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

Kimura and Crow’s (1963) scheme. The F3 individ-uals assume their parents’ place in the circle and matewith their neighbors, which are their half-siblings.

3. CPM–RIAIL: Circular pair mating. Individuals arepaired and the pairs are arranged in a circle,according to Kimura and Crow’s (1963) scheme.Each pair produces one male and one hermaphro-dite, and the pairings in the next generation shift byone individual around the circle.

4. IA–RIAIL: Inbreeding avoidance. Wright’s (1921)design for ‘‘maximum possible avoidance of consan-guineous mating.’’ In a population of size 2m, withequal numbers of males and hermaphrodites, matesshare no ancestors in the previous m� 1 generations.

5. RM–RIAIL: Random mating. F3 individuals wereformed by randomly drawing independent gametesfrom the F2 population, one from a male and onefrom a hermaphrodite, and subsequent generationswere formed similarly.

6. RMEC–RIAIL: Random mating with equal contribu-tions of each parent to the next generation. That is,variance in offspring number is zero. For eachgeneration, males were formed by drawing gametessequentially from randomly ordered lists of malesand hermaphrodites in the previous generation. Thelists were then randomly reordered, and gametes toform hermaphrodites drawn in turn. Each parentalindividual thereby contributes gametes to a singlemale and a single hermaphrodite in the nextgeneration.

7. RPM–RIAIL: Random pair mating. F2 males andhermaphrodites were randomly paired and F3 indi-viduals drew gametes independently from a randomparental pair. Mating pairs were chosen randomly ineach generation. Pair mating is equivalent to a re-quirement for monogamy.

8. RPMEC–RIAIL: Random pair mating with equalcontributions. F2 individuals were paired randomly.Each pair then generated gametes to form one maleand one hermaphrodite for the F3 population.

Evaluation statistics: Map expansion: The accumula-tion of breakpoints during intercrossing or inbreedingresults in an increase in the recombination fractionbetween markers and consequently an apparent expan-sion of the genetic map (Haldane and Waddington

1931; Darvasi and Soller 1995; Teuscherand Broman

2007). Teuscheret al. (2005) provided a simple formulafor the expected map expansion in a RIAIL populationin the infinite population, random-mating case. Forlines derived by selfing for i generations, starting froman advanced intercross population after t generations ofintercrossing, the expected genetic map expands by afactor of t/2 1 (2i � 1)/2i, approaching t/2 1 1 withinbreeding to fixation. In finite populations, both allele-frequency drift and matings between relatives diminishthe map expansion.

We compared the observed map expansion for eachsimulation to that expected in the infinite population,random-mating case. Our simulations feature a maplength of 0.5 M and a final population size of N/2, so theexpected number of breakpoints after a single meiosis,N/4 (0.5 M 3 N/2 chromosomes), provides a simplerelation between the count of breakpoints and the mapsize. After t generations of intercrossing and 10 gen-erations of selfing, the expected number of breakpointsin the ideal case is (N/4)(t/2 1 ½210 � 1�/210). We mea-sured the observed number of chromosomal break-points present in the final population at the end of eachsimulation. Breakpoints were recognized as switches inthe parental origin of markers in sequence along eachchromosome. The ratio of observed-to-expected num-ber of breakpoints is our statistic for the approach ofeach simulation to the ideal map expansion.

Expected bin size: Map expansion is a useful guide tothe improved resolution achievable under differentcrossing designs. However, the cross design of a RIAILpopulation, unlike that of a BC, F2, or RIL population,partly uncouples mapping resolution from map ex-pansion. Because each recombination breakpointarising during the intercross phase of RIAIL construc-tion can be inherited by multiple descendant lines,replicated breakpoints can increase the map expan-sion without improving mapping resolution. Thelimiting resolution is determined by the spacing ofindependent breakpoints, which we can assess fromthe distribution of bin sizes (Vision et al. 2000). A binis an interval within which there are no recombinationbreakpoints in the sampled population and whoseends are defined by the presence of a breakpoint inany individual in the sample or by the end of thechromosome. A bin is thus the smallest interval towhich a Mendelian trait could be mapped in thepopulation on which the bins are defined. FollowingVision et al. (2000), we focus on the expectation of thesize of a QTL-containing bin. This expected bin sizeincorporates aspects of both the mean and thevariance of bin size. On a chromosome of length Lwith n bins with lengths li, a random marker (or QTL)falls in bin i with probability li/L. The expectation forbin size is the sum over the n bins of the product of theprobability that a marker falls within the bin (li/L)and the bin length, li. The expected bin size is thus thesum of the squares of the bin lengths (SSBL, Vision

et al. 2000) divided by the chromosome length L, 1/LSli

2, with the summation over the n bins. Thisexpected bin size is larger than the average bin sizebecause a randomly chosen marker is more likely tofall into a larger bin; this phenomenon is known aslength-biased sampling or the waiting-time paradox.In fact, standard results from renewal theory showthat the expected QTL-containing bin size can beapproximated by M 1 V/M, where M is the averagebin size and V is the variance of bin size, and the

1072 M. V. Rockman and L. Kruglyak

Page 5: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

approximation is increasingly accurate for a largernumber n of bins (Cooper et al. 1998).

We measured the bin lengths for each simulatedpopulation by first identifying each pair of markersbetween which a chromosome in the population ex-hibits a breakpoint and then counting the markersbetween pairs of breakpoints or between breakpointsand chromosome ends. The number of bins and theirsizes are constrained by the number of markers trackedin the simulations (recall that the 10,000 markers arespaced uniformly every 0.005 cM). We then convertedthe expected bin lengths from marker units intocentimorgans using the physical–genetic distance ratioof 200 markers/cM. We also found the largest bin foreach population of simulated chromosomes. This sta-tistic is highly correlated with the expected bin size(supplemental Table 1).

Allele-frequency drift: The power to detect a QTL isstrongly influenced by the allele frequencies at the QTL,and genetic drift during intercrossing permits thealleles to depart from the expected 0.5 frequency. Foreach simulated population, we found the minor allelefrequency of a single arbitrary marker and calculated itsabsolute departure from 0.5. We also calculated theallele frequency at each marker along the chromosomeand found the marker exhibiting the most extremedeparture from 0.5. We used the expected single-locusdeparture and expected extreme departure from 0.5 assummary statistics to assess the influence of cross designon drift.

RESULTS

Map expansion: At large population sizes, with $256individuals, random-mating designs and inbreedingavoidance all achieve close approximation to the infin-ite-population case, averaging .98% of the ideal mapexpansion even with 20 generations of intercrossing(Figure 2, supplemental Table 1), with small variance.The circular mating schemes exhibit much less mapexpansion, just 67% in the case of circular mating for 20generations and 83% for circular pair mating. The loss ofmap expansion in the circular designs is nearly inde-pendent of population size but highly sensitive to thenumber of generations of intercrossing.

At modest population sizes, inbreeding avoidanceexhibits slightly greater map expansion than the otherdesigns, and IA, RMEC, and RPMEC exhibit slightly lowervariance in map expansion than the random-mating (RMand RPM) designs. These effects are negligible, particu-larly given the large variances in map expansion for eachdesign at small population sizes. For N $ 64, all but thecircular designs achieve roughly the expected level ofmap expansion.

Expected bin size: Inbreeding avoidance and randommating with equal contributions, with or without mo-nogamy, are indistinguishable from one another and are

superior to the other designs at reducing expected binsize (Figure 3, supplemental Table 1). At large popula-tion sizes, circular pair mating is next best, althoughsignificantly worse, and random mating is the worstdesign for ,10 generations of intercrossing. After 10generations, circular mating is even worse than randommating. In the largest population that we considered,with 512 individuals, 15 generations of intercrossingwith circular mating yields the same expected bin size(0.153 cM) as just 10 generations of intercrossing withRPMEC, RMEC, and IA designs. At smaller populationsizes, random mating is uniformly the worst design forreducing expected bin size, and at very low populationsizes (,32), random-mating designs actually lose map-ping resolution with additional generations of inter-crossing because unique breakpoints are lost as a singlerecombinant chromosome drifts to fixation.

Allele-frequency drift: Naturally, population size hasa dominant effect on genetic drift, as illustrated by theexpected departure from allele-frequency equity in theBC or RIL populations, in which a single generation ofsegregation is its only cause (Figure 4, supplementalTable 1). In this case, the expected departure is 0.185 forN ¼ 8, but only 0.025 for N ¼ 512 (i.e., the expectedminor allele frequency at a QTL is 0.315 in the formercase and 0.475 in the latter). The BC and RIL expecteddepartures place a lower bound on the possible allele-frequency departure in a population of recombinantinbred advanced intercross lines.

Within each population size, the two random-matingdesigns with variance in offspring number are signifi-cantly worse than other designs, as expected; eliminatingvariance in family size doubles the effective populationsize with respect to genetic drift (Crow and Kimura

1970). Also in keeping with expectation, circular matingis better than the others at preventing drift. However, thedifferences among the designs are modest, particularly atlarger population sizes. In a population of 512, theexpected departure is 0.036 after 20 generations of CMintercrossing and 10 of selfing vs. 0.043 in the same crosswith RMEC intercrossing and 0.058 for RM. In each case,these departures are acceptably close to the baseline0.025. Only when the population drops to 128 does anydesign yield an expected departure of 0.1, and then onlyfor RM and RPM and only in the case of $15 generationsof intercrossing. For smaller populations, although CM issubstantially better than other designs, it is still inade-quate for preventing significant drift. In a population of64, the BC baseline in our simulations has an expecteddeparture of 0.072. After 10 generations of intercrossing,CM holds the departure down to 0.088 vs. 0.096 forRPMEC and 0.127 for RPM.

The dominant influence of population size is alsoseen in the extremes of the allele-frequency departures.We found the point on each chromosome with the mostextreme allele frequency in each simulation. This pointis the locus at which QTL detection power will be most

Recombinant Inbred Advanced Intercross Lines 1073

Page 6: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

compromised by genetic drift. In populations of 512,with #10 generations of intercrossing, the expectedextreme departure for all designs with equal parentalcontributions is ,0.1 (supplemental Table 1). That is,on average, no marker will have a minor allele frequency,0.4. For smaller populations, extreme allele frequen-cies like these are the norm; for N ¼ 256, the expectedextreme departure exceeds 0.1 for every design withintercrossing past F4; in such a population, the baselineexpected extreme departure, in a BC population, isalready 0.074. Circular mating retains its advantages inthe context of extreme departures, but again they aresmall; for a 10-generation intercross in a population of64, the expected extreme departure is 0.218 for CM,0.250 for RPMEC, 0.254 for IA, and 0.295 for RPM.

DISCUSSION

The three measures of RIAIL quality—map expan-sion, bin size reduction, and drift minimization—reflectdifferent aspects of the utility of a mapping population.Map expansion is important for separating closely linkedQTL, bin size reduction is critical for localizing QTL,and drift minimization is necessary to preserve detec-tion power. The crossing designs vary in their effectson each of these qualities, but some designs are clearlyworse and others better across the spectrum. Mostimportantly, our study suggests that one of the simplestand most practical designs—random pair mating withequal contributions of each parent to the next gener-ation—is an effective way to generate RIAILs for high-resolution mapping.

Figure 2.—The realized map expan-sion, relative to the infinite populationexpectation. Given a map size for theBC population, the expected map ex-pansion factor is 2 for RILs, and �t/21 1 for Ft RIAILs. Although populationsize has an obvious influence on howwell a cross approximates the infinitepopulation case and on the varianceamong realizations, the loss of map ex-pansion in the circular mating designsis the dominant pattern. The scale ofthe y-axis varies among the panels dueto the higher variances at smaller popu-lation sizes. The shading indicates therange of outcomes within 20% of the ex-pected expansion. The box plots indi-cate medians and interquartile ranges,with the whiskers encompassing allpoints within 1.5 times the interquartilerange. Outliers are not plotted.

1074 M. V. Rockman and L. Kruglyak

Page 7: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

The simulation results make clear that a clever crossdesign is no substitute for a large population. In anypopulation large enough to be useful for mapping(intercross N $ 128, yielding $64 RIAILs), the effectsof genetic drift are modest for all designs with equalparental contributions. Although circular designs areslightly better than the others at reducing drift, thedifferences are negligible and the magnitude of driftnot a serious impediment to mapping. In a panel of 64RIAILs, for example, the typical QTL will be present atan allelic ratio of 28:36 in F20 CM–RIAILs and 27:37 inF20 RPMEC–RIAILs. The importance of drift diminishesto irrelevance in larger intercross populations.

Differences among designs are more substantial formap expansion and expected bin size. Circular designsare dramatically worse than others at map expansion,and they join random-mating designs with variance inoffspring number in their inadequacy in reducing binsize. The best designs are the inbreeding avoidancescheme and the random-mating designs with equal con-tributions. Surprisingly, these designs are indistinguish-able in their efficacy for reducing bin size and expandingthe genetic map. Although random-mating designs in-clude some matings among close relatives, they alsoinclude matings among very distantly related individuals.Inbreeding avoidance, on the other hand, while avoiding

Figure 3.—Expected bin size in cen-timorgans. Inbreeding avoidance andrandom-mating schemes with equalcontributions are superior (producethe smallest bins) to the other designs.The scale of the y-axis varies amongthe panels. The box plots indicate me-dians and interquartile ranges, withthe whiskers encompassing all pointswithin 1.5 times the interquartile range.Outliers are not plotted.

Recombinant Inbred Advanced Intercross Lines 1075

Page 8: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

matings among close relatives, has a lower limit on thelevel of relatedness among mates, determined by pop-ulation size. Apparently, inbreeding avoidance excludesmatings among very close relatives at the cost of exclud-ing matings among very distant relatives.

Monogamy is a practical necessity for performingcrosses in many species, and although multiple matesper individual may seem intuitively to provide moreopportunities for effective recombination, monogamyincurs no cost in RIAIL construction for any of themeasures considered.

Although most analytical and simulation studies haveconsidered designs that we find undesirable, particu-larly random mating (Darvasi and Soller 1995;Teuscher et al. 2005) or circular mating (Valdar et al.2006), few experimentalists have employed these ap-

proaches (e.g., Ebert et al. 1993; Yu et al. 2006). Instead,the most common experimental design for AILs andRIAILs is an intuitively appealing blend of randommating with equal contributions and inbreeding avoid-ance (Liu et al. 1996; Lee et al. 2002; Peirce et al. 2004).However, many articles provide ambiguous accounts ofthe variance in offspring number. Our results suggestthat, given equal parental contributions, special effortto avoid matings between relatives are neither necessarynor harmful.

We have considered only a few of the possible designsfor RIAIL construction. Peirce et al. (2004) proposed ascheme for constructing RIAILs one strain at a time, amethod that removes the possibility that breakpointswill be inherited by multiple strains. The design com-pletely eliminates shared ancestry among mates, so that

Figure 4.—Allele-frequency drift, asmeasured by the departure from 50%frequency of an arbitrary marker. Theinfluence of population size swampsthe effects of breeding design. The scaleof the y-axis varies among the panels.The shading indicates the range of out-comes with a ,10 percentage point de-parture from allele-frequency balance.The box plots indicate medians and in-terquartile ranges, with the whiskers en-compassing all points within 1.5 timesthe interquartile range. Outliers arenot plotted. The blockiness of the plotsis due to the discrete nature of the pos-sible allele frequencies, particularly atlow population sizes.

1076 M. V. Rockman and L. Kruglyak

Page 9: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

each inbred strain derived from an Ft intercross wouldrequire a separate initial population of 2t�1. Theapproach requires a radically larger number of individ-uals than the approach of building RIAILs from apopulation, but the crosses can be done sequentiallyrather than simultaneously, eliminating the constraintsimposed by labor and space costs. Given the closeapproximation of large population RIAILs to theinfinite-population case, the slight benefits of the one-at-a-time scheme in terms of eliminating breakpointsshared by descent is unlikely to justify the exceptionaloutlay of resources required for each strain. Forexample, construction of 256 F10 population-derivedRIAILs requires 10 rounds of mating, each entailing256 crosses. The equivalent number of F10 one-at-a-timeRIAILs would require 256 rounds of mating entailing256 crosses each, plus 256 rounds of mating entailing128 crosses each, and so forth. The ultimate differenceis between 2560 crosses for population-derived RIAILsand 130,836 crosses for the one-at-a-time RIAILs.

An alternative approach is marker-assisted selection,which can be used to apply balancing selection at everymarker in the genome (Wang and Hill 2000). For afinite number of strains, this is even better than RIAILconstruction in a intercross population of infinite sizebecause heterozygosity in the final, finite RIAIL panelcan exceed the binomial sampling expectation. Ofcourse, marker-assisted selection is currently labor- andcost-intensive because it requires genotyping at eachgeneration. The added expense may be justified in thecase of long-lived or commercial species or species inwhich selection is likely to drive strong departures fromallele-frequency balance during intercrossing. An eco-nomical alternative to marker-assisted selection is se-lective phenotyping, in which a larger-than-necessarymapping population is culled according to genotype tomaximize the mapping potential of a given number ofindividuals (Vision et al. 2000; Jin et al. 2004; Jannink

2005; Xu et al. 2005). In this case, only a single gen-eration of genotyping is required.

Two concerns for any experimental RIAIL populationare mutation and selection, forces neglected in oursimulations. Although all cross designs that employmany generations of breeding, including RILs, provideopportunity for selection to alter allele frequencies andfor spontaneous mutations to arise, RIAILs are partic-ularly at risk. In recombinant inbred lines, any newmutation is confined to a single line and at worst inflatesthe nonadditive phenotypic variance. Because a muta-tion can be inherited by multiple lines in an RIAILpopulation, it can be actively misleading for efforts tomap segregating variants present in the parental strains.Each additional generation of intercrossing also per-mits the population to respond to selection, with thepotential to exacerbate allele-frequency skews due tosegregation distortion and unintentional directionalselection.

Breeding designs for RIAILs have different conse-quences for mutation and selection. Equal contributiondesigns conform to the middle-class neighborhoodmodel for mutation accumulation (Shabalina et al.1997), in which the absence of selection among familiespermits mutations that would otherwise be deleteriousto behave as though neutral. As a result, selection onstanding variation is minimized (only genotypes causingsterility or lethality are eliminated), but the potential forspontaneous mutations to spread is increased. Experi-mental results on the impact of mutation accumulationin middle-class neighborhood populations are equivo-cal but suggest that the effects are modest (Shabalina

et al. 1997; Keightley et al. 1998; Fernandez andCaballero 2001; Reed and Bryant 2001; Rodriguez-Ramilo et al. 2006). Designs with variance in family sizeavoid mutation accumulation at the expense of expo-sure to selection-driven allele-frequency skew. Becauseallele-frequency skews are extremely common even in F2

populations (segregation distortion), the loss of map-ping power due to selection is likely to be a more seriousconcern for RIAIL populations than mutation accumu-lation, and consequently the best equal contributiondesigns (IA, RMEC, and RPMEC) remain preferable tothe alternatives.

Our results suggest that the most effective designs forRIAIL construction are inbreeding avoidance andrandom mating with equal contributions of each parentto the next generation. These results are encouraging,as random pair mating with equal contributions is likelyto be the most practical and convenient design for mostorganisms.

We thank Daniel Promislow for drawing our attention to relevantliterature. M.V.R. is supported by a Jane Coffin Childs Fellowship andL.K. by the National Institutes of Health (grants R37 MH059520 andR01 HG004321) and a James S. McDonnell Foundation CentennialFellowship.

LITERATURE CITED

Bailey, D. W., 1971 Recombinant-inbred strains: an aid to findingidentity, linkage, and function of histocompatibility and othergenes. Transplantation 11: 325–327.

Boucher, W., and C. W. Cotterman, 1990 On the classification ofregular systems of inbreeding. J. Math. Biol. 28: 293–305.

Boucher, W., and T. Nagylaki, 1988 Regular systems of inbreed-ing. J. Math. Biol. 26: 121–142.

Cooper, R. B., S.-C. Niu and M. M. Srinivasan, 1998 Some reflec-tions on the renewal-theory paradox in queueing theory. J. Appl.Math. Stochastic Anal. 11: 355–368.

Crow, J. F., and M. Kimura, 1970 An Introduction to Population Genet-ics Theory. Harper & Row, New York.

Darvasi, A., and M. Soller, 1995 Advanced intercross lines, an ex-perimental population for fine genetic mapping. Genetics 141:1199–1207.

Ebert, R. H., II, V. A. Cherkasova, R. A. Dennis, J. H. Wu, S. Ruggles

et al., 1993 Longevity-determining genes in Caenorhabditis elegans:chromosomal mapping of multiple noninteractive loci. Genetics135: 1003–1010.

Fernandez, J., and A. Caballero, 2001 Accumulation of deleteri-ous mutations and equalization of parental contributions inthe conservation of genetic resources. Heredity 86: 480–488.

Recombinant Inbred Advanced Intercross Lines 1077

Page 10: Breeding Designs for Recombinant Inbred Advanced Intercross Lines · 2018. 8. 3. · 1070 M. V. Rockman and L. Kruglyak. one breakpoint or none in each gametic chromosome. This meiotic

Haldane, J. B. S., and C. H. Waddington, 1931 Inbreeding andlinkage. Genetics 16: 357–374.

Jannink, J.-L., 2005 Selective phenotyping to accurately map quan-titative trait loci. Crop Sci. 45: 901–908.

Jin, C., H. Lan, A. D. Attie, G. A. Churchill, D. Bulutuglo et al.,2004 Selective phenotyping for increased efficiency in geneticmapping studies. Genetics 168: 2285–2293.

Keightley, P. D., A. Caballero and A. Garcia-Dorado, 1998 Pop-ulation genetics: surviving under mutation pressure. Curr. Biol.8: R235–R237.

Kimura, M., and J. F. Crow, 1963 On the maximum avoidance ofinbreeding. Genet. Res. 4: 399–415.

Lee, M., N. Sharopova, W. D. Beavis, D. Grant, M. Katt et al.,2002 Expanding the genetic map of maize with the intermatedB73 3 Mo17 (IBM) population. Plant Mol. Biol. 48: 453–461.

Liu, S. C., S. P. Kowalski, T. H. Lan, K. A. Feldmann and A. H.Paterson, 1996 Genome-wide high-resolution mapping by re-current intermating using Arabidopsis thaliana as a model. Genet-ics 142: 247–258.

Peirce, J. L., L. Lu, J. Gu, L. M. Silver and R. W. Williams, 2004 Anew set of BXD recombinant inbred lines from advanced inter-cross populations in mice. BMC Genet. 5: 7.

Reed, D. H., and E. H. Bryant, 2001 Relative effects of mutationaccumulation versus inbreeding depression on fitness in experi-mental populations of the housefly. Zoo Biol. 20: 145–156.

Robertson, A., 1964 The effect of non-random mating within in-bred lines on the rate of inbreeding. Genet. Res. 5: 164–167.

Rodriguez-Ramilo, S. T., P. Moran and A. Caballero, 2006 Re-laxation of selection with equalization of parental contributionsin conservation programs: an experimental test with Drosophilamelanogaster. Genetics 172: 1043–1054.

Shabalina, S. A., L. Yampolsky and A. S. Kondrashov, 1997 Rapiddecline of fitness in panmictic populations of Drosophila mela-nogaster maintained under relaxed natural selection. Proc. Natl.Acad. Sci. USA 94: 13034–13039.

Sharopova, N., M. D. McMullen, L. Schultz, S. Schroeder, H.Sanchez-Villeda et al., 2002 Development and mapping ofSSR markers for maize. Plant Mol. Biol. 48: 463–481.

Soller, M., and J. Beckman, 1990 Marker-based mapping of quan-titative trait loci using replicated progenies. Theor. Appl. Genet.80: 205–208.

Teuscher, F., and K. W. Broman, 2007 Haplotype probabilities formultiple-strain recombinant inbred lines. Genetics 175: 1267–1274.

Teuscher, F., V. Guiard, P. E. Rudolph and G. A. Brockmann,2005 The map expansion obtained with recombinant inbredstrains and intermated recombinant inbred populations for fi-nite generation designs. Genetics 170: 875–879.

Valdar, W., J. Flint and R. Mott, 2006 Simulating the collabora-tive cross: power of quantitative trait loci detection and mappingresolution in large sets of recombinant inbred strains of mice.Genetics 172: 1783–1797.

Vision, T. J., D. G. Brown, D. B. Shmoys, R. T. Durrett and S. D.Tanksley, 2000 Selective mapping: a strategy for optimizingthe construction of high-density linkage maps. Genetics 155:407–420.

Wang, J., and W. G. Hill, 2000 Marker-assisted selection to increaseeffective population size by reducing Mendelian segregation var-iance. Genetics 154: 475–489.

Wright, S., 1921 Systems of mating. II. The effects of inbreed-ing on the genetic composition of a population. Genetics 6:124–143.

Xu, Z., F. Zou and T. J. Vision, 2005 Improving quantitative trait locimapping resolution in experimental crosses by the use of geno-typically selected samples. Genetics 170: 401–408.

Yu, X., K. Bauer, P. Wernhoff, D. Koczan, S. Moller et al.,2006 Fine mapping of collagen-induced arthritis quantitativetrait loci in an advanced intercross line. J. Immunol. 177:7042–7049.

Zou, F., J. A. Gelfond, D. C. Airey, L. Lu, K. F. Manly et al.,2005 Quantitative trait locus analysis using recombinant inbredintercrosses: theoretical and empirical considerations. Genetics170: 1299–1311.

Communicating editor: W. R. Doerge

1078 M. V. Rockman and L. Kruglyak