quantitative genetics: past and present
TRANSCRIPT
Quantitative genetics: past and present
Prem Narain
Received: 26 June 2009 / Accepted: 4 February 2010 / Published online: 18 February 2010
� Springer Science+Business Media B.V. 2010
Abstract Most characters of economic importance
in plants and animals, and complex diseases in
humans, exhibit quantitative variation, the genetics
of which has been a fascinating subject of study since
Mendel’s discovery of the laws of inheritance. The
classical genetic basis of continuous variation based
on the infinitesimal model of Fisher and mostly using
statistical methods has since undergone major modi-
fications. The advent of molecular markers and their
extensive mapping in several species has enabled
detection of genes of metric characters known as
quantitative trait loci (QTL). Modeling the high-
resolution mapping of QTL by association analysis at
the population level as well as at the family level has
indicated that incorporation of a haplotype of a pair of
single-nucleotide polymorphisms (SNPs) in the model
is statistically more powerful than a single marker
approach. High-throughput genotyping technology
coupled with micro-arrays has allowed expression of
thousand of genes with known positions in the genome
and has provided an intermediate step with mRNA
abundance as a sub-phenotype in the mapping of
genotype onto phenotype for quantitative traits. Such
gene expression profiling has been combined with
linkage analysis in what is known as eQTL mapping.
The first study of this kind was on budding yeast. The
associated genetic basis of protein abundance using
mass spectrometry has also been attempted in the same
population of yeast. A comparative picture of tran-
script vs. protein abundance levels indicates that
functionally important changes in the levels of the
former are not necessarily reflected in changes in the
levels of the latter. Genes and proteins must therefore
be considered simultaneously to unravel the complex
molecular circuitry that operates within a cell. One has
to take a global perspective on life processes instead of
individual components of the system. The network
approach connecting data on genes, transcripts, pro-
teins, metabolites etc. indicates the emergence of a
systems quantitative genetics. It seems that the inter-
play of the genotype-phenotype relationship for
quantitative variation is not only complex but also
requires a dialectical approach for its understanding in
which ‘parts’ and ‘whole’ evolve as a consequence of
their relationship and the relationship itself evolves.
Keywords Quantitative characters � Genetic basis �Molecular markers � Quantitative trait loci (QTL) �High-resolution mapping � Power of statistical
modeling � eQTL � mRNA abundance � Protein
abundance � Systems quantitative genetics �Dialectical approach
Introduction
Most traits of economic importance in plants and
animals as well as disease traits in humans have an
P. Narain (&)
IASRI, New Delhi, India
e-mail: [email protected]
123
Mol Breeding (2010) 26:135–143
DOI 10.1007/s11032-010-9406-4
underlying genetic basis involving several genes and
are subject to modification by environmental factors.
Statistical considerations have been predominant in
dissecting such complex traits into estimable com-
ponents. The heritability of a trait as the proportion of
phenotypic variation that is attributed to genetic
causes has been a prime indicator helpful in making
decisions for the genetic improvement of economic
traits. The prediction of response to artificial selection
based on intensity and accuracy of selection and the
existence of genetic variability has been successful
across several crop plants, livestock, poultry and
fisheries. However, the relationship between pheno-
type and genotype has been like a black box, where
an inferential approach has been the only way to look
into it. This scenario is now changing with the advent
of modern technologies of gene sequencing, micro-
array experiments and enormous advances in
attempts to understand gene and protein expression
within a cell of an organism. Information on molec-
ular markers has been extremely helpful in identify-
ing the regions on chromosomes that bring about
variation in the trait (quantitative trait loci; QTL),
thereby providing tools that can lead to much more
accurate selection procedures for genetic improve-
ment of economic traits. Saturated genetic maps of
markers, giving their order along a chromosome and
relative distances between them, have been devel-
oped. The map distance is based on the total number
of crossovers between the two markers, whereas the
physical distance between them is in terms of
nucleotide base pairs (bp). A centiMorgan (cM),
corresponding to a crossover of 1%, can be a span of
10–1,000 kbp and can vary across species. The gene
transcript data from microarray experiments can be
integrated with molecular marker information to map
expression traits (eQTL) that can possibly lead to
causal networks. In this paper we discuss briefly some
of these developments and indicate how the evolution
of the quantitative genetics from the past to the
present is heading towards a systems quantitative
genetics.
QTL mapping
Since the marker genotypes can be followed in their
inheritance through generations, they can serve as
molecular tags for following the QTL provided they
are tightly linked with the QTL. The first problem is
therefore to detect the marker–QTL linkage. Once
this is established, the next problem is to estimate the
QTL map position on the chromosome and estimate
the effect of allelic substitution. However, these
problems depend on whether we have data on
experimental populations obtained from controlled
crosses, as in plants and animals, or on natural
populations like humans where controlled crosses
cannot be made. It is, however, important to note that
the markers chosen for the QTL analysis should not
show any segregation distortion, as that may lead to
biased marker-trait association. Also, the phenotypic
data on the quantitative trait should follow a normal
distribution. One has therefore to verify these
assumptions for the data under consideration before
embarking on the QTL analysis.
The detection of marker–QTL linkage is based on
a statistical test of a null hypothesis (H0) against an
alternative hypothesis (H1). It is therefore subject to
two types of error. H0 postulates that there is no QTL
and hence no linkage exists between the marker and
the QTL. Rejecting it when it is true is a Type I error
which means that we detect marker–QTL linkage
when in fact no QTL is present. This is termed false
positive and the probability of such a contingency (a)
is kept as low as 5% or less. On the other hand, if we
accept H0 when in fact a QTL is present, we commit a
Type II error. This means that our test misses the
QTL. As in any statistical test, the strategy is to
minimize the probability of committing a Type II
error (b) for a fixed value of a. The statistical power
for QTL detection is then (1–b). In QTL studies, such
testing is done at several points or intervals where
markers are located on each of the several chromo-
somes across the genome. Such multiple testing poses
a challenging problem that is primarily statistical.
The most common method of QTL mapping is that
of interval mapping. The whole chromosome is
divided into short intervals of about 20 cM each
and each interval is treated separately for QTL
detection and estimation. The maximum likelihood
method leading to LOD score statistics is used for
this purpose. A LOD score threshold T is chosen for
comparing with the observed value. An observed
value greater than T indicates significance. The LOD
score values obtained for each interval are plotted
against the chromosome position to give a Likelihood
Map. The maximum value of the significant LOD
136 Mol Breeding (2010) 26:135–143
123
scores provides a possible position of the QTL for the
given genomic region.
Although simple interval mapping (SIM) is the
method for QTL mapping most widely used with
advantage in several practical situations, it ignores
the fact that most quantitative traits are influenced by
numerous QTL. This is overcome either by adopting
a model of multiple QTL mapping (MQM) or by
combining SIM with the method of multiple linear
regression, a procedure known as composite interval
mapping (CIM). In all these methods, one uses the
approach of maximum likelihood which produces
only point estimates of the parameters such as the
number of QTL, their location, and effects. The
corresponding confidence intervals are required to be
determined separately by re-sampling methods. Fur-
ther, the correct number of QTL is difficult to
determine using traditional methods. Their incorrect
specification leads to distortion of the estimates of
locations and effects of QTL. To address these
problems a Bayesian approach is adopted wherein the
joint posterior distribution of all unknown parameters
given their prior distributions and the observed data
is computed. This is done using iterative simulation
procedures on high-speed computers.
The first application of interval mapping in plant
breeding has been to an inter-specific backcross in
tomato. The parents for the backcross were the
domestic tomato Lycopersicon esculentum (E) with
fruit mass 65 g and a wild South American green-
fruited tomoto L. chmielewskii (CL) with fruit mass
5 g. A total of 237 backcross plants were assayed for
continuously varying characters like fruit mass,
soluble-solids concentration and pH, and 63 RFLP
and 20 isozyme markers spaced at approximately
20 cM intervals were selected for QTL mapping. The
methods of maximum likelihood and LOD scores
were used through the software MAPMAKER-QTL
to implement the interval mapping. A threshold
T = 2.4, giving the probability of less than 5% that
even a single false positive will occur anywhere in
the genome, was used. This corresponds approxi-
mately to the significance level for any single test of
0.001. The resulting QTL likelihood maps revealed
multiple QTL for each trait (6 for fruit weight, 4 for
concentration of soluble solids and 5 for fruit pH) and
estimated their location to within 20–30 cM.
In regard to fruit weight, the above type of
investigation was continued, with more and more
QTL for this trait being identified. In another study, at
least 28 QTL controlling the difference in fruit
weight between wild and cultivated tomato were
identified, one of them being fw2.2 on chromosome 2.
Using refined mapping studies, this QTL was local-
ized to a narrow chromosomal region of the order of
1/10,000 of the genome. Using a map-based
approach, fw2.2 was cloned and a 19-kb segment of
DNA containing it was sequenced. This made it
possible to identify a single gene, ORFX, responsible
for the QTL effect. By transforming the wild version
of the gene into a cultivated tomato, it was shown that
the transformed plants decrease in weight by around
30% as predicted thus conforming that there are no
additional fruit weight QTL nearby on the chromo-
some. Yet in another experiment, the population
under study was derived from a cross between the
wild species L. pimpinellifolium with average tomato
fruit weight of 1 g and L. esculentum cultivar var.
Giant Heirloom with fruit weight in excess of
1,000 g. The same six major loci on chromosomes
1–3 and 11 accounting for as much as 67% of
phenotypic variation in fruit mass as in the previous
experiments were identified. The two most significant
QTL detected in this study are fw11.3 and fw2.1 on
chromosomes 11 and 2 respectively.
Linkage disequilibrium or association mapping
Association studies that involve linkage disequilib-
rium (LD) between markers and genes underlying
complex traits are being undertaken in different parts
of the world, but mostly in human genetics. The key
idea is that a disease mutation assumed to have arisen
once on the ancestral haplotype of a single chromo-
some in the past history of the population of interest
is passed on from generation to generation together
with markers at tightly linked loci resulting in LD.
The usual method adopted in human genetic studies
is that of case–control analysis wherein genotype or
allele frequencies of candidate genes are compared in
unrelated cases and controls. However, when the
population is composed of a recent admixture of
different ethnic groups that differ in marker allele
frequencies and disease frequencies, the method of
case–control comparison leads to spurious associa-
tion between the marker genotypes and the disease
traits. Family-based association methods such as the
Mol Breeding (2010) 26:135–143 137
123
transmission/disequilibrium test (TDT) can circum-
vent such problems.
Several studies on modeling the high-resolution
mapping of QTL by association analysis at the
population level as well as at the family level have
been conducted (Spielman et al. 1993; Luo et al.
1997; Luo et al. 2000; Fan et al. 2006 and several
others). Because of the difficulty in ascertaining the
phase of a haplotype consisting of several single-
nucleotide polymorphisms (SNPs), these models
considered marker genotypes at each locus sepa-
rately, thus losing information on their joint charac-
teristics. Narain (2007, 2009) therefore considered
the full genotypic model at a pair of flanking diallelic
SNPs, in the context of a family-based approach like
the TDT for testing the association in the presence of
LD. It led to a more powerful test when expressed in
terms of non-centrality parameters. This strategy for
high-resolution mapping of QTL by association
analysis was also investigated at the population level
and led to increased power of the corresponding tests.
Joint linkage and LD mapping
While linkage mapping can readily detect chromo-
somal regions harboring QTL, it is difficult to locate
them precisely. Also, since this approach depends on
the cross between two true breeding parents, it
captures only a tiny fraction of the genetic diversity
in the population. Association mapping, on the other
hand, widely samples genetic diversity as well as
requires fewer individuals but has less power to
detect QTL when they are not common. The advan-
tages of the two approaches can, however, be
combined by initially detecting QTL using linkage
mapping with a moderate number of markers fol-
lowed by a second stage of high-resolution associa-
tion mapping in QTL regions that capitalizes on a
high-density marker map.
The benefits of linkage and association mapping
have recently been combined in a single population
of maize by adopting a nested association mapping
(NAM) approach. The maize NAM population was
derived by crossing a common reference sequence
strain to 25 different maize lines. Individuals result-
ing from each of the 25 crosses were self-fertilized
for four further generations, to produce 5,000 NAM
recombinant inbred lines (RILs). This population was
first used for initial detection of QTL using linkage
mapping approach. Subsequently, within each diverse
strain, high-resolution association mapping was
adopted with a high-density marker map. It is
significant to note that within each RIL all individuals
are genetically nearly identical. This means we can
estimate the true breeding value of each line much
more accurately by averaging the phenotypic mea-
surements of a given trait taken on several individuals
with the same genotype.
In a recent experiment, the genetic architecture of
flowering time in Zea mays (maize) was dissected
using NAM. About 1 million plants were assayed in
eight environments to map the QTL. About 29–56
QTL were found to affect flowering time. These were
small-effect QTL shared among the diverse families.
The analysis showed, surprisingly, the absence of any
single large-effect QTL. Moreover, there was found
no evidence of epistasis or environmental interac-
tions. Flowering time controls adaptation of plants to
their local environment in the outcrossing species Zea
mays. A simple additive genetic model accurately
predicting the flowering time in this species is thus in
sharp contrast to what has been observed in several
plant species which practice self-fertilization.
Mapping of QTL for gene expression profile
(eQTL)
The advent of DNA chip technology in the form of
cDNA and oligonucleotide microarrays has provided
huge and complex datasets on gene expression
profiles of different cell lines from different organ-
isms. Such gene expression profiles have recently
been combined with linkage analysis based on QTL
mapping through molecular markers in what has been
termed ‘genetical genomics’ (Jansen and Nap 2001).
Gene expression levels for each individual of a
segregating population are phenotypes that are cor-
related with markers, genotyped for that individual,
to identify the QTL and their locations on the genome
to which the expression traits are linked. Such
expression quantitative trait loci (eQTL) studies are
similar to traditional multi-trait QTL studies but with
thousands of phenotypes. It is also important to note
that, underlying the gene expression differences,
there are two types of regulatory sequence variation.
One is cis-regulatory that affects its own expression
138 Mol Breeding (2010) 26:135–143
123
and the other is trans-acting or protein coding that
affects the expression of other genes. The first study
in which transcript abundance was used to study the
linkage with the QTL was on budding yeast (Brem
et al. 2002) based on a cross between a laboratory
strain and a wild strain, the parents being haploid
derivatives. The heritability estimation was based on
haploid segregants and the linkage with a marker was
tested by partitioning the segregants into two groups
according to marker genotypes and comparing the
expression levels between the groups with the
Wilcoxon–Mann–Whitney test. They found eight
trans-acting loci, each affecting the expression of a
group of 7–94 genes of related function. Since then,
several eQTL studies have been published in species
like mice, maize, humans, rats and Arabidopsis
thaliana (Schadt et al. 2003; Lan et al. 2003; Morley
et al. 2004; DeCook et al. 2006). These have led to
some general principles of genetic mapping of
genome-wide gene expression as reviewed by Rock-
man and Kruglyak (2006).
Conducting experiments to identify QTL for
organismal phenotype (P) as well as for the corre-
sponding transcript phenotype (Ps) can indicate the
genetic relationship between them, as borne out by
the study of Lan et al. (2003) on type 2 diabetes in a
population of F2-ob/ob mice from a cross of two
mouse strains. There were 8 mRNA traits (several Ps)
and 8- and 10-week levels of fasting plasma glucose,
insulin and body mass—the six physiological pheno-
typic traits (several P) for diabetes—and known
genotypes of 192 microsatellite markers included in
the study. In addition, of course, each transcript had a
known position on the genome, as is true for any
microarray experiment. The clustering of the two
types of phenotypes together led to two groups of 4
each of the 8 mRNA traits due to their mutual
correlations, with one of the groups containing SCD1
transcript (Ps), showing strong association with the
insulin trait. eQTL mapping of the first principal
component of this group revealed two loci DMC1 and
DMC2 that were significantly associated with SCD1.
The region of the former, on chromosome 2, over-
lapped with the locus t2dm3 that was found to be
associated with fasting insulin levels (P), using
traditional QTL mapping. Similarly, the region of
the DMC2 gene, on chromosome 5, overlapped with a
locus associated with fasting glucose levels (P). Thus
SCD1 mRNA expression was shown to be linked to
the loci that are associated with type 2 diabetes using
both multi- as well as single-trait QTL mapping. This
study points out that the phenotypic correlation
between P and Ps is due to the genetic correlation
between the corresponding genotypes—the DNA
sequence variation—and the possible correlation
between their corresponding environmental compo-
nents. As we will see later, such data can develop into
causal networks.
QTL for protein levels in yeast
In each cell of an organism, most of the day-to-day
work in terms of metabolism and structure is
performed by proteins consisting of long polypeptide
chains of amino acids that are of 20 types. It is well
known that the function of a protein is coded in a 20-
letter-alphabet language of amino acids and the type
of amino acid is dictated by the genetic code that
consists of successive triplets of nucleotides along the
DNA. The relationship between DNA and proteins is
provided by the manner in which the 4-letter
language of DNA is transformed in the 20-letter
language of protein. It is therefore expected that
functionally important changes in transcript levels
should be reflected in the changes in the levels of
corresponding protein levels.
Proteome profiling based on mass spectrometry
has been used for quantitative measurement of
protein abundance to study the genetic basis of
protein level in a cross between two diverse strains of
the budding yeast, the two strains differing at 0.6% of
base pairs (Foss et al. 2007). The same cross was also
used earlier to understand the genetic basis of
transcript levels (Brem et al. 2002). This therefore
allowed the comparison of the genetics of protein and
transcript levels in the same population. Just as
transcript levels are compared across samples by
measurements of corresponding spot hybridization
intensities on micro-arrays, levels of peptides in an
output of a mass spectrometry experiment consisting
of a matrix of peaks, each of which represents a
peptide, are measured in terms of ion intensities after
appropriate alignment of the matrices.
Total proteins from eight independent logarithmic-
phase cultures of each parent and from two indepen-
dent cultures of each of 98 segregants were isolated,
digested with trypsin and analyzed by mass
Mol Breeding (2010) 26:135–143 139
123
spectrometry. Only the best peptide for a given
protein was selected. This led to 221 unique peptides
with high quality data and corresponded to 278
proteins. The genetic contribution to the observed
variability in protein abundance was estimated from a
subset of 156 of these proteins for which high-quality
data from the parent strains were also available. The
heritability of protein abundance was found to be
0.62. The comparison between genetic regulation of
proteins and that of the transcripts revealed more
differences than similarities; the average correlation
between them was found to be only 0.186. The
parental strains differed in both proteins as well as
transcripts to the extent of about 33%. However, only
43% of proteins that differed between the parents
corresponded to transcripts that were different
between the parents. Linkage analysis detected loci
for 156 of 278 transcripts (56%) compared to 85 of
221 peptides (38%). Most loci affected either peptide
abundance or transcript abundance but not both.
Since traits are not physically located in the region,
the corresponding hot spots are trans-acting. Protein
linkages were found to be concentrated in fewer hot-
spots than the transcript linkages. The overall
conclusion of this study was startling in that the loci
that influenced protein abundance differed from those
that influenced transcript levels, much against
expectations.
Systems quantitative genetics
The relationship between genotype and phenotype is
viewed by Rockman (2008) as a reverse engineering
process in which observations from segregating
populations on genes, transcript abundance, QTL
for transcript abundance and molecular markers are
used to infer causal networks to understand how the
system works as an integrated whole. Based on the
premise that genetic variation occurring naturally in a
population is a source of multi-factorial perturbation,
he reviews the recent literature to show how models
of probabilistic causal networks can be built up to
establish the genotype–phenotype map. In a way, the
review indicates the emergence of a systems quan-
titative genetics.
We illustrate the causal network with the help of a
hypothetical example given in Rockman (2008). The
existence of a large segregating population derived
from a cross between two pure breeding lines, such as
F2 and F3 progeny, recombinant inbred lines (RILs),
or backcross (BC) progeny etc. is assumed. Each
individual of the population is used for analysis by
micro-array profiling based on RNA, protein or
metabolites as well as molecular marker analysis.
We take the transcript abundances (Ps) corresponding
to 10 genes, denoted as 1–10, as sub-phenotypes of
the main phenotype such as a disease (P). For each
transcript we carry out a linkage analysis in which
each point along the genome is tested to find out
whether it affects the abundance of the transcript.
QTL are detected as peaks of statistical significance
walking along the chromosome exceeding a certain
threshold (Table 1). The correlation between genes
and transcript phenotypes is studied by plotting the
physical position of the gene corresponding to each
transcript against the position of the QTL associated
with variation in the transcript abundance (Fig. 1).
The data points aligning on the diagonal represent
local linkages typically due to cis-acting regulatory
polymorphisms whereas vertical alignments indicate
linkage hotspots—regions in the genome where
variation changes the abundance of a large number
of transcripts, a pleiotropic effect. Such plots lead to a
causal network in which QTL variation is the cause
of variation in the transcript abundance phenotypes
(Figs. 2 and 3). Transcript abundance (Ps) QTL can
co-localize with QTL for organismal phenotypes (P)
such as a disease. The reverse engineering process
lies in finding out whether Ps that is affected by QTL
Table 1 QTL analysis of 10 hypothetical transcripts
cDNA Marker
A B C D E F G
1 *
2 *
3 * *
4 * *
5 * *
6 * *
7 * *
8 *
9 *
10
* Means significant indicating the presence of QTL in the
region
140 Mol Breeding (2010) 26:135–143
123
of P is a cause or effect of the disease. Several other
approaches like Bayesian networks, structural equa-
tion modeling, or a simple network derived from pair-
wise trait correlations could be adopted to study the
systems level understanding of the biological cause
and effect relationship. But each of them has its own
limitations. A complete reverse engineering of the
genotype–phenotype map, however, does not seem to
be feasible unless we can include all possible causal
variables in the network-inference methodology.
Discussion
There have been two major developments in recent
times that have changed the way we are accustomed
to look at the mapping of genotype onto phenotype
for quantitative characters. The first is the advent of
molecular markers, their extensive mapping in sev-
eral species and their incorporation in statistical
models as covariates. In addition to classical herita-
bility as the proportion of phenotypic variation in the
character that is due to additive effects of QTL, we
have now the proportion of additive genetic variation
that is associated with the markers. The larger this
proportion, the greater is our ability to detect QTL.
However, the regions to which the QTL are mapped
are usually large, of the order of 10–20 cM or even
greater, making candidate gene evaluation impossi-
ble. High-resolution mapping based on association
genetics must then be undertaken for which various
models have been developed, most of which consider
a single marker at a time thereby losing valuable
information due to linkage between them. For family-
based association methods like the TDT, Narain
(2007, 2009) developed the theory with haplotypes
instead of a single marker and proposed that one can
study the putative gene at any given location on the
11111
10
9
8
7
6
5
4
3
2
1
*
*
*
**
*
*
*
*
*
*
*
*
Marker Position Along The Genome
A B C D E F G
*
*
####
#
###
##
Fig. 1 Plot of QTL positions against the physical positions of
the genes
Fig. 2 Co-variation in transcript phenotype due to variation in
QTLs
Fig. 3 Causal network
Mol Breeding (2010) 26:135–143 141
123
chromosome by considering only a pair of markers
around it rather than the whole set of markers.
The second development is high-throughput geno-
typing technology which, coupled with micro-arrays,
has allowed expression of thousand of genes with
known positions in the genome and has provided an
intermediate step with mRNA abundance as a sub-
phenotype (Ps) in the mapping of genotype onto
phenotype for quantitative traits. Such gene expres-
sion profiling has been combined with linkage
analysis, termed eQTL mapping. Recently, the asso-
ciated genetic basis of protein abundance using mass
spectrometry has also been attempted. A comparative
picture of transcript vs. protein abundance levels in
the same population in the case of budding yeast,
however, indicates that functionally important
changes in the levels of the former are not necessarily
reflected in changes in the levels of the latter. It may
be worthwhile to discuss it from a conceptual angle.
As we know, the central dogma of molecular
biology stipulates that the sequence information flows
from DNA to RNA to protein but not in the reverse
direction. Rockman (2008) has also indicated that
many causal orderings in the network analysis are
prohibited by the central dogma, at least within an
individual, as phenotype does not feed back to affect
genotype, though between individuals phenotypes do
feed back by selection to shape genes. But Kimchi-
Sarfaty et al. (2007) reported data that indicates that a
protein’s three-dimensional structure is not necessar-
ily determined by its amino acid sequence which has
been specified by the DNA sequence. An mRNA, if
subjected to translational braking, can generate a
protein with a different structure than specified by the
DNA sequence. This has been termed the ‘transla-
tion-dependent folding’ (TDF) hypothesis (Newman
and Bhat 2007). Differential gene expression result-
ing in transcripts as sub-phenotypes could then lead
to different proteins and could give results similar to
those obtained in the yeast experiment. Genes and
proteins are therefore required to be considered
simultaneously to unravel the complex molecular
circuitry that operates within a cell. One has to look
at a global perspective of genotype—phenotype
relationship instead of individual components like
DNA or proteins of a cellular system.
It seems the interplay of genotype–phenotype
relationship for quantitative variation is not only
complex but also needs a closer look at how we view
this relationship—whether purely at the DNA–RNA
level as in the reductionist approach or at the level of
the cell as a whole where DNA–RNA are just parts of
the cellular system, with other contextual forces
present in the micro-environments of the cell also
playing their own important roles. Such situations
have also been noticed in agricultural experimenta-
tion where a dialectical approach has been advocated
(Narain 2006). In the grain production process, it is
also important to study how this process affects the
soil health and the ecosystem surrounding the plant,
as is studying the effect of the inputs on the
production. In the dialectical approach, this relation-
ship between the plant and its environment is studied
both ways—input to output as well as output to
input—a sort of feedback. A similar possibility seems
to exist in the genotype–phenotype relationship
within a cell. The protein as a phenotype is
determined by the DNA sequence as the genotype
but the reverse phenomenon of protein affecting the
DNA could also take place at the expense of violating
the central dogma. In fact, studies are being con-
ducted to explore biochemical signaling pathways
that regulate the function of living cells through
regulatory networks having positive and negative
feedback loops (Ray 2008), though it is unclear how
genetics can be incorporated into it. These feedback
loops are basically cybernetic concepts that are
inherent in the dialectical approach. This approach
takes into account the dynamics of the system over
time as well, in which the development is a conse-
quence of opposing forces. This is based on the
concept of contradiction inherent in the meaning of
dialectics. Things change because of the action of the
opposing forces on them, and things remain how they
are because of the temporary balance of opposing
forces. The opposing forces are seen as contradictory
in the sense that each taken separately would have
opposite effects, but their joint action may be
different from the result of either acting alone. These
forces are, however, part of self-regulation and the
development of the object is regarded as a network of
positive and negative feedback loops the incorpora-
tion of which in the genetic context would violate the
central dogma. Genes, transcripts, proteins, metabo-
lites, physical components etc. can be regarded as
‘parts’ of the cellular system and the ‘whole’ is
regarded as a relation of these parts that acquire
properties by virtue of being the parts of a particular
142 Mol Breeding (2010) 26:135–143
123
whole. As soon as the parts acquire properties by
being together, they impart to the whole new
properties that are in turn reflected in changes in
the parts, and so on. Parts and whole therefore evolve
as a consequence of their relationship, and the
relationship itself evolves. Genes are fixed but their
expression, the transcript, is not. At any given
moment of time genes are expressed depending on
the requirement of the cell and through the informa-
tion contained in the DNA. At this moment of time
the cellular system is said to have a particular state.
At the next moment of time the same genes are
expressed but differently, depending upon the then
requirement of the cell and based on the feedback, if
any, from the system’s state at the previous time
point, assuming that the process is Markovian. This
gives the next state of the system which might or
might not be different from the previous state. And so
the process goes on, continually modifying the
relationship between the different parts of the system
based on the interactions and feedbacks. It seems a
dialectical approach could provide the clue for
understanding how ‘parts’ of a system and the
‘whole’ system behave in the genetics context. But
how to model such a process remains to be seen.
References
Brem RB, Yvert G, Clinton R, Kruglyak L (2002) Genetic
dissection of transcriptional regulation in budding yeast.
Science 296:752–755
DeCook R, Lall S, Nettleton D, Howell SH (2006) Genetic
regulation of gene expression during shoot development
in Arabidopsis. Genetics 172:1155–1164
Fan RZ, Jung JS, Jin L (2006) High-resolution association
mapping of quantitative trait loci: a population-based
approach. Genetics 172:663–686
Foss EJ, Radulovic D, Shaffer SA, Ruderfer DM, Bedalov A,
Goodlett DR, Kruglyak L (2007) Genetic basis of prote-
ome variation in yeast. Nat Genet 39:1369–1375
Jansen RC, Nap J-P (2001) Genetical genomics: the added
value from segregation. Trends Genet 17(7):388–391
Kimchi-Sarfaty C, Oh JM, Kim IW, Sauna ZE, Calcagno AM,
Ambudkar SV, Gottesman MM (2007) A ‘‘silent’’ poly-
morphism in the MDRI gene changes substrate specificity.
Science 315:525–528
Lan H, Stoehr JP, Nadler ST, Schueler KL, Yandell BS, Attie
AD (2003) Dimension reduction for mapping mRNA
abundance as quantitative traits. Genetics 164:1607–1614
Luo ZW, Thompson R, Woolliams JA (1997) A population
genetics model of marker-assisted selection. Genetics
146:1173–1183
Luo ZW, Tao SH, Zeng Z-B (2000) Inferring linkage dis-
equilibrium between a polymorphic marker locus and a
trait locus in natural populations. Genetics 156:457–467
Morley M et al (2004) Genetic analysis of genome-wide var-
iation in human gene expression. Nature 430:743–747
Narain P (2006) Dialectical agriculture. Natl Acad Sci Lett
29:253–260
Narain P (2007) A theoretical treatment of interval mapping of
a disease gene using transmission disequilibrium tests. J
Biosci 32(7):1317–1324
Narain P (2009) Transmission Disequilibrium Test (TDT) for a
pair of linked marker loci. Comput Stat Data Anal
53(5):1883–1893
Newman SA, Bhat R (2007) Genes and proteins: dogmas in
decline. J Biosci 32(6):1041–1043
Ray LB (2008) Getting your loops straight. Science 322:389
Rockman MV (2008) Reverse engineering the genotype-phe-
notype map with natural genetic variation. Nature
456:738–744
Rockman MV, Kruglyak L (2006) Genetics of global gene
expression. Nat Rev Genet 7:862–872
Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo
V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS,
Mao M, Stoughton RB, Friend SH (2003) Genetics of
gene expression surveyed in maize, mouse and man.
Nature 422:297–302
Spielman RS, Mcginnis RE, Ewens WJ (1993) Transmission
tests for linkage disequilibrium: the insulin gene region
and insulin-dependent diabetes mellitus (IDDM). Am J
Hum Genet 52:506–516
Mol Breeding (2010) 26:135–143 143
123