more than 80r2r3-myb regulatory genes in the genome of arabidopsis thaliana
TRANSCRIPT
The Plant Journal (1998) 14(3), 273–284
More than 80R2R3-MYB regulatory genes in the genome ofArabidopsis thaliana
I. Romero1, A. Fuertes1, M. J. Benito1, J. M. Malpica2,
A. Leyva1 and J. Paz-Ares1,*1Centro Nacional de Biotecnologıa-CSIC, Campus de
Cantoblanco, 28049-Madrid, Spain, and2Instituto Nacional de Investigaciones Agrarias, ctra. de
La Coruna, Km. 7,528040-Madrid, Spain
Summary
Transcription factors belonging to the R2R3-MYB family
contain the related helix-turn-helix repeats R2 and R3. The
authors isolated partial cDNA and/or genomic clones of
78 R2R3-MYB genes from Arabidopsis thaliana and found
accessions corresponding to 31 Arabidopsis genes of
this class in databanks, seven of which were not
represented in the authors’ collection. Therefore, there
are at least 85, and probably more than 100, R2R3-MYB
genes present in the Arabidopsis thaliana genome,
representing the largest regulatory gene family currently
known in plants. In contrast, no more than three R2R3-
MYB genes have been reported in any organism from
other phyla. DNA-binding studies showed that there
are differences but also frequent overlaps in binding
specificity among plant R2R3-MYB proteins, in line with
the distinct but often related functions that are beginning
to be recognized for these proteins. This large-sized
gene family may contribute to the regulatory flexibility
underlying the developmental and metabolic plasticity
displayed by plants.
Introduction
Transcription factors play a central role in the regulation
of developmental and metabolic programs. Despite the
large differences in these programs, existing among
organisms from different eukaryotic phyla, their transcrip-
tion factors are quite conserved and most of them can
be grouped into a few families according to the structural
features of the DNA-binding domain they contain. One
of these families is that of the R2R3-MYB proteins, whose
complexity in plants is addressed in this study.
The prototype of this family is the product of the
animal c-MYB proto-oncogene, whose DNA-binding
domain consists of three related helix-turn-helix motifs of
about 50 amino acid residues, the so-called R1, R2 and
Received 18 August 1997; revised 26 January 1998; accepted 28 January
1998.
*For correspondence (fax 133 41585 4506; e-mail [email protected]).
© 1998 Blackwell Science Ltd 273
R3 repeats. The repeat most proximal to the N-terminus
(R1) does not affect DNA-binding specificity and is
missing in oncogenic variants of c-MYB, such as v-MYB,
and in the known plant R2R3-MYB proteins (Graf, 1992;
Lipsick, 1996; Luscher and Eisenman, 1990; Martin and
Paz-Ares, 1997; Thompson and Ramsay, 1995). R2R3-
MYB proteins belong to the MYB superfamily, which
also includes proteins with two or three more distantly
related repeats (e.g. of the R1/2 type, the progenitor of
the R1 and R2 repeats), and proteins with one repeat,
either of the R1/2 type (Feldbrugge et al., 1997) or of the
R3 type (Bilaud et al., 1996; Kirik and Baumlein, 1996).
Genes of the MYB superfamily have been found in all
eukaryotic organisms in which their presence has been
investigated. However, the R2R3-type is not present in
Saccharomyces cerevisiae and only 1–3 copies of R2R3-
MYB genes per haploid genome have been described in
organisms from protists and animals (Graf, 1992; Lipsick,
1996; Luscher and Eisenman, 1990; Thompson and Ramsay,
1995). In contrast, preliminary evidence suggest that
plants contain a much larger number of these genes (Avila
et al., 1993; Jackson et al., 1991; Marocco et al., 1989;
Oppenheimer et al., 1991).
Little is known about the function of most plant R2R3-
MYB genes although, in those few cases in which functions
are known, these are different from those of their animal
counterparts, which are mostly associated with the control
of cell proliferation, prevention of apoptosis, and commit-
ment to development (Graf, 1992; Lipsick, 1996; Luscher
and Eisenman, 1990; Martin and Paz-Ares, 1997; Taylor
et al., 1996; Thompson and Ramsay, 1995; Toscani et al.,
1997). Thus, most members of the plant R2R3-MYB family
with known functions have been implicated in the regula-
tion of the synthesis of different phenylpropanoids (Cone
et al., 1993; Franken et al., 1994; Grotewold et al., 1994;
Moyano et al., 1996; Paz-Ares et al., 1987; Quattrocchio
et al., 1993; Quattrocchio, 1994; Sablowski et al., 1994;
Solano et al., 1995a). Phenylpropanoids are a large class
of chemically different metabolites originating from
phenylalanine, which includes flavonoids, coumarins and
cinnamyl alcohols among others (Hahlbrock and Scheel,
1989). Despite their chemical diversity, these compounds
are biosynthetically related as their synthesis does include
common enzymatic steps. Other functions associated with
members of the plant R2R3-MYB gene family include
the control of cell differentiation (Noda et al., 1994;
Oppenheimer et al., 1991) and the mediation of responses
to signalling molecules such as salicylic acid and the
phytohormones abscisic acid (ABA) and giberellic acid
274 I. Romero et al.
(GA) (Gubler et al., 1995; Urao et al., 1993; Yang and
Klessig, 1996).
Sequence specific DNA-binding has been demonstrated
for several R2R3-MYB proteins, in agreement with their
role in transcriptional control (Biedenkapp et al., 1988;
Grotewold et al., 1994; Gubler et al., 1995; Howe and
Watson, 1991; Li and Parish, 1995; Moyano et al., 1996;
Sablowski et al., 1994; Sainz et al., 1997; Solano et al.,
1995a; Solano et al., 1997; Stober-Grasser et al., 1992; Urao
et al., 1993; Watson et al., 1993; Yang and Klessig, 1996).
The information available indicates that these proteins
bind to one or more of the following types of site: I,
CNGTTR; II, GKTWGTTR; and IIG, GKTWGGTR (where N
indicates A, G, C or T; K, G or T; R, A or G; W, A or T).
For instance, animal R2R3-MYB proteins recognize type I
sequences (Biedenkapp et al., 1988; Howe and Watson,
1991; Stober-Grasser et al., 1992; Watson et al., 1993), the
ZmMYBP (also known as P) proteins bind to type IIG
sequences, the ZmMYBC1 (also known as C1) and
AmMYB305 proteins bind to both type II and type IIG, and
the PhMYB3 protein can bind to types I and II (Grotewold
et al., 1994; Sablowski et al., 1994; Sainz et al., 1997; Solano
et al., 1995a; Solano et al., 1997). Recent studies with
protein PhMYB3 from Petunia, including molecular
modelling based on the solved structure of the mouse c-
MYB protein (MmMYB), have highlighted the importance of
residues Lys67, Leu71, Lys121 and Asn122 in determining
recognition specificity (Ogata et al., 1994; Solano et al.,
1997). These residues are fully conserved in all known
plant R2R3-MYB proteins. In contrast, protein AtMYBCDC5,
which has two R1/2-type repeats and does not
conserve these residues, has a completely different speci-
ficity (CTCAGCG, Hirayama and Shinokazi, 1996).
To evaluate the number of R2R3-MYB genes in plants,
and as a first step towards determining the full range of
functions associated with these genes using a reverse
genetic approach, we have carried out a PCR-based
systematic search for R2R3-MYB genes in the model
species Arabidopsis thaliana. We estimate that it contains
at least 85, and probably more than 100 R2R3-MYB genes,
representing the largest gene family of regulatory genes
described thus far in any plant species. In addition, we have
investigated the DNA-binding specificity of representative
R2R3-MYB proteins and have shown that there may be
differences but also considerable similarities in binding
specificitiy between R2R3-MYB proteins, particularly
among members of the same phylogenetic group, which is
in agreement with the recognizable functional relationships
between the members of the R2R3-MYB family.
Results
Isolation of R2R3-MYB clones
All known plant R2R3-MYB proteins contain highly con-
served stretches of amino acid residues within the
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Figure 1. Consensus amino acid sequence of the two repeats comprising
the DNA binding domain of plant R2R3-MYB proteins as described by
Avila et al. (1993), and oligonucleotide mixtures used in the isolation of
the R2R3-MYB genes (N1–6 and C1–3).
Upper case indicates residues fully conserved in all proteins used to
derive the consensus. Lower case indicates residues identical in at least
80% of the proteins. Other symbols are: 1, basic amino acid; –, acidic
amino acid; #, hydrophobic amino acid. New sequences (published since
this alignment, see Figure 2) have not altered this consensus sequence
in the regions from which the oligonucleotide sequences were derived,
with the exception of PhMYBAn2 which has a D/A substitution in the
region corresponding to oligonucleotide mixtures C1-C3, although they
have increased the variability of residues in variable positions. This
variability was taken into account in the design of the oligonucleotide
mixtures (R 5 A 1 G, Y 5 C 1 T, S 5 G 1 C, D 5 A 1 G 1 T, N 5
A 1 G 1 C 1 T) and so the oligonucleotide mixtures should have
recognized all the more recent additions to the R2R3 MYB gene family.
recognition helices of the R2 and R3 repeats from which
R2R3-MYB-specific mixtures of oligonucleotides can be
derived (Avila et al., 1993; Figure 1). These oligonucleotide
mixtures do not recognize the AtMYBCDC5 gene encoding
a MYB protein with two highly divergent repeats of the
R1/2-type (Hirayama and Shinokazi, 1996; Lipsick, 1996).
To search for R2R3-MYB genes, we first prepared cDNA
and genomic DNA libraries (of 1000 and 3000 clones,
respectively) enriched in these genes using PCR with
R2R3-MYB-specific oligonucleotides. Sequencing of all the
different clones present in each of these libraries (for
details, see Experimental procedures), revealed that 36 and
74 different R2R3-MYB genes were represented in the
cDNA and genomic DNA libraries, respectively, and that
32 were represented in both libraries. A total of 78 different
R2R3-MYB genes were therefore represented in our collec-
tion. A computer search revealed that there were 31 R2R3-
MYB genes from Arabidopsis described in databanks, of
which seven were not represented in the set of 78 isolated
in this study. There are, therefore, at least 85 (78 1 7), and
probably more than 100 (78 3 31/24, see Experimental
procedures) R2R3-MYB genes in the Arabidopsis thaliana
genome.
More than half of the R2R3-MYB genes identified in this
study were characterized only at the genomic DNA level,
raising the possibility that many of these R2R3-MYB
The R2R3-MYB gene family in Arabidopsis 275
genomic sequences might represent pseudogenes rather
than active genes. However, in no case was the reading
frame of the exonic sequences (represented in the genomic
clones) prematurely terminated. In addition, the number
of fully conserved residues in plant R2R3-MYB proteins is
the same independently of whether those protein
sequences from the R2R3-MYB genes characterized only
at the genomic DNA level are considered in the estimation.
On the other hand, pseudogenes usually show higher
rates of non-synonymous substitutions (Kns) relative to
synonymous substitutions (Ks) than active genes (Satta,
1993). We calculated the Kns/Ks ratio for all possible pairs
of R2R3-MYB genes in this population and these ratios
were compared to those in the population of R2R3-MYB
genes known to be expressed (i.e. those for which a cDNA
clone was available), using the method of Nei and Gojobori
(1986). The Kns/Ks values in the two populations (Kns/Ks
in genomic DNA population: 0.393 6 0.016; Kns/Ks in cDNA
population: 0.392 6 0.115) were not significantly different
in a t-test (P 5 0.83 ù 0.10). Collectively, these data are in
agreement with the conclusion that most, if not all, plant
R2R3-MYB sequences represent active genes.
Phylogenetic analysis of R2R3-MYB proteins
A phylogram of R2R3-MYB proteins was constructed with
the neighbor-joining method (Saitou and Nei, 1987) using
the sequences of the proteins in Figure 2 (except HvMYB33,
LeMYB1, AtMYB67, AtMYB41 and AtMYB45; Figure 3).
Three major groups were distinguished in the phylogram,
A, B and C (Figure 3). The bootstrap support for the node
corresponding to group C was not very high (30%), perhaps
due to the short size of the sequences used. However,
when the analysis was made using the whole R2R3-MYB
domain from the proteins for which this sequence was
available, the bootstrap support of this node was more
than 75% (see Figure 3). In addition, the existence of the
three groups was also supported by the tree constructed
using parsimony (Eck and Dayhoff, 1966) (not shown) and
by the different intron/exon structure of the genes encoding
the proteins of each group, with the exception of AtMYB67
(see Figure 3). Group A (accounting for about 10% of the
A. thaliana proteins), which also includes the animal and
protist R2R3-MYB proteins, represents genes with no intron
in the region sequenced, with the exception of AtMYB1
which has an intron at position 1. Group B (5% of the A.
thaliana proteins) represents proteins encoded by genes
with an intron at position 3. Finally, group C (85% of A.
thaliana proteins) contains genes with an intron at position
2. As shown below (see Discussion), this classification is
also in agreement with the data on DNA-binding specificity
of R2R3-MYB proteins, as similarities in this property were
usually higher between proteins belonging to the same
group than between proteins belonging to different groups.
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Each group, particularly group C, can be further subdivided
into subgroups of more closely related members. Many of
these subgroups contain R2R3-MYB proteins from other
plant species (although the search for this type of MYB
genes in these species has not been exhaustive), consistent
with the high functional similarity of regulatory systems
among plants (Benfey and Chua, 1989).
DNA-binding specificity of representative R2R3-MYB
proteins
To evaluate the degree of similarity in DNA binding
specificity between different Arabidopsis R2R3-MYB
proteins, we isolated cDNA clones containing the entire
coding region of four representative R2R3-MYB proteins,
AtMYB15, AtMYB77, AtMYB84 and AtMYBGl1 (see
Methods). Full length and deletion derivatives of these
proteins were produced by in vitro transcription and
translation. To determine their DNA-binding specificity,
an EMSA (electrophoretic mobility shift assay)-based
random-site selection procedure was used (Blackwell and
Weintraub, 1990; Solano et al., 1995a). Selection experi-
ments were performed with two oligonucleotide mixtures,
OI and OII, which had a partially random core sequence
representing the three types of sites defined for R2R3-MYB
proteins: OI, type I; OII, types II and IIG (Biedenkapp et al.,
1988; Grotewold et al., 1994; Gubler et al., 1995; Howe and
Watson, 1991; Li and Parish, 1995; Moyano et al., 1996;
Sablowski et al., 1994; Sainz et al., 1997; Solano et al.,
1995a; Solano et al., 1997; Stober-Grasser et al., 1992; Urao
et al., 1993; Watson et al., 1993; Yang and Klessig, 1996;
Figure 4; see Introduction). In fact, the nucleotides (or their
counterparts in the complementary strand) present in the
non-randomized positions (–2, 11 and 13) are contacted
by residues fully conserved in all plant R2R3-MYB proteins
(Leu71, Lys121 and Asn122, respectively, in PhMYB3; the
G in the complementary strand of position –2 in type I
targets is contacted by another fully conserved residue,
Lys67 (Solano et al., 1997).
AtMYB15 and AtMYB84 bound the partially randomized
oligonucleotide mixture OII and, to a lesser extent, the OI
oligonucleotide mixture, and the reciprocal was true with
a carboxy-terminal deletion derivative of AtMYB77
(AtMYB77∆C1) which bound better to OI (data not shown).
AtMYB77∆C1 was used because the full size protein had
lower binding affinity, as is the case with other R2R3-MYB
proteins (PhMYB3 and MmMYB) (Ramsay et al., 1992;
Solano et al., 1995a). In contrast, neither AtMYBGl1 nor its
carboxy-terminal deletion derivatives showed detectable
binding to either of these oligonucleotide mixtures (not
shown). A similar result was obtained with an increased
amount of probe and/or a decreased amount of non-
specific competitor DNA, independently of the type of
probe used, the partially randomized oligonucleotide
276 I. Romero et al.
mixtures OI and OII, or a fully randomized mixture (O, data
not shown). Protein phosphatase treatments, which have
been shown to increase binding affinity of one R2R3-MYB
protein (Moyano et al., 1996), were also ineffective (not
shown). Collectively, these data suggest limited in vitro
DNA-binding affinity for this protein. It is possible that low
DNA-binding affinity is an intrinsic property of AtMYBGl1
and that it might be increased in vivo after interaction(s)
with other protein(s). For example, there is evidence that
maize C1 protein (ZmMYBC1), which also shows low bind-
ing affinity in vitro (Sainz et al., 1997), requires an inter-
action with a second protein (the MYC protein R, Goff
et al., 1992) to activate flavonoid biosynthetic genes. A
similar interaction is possibly necessary for the activity of
AtMYBGl1 in vivo (Lloyd et al., 1992).
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
After four cycles of enrichment, oligonucleotides selected
by the R2R3-MYB proteins were cloned and sequenced. In
all instances, despite using two target oligonucleotide
mixtures, only one type of sequence was recovered for
each protein, indicating strong preference for one of the
types of sequences (Figure 4). For instance, in the case of
protein AtMYB77∆C1, which preferred type I sequences,
the sequences selected from oligonucleotide OII were also
of type I (generated in variable positions of OII, not shown)
and the reciprocal was true for proteins AtMYB15 and
AtMYB84 (not shown). These results argue against a bias
in the binding site selection experiments due to the use of
partially degenerated oligonucleotide mixtures, although
this possibility cannot be fully excluded.
Next, we used oligonucleotides representing the defined
The R2R3-MYB gene family in Arabidopsis 277
optimal target sites and mutants of these sites in binding
experiments with each of the above Arabidopsis proteins
and with carboxy-terminal deletion derivatives of PhMYB3
(PhMYB3∆C1), AmMYB305 (AmMYB305∆C1) and MmMYB
(MmMYB∆C2R1; Solano et al., 1997) as controls (Figure 5a).
The results of these experiments agreed with those from
site selection experiments, but revealed that AtMYB77∆C1
also recognised certain type II sequences, although with
reduced affinity compared to that for type I sequences. In
addition, they also showed specific DNA binding affinity
for AtMYBGl1, as it could weakly bind to oligonucleotide
II-1. In an apparent discrepancy with binding site selection
experiments, protein AtMYB77∆C1 bound better to the
oligonucleotide containing one of the optimal binding sites
of PhMYB3 (MBSI, oligonucleotide I-1; Solano et al., 1995a)
than to that containing its deduced optimal binding
sequence (oligonucleotide I-2). Discrepancies between a
binding site selection derived sequence with the optimal
binding site have also been reported for MADS box proteins
(Riechmann et al., 1996). A difference between the two
oligonucleotides (I-1 and I-2) is that I-1 is flanked by three
extra As, which would increase its ability to bend, a
property known to greatly influence binding by DNA-
distorting/bending proteins, such as R2R3-MYB proteins
and MADS proteins (Parvin et al., 1995; Riechmann et al.,
1996; Solano et al., 1995b; Thanos and Maniatis, 1992). To
test whether this difference could be the cause of the
preference of AtMYB77∆C1 for oligonucleotide I-1 versus
I-2, DNA binding experiments were conducted with new
oligonucleotides in which the three extra As of oligonucleo-
tide I-1 had been removed. The binding by AtMYB77∆C1
to this deletion version of I-1 (I-1∆) was similar to that
obtained for the oligonucleotide derived from binding site
Figure 2. Deduced amino acid sequences of Arabidopsis R2R3-MYB proteins.
For comparison, the sequences of R2R3-MYB proteins from other plant species and from representative organisms of other phyla are also given. The
region shown is that flanked by the sequences used to derive the oligonucleotide mixtures shown in Figure 1. The clones corresponding to AtMYB41
and to AtMYB45 did not encode the carboxy-terminal part of their sequence due to mispriming events. For protein (and gene) names, a standardized
nomenclature has been used (Martin and Paz-Ares, 1997) whereby the name of each protein includes a two-letter prefix as species identifier, the term
MYB, and then a term describing the particular family member. The codes for the species identifier are: Am, Antirrhinum majus; At, Arabidopsis
thaliana; Cp, Craterostigma plantagineum; Dd, Dictyostelium discoideum; Dm, Drosophila melanogaster; Gh, Gossypium hirsutum; Hv, Hordeum vulgare;
Le, Lycopersicon esculentum; Mm, Mus musculus; Nt, Nicotiana tabacum; Os, Oryza sativa; Ph, Petunia hybrida; Pm, Picea mariana; Pp, Physcomitrella
patens; Ps, Pisum sativum; Xl, Xenopus laevis; Zm, Zea mays. As family member identifier we have always used a number except where the previously
given name was based on functional information, such as the phenotype of mutants (e.g. the Gl1 (Glabrous1) protein from Arabidopsis is named
AtMYBGl1). Thus, all the genes identified in this study have been given a standardized number independent of whether a different non-standardized
name has been given by other authors. This has occurred in the following cases: AtMYB13, also named AtMYBlfgn (accession number Z50869);
AtMYB15, also named Y19 (X90384); AtMYB16, also named AtMIXTA (X99809); AtMYB23, also named AtMYBrtf (Z68158); AtMYB31, also named Y13
(X90387); AtMYB44, also named AtMYBR1 (Z54136); AtMYB77, also named AtMYBR2 (Z54137). In addition, the following R2R3-MYB genes, which were
not identified in this study, were renamed (with the agreement of the authors who first described them): AtMYB101 (M1); AtMYB102 (M4). AtMYB90
is described in the EMBL databank as an anonymous EST (H76020). The column on the right of the amino acid sequence gives the accession number
from which the sequences were derived. The accession numbers of the cDNAs encoding the full-size proteins AtMYB15, AtMYB77 and AtMYB84 are
Y14207, Y14208 and Y14209, respectively. In case of PhMYBAn2, the sequence was copied directly from Quattrochio (1994). The second column shows
the position of the intron interrupting that part of coding sequence represented in the figure: –, unknown; 0, no intron; the localization of introns 1, 2
and 3 is shown relative to the consensus sequence. The third column shows the type of clone isolated in this study: a, cDNA clone; b, genomic clone.
Other letters in this column indicate that the sequence shown in the figure was previously described in databanks or published (c) or that only part
of the sequence shown was previously described (d). Two additional sequences (accession numbers H36793 and T42245), each corresponding to a
novel Arabidopsis R2R3-MYB gene, were found in the EST databank, but are not represented in the figure because they were incomplete. These
sequences were, however, used for the estimation of the size of the R2R3-MYB gene family. Asterisks indicate proteins for which the sequence of the
whole R2R3-MYB domain is known. Symbols in the consensus sequence are as in Figure 1.
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
selection experiments (Figure 5b). This result underscored
the importance of DNA conformational properties in bind-
ing by transcriptional factors.
Discussion
Genes of the R2R3-MYB family are quite widespread in
eukaryotes, with the exception of yeast, and in plants the
number of these genes is especially high. Whereas no
more than three R2R3-MYB genes have been described in
any organisms from other eukaryotic phyla, here we isol-
ated partial cDNA and/or genomic clones corresponding
to 78 different R2R3-MYB genes from Arabidopsis and
estimated that there are probably more than 100 R2R3-
MYB genes in this species. The different size of regulatory
gene families in different groups of eukaryotes, a situation
which is not exclusive for R2R3-MYB genes (for instance,
see the case of MADS box proteins; Theissen et al., 1996),
might reflect major differences in developmental and meta-
bolic programs generated during evolution of these groups,
which largely involved a different use of pre-existing regu-
latory systems rather than the generation of new systems
(Martin and Paz-Ares, 1997).
According to recent estimates on the number of genes
in Arabidopsis (16 000–43 000; Gibson and Sommerville,
1993), members of the R2R3-MYB family would
represent at least 0.2–0.6% of the total Arabidopsis genes,
the largest proportion of genes thus far assigned to a
single regulatory gene family (and even to a gene family
encoding any type of protein) in plants. In other types of
eukaryotes there are families of equal, or even larger, size;
for instance, it is estimated that genes encoding zinc-finger
proteins represent about 1% of the human genes (Hoovers
278 I. Romero et al.
et al., 1992) and, in Caenorhabditis elegans, about 0.4% of
its genes contain homeoboxes (Burglin, 1995). However,
in these families overall sequence conservation is very low
and variability in DNA-binding specificity is high (Klug
and Schwabe, 1995; Treisman et al., 1992). In contrast,
members of the plant R2R3-MYB family share higher amino
acid sequence similarity, particularly in their recogni-
tion helices (Figure 1) and display considerable DNA-
recognition similarities (Figures 3 and 5).
These similarities in recognition specificity are par-
ticularly noticeable between members of the same
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
phylogenetic group, although in some cases overlaps in
binding specificity between members belonging to differ-
ent groups have been observed (Figures 3 and 5). Thus,
in the cases studied here or elsewhere (Biedenkapp et al.,
1988; Grotewold et al., 1994; Gubler et al., 1995; Howe
and Watson, 1991; Li and Parish, 1995; Moyano et al.,
1996; Sablowski et al., 1994; Sainz et al., 1997; Solano
et al., 1995a, 1997; Stober-Grasser et al., 1992; Urao et al.,
1993; Watson et al., 1993; Yang and Klessig, 1996)
members from group A (including both those from plants
and from organisms from other phyla) prefer (or bind
The R2R3-MYB gene family in Arabidopsis 279
to) a type I sequence, members of group B bind equally
well to both type I and type II, and most members of
group C prefer (or bind to) a type IIG. Possible exceptions
are the proteins from group C AtMYB2, reported to bind
type I sequences (Urao et al., 1993), and GLABROUS1
(AtMYBGl1), which only bound to a type II sequence
(Figure 5) although, in the first case, binding to IIG
sequences was not studied and, in the second case,
binding site selection experiments failed to provide
information on its optimal binding site (see Results).
However, it is striking that the only sequence bound by
AtMYBGl1 (AAAGTTAGTTA) perfectly conforms to the
sequence of gibberellic acid responsive elements, and
gibberellic acid is known to affect the AtMYBGl1-
controlled trait trichome formation (Oppenheimer et al.,
1991; Telfer et al., 1997).
In line with these similarities in binding specificity, and
despite the fact that target selectivity is usually also
influenced by interactions with other factors, most of the
R2R3-MYB proteins studied so far, which are scattered
throughout groups B and C, have been implicated in the
control of phenylpropanoid biosynthetic genes (Cone
et al., 1993; Franken et al., 1994; Grotewold et al., 1994;
Moyano et al., 1996; Paz-Ares et al., 1987; Quattrocchio
et al., 1993, 1994; Sablowski et al., 1994; Solano et al.,
1995a; Figure 3). Nevertheless, there are some R2R3-MYB
proteins that have been implicated in other functions,
including the control of cell differentiation and the
mediation of plant responses to several signal molecules
(Gubler et al., 1995; Noda et al., 1994; Oppenheimer et al.,
1991; Urao et al., 1993; Yang and Klessig, 1996). Target
Figure 3. Phylogenetic tree of the R2R3-MYB family using the neighbor-joining method (Saitou and Nei, 1987).
The phylogram shown was constructed with the sequences given in Figure 2, except HvMYB33, LeMYB1, AtMYB67, AtMYB41 and AtMYB45. The first
two were excluded because they were the only ones out of the 57 known complete-MYB-domain sequences which grouped differently (with bootstrap
support . 50%) depending on whether the complete MYB domains or the portion characterized in this study was used in the calculations. Protein
AtMYB67 was the only one which was not grouped with the other proteins encoded by genes with the same intron/exon structure. Proteins AtMYB41
and AtMYB45 were not used because only partial sequence data were available, although their probable position in the phylogram, inferred from a
tree constructed also using their incomplete sequences (not shown), is indicated in the tree with dashed lines. Exclusion of these five proteins increased
the bootstrap support of the major nodes (not shown). Names of R2R3-MYB proteins from non-plant species are shown in red. The three major nodes,
A, B and C, are denoted. Numbers (0, 1, 2 or 3) in some branches indicate the type of intron in the cloned portion of the genes encoding proteins
originating from the respective branch, as far as the genes for which this information is available are concerned (Figure 2). Nodes with high bootstrap
support are indicated (empty symbols, bootstraps . 50%; filled symbols, bootstraps . 75%). Circles refer to bootstraps data corresponding to the
represented tree. Squares refer to bootstraps data corresponding to the tree constructed with the sequence of the whole MYB domain of the proteins
for which this information was available (Figure 2). The known functions associated with some plant R2R3-MYB proteins are indicated: Ph, regulation
of phenylpropanoid biosynthetic genes (proteins ZmMYBC1, ZmMYBPl, ZmMYBP, ZmMYB38, ZmMYB1, AmMYB305, AmMYB340, PhMYBAn2; PhMYB3,
Cone et al., 1993; Franken et al., 1994; Grotewold et al., 1994; Moyano et al., 1996; Paz-Ares et al., 1987; Quattrocchio et al., 1993; Quattrocchio, 1994;
Sablowski et al., 1994; Solano et al., 1995a); CD, control of cell differentiation (proteins AtMYBGl1 and AmMYBMx, Noda et al., 1994; Oppenheimer
et al., 1991); SA, GA and ABA, involved in signal transduction pathway, respectively, salicylic acid (gene NtMYB1; Yang and Klessig, 1996), gibberellic
acid (proteins HvMYBGa, Gubler et al., 1995) and abscisic acid (proteins AtMYB2 and ZmMYBC1; Hattori et al., 1992; Urao et al., 1993). Capital letters
are used when the functions associated are based on genetic evidence (i.e. analysis of mutants). Also indicated is the available information on DNA-
binding specificity of some of the R2R3-MYB proteins, (arrowheads indicate the proteins examined in this study): I, CNGTTR (proteins MmMYB,
MmMYBA, MmMYBB, DdMYB, AtMYB1, AtMYB2, AtMYB77, PhMYB3, HvMYBGa, NtMYB1; Biedenkapp et al., 1988; Howe and Watson, 1991; Solano
et al., 1995a; Stober-Grasser et al., 1992; Urao et al., 1993; Watson et al., 1993); II, GTTWGTTR (proteins PhMYB3, HvMYBGa, AmMYB305, ZmMYBC1,
AtMYBGl1; Gubler et al., 1995; Sainz et al., 1997; Solano et al., 1995a; Solano et al., 1997); IIG, GKTWGGTR (proteins AmMYB305, AmMYB340, ZmMYBP,
ZmMYBC1, AtMYB6, AtMYB7, AtMYB15, AtMYB84, NtMYB1; Grotewold et al., 1994; Li and Parish, 1995; Moyano et al., 1996; Sablowski et al., 1994;
Sainz et al., 1997; Solano et al., 1995a; Yang and Klessig, 1996) (where N indicates A or G or C or T; K, G or T; R, A or G; W, A or T). Capital letters
are used in those cases in which the sequences are known to be the optimal binding site. When a given protein is able to bind to more than one
type of site, the size of the letter reflects the relative binding affinity for these sites.
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
genes of these latter R2R3-MYB genes are mostly
unknown, thus precluding definite conclusions about
whether they are functionally related between themselves
or indeed with the R2R3-MYB genes regulating phenyl-
propanoid biosynthetic genes. However, the signal
molecules salicylic acid, ABA and GA influence, among
others, the expression of phenylpropanoid biosynthetic
genes, in several instances through cis-acting elements
resembling R2R3-MYB binding sites (Dixon and Paiva,
1995; Hahlbrock and Scheel, 1989; Hattori et al., 1992;
Sablowski et al., 1994; Shirasu et al., 1997; Weiss et al.,
1990, 1992). In addition, GA also affects trichome forma-
tion, another trait under the control of an R2R3-MYB
gene, AtMYBGl1 (Telfer et al., 1997). Moreover, the MIXTA
gene (AmMYBMx) controls the specialized shape of inner
epidermal petal cells of Antirrhinum flowers, and these
changes in cell shape correlate with changes in the cell
wall, a structure containing phenylpropanoid derivatives
(Noda et al., 1994).
The number of R2R3-MYB genes with distinct but
related functions might therefore be extraordinarily high,
particularly with regard to the regulation of different
phenylpropanoid biosynthetic genes, although some of
these genes could also (or alternatively) act on other
types of targets (e.g. the barley gibberellic acid induced
α-amy gene is a likely target of HvMYBGa, Gubler et al.,
1995). In any case, the broad (phylogenetic) distribution
of the R2R3-MYB genes for which there is evidence of
their involvement in the regulation of phenylpropanoid
metabolism, suggests that a very early plant-specific
R2R3-MYB ancestor already had this function, and that
280 I. Romero et al.
Figure 4. DNA-binding specificity of the Arabidopsis proteins AtMYB77,
AtMYB15 and AtMYB84, obtained using binding site selection
experiments.
(a) Sequence of the partially random core of the oligonucleotide mixtures
(O-I and O-II) used in the binding site selection experiments.
(b) Summary of the nucleotide sequences of the oligonucleotides selected
by the different R2R3-MYB proteins. The base constitution around the
consensus is indicated in percentage. Asterisks indicate the positions at
which the nucleotide sequence was fixed in the original oligonucleotide
mixture. The type of binding site (I, II or IIG; see Figure 3) of each
protein is indicated. The part of the sequence determining the type of
binding site is underlined.
this was probably the ancestor of at least the genes
belonging to groups B and C (information on the function
of genes belonging to group A is currently lacking).
The existence of functional relationships between
members of the same family of transcription factors
have been documented for virtually any of these families.
For instance, HOX proteins control related developmental
pathways and in some cases they share some target
genes (Botas, 1993). However, the complete extent of
the relationship is difficult to be defined in most cases,
due to the limited information on the genes involved in
the pathways they regulate. In contrast, the R2R3-
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Figure 5. DNA-binding specificity of different R2R3-MYB proteins as
shown by EMSA.
The proteins used in the experiment were the full-size AtMYB84, AtMYB15
and AtMYBGl1, and the deletion derivatives PhMYB3∆C1 (amino acid
residues 1–180 of PhMYB3), MmMYB∆C2R1 (aminoacid residues 89–236
of MmMYB), AmMYB305∆C1 (amino acid residues 1–159 of AmMYB305)
and AtMYB77∆C1 (amino acid residues 1–200 of AtMYB77, see Methods).
The core sequence of the oligonucleotides used in the assay (a) are
shown on top of each lane. New sequences used in (b), corresponding
to the deletion derivatives of I-1 and II-1 lacking three As, are I-1∆(AAACGGTTA) and II-1∆ (AGTTAGTTA). All reactions contained an
equimolar amount of protein as well as DNA. The autoradiograph
corresponding to the protein AtMYBGl1 was threefold over-exposed.
MYB gene family regulated phenylpropanoid biosynthetic
pathway is biochemically well characterized and thus can
be used as a reporter to evaluate functional relationships
(as well as functional diversity) between several, and
potentially many, members of this gene family (because
the synthesis of each phenylpropanoid involves common
as well as specific enzymatic steps). The clones isolated
in this work should allow the use of reverse genetic
approaches to carry out such studies.
Plants, as sessile organisms, have evolved a great
plasticity in their developmental and metabolic programs
to cope with changing environmental conditions (Steeves
and Sussex, 1990). This requires very flexible regulatory
mechanisms whereby patterns of gene expression can be
The R2R3-MYB gene family in Arabidopsis 281
continuously adjusted in response to any environmental
change. The presence of large-sized regulatory gene
families such as the R2R3-MYB family, whose members
often share some target genes on which each regulatory
gene may exert a different effect, could have contributed
to these flexible control mechanisms.
Experimental procedures
Plant material
Arabidopsis thaliana, Landsberg ecotype, was used in this study.
Standard molecular procedures
All methods, including screening of cDNA libraries, RNA and
genomic DNA isolation, labelling of DNA and oligonucleo-
tides, etc., were performed as described previously (Avila et al.,
1993; Sambrook et al., 1989), except where indicated. The
vectors for cloning were pUC19 (Yanisch-Perron et al., 1985)
and pBluescriptII (Alting-Mees and Short, 1989).
The search for R2R3-MYB genes was performed in two
sequential stages. In the first stage, we prepared a cDNA library
enriched in R2R3-MYB genes using PCR with the R2R3-MYB-
specific oligonucleotides described in Figure 1, in all possible
pairwise combinations that included one oligonucleotide mixture
corresponding to the R2 repeat and one corresponding to the
R3 repeat. Since the size of the amplified R2R3-MYB cDNA
fragments was predictable (about 180 bp), the PCR-amplified
cDNA was size selected prior to cloning. To discard genes
already sequenced, an iterative procedure was used consisting
of hybridization to the library at high stringency using as a
probe the inserts of 20 previously sequenced clones. In a
second stage, the same PCR procedure was applied to genomic
DNA to reduce biases due to differential expression of different
R2R3-MYB genes. The amplified genomic DNA was cloned
directly, since the presence of introns in the amplified region
precluded size selection-based enrichment. Alternatively, enrich-
ment in R2R3-MYB clones was carried out by low stringency
hybridization using a mixture of the previously isolated R2R3-
MYB cDNAs as a probe. The MYB-enriched genomic DNA
library was screened following the same iterative procedure
adopted for the cDNA library. The cDNA used in the PCR
reactions was derived from poly(A)1 RNA prepared from a
mixture of plants grown in soil or in MS (0.53) medium
(Murashige and Skoog, 1962), collected at different develop-
mental stages (from seedling to flowering stages), and also
included plants treated with ABA and GA. The hormonal
treatments were performed on plants germinated and grown
without hormone in liquid MS (0.53) medium for 7 days,
after which the corresponding hormone was added (final
concentrations: ABA, 100 mM; GA, 100 mM) and kept for 8 h
before the plants were collected. The poly(A)1 RNA (20 ng ml21)
was reverse transcribed with AMV reverse transcriptase
(0.7 U ml–1) in the presence of the ribonuclease inhibitor RNasin
(0.5 U ml–1), using oligo (dT)15 (25 ng ml–1) as a primer; the
reaction mixture was incubated at 42°C for 1.5 h. PCR amplifica-
tion of R2R3-MYB genes was performed as follows: the DNA
(cDNA or genomic DNA, 20–200 pg ml–1) was amplified for 30
cycles using polymerase (0.025 U ml–1). Each cycle of amplifica-
tion consisted of: 1 min at 94°C, 90 sec at 55°C and 2 min at
72°C, except the first two cycles in which the annealing
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
temperature was 40–42°C instead of 55°C. cDNA clones encoding
the full-size proteins AtMYB15, AtMYB77 and AtMYB84 were
isolated by screening a whole-plant Arabidopsis cDNA library in
vector λNM1149 (105 Pfu, M. Sanchez, unpublished observations),
using a mixture of the available R2R3-MYB cDNA fragments as
a probe. That of AtMYBGl1 was obtained after reverse transcrip-
tion and amplification with PCR with oligonucleotides Ngl1
(GAATGAGAATAAGGAGAAGAG) and Cgl1 (CTAAAGGCAGTACT-
CAATATC) designed on the basis of the previously reported
sequence of this gene (Oppenheimer et al., 1991). The conditions
of PCR were the same as above, except that the annealing
temperature was always 55°C.
Plasmid constructs and in vitro synthesis of proteins
Constructs coding for deletion derivatives of PhMYB3
(PhMYB3∆C1), MmMYB (MmMYB∆C2R1) and AmMYB305
(AmMYB305∆C1) were previously reported (Solano et al.,
1997). Constructs coding for AtMYB15, AtMYB77, AtMYB84 and
AtMYBGl1 were prepared by cloning the cDNA corresponding
to these proteins into vector pBluescriptII and transcription with
the T3 or T7 polymerase. Transcripts coding for deletion
derivatives of these proteins were obtained by digestion with
restriction enzymes within the coding region of the proteins
before in vitro transcription (e.g. AtMYB77∆C1 which contains
amino acid residues 1–200 of the wild-type protein was
obtained by predigestion with BamHI). In vitro translation and
standardization of protein amount was as described previously
(Solano et al., 1997).
DNA-binding assays
Binding site selection experiments were performed as
described previously (Solano et al., 1995a), except that rabbit
reticulocyte extract (2 ml) containing the in vitro synthesized
protein was substituted for the bacterial extracts. In addition,
two oligonucleotide mixtures with a partially degenerated core
were also used (OI: 59-ACCGCTCGAGTCGACN6CNGNTN2CGGA-
TCCTGCAGAATTCGCG-39; O2: 59-ACCGCTCGAGTCGACN6TNG-
NTN2CGGATCCTGCAGAATTCGCG-39; Figure 4). DNA binding
assays with selected oligonucleotides were carried out as in
Solano et al. (1997). Oligonucleotides I-1 and II-1 (MBSI and
MBSII in Solano et al., 1997) represent the optimal binding sites
defined for PhMYB3. Oligonucleotides I-2 (59-CGCGAATT-
CTGCAGGATCCGTGACAGTTACGTCGACTCGAGCGGT-39) and II-
2 (59-CGCGAATTCTGCAGGATCCGCGGTAGGTGGGTCGACTCG-
AGCGGT-39) represent the optimal binding sites of AtMYB77,
and of AtMYB15 and AtMYB84, respectively. The core sequence
of other oligonucleotides representing variants of I-2 and of II-2
are shown in Figure 5.
Estimation of the size of the R2R3-MYB family in
Arabidopsis
To estimate the total number of R2R3-MYB genes in Arabidopsis,
we reasoned as follows. If two samples of n1 and n2 individuals
are extracted randomly from a population of N individuals, the
probability that a particular individual will be present in both
samples is P 5 n1 3 n2/N2. On the other hand, if n3 is the
number of individuals in such a class, the above probability
can also be estimated as P 5 n3/N. From these two equations,
it follows that N 5 n13n2/n3. Should there be any common bias
282 I. Romero et al.
in the two samples, the calculated N would be an underestimate
(as n3 would be higher than that expected for random samples).
The two samples of R2R3-MYB genes used in this study are
likely to share some bias. The R2R3-MYB sequences in databanks
are enriched in abundantly/moderately expressed genes, as
most correspond to EST/cDNA sequences. In the case of our
collection, although the bias towards the most highly expressed
genes has been alleviated by using an R2R3-MYB enriched
library from PCR-amplified genomic DNA, there is still some of
this type of bias since all genes identified had to cross-hybridize
to R2R3-MYB genes isolated from cDNA libraries.
Computer programs for protein and nucleic acid
analysis
Alignments, tree construction by the neighbour-joining method
and its bootstrapping (1000 samples) were performed with
CLUSTALW (Thompson et al., 1994). Using the matrices BOSUM
(Henikoff and Henikoff, 1992) or PAM 250 (Dayhoff et al., 1978)
did not make any difference to the results. In the case of the
parsimony method (Eck and Dayhoff, 1966), the PHYLIP package
(Felsenstein, 1989) was used. Multiple most parsimonious trees
were found and the consensus tree was built with the
CONSENSUS program of PHYLIP. Rates of synonymous and of
non-synonymous substitutions were calculated according to Nei
and Gojobori (1986) using the Ina program (Ina, 1995).
Acknowledgements
We are very grateful to the other members of the European
MYB function search consortium (the groups led by Michael
Bevan, Cathie Martin, Sjef Smeekens, Chiara Tonelli and Bernd
Weisshaar) for ongoing interest and stimulating discussions.
We thank Cathie Martin and Roger Watson for providing us
with the AmMYB305 and MmMYB progenitor constructs. We
also thank Francisco Garcıa Olmedo, Cathie Martin, Miguel
Angel Penalva, Santiago Rodrıguez de Cordoba, and Bernd
Weisshaar for critical reading of the manuscript. This work was
financed by grants from the EU (BIO2-CT93–0101; BIO4-CT95–
0129) and from the Spanish CICYT (BIO96–1115).
References
Alting-Mees, M.A. and Short, J.M. (1989) pBluescript II: gene
mapping vectors. Nucl. Acids Res. 17.
Avila, J., Nieto, C., Canas, L., Benito, M.J. and Paz-Ares, J.
(1993) Petunia hybrida genes related to the maize regulatory
C1 gene and to animal myb proto-oncogenes. Plant J. 3,
553–562.
Benfey, P.N. and Chua, N.H. (1989) Regulated genes in transgenic
plants. Science, 244, 174–181.
Biedenkapp, H., Borgmeyer, U., Sippel, A.E. and Klempnauer,
K.H. (1988) Viral myb oncogene encodes a sequence-specific
DNA-binding activity. Nature, 335, 835–837.
Bilaud, T., Koering, C.E., Binet-Brasselet, E., Ancelin, K., Pollice,
A., Gasser, S.M. and Gilson, E. (1996) The telobox, a Myb-
related telomeric DNA binding motif found in proteins from
yeast, plants and human. Nucl. Acids Res. 24, 1294–1303.
Blackwell, T.K. and Weintraub, H. (1990) Differences and
similarities in DNA-binding preferences of MyoD and E2A
protein complexes revealed by binding site selection. Science,
250, 1104–1110.
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Botas, J. (1993) Control. of morphogenesis and differentiation
by HOM/Hox genes. Curr. Opin. Cell. Biol. 5, 1015–1022.
Burglin, T.R. (1995) The evolution of Homeobox genes. In
Biodiversity and Evolution (Arai, R., Kato, M. and Doi, Y.,
eds). Tokyo: The National Science Museum Foundation,
pp. 291–336.
Cone, K.C., Cocciolone, S.M., Burr, F.A. and Burr, B. (1993)
Maize anthocyanin regulatory gene pl is a duplicate of c1
that functions in the plant. Plant Cell, 5, 1795–1805.
Dayhoff, M.O., Schwartz, R.M. and Orcutt, B.C. (1978) A model
of evolutionary change in proteins. In Atlas protein sequence
structure. Volume 5 (Dayhoff, M.O., ed.). Silver Spring,
Maryland: National Biomedical Research Foundation, pp.
345–352.
Dixon, R.A. and Paiva, N.L. (1995) Stress-induced
phenylpropanoid metabolism. Plant Cell, 7, 1085–1097.
Eck, R.V. and Dayhoff, M.O. (1966) Atlas of Protein Sequences
and Structure. Siver Spring, Maryland: National Biomedical
Research Foundation.
Feldbrugge, M., Sprenger, M., Hahlbrock, K. and Weisshaar, B.
(1997) PcMYB1, a novel plant protein containing a DNA-
binding domain with one MYB repeat, interacts in vivo with
a light-regulatory promoter unit. Plant J. 11, 1079–1093.
Felsenstein, J. (1989) phylip Phylogeny inference package.
Cladistics, 5, 164–166.
Franken, P., Schrell, S., Peterson, P.A., Saedler, H. and Wienand,
U. (1994) Molecular analysis of protein domain function
encoded by the myb-homologous maize genes C1, Zm 1 and
Zm 38. Plant J. 6, 21–30.
Gibson, S. and Somerville, C. (1993) Isolating plant genes.
TIBTECH, 11, 306–313.
Goff, S.A., Cone, K.C. and Chandler, V.L. (1992) Functional
analysis of the transcriptional activator encoded by the maize
B gene: evidence for a direct functional interaction between
two classes of regulatory proteins. Genes Dev. 6, 864–875.
Graf, T. (1992) Myb: a transcriptional activator linking proliferation
and differentiation in hematopoietic cells. Curr. Opin. Genet.
Dev. 2, 249–255.
Grotewold, E., Drummond, B.J., Bowen, B. and Peterson, T.
(1994) The myb-homologous P gene controls phlobaphene
pigmentation in maize floral organs by directly activating a
flavonoid biosynthetic gene subset. Cell, 76, 543–553.
Gubler, F., Kalla, R., Roberts, J.K. and Jacobsen, J.V. (1995)
Gibberellin-regulated expression of a myb gene in barley
aleurone cells: Evidence for Myb transactivation of a high-pI
alpha-amylase gene promoter. Plant Cell, 7, 1879–1891.
Hahlbrock, K. and Scheel, D. (1989) Physiology and molecular
biology of phenylpropanoid metabolism. Annu. Rev. Plant.
Physiol. Plant Mol. Biol. 40, 347–336.
Hattori, T., Vasil, V., Rosenkrans, L., Hannah, L.C., McCarty, D.R.
and Vasil, I.K. (1992) The Viviparous-1 gene and Abscisic acid
activate the C1 regulatory gene for anthocyanin biosynthesis
during seed maturation in maize. Genes Dev. 6, 609–618.
Henikoff, S. and Henikoff, J.G. (1992) Amino acid substitution
matrices from protein blocks. Proc. Natl Acad. Sci. USA, 89,
10915–10919.
Hirayama, T. and Shinozaki, K. (1996) A cdc51 homolog of a
higher plant, Arabidopsis thaliana. Proc. Natl Acad. Sci. USA,
93, 13371–13376.
Hoovers, J.M., Mannens, M., John, R., Bliek, J., van Heyningen,
V., Porteous, D.J., Leschot, N.J., Westerveld, A. and Little,
P.F. (1992) High-resolution localization of 69 potential human
zinc finger protein genes: a number are clustered. Genomics,
12, 254–263.
The R2R3-MYB gene family in Arabidopsis 283
Howe, K.M. and Watson, R.J. (1991) Nucleotide preferences in
sequence-specific recognition of DNA by c-myb protein. Nucl.
Acids Res. 19, 3913–3919.
Ina, Y. (1995) New methods for estimating the numbers of
synonymous and nonsynonymous substitutions. J. Mol. Evol.
40, 190–226.
Jackson, D., Culianez, M.F., Prescott, A.G., Roberts, K. and
Martin, C. (1991) Expression patterns of myb genes from
Antirrhinum flowers. Plant Cell, 3, 115–125.
Kirik, V. and Baumlein, H. (1996) A novel leaf-specific myb-
related protein with a single binding repeat. Gene, 183,
109–113.
Klug, A. and Schwabe, J.W. (1995) Protein motifs 5. Zinc fingers.
FASEB J. 9, 597–604.
Li, S.F. and Parish, R.W. (1995) Isolation of two novel myb-like
genes from Arabidopsis and studies on the DNA-binding
properties of their products. Plant J. 8, 963–972.
Lipsick, J.S. (1996) One billion years of Myb. Oncogene, 13,
223–235.
Lloyd, A.M., Walbot, V. and Davis, R.W. (1992) Arabidopsis
and Nicotiana anthocyanin production activated by maize
regulators R and C1. Science, 258, 1773–1775.
Luscher, B. and Eisenman, R.N. (1990) New light on Myc and
Myb. Part II. Myb. Genes Dev. 4, 2235–2241.
Marocco, A., Wissenbach, M., Becker, D., Paz-Ares, J., Saedler,
H., Salamini, F. and Rohde, W. (1989) Multiple genes are
transcribed in Hordeum vulgare and Zea mays that carry the
DNA-binding domain of the MYB oncoproteins. Mol. Gen.
Genet. 216, 183–187.
Martin, C. and PazAres, J. (1997) MYB transcription factors in
plants. Trends Genet. 13, 67–73.
Moyano, E., Martınez, G.J. and Martin, C. (1996) Apparent
redundancy in myb gene function provides gearing for the
control. of flavonoid biosynthesis in antirrhinum flowers. Plant
Cell, 8, 1519–1532.
Murashige, T. and Skoog, F. (1962) A revised medium for rapid
growth and bioassays with tobacco tissue cultures. Physiol.
Plant. 15, 473–497.
Nei, M. and Gojobori, T. (1986) Estimating synonymous and
nonsynonymous substitution rates. Mol. Biol. Evol. 3, 105–114.
Noda, K., Glover, B.J., Linstead, P. and Martin, C. (1994) Flower
colour intensity depends on specialized cell shape controlled
by a Myb-related transcription factor. Nature, 369, 661–664.
Ogata, K., Morikawa, S., Nakamura, H., Sekikawa, A., Inoue,
T., Kanai, H., Sarai, A., Ishii, S. and Nishimura, Y. (1994)
Solution structure of a specific DNA complex of the Mandb
DNA-binding domain with cooperative recognition helices.
Cell, 79, 639–648.
Oppenheimer, D.G., Herman, P.L., Sivakumaran, S., Esch, J. and
Marks, M.D. (1991) A myb gene required for leaf trichome
differentiation in Arabidopsis is expressed in stipules. Cell,
67, 483–493.
Parvin, J.D., McCormick, R.J., Sharp, P.A. and Fisher, D.F. (1995)
Prebending of a promoter sequence enhances affinity for the
TATA-binding factor. Nature, 373, 724–727.
Paz-Ares, J., Ghosal, D., Wienand, U., Peterson, P.A. and Saedler,
H. (1987) The regulatory c1 locus of Zea mays encodes a
protein with homology to myb proto-oncogene products and
with structural similarities to transcriptional activators. EMBO
J. 6, 3553–3558.
Quattrocchio, F. (1994) Regulatory genes controlling flower
pigmentation in Petunia hybrida. PhD thesis. Amsterdam:
Vrije Universiteit te Amsterdam.
Quattrocchio, F., Wing, J.F., Leppen, H.T.C., Mol, J.N.M. and
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Koes, R.E. (1993) Regulatory genes controlling anthocyanin
pigmentation are functionally conserved among plant species
and have distinct sets of target genes. Plant Cell, 5, 1497–1512.
Ramsay, R.G., Ishii, S. and Gonda, T.J. (1992) Interaction of the
Myb protein with specific DNA binding sites. J. Biol. Chem.
267, 5656–5662.
Riechmann, J.L., Wang, M. and Meyerowitz, E.M. (1996) DNA-
binding properties of Arabidopsis MADS domain homeotic
proteins APETALA1, APETALA3, PISTILLATA and AGAMOUS.
Nucl. Acids Res. 24, 3134–3141.
Sablowski, R.W.M., Moyano, E., Culianez-Macia, F.A., Schuch,
W., Martin, C. and Bevan, M. (1994) A flower-specific Myb
protein activates transcription of phenylpropanoid biosynthetic
genes. EMBO J. 13, 128–137.
Sainz, M.B., Grotewold, E. and Chandler, V.L. (1997) Evidence
for direct activation of an anthocyanin promoter by the maize
C1 protein and comparison of DNA binding by related myb
domain proteins. Plant Cell, 9, 611–625.
Saitou, N. and Nei, M. (1987) The neighbor-joining method: a
new method for reconstructing phylogenetic trees. Mol. Biol.
Evol. 4, 406–425.
Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular
Cloning: A Laboratory Manual, 2nd edn. Cold Spring Harbor,
NY: Cold Spring Harbor Laboratory Press.
Satta, Y. (1993) How the ratio of nonsynonymous to synonymous
pseudogene substitutions can be less than one.
Immunogenetics, 38, 450–454.
Shirasu, K., Nakajima, H., Rajasekhar, V.K., Dixon, R.A. and
Lamb, C. (1997)Salicylic acid potentiates an agonist-dependent
gain control. that amplifies pathogen signals in the activation
of defense mechanisms. Plant Cell, 9, 261–270.
Solano, R., Fuertes, A., Sanchez, L., Valencia, A. and Paz-Ares,
J. (1997) A single residue substitution causes a switch from
the dual DNA binding specificity of plant transcription factor
MYB.Ph3 to the animal c-MYB specificity. J. Biol. Chem. 272,
2889–2895.
Solano, R., Nieto, C., Avila, J., Canas, L., Dıaz, I. and Paz-Ares,
J. (1995a) Dual DNA-binding specificity of petal epidermis
specific MYB transcription factor (MYB.Ph3) from Petunia
hybrida. EMBO J. 14, 1773–1784.
Solano, R., Nieto, C. and Paz-Ares, J. (1995b) MYB.Ph3
transcription factor from Petunia hybrida induces similar DNA-
bending/distortions on its two types of binding site. Plant J.
8, 673–682.
Steeves, T.A. and Sussex, I.M. (1990) Patterns in Plant
Development, 2nd edn. Cambridge: Cambridge University
Press.
Stober-Grasser, U., Brydolf, B., Bin, X., Grasser, F., Firtel, R.A.
and Lipsick, J.S. (1992) The Myb DNA-binding domain is
highly conserved in Dictyostelium discoideum. Oncogene, 7,
589–596.
Taylor, D., Badiani, P. and Weston, K.A. (1996) A dominant
interfering Myb mutant causes apoptosis in T cells. Genes
Dev. 10, 2732–2744.
Telfer, A., Bollman, K.M. and Poethig, R.S. (1997) Phase change
and the regulation of trichome distribution in Arabidopsis
thaliana. Development, 124, 645–654.
Thanos, D. and Maniatis, T. (1992) The high mobility group
protein HMG I (Y) is required for NF-kB-dependent virus
induction of the human IFN-b gene. Cell, 71, 777–789.
Theissen, G., Kim, J.T. and Saedler, H. (1996) Classification and
phylogeny of the MADS-box multigene family suggest defined
roles of MADS-box gene subfamilies in the morphological
evolution of eukaryotes. J. Mol. Evol. 43, 484–516.
284 I. Romero et al.
Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL
W: improving the sensitivity of progressive multiple sequence
alignment throught sequence weighting, position specific gap
penalties and weight matrix choice. Nucl. Acids Res. 22,
4673–4680.
Thompson, M.A. and Ramsay, R.G. (1995) Myb: An old
oncoprotein with new roles. Bioessays, 17, 341–350.
Toscani, A., Mettus, R.V., Coupland, R., Simpkins, H., Litvin, J.,
Orth, J., Hatton, K.S. and Reddy, E.P. (1997) Arrest of
spermatogenesis and defective breast development in mice
lacking A-myb. Nature, 386, 713–717.
Treisman, J., Harris, E., Wilson, D. and Desplan, C. (1992) The
homeodomain: a new face for the helix-turn-helix? Bioessays,
14, 145–150.
Urao, T., Yamaguchi, S.K., Urao, S. and Shinozaki, K. (1993) An
Arabidopsis myb homolog is induced by dehydration stress
and its gene product binds to the conserved MYB recognition
sequence. Plant Cell, 5, 1529–1539.
© Blackwell Science Ltd, The Plant Journal, (1998), 14, 273–284
Watson, R.J., Robinson, C. and Lam, E.W. (1993) Transcription
regulation by murine B-myb is distinct from that by c-myb.
Nucl. Acids Res. 21, 267–272.
Weiss, D., van Blockland, R., Kooter, J.M., Mol, J.N.M. and
van Tunen, A.J. (1992) Gibberellic acid regulates chalcone
syntethase gene expression in the corolla of Petunia hybrida.
Plant Physiol. 98, 191–197.
Weiss, D., van Tunen, A.J., Halevy, A.H., Mol, J.N.M. and Gerats,
A.G.M. (1990) Stamens and gibberellic acid in the regulation
of flavonoid gene expression in the corolla of Petunia hybrida.
Plant Physiol. 94, 511–515.
Yang, Y. and Klessig, D.F. (1996) Isolation and characterization
of a tobacco mosaic virus-inducible myb oncogene homolog
from tobacco. Proc. Natl Acad. Sci. USA, 93, 14972–14977.
Yanisch-Perron, C., Vieira, J. and Messing, J. (1985) Improved
M13 phage cloning vectors and host strains: nucleotide
sequences of the M13mp18 and pUC19 vectors. Gene, 33,
103–119.