overview and concepts - cold spring harbor laboratory

18
23 C H A P T E R 3 Overview and Concepts C. David Allis, 1 Thomas Jenuwein, 2 and Danny Reinberg 3 1 The Rockefeller University, New York, New York; 2 Research Institute of Molecular Pathology, Vienna, Austria; 3 UMDNJ-Robert Wood Johnson Medical School, Piscataway, New Jersey 1. Genetics Versus Epigenetics, 25 2. Model Systems for the Study of Epigenetics, 26 3. Defining Epigenetics, 28 4. The Chromatin Template, 29 5. Higher-Order Chromatin Organization, 31 6. The Distinction between Euchromatin and Heterochromatin, 34 7. Histone Modifications and the Histone Code, 36 8. Chromatin-remodeling Complexes and Histone Variants, 39 9. DNA Methylation, 41 10. RNAi and RNA-directed Gene Silencing, 42 11. From Unicellular to Multicellular Systems, 44 12. Polycomb and Trithorax, 45 13. X Inactivation and Facultative Heterochromatin, 47 14. Reprogramming of Cell Fates, 49 15. Cancer, 50 16. What Does Epigenetic Control Actually Do? , 52 17. Big Questions in Epigenetic Research, 55 References, 56 CONTENTS Copyright 2007 Cold Spring Harbor Laboratory Press.

Upload: others

Post on 03-Feb-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

23

C H A P T E R 3

Overview and ConceptsC. David Allis,1 Thomas Jenuwein,2 and Danny Reinberg3

1The Rockefeller University, New York, New York; 2Research Institute of Molecular Pathology, Vienna, Austria; 3UMDNJ-Robert Wood Johnson Medical School, Piscataway, New Jersey

1. Genetics Versus Epigenetics, 25

2. Model Systems for the Study of Epigenetics, 26

3. Defining Epigenetics, 28

4. The Chromatin Template, 29

5. Higher-Order Chromatin Organization, 31

6. The Distinction between Euchromatin andHeterochromatin, 34

7. Histone Modifications and the Histone Code, 36

8. Chromatin-remodeling Complexes andHistone Variants, 39

9. DNA Methylation, 41

10. RNAi and RNA-directed Gene Silencing, 42

11. From Unicellular to Multicellular Systems, 44

12. Polycomb and Trithorax, 45

13. X Inactivation and FacultativeHeterochromatin, 47

14. Reprogramming of Cell Fates, 49

15. Cancer, 50

16. What Does Epigenetic Control Actually Do? , 52

17. Big Questions in Epigenetic Research, 55

References, 56

C O N T E N T S

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 23

Copyright 2007 Cold Spring Harbor Laboratory Press.

The DNA sequencing of the human genome and thegenomes of many model organisms has generated consid-erable excitement within the biomedical community andthe general public over the past several years. Thesegenetic “blueprints” that exhibit the well-accepted rules ofMendelian inheritance are now readily available for closeinspection, opening the door to improved understandingof human biology and disease. This knowledge is also gen-erating renewed hope for novel therapeutic strategies andtreatments. Many fundamental questions nonethelessremain. For example, how does normal development pro-ceed, given that every cell has the same genetic informa-tion, yet follows a different developmental pathway,realized with exact temporal and spatial precision? Howdoes a cell decide when to divide and differentiate, orwhen to retain an unchanged cellular identity, respondingand expressing according to its normal developmentalprogram? Mistakes made in the above processes can leadto the generation of disease states such as cancer. Arethese mistakes encoded in faulty genetic blueprints thatwe inherited from one or both of our parents, or are thereother layers of regulatory information that are not beingproperly read and decoded?

In humans, the genetic information (DNA) is organ-ized into 23 chromosome pairs consisting of approxi-mately 25,000 genes. These chromosomes can becompared to libraries with different sets of books thattogether instruct the development of a complete humanbeing. The DNA sequence of our genome is composed ofabout 3 x 109 bases, abbreviated by the four letters (orbases) A, C, G, and T within its sequence, giving rise towell-defined words (genes), sentences, chapters, andbooks. However, what dictates when the different booksare read, and in what order, remains far from clear. Meet-ing this extraordinary challenge is likely to reveal insightsinto how cellular events are coordinated during normaland abnormal development.

When summed across all chromosomes, the DNA mol-ecule in higher eukaryotes is about 2 meters long andtherefore needs to be maximally condensed about10,000-fold to fit into a cell’s nucleus, the compartment ofa cell that stores our genetic material. The wrapping ofDNA around “spools” of proteins, so-called histone pro-teins, provides an elegant solution to this packaging prob-lem, giving rise to a repeating protein:DNA polymerknown as chromatin. However, in packaging DNA to bet-ter fit into a confined space, a problem develops, much as

when one packs too many books onto library shelves: Itbecomes harder to find and read the book of choice, andthus, an indexing system is needed. Chromatin, as agenome-organizing platform, provides this indexing.Chromatin is not uniform in structure; it comes in differentpackaging designs from a highly condensed chromatinfiber (known as heterochromatin) to a less compactedtype where genes are typically expressed (known aseuchromatin). Variation can enter into the basic chromatinpolymer through the introduction of unusual histone pro-teins (known as histone variants), altered chromatin struc-tures (known as chromatin remodeling), and the additionof chemical flags to the histone proteins themselves(known as covalent modifications). Moreover, addition ofa methyl group directly to a cytosine (C) base in the DNAtemplate (known as DNA methylation) can provide dock-ing sites for proteins to alter the chromatin state or affectthe covalent modification of resident histones. Recent evi-dence suggests that noncoding RNAs can “guide” special-ized regions of the genome into more compactedchromatin states. Thus, chromatin should be viewed as adynamic polymer that can index the genome and poten-tiate signals from the environment, ultimately determiningwhich genes are expressed and which are not.

Together, these regulatory options provide chromatinwith an organizing principle for genomes known as “epi-genetics,” the subject of this book. In some cases, epige-netic indexing patterns appear to be inherited throughcell divisions, providing cellular “memory” that mayextend the heritable information potential of the genetic(DNA) code. Epigenetics can thus be narrowly defined aschanges in gene transcription through modulation ofchromatin, which is not brought about by changes in theDNA sequence.

In this overview, we explain the basic concepts ofchromatin and epigenetics, and we discuss how epigenetic control may give us the clues to solve somelong-standing mysteries, such as cellular identity,tumorigenesis, stem cell plasticity, regeneration, andaging. As readers comb through the chapters that follow,we encourage them to note the wide range of biologicalphenomena uncovered in a diverse range of experimen-tal models that seem to have an epigenetic (non-DNA)basis. Understanding how epigenetics operates in mech-anistic terms will likely have important and far-reachingimplications for human biology and human disease in this“post-genomic” era.

G E N E R A L S U M M A R Y

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 24

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 25

1 Genetics Versus Epigenetics

Determining the structural details of the DNA doublehelix stands as one of the landmark discoveries in all ofbiology. DNA is the prime macromolecule that storesgenetic information (Avery et al. 1944), and it propagatesthis stored information to the next generation throughthe germ line. From this and other findings, the “centraldogma” of modern biology emerged. This dogma encap-sulates the processes involved in maintaining and trans-lating the genetic template required for life. The essentialstages are (1) the self-propagation of DNA by semicon-servative replication; (2) transcription in a unidirectional5′ to 3′ direction, templated by the genetic code (DNA),generation of an intermediary messenger RNA (mRNA);(3) translation of mRNA to produce polypeptides con-sisting of linear amino to carboxyl strings of amino acidsthat are colinear with the 5′ to 3′ order of DNA. In simpleterms: DNA ↔ RNA → protein. The central dogmaaccommodates feedback from RNA to DNA by theprocess of reverse transcription, followed by integrationinto existing DNA (as demonstrated by retroviruses andretrotransposons). However, this dogma disavows feed-back from protein to DNA, although a new twist to thegenetic dogma is that rare proteins, known as prions, canbe inherited in the absence of a DNA or RNA template.Thus, these specialized self-aggregating proteins haveproperties that resemble some properties of DNA itself,including a mechanism for replication and informationstorage (Cohen and Prusiner 1998; Shorter and Lindquist2005). Additionally, emerging evidence suggests that aremarkably large fraction of our genome is transcribedinto “noncoding” RNAs. The function of these noncodingRNAs (i.e., non-protein-encoding except tRNAs, rRNAs,snoRNAs) is under active investigation and is only begin-ning to become clear in a limited number of cases.

The origin of epigenetics stems from long-standingstudies of seemingly anomalous (i.e., non-Mendelian)and disparate patterns of inheritance in many organisms(see Chapters 1 and 2 for a historical overview). ClassicMendelian inheritance of phenotypic traits (e.g., peacolor, number of digits, or hemoglobin insufficiency)results from allelic differences caused by mutations of theDNA sequence. Collectively, mutations underlie the defi-nition of phenotypic traits, which contributes to thedetermination of species boundaries. These boundariesare then shaped by the pressures of natural selection, asexplained by Darwin’s theory of evolution. Such conceptsplace mutations at the heart of classic genetics. In con-trast, non-Mendelian inheritance (e.g., variation ofembryonic growth, mosaic skin coloring, random X inac-

tivation, plant paramutation) (Fig. 1) can manifest, totake one example, from the expression of only one (oftwo) alleles within the same nuclear environment. Impor-tantly, in these circumstances, the DNA sequence is notaltered. This is distinct from another commonly referredto non-Mendelian inheritance pattern that arises fromthe maternal inheritance of mitochondria (Birky 2001).

The challenge for epigenetic research is captured bythe selective regulation of one allele within a nucleus.What distinguishes two identical alleles, and how is thisdistinction mechanistically established and maintainedthrough successive cell generations? What underlies dif-ferences observed in monozygotic (“identical”) twins thatmake them not totally identical? Epigenetics is sometimescited as one explanation for the differences in outwardtraits, by translating the influence of the environment,diet, and potentially other external sources to the expres-sion of the genome (Klar 2004; see Chapters 23 and 24).Determining what components are affected at a molecu-lar level, and how alterations in these components affecthuman biology and human disease, is a major challengefor future studies.

Another key question in the field is, How important isthe contribution of epigenetic information for normaldevelopment? How do normal pathways become dysfunc-tional, leading to abnormal development and neoplastictransformation (i.e., cancer)? As mentioned above, “iden-tical” twins share the same DNA sequence, and as such,their phenotypic identity is often used to underscore thedefining power of genetics. However, even twins such asthese can exhibit outward phenotypic differences, likelyimparted by epigenetic modifications that occur over thelifetime of the individuals (Fraga et al. 2005). Thus, theextent to which epigenetics is important in defining cellfate, identity, and phenotype remains to be fully under-stood. In the case of tissue regeneration and aging, itremains unclear whether these processes are dictated byalterations in the genetic program of cells or by epigeneticmodifications. The intensity of research on a global scaletestifies to the recognition that the field of epigenetics is acritical new frontier in this post-genomic era.

In the words of others, “We are more than the sum ofour genes” (Klar 1998), or “You can inherit somethingbeyond the DNA sequence. That’s where the real excite-ment in genetics is now” (Watson 2003). The overridingmotivation for deciding to edit this book was the generalbelief that we and all the contributors to this volumecould transmit this excitement to future generations ofstudents, scientists, and physicians, most of whom weretaught genetic, but not epigenetic, principles governinginheritance and chromosome segregation.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 25

Copyright 2007 Cold Spring Harbor Laboratory Press.

26 ■ C H A P T E R 3

2 Model Systems for the Study of Epigenetics

The study of epigenetics necessarily requires good experi-mental models, and as often is the case, these models seemat first sight far removed from studies using human (ormammalian) cells. Collectively, however, results from manysystems have yielded a wealth of knowledge. The historicaloverviews (Chapters 1 and 2) make reference to severalimportant landmark discoveries that have emerged fromearly cytology, the growth of genetics, the birth of molecu-lar biology, and relatively new advances in chromatin-mediated gene regulation. Different model organisms (Fig.2) have been pivotal in addressing and solving the variousquestions raised by epigenetic research. Indeed, seeminglydisparate epigenetic discoveries made in various modelorganisms have served to unite the research community.The purpose of this section is to highlight some of thesemajor findings, which are discussed in more detail in thefollowing chapters of this book. As readers note these dis-coveries, they should focus on the fundamental principlesthat investigations using these model systems haveexposed; their collective contributions point more often tocommon concepts than to diverging details.

Unicellular and “lower” eukaryotic organisms—Sac-charomyces cerevisiae, Schizosaccharomyces pombe, and

Neurospora crassa—permit powerful genetic analyses, inpart facilitated by a short life cycle. Mating-type (MAT)switching that occurs in S. cerevisiae (Chapter 3) and S.pombe (Chapter 6) has provided remarkably instructiveexamples, demonstrating the importance of chromatin-mediated gene control. In the budding yeast S. cerevisiae,the unique silent information regulator (SIR) proteinswere shown to engage specific modified histones. Thiswas preceded by elegant experiments using genetics todocument the active participation of histone proteins ingene regulation (Clark-Adams et al. 1988; Kayne et al.1988). In the fission yeast S. pombe, the patterns of his-tone modification operating as activating and repressingsignals are remarkably similar to those in metazoanorganisms. This has opened the door for powerful geneticscreens being employed to look for gene products thatsuppress or enhance the silencing of genes. Most recently,a wealth of mechanistic insights linking the RNA interfer-ence (RNAi) machinery to the induction of histone mod-ifications acting to repress gene expression was discoveredin fission yeast (Hall et al. 2002; Volpe et al. 2002). Shortlyafterward, the RNAi machinery was also implicated intranscriptional gene silencing in the plant Arabidopsisthaliana, underscoring the potential importance of thisregulation in a wide range of organisms (see Section 10).

Barr body

polytenechromosomes

blood smear

tumor tissue

mutant plant

twins

epigenetic

biology

yeast mating typescloned cat

Figure 1. Biological Examples of EpigeneticPhenotypes

Epigenetic phenotypes in a range of organismsand cell types, all attributable to non-genetic dif-ferences. Twins: Slight variations partially attribut-able to epigenetics (©Randy Harris, New York).Barr body: The epigenetically silenced X chromo-some in female mammalian cells, visible cytologi-cally as condensed heterochromatin. Polytenechromosomes: Giant chromosomes in Drosophilasalivary glands, ideally suited for correlating geneswith epigenetic marks (reprinted from Schotta etal. 2003 [©Springer]). Yeast mating type: Sex isdetermined by the active MAT locus, while copiesof both mating-type genes are epigeneticallysilenced (©Alan Wheals, University of Bath). Bloodsmear: Heterogeneous cells of the same genotype,but epigenetically determined to serve differentfunctions (courtesy Prof. Christian Sillaber). Tumortissue: Metastatic cells (left) showing elevated lev-els of epigenetic marks in the tissue section(reprinted, with permission, from Seligson et al.2005 [©Macmillan]). Mutant plant: Arabidopsisflower epiphenotypes, genetically identical, withepigenetically caused mutations (reprinted, withpermission, from Jackson et al. 2002 [©Macmil-lan]). Cloned cat: Genetically identical, but withvarying coat-color phenotype (reprinted, withpermission, from Shin et al. 2002 [©Macmillan]).

03_EPIGEN_p023-062.qxd 9/13/06 11:43 AM Page 26

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 27

Other “off-beat” organisms have also made dispropor-tionate contributions toward unraveling epigenetic path-ways that at first seemed peculiar. The fungal species, N.crassa, revealed the unusual non-Mendelian phenomenonof repeat-induced point mutation (RIP) as a model forstudying epigenetic control (Chapter 6). Later, this organ-ism was used to demonstrate the first functional connec-tion between histone modifications and DNA methylation(Tamaru and Selker 2001), a finding later extended to“higher” organisms (Jackson et al. 2002). Ciliated protozoa,such as Tetrahymena and Paramecium, commonly used inbiology laboratories as convenient microscopy specimens,facilitated important epigenetic discoveries because oftheir unique nuclear dimorphism. Each cell carries twonuclei: a somatic macronucleus that is transcriptionallyactive, and a germ-line micronucleus that is transcription-ally inactive. Using macronuclei as an enriched startingsource of “active” chromatin, the biochemical purificationof the first nuclear histone-modifying enzyme—a histoneacetyltransferase or HAT—was made (Brownell et al.1996). Ciliates are also well known for their peculiar phe-nomenon of programmed DNA elimination during theirsexual life cycle, triggered by small noncoding RNAs andhistone modifications (Chapter 7).

In multicellular organisms, genome size and organis-mal complexity generally increase from invertebrate(Caenorhabditis elegans, Drosophila melanogaster) or

plant (A. thaliana) species to “higher,” and to some,“morerelevant,” vertebrate organisms (mammals). Plants, how-ever, have been pivotal to the field of epigenetics, provid-ing a particularly rich source of epigenetic discoveries(Chapter 9) ranging from transposable elements andparamutation (McClintock 1951) to the first descriptionof noncoding RNAs involved in transcriptional silencing(Ratcliff et al. 1997). Crucial links between DNA methy-lation, histone modification, and components of theRNAi machinery came through plant studies. The discov-ery of plant epialleles, with comic names such as SUPER-MAN and KRYPTONITE (e.g., Jackson et al. 2002), andseveral vernalizing genes (Bastow et al. 2004; Sung andAmasino 2004) have further provided the research fieldwith insights into understanding the developmental roleof epigenetics and cellular memory. Plant meristem cellshave also offered the opportunity to study crucial ques-tions such as somatic regeneration and stem cell plasticity(see Chapters 9 and 11).

For understanding animal development, Drosophilahas been an early and continuous genetic powerhouse.Based on the pioneering work of Muller (1930), manydevelopmental mutations were generated, including thehomeotic transformations and position-effect variegation(PEV) mutants explained below (also see Chapter 5). Thehomeotic transformation mutants led to the idea that therecould be regulatory mechanisms for establishing and

epigenetic

model

organisms

Arabidopsis

S.cerevisiae

MATa

S.pombe

Tetrahymena

mic

mac

mic

mac

maize

Drosophila

mammalsXa Xi

Neurospora

C. elegans

Figure 2. Model Organisms Used in Epigenetic Research

Schematic representation of model organ-isms used in epigenetic research. S. cere-visiae: Mating-type switching to studyepigenetic chromatin control. S. pombe:Variegated gene silencing manifests ascolony sectoring. Neurospora crassa: Epige-netic genome defense systems includerepeat-induced point mutation, quelling,and meiotic silencing of unpaired DNA,revealing an interplay between RNAi path-ways, DNA and histone methylation.Tetrahymena: Chromatin in somatic andgerm-line nuclei are distinguished by epige-netically regulated mechanisms. Arabidop-sis: Model for repression by DNA, histone,and RNA-guided silencing mechanisms.Maize: Model for imprinting, paramutation,and transposon-induced gene silencing. C.elegans: Epigenetic regulation in the germline. Drosophila: Position-effect variegation(PEV) manifest by clonal patches of expres-sion and silencing of the white gene in theeye. Mammals: X-chromosome inactivation.

03_EPIGEN_p023-062.qxd 9/13/06 11:43 AM Page 27

Copyright 2007 Cold Spring Harbor Laboratory Press.

28 ■ C H A P T E R 3

maintaining cellular identity/memory which was latershown to be regulated by the Polycomb and trithorax sys-tems (see Chapters 11 and 12). For PEV, gene activity is dic-tated by the surrounding chromatin structure and not byprimary DNA sequence. This system has been a particu-larly informative source for dissecting factors involved inepigenetic control (Chapter 5). Over 100 suppressors ofvariegation [Su(var)] genes are believed to encode compo-nents of heterochromatin. Without the foundation estab-lished by these landmark studies, the discovery of the firsthistone lysine methyltransferases (HKMTs) (Rea et al.2000) and the resultant advances in histone lysine methy-lation would not have been possible. As is often the case inbiology, comparable screens have been carried out in fis-sion yeast and in plants, identifying silencing mutants withfunctional conservation with the Drosophila Su(var) genes.

The use of reverse genetics via RNAi libraries in thenematode worm C. elegans has contributed to our under-standing of epigenetic regulation in metazoan develop-ment. There, comprehensive cell-fate tracking studies,detailing all the developmental pathways of each cell, havehighlighted the fact that Polycomb and trithorax systemsprobably arose with the emergence of multicellularity(see Sections 12 and 13). In particular, these mechanismsof epigenetic control are essential for gene regulation inthe germ line (see Chapter 15).

The role of epigenetics in mammalian developmenthas mostly been elucidated in the mouse, although a num-ber of studies have been translated to diverse human celllines and primary cell cultures. The advent of gene“knock-out” and “knock-in” technologies has been instru-mental for the functional dissection of key epigenetic reg-ulators. For instance, the Dnmt1 DNA methyltransferasemutant mouse provided functional insight for the role ofDNA methylation in mammals (Li et al. 1992). It isembryonic-lethal and shows impaired imprinting (seeChapter 18). Disruption of DNA methylation has alsobeen shown to cause genomic instability and reanimationof transposon activity, particularly in germ cells (Walsh etal. 1998; Bourc’his and Bestor 2004). There are approxi-mately 100 characterized chromatin-regulating factors(i.e., histone and DNA-modifying enzymes, componentsof nucleosome remodeling complexes and of the RNAimachinery) that have been disrupted in the mouse. Themutant phenotypes affect cell proliferation, lineage com-mitment, stem cell plasticity, genomic stability, DNArepair, and chromosome segregation processes, in bothsomatic and germ cell lineages. Not surprisingly, most ofthese mutants are also involved in disease developmentand cancer. Thus, many of the key advances in epigenetic

control took advantage of unique biological featuresexhibited by many, if not all, of the above-mentionedmodel organisms. Without these biological processes andthe functional analyses (genetic and biochemical) thatdelved into them, many of the recent advances in epige-netic control would have remained elusive.

3 Defining Epigenetics

The above discussion begs the question, What is the com-mon thread that allows diverse eukaryotic organisms tobe connected with respect to fundamental epigeneticprinciples? Different epigenetic phenomena are linkedlargely by the fact that DNA is not “naked” in all organ-isms that maintain a true nucleus (eukaryotes). Instead,the DNA exists as an intimate complex with specializedproteins, which together comprise chromatin. In its sim-plest form, chromatin—i.e., DNA spooled around nucle-osomal units consisting of small histone proteins(Kornberg 1974)—was initially regarded as a passivepackaging molecule to wrap and organize the DNA. Dis-tinctive forms of chromatin arise, however, through anarray of covalent and non-covalent mechanisms that arebeing uncovered at a rapid pace (see Section 6). Thisincludes a plethora of posttranslational histone modifica-tions, energy-dependent chromatin-remodeling stepsthat mobilize or alter nucleosome structures, the dynamicshuffling of new histones (variants) in and out of nucleo-somes, and the targeting role of small noncoding RNAs.DNA itself can also be modified covalently in manyhigher eukaryotes, by methylation at the cytosine residue,usually but not always, of CpG dinucleotides. Together,these mechanisms provide a set of interrelated pathwaysthat all create variation in the chromatin polymer (Fig. 3).

Many, but not all, of these modifications and chro-matin changes are reversible and, therefore, are unlikely tobe propagated through the germ line. Transitory marks areattractive because they impose changes to the chromatintemplate in response to intrinsic and external stimuli(Jaenisch and Bird 2003), and in so doing, regulate theaccess and/or processivity of the transcriptional machin-ery, needed to “read” the underlying DNA template (Simset al. 2004; Chapter 10). Some histone modifications (likelysine methylation), methylated DNA regions, and alterednucleosome structures can, however, be stable through sev-eral cell divisions. This establishes “epigenetic states” ormeans of achieving cellular memory, which remain poorlyappreciated or understood. From this perspective, chro-matin “signatures” can be viewed as a highly organized sys-tem of information storage that can index distinct regions

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 28

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 29

of the genome and accommodate a response to environ-mental signals that dictate gene expression programs.

The significance of having a chromatin template thatcan potentiate the genetic information is that it providesmultidimensional layers to the readout of DNA. This isperhaps a necessity, given the vast size and complexity ofthe eukaryotic genome, particularly for multicellularorganisms (see Section 11 for further details). In suchorganisms, a fertilized egg progresses through develop-ment, starting with a single genome that becomes epige-netically programmed to generate a multitude of distinct“epigenomes” in more than 200 different types of cells(Fig. 4). This programmed variation has been proposed toconstitute an “epigenetic code” that significantly extendsthe information potential of the genetic code (Strahl andAllis 2000; Turner 2000; Jenuwein and Allis 2001).Although this is an attractive hypothesis, we stress thatmore work is needed to test this and related provocativetheories. Other alternative viewpoints are being advancedwhich argue that clear combinatorial “codes,” like thetriplet genetic code, are not likely in histones or are farfrom established (Schreiber and Bernstein 2002; Henikoff2005). Despite these uncertainties, we favor the generalview that a combination of covalent and non-covalentmechanisms will act to create chromatin states that can be

templated through cell division and development bymechanisms that are just beginning to be defined. Exactlyhow these altered chromatin states are faithfully propa-gated during DNA replication and mitosis remains one ofthe fundamental challenges of future studies.

The phenotypic alterations that occur from cell to cellduring the course of development in a multicellularorganism were described by Waddington as the “epige-netic landscape” (Waddington 1957). Yet the spectrum ofcells, from stem cells to fully differentiated cells, all shareidentical DNA sequences but differ remarkably in theprofile of genes that they actually express. With thisknowledge, epigenetics later came to be defined as the“Nuclear inheritance which is not based on differences inDNA sequence” (Holliday 1994).

Since the discovery of the DNA double helix and theearly explanations of epigenetics, our understanding ofepigenetic control and its underlying mechanisms hasgreatly increased, causing some to describe it in more loftyterms as a “field” rather than just “phenomena” (see Wolffeand Matzke 1999; Roloff and Nuber 2005; Chapter 1). Inthe past decade, considerable progress has been gainedregarding the many enzyme families that actively modifychromatin (see below). Thus, in today’s modern terms,epigenetics can be molecularly (mechanistically) definedas “The sum of the alterations to the chromatin templatethat collectively establish and propagate different patternsof gene expression (transcription) and silencing from thesame genome.”

4 The Chromatin Template

The nucleosome is the fundamental repeating unit ofchromatin (Kornberg 1974). On the one hand, the basicchromatin unit consists of a protein octamer containingtwo molecules of each canonical (or core) histone (H2A,H2B, H3, and H4), around which is wrapped 147 bp ofDNA. Detailed intermolecular interactions between thecore histones and the DNA were determined from land-mark studies leading to an atomic (2.8 Å) resolution X-ray picture of the nucleosome assembled from recom-binant parts (Fig. 5) (Luger et al. 1997). Higher-resolutionimages of mononucleosomes, as well as emerging higher-order structures (tetranucleosomes) (Schalch et al. 2005),continue to capture our attention, promising to betterexplain the physiologically relevant substrate upon whichmost, if not all, of the chromatin remodeling and tran-scriptional machinery operates.

The core histone proteins that make up the nucleo-some are small and highly basic. They are composed of a

Figure 3. Genetics Versus Epigenetics

GENETICS: Mutations (red stars) of the DNA template (green helix)are heritable somatically and through the germ line. EPIGENETICS:Variations in chromatin structure modulate the use of the genomeby (1) histone modifications (mod), (2) chromatin remodeling(remodeler), (3) histone variant composition (yellow nucleosome), (4)DNA methylation (Me), and (5) noncoding RNAs. Marks on thechromatin template may be heritable through cell division and col-lectively contribute to determining cellular phenotype.

mod

Me

alterationsmutations

GENETICS EPIGENETICS

inherited

germ line

species

stable?

soma

variability

remodeler

ncRNAs

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 29

Copyright 2007 Cold Spring Harbor Laboratory Press.

30 ■ C H A P T E R 3

globular domain and flexible (relatively unstructured)“histone tails,” which protrude from the surface of thenucleosome (Fig. 5). Based on amino acid sequence, his-tone proteins are highly conserved from yeast to humans.Such a high degree of conservation lends support to thegeneral view that these proteins, even the unstructuredtail domains, are likely to serve critical functions. Thetails, particularly of histones H3 and H4, in fact holdimportant clues to nucleosomal variability (and hencechromatin), as many of the residues are subject to exten-sive posttranslational modifications (see back end paperfor standard nomenclature used in this textbook andAppendix 2 for a listing of known histone modifications).

Acetylation and methylation of core histones, notablyH3 and H4, were among the first covalent modificationsto be described, and were long proposed to correlate withpositive and negative changes in transcriptional activity.Since the pioneering studies of Allfrey and coworkers(Allfrey et al. 1964), many types of covalent histone mod-ifications have been identified and characterized; theseinclude histone phosphorylation, ubiquitination, sumoy-lation, ADP-ribosylation, biotinylation, proline isomer-ization, and likely others that await description (Vaqueroet al. 2003). These modifications occur at specific sitesand residues, some of which are illustrated in Figure 6and listed in Appendix 2. Specific enzymes and enzymaticcomplexes, some of which are highlighted in the follow-

ing overview and individual chapters, catalyze these cova-lent markings. Because these lists will continue to grow inyears to come, our intent was to mention only individualmarks and enzymes that can illustrate what we feel areimportant general concepts and principles.

In certain chromatin regions, nucleosomes may con-tain histone variant proteins in place of a core (canonical)histone. Ongoing research is showing that this composi-tional difference contributes to marking regions of the

Figure 5. Nucleosome Structure

(Left) A 2.8 Å model of a nucleosome. (Right) A schematic represen-tation of histone organization within the octamer core aroundwhich the DNA (black line) is wrapped. Nucleosome formationoccurs first through the deposition of an H3/H4 tetramer on theDNA, followed by two sets of H2A/H2B dimers. Unstructured amino-terminal histone tails extrude from the nucleosome core, which con-sists of structured globular domains of the eight histone proteins.

H4

H4H4

H2BH2A

>25,000 genes identicalDNA sequence

>200 different cell types

organizedinformation

storedinformation

chromatin

DNA

1 g e n o m e

e p i g e n o m e s

mod

Me

remodeler

ncRNAs

stemcell committed

cell

cell A

cell C

cell B

Figure 4. DNA Versus Chromatin

The genome: Invariant DNA sequence(green double helix) of an individual. Theepigenome: The overall chromatin com-position, which indexes the entire genomein any given cell. It varies according to celltype, and response to internal and exter-nal signals it receives. (Lower panel)Epigenome diversification occurs duringdevelopment in multicellular organisms asdifferentiation proceeds from a single stemcell (the fertilized embryo) to more com-mitted cells. Reversal of differentiation ortransdifferentiation (blue lines) requires thereprogramming of the cell’s epigenome.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 30

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 31

chromosomes for specialized functions. Variant proteinsfor core histones H2A and H3 are currently known, butnone exists for histones H2B and H4. We suspect that his-tone variants, although often minor in terms of amountand accordingly more difficult to study, are bountiful inthe information they contain and essential to contribut-ing to epigenetic regulation (for more detail, see Section 8and Chapter 13).

5 Higher-Order Chromatin Organization

Chromatin, the DNA-nucleosome polymer, is a dynamicmolecule existing in many configurations. Historically,chromatin has been classified as either euchromatic orheterochromatic, stemming from the nuclear stainingpatterns of dyes used by cytologists to visualize DNA.Euchromatin is decondensed chromatin, although it may be transcriptionally active or inactive. Heterochro-matin can broadly be defined as highly compacted andsilenced chromatin. It may exist as permanently silentchromatin (constitutive heterochromatin), where geneswill rarely be expressed in any cell type of the organism,or repressed (facultative heterochromatin) in some cellsduring a specific cell cycle or developmental stage. Thus,there is a spectrum of chromatin states and a long-standing literature suggesting that chromatin is a highlydynamic macromolecular structure, prone to remodel-

ing and restructuring as it receives physiologically rele-vant input from upstream signaling pathways. Onlyrecently, however, has excellent progress been madeunraveling molecular mechanisms that govern theseremodeling steps.

The textbook, 11-nm “beads on a string” templaterepresents an active and largely “unfolded” interphaseconfiguration wherein DNA is periodically wrappedaround repeating units of nucleosomes (Fig. 7). The chro-matin fiber, however, is not always made up of regularlyspaced nucleosomal arrays. Nucleosomes may be irregu-larly packed and fold into higher-order structures that areonly beginning to be observed at atomic resolution (Kho-rasanizadeh 2004). Differential and higher-order chro-matin conformations occur in diverse regions of thegenome during cell-fate specification or in distinct stagesof the cell cycle (interphase versus mitotic chromatin).

The arrangement of nucleosomes on the 11-nm tem-plate can be altered by cis-effects and trans-effects of cova-lently modified histone tails (Fig. 8). cis-Effects are broughtabout by changes in the physical properties of modifiedhistone tails, such as a modulation in the electrostaticcharge or tail structure that, in turn, alters internucleoso-mal contacts. A well-known example, histone acetylation,has long been suspected to neutralize positive charges ofhighly basic histone tails, generating a localized expansionof the chromatin fiber, thereby enabling better access of

P

PP

Ac

Me

MeAc Ac Ac

Ac Ac Ac Ac

Ac Ac

Ac Ac

Me

MeAc MeMephosphorylation

acetylation

methylation

(arginine)

methylation

(active lysine)

methylation

(repressive lysine)

ubiquitylation

P

P

P

120

119

Figure 6. Sites of Histone Tail Modifications

The amino-terminal tails of histones account for a quarter of the nucleosome mass. They host the vast majority of knowncovalent modification sites as illustrated. Modifications do also occur in the globular domain (boxed), some of which areindicated. In general, active marks include acetylation (turquoise Ac flag), arginine methylation (yellow Me hexagon), andsome lysine methylation such as H3K4 and H3K36 (green Me hexagon). H3K79 in the globular domain has anti-silencingfunction. Repressive marks include H3K9, H3K27, and H4K20 (red Me hexagon), Green = active mark, red = repressive mark.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 31

Copyright 2007 Cold Spring Harbor Laboratory Press.

32 ■ C H A P T E R 3

transcription machinery to the DNA double helix. Phos-phorylation, through the addition of net negative charge,can generate “charge patches” (Dou and Gorovsky 2000)that are believed to alter nucleosome packaging or toexpose histone amino termini by altering the higher-orderfolded state of the chromatin polymer (Wei et al. 1999;Nowak and Corces 2004). In much the same way, linkerhistones (H1) are believed to promote the packaging ofhigher-order fibers by shielding the negative charge oflinker DNA between adjacent nucleosomes (Thomas 1999;Khochbin 2001; Harvey and Downs 2004; Kimmins andSassone-Corsi 2005). The addition of bulky adducts, suchas ubiquitin and ADP-ribose, may also induce differentarrangements of the histone tails and open up nucleosomearrays. The extent to which histone tails can induce chro-matin compaction through modification-dependent and -independent mechanisms is not clear.

Histone modifications may also elicit what we refer toas trans-effects by the recruitment of modification-bind-

ing partners to the chromatin. This can be viewed as“reading” a particular covalent histone mark in a con-text-dependent fashion. Certain binding partners have aparticular affinity and hence are known to “dock” ontospecific histone tails and often do so by serving as thechromatin “Velcro” for one polypeptide within a muchlarger enzymatic complex that needs to engage the chro-matin polymer. For instance, the bromodomain—amotif that recognizes acetylated histone residues—isoften, but not always, part of a histone acetyltransferase(HAT) enzyme that exists to acetylate target histones (seeFig. 10 in Section 7) as part of a larger chromatin-remod-eling complex (Dhalluin et al. 1999; Jacobson et al. 2000).Similarly, methylated lysine residues embedded in his-tone tails can be read by chromodomains (Bannister etal. 2001; Lachner et al. 2001; Nakayama et al. 2001) orsimilar domains (e.g., MBT, tudor) (Maurer-Stroh et al.2003; Kim et al. 2006) to facilitate downstream chro-matin-modulating events. In some cases, for instance,the association of chromodomain proteins precipitatesthe spreading of heterochromatin by the histone methyl-transferase (HKMT)-catalyzed methylation of adjacenthistones which can then be read by chromodomain pro-teins (Chapter 5).

Histone modifications of both the tail regions and theglobular core region (Cosgrove et al. 2004) can also targetATP-dependent remodeling complexes to the 11-nm fiberrequired for the transition from poised euchromatin to atranscriptionally active state. This mobilization of nucle-osomes may occur by octamer sliding, alteration of nucle-osome structure by DNA looping (for more detail, seeChapter 12) or replacement of specific core histones withhistone variants (Chapter 13). ATP-dependent chromatinremodelers (such as SWI/SNF, an historically importantexample) hydrolyze energy to bring about significantchanges in histone:DNA contacts, resulting in looping,twisting, and sliding of nucleosomes. These non-covalentmechanisms have been shown to be critically importantfor gene regulatory events (Narlikar et al. 2002) as much asthose involving covalent histone modifications (see Chap-ter 10). The finding that specific ATP-dependent remodel-ers can shuffle histone variants into and out of chromatinprovides a means to link cis, trans, and remodeling mech-anisms. Understanding, in turn, how these interconnectedmechanisms act in a concerted fashion to vary epigeneticstates in chromatin is far from complete.

More compact and repressive higher-order chromatinstructures (30-nm) can also be achieved through therecruitment of linker histone H1 and/or modification-dependent or “architectural” chromatin-associated factors

Figure 7. Higher-Order Structuring of Chromatin

The 11-nm fiber represents DNA wrapped around nucleosomes. The30-nm fiber is further compacted into an as-yet-unconfirmed struc-ture (illustrated as solenoid conformation here), involving linker his-tone H1. The 300–700-nm fiber represents dynamic higher-orderlooping that occurs in both interphase and metaphase chromatin. The1.5-μm condensed chromosome represents the most compactedform of chromatin that occurs only during nuclear division (mitosis ormeiosis). It is not yet clear how mitotic chromosome-banding patterns(i.e., G- or R-banding) correlate with particular chromatin structures.

nucleosomes

domain organization

length: 2 m

mitotic condensation

chromosome

11 nm

30 nm

300-700 nm

DNA

histone H1histone modifications

1.5 μmlength: 10 μm

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 32

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 33

such as heterochromatin protein 1 (HP1) or Polycomb(PC). Although it is commonly held that compaction ofnucleosomal chromatin (11-nm) into a 30-nm transcrip-tionally incompetent conformation is accomplished by theincorporation of linker histone H1 during interphase, thefunctional and structural dissection of this histone has,until recently, been difficult (Fan et al. 2005). One likelyproblem underlying these studies is the fact that histoneH1 occurs as different isoforms (~8 in mammals), makingit difficult to do detailed genetic analyses. Thus, there isredundancy between some H1 isoforms whereas othersmay hold tissue-specific functions (Kimmins and Sassone-Corsi 2005). Interestingly, H1 itself can be covalently mod-ified (phosphorylated, methylated, poly(ADP) ribosylated,etc.), raising the possibility that cis and trans mechanismscurrently being dissected on core histones may well extendto this important class of linker histone, and also to non-histone proteins (Sterner and Berger 2000).

Considerable debate has taken place over the details ofthe way in which the 30-nm chromatin fiber is organized.In general, either “solenoid” (one-start helix) models,wherein the nucleosomes are gradually coiled around acentral axis (6–8 nucleosomes/turn), or more open“zigzag” models, which adopt higher-order self-assem-blies (two-start helix), have been described. New evi-dence, including that collected from X-ray structure usinga model system containing four nucleosomes, suggests afiber arrangement more consistent with a two-start,zigzag arrangement of linker DNA connecting two stacksof nucleosome particles (Khorasanizadeh 2004; Schalchet al. 2005). Despite this progress, we note that linker his-

tone is not present in the current structures, and even if itwere present, the 30-nm chromatin fiber compacts theDNA only approximately 50-fold. Thus, considerablymore levels of higher-order chromatin organization existthat have yet to be resolved outside of light- and electron-microscopic examination, whether leading to interphaseor mitotic chromatin states. Despite structural uncertain-ties, recent results in living cells have now established theexistence of multiple levels of chromatin folding abovethe 30-nm fiber within interphase chromosomes. A note-worthy advance was the development of new approachesto label specific DNA sequences in live cells, making itpossible to study the dynamics of chromatin opening andclosing in vivo in real time. Interestingly, these resultsreveal a dynamic interplay of positive and negative chro-matin-remodeling factors in setting higher-order chro-matin structures for states more or less compatible withgene expression (Fisher and Merkenschlager 2002;Felsenfeld and Groudine 2003; Misteli 2004).

Organization into larger looped chromatin domains(300–700 nm) occurs, perhaps through anchoring thechromatin fiber to the nuclear periphery or other nuclearscaffolds via chromatin-associated proteins such asnuclear lamins. The extent to which these associationsgive rise to meaningful functional “chromosome territo-ries” remains unclear, but numerous reports are showingthat this concept deserves serious attention. For instance,clustering of multiple active chromatin sites to RNA poly-merase II (RNA pol II) transcription factors has beenobserved, and similar concepts seem to apply to the clus-tering around replicating DNA and DNA polymerase. In

modifyingenzymecis-effects

histonereplacement

trans-effects modifyingenzyme

nucleosomeremodeling

modbinder

histone variant

regular histone

modification binder

mod

mod

Figure 8. Transitions in the Chromatin Template(cis/trans)

cis-effects: A covalent modification of a histone tailresidue results in an altered structure or charge thatmanifests as a change in chromatin organization.trans-effects: The enzymatic modification of a his-tone tail residue (e.g., H3K9 methylation) results inan affinity for chromatin-associated protein (modbinder, e.g., HP1). The association of a mod binder(or associated protein complexes) causes down-stream alterations in chromatin structure. Histonereplacement: A covalent histone modification (orother stimulus) can signal the replacement of acore histone with a histone variant through anucleosome-remodeling exchanger complex.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 33

Copyright 2007 Cold Spring Harbor Laboratory Press.

34 ■ C H A P T E R 3

contrast, clustering of “silent” heterochromatin (particu-larly pericentromeric foci) and genes localized in transhas also been documented (see Chapters 4 and 21). Howthese associations are controlled and the extent to whichnuclear localization of chromatin domains affectsgenome regulation are not yet clear. There is, nonetheless,an increasing body of evidence showing correlations of anactive or silent chromatin configuration with a particularnuclear territory (Cremer and Cremer 2001; Gilbert et al.2004; Janicki et al. 2004; Chakalova et al. 2005).

The most condensed DNA structure is observed dur-ing the metaphase stage of mitosis or meiosis. This per-mits the faithful segregation of exact copies of our genome(one or two copies of each chromosome, depending onthe division at hand), via chromosomes, to each daughtercell. This condensation involves a dramatic restructuringof the DNA from a 2-m molecule when fully extended,into discrete chromosomes measuring on average 1.5 μmin diameter (Fig. 7). This is no less than a 10,000-foldcompaction and is achieved by the hyperphosphorylationof linker (H1) and core histone H3, and the ATP-depend-ent action of the condensin and cohesin complexes, andtopoisomerase II. Exactly how non-histone complexesengage mitotic chromatin (or M-phase chromatin modifi-cations), and what rules dictate their association andrelease from chromatin in a cell-cycle-regulated fashion,remain to be determined (Bernard et al. 2001; Watanabe etal. 2001). Here, the well-known mitotic phosphorylationof histone H3 (i.e., serines 10 and 28) and members of theH1 family may provide important clues, but genetic andbiochemical experiments have yet to yield full insights intowhat the function of these mitotic marks is. Interestingly,a formal theory has been proposed that specific methyla-tion marks, when paired with more dynamic andreversible phosphorylation marks, may act as a “binaryswitch” in histone proteins, governing the binding andrelease of downstream effectors that engage the chromatintemplate (Fischle et al. 2003a). Using HP1 binding tomethylated histone H3 on lysine 9 (H3K9me) and mitoticserine 10 phosphorylation (H3S10ph) as a paradigm, evi-dence in support of a mitotic “methyl/phos switch” hasrecently been provided (Daujat et al. 2005; Fischle et al.2005; Hirota et al. 2005).

Specialized chromosomal domains, such as telomeresand centromeres, serve distinct functions dedicated toproper chromosome dynamics. Telomeres act as chromo-somal ends, providing protection and unique solutions tohow the very ends of DNA molecules are replicated. Cen-tromeres provide an attachment anchor for spindle micro-tubules during nuclear division. Both of these specialized

domains have a fundamental role in the events that lead tofaithful chromosome segregation. Interestingly, bothtelomeric and centromeric heterochromatin is distin-guishable from euchromatin, and even other heterochro-matic regions (see below), by the presence of uniquechromatin structures that are largely repressive for geneactivity and recombination. Moving expressed genes fromtheir normal positions in euchromatin to new positions ator near centromeric and telomeric heterochromatin (seeChapters 4–6) can silence these genes, giving rise to pow-erful screens described earlier that sought to identify sup-pressors or enhancers of position-effect variegation (PEV)or telomere-position effects (TPE; Gottschling et al. 1990;Aparicio et al. 1991). Centromeres and telomeres havemolecular signatures that include, for example, hypo-acetylated histones. Interestingly, centromeres are also“marked” by the presence of the histone variant CENP-A,which plays an active role in chromosome segregation(Chapter 14). Thus, the proper assembly and maintenanceof distinct centromeric and pericentromeric heterochro-matin is critical for the completion of mitosis or meiosis,and hence, cellular viability. In addition to the well-stud-ied centromeric and pericentromeric forms of constitutiveheterochromatin, progress is being made into mechanismsof epigenetic control for centromeric (and telomeric)“identity.” Clever experiments have shown that “neocen-tromeres” can function in place of normal centromeres,demonstrating that DNA sequences do not dictate theidentity of centromeres (Chapters 13 and 14). Instead, epi-genetic hallmarks, including centromere-specific modifi-cation patterns and histone variants, mark this specializedchromosomal domain. Considerable progress is beingmade into how other coding, noncoding, and repetitiveregions of chromatin contribute to these epigenetic signa-tures. How any of these mechanisms relate, if at all, tochromosomal banding patterns is not known, but remainsan intriguing possibility. Achieving an understanding ofthe epigenetic regulation of these portions of unique chro-mosomal regions is needed, highlighted by the fact thatnumerous human cancers are characterized by genomicinstability, which is a hallmark of certain disease progres-sion and neoplasia.

6 The Distinction between Euchromatin and Heterochromatin

This overview has been divided into discussions ofeuchromatin and heterochromatin, although weacknowledge that multiple forms of both classes of chro-matins exist. Euchromatin, or “active” chromatin, consists

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 34

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 35

largely of coding sequences, which only account for asmall fraction (less than 4%) of the genome in mammals.What molecular signals then mark coding sequenceswith the potential for productive transcription, and howdoes chromatin structure contribute to the process? Anextensive literature has suggested that euchromatin existsin an “open” (decompacted), more nuclease-sensitiveconfiguration, making it “poised” for gene expression,although not necessarily transcriptionally active. Someof the genes are ubiquitously expressed (housekeepinggenes); others are developmentally regulated or stress-induced in response to environmental cues. The cooper-ation of selected cis-acting DNA sequences (promoters,enhancers, and locus control regions), bound by combi-nations of trans-acting factors, triggers gene transcriptionin concert with RNA polymerase and associated factors(Sims et al. 2004). Together these factors have been highlyselected during evolution to orchestrate an elaborateseries of biochemical reactions that must occur in theappropriate spatial and temporal setting. Does chromatinprovide an “indexing system” which better ensures thatthe above machinery can access its target sequences in theappropriate cell type?

At the DNA level, the AT-rich vicinity of promoters isoften devoid of nucleosomes and may exist in a rigidnoncanonical B-form DNA configuration, promotingtranscription factor (TF) occupancy (Mito et al. 2005;Sekinger et al. 2005). However, TF occupancy is notenough to ensure transcription. The recruitment ofnucleosome-remodeling machines, through the induc-tion of activating histone modifications (e.g., acetylationand H3K4 methylation), facilitates the engagement ofthe transcription machinery by pathways that are cur-rently being defined (Fig. 9 and Chapter 10). Exchange ofdisplaced histones with histone variants after the tran-scription machinery has unraveled and transcribed thechromatin fiber ensures integrity of the chromatin tem-plate (Ahmad and Henikoff 2002). Achieving fullymature mRNAs, however, also requires posttranscrip-tional processes involving splicing, polyadenylation, andnuclear export. Thus, the collective term “euchromatin”likely represents a complex chromatin state(s) thatencompasses a dynamic and elaborate mixture of dedi-cated machines that interact together and closely withthe chromatin fiber to bring about the transcription offunctional RNAs. Learning the “rules” as to how, in themost general sense, the “activating machinery” interactswith the transcription apparatus as well as the chromatintemplate is an exciting area of current research, althoughdue to its dynamic nature, it may not strictly classify as

epigenetics, but more as transcription and chromatindynamics studies.

What then defines “heterochromatin?” Although it ishistorically less well studied than euchromatin, newinsights suggest that heterochromatin plays a criticallyimportant role in the organization and proper functioningof genomes from yeast to humans (although S. cerevisiaehas a distinct form of heterochromatin). Underscoring itspotential importance is the fact that 96% of the mam-malian genome consists of noncoding and repetitivesequences. New mechanistic insights, underlying the for-mation of heterochromatin, have revealed unexpectedfindings. For example, non-sequence-specific transcrip-tion, which produces double-stranded RNA (dsRNA), issubject to silencing by an RNA interference (RNAi)-likemechanism (see Section 10 below). The production of suchdsRNAs acts as an “alarm signal” reflecting the fact that theunderlying DNA sequence cannot generate a functionalproduct, or has been invaded by RNA transposons orviruses. The dsRNA is then processed by Dicer and targetedto chromatin by complexes dedicated to initiating a cas-cade of events leading to the formation of heterochro-matin. Using a variety of model systems, remarkableprogress has been made dissecting what appears to be ahighly conserved pathway leading to a heterochromatin“locked-down” state. Although the exact order and detailsmay vary, this general pathway involves histone taildeacetylation, methylation of specific lysine residues (e.g.,H3K9), recruitment of heterochromatin-associated pro-teins (e.g., HP1), and establishment of DNA methylation(Fig. 9). It is likely that sequestering of selective genomicregions to repressive nuclear domains or territories mayenhance heterochromatin formation. Interestingly, in-creasing evidence suggests that heterochromatin may bethe “default state,” at least in higher organisms, and that thepresence of a strong promoter or enhancer, producing aproductive transcript, can override heterochromatin. Evenin lower eukaryotes, the general concepts of heterochro-matin assembly seem to apply. Hallmark features includehypoacetylated histone tails, followed by the binding ofacetylation-sensitive heterochromatin proteins (e.g., SIRproteins; for details, see Chapter 4). Depending on the fun-gal species (e.g., budding vs. fission yeast), varyingamounts of histone methylation and HP1-like proteinsexist. Even though these genomes are more set to a generaldefault state of being poised for transcription, some hete-rochromatin-like genomic regions are present (matingloci, telomeres, centromeres, etc.) that are able to suppressgene transcription and genetic recombination when testgenes are placed in these new neighborhoods.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 35

Copyright 2007 Cold Spring Harbor Laboratory Press.

36 ■ C H A P T E R 3

What useful functions might heterochromatin serve?The definition of centromeres, a region of constitutive het-erochromatin, correlates well with a heritable epigeneticstate and is thought to be evolutionarily driven by thelargest clustering of repeats and repetitive elements on achromosome. This partitioning ensures large and relativelystable heterochromatic domains marked by repressive “epi-genetic signatures,” facilitating chromosome segregationduring mitosis and meiosis (Chapter 14). Here, it is note-worthy that centromeric repeats and the correspondingepigenetic marks that associate with them have been dupli-cated and moved onto other chromosome arms to create“silencing domains” in organisms such as fission yeast.Constitutive heterochromatin at telomeres (the protectiveends of chromosomes) similarly ensures stability of thegenome by serving as chromosomal “caps.” Last, hete-rochromatin formation is known to be a defense mecha-nism against invading DNA. Collectively, these findingsunderscore a general view that heterochromatin servesimportant genome maintenance functions which may rivaleven that of euchromatin itself.

In summary, the broad functional distinction betweeneuchromatin and heterochromatin can thus far be attrib-uted to three known characteristics of chromatin. First isthe nature of the DNA sequence—e.g., whether it con-tains AT-rich “rigid” DNA around promoters, repetitivesequences and/or repressor-binding sequences that signalfactor association. Second, the quality of the RNA pro-duced during transcription determines whether it is fullyprocessed into an mRNA that can be translated, orwhether the RNA is degraded or earmarked for use by theRNAi machinery to target heterochromatinization. Third,spatial organization within the nucleus can play a signifi-cant sequestering role for the maintenance of localizedchromatin configurations.

7 Histone Modifications and the Histone Code

We have explored how histone modifications may changethe chromatin template by cis-effects that alter internu-cleosomal contacts and spacing, or the trans-effectscaused by histone and non-histone protein associations

nucleosomeremodeling

AcAc

i d

restricted

information

accessible

information

heterochromatineuchromatin

transcription factorbinding

remodeling complexrecruitment

activating histonemodifications

histone variants

noncodingdsRNAs

RITS complexrecruitment

repressive histonemethylation

DNA methylation

messenger RNA

gene

t r a n s c r i p t i o n u n i t( g e n e )

D N A r e p e a t s( n o n c o d i n g )

ncRNAs

Me

TF Me MeMe

Me Me MeHP1HP1Me

MeMe

MeMeMe

Figure 9. Distinction between Euchromatic and HeterochromaticDomains

Summary of common differences betweeneuchromatin and constitutive heterochro-matin. This includes differences in the typeof transcripts produced, recruitment ofDNA-binding proteins (i.e., transcriptionfactor [TF]), chromatin-associated proteinsand complexes, covalent histone modifi-cations, and histone variant composition.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 36

Copyright 2007 Cold Spring Harbor Laboratory Press.

O V E R V I E W A N D C O N C E P T S ■ 37

with the template. What is the contribution and biologi-cal output of histone modifications? Patterns of chro-matin structure that correlate with histone tailmodifications have emerged from studies using bulk his-tones, suggesting that epigenetic marks may provide“ON” (i.e., active) or “OFF” (inactive) signatures. This hascome through a long history of mostly correlative studiesshowing that certain histone modifications, notably his-tone acetylation, are associated with active chromatindomains or regions that are generally permissive for tran-scription. In contrast, other marks, such as certain phos-phorylated histone residues, have long been associatedwith condensed chromatin that, in general, fails to sup-port transcriptional activity. The histone modificationsshown in Appendix 2 summarize the sites of modificationthat are known at this time. Here, we stress that thesereflect modifications and sites that may well not be exhib-ited by every organism.

How are histone modifications established orremoved in the first place? A wealth of work in the chro-matin field has suggested that histone tail modificationsare established (“written”) or removed (“erased”) by thecatalytic action of chromatin-associated enzymatic sys-tems. However, the identity of these enzymes eludedresearchers for years. Over the last decade, a remarkablylarge number of chromatin-modifying enzymes havebeen identified from many sources, most of which arecompiled in Appendix 2. This has been achieved throughnumerous biochemical and genetic studies. The enzymesoften reside in large multi-subunit complexes that cancatalyze the incorporation or removal of covalent modifi-cations from both histone and non-histone targets. More-over, many of these enzymes catalyze their reactions withremarkable specificity to target residue and cellular con-text (i.e., dependent on external or intrinsic signals). Forclarity, and by way of example, we discuss briefly the fourmajor enzymatic systems that catalyze histone modifica-tions, together with their counterpart enzymatic systemsthat reverse the modifications (Fig. 10) (Vaquero et al.2003; Holbert and Marmorstein 2005). Together, theseantagonistic activities govern the steady-state balance ofeach modification in question.

Histone acetylases (HATs) acetylate specific lysineresidues in histone substrates (Roth et al. 2001) and arereversed by the action of histone deacetylases (HDACs)(Grozinger and Schreiber 2002). The histone kinase fam-ily of enzymes phosphorylate specific serine or threonineresidues, and the phosphatases (PPTases) remove phos-phorylation marks. Particularly well known are themitotic kinases, such as cyclin-dependent kinase or

aurora kinase, which catalyze the phosphorylation of core(H3) and linker (H1) histones. Less clear in each case arethe opposing PPTases that act to reverse these phosphory-lations as cells exit mitosis.

Two general classes of methylating enzymes have beendescribed: the PRMTs (protein arginine methyltrans-ferases) whose substrate is arginine (Lee et al. 2005), andthe HKMTs (histone lysine methyltransferases) that acton lysine residues (Lachner et al. 2003). Arginine methy-lation is indirectly reversed by the action of deiminases,which convert methyl-arginine (or arginine) to a cit-rulline residue (Bannister and Kouzarides 2005). Methyl-ated lysine residues appear to be chemically more stable.Lysine methylation has been shown to be present inmono-, di-, or tri-methylated states. Several tri-methy-lated residues in the H3 and H4 animo termini appear tohave the potential to be stably propagated during celldivisions (Lachner et al. 2004), as well as the H4K20me1mark in Drosophila imaginal discs (Reinberg et al. 2004).Recently, a lysine-specific “demethylase” (LSD1) wasdescribed as an amine oxidase that is able to removeH3K4 methylation (Shi et al. 2004). The enzyme acts byFAD-dependent oxidative destabilization of the amino-methyl bond, resulting in the formation of unmodifiedlysine and formaldehyde. LSD1 was shown to be selectivefor the activating H3K4 methylation mark and can onlydestabilize mono- and di-, but not tri-methylation. Thisdemethylase is part of a large repressive protein complexthat also contains HDACs and other enzymes. Other evi-dence suggests that LSD1 can associate in a complextogether with the androgen receptor at target loci anddemethylate the H3K9me2 repressive histone mark tocontribute to transcriptional activation (Metzger et al.2005). A different class of histone demethylases has beencharacterized to work via a more potent mechanism—radical attack—known as hydroxylases or dioxygenases(Tsukada et al. 2006). One of these only destabilizesH3K36me2 (an active mark), but not in the tri-methylstate. This novel jumonji histone demethylase (JHDM1)contains the conserved jumonji domain, of which thereare around 30 known in the mammalian genome, sug-gesting that some of these enzymes may also be able toattack other residues as well as a tri-methyl state (Fodoret al 2006; Whetstine et al 2006).

Considerable progress has been made in dissecting theenzyme systems that govern the steady-state balance ofthese modifications, and we suspect that much moreprogress will be made in this exciting area. It remains achallenge to understand how these enzyme complexes areregulated and how their physiologically relevant substrates

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 37

Copyright 2007 Cold Spring Harbor Laboratory Press.

38 ■ C H A P T E R 3

and sites are targeted. In addition, it remains unclear howcovalent mechanisms affect epigenetic phenomena.

Histone modifications do not occur in isolation, butrather in a combinatorial manner as proposed for modi-fication cassettes (i.e., covalent modifications in adjacentresidues of a particular histone tail, e.g., H3K9me andH3S10ph or H4S1ph, H4R3me, and H4K4ac) and transhistone pathways (covalent modifications between differ-ent histone tails or nucleosomes; see Fig. 11). Intriguingly,almost all of the known histone modifications correlatewith activating or repressive function, dependent onwhich amino acid residue(s) in the histone amino terminiis modified. Both synergistic and antagonistic pathwayshave been described (Zhang and Reinberg 2001; Berger2002; Fischle et al. 2003b) that can progressively inducecombinations of active marks, while simultaneouslycounteracting repressive modifications. It is, however, notknown how many distinct combinations of modificationsacross the various amino-terminal histone positions existfor any given nucleosome, because most of the studieshave been carried out on bulk histone preparations. Inaddition to the amino termini, modifications in the glob-ular histone fold domains have recently been shown toaffect chromatin structure and assembly (Cosgrove et al.2004), thereby influencing gene expression and DNAdamage repair (van Attikum and Gasser 2005; Vidanes etal. 2005). It is also worth noting that several of the his-tone-modifying enzymes also target non-histone sub-strates (Sterner and Berger 2000; Chuikov et al. 2004).Figure 11 illustrates two examples of established hierar-chies of histone modifications that seem to index tran-

scription of active chromatin or, in contrast, pattern het-erochromatic domains.

These studies provoke the question of whether there isa “histone code” or even an “epigenetic code.” Althoughthis theoretical concept has been highly stimulating, andhas been shown to be correct in some of its predictions,the issue as to whether a code actually exists has remainedlargely open. As a comparison, the genetic code hasproven extremely useful, because of its predictability andnear universality. It uses for the most part a four-base“alphabet” in the DNA (i.e., nucleotides), forming what isgenerally an invariant and nearly universal language. Incontrast, current evidence suggests that histone-modifi-cation patterns are likely to vary considerably from oneorganism to the next, especially between lower and highereukaryotes, such as yeast and humans. Thus, even if a his-tone code exists, it is not likely to be universal. This situ-ation is made considerably more complicated when oneconsiders the dynamic nature of histone modifications,varying in space and in time. Furthermore, the chromatintemplate engages a staggering array of remodeling factors(Vignali et al. 2000; Narlikar et al. 2002; Langst andBecker 2004; Smith and Peterson 2005). However, chro-matin immunoprecipitation assays (ChIP), when exam-ined on genome-wide levels (ChIP on chip), have begunto decipher nonrandom and somewhat predictable pat-terns in several genomes (e.g., S. pombe, A. thaliana,mammalian cells), such as strong correlations ofH3K4me3 with activated promoter regions (Strahl et al.1999; Santos-Rosa et al. 2002; Bernstein et al. 2005) andof H3K9 (Hall et al. 2002; Lippman et al. 2004; Martens et

Me Me

bromo

Ac Ac

bromo

P

chromo? ? ?

Me

HAT kinase PRMT HKMT

? ? ?

(lysine) (serine, threonine) (arginine) (lysine)

HDAC PPTase deiminase amine oxidase(citrulline) hydroxylase

Figure 10. Histone-modifying Enzymes

Covalent histone modifications are transduced by histone-modifying enzymes (“writers”) and removed by antagonizingactivities. They are classified into families according to the type of enzymatic action (e.g., acetylation or phosphorylation).Protein domains with specific affinity for a histone tail modification are termed “readers.” (HAT) Histone acetyltransferase;(PRMT) protein arginine methyltransferase; (HKMT) histone lysine methyltransferase; (HDAC) histone deacetylase; (PPTase)protein phosphatases; (Ac) acetylation; (P) phosphorylation; (Me) methylation.

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 38

Copyright 2007 Cold Spring Harbor Laboratory Press.

al. 2005) and H3K27 (Litt et al. 2001; Ringrose et al. 2004)methylation with silenced heterochromatin. Perhaps thelimitation of the histone code is that one modificationdoes not invariantly translate to one biological output.However, modifications combinatorially or cumulativelydo appear to define and contribute to biological functions(Henikoff 2005).

8 Chromatin-remodeling Complexes and Histone Variants

Another major mechanism by which transitions in thechromatin template are induced is by signaling therecruitment of chromatin “remodeling” complexes thatuse energy (ATP-hydrolysis) to change chromatin andnucleosome composition in a non-covalent manner.Nucleosomes, particularly when bound by repressivechromatin-associated factors, often impose an intrinsicinhibition to the transcription machinery. Hence, onlysome sequence-specific transcription factors and regula-tors (although not the basal transcription machinery) areable to gain access to their binding site(s). This accessibil-ity problem is solved, in part, by protein complexes that

mobilize nucleosomes and/or alter nucleosomal struc-ture. Chromatin-remodeling activities often work in con-cert with activating chromatin-modifying enzymes andcan generally be categorized into two families: the SNF2Hor ISWI, and the Brahma or SWI/SNF family. TheSNF2H/ISWI family mobilizes nucleosomes along theDNA (Tsukiyama et al. 1995; Varga-Weisz et al. 1997),whereas Brahma/SWI/SNF transiently alter the structureof the nucleosome, thereby exposing DNA:histone con-tacts in ways that are currently being unraveled (seeChapter 12).

Additionally, some of the ATP-hydrolyzing activitiesresemble “exchanger complexes” that are themselves ded-icated to the replacement of conventional core histoneswith specialized histone “variant” proteins. This ATP-costing shuffle may actually be a means by which existingmodified histone tails are replaced with a clean slate ofvariant histones (Schwartz and Ahmad 2005). Alterna-tively, recruitment of chromatin-remodeling complexes,such as SAGA (Spt-Ada-Gcn5-acetyltransferase) can alsobe enhanced by preexisting histone modifications toensure transcriptional competence of targeted promoters(Grant et al. 1997; Hassan et al. 2002).

O V E R V I E W A N D C O N C E P T S ■ 39

Figure 11. Coordinated Modification ofChromatin

The transition of a naïve chromatin tem-plate to active euchromatin (left) or theestablishment of repressive heterochro-matin (right), involving a series of coordi-nated chromatin modifications. In the caseof transcriptional activation, this is accom-panied by the action of nucleosome-remod-eling complexes and the replacement ofcore histones with histone variants (yellow,namely H3.3).

P

H2BK123ub

H3K79me

H3K4me H3K9me

H3K9ac

H3R17me

H4K16ac H4K20me

HP1HP1

HP1

HAT

H3K9ac H3K9me

H3S10phHP1

H4K20meH4K16ac

DNA methylation

a c t i v e re p re s s e d

bromo

S I G N A L S

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 39

Copyright 2007 Cold Spring Harbor Laboratory Press.

40 ■ C H A P T E R 3

In addition to transcriptional initiation and estab-lishing the primary contact with a promoter region, thepassage of RNA pol II (or of RNA pol I) during tran-scriptional elongation is further obstructed by the pres-ence of nucleosomes. Mechanisms are thereforerequired to ensure the completion of nascent transcripts(particularly of long genes). In particular, a series of his-tone modifications and docking effectors act in concertwith chromatin-remodeling complexes such as SAGAand FACT (for facilitate chromatin transcription)(Orphanides et al. 1998) to allow RNA pol II passagethrough nucleosomal arrays. These concerted activitieswill, for example, induce increased nucleosomal mobil-ity, displace H2A/H2B dimers, and promote theexchange of core histones with histone variants. As such,they provide an excellent example of the close interplaybetween histone modifications, chromatin remodeling,and histone variant exchange to facilitate transcriptionalinitiation and elongation (Sims et al. 2004). Otherremodeling complexes have also been characterized, suchas Mi-2 (Zhang et al. 1998; Wade et al. 1999) and INO-80 (Shen et al. 2000), which are involved in stabilizingrepressed rather than active chromatin.

Compositional differences of the chromatin fiber thatoccur through the presence of histone variants contributeto the indexing of chromosome regions for specializedfunctions. Each histone variant represents a substitute fora particular core histone (Fig. 12), although histone vari-ants are often a minor proportion of the bulk histone con-tent, and thus more difficult to study than regularhistones. An increasing body of literature (for review, seeHenikoff and Ahmad 2005; Sarma and Reinberg 2005)documents that histone variants have their own pattern ofsusceptibility to modifications, likely specified by the smallnumber of amino acid changes that distinguish them fromtheir family members. On the other hand, some histonevariants have distinct amino- and carboxy-terminaldomains with unique chromatin-regulating activity anddifferent affinities to binding factors. By way of example,transcriptionally active genes have general histone H3exchanged by the H3.3 variant, in a transcription-coupledmechanism that does not require DNA replication(Ahmad and Henikoff 2002). The replacement of corehistone H2A with the H2A.Z variant correlates with tran-scriptional activity and can index the 5′ end of nucleo-some-free promoters. However, H2A.Z has also beenassociated with repressed chromatin. CENP-A, the cen-tromere-specific H3 variant, is essential for centromericfunction and hence chromosome segregation. H2A.X,together with other histone marks, is associated with

sensing DNA damage and appears to index a DNA lesionfor recruitment of DNA repair complexes. MacroH2A is ahistone variant that specifically associates with the inactiveX chromosome (Xi) in mammals (for more details on his-tone variants, see Chapter 13).

Importantly, and in contrast to the commonly heldtextbook notion that histones are synthesized anddeposited only during S phase, synthesis and substitutionof many of these histone variants occurs independently ofDNA replication. Hence, the replacement of core histonesby histone variants is not restricted to cell cycle stages (i.e.,S phase), but can take immediate effect in response toongoing mechanisms (e.g., transcriptional activity orkinetochore tension during cell division) or stress signals(e.g., DNA damage or nutrient starvation). Elegant bio-chemical studies have documented chromatin remodelingor exchanger complexes that are specific for replacementof distinct histone variants, such as H3.3, H2A.Z, orH2A.X (Cairns 2005; Henikoff and Ahmad 2005; Sarmaand Reinberg 2005). For instance, replacement of H3 withthe H3.3 variant occurs via the action of the HIRA (his-tone regulator A) exchanger complex (Tagami et al. 2004),and H2A is replaced by H2A.Z through the activity of theSWR1 (Swi2/Snf2-related ATPase 1) exchanger complex

Figure 12. Histone Variants

Protein domain structure for the core histones (H3, H4, H2A,H2B), linker histone H1, and variants of histones H3 and H2A. Thehistone fold domain (HFD) where histone dimerization occurs, andregions of the protein that differ in histone variants (shown in red)are indicated.

histone H3

histone H4

histone H2A

histone H2B

histone H1

H3

H3.3

CenpA

H4

H2A

H2A.Z

H2A.X

macroH2A

H2B

(7 isoforms)

HFD

HFD

HFD

HFD

HFD

HFD

HFD

HFD

H15

MACRO

135

135

133

102

129

127

142

355

125

179-223

HFD

03_EPIGEN_p023-062.qxd 9/12/06 11:57 AM Page 40

Copyright 2007 Cold Spring Harbor Laboratory Press.