epigenome analyses using bac microarrays identify evolutionary conservation of tissue-specific...

7
Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3 Tsui-Ting Ching 1,6 , Alika K Maunakea 1,6 , Peter Jun 1 , Chibo Hong 1 , Giuseppe Zardo 1 , Daniel Pinkel 2 , Donna G Albertson 3 , Jane Fridlyand 4 , Jian-Hua Mao 3 , Ksenya Shchors 3 , William A Weiss 5 & Joseph F Costello 1 CpG islands are present in one-half of all human and mouse genes and typically overlap with promoters or exons. We developed a method for high-resolution analysis of the methylation status of CpG islands genome-wide, using arrays of BAC clones and the methylation-sensitive restriction enzyme NotI. Here we demonstrate the accuracy and specificity of the method. By computationally mapping all NotI sites, methylation events can be defined with single-nucleotide precision throughout the genome. We also demonstrate the unique expandability of the array method using a different methylation-sensitive restriction enzyme, BssHII. We identified and validated new CpG island loci that are methylated in a tissue-specific manner in normal human tissues. The methylation status of the CpG islands is associated with gene expression for several genes, including SHANK3, which encodes a structural protein in neuronal postsynaptic densities. Defects in SHANK3 seem to underlie human 22q13 deletion syndrome. Furthermore, these patterns for SHANK3 are conserved in mice and rats. DNA methylation is essential for a properly functioning genome 1–6 . DNA methylation involves transfer of a methyl group to cytosine in a CpG dinucleotide by DNA methyltransferases that create (DNMT3a, DNMT3b) or maintain (DNMT1) methylation patterns. CpGs are five times more abundant in CpG islands than in the rest of the genome 7–9 . CpG islands are B200 bp to several kilobases in length and nearly always encompass gene promoters or exons 9–11 . CpG islands asso- ciated with imprinted genes 12 , genes on the inactive X chromosome in females 13 and some tissue-specific genes can be differentially methy- lated 14–18 , whereas many CpG islands are thought to be unmethylated in all normal tissues. The methylation status of most of the estimated 30,000 CpG islands 19–21 has not yet been assessed in multiple tissues. Array comparative genomic hybridization is a quantitative, high- resolution method for measuring copy number across the genome 22–25 . The arrays consist of thousands of BAC clones distributed across the genome and are capable of accurately distinguishing single-copy gain and loss in cancer, as well as copy-number polymorphisms in normal tissue 26 . Because many BAC clones contain CpG islands, we propose that these arrays are ideally suited for large-scale CpG island methyla- tion analyses. We designed a method to use methylation-sensitive restriction enzymes and BAC clone arrays to determine the methyla- tion status of CpG islands genome-wide in different tissues. The method is applicable to any organism for which BAC arrays have been developed and to normal and disease tissue. The method can also be expanded to assess many methylation-sensitive restriction enzyme sites and can be applied to updated tiling path arrays 27 . We demon- strate the utility of the array by identifying new CpG island loci that are subject to tissue-specific methylation and gene expression in human tissue. Using a comparative epigenomics approach, we also show that the differential methylation and expression of one of the genes is conserved in rodents. RESULTS Methylation analysis using BAC microarrays We designed a method to use arrays of BAC clones 22,23,25 and methylation-sensitive enzymes for analysis of CpG island methylation status (Fig. 1a). To assess the methylation status of CpG islands, we used NotI sites as indicators, because its endonuclease activity is inhibited by methylation and 94% of NotI sites are within CpG islands 28,29 . Using a custom Perl script, we estimate that 2,500 NotI sites can be analyzed using this prototype methylation array. To prove that the signal intensities across the array are from unmethylated NotI fragments, we compared a self-self hybridization to a hybridization of peripheral blood lymphocyte (PBL) DNA that was first methylated in vitro and cohybridized with untreated PBL DNA. Following in vitro methylation, the sample was spiked with DNA from two unmethylated BAC clones that contain NotI sites. We then prepared the methylated genomic DNA–unmethylated BAC mixture (Fig. 1a) and labeled it with Cy3. For the self-self hybridiza- tion, equivalent amounts of Cy3 (red) and Cy5 (green) signal resulted Published online 15 May 2005; doi:10.1038/ng1563 1 The Brain Tumor Research Center, Department of Neurological Surgery and the Biomedical Sciences Program, 2 Department of Laboratory Medicine, 3 Cancer Research Institute, 4 Department of Epidemiology and Biostatistics and 5 Department of Neurology, University of California San Francisco, San Franciso, California 94143, USA. 6 These authors contributed equally to this work. Correspondence should be addressed to J.F.C. ([email protected]). NATURE GENETICS VOLUME 37 [ NUMBER 6 [ JUNE 2005 645 TECHNICAL REPORTS © 2005 Nature Publishing Group http://www.nature.com/naturegenetics

Upload: ksenya

Post on 21-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

Epigenome analyses using BAC microarrays identifyevolutionary conservation of tissue-specific methylationof SHANK3Tsui-Ting Ching1,6, Alika K Maunakea1,6, Peter Jun1, Chibo Hong1, Giuseppe Zardo1, Daniel Pinkel2,Donna G Albertson3, Jane Fridlyand4, Jian-Hua Mao3, Ksenya Shchors3, William AWeiss5 & Joseph F Costello1

CpG islands are present in one-half of all human and mousegenes and typically overlap with promoters or exons. Wedeveloped a method for high-resolution analysis of themethylation status of CpG islands genome-wide, using arraysof BAC clones and the methylation-sensitive restriction enzymeNotI. Here we demonstrate the accuracy and specificity ofthe method. By computationally mapping all NotI sites,methylation events can be defined with single-nucleotideprecision throughout the genome. We also demonstrate theunique expandability of the array method using a differentmethylation-sensitive restriction enzyme, BssHII. We identifiedand validated new CpG island loci that are methylated ina tissue-specific manner in normal human tissues. Themethylation status of the CpG islands is associated with geneexpression for several genes, including SHANK3, whichencodes a structural protein in neuronal postsynaptic densities.Defects in SHANK3 seem to underlie human 22q13 deletionsyndrome. Furthermore, these patterns for SHANK3 areconserved in mice and rats.

DNA methylation is essential for a properly functioning genome1–6.DNA methylation involves transfer of a methyl group to cytosine in aCpG dinucleotide by DNA methyltransferases that create (DNMT3a,DNMT3b) or maintain (DNMT1) methylation patterns. CpGs are fivetimes more abundant in CpG islands than in the rest of the genome7–9.CpG islands are B200 bp to several kilobases in length and nearlyalways encompass gene promoters or exons9–11. CpG islands asso-ciated with imprinted genes12, genes on the inactive X chromosome infemales13 and some tissue-specific genes can be differentially methy-lated14–18, whereas many CpG islands are thought to be unmethylatedin all normal tissues. The methylation status of most of the estimated30,000 CpG islands19–21 has not yet been assessed in multiple tissues.

Array comparative genomic hybridization is a quantitative, high-resolution method for measuring copy number across the genome22–25.The arrays consist of thousands of BAC clones distributed across the

genome and are capable of accurately distinguishing single-copy gainand loss in cancer, as well as copy-number polymorphisms in normaltissue26. Because many BAC clones contain CpG islands, we proposethat these arrays are ideally suited for large-scale CpG island methyla-tion analyses. We designed a method to use methylation-sensitiverestriction enzymes and BAC clone arrays to determine the methyla-tion status of CpG islands genome-wide in different tissues. Themethod is applicable to any organism for which BAC arrays havebeen developed and to normal and disease tissue. The method can alsobe expanded to assess many methylation-sensitive restriction enzymesites and can be applied to updated tiling path arrays27. We demon-strate the utility of the array by identifying new CpG island loci that aresubject to tissue-specific methylation and gene expression in humantissue. Using a comparative epigenomics approach, we also show thatthe differential methylation and expression of one of the genes isconserved in rodents.

RESULTSMethylation analysis using BAC microarraysWe designed a method to use arrays of BAC clones22,23,25 andmethylation-sensitive enzymes for analysis of CpG island methylationstatus (Fig. 1a). To assess the methylation status of CpG islands, weused NotI sites as indicators, because its endonuclease activity isinhibited by methylation and 94% of NotI sites are within CpGislands28,29. Using a custom Perl script, we estimate that 2,500 NotIsites can be analyzed using this prototype methylation array.

To prove that the signal intensities across the array are fromunmethylated NotI fragments, we compared a self-self hybridizationto a hybridization of peripheral blood lymphocyte (PBL) DNA thatwas first methylated in vitro and cohybridized with untreated PBLDNA. Following in vitro methylation, the sample was spiked withDNA from two unmethylated BAC clones that contain NotI sites. Wethen prepared the methylated genomic DNA–unmethylated BACmixture (Fig. 1a) and labeled it with Cy3. For the self-self hybridiza-tion, equivalent amounts of Cy3 (red) and Cy5 (green) signal resulted

Published online 15 May 2005; doi:10.1038/ng1563

1The Brain Tumor Research Center, Department of Neurological Surgery and the Biomedical Sciences Program, 2Department of Laboratory Medicine, 3Cancer ResearchInstitute, 4Department of Epidemiology and Biostatistics and 5Department of Neurology, University of California San Francisco, San Franciso, California 94143, USA.6These authors contributed equally to this work. Correspondence should be addressed to J.F.C. ([email protected]).

NATURE GENETICS VOLUME 37 [ NUMBER 6 [ JUNE 2005 64 5

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 2: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

in a yellow signal (Fig. 1b). In contrast, the in vitro methylation of PBLDNA significantly reduced the Cy3 (red) signal for most spots,whereas the Cy5 (green) signal from untreated PBL DNA was main-tained (Figs. 1c). The unmethylated NotI fragments copurified fromthe two admixed BAC clones hybridized to their corresponding spotsspecifically (Fig. 1c). Consistent with the array image, the average log2

ratio of BACs containing NotI sites in the control self-self hybridiza-tion was –0.07, indicating an equivalent signal from test and referenceDNA. In contrast, the average log2 ratio of these BACs after in vitromethylation was significantly lower (average log2 ratio–0.82, Po 0.01;Figs. 1c,d). As expected, the log2 ratios of the purified BACs added tothe Cy3 channel were significantly higher than those in the controlself-self hybridization (Fig. 1d). We obtained very similar resultswhen we repeated this experiment (Fig. 1d). Hybridization signalson the array were therefore predominately generated from unmethy-lated NotI fragments.

To evaluate the quality of the array data, we looked for the presenceof spatial effects of a considerable size and for intensity dependenceof the log2 ratios. We normalized the log2 ratios using global medianshift to ease the across-array comparisons. We evaluated intensitydependence by considering the shape of the lowess smoothers in eachsubarray (Fig. 2). We evaluated the geometric effects by comparingprint-tip–specific log2 ratio distributions in each array (Fig. 2).The geometric aspect of the array was judged to be satisfactory ifthe variability in individual subarrays was approximately constantand less than 0.2 and if the interquartile range of each print-tipdistribution contained zero. We accepted the array if none of thelowess lines showed considerable departure from a horizontal line andthere was no significant geometric dependence. We found a lack of an

intensity-dependent effect and a lack of strong spatial print-specificeffects in the arrays (Fig. 2).

We determined a conservative sample-specific cutoff for declaring aclone to be differentially methylated. For each sample, we calculatedthe median absolute deviation (MAD) computed over the autosomalclones present in at least 50% of the samples. In the second iteration ofthe MAD calculation, we used only those clones that were within 2.5MAD of zero. We computed the conservative threshold for callingmethylation differences as the maximum of either three times theMAD for a given sample or 0.5. Among repeated self-self hybridiza-tions of acceptable quality, between zero and five autosomal clonesexceeded the resulting threshold with the median of 1 (Fig. 3a). This iswell within the typical range for the BAC arrays used in standard arraycomparative genomic hybridization25.

Tissue-specific methylation at CpG islandsTo test for tissue- and cell-type-specific methylation of CpG islands, wecarried out a self-self hybridization (PBL) and a comparative hybridi-zation (astrocytes versus PBLs; Fig. 3). In the self-self hybridization, thedistribution of the log2 ratios was narrow (Fig. 3a). In contrast,comparison of astrocyte DNA versus PBL DNA showed that asignificant subset of clones had log2 ratios outside the sample threshold(Fig. 3b). These results suggest that there were differences in themethylation status of many NotI sites, possibly CpG islands, betweenthe astrocytes and PBLs.

To extend these results to a larger set of CpG islands, we developed aprotocol for using BssHII as the methylation-sensitive enzyme. Weobserved specific hybridization to BAC clones containing BssHII sites,many of which do not contain NotI sites. The results of cohybridiza-

1. Restriction digestion

2. Biotin labeling and binding

3. Washing and elution

4. Dye labeling

5. Cohybridization

Not I EcoRV EcoRVNotI

Test DNA

Cy3 Cy3 Cy3 Cy3

Reference DNA

Cy5 Cy5 Cy5 Cy5

A

1

5

10

15

20

1

5

10

15

20

B C D E F G H I J K L M N

Experiment 1

PBLPBLPosition

G14

L19

B17

F11

F3

F9

G7

H5

I7

I9

K8

L15

–0.10

–0.22

0.01

–0.06

–0.21

–0.15

–0.09

–0.01

–0.19

0.00

0.06

–0.17

4.19

2.67

–1.98

–1.44

–2.13

–1.42

–1.37

–1.53

–1.21

–1.74

–1.84

–1.78

–0.20

0.08

–0.07

–0.32

–0.14

–0.39

0.26

0.00

–0.09

–0.18

–0.15

–0.09

3.84

3.64

–2.40

–1.78

–2.31

–1.81

–1.49

–1.62

–1.63

–2.64

–2.20

–2.01

RP11-15M23

RP11-23D23

RP11-72M14

CTD-2100I14

RP11-55M16

RP11-192A12

RP11-29P24

RP11-206I1

RP11-105N15

CTD-2027I18

RMC20P117

RP11-282J24

Clone ID PBL PBL PBLPBLPBL-m + BAC PBL-m + BAC

Experiment 2

a b d

c

Figure 1 Detection of methylation differences using a BAC clone array. (a) The methylation array procedure. Genomic DNA was digested with EcoRV and a

methylation-sensitive enzyme, such as NotI. The cleaved ends of unmethylated NotI fragments were filled in with biotin-dNTPs and isolated on streptavidin-

coated magnetic beads, whereas the methylated fragments and most non-CpG-island DNA were eliminated. The purified fragments were hybridized to arrays

of BAC clones printed in triplicate. The BAC arrays consisted of 2,413 unique BAC clones with an average interval of B1 Mb across the human genome25.

m, methylation. (b) Hybridization signals on the array were derived from unmethylated DNA. Two array sections from a control PBL-PBL hybridization

showing equivalent amounts of Cy3 (red) and Cy5 (green) hybridization signal. BACs without NotI sites had little or no signal intensity, whereas most BACs

containing NotI sites had a signal. (c) The same array sections as in b from untreated PBL DNA labeled with Cy5 (green) and cohybridized with in vitro

methylated PBL DNA spiked with 5 ng of two unmethylated BAC DNAs (RP11-15M23 at position G14 and RP11-23D23 at L19; in PBLs, both these sites

are normally methylated; data not shown) and then labeled with Cy3 (red). The in vitro methylation significantly reduced the Cy3 signal for most spots,

whereas the Cy5 signal from untreated PBL DNA was maintained. (d) Average log2 ratios of ten BACs containing NotI sites with the highest intensity in b

and c (experiment 1). This M.SssI array experiment was repeated with very similar results (experiment 2). m, methylated by M.SssI.

6 46 VOLUME 37 [ NUMBER 6 [ JUNE 2005 NATURE GENETICS

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 3: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

tions of BssHII fragments from PBL versus PBL and brain versus PBLyielded overall results similar to those of the arrays hybridized withNotI fragments (Fig. 3c,d and Supplementary Fig. 1 online).Together, the NotI and BssHII protocols will allow us to interrogatedifferent CpG islands in a single BAC or different CpGs in a singleCpG island, depending on the restriction-site distribution.

To confirm that these differences were generated by changes inmethylation, we carried out 7–12 bisulfite-sequencing reactions foreach of 25 data points on the arrays for loci that seemed to bedifferentially methylated in a small panel of human tissues andprimary cell cultures, including brain, fetal astrocytes, keratinocytesand PBLs. We found an association between the log2 ratios from thearray and the degree of methylation in the NotI sites or BssHII site andadjacent CpGs (Figs. 4 and 5). The NotI site of RP11-218N6 isadjacent to a CpG island at the 5¢ end of C2orf32, a gene of unknownfunction (Supplementary Fig. 2 online). Bisulfite sequencing of theNotI site and adjacent CpGs in each direction showed tissue-specificmethylation (Fig. 4b). To determine whether the differential methyla-tion was associated with gene expression status, we assessed levels ofC2orf32 mRNA by real-time RT-PCR analysis of the same set of tissues(Fig. 4b). C2orf32 mRNA was undetectable in PBLs and keratinocytes,where this region was methylated, but expression was moderate inastrocytes and high in brain, where the region remained unmethy-

lated. We obtained similar results for another CpG-rich region in thegene PRKCZ (Fig. 4c), which contains two NotI sites. A candidatefrom the BssHII arrays, BAC clone RP11-96D22, had a consistentlypositive log2 ratio 40.5 in uncultured brain samples relative to PBLs,suggestive of more methylation in PBLs than in brain at this BssHIIsite (Fig. 4d). The BssHII site is in an intragenic CpG island associatedwith ARHGEF17, a gene encoding a Rho guanine nucleotide exchangefactor. Consistent with the array results, the ARHGEF17 CpG islandwas unmethylated at nearly all CpGs (in two contiguous regions ofbisulfite analysis) in brain, where it is expressed, and was denselymethylated in PBLs, which do not express ARHGEF17 (Fig. 4d). TheCpG island was also unmethylated in astrocytes and keratinocytes,which express low levels of ARHGEF17. These data validate the arraymethod in detecting methylation differences and identify new exam-ples of tissue-specific methylation associated with specific gene-expression states.

Another candidate for differential CpG island methylation, CTA-799F10, had high log2 ratios in brain, astrocytes and keratinocytesrelative to PBLs, suggesting that there is more methylation in PBLs.We identified a CpG island containing a NotI site in this region in thegene SHANK3 (refs. 30,31; Fig. 5a), which encodes a structuralprotein found in neuronal postsynaptic densities. Haploinsufficiencythrough deletion or intragenic translocation of SHANK3 is thought

21

0–1

–2

7 8 89 10 1011 12 12 1413A

7 8 9 10 11 12 13AA

M2

10

–1–

2

(1,1

)(1

,2)

(1,3

)(1

,4)

(2,1

)(2

,2)

(2,3

)(2

,4)

(3,1

)(3

,2)

(3,3

)(3

,4)

(4,1

)(4

,2)

(4,3

)(4

,4)

(1,1

)(1

,2)

(1,3

)(1

,4)

(2,1

)(2

,2)

(2,3

)(2

,4)

(3,1

)(3

,2)

(3,3

)(3

,4)

(4,1

)(4

,2)

(4,3

)(4

,4)

(1,1

)(1

,2)

(1,3

)(1

,4)

(2,1

)(2

,2)

(2,3

)(2

,4)

(3,1

)(3

,2)

(3,3

)(3

,4)

(4,1

)(4

,2)

(4,3

)(4

,4)

M

21

0–1

–2

M

21

0–1

–2

M

21

0–1

–2

M

21

0–1

–2

MPrint tip Print tip Print tip

Figure 2 The log2 ratios of cohybridizations of NotI fragments did not have either intensity-dependent effects or geographic location-specific effects.

Plots of the relationship between log2 ratio of the two signals (M value) and total intensity (log2 of average Cy3 and Cy5 signals; A value; top row) and plots

comparing the distributions of the M values across the subarrays and showing the sizes of the print tip–specific effects, if any (bottom row) of self-self

hybridizations (plot 1 of each row, PBL versus PBL) and either normal brain versus PBLs or astrocytes (Astro) versus PBLs (plots 2 and 3 of each row).The M values were shifted by their global median to ease the effect-size interpretation. The dotted horizontal curves are the lowess-fitted print tip–specific

regressions. Because variability across the subarrays is similar, we used global median normalization in our subsequent analysis.

NATURE GENETICS VOLUME 37 [ NUMBER 6 [ JUNE 2005 64 7

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 4: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

to underlie the severe neurologic symptoms of 22q13 deletion syn-drome in humans31–33. A chromosomal breakpoint in one affectedindividual occurs precisely in this CpG island31,34. Consistent withthe array results, bisulfite sequencing showed that this CpG islandwas extensively methylated in PBLs, whereas methylation in othertissues was low or absent (Fig. 5b). SHANK3 is expressed in braintissue, where it is primarily unmethylated, but it is not detectable in

PBLs, where the CpG island is methylated. Thus, the unmethylatedstate seems to be permissive for SHANK3 expression whereas methy-lation of the CpG island appears repressive. The methylation andexpression status of SHANK3, and the association of defects inSHANK3 with a severe disease phenotype in humans, suggests thatthe tissue-specific methylation status of this CpG island is of somefunctional importance.

2

1

0

Log2

rat

io

0 500Position

1,5001,000 2,000

0 500Position

1,5001,000 2,000

0 500Position

1,5001,000 2,000

0 500Position

1,5001,000 2,000

–1

–2

2

1

0

Log2

rat

io

–1

–2

2

1

0

Log2

rat

io

–1

–2

2

1

0

Log2

rat

io

–1

–2

a

b

c

d

Figure 3 Detection of methylation differences between cell types using NotI or BssHII. (a) Log2 ratios from an array hybridization of NotI fragments

independently isolated from two aliquots of a male PBL DNA (red dots). (b) Log2 ratios from the self-self hybridization in a, and a comparative hybridization

between normal human astrocytes and PBL DNA (blue dots). (c) Log2 ratios from an array hybridization of BssHII fragments independently isolated from two

aliquots of a male PBL DNA (red dots). (d) Log2 ratios from the self-self hybridization of BssHII fragments, and a comparative hybridization between normal

human brain and PBL DNA (blue dots). Data are plotted as the mean log2 ratio of triplicate spots for each clone, and only those log2 ratios with a standard

deviation less than 0.1 were considered valid. The BACs are ordered from left to right by their position in the genome starting at chromosome 1p (see alsoSupplementary Fig. 1 online).

Brain1

Log2 ratio n Not I

Brain2

Astrocyte-F

Keratinocyte

PBL1

PBL2

0.29

0.88

1.40

0.64

0.13

–0.13

8

8

8

8

8

9

Brain1

Log2 ratio n nBssHII

Brain2

Astrocyte-F

Keratinocyte

PBL1

PBL2

1.20

1.20

NA

NA

0.06

0.13

8

8

9

9

8

11

6.7

6.6

4.8

0.6

91.7

91.8

1.4

0.7

6.3

0.0

87.1

98.4

1.4 ± 0.5

1.1 ± 0.6

10

10

9

10

10

9

Brain1Log2 ratio n Not I n Not I

Total methylation(%)

Total methylation(%)

Total methylation(%)

Total methylation(%)

Expression(%)

Expression(%)

Brain2

PBL1

PBL2

1.17

0.70

0.04

–0.07

7

7

12

7

72.8

62.6

97.6

95.9

18.6 5051 ± 274

12.7 5759 ± 1240

96.4 182 ± 27

96.4 354 ± 61

172.6 ± 11.7

189.7 ± 4.2

276.4 ± 15.3

23.7 ± 0.2

10

9

8

10

Brain1Log2 ratio n Not I

Expression(%)

Brain2

Astrocyte-F

Keratinocyte

PBL1

PBL2

0.62

0.93

1.63

–0.06

–0.09

–0.03

8

8

8

8

8

8

2162.0 ± 151.0

2752.0 ± 798.0

738.0 ± 93.0

ND

ND

ND

I II

III

a b

c

d

Figure 4 Validation of the array results by bisulfite-sequencing and gene-expression analysis. Log2 ratios from the array correlate well with the methylation

status of the NotI (a–c) and BssHII (d) sites, and adjacent CpGs (see also Supplementary Fig. 2 online). (a) Methylation analysis in and around the NotI site

of BAC clone RP11-60M20 located on chromosome 2q14.2. (b) Methylation analysis in and around the NotI site of BAC clone RP11-218N6 located onchromosome 2p14, and quantitative gene-expression analysis of the associated gene C2orf32. (c) Methylation analysis in and around the NotI site of BAC

clone RP11-82D16 located on chromosome 1p36.33, and quantitative gene-expression analysis of the associated gene PRKCZ. We assessed methylation

in both NotI sites in this local region by carrying out two PCRs on contiguous regions (I and II). (d) Methylation analysis in and around the BssHII site of

BAC clone RP11-96D22 located on chromosome 11q13.4, and quantitative gene-expression analysis of the associated gene ARHGEF17. We assessed

methylation across 53 CpGs by carrying out two PCRs on contiguous regions (I and II). Each row of circles represents a summary of the methylation status

of cytosines of CpGs after bisulfite treatment and sequencing. n, number of independent clones sequenced. Open circles, 0–33% clones were methylated

at the CpG; gray circles, 34–66% methylated; black circles, 67–100% methylated. The total percentage methylation was calculated considering the

methylation status of all CpGs in all sequence reactions for the given sample. The relative mRNA levels of each gene relative to control GUSB were

determined by real-time RT-PCR, done in triplicate (mean 7 s.d.). ND, not detectable after 45 PCR cycles. The expression analysis was repeated two

additional times and showed similar results. NA, not analyzed.

6 48 VOLUME 37 [ NUMBER 6 [ JUNE 2005 NATURE GENETICS

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 5: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

Comparative epigenomics of SHANK3Identification of gene-regulatory elements is a main goal of currentresearch and has been advanced at the DNA sequence level throughcomparative genomics. Here we reasoned that a comparative epige-nomics approach could help determine whether the differentialmethylation of the CpG island of SHANK3 is conserved betweendivergent species and whether the methylation status has a functionalimpact on gene expression. We therefore assessed the methylationstatus and mRNA levels of Shank3 in analogous mouse tissues(Fig. 5c,d). Shank3 was densely methylated in PBLs, whereas partialor no methylation was observed in brain, astrocytes or keratinocytes, apattern very similar to that observed in the corresponding humantissues. The methylation and expression pattern in PBLs and brain wasvery similar in both C57BL/6J and FVBN mice. In rat PBLs, Shank3was similarly densely methylated, whereas in rat brain it was partiallymethylated (Supplementary Fig. 3 online). The tissue-specific expres-sion of SHANK3 was also conserved in the mouse (Fig. 5d) and rat(Supplementary Fig. 3 online). The apparent conservation of tissue-specific CpG island methylation and gene-expression patterns forSHANK3 over B75 million years suggests that they are of somefunctional importance.

DISCUSSIONOur method allows methylation analysis across the genome with single-nucleotide precision. Because even generic BAC arrays can be used forthe methylation array method, it is applicable to all species24 and canprovide comprehensive methylation analysis using complete tiling-patharrays27 and alternative methylation-sensitive enzyme combinations. Aswith restriction landmark genomic scanning and other genome-scalemethylation screens35,36, one possible limitation of our BAC arraymethod is the presence of SNPs, which might create or eliminate arestriction site, leading to a false positive or negative result. To addressthis problem, we wrote a Perl script that maps all known SNPs to therestriction sites being tested on the array. Thus, we can know a prioriwhether a common polymorphism might influence the array results.This limitation will be irrelevant when comparing multiple samplesfrom one individual or from a panel of genetically similar organisms.

We applied the logic of comparative genomics to our epigenomicdata from the array and found that the tissue-specific pattern of CpGisland methylation of SHANK3 was very similar in mouse, humanand rat tissues. Two different mouse strains also yielded similarmethylation and gene-expression results, further supporting ourconclusions. Conservation suggests function, which can be tested byfurther in-depth understanding of other cis- and trans-acting factorsthat regulate SHANK3. The presence or absence of relevant transcrip-tion factors, modifications to other portions of SHANK3 or thehistones bound to it and structural changes in positioning of cis-acting elements in each cell type might also have a role. We also testedArhgef17 in mice and found a lack of conservation of the CpG islandmethylation (data not shown). These data suggest that SHANK3methylation patterns may be evolutionarily conserved in a gene-specific manner rather than simply being one part of mechanismthat is entirely conserved. This does not necessarily imply thatARHGEF17 methylation is nonfunctional in humans, particularly ifthe pattern is conserved across a large population. Conservation ofprotein (histone) methylation and acetylation patterns between miceand humans has also been reported37.

Despite the general knowledge that methylation patterns are tissuespecific, the genomic location and function of most of these sites isunknown. Early studies suggested that there were substantial differ-ences in the total 5-methylcytosine content of different tissues38. Ourresults show that the intragenic CpG islands of SHANK3 andARHGEF17 have tissue-specific methylation and gene expression.The mechanism by which intragenic methylation might modulatelevels of gene expression is not known, although a recent reportsuggests that intragenic methylation may result in exclusion of RNApolymerase II, reducing the efficiency of transcription elongation39.Our results extend recent reports from the human epigenomeproject16, restriction landmark genomic scanning15 and elsewhere18,40

that CpG islands can be methylated in specific tissues. Nevertheless,a substantial fraction of the genome remains to be explored in anytissue, and very few genes have been studied for methylation in morethan a handful of normal tissues or specific cell types18. The methodwe report here should be useful for further identification of such sites,including 5¢ and intragenic CpG islands, as well as of non-CpG-islandsequences. By determining the distribution of other methylation-sensitive restriction enzymes in relation to genes, CpG islands orrepeat sequences, as we have for NotI and BssHII, we can examinenew sets of CpG islands or non-CpG-island sequences. Our approachis applicable to many important issues, including normal epigeneticvariation in the population, epigenetic reprogramming in clonedorganisms, tissue-specific methylation, imprinting and aberrantmethylation in cancer and other diseases.

Brain1Log2 ratio n NotI

NotI siteCpG island

CpG island

Brain2Astrocyte-FKeratinocytePBL1PBL2

Brain1Brain2AstrocyteKeratinocytePBL1PBL2

0.880.711.131.020.160.01

10101010

89

n88

198

139

17.935.417.9

1.196.997.6

Expression(%)

Methylation(%)

41.741.323.2

6.799.296.7

Methylation(%)

1428.0 ± 286.0NA

17.0 ± 2.00.9 ± 0.45.0 ± 2.0

NA

Expression(%)

833.0 ± 392.0184.0 ± 59.0

9.0 ± 0.319.0 ± 5.0

NDND

CTA-799F10

Shank3

20 kb

SHANK3

20 kb

C

C T

T

a

b

c

d

Figure 5 Evolutionary conservation of tissue-specific CpG island methylation

and gene expression in SHANK3. (a) The SHANK3 locus aligned to BAC

clone CTA-799F10 at chromosome 22q13.33. (b) Tissue-specific

methylation and gene expression of SHANK3. The arrow indicates the

position of the putative transcription start site. (c) The mouse Shank3 locus

at chromosome 15qE3. The nucleotide sequences of the CpG island in

mouse was 485% identical to that in human DNA and satisfied the criteria

for CpG islands. (d) Tissue-specific methylation and gene expression of

Shank3 were conserved in the mouse. PBL1 and brain1 were from strain

C57BL/6J, and PBL2 and brain2 were from strain FVBN. Methylation data

are shown as in Figure 4. The percentage methylation was calculated

considering the methylation status of all CpGs in all sequence reactions for

the given sample. The mRNA levels of human and mouse SHANK3 were

assessed in triplicate using quantitative RT-PCR, relative to human and

mouse GUSB (mean 7 s.d.). The RT-PCR was repeated two more times with

similar results. NA, not analyzed; ND, not detectable after 45 PCR cycles.

NATURE GENETICS VOLUME 37 [ NUMBER 6 [ JUNE 2005 64 9

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 6: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

METHODSTissues. We obtained normal brain samples from the Neurosurgery Tissue

Bank at the University of California San Francisco (UCSF). We obtained

samples with informed consent, and their use was approved by the Committee

on Human Research at UCSF. We collected adult PBLs from healthy volunteers.

We purchased normal human primary adult keratinocytes and normal human

fetal astrocytes from Cambrex and cultured them for fewer than three passages.

We obtained mouse tissues from the FVBN, C57BL/6J and NIH/Ola strains.

Mouse whole brain, purified primary astrocytes (postnatal day 10) and PBLs

were derived from normal FVBN mice. We also analyzed an additional brain

sample from a C57BL/6J mouse and obtained similar results. We isolated

keratinocytes from the skin of normal newborn NIH/Ola pups by physical

separation of the epidermal layer from whole skin. We obtained rat brain and

PBLs from Sprague-Dawley rats.

Methylation array analysis. We blocked the ends of genomic DNA (2 mg) from

normal tissue samples in a 10-ml reaction by adding nucleotide analogs

(aS-dGTP, aS-dCTP, ddATP and ddTTP) with DNA polymerase followed by

heat inactivation (65 1C for 30 min). We digested DNA with 20 U of NotI

(or BssHII) and 20 U of EcoRV for 4 h at 37 1C. We filled in the cleaved NotI (or

BssHII) ends with biotin-dGTP and biotin-dCTP (NEN) using sequenase V2.0

(USB). We separated the biotinylated fragments by incubating them with

magnetic streptavidin beads (M-280; Dynal) at 25 1C for 30 min. We removed

the supernatant and then washed the beads three times with B&W buffer

(10 mM Tris-HCl (pH 7.5), 1 mM EDTA and 2 M NaCl). We released the DNA

fragments from the beads in 98% formamide-containing 10 mM EDTA at

65 1C for 5 min. Amino-allyl-dUTP (Sigma) was incorporated into the purified

fragments using the Bioprime DNA Labeling System (Life Technologies). We

removed free nucleotides using PCR purification columns (Qiagen). Cy5 and

Cy3 dye esters were coupled to amino-allyl-dUTP–labeled test and reference

samples, respectively. We washed slides using the established protocols for array

comparative genomic hybridization22,23,25.

We quantified and summarized array images using UCSF SPOT and SPROC

software with default parameters41. We removed from further analysis those

clones for which fewer than two replicate spots were retained or for which the

standard deviation of replicate spots exceeded 0.1. We also filtered out those

clones present in fewer than 50% of the samples. We assessed the quality of the

samples using tools available in the R/Bioconductor42 freely available package

‘marray.’ We calculated the conservative sample-specific threshold as described

in the Results.

Predicting fragment hybridization computationally. We used the University

of California Santa Cruz July 2003 freeze of the human genome for our in silico

analysis. We downloaded the whole-chromosome assemblies, the BAC end-

sequences, CpG islands and National Institutes of Health SNP annotations and

analyzed them using a custom Perl program on a Windows XP workstation.

The program simulates the array experiment and generates a list of restriction

fragments using the enzyme pairs across the genome and the BAC clones they

overlap. Fragments residing on CpG islands are also identified, and restriction

sites residing on recognized polymorphisms are flagged. The program can be

adjusted for any restriction-enzyme combination and can process the entire

human genome in less than 30 min. We predict that 870 BACs on the existing

array contain one or more NotI sites and that B1,700 contain BssHII sites. The

Perl script is available on request.

In vitro methylation of genomic DNA. We digested 2 mg of total genomic

DNA from normal human PBLs with 4 U of EcoRV (NEB) for 2 h at 37 1C

followed by heat inactivation for 20 min at 80 1C and phenol-chloroform-

isoamyl (25:24:1) extraction. We then precipitated DNA with 3 M sodium

acetate and ethanol and resuspended the pellet in 20 ml of water. We treated the

digested DNA with 100 U of M.SssI (NEB) for 4 h at 37 1C in accordance with

the manufacturer’s instructions. The enzyme was heat-inactivated for 20 min at

65 1C, and DNA was purified and processed for array hybridization. We used

the M.SssI-treated DNA as a test sample along with untreated PBL DNA as a

reference. To ensure that appropriate labeling and hybridization of NotI-selected

fragments could occur in M.SssI-treated DNA, we added 5 ng of untreated

DNA from two human BAC clones (RP11-15M23 and RP11-23D23) to the

M.SssI-methylated PBL DNA before NotI digestion and fragment selection. We

verified these clones containing NotI sites by end-sequencing; they correspond

to chromosome 7: 40858466–40697704 (RP11-15M23) and chromosome 7:

870718–698268 (RP11-23D23; July 2003 freeze of the human genome from the

University of California Santa Cruz genome browser).

Bisulfite treatment, PCR and sequencing. We treated total genomic DNA with

sodium bisulfite as described43, carried out PCR using primers listed in

Supplementary Table 1 online and cloned products into pCR2.1/TOPO

(Invitrogen). We selected 7–13 individual colonies and sequenced inserts using

the ABI 3700 automated DNA sequencer.

Real-time RT-PCR. We measured expression of all genes analyzed from mouse,

rat and human samples by real-time RT-PCR using the Opticon2 Continuous

Fluorescence Detector (MJ Research). We purchased primers and probes from

IDT and Applied Biosystems (Supplementary Table 1 online).

URL. The R/Bioconductor package ‘marray’ is available at http://www.

bioconductor.org/repository/devel/vignette/marray.pdf.

Note: Supplementary information is available on the Nature Genetics website.

ACKNOWLEDGMENTSWe thank J. Nakamura for providing rat tissues. This work was supported by theCalifornia Breast Cancer Research Program (T.T.C.), Howard Hughes MedicalInstitute (P.J.), American Brain Tumor Association (C.H.) and US NationalInstitutes of Health (J.F.C.).

COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests.

Received 28 December 2004; accepted 22 March 2005

Published online at http://www.nature.com/naturegenetics/

1. Okano, M., Bell, D.W., Haber, D.A. & Li, E. DNA methyltransferases Dnmt3a andDnmt3b are essential for de novo methylation and mammalian development. Cell 99,247–257 (1999).

2. Trasler, J.M., Trasler, D.G., Bestor, T.H., Li, E. & Ghibu, F. DNA methyltransferase innormal and Dnmtn/Dnmtn mouse embryos. Dev. Dyn. 206, 239–247 (1996).

3. Chen, R.Z., Pettersson, U., Beard, C., Jackson-Grusby, L. & Jaenisch, R. DNAhypomethylation leads to elevated mutation rates. Nature 395, 89–93 (1998).

4. Maraschio, P., Zuffardi, O., Dalla Fior, T. & Tiepolo, L. Immunodeficiency, centromericheterochromatin instability of chromosomes 1, 9, and 16, and facial anomalies: theICF syndrome. J. Med. Genet. 25, 173–180 (1988).

5. Hansen, R.S. et al. The DNMT3B DNA methyltransferase gene is mutated in the ICFimmunodeficiency syndrome. Proc. Natl. Acad. Sci. USA 96, 14412–14417 (1999).

6. Xu, G.L. et al. Chromosome instability and immunodeficiency syndrome caused bymutations in a DNA methyltransferase gene. Nature 402, 187–191 (1999).

7. McClelland, M. & Ivarie, R. Asymmetrical distribution of CpG in an ‘average’ mamma-lian gene. Nucleic Acids Res. 10, 7865–7877 (1982).

8. Bird, A., Taggart, M., Frommer, M., Miller, O.J. & Macleod, D. A fraction of the mousegenome that is derived from islands of nonmethylated, CpG-rich DNA. Cell 40, 91–99(1985).

9. Gardiner-Garden, M. & Frommer, M. CpG islands in vertebrate genomes. J. Mol. Biol.196, 261–282 (1987).

10. Larsen, F., Gundersen, G., Lopez, R. & Prydz, H. CpG islands as gene markers in thehuman genome. Genomics 13, 1095–1107 (1992).

11. Takai, D. & Jones, P.A. Comprehensive analysis of CpG islands in human chromosomes21 and 22. Proc. Natl. Acad. Sci. USA 99, 3740–3745 (2002).

12. Surani, M.A. Imprinting and the initiation of gene silencing in the germ line. Cell 93,309–312 (1998).

13. Gartler, S.M., Dyer, K.A. & Goldman, M.A. Mammalian X chromosome inactivation.Mol. Genet. Med. 2, 121–160 (1992).

14. Hershko, A.Y., Kafri, T., Fainsod, A. & Razin, A. Methylation of HoxA5 and HoxB5 andits relevance to expression during mouse development. Gene 302, 65–72 (2003).

15. Song, F. et al. Association of tissue-specific differentially methylated regions(TDMs) with differential gene expression. Proc. Natl. Acad. Sci. USA 102, 3336–3341 (2005).

16. Rakyan, V.K. et al. DNA methylation profiling of the human major histocompatibilitycomplex: a pilot study for the human epigenome project. PLoS Biol. 2, e405 (2004).

17. De Smet, C., Lurquin, C., Lethe, B., Martelange, B. & Boon, T. DNA methylation is theprimary silencing mechanism for a set of germ line- and tumor-specific genes with aCpG-rich promoter. Mol. Cell. Biol. 19, 7327–7335 (1999).

18. Futscher, B.W. et al. Role for DNA methylation in the control of cell type-specificmaspin expression. Nat. Genet. 31, 175–179 (2002).

19. Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–13511289 (2001).

6 50 VOLUME 37 [ NUMBER 6 [ JUNE 2005 NATURE GENETICS

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s

Page 7: Epigenome analyses using BAC microarrays identify evolutionary conservation of tissue-specific methylation of SHANK3

20. Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proc.Natl. Acad. Sci. USA 90, 11995–11999 (1993).

21. Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409,860–921 (2001).

22. Pinkel, D. et al. High resolution analysis of DNA copy number variationusing comparative genomic hybridization to microarrays. Nat. Genet. 20, 207–211(1998).

23. Albertson, D.G. et al. Quantitative mapping of amplicon structure by array CGHidentifies CYP24 as a candidate oncogene. Nat. Genet. 25, 144–146 (2000).

24. Hodgson, G. et al. Genome scanning with array CGH delineates regional alterations inmouse islet carcinomas. Nat. Genet. 29, 459–464 (2001).

25. Snijders, A.M. et al. Assembly of microarrays for genome-wide measurement of DNAcopy number. Nat. Genet. 29, 263–264 (2001).

26. Snijders, A.M. et al. Mapping segmental and sequence variations among laboratorymice using BAC array CGH. Genome Res. 15, 302–311 (2005).

27. Ishkanian, A.S. et al. A tiling resolution DNA microarray with complete coverage of thehuman genome. Nat. Genet. 36, 299–303 (2004).

28. Lindsay, S. & Bird, A.P. Use of restriction enzymes to detect potential gene sequencesin mammalian DNA. Nature 327, 336–338 (1987).

29. Costello, J.F. et al. Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nat. Genet. 24, 132–138 (2000).

30. Qualmann, B., Boeckers, T.M., Jeromin, M., Gundelfinger, E.D. & Kessels, M.M.Linkage of the actin cytoskeleton to the postsynaptic density via direct interac-tions of Abp1 with the ProSAP/Shank family. J. Neurosci. 24, 2481–2495(2004).

31. Wilson, H.L. et al. Molecular characterisation of the 22q13 deletion syndromesupports the role of haploinsufficiency of SHANK3/PROSAP2 in the major neurologicalsymptoms. J. Med. Genet. 40, 575–584 (2003).

32. Watt, J.L. et al. A familial pericentric inversion of chromosome 22 with a recombinantsubject illustrating a ‘pure’ partial monosomy syndrome. J. Med. Genet. 22, 283–287(1985).

33. Phelan, M.C. et al. 22q13 deletion syndrome. Am. J. Med. Genet. 101, 91–99(2001).

34. Bonaglia, M.C. et al. Disruption of the ProSAP2 gene in a t(12;22)(q24.1;q13.3) isassociated with the 22q13.3 deletion syndrome. Am. J. Hum. Genet. 69, 261–268(2001).

35. Lippman, Z. et al. Role of transposable elements in heterochromatin and epigeneticcontrol. Nature 430, 471–476 (2004).

36. Huang, T.H. et al. Identification of DNA methylation markers for human breastcarcinomas using the methylation-sensitive restriction fingerprinting technique.Cancer Res. 57, 1030–1034 (1997).

37. Bernstein, B.E. et al. Genomic maps and comparative analysis of histone modificationsin human and mouse. Cell 120, 169–181 (2005).

38. Ehrlich, M. et al. Amount and distribution of 5-methylcytosine in human DNA fromdifferent types of tissues and cells. Nucleic Acids Res. 10, 2709–2721 (1982).

39. Lorincz, M.C., Dickerson, D.R., Schmitt, M. & Groudine, M. Intragenic DNA methyla-tion alters chromatin structure and elongation efficiency in mammalian cells. Nat.Struct. Mol. Biol. 11, 1068–1075 (2004).

40. Desmet, C. et al. The activation of human gene Mage-1 in tumor cells is correlated withgenome-wide demethylation. Proc. Natl. Acad. Sci. USA 93, 7149–7153 (1996).

41. Jain, A.J. et al. Fully automatic quantification of microarray image data. Genome Res.12, 325–332 (2002).

42. Ihaka, R. & Gentleman, R. A language for data analysis and graphics. J. Comput.Graph. Stat. 5, 299–314 (1996).

43. Grunau, C., Clark, S.J. & Rosenthal, A. Bisulfite genomic sequencing: systematicinvestigation of critical experimental parameters. Nucleic Acids Res. 29, e65 (2001).

NATURE GENETICS VOLUME 37 [ NUMBER 6 [ JUNE 2005 65 1

TECHN I CAL REPORTS©

2005

Nat

ure

Pub

lishi

ng G

roup

ht

tp://

ww

w.n

atur

e.co

m/n

atur

egen

etic

s