bioinformatics final product claire

13
DNA methylation coverage in two tissues of the Pacific Oyster Claire Ellis Bioinformatics Terminal Product 3/14/13

Upload: chellis6

Post on 06-Jul-2015

149 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Bioinformatics Final Product claire

DNA methylation coverage in two tissues of the Pacific Oyster

Claire Ellis

Bioinformatics Terminal Product

3/14/13

Page 2: Bioinformatics Final Product claire

Epigenetics describes DNA modifications that change gene expression without altering nucleotide sequence.

DNA methylation in organisms is extremely diverse, variable among species, and can change genome function under external influences.

DNA methylation patterns in Crassostrea gigas

DNA methylation

Source: http://www.nist.gov/pml/div689/dna

_011911.cfm

CH3

Page 3: Bioinformatics Final Product claire

Bisulfite sequencing was used to examine DNA methylation in gonad tissue

MBD-Seq was used to examine DNA methylation in gill tissue (Mackenzie)

Sequencing Approaches

Bisulfite sequencingCm= methylated cytosineC= unmethylated cytosine5’ ACmGTTCGCTTGAG 3’3’ TGCmAAGCGAACTC 5’

5’ ACmGTTUGUTTGAG 3’3’ TGCmAAGUGAAUTU 5’

Bisulfite Treatment

Page 4: Bioinformatics Final Product claire

Bisulfite converted reads aligned to genome and % methylation value per base calculated by processing alignments

methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing

Approach

Page 5: Bioinformatics Final Product claire

Goal: obtain methylation coverage and examine differential methylation between gonad and gill tissues

Read annotation files and perform basic statistical analyses for differentially methylated regions or bases

methylKit

Page 6: Bioinformatics Final Product claire
Page 7: Bioinformatics Final Product claire

Methylation Statistics

Histogram of % CpG methylation

% methylation per base

Fre

que

ncy

20 40 60 80 100

05

000

010

0000

150

000

2000

00

0 0 0 0 0 0.3 0 0

4.4

0 0 0 0.3 0 0 0 0 0

94.7test

Histogram of % CpG methylation

% methylation per base

Fre

qu

en

cy

0 20 40 60 80 100

01

00

00

20

00

03

00

00

40

00

0

4.5

2.4

1

1.81.4

1.7 1.6

2.62

3.8

2.7

4 4.2

5.4

6.4

7.1

8.1

9.8

12.5

16.8test

Gonad Gill

Page 8: Bioinformatics Final Product claire

Coverage Statistics

Histogram of CpG coverage

log10 of read coverage per base

Fre

que

ncy

0.0 0.5 1.0 1.5 2.0

05

0000

100

000

15000

0200

000

91.6

0 0

7.2

0.7 0 0.2 0 0 0 0 0 0 0.1 0 0 0 0 0 0 0

test

Histogram of CpG coverage

log10 of read coverage per base

Fre

que

ncy

1.0 1.5 2.0 2.5 3.0

010

000

20000

3000

0400

00

16.9

12.512.7

13.6

9.4

8.5

7.6

5.5

3.9

2.9

21.5

10.70.50.30.20.10.1 0 0 0 0 0

testGonad Gill

Page 9: Bioinformatics Final Product claire

CpG base correlation

~/Desktop/TJGR_GonadPE_BS_v9_90_CG_methylkit_modified.txt

0.0 0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

0.068

0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x

y

~/Desktop/TJGR_gillMBD_BS_v9_10x_methylkit_modified.tabular.txt

CpG base pearson cor.

Gonad

Gill

0.068

Page 10: Bioinformatics Final Product claire

Methylation Clustering

0.0

0.2

0.4

0.6

0.8

CpG methylation clustering

Distance method: "correlation"; Clustering method: "ward"

Samples

Heig

ht

~/D

eskto

p/T

JG

R_G

onadP

E_B

S_v9

_90_

CG

_m

eth

ylk

it_m

odifie

d.txt

~/D

eskto

p/T

JG

R_

gill

MB

D_B

S_v9

_10x_

meth

ylk

it_

modifie

d.tabu

lar.

txt

Blue= Gonad

Red= Gill

Page 11: Bioinformatics Final Product claire

PCA- Principal Component Analysis

-60 -40 -20 0 20 40 60

-2e-1

2-1

e-1

20e

+00

1e

-12

2e-1

2

CpG methylation PCA Analysis

PC1

PC

2

~/Desktop/TJGR_GonadPE_BS_v9_90_CG_methylkit_modified.txt

Blue= Gonad

Red= Gill

Page 12: Bioinformatics Final Product claire

Additional analyses included examining type of differential methylation (hypo and hyper)

Extracted bases with a q-value <0.01 and % methylation difference >25%

The methylKit package was successfully used to characterize DNA methylation

Differences between gonad and gill methylation profiles may be due to library prep

Will use R script for future analyses comparing different samples’ methylation profiles

Conclusions

Page 13: Bioinformatics Final Product claire

> getMethylationStats(gonad,plot=F,both.strands=F)

methylation statistics per base

summary (gonad):

Min. 1st Qu. Median Mean 3rd Qu. Max.

9.091 100.000 100.000 97.360 100.000 100.000

Percentiles (gonad):

0% 10% 20% 30% 40% 50% 60% 70% 80% 95% 99.5% 99.9% 100%

9.09 100 100 100 100 100 100 100 100 100 100 100 100

summary (gill):

Min. 1st Qu. Median Mean 3rd Qu. Max.

0.00 54.55 78.70 69.56 91.89 100.000

Percentiles (gill):

0% 10% 20% 30% 40% 50% 60% 70% 80% 95% 99.5% 99.9% 100%

0 21.4 46.6 61.1 70.9 78.7 84.6 90 93.7 100 100 100 100

Methylation Statistics