supplementary materials chromatin structure is distinct ...10.1186/1471-2199-15-22...chromatin...

8
1 Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms Hongde Liu 1 *; Jingchen Zhai 1 ; Kun Luo 2 ; Lingjie Liu 1 1 State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China 2 Department of Neurosurgery, Xinjiang Evidence-Based Medicine Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi 830054, China Supplementary Materials

Upload: lamque

Post on 02-May-2018

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

1

Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

Hongde Liu1*; Jingchen Zhai1; Kun Luo2; Lingjie Liu1

1State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China

2Department of Neurosurgery, Xinjiang Evidence-Based Medicine Research Institute, The First Affiliated Hospital of

Xinjiang Medical University, Urumqi 830054, China

Supplementary Materials

Page 2: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

2

0.03-2.539H3K36ac

1.02.240CTCF

1.00.364H2AK5ac

1.00.100H2BK12ac

1.255-1.237H3K79me2

1.20.0722H4R3me2

1.53.246H2AZ

30.866H3K36me1

4-1.278H4K5ac

4-1.204H4K20me3

90.273H3K18ac

9-5.144H2AK9ac

9-1.139H4K8ac

110.321H3R2me2

112.370H2BK20ac

12-1.0626H4K12ac

12-0.0319H3K9ac

130.331H3K27ac

140.130H3K4ac

14-0.0648H3K79me3

140.00146H3K4me1

16-1.599H2BK5ac

16-0.228H3K4me2

170.254H2BK120ac

19-5.423H3K14ac

201.514H3K23ac

25-0.711H3K27me2

30-0.714H3K27me1

42-0.0142H3K9me1

44-2.121H4K91ac

44-0.768H3K4me3

530.433H3K9me3

54-0.356PolII

65-4.089H3K27me3

75-0.819H3R2me1

82-0.164H3K79me1

1240.395H2BK5me1

129-1.398H4K16ac

132-0.108H3K9me2

159-0.130H4K20me1

1890.052H3K36me3

1.0Const

-log10(P-value) of a single HMParameters in the classifier

Linear classifier parameters (Col.2) and discriminating capacity (Col.3) of each of HMsTable S1

Page 3: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

3

Number of nine types of SNPs

Figure S1

A

E

B

Distribution Number

Exon 1898

Intron 6721

Inter-gene 2237

UTR5 & UTR3 3652

Total 14508

Distribution of risk SNPs in different genomic loci

663

47557

45601

538

89937

20648

50817

218436

4707162

NumberTypes of SNPs Total

Intron 4707162

Near-gene-5

Near-gene-3

UTR5

204944

UTR3

Frameshift

*Coding-synon

Missense

Nonsense

Near-gene-3Near-gene-5

TSS

5’U

TR

Exon

Intro

n

3’U

TR

TTS

2k bp 500 bp

Exon

Intro

n

Exon

5’ 3’

Intron: 46.3%

UTR5 & UTR3:25.2%

Inter-gene:15.4%

Exon: 13.1%

D

C

* Randomly selected 14508 coding-

synonymous SNPs as neutral SNPs

Percentage of risk SNPs in different genomic loci

A: Scheme indicating genomic regions for nine categories of SNPs;B: Number of SNPs in the nine categories of SNPs;C: SNP frequencies around transcription start sites (TSSs);D: Number of risk-associated SNPs (risk SNPs); the risk SNPs data were retrieved from (http://www.genome.gov/gwastudies/) (Hindorff et al., 2009);E: Distribution of risk SNPs in different genomic regions.

Page 4: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

4

Exon

UTR5

Near-gene-3

UTR3

Near-gene-5

Intron

Figure S2Profiles of nucleosome occupancy around random genomic loci. Random genomic loci are selected in exon, intron, UTR5, UTR3 and 5’ and 3’ of genes. Profile of nucleosome occupancy is calculated around the random loci, respectively.

Page 5: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

5

UTR5 (GM12878 Cells) UTR3 (GM12878 Cells) Coding-synon (GM12878 Cells)

Intron (GM12878 Cells)

Profiles of histones modifications near SNPs sites in lymphoblastoid cell (GM12878 cells). A-D: Histones methylations near around 5'-untranslated region (UTR5)-SNPs sites (A), 3'-untranslated region (UTR5)-SNPs sites (B), coding-synonymous SNPs sites (C) and intron-SNPs sites (D), respectively, in lymphoblastoid cells; E and F: Binding of histone acetylases and deacetylase at neutral SNPs sites (E) and risk SNPs sites (F), respectively, in CD4+ T cells.

Figure S3

A B C

D Neutral SNPs (CD4+ T Cell)

Risk SNPs (CD4+ T Cell)

E

F

Page 6: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

6

Correlation coefficients of profiles of HMs, H2AZ and CTCF between CD4+ T cells and lymphoblastoid cells (GM12878 cells) .The profile is for 3000-bp genomic region around SNPs. The calculation is for the four types of SNPs, UTR5-SNPs, UTR3-SNPs, coding-synonymous SNPs and intron-SNPs

Figure S4

Page 7: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

7

Histones modifications can be used to distinguish risk SNPs and neutral SNPs. A: Histones modifications that are different between risk SNPs and neutral SNPs; P-values that indicate difference significance were calculated with a two-sample t-test;B: Receiver operating characteristic (ROC) curves of the linear classifier models that identify risk SNPs. The linear classifier parameters are listed in Table S1. Features refer to chromatin marks. The features are sorted according to the different significant P-values (Table S1). The 4, 8, 16 and all features were chosen, respectively, to construct the models. Area under the ROC curve (AUC) is indicated for each of the models.

Figure S5

A

B

Page 8: Supplementary Materials Chromatin structure is distinct ...10.1186/1471-2199-15-22...Chromatin structure is distinct between coding and non-coding single nucleotide polymorphisms

8

A: GC-content profiles around nine categories of SNPs;B: Average GC-content in a 2-kbp region around the nine categories of SNPs;C: Nucleosome occupancy profiles for both base transition (A/G and C/T) and base transversion (G/T, A/C, C/G and A/T) in both risk SNPs and neutral SNPs.

Figure S6

A

B

C