identification of fusion transcripts with retroviral elements and its application as a cancer...

Post on 28-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Identification of fusion transcripts with retroviral elements and its application as a cancer biomarker Yun-Ji Kim1, Jae-Won Huh2, Dae-Soo Kim3, Hong-Seok Ha1, Kung Ahn1, Ja-Rang Lee1, Yi-Deun Jung1, and Heui-Soo

Kim1

1 Division of Biological Sciences, College of Natural Sciences, Pusan National University, Busan 609-735, Republic of Korea2 National Primate Research Center (NPRC), KRIBB, Ochang, Chungbuk 363-883, Republic of Korea 3 Korea Bi

oinformation Center, KRIBB, Daejeon 305-806, Korea http://www.primate.or.kr

Abstract

Introduction

Materials & Methods

Results

References

The human genome is estimated to be composed of 45% transposable elements (TEs). They have been reported to have capacity for affecting adjacent genes by altering transcriptional regulation. Most TEs are transcriptionally silent in normal tissues. However, TEs have been found to be expressed specifically in cancer cell lines. Here we investigated the cancer specific fusion transcript with TEs using bioinformatics and experimental approaches. To identify the candidate cancer markers, we adopted an analysis pipeline for screening methods to detect cancer-specific expression from expressed human sequences and developed a database. Total 999 genes fused with transposable elements were found to be cancer-specific in our analysis of the EST database. To confirm the candidate marker transcripts, experimental validation was conducted by RT-PCR analysis in tumor/adjacent normal tissues and corresponding cancer cell lines. Our results could contribute greatly to understand the human cancers in relation to transposable element.……..........................……...…...

1.Kim TH, Jeon YJ, Kim WY, Kim HS: HESAS: HERVs expression and structure analysis system. Bioinformatics 2005, 15:1699-1970.

2. Kim DS, Kim TH, Huh JW, Kim IC, Kim SW, Park HS, Kim HS : LINE FUSION GENES: a database of LINE expression in human genes. BMC Genomic 2006, 7:139

Hypothetical model for retroelements in human genome

Promoter region

1 exon

Transcription change

Supplying the Promoter or Enhancer

1 exon 2 exonExonization in UTR and C

DS region

Alternative Promoter1 exon 2 exon

Alternative Polyadenylationlast exon

Retroelement

Retroposon

SINE

Retrotransposon

LINE

RNA intermediate

- LTR element + LTR element

- env + env

- RT + RT

Yeast Ty1/copia/truncated HERVsLTR ORF1 ORF2 LTR

LTR LTR

Human THE1

PPoly(A)

Human Alu

ORF1 ORF2PPoly(A)L1

gag pol envLTR LTR

Full-length HERVs/exogenous retrovirus

Retrovirus

11%

82%

6%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

5′UTR CDS 3′UTR

Location of transposable elements fusion EST

Perc

ent o

f exo

ns %

13.6%

3.8%1.6% 0.7% 0.2% 0.1% 0.1%

79.8%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

1 2 3 4 5 6 11 17

Transposable element fuion EST counts

Gen

es %

Aims Most of TEs are tranScr-iptionally silent in human normal tissues, however, some of TEs have been found to be expressed in placenta tissues and cancer cell lines. The L1 antisense promoter-driven transcription has been detected in human tumor cells or normal ones, while HERV LTR elements have shown the bidirectional promoter activity (Medstrand et al., 2001; Nigumann et al., 2002; Dunn et al., 2003; Sin et al., 2006). Those elements could provide biological role of organismal complexity by transcriptional diversity (Landry et al., 2003). Here, we developed a database for understanding the mechanism of cancer develop-ment in relation to TEs in human ESTsequences, and conducted experiemental validation using RT-PCR in tumor/adjacent normal tissues and corresponding cancer cell lines to confirm thecandidate marker transcripts.

RT-PCR & Real-time PCR

Bioinformatics

NCBI,BLAST,MEGA3

Transposable elements

fusion region within genes SINE Family LINE Family LTR Family DNA Family Others

CDS 619 280 85 76 1

76 30 33 5 03′UTR 44 20 14 5 0

Transposable elementsTable. Distribution of transposable element family in region of transposable element exonization

5′UTR

AKR1C2aldo-keto reductase family 1, member C2

Chr.10

p15.1

NM_2058453.1

NM_001354.4

CB106780

1 10

111

LTR/MaLR MLT1L LINE/L1 LTR/MaLR MSTA 30 cycle 32cycle 34 cycle

liver(N

)

liver(C

)

liver(N

)

liver(C

)

liver(N

)

liver(C

)

300 bp

GAPDH 120 bp

NM_004817.2

NM_201629.1

AW604158

Chr.9

q21.11

1 23

AluJo/FRAM Coding region Untranslated regioncolon(N)

colon(C)

colon(N)

colon(C)

tight junction protein 2 (zona occludens 2)TJP2

168 bp

GAPDH 120 bp

1 21

Transposable elements

fusion region within genes SINE Family LINE Family LTR Family DNA Family Others

CDS 619 280 85 76 15 ′UTR 76 30 33 5 03 ′UTR 44 20 14 5 0

Transposable elements

Table. Distribution of transposable element family in region of transposable element exonization

Type of

potential splicing site SINE Family LINE Family LTR Family DNA Family

Accept&Donor 83 68 50 12Accept Site 271 110 33 28Donor Site 216 80 43 18

Transposable elements

Table. Potential splice site are utilized by transposable elements fusion exons

Family SubfamilyAlu 20 1.44AluJ 171 12.35AluS 244 17.62MIR 250 18.05FAM 2 0.14FRAM 18 1.30FLAM 37 2.67HAL 13 0.94L1HS 1 0.07L1P 18 1.30L1M 15

311.05

L2 151 10.90L3 25 1.81

MaLR 67 4.84ERV1 40 2.89ERVL 27 1.95ERVK 6 0.43Charlie 9 0.65

HSMAR2 2 0.14Kanga1 1 0.07MARNA 3 0.22MER 61 4.40Tigger 14 1.01Zaphod2 1 0.07

Others Charlie 1 0.07

SINE

LINE

LTR

DNA

Transposable elementsOccurrences Percent (%)5UTR CDS 3UTR

Alu 0 20 0

AluJ 20 131 12

AluS 13 190 15

AluY 3 37 5

MIR 33 198 7

FAM 0 2 0

FRAM 0 16 2

FLAM 7 25 3

HAL 0 11 0

L1HS 0 1 0

L1P 1 12 5

L1M 6 125 6

L2 22 111 7

L3 1 20 2

MaLR 16 40 6

ERV1 13 23 3

ERVL 4 16 5

ERVK 0 6 0

Charlie 0 9 0

HSMAR2 0 2 0

Kanga1 0 0 1

MARNA 0 3 0

MER 5 50 3

Tigger 0 11 1

Zaphod2 0 1 0

Others Charlie 0 1 0

Transposable elements fusion in gene region

DNA Family

LTR Family

LINE Family

SINE Family

Family Subfamily

Experimental data

tumor/adjacent normal tissues

DATABASE

Computational data

top related