rna in development and regulatory roles of noncoding...

52
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2013 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 940 Regulatory Roles of Noncoding RNA in Development and Disease GAURAV KUMAR PANDEY ISSN 1651-6206 ISBN 978-91-554-8786-7 urn:nbn:se:uu:diva-209596

Upload: leduong

Post on 28-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2013

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 940

Regulatory Roles of NoncodingRNA in Development andDisease

GAURAV KUMAR PANDEY

ISSN 1651-6206ISBN 978-91-554-8786-7urn:nbn:se:uu:diva-209596

Dissertation presented at Uppsala University to be publicly examined in Rudbecksalen,Rudbeck laboratory, Uppsala, Wednesday, 4 December 2013 at 09:30 for the degree of Doctorof Philosophy (Faculty of Medicine). The examination will be conducted in English. Facultyexaminer: Professor Karl Ekwall (Department of Biosciences and Nutrition, KarolinskaInstitute, Stockholm).

AbstractPandey, G. K. 2013. Regulatory Roles of Noncoding RNA in Development and Disease.Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine940. 51 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-8786-7.

Long noncoding RNAs (lncRNAs) are being realized as important players in gene regulation andtheir misregulation has been considered as one of the underlying causes for tumor initiation andprogression in many human pathologies. In the current thesis, I have addressed the functionalrole of lncRNAs in development and disease model systems.

Genomic imprinting is an epigenetic phenomenon by which subset of genes are expressedin a parent of origin-specific manner. The Kcnq1 imprinted locus is epigenetically regulated byKcnq1ot1 lncRNA. Deletion of an 890bp region at the 5’ end of Kcnq1ot1 in mouse resultedin the loss of silencing of neighboring ubiqui-tously imprinted genes (UIGs). In addition,we observed loss of DNA methylation at the UIG promoters. We have shown that Kcnq1ot1RNA establishes CpG methylation by interacting with DNMT1. To explore the stability oflncRNA mediated silencing pathways, we have conditionally deleted Kcnq1ot1 in the mouse ina stage and tissue-specific manner. We have shown that Kcnq1ot1 is continuously required formaintaining the silencing of UIGs, whereas the silencing of the placental im-printed genes ismaintained in an RNA independent manner.

To identify chromatin-associated lncRNA (CARs) on a genome-wide scale, we purified RNAfrom the sucrose gradient fractionated chromatin and subjected it to RNA sequencing. Our studyhas identified 141 intronic and 74 long intergenic CARs. Characterization of one of the CARsrevealed that it regulates the expression of neighboring genes in cis by modulating the chromatinstructure.

We have explored the functional role of lncRNA in tumor progression and initiation byusing pediatric neuroblastoma. By transcriptional profiling of low- and high-risk tumors, wehave identified several lncRNAs differentially expressed between these subtypes. We report anuncharacterized RNA NBAT-1, expressed at lower levels in high-risk tumors relative to low-risk tumors. Using neuroblastoma cell culture system, we demonstrated that NBAT-1 has anti-cell proliferative and anti-invasive properties. In addition, it promotes differentiation of neuronsfrom undifferentiated neuroblastoma cell lines.

In summary, by employing mouse genetics, cell culture based model system and expressionprofiling in tumors, we have uncovered new roles of lncRNA in gene regulation.

Keywords: Noncoding RNA, Genomic Imprinting, Epigenetics, Neuroblastoma

Gaurav Kumar Pandey, Department of Immunology, Genetics and Pathology,Rudbecklaboratoriet, Uppsala University, SE-751 85 Uppsala, Sweden.

© Gaurav Kumar Pandey 2013

ISSN 1651-6206ISBN 978-91-554-8786-7urn:nbn:se:uu:diva-209596 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-209596)

Dedicated to my family

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Mohammad F, Mondal T, Guseva N, Pandey GK, Kanduri C.

Kcnq1ot1 noncoding RNA mediates transcriptional gene silenc-ing by interacting with Dnmt1. Development. 2010 Aug 1; 137(15):2493-9.

II Mohammad F*, Pandey GK*, Mondal T*, Enroth S, Redrup L,

Gyllensten U, Kanduri C. Long noncoding RNA-mediated maintenance of DNA methylation and transcriptional gene si-lencing. Development. 2012 Aug; 139(15):2792-803.

III Mondal T, Rasmussen M, Pandey GK, Isaksson A, Kanduri C. Characterization of the RNA content of chromatin. Genome Res. 2010 Jul; 20(7):899-907.

IV Pandey GK, Mitra S, Subhash S, Mishra K, Fransson S, Gane-

shram A, Mondal T, Kanduri M, Abrahamsson J, Pfeifer S, Larsson E, Martinsson T, Kogner P, Hedborg F, Kanduri C. Transcriptomic profiling of primary neuroblastomas reveals a high-risk tumor associated long noncoding RNA NBAT1, with functional roles in cell proliferation and neuronal differentia-tion. (Manuscript)

*-These authors contributed equally to the work. Reprints were made with permission from the respective publishers.

Contents

Introduction ................................................................................................... 11 Complexity of the eukaryotic gene regulation ......................................... 11 Epigenetic regulators of gene expression ................................................. 12

DNA Methylation ................................................................................ 13 Histone Modification ........................................................................... 14 Noncoding RNA .................................................................................. 15

High-throughput strategies to identify ncRNAs on a genome wide scale .......................................................................................................... 16

Array based approaches ....................................................................... 16 Next generation sequencing methods .................................................. 17 Immunoprecipitation based approaches ............................................... 19

LncRNAs as regulators of developmental processes ............................... 20 Genomic Imprinting and ncRNA ......................................................... 20 ncRNAs and Dosage compensation ..................................................... 24 Other roles of lncRNAs in animal development .................................. 26

LncRNAs in disease ................................................................................. 28 ANRIL lncRNA .................................................................................... 28 HOTAIR lncRNA ................................................................................. 28 MALAT1 lncRNA ................................................................................ 29 LncRNAs in prostate cancer ................................................................ 29 LncRNAs in neuroblastoma ................................................................ 30

Mechanisms of LncRNA mediated gene regulation ................................ 31 1. Recruitment of Chromatin modifying enzymes ............................... 31 2. Regulation of tumor suppressor or oncogene signaling ................... 33 3. Sequestration of microRNA ............................................................ 33 4. Regulators of mRNA processing, stability and translation .............. 33 5. Decoys of transcription factors ........................................................ 34

Aims .............................................................................................................. 35

Results and Discussion ................................................................................. 36 Paper I: Kcnq1ot1 noncoding RNA mediates transcriptional gene silencing by interacting with Dnmt1. ....................................................... 36 Paper II: Long noncoding RNA mediated maintenance of DNA methylation and transcriptional gene silencing. ....................................... 37 Paper III: Characterization of the RNA content of chromatin.................. 39

PAPER IV: Transcriptomic profiling of primary neuroblastomas reveals a high-risk tumor associated long noncoding RNA NBAT1, with functional roles in cell proliferation and neuronal differentiation. ... 40

Acknowledgement ........................................................................................ 42

References ..................................................................................................... 45

Abbreviations

ncRNA Non coding RNA

lncRNA Long noncoding RNA

PRC2 Polycomb repressive complex 2

PRE Polycomb responsive element

RdDM RNA-directed DNA methylation

RDR2 RNA-dependent RNA polymerase

DCL3 DICER-like 3

DRM2 DNA methyltransferase

TGS Transcriptional gene silencing

ES Embryonic stem cells

iPSCs Induced pluripotent stem cells

CAR Chromatin associated RNA

LincRNA Large intergenic noncoding RNA

ICR Imprinting control region

DMR Differentially methylated region

UIG Ubiquitously imprinted genes

PIG Placental imprinted genes

XCI X chromosome inactivation

DCC Dosage compensation complex

HOX Homeobox

PCAT Prostate cancer associated transcript

11

Introduction

The cellular complexity of higher eukaryotic organisms has generated a great deal of scientific interest since several years. Many theories and hy-pothesis have prevailed through the course of time explaining the intricate complexity, cellular diversity and tight gene regulation present in the highly evolved organisms like mammals. The initial hypothesis prevalent before the genomic revolution, were based on the assumptions that the number of genes linearly increase with the organismal size and developmental complexity. This increase was presumed mainly in terms of protein coding genes, as only they were envisaged to regulate the eukaryotic gene regulatory network. A dramatic change in this perception about the basis for the organismal com-plexity in higher organisms occurred after the genomic revolution in 90s and early 2000s. A very significant realization based on the results of high- throughput studies came in the form of the number of protein coding genes present in higher eukaryotes. The human genome contains around 20-25000 protein coding genes (far less than the earlier estimates) which is almost the same in number as present in nematode, Caenorhabditis elegans[1, 2]. This is termed as the ‘G-value’ paradox. It is very hard to visualize that similar number of protein coding genes could regulate, both a worm of 1000 cells and humans with diverse cell types and complex body plan. This contradic-tion in number of genes has made us to focus on the aspect of gene regula-tion, in order to understand and decipher the basis for complexities of the cellular diversity and development in higher organisms.

Complexity of the eukaryotic gene regulation The conventional view regarding eukaryotic gene regulation has been largely protein centric. However, the evidences from the high-throughput studies have suggested that the eukaryotic genome is pervasively transcribed, and proteins coding RNAs form a very minor part (1.5%) of the transcriptional output. The major part of the transcribed genome is noncoding and is consti-tuted of RNAs lacking coding potential, termed as noncoding RNAs (ncRNAs)[3-5]. High-throughput approaches have lead to identification of several classes of ncRNAs transcribed in the eukaryotic genome in diverse tissue types and developmental stages. The ncRNAs are mainly comprised of ribosomal RNAs (rRNA) and transfer RNAs (tRNAs) involved in house-

12

keeping functions, small RNAs such as microRNAs (miRNA), snoRNA, piwi-interacting RNAs (piRNA) and long noncoding RNAs (lncRNAs). The small RNAs are highly conserved and are involved in transcriptional and post-transcriptional gene silencing. In contrast, the lncRNAs (defined as RNAs greater than 200 nt in length) are poorly conserved and regulate gene expression through diverse mechanisms not yet fully understood[6-8]. How-ever, lncRNAs have been reported in a myriad of biological processes such as genomic imprinting, RNA maturation and transport, cellular differentia-tion, pluripotency maintenance, X inactivation and signal transduction[5, 9-13]. With lncRNAs being implicated as key players in regulating gene ex-pression in diverse cellular and developmental processes, it is widely accept-ed that lncRNAs form a major component of the eukaryotic gene regulatory network. Understanding their roles in greater detail and uncovering the mechanisms behind their functions would provide us with much improved understanding of the eukaryotic gene regulation and help us explain in part, the organismal and developmental complexity of higher organisms[14]. In the current thesis, I will begin with describing the different strategies that have been used to identify several different classes of lncRNAs. With in-creasing repertoire of lncRNAs being reported, it becomes relevant to under-stand their functional roles. Hence, I will discuss how lncRNAs are playing crucial roles in diverse developmental processes. Due to their significant roles in regulating these processes, it is highly probable that misregulation in expression of lncRNAs could be one of the underlying causes of disease such as cancer. Elaborating more on this point, I would state examples of various lncRNAs which have been reported to be misregulated in disease states and what kind of roles they are involved in disease progression. Final-ly, I will summarize the current known mechanisms of lncRNA mediated gene regulation. Before, I start discussing the lncRNAs in detail, it is relevant to introduce the known epigenetic modifications and how in combination with these modifi-cations, ncRNA bring about gene regulation.

Epigenetic regulators of gene expression The diverse cell types in multicellular organisms have the same genetic composition, yet are distinctly different from each other in expression of genes. This differential expression can be understood, in part through the phenomenon of epigenetics. It is defined as the study of the inheritable mod-ifications of the DNA or DNA associated proteins without any changes in the DNA sequences. Two well-known epigenetic modifications, DNA meth-

13

ylation and histone modification are known to act in conjugation with ncRNAs to bring about transcriptional regulation.

DNA Methylation

DNA methylation is one of the most well studied epigenetic mark present predominantly in the eukaryotic genome on the cytosine nucleotide (5-methyl cytosine) and, it is present as CpG dinucleotides across the genome[15].

The mammalian DNA methylation machinery consists of three families of DNA methyltransferases (DNMTs). DNMT3A and DNMT3B are the de novo methyltransferases involved primarily in the establishment of the methylation pattern whereas DNMT1 is the maintenance methyltransferase required for the faithful propagation of the methylation imprint over cellular divisions. Another member of the DNMT3 family, DNMT3L has been ob-served to be essential for establishment of methylation at most of the im-printed loci[16]. This DNMT lacks methyltransferase activity but can stimu-late the activity of the de novo DNMTs like DNMT3A and DNMT3B[17, 18]. The classical view regarding methylation has been that once established, it is successfully perpetuated across generations by DNMT1 through its abil-ity to specifically recognize the hemimethylated DNA. However, recent studies are providing evidences that other factors are required in addition to DNMT1, for maintenance of methylation. For example, UHRF1, a chroma-tin associated protein, which specifically binds to hemimethylated CpG nu-cleotides, has been shown to interact with DNMT1. In addition, loss of UHRF1 in embryonic stem (ES) cells leads to a drastic decrease in methyla-tion indicating its role in DNA methylation[19]. It has been reported that UHRF1 is involved in maintenance of DNA methylation by regulating the stability of DNMT1, through its association with methylated histone H3 on lysine 9 residue[20].

The CpG dinucleotides are underrepresented in the eukaryotic genomes. It is due to the reason that, methylated cytosines are prone to deamination. They can get converted to uracil, which is removed by the base excision pathway or to thymine which makes it susceptible to mutagenesis. The majorities (50-70%) of the CpG dinucleotides are methylated in the vertebrate genome with an exception of hypomethylated regions called as CpG islands[21, 22]. These CpG islands are characterized as, the unmethylated regions present mostly near the gene promoters, which are around 1000 base pairs in length and show high GC content. The only exception to the unmethylated CpG islands, are the imprinted genes and genes undergoing X inactivation. NcRNAs, transcriptional events and transcription factors are known to be involved in methylation of these CpG islands present at imprinted genes and genes undergoing X inactivation [23, 24].

14

The unmethylated CpG islands are believed to keep the transcription initia-tion of the gene promoters going and methylation at these regions could lead to repression of the associated gene transcription. Indeed, these islands have been shown to be present over the transcription start site of the entire house keeping genes and on a significant number of tissue-specific or regulatory genes[25]. Also, a genome wide study in humans has found that the majority (72%) of the annotated gene promoters (Refseq database) are associated with high CpG content[26]. An interesting aspect associated with these islands is, how do CpG islands escape de novo methylation? It has been observed that CpG islands are enriched with active histone marks, histone H3 trimethyl lysine 4 (H3K4me3) suggesting that active chromatin at these islands could be crucial for their unmethylated state. Indeed, it has been reported that H3K4me3 prevents the access of DNMT3L, which can directly interact with umethylated tail of histone H3[27]. In addition, the protein Cfp1 has been found to be present over the unmethylated CpG islands, whose depletion leads to reduced levels of H3K4me3 indicating its involvement in maintain-ing the unmethylated state of the CpG islands by regulating the local active chromatin structure[28].

Histone Modification In eukaryotes the DNA is packed with proteins known as histones and forms the basic structural components of the eukaryotic chromatin. The dimers of core histone proteins H2A, H2B, H3 and H4 constitute the core octameric nucleosomal molecule with 147bp of DNA wrapped around it. Along with these four core histone proteins there is histone protein H1 which forms an important component of higher order chromatin organization. The N-terminal tails of the histones undergo post translational modifications (PTM). These PTM include modifications such as methylation, acetylation, ubiquitylation, phosphorylation and sumoylation. Most of these modifica-tions occur mainly on the lysine (K) and arginine (R) residues. These modi-fications affect the accessibility of various regulatory factors and complexes to chromatin and thus play a vital role in regulating gene expression. H3K4 di or tri methylation (me) are mainly associated with active promoters, H3K36me3 and H3K79me2 are associated with the transcribed regions whereas H3K27me3, H3K9me2 and H3K9me3 are present mainly over re-pressed genes[29]. The distal regulatory elements such as enhancers are en-riched mainly with H3K4 monomethylation and H3K27 acetylation. The active mark (H3K4me3) and repressive mark (H3K27me3) can be present together in ES cells as a bivalent chromatin[30]. This bivalent chromatin feature is thought to provide genes a flexibility to commit itself to an active or inactive state in specific lineages during subsequent differentiation. These histone modifications are catalyzed by specific enzymes such as histone methyltransferases (HMTs), histone acetylases (HATs) and histone deacety-

15

lases (HDACs). Polycomb repressive complex 1 (PRC1) and 2 (PRC2) are involved in regulation of repressive chromatin mark, H3K27me3 [31]. The PRC2 complex is composed of three major proteins, SUZ12, EED and EZH2 which is a HMT. These PRC2 members are known to be recruited by poly-comb repressive elements (PRE) in Drosophila for PRC2 mediated gene repression[32]. However not much is known about presence of such ele-ments in mammals. Another interesting aspect associated with PRC2 mem-bers is, how does the faithful propagation of the repressive marks take place. Biochemical and structural studies have revealed that the repressive mark H3K27me3 promotes its own propagation on the newly incorporated nucleo-somes at the end of mitosis[33]. This is done by the allosteric activation of the PRC2 activity by the c-terminal domain of the PRC2 member EED[34].

Noncoding RNA NcRNAs are emerging as novel epigenetic regulators which can work in conjugation with both DNA methylation and histone modifications machin-eries to bring about gene regulation. The role of small RNAs in transcrip-tional gene silencing (TGS) and post-transcriptional gene silencing is very well studied. We will discuss here the examples of mainly small RNAs and how they drive TGS via DNA methylation or histone modifications. Also, we will try to correlate it with the known information about their longer counterparts (lncRNAs) in TGS. (A) RNA and DNA methylation Role of ncRNAs in regulating gene expression via DNA methylation is re-ported in both plants and animals. RNA-directed DNA methylation (RdDM) was first reported in plants, following the observation that upon introduction of viroid cDNAs in tobacco genome, only the viroid sequences became methylated whereas the neighboring genomic DNA remained unaltered [35]. Since then, mechanisms of RdDM have been well studied in Arabidopsis. In Arabidopsis, single stranded long transcripts are transcribed from the repeti-tive heterochromatic regions by a specialized polymerase, RNA Pol IV. The single strands of these transcripts are used to produce the complementary strand by a RNA-dependent RNA polymerase (RDR2), through its associa-tion with RNA Pol IV. Once double stranded (ds) RNAs are produced, DIC-ER-like 3 (DCL3) enzymes cleave these ds transcripts to generate 24 nt, small interfering RNAs (siRNAs). These siRNAs once loaded with the Ar-gonaute 4 protein (AGO4) during their nuclear-cytoplasmic-nuclear transport, target the denovo DNA methyltransferase (DRM2) to methylate and silence the target genes[7, 36].

Along with small RNAs, examples of lncRNA are also known to be in-volved in regulating DNA methylation are also known. Kcnq1ot1 lncRNA has been shown to interact with DNMT1 to regulate the differential methyla-

16

tion at the somatic DMRs of Cdkn1c and Slc22a18 genes[37]. In fact, using a conditional knockout approach, it was shown that continuous expression of RNA is required to maintain the differential methylation of these imprinted genes across cell generations during mouse development[38]. Similarly, Tsix lncRNA is reported to silence the Xist promoter by associating with DNMT3A[39, 40]. In another example, DNMT3b has been observed to as-sociate with triplex formed by promoter RNAs and ribosomal DNA (rDNA) promoter to methylate and silence rDNA[41]. (B) RNA and histone modifications- In yeast, small RNAs have been shown to be critical for heterochromatin initiation and maintenance at the centro-meric repeats. Removal of the components of RNAi machinery affected the silent state of the transgenes integrated at the centromeres. Loss of repressive histone mark H3K9 methylation coupled with increase in H3K4me3 and transcriptional activation of the repeat sequence was also observed. Accord-ing to the current model of TGS in yeast, dsRNA transcribed from these repetitive regions are processed into siRNAs by Dcr1. These siRNAs guide the RNA-induced transcriptional silencing complex (RITSC) to pericentro-meric regions along with Chp1 (chromodomain protein). Once localized, it silences the corresponding loci through recruitment of the H3K9me3 methyl-transferase, Clr4[7, 36]. Similar to small RNAs, lncRNAs such as Kcnq1ot1, Air have been shown to interact with mammalian H3K9me3 HMT, G9a [42, 43]. Additionally, lncRNAs (RepA, Kcnq1ot1, HOTAIR, ANRIL) have also been observed to associate with other repressive modifiers such as PRC2 complex[42, 44-46].

High-throughput strategies to identify ncRNAs on a genome wide scale Research groups have employed various strategies to globally identify and catalogue the lncRNAs in mammalian transcriptome. We would discuss some of these approaches which have evolved through the years to identify lncRNAs and the current methods or criteria which have been used to effi-ciently annotate them.

Array based approaches At the beginning of 21st century, the sequencing of the human genome was nearly complete and the next challenge ahead was to decode the hidden bio-logical information in these sequences. To interrogate these sequences at a global scale, high-throughput approaches were needed to be employed and the advent of microarray technology proved to be a very useful tool at that

17

time. The initial glimpse into presence of noncoding transcribed regions in mammalian genome was provided with use of microarrays. Kapranov et al used an oligonucleotide array to map transcribed region of human chromo-some 22, in 11 different cell lines and observed that many unannotated re-gions are transcribed in comparison to annotated regions [47]. Similar results were described in another study using polyA(+) placental RNA on chromo-some 22 where, unannotated transcribed regions (putative noncoding re-gions) were observed in equal number to annotated protein coding genes[48]. With technological advances, arrays have become one of most commonly used methods to identify or screen lncRNAs associated to a specific biologi-cal question. There are several reports which have used microarrays or tiling arrays to identify lncRNAs. Using a customized microarray, Dinger et al identified 945 lncRNAs expressed during differentiation of mouse ES cells to embryoid bodies[49]. Out of these, 174 lncRNAs were observed to be differentially expressed and correlated with pluripotency or differentiation events[50]. In additional studies, Mercer et al reported systematic analysis of lncRNA expression during neuronal and oligodendrocyte lineage specifica-tion, neuronal-glial fate switching and neural linage maturation [51]. Anoth-er elegant approach has used information generated from genome-wide chromatin maps of H3K4me3-H3K36me3 (active marks at promoter and gene body respectively) domains across four mouse cell types, to generate a tilling array with 2.1 million oligonucleotide probes representing the identi-fied 350 K4-K36 domains. Around 1,600 long intervening noncoding RNAs (lincRNAs) were identified from the analyzed four mouse cell types [52, 53]. Extending the same tilling array, around 3,300 human lincRNAs were identi-fied mapping to K4-K36 chromatin signatures. Using this catalog of identi-fied lincRNAs, Loewer et al designed a microarray of 900 lincRNA and hybridized it to RNA from fibroblast cells, derived iPS cells and ES cells. They detected 133 and 104 lincRNAs as induced or repressed across all iP-SCs and ES cells as compared to fibroblast cells[54].

Next generation sequencing methods RNA sequencing or famously known as RNA-seq is a powerful tool which provides in depth sequencing of transcripts at a single nucleotide resolution allowing identification, detection and quantification of known as well as novel lncRNAs. The steps followed in a standard RNA-seq procedure are presented below.

18

Figure1: Flow chart depicting the steps in RNA-sequencing, used commonly for identification of known as well as novel lncRNAs.

RNA-seq provides selective advantage over microarrays in identification of novel transcripts and has proved to be a handy tool in unbiased identification of diverse classes of noncoding transcripts across cell types. We would discuss some of these classes of ncRNAs whose genome-wide view/presence has been possible through use of these next generation sequencing (NGS) approaches.

(A) Antisense transcripts- These classes of ncRNAs are transcribed from the complementary strand of the sense transcript and overlap in part with it. Large scale cDNA sequencing in the FANTOM project have revealed the presence of widespread antisense transcription in mammalian genome. It was observed that at least 25-40% of mam-malian protein coding genes have overlapping transcription[55]. The function of majority of these antisense transcripts remains unkown, however there are many very well studied examples of antisense lncRNAs of protein coding genes or noncoding RNAs in the mam-malian genome such as Kcnq1/Kcnq1ot1, Igf2r/Air, Xist/Tsix[10].

(B) Long intronic RNAs- These classes of noncoding transcripts are pre-sent within the introns of either protein coding genes or noncoding RNAs. Mainly introns have been known to harbor small RNAs such as snoRNAs or microRNAs. However, recent large scale tran-scriptomic and computational analysis have reported that many in-tronic lncRNAs are present in mammalian genome[56, 57]. An ex-ample of such RNA is COLDAIR, an intronic transcript implicated in vernalization in plants. It is involved in epigenetic suppression of

19

FLOWERING LOCUS (FLC) through its interaction with PRC2 [58].

(C) Bidirectional promoter transcription- With high-throughput ap-proaches such as NGS, it has been realized that initiation of RNA polymerase (RNA Pol II) transcripts can be bidirectional. Cryptic unstable transcripts (CUTs) are products of such a bidirectional tran-scription reported in unicellular eukaryotes[59]. Two studies using NGS or tiling arrays, have observed that CUTs are well defined tran-scription units and are rapidly subjected to degradation by exosomes. In addition, divergent transcription from known promoters can give rise to stable unannotated transcripts (SUTs). Similar bidirectional transcription has also been observed in higher eukaryotes[60, 61]. Exosome depletion and subsequent tiling array analysis in human fi-broblasts has revealed a new class of short, poly-adenylated and un-stable RNAs; promoter upstream transcripts or PROMPTs[62].

(D) Enhancer RNAs- Enhancers are regulatory DNA elements which can activate gene expression independent of their proximity or orien-tation to the target gene. NGS studies have revealed that several en-hancers express lncRNAs, resulting into a new class of lncRNAs; enhancer RNAs or eRNAs. For example, based on ChIP profiles of p300/CBP binding, H3K4me1 and RNA PolII occupancy, Kim et al identified 2000 enhancers which encode enhancer RNAs[63]. A sim-ilar approach was employed in human macrophages to identify sev-eral transcribed enhancers[64].

Immunoprecipitation based approaches

LncRNAs like Xist, Kncq1ot1 and many others have been reported to associ-ate with epigenetic regulators like EZH2 to regulate gene expression. Im-munoprecipitation followed by microarray or sequencing has been used to identify such chromatin modifiying enzymes associated RNAs. Using such an approach Zhao et al identified > 9000 PRC2 interacting RNAs[65]. An-other modified method called as CLIP (cross-linking and immunoprecipita-tion) uses UV crosslinking followed by RNase digestion to retain only the protein–interacting RNA region, which is then immunoprecipitated and puri-fied fragments are subjected to sequencing. CLIP has been used to identify intronic lncRNAs directly associated with PRC2 complex[66]. Many of the lncRNAs such as Kcnq1ot1, Xist, and HOTAIR use chromatin as a substrate to execute their biological functions. Thus, it can be postulated that there could be many other lncRNAs regulating gene expression through their chromatin association property. Based on this presumption, around 200 chromatin associated RNAs (CARs) were identified upon sequencing of RNAs from the chromatin fraction of human fibroblast cells. Functional analysis of one of the intergenic CARs, intergenic 10 showed that, it epige-

20

netically regulates the expression of neighboring genes[67]. With the num-ber of identified lncRNAs constantly increasing, it becomes necessary to explore their plausible biological functions.

LncRNAs as regulators of developmental processes A number of lncRNAs have been reported in developmental processes of multicellular organisms. In the following section, I will discuss some of these developmental processes which have served as model system for the understanding of lncRNAs mediated gene regulation.

Genomic Imprinting and ncRNA Genomic imprinting is an epigenetic process that causes mono-allelic ex-pression of genes in a parent of origin-specific manner. Such genes are re-ferred as ‘imprinted ’. Some of these genes are expressed only from the pa-ternal allele, and the others are expressed from the maternal allele. Genomic imprinting was observed by researchers in plants, insects and mammals around 50 years ago. Historically, the term ‘imprinting’ was first coined by Helen Crouse in 1960s during her studies in Sciara coprophila[68]. Based on her observation that there is a selective loss of paternally inherited, one or two X chromosomes in females and males respectively, She mentions in her article that— “the ‘imprint’ a chromosome bears is unrelated to the genic constitution of the chromosome and is determined only by the sex of the germ line through which the chromosome has been inherited.” Since then, subsequent genetic studies performed in mouse over the years have established genomic imprinting as a paradigm for epigenetic research. Currently, there are 150 imprinted genes reported in mouse and humans. The characteristic feature associated with imprinted genes is that they are mostly present in clusters. The presence of these genes in cluster indicates towards presence of coordinated regulation of imprinted genes in a given imprinted locus. Indeed, these imprinted clusters are under control of a cis- acting con-trol element known as the imprint control region (ICR). The ICR is methyl-ated on one of the parental alleles, whereas the other paternal allele remains unmethylated and hence the term differentially methylated region or DMR. This DNA methylation ‘imprint’ is established by denovo methyltransferase DNMT3A/3L during germ cell development at these ICRs and subsequently maintained by maintenance methyltransferase DNMT1 in diploid cells. The methylation imprint at ICRs is erased in primordial germ cells in the devel-

21

oping embryo and a new methylation mark is re-established based on the sex of the organism.

A distinct feature associated with the majority of imprinted gene clusters is, the presence of one or more lncRNAs. In this respect, imprinted clusters provide us with an excellent model system for understanding the mechanistic aspects of lncRNA mediated gene regulation. However, mono-allelic silenc-ing occurs by different mechanisms amongst these clusters. ICR can acts as an insulator and control the access of imprinted genes to the regulatory ele-ments (H19/Igf2 locus) or, act as a promoter for a lncRNA which is involved in silencing of the imprinted genes (Kcnq1& Igf2r locus). We will discuss in detail about some well-studied imprinted clusters and how nicely they have contributed in our understanding of functional roles played by lncRNAs in regulating gene expression— H19/Igf2 locus: The H19/Igf2 locus resides on mouse chromosome 7, and the orthologous region is present on chromosome 11 in humans. Insulin growth factor 2 (Igf2) and H19 were the initial identified imprinted genes in mammals during early 90s. Igf2 is expressed from the paternal allele and is present 90kb upstream of the maternally expressed H19 gene. Both H19 and Igf2 share common tissue-specific enhancers located downstream of the H19 gene. The ICR regulating mono-allelic expression of both these genes is present upstream of the H19 gene. The ICR is maternally unmethylated, and contains multiple binding sites for the insulator protein CTCF. Binding of CTCF at the ICR blocks the Igf2 promoter from accessing the downstream enhancers and thus keeps Igf2 maternally repressed. This, in turn, allows the maternally expressed H19 to access the downstream enhancers. On the pa-ternal allele, the ICR is methylated and prevents the binding of CTCF. In the absence of insulator protein CTCF, Igf2 can access the downstream enhanc-ers and express from the paternally derived allele[69, 70]. Later studies, us-ing 3C approach have validated the presence of higher order chromatin in-teractions (looping) between the promoters and enhancers which are influ-enced by the presence of parent specific differential methylation and CTCF binding[71].

22

Figure2: Model depicting the insulator based regulation at the H19/Igf2 locus. The ICR is unmethylated on the maternal allele and allows binding of CTCF, preventing access of enhancer to Igf2. On the paternal allele, binding of CTCF is prevented due to methylation of ICR and therefore allows Igf2 to interact with downstream en-hancers.

Kcnq1 locus: The Kcnq1 imprinted locus is present on chromosome 7 in mouse and, the orthologous region is present on chromosome 11 in humans. It is constituted of 8-10 maternally expressed protein coding genes and an lncRNA, Kcnq1ot1. The ICR at this locus is located in the intron 10 of the Kcnq1 gene where it serves as promoter for the expression of lncRNA. Kcnq1ot1 RNA is 91Kb, nuclear localized, unspliced and expressed only from the paternal allele as its promoter is unmethylated, whereas it is re-pressed on the maternal allele due to methylation of its promoter[42, 72]. The imprinted genes at the Kcnq1 locus show varied gene repression in dif-ferent tissue types and developmental stages. Based on the extent of gene repression, the imprinted genes at this locus can be categorized into three distinct classes; ubiquitously imprinted genes (UIGs) which are repressed in both embryonic and extra-embryonic lineages, placental imprinted genes (PIGs) which show repression only in the placental tissue, and non-imprinted genes (NIGs) which escape the RNA mediated paternal silencing at the Kcnq1 locus.

The paternal expression of Kcnq1ot1 RNA and silencing of imprinted genes on the paternal allele indicate towards a role of Kcnq1ot1 in silencing of its neighboring genes in cis. Deletion, truncation and destabilization stud-ies on the Kcnq1ot1 lncRNA in mouse and cell based systems have shown that Kcnq1ot1 RNA is required for the allelic silencing of the protein coding genes in cis[42, 73, 74]. Furthermore, using an episomal system, a 890bp functional sequence, crucial for its silencing has been reported which under-scores the role of RNA in gene silencing[75].

The Kcnq1ot1 adopts distinct mechanisms to bring about silencing of the imprinted genes at the Kcnq1 locus[9]. Silencing of PIGs has been observed to be regulated primarily by histone modifications. Mouse lacking histone modifiers such as G9a, Ezh2 and Eed have an effect on the silencing of PIGs. Furthermore, the Kcnq1ot1 lncRNA has been demonstrated to interact

23

with chromatin modifying enzymes such as G9a and Ezh2 and recruit them to the gene promoters of the imprinted genes in cis in a lineage specific manner. Unlike PIGs, DNA methylation is reported to be involved in silenc-ing of UIGs. Moreover, the lncRNA Kcnq1ot1 has been observed to interact with maintenance methyltransferase DNMT1, and recruit it to somatic DMRs of UIGs, Cdkn1c and Slc22a18[37].

Taken together it has been suggested that, the lncRNA Kcnq1ot1 through its association with DNA and chromatin modifying enzymes is involved in regulating multilayered silencing mechanisms operational at the Kcnq1 im-printed locus.

Figure3: Schematic representation of the imprinted loci (A) Kcnq1 locus and (B) Igf2r locus. The Kcnq1 locus is constituted of paternally expressed RNA, Kcnq1ot1 which is involved in silencing of the neighboring genes. Genes colored in red are imprinted in all tissues(UIGs), blue in only placental tissue (PIGs) and green are non-imprinted genes(NIGs). (B) The Igf2r locus contains a paternally expressed lncRNA, Air. The genes colored in red represent the imprinted genes whereas genes colored in green represent non-imprinted genes.

The Igf2r locus- The Igf2r locus is present on chromosome 17 in mouse. An unspliced, polyadenylated, 108Kb lncRNA, Air is transcribed from this lo-cus. Similar to the Kcnq1 locus, the ICR is unmethylated on the paternal allele and serves as a promoter for the expression of Air lncRNA. The pa-ternal expression of Air lncRNA correlates with the silencing of its three neighboring genes Igf2r, Sl22a2 and Slc22a3.

Studies using deletion of the ICR or the paternally inherited truncated al-lele of the Air transcript have directly implicated the RNA or its transcription in gene repression over the Ig2fr locus[76, 77]. Furthermore, Air lncRNA has been shown to interact with H3K9 methyltransferase G9a and specifical-

24

ly target it to the promoter of Slc22a3 gene in a lineage specific manner[43]. Therefore, similar to Kcnq1ot1 RNA, Air too is involved in epigenetic si-lencing by targeting repressive histone machinery to specific gene promoters in a lineage specific manner.

ncRNAs and Dosage compensation Sex of an organism is determined by the composition of the sex chromo-some, such as XO or XY in males and XX in females. Distinct regulatory mechanisms are present, to balance the dosage of X-linked gene products in cells of either sex. This process of dosage compensation is well studied in flies (Drosophila) and mammals (mouse) and serves as an excellent example, where significant insights into regulatory roles of lncRNAs have been eluci-dated. We will discuss these roles in greater detail in the following section- ncRNAs in mammalian X -inactivation— In mammals, there are two X chromosomes (XX) in females as compared to one in males (XY). There-fore, X –chromosome inactivation (XCI) takes place in females to equalize the dosage of X- linked gene products between both the sexes. During fe-male development, XCI occurs in two forms: imprinted XCI, observed in pre-implantation embryo where paternally inherited X chromosome (Xp) is preferentially silenced, and random XCI, observed in post-implantation epi-blast lineage where either X chromosome is silenced randomly.

Studies on X chromosome have postulated the presence of a X inactiva-tion center (Xic), which is essential to trigger the XCI. This region is around 500Kb in size and harbors five lncRNAs known to regulate X inactivation. The most well-known lncRNA transcribed from the Xic is Xist. It is a 17kb, nuclear localized, spliced transcript and expressed exclusively from the inac-tive X chromosome (Xi)[78]. It coats the entire Xi chromosome in cis- from which it is expressed and its upregulation is necessary for the onset of XCI. The Xist lncRNA contains several repeats of which repeat A has been shown to be crucial for its silencing activity. Interestingly, a shorter RNA, ‘RepA’ (Repeat A RNA), around 1.6Kb is size is transcribed from the repeat A re-gion of Xist and is believed to interact and recruit the PRC2 complex to Xi [44]. The repressive epigenetic memory once initiated by Xist expression, can induce several changes on the Xi chromatin by recruitment of factors such as PcG proteins, Ash2l, hnRNP U (SAF-A), macroH2a and epigenetic modifications such as histoneH4 hypoacetylation and DNA methylation, which are believed to contribute in faithful maintenance of the repressed state of Xi[40, 79-81].

An intriguing question in relation to XCI is, whether Xist lncRNA is also required for maintenance of the repressed state of Xi during development or its role is limited only in establishment of XCI? Using mouse/human somatic cell hybrids, Brown et al, observed that, X-linked genes remained silent in

25

somatic hybrids lacking the Xic consisting of Xist RNA [82]. Later, Csankovszki et al noted that, conditionally deleting Xist after XCI in MEFs did not reactivate the X-linked genes[83]. These observations suggest that, Xist expression is dispensable for maintenance of XCI. However, the contin-ued expression and association of Xist RNA with Xi in somatic cells of fe-males during entire lifespan make us to revisit its role during maintenance. In fact, the lack of Xist causes a decrease in levels of histone macroH2A1.2, characteristic of Xi in conditionally mutant Xi cells. Reactivation of two genes; GFP (transgene) and Hprt (endogenous) due to destabilization of inactive state of X chromosome was also reported in conditional allele lack-ing Xist[84]. Furthermore, it resulted in an increase in the active chromatin marks; histone H4 acetylation and H3K4me3 throughout Xi[85, 86]. In a recent study in mouse hematopoietic stem cells, the conditional deletion of Xist at E10.5 (after establishment of XCI) resulted in upregulation of 86 X-linked genes. Collectively, it can be summarized that, Xist is indispensable for the establishment of XCI whereas it could be involved in maintaining specific features of the Xi chromatin.

The Xic harbors another nuclear localized lncRNA, Tsix, with antagonis-tic function to Xist. It is transcribed in the antisense orientation to Xist and its deletion in the mouse ES cells results in activation of Xist in cis. Therefore, in combination with Xist, Tsix is involved in regulating XCI by maintaining the repression of Xist on the active X chromosome. Another ncRNA, Xite is located upstream of the Tsix promoter and is believed to positively regulate Tsix[86].

Another important and unique aspect in XCI is positive and negative reg-ulation of the Xist by lncRNAs. For example, Tsix functions as a repressor of Xist on the Xa, another lncRNA upstream of the Xist, Jpx acts as a trans-activator of the Xist expression on the Xi. A recent investigation has demon-strated that Jpx regulates Xist activation by binding and titrating away the CTCF protein from the Xist promoter[87]. roX RNAs and dosage compensation in Drosophila melanogaster — Unlike mammals, dosage compensation in Drosophila is achieved by hyperactiva-tion of the single X chromosome in males. Although the ways to achieve dosage compensation are exactly opposite to the mammals, nevertheless lncRNAs act as a common theme regulating dosage compensation in mam-mals and Drosophila. Two lncRNAs; RNA on X1 and 2(roX1 and roX2) are known to be transcribed from the X chromosome. These two RNAs along with male specific proteins (male specific lethal proteins-MSL1, MSL2 and MSL3; maleless helicase-MLE; and histone acetyltransferase males absent on the first-MOF) form the dosage compensation complex (DCC) which is spread across the X chromosome. This coating of the DCC complex acety-lates the H4K16, resulting in an open chromatin and thus brings about an increased transcriptional output of the single X- chromosome in males[88].

26

How these two lncRNAs interact and function with the MSL complex to bring about the X chromosome activation has remained enigmatic? Two recent studies have shed light on the contributions of these two lncRNAs. Using CLIP or RNA affinity purification strategy, critical binding sites for MLE helicase on roX RNA have been reported. The MLE helicase once bound to the conserved motifs on roX RNAs leads to a conformational change in the interacting roX RNAs. This leads to an unmasking of binding sites for MSL2 subunit and thus triggers DCC assembly[89, 90]. Taken to-gether, roX RNAs present an interesting RNA based mechanism of gene regulation.

Other roles of lncRNAs in animal development In addition to classically studied functions of lncRNAs in genomic imprint-ing and dosage compensation, several examples of lncRNAs and their roles in body pattern formation and cellular differentiation have come into light in recent years. I would discuss briefly the regulatory roles of lncRNAs in these developmental processes. lncRNAs in Homeobox (Hox) gene regulation: Hox genes are a group of related genes organized in clusters and are crucial for early embryonic de-velopment in animals. There are 39 Hox genes in vertebrates, clustered into four chromosomal loci ( HoxA-HoxD) that are involved in regulating the anterio-posterior axis formation during embryogenesis. The differential ex-pression of Hox genes is under regulation of epigenetic modifiers such as Trithorax (Trx) and PcG group of proteins. In addition, several lncRNAs have been reported at these loci, some of which have been functionally im-plicated in repressing and activating target Hox genes. The first example is, HOTAIR, a 2.2Kb transcript which is expressed from the HoxC cluster is involved in gene repression of the HoxD locus in trans- in mammalian cells. It has been shown that, the HOTAIR lncRNA interacts with PRC2 complex at its 5’ end and with LSD1, a H3K4 demethylase at its 3’ end. Therefore, the HOTAIR lncRNA serves as a scaffold for assembly of both these com-plexes and this property allows the lncRNA to regulate the targeting of these complexes to silence target HoxD genes[45, 91].

Other lncRNAs at the Hox locus, HOTTIP and Mistral have been reported to activate transcription of Hox genes[92, 93]. HOTTIP is a 3.7 Kb transcript and expressed from the anatomical distal region ie 5’ tip of the HOXA locus. HOTTIP has been shown to activate the transcription of HoxA genes by as-sociating with the WDR5-MLL complex and targeting it to these genes. In mouse ES cells, another lncRNA, Mistral is involved in epigenetic activation of homeotic genes Hoxa6 and Hoxa7. This unspliced and polyadenylated 798 nt RNA interacts with WDR5-MLL1 complex resulting in a changed chromosomal conformation and hence activation of its target genes.

27

LncRNA in pluripotency maintenance and cellular differentiation: ES cells are pluripotent cells with unique ability to differentiate into all the lin-ages derived from any of the three primary germ layer cells (ectoderm, mes-oderm and endoderm). This pluripotent state of ES cells is tightly regulated and maintained by key transcription factors, such as Oct4, Sox2 and Nanog. In addition to these well-known transcription factors, several lncRNAs have been implicated in maintenance of pluripotency. Using a loss-of-function approach, Guttman et al have tested the role of 147 lncRNAs expressed in the mouse ES cells[94]. They found 26 lncRNA with roles in the mainte-nance of pluripotency, whereas expression of 30 lncRNAs was crucial for the repression of lineage specific programs associated with the differentia-tion. A significant number of these RNAs are under the regulation of ES cell specific transcription factors such as Oct4, Sox2 and Nanog, and many of them associate with diverse chromatin modifiers, potentially revealing their role in regulating gene expression programs by forming specific RNA-protein complexes.

In addition to maintenance of pluripotent state of ES cells, many lncRNAs have been reported to be crucial for differentiation into specific lineages. Ng etal have identified 4 lncRNAs crucial for neuronal differentia-tion, through their association with proteins such as SUZ12 and REST[95]. One of these RNAs, RMST has been reported to interact with SOX2 and this RMST-SOX2 complex regulates many downstream genes during neurogene-sis[96]. On a similar line, a 590nt lncRNA, Braveheart (Bvht) was found to be crucial for mouse ES cells differentiation towards cardiovascular lineage. Bvht epigenetically regulates cardiovascular lineage commitment through its association with PRC2 member, SUZ12[97]. All these mentioned lncRNAs having roles during lineage commitment act as molecular scaffold for pro-tein complexes, however other examples of lncRNAs employing different regulatory mechanisms during differentiation have also been characterized. A cytoplasmic lncRNA, linc-MD1, controls muscle differentiation in mouse and human cells by acting as a sponge for micro RNAs miR-133 and miR-135. Overexpression of linc-MD1 up regulates transcription factors, MAML1 and MEF2C, known to activate genes related to muscle differentia-tion. These transcription factors are targets of miR-133 and miR-135 and thus linc-MD1 acts as a decoy for these microRNAs and contributes in regulation of muscle differentiation[98]. Another lncRNA, TINCR (terminally differen-tiation-induced ncRNA) is essential for human epidermal differentiation. Analyzing TINCR RNA- and protein interactome revealed that TINCR inter-acts with mRNAs of genes involved in differentiation (through a 25 nucleo-tide ‘TINCR motif’) and staufen (STAU1) protein. It is proposed that, TINCR stabilizes expression of mRNAs essential for epidermal differentia-tion, through its association with STAU1[49].

28

With all these examples, it is evident that lncRNAs are an integral compo-nent of regulatory circuit during embryonic development and cellular differ-entiation.

LncRNAs in disease With several examples of lncRNAs functions discussed in earlier section, it is clear that these molecules play a significant role in mammalian develop-mental processes. In this section I would discuss the examples of lncRNAs which have been reported to be misregulated in disease conditions. Also, I would discuss how understanding these diseases in the context of lncRNA expression are contributing to our understanding of lncRNA mediated gene regulation.

ANRIL lncRNA

This lncRNA was first reported from genetic analysis of familial melanoma-neural system tumors with large germ line deletion (42Kb) encompassing INK4 locus[99]. This locus harbors three tumors suppressor genes; p15/CDKN2B/INK4b, p16/CDKN2A/INK4a, p14/ARF along with antisense lncRNA ANRIL (antisense RNA in INK4 locus). Altered expression of these genes has been observed in many tumors. Additionally, the regions encom-passing ANRIL gene contains disease associated SNPs[100, 101].

ANRIL lncRNA is transcribed in an antisense direction to INK4b. It has been reported that ANRIL interacts with chromobox 7 (CBX7) protein, a member of PRC1 complex, and controls the transcriptional repression of the INK4a tumor suppressor genes. Increased levels of both ANRIL and CBX7 have been reported in prostate cancer correlating with reduced levels of INK4a. Another study observed that, ANRIL interacts with SUZ12 and re-cruits this PRC2 member to repress INK4b expression. This suggests that ANRIL is involved in regulating epigenetic repression of INK4 locus through its association with both PRC1 and PRC2 members. These observations also indicate that, altered expression of ANRIL is involved in tumor initiation via silencing the three tumor suppressor genes. However the detailed role of ANRIL in INK4 locus regulation and mechanisms by which it initiates cancer progression need to be fully elucidated[46, 102].

HOTAIR lncRNA HOTAIR is a very well-known example of lncRNA involved in cancer pro-gression. Gupta et al first reported that increased expression of HOTAIR was associated with primary breast cancer and metastasis. Over expression of HOTAIR lncRNA resulted in an increased invasiveness and metastasis in

29

epithelial cancer cells. This increased expression of HOTAIR lncRNA leads to modified occupancy of PRC2 complex to many metastasis suppressor genes resulting in their aggressive phenotype. Conversely, depletion of HO-TAIR decreased the invasive potential of the MCF7 cell line. In addition to breast cancer, increased expression of HOTAIR has been reported in several other cancers such as adenocarcinoma, melanoma, squamous cell carcinoma, hepatocellular carcinoma etc. It is therefore believed that HOTAIR contrib-utes to cancer progression and metastasis by remodeling the chromatin land-scape through its interaction with the PRC2 complex[103].

MALAT1 lncRNA MALAT1 was initially reported as a prognostic parameter associated with patient survival and metastasis in stage I non-small cell lung cancers[104]. Since its identification several studies have reported elevated levels of MA-LAT1 in cancers of liver, breast, cervix and bladder and linked its differential expression to cancer recurrence and metastasis. In addition, overexpression experiments observed that increased levels of MALAT1 resulted in increased cell proliferation, migration and tumor formation in nude mice and converse-ly, depleted levels of MALAT1 resulted in impaired migration and reduced tumor formation in mouse xenografts[105]. The mechanisms behind these effects of MALAT1 and its role in cancer can be partly understood by its regulatory functions.

It is a highly expressed, conserved nuclear RNA and is localized to nucle-ar speckles, although it is not crucial for the integrity of these nuclear speck-les. MALAT1 has been shown to interact with pre-splicing factors (SR pro-teins) and regulate their level and distribution in nuclear speckles domains, known to be enriched for pre-mRNA splicing factors[106]. In extension to this observation, Tripathi et al have recently shown that MALAT1 promotes cell proliferation by modulating levels of cell-cycle regulated oncogenic transcription factors such as B-MYB[107]. It is therefore proposed that MA-LAT1 acts as molecular sponge and impairs the levels of pre-mRNA splicing factors leading to misregulation of cell cycle progression genes and thus contributes in tumor progression.

LncRNAs in prostate cancer PCA3 is one of the initially reported lncRNA observed to be highly ex-pressed in primary prostate cancers. Recently, FDA has approved the use of PCA3 for prostate cancer detection[100, 108]. After its identification many other lncRNAs in prostate cancer have been reported. A study using tran-scriptomic profiling approach has sequenced RNA from a cohort of 102 prostate tissue and cell lines, and identified 121 unannotated prostate cancer associated transcripts (PCATs). Functional characterization of one of these

30

lncRNAs, PCAT-1 revealed that this lncRNA is selectively unregulated in prostate cancer[109]. It is involved in repression of genes involved in mitosis and cell division such as BRCA2, a known tumor suppressor gene. Another recent study has identified another lncRNA, PCAT-114 or also named as second chromosome locus associated with prostate-1 (SchLAP1) as a strong predictor of clinical outcomes in aggressive prostate cancer. Over expression and depletion experiments suggests that, SchLAP1 contributes to cancer cells invasiveness and metastasis. It was observed that SchLAP1 antagonizes SWI/SNF complex and in turn impairs its genome wide occupancy and regu-latory functions. These observations indicate that SCHLAP1 promotes ag-gressive cancer in part by regulating downstream targets of SWI/SNF com-plex[110]. In addition, two lncRNAs PRNCR1 and PCGEM1 have been recently reported to be highly expressed in aggressive prostate cancer. Both these lncRNAs bind to androgen receptors (AR) and enhance AR mediated gene expression programs resulting in an increased proliferation and metas-tasis in prostate cancers[111].

LncRNAs in neuroblastoma Neuroblastoma is a disease of sympathetic nervous system affecting children below the age of 1.5 years. These tumors occur generally in adrenal medulla but can occur in other organs such as chest, abdomen and pelvis. Neuroblas-tomas are highly heterogeneous tumors with variable clinical outcomes and hence can be classified into various subtypes. However, in general these tumors can be divided into two broad categories- Low-risk tumors and high-risk tumors. Low-risk tumors are characterized by whole chromosomal changes, favorable prognosis and have the propensity to undergo regression. On the other hand high-risk tumors have characteristic chromosomal aberra-tions such as deletion of 11q and 1p arm, gain of 17q and amplification of MYCN oncogene, unfavorable prognosis and are highly aggressive. Many studies have reported mutations, epigenetic alterations and expression signa-tures for efficient stratification of these tumors[112].

Few lncRNAs have also been reported to be associated with neuroblasto-ma. Transcribed ultra-conserved regions or T-UCRs are lncRNAs, tran-scribed from 481 regions, which were found to be highly conserved between human, mouse and rat and have been found to be dysregulated in tumors such as hepatocellular carcinoma and leukemia[113]. In a study, expression of all 481 T-UCRs was investigated in primary neuroblastomas. 7 T-UCRs were found to be overexpressed in MYCN amplified tumors. In addition, three of these lncRNAs were found to be upregulated upon MYCN induction, pointing towards their MYCN dependent regulation[114]. Another lncRNA, ncRAN (noncoding RNA expressed in aggressive neuroblastomas) has been reported in the aggressive neuroblastoma associated 17q region[115]. Deple-tion of ncRAN in neuroblastoma cell lines inhibited cell growth. Also, over-

31

expression of ncRAN in bladder cancer cell line has been reported to result in increased cell proliferation and invasiveness. However, the mechanisms and exact roles of these T-UCRs and ncRAN in neuroblastoma progression are largely unknown. Therefore, identifying and functionally characterization lncRNA signatures in neuroblastoma would lead to a much improved under-standing of the disease.

Mechanisms of LncRNA mediated gene regulation In the current thesis, I have discussed so far various examples of lncRNA in different aspects of development and maintenance of cellular states. In addi-tion, I have provided examples of misregulated lncRNAs in diseases. In this section of the thesis, I will discuss various mechanisms adopted by lncRNA for regulating gene expression.

1. Recruitment of Chromatin modifying enzymes One of the most well-known mechanisms adopted by lncRNAs is through their ability to interact with chromatin modifying enzymes and recruit these epigenetic modifiers to target gene loci in cis or guide these modifiers to distant loci in trans. Cis regulatory RNAs such as Xist/RepA, Kcnq1ot1, and ANRIL have been observed to bind to repressive complex PRC2 to regulate gene expression in a localized manner. The recruitment of these complexes could be through putative protein binding domains/motifs in the RNA. A GC rich double stem loop structure (28 nt) in Rep A lncRNA directly interacts with EZH2[44]. Such structural motifs or functional domains in the RNA could serve as scaffolds for assembly of protein complexes and facilitate them to efficiently target the repressive gene machinery. For example, the Kcnq1ot1 RNA is reported to interact with both PRC2 & G9a as well as with the maintenance methyltrasferase, DNMT1 and regulate lineage specific gene repression via recruiting these complexes to specific gene promoters and somatic DMRs respectively[37, 42]. Germ line deletion of 890 bp si-lencing domain, present at its 5’ end significantly reduces the chromatin association and interaction of the Kcnq1ot1 RNA with DNMT1, resulting in loss of gene repression. The structural conformations/ binding motifs which recruit and the mechanisms through which lncRNAs stabilize the assembly of these protein complexes are yet to be fully characterized in mammals. In relation to this aspect of lncRNA mechanism, an interesting example of lncRNA recruitment and protein assembly has been recently reported in Drosophila, where binding of one protein factor to conserved motif in roX RNA leads to change in structural confirmation of the interacting RNA and triggers assembly of other proteins[89, 90].

32

In contrast to the local gene repression by cis regulating RNAs, the trans regulatory RNAs such as HOTAIR and linc-p21 might adopt additional mechanism to regulate distant gene repression. It is proposed that, they can target the repressive gene machinery in trans by binding to the specific genes as RNA: DNA duplexes or RNA: DNA: triplexes[116]. For example, recruitment of Dnmt3b by promoter RNA (pRNAs) generated from the in-tergenic spacer regions (IGS) of the rDNA repeats is proposed to be depend-ent on the DNA: RNA triplex between pRNA and rDNA promoter[41].

Figure 4: Schematic representing mechanisms of lncRNA mediated gene regulation. LncRNA can act as Scaffold for chromatin modifying enzymes(A), regulators of signaling pathways of tumor suppressor genes(B), sponges for microRNAs (C), decoys of transcription factors(D) and can regulate mRNA processing, translation and stability(E).

33

2. Regulation of tumor suppressor or oncogene signaling LncRNAs can play critical regulatory roles in disease states through regulat-ing expression of tumor suppressor genes or their downstream signaling. For example, gene repression of tumor suppressor genes, p15 and p16 is ob-served in many cancers and is associated with ANRIL mediated targeting of the repressive chromatin machinery to these target tumor suppressors [102, 117]. Another lncRNA, linc-p21 has emerged as a crucial p53 response gene. In mouse lung, sarcoma and lymphoma tumors, linc-p21 was observed to be induced upon activation of p53 signaling and transcriptionally repress p53 target genes through its association with heterogeneous ribonulceopro-tein K (hnRNP-K)[118]. Similarly, an lncRNA PANDA has been reported to be overexpressed in metastatic ductal carcinoma and is activated in a p53 dependent manner. It limits expression of pro-apoptotic genes by sequester-ing away the transcription factor NF-Y from promoters of these genes[119].

3. Sequestration of microRNA Several lncRNAs have micro-RNA binding sites and thus can serve as ‘sponges’ of these molecules and regulate their levels during cellular pro-cesses. An interesting ‘competing endogenous RNA (ceRNA)’ hypothesis has been proposed, where an intricate network of cross talk between coding and noncoding RNAs is present, and changes in levels of one microRNAs can lead to concomitant changes in expression of other member of the net-work[120]. For example, H19 lncRNA has binding sites for let-7 microRNA and regulates muscle differentiation by regulating levels of let-7[121]. Simi-larly, linc-MD1 lncRNA regulates levels of transcription factors, MAML1 and MEF2C involved in muscle differentiation by acting as a sponge for micro-RNAs, mir-133 and mir-135[98]. Along with lncRNAs, pseudogenes such as PTENP1 can also act as sponges for microRNAs and in turn regulate the expression of their target genes[122].

4. Regulators of mRNA processing, stability and translation LncRNAs can also exert their regulatory control through their roles at the level of mRNA processing, stability or translational control. As discussed earlier, lncRNA MALAT1 controls mRNA splicing by regulating the levels of pre-mRNA splicing (SR) factor. Similarly, a relatively new class of lncRNA, sno-lncRNAs is transcribed from the intronic region of 15q11-q13 (Prader-willy syndrome region) locus. This too can influence splicing events through its association with the splicing factor FOXC2 [123]. In addition, lncRNAs can influence stability of transcripts through their interaction with target mRNAs. It has been observed that Alu-repeat containing cytoplasmic lncRNAs can interact with Alu elements in the 3’ UTR of a mRNA and

34

these double stranded structures formed through RNA-RNA interaction can create binding sites for Staufen-mediated decay (SMD) and in turn activate the mRNA decay pathway[124]. Another important aspect of lncRNA medi-ated regulation is through translational control. Antisense lncRNAs like Uch1As have been observed to upregulate the translation of its sense mRNA, Uch1 through a repeat SINEB2 element. Several other sense–antisense lncRNAs with SINEB2 elements have been identified and translation control could be employed by such RNAs[125].

5. Decoys of transcription factors An interesting theme in lncRNA mediated gene regulation is through acting as ‘molecular decoys’ of transcription factors or chromatin modifiers or other protein factors. For example, PANDA lncRNA regulates the expres-sion of pro-apoptotic gene through titrating away transcription factor NF-Y [119]. Similarly, MALAT1 acts as a decoy for the SR proteins and regulates splicing.

35

Aims

The main objective of the current thesis was to understand the role of lncRNA in development and disease. Using Kcnq1 imprinted locus as a model system, we have investigated the role of Kcnq1ot1 RNA in establish-ment and maintenance of the epigenetic memory. Next, using genome wide approaches and pediatric tumors as disease model, we have identified regula-tory RNAs in normal and disease condition. The specific aims of the thesis are as follows-

Paper I 1. To functionally characterize the critical sequences within Kcnq1ot1 RNA required for epigenetic silencing of neighboring genes, in vivo.

Paper II 1. To study the role of Kcnq1ot1 RNA in the maintenance of transcriptional gene silencing at different stages of mouse embryonic development. 2. To investigate the plausible mechanism by which non-imprinted genes escape noncoding RNA mediated silencing.

Paper III 1. Identification and characterization of Chromatin associated RNAs (CARs) from human fibroblast cells.

Paper IV 1. Identification of differentially expressed lncRNAs between low- and high-risk neuroblastomas. 2. To functionally characterize the regulatory roles of these identified lncRNAs in different subtypes of neuroblastomas.

36

Results and Discussion

Paper I: Kcnq1ot1 noncoding RNA mediates transcriptional gene silencing by interacting with Dnmt1. Kcnq1ot1 is an lncRNA paternally expressed from the intron 10 of the Kcnq1 gene. Earlier studies, using cell based episomal system and mouse knockouts have demonstrated that Kcnq1ot1 is involved in bidirectional, epigenetic silencing at the Kcnq1 locus, in a lineage specific manner. Further attempts have been made to identify and characterize the crucial sequences in the Kcnq1ot1 RNA, necessary for its silencing activity. A cell based epi-somal system was used to search for functional sequences within the noncoding RNA. The episomal system comprises of two reporter genes Hy-gromycin (Hyg) and H19. In this construct the Hyg gene is located upstream to the promoter of Kcnq1ot1 whereas the H19 gene is present downstream of the Kcnq1ot1 transcription start site. The Kcnq1ot1 RNA upon transcription is involved in bidirectional silencing of the two reporter genes. During iden-tification of the functional sequences within the RNA, it was found that dele-tion of 890bp from the RNA resulted in the loss of bidirectional silencing of both the reporter genes within the episomal system. Further characterization of the 890bp sequence demonstrated that, it could silence the reporter genes in an orientation dependent and position independent manner.

In the current study, we wanted to investigate the role of 890bp present at the 5’ end of the Kcnq1ot1 on the silencing status of imprinted genes at the Kcnq1 locus, in an in vivo condition. Towards this end, we have generated a knockout mouse lacking the critical 890 base pairs through homologous recombination. Homozygous mice lacking 890bp (∆890 -/-) were normal and fertile. In addition, no effect on the stability, localization, expression and promoter methylation of Kcnq1ot1 RNA was observed in ∆890 -/-. In order to analyze the effect of 890 bp deletion on imprinted genes, we performed allele- specific imprinting assays on the RNAs collected from placenta (ex-tra-embryonic) and liver (embryonic) tissues of mouse embryos. We ob-served a loss of silencing of the ubiquitously imprinted genes (UIGs-Kcnq1, Slc22a18, Phlda2, Cdkn1c) in the paternally derived Kcnq1ot1 allele lacking 890bp. However, the UIGs showed variations in their loss of imprinting in a lineage specific manner. Kcnq1 showed a loss of imprinting in both placenta

37

and liver, Phlda2 and Cdkn1c showed loss of imprinting only in placental tissue whereas Slc22a18 showed loss of imprinting only in liver tissue. In addition, we did not observe any changes on the imprinting status of the placental imprinted genes (PIGs-Ascl2, Tssc4, Cd81 and Osbpl5). Taken together, our results suggest that 890bp of the Kcnq1ot1 RNA are essential for the proper imprinting of UIGs.

The UIGs are known to be regulated via DNA methylation. Mouse lack-ing the maintenance methyltranferase, DNMT1 shows a loss of silencing of UIGs. In order to investigate the reasons behind loss of silencing of UIGs in ∆890 mice, we analyzed the methylation status of somatic DMRs of Cdkn1c and Slc22a18 genes. We observed a loss of methylation at the DMRs of both the genes in placenta as well as liver tissues indicating that methylation could be important for the silencing at these UIGs. To further investigate, the role of RNA in regulating the methylation, we performed RNA immunopre-cipitation (RIP) with DNMT1 antibody. We found that RNA interacts with the maintenance methyltrasferase, DNMT1and this interaction was signifi-cantly reduced in mice lacking 890bp, explaining the loss of methylation observed at the somatic DMRs of Cdkn1c and Slc22a18 genes. We also found significant enrichment of DNMT1 at the somatic DMRs of Cdkn1c and Slc22a18, which was reduced in ∆890 mice.

Kcnq1ot1 is known to associate with chromatin and it mediates its tran-scriptional silencing by organizing a three dimensional, contracted repressive compartment. We analyzed the effect of deletion of 890bp on these proper-ties of the noncoding RNA using Chromatin RNA immunoprecipitation (ChRIP) and 3C. We observed a reduced chromatin association of the RNA and release in the contracted state of the repressive compartment in mice lacking 890bp. This effect on the higher order chromatin structure in ∆890 mice could explain the loss of silencing of Kcnq1 and Phlda2 gene which are not known to have any DMRs.

In summary, the current investigation elucidates that, 890bp of the Kcnq1ot1 RNA is required for RNA mediated transcriptional silencing of the UIGs, through its association with the DNA methylation machinery.

Paper II: Long noncoding RNA mediated maintenance of DNA methylation and transcriptional gene silencing. In Paper I we have investigated the role of RNA in establishment of epige-netic silencing. The Kcnq1ot1 lncRNA starts expressing from 2 cell stage onwards, however the gene silencing at the Kcnq1 locus occurs in a lineage and developmental stage specific manner. The imprinting of UIGs is estab-lished at the blastocyst stage (E3.5) whereas PIGs start showing imprinted expression from the post-implantation stages (E5.5). In addition, there are

38

non-imprinted genes (NIGs) in this locus which remain biallelically ex-pressed in all developmental stages. The Kcnq1ot1 RNA has been shown to associate with DNA and chromatin modifiers and establish multilayered silencing pathways at the Kcnq1 locus. However, the role of lncRNA in maintenance of epigenetic memory once established has remained mainly unknown. Another intriguing question is how certain genes escape transcrip-tional gene silencing by noncoding RNAs?

In the current investigation, using conditional knockout mouse, we have attempted to answer these questions. We have generated a transgenic mouse where the Kcnq1 ICR was flanked with two loxP sequences (ICR2lox). We have deleted the Kcnq1ot1 RNA at specific embryonic developmental stages by crossing the ICR2lox mice with two different mouse lines expressing, the cre recombinase enzyme. Firstly, we crossed the male ICR 2lox mice with Meox cre/+ females, that expressed cre recombinase only in the epiblast line-ages which give rise to embryonic tissue. Conditional deletion of the Kcnq1ot1 at E 5.5 resulted in a loss of silencing of UIGs in the embryonic lineages (liver, brain and heart) suggesting that RNA is required to maintain the imprinting of UIGs. In line with earlier observation, we observed re-duced weight in mice lacking Kcnq1ot1 RNA explained the increased ex-pression of Cdkn1c. Furthermore, we wanted to investigate whether RNA is also required to maintain silencing at later stages of development or there are developmental stage specific differences in the requirement of RNA in maintenance of epigenetic memory. Therefore, we crossed the male ICR 2lox

mice with CAG cre/+ female mice and fed the pregnant female with 5mg of tamoxifen at E8.5d to conditionally delete the RNA in both embryonic and extra-embryonic lineages. Conditionally deleting the RNA at later stages of development resulted in loss of UIGs, however the silencing of PIGs re-mained unchanged.

As ubiquitously imprinted genes such as Cdkn1c and Slc2218 are known to have somatic DMRs, we analyzed the status of methylation over these genes. A loss of methylation at these DMRs was observed correlating with the loss of silencing of these UIGs at two different developmental stages. We also observed a decreased enrichment of DNMT1 at these DMRs explaining the decreased methylation. Taken together, we observed that Kcnq1ot1 RNA is required to maintain gene silencing at the UIGs. Moreover, continuous presence of RNA over the two somatic DMRs is essential to stably maintain the DNA methylation across cell generations, contrary to the general percep-tion that DNA methylation is a stable mark. We also investigated the repres-sive histone modification at the Kcnq1 locus, upon deletion of RNA at E8.5. The repressive histone modifications remained unchanged indicating that silencing of the PIGs is maintained in a RNA independent manner.

Finally, we have investigated the plausible mechanisms by which NIGs escape the silencing mediated by Kcnq1ot1 ncRNA. The repressive modifi-cations H3K27me3 were found to be absent over the NIGs, Nap1l4 and

39

Cars. As the H3K27me3 modifications and the active histone modifications like H3K27act are known to be present in a mutually exclusive fashion, we profiled the enhancer specific marks H3K27act and H3K4me1 along the one mega base Kcnq1 domain using ChIP on chip. The non-imprinted genes were found to be enriched with enhancer specific modifications, H3K27act and H3K4me1. These results indicate that, enhancer specific active histone marks over the NIGs could be involved in evading RNA mediated silencing.

Paper III: Characterization of the RNA content of chromatin. With the advent of high throughput analysis it has been well accepted that majority of the genome is transcribed. This has indicated towards the pres-ence of large number of noncoding RNAs as compared to protein coding genes. There have been several evidences from studies which have implicat-ed noncoding RNAs in epigenetic regulation. For example, Kcnq1ot1 and Airn noncoding RNAs at their respective imprinted locus are involved in epigenetic silencing by recruiting repressive machinery to target genes by associating with the chromatin. Similarly Xist has been implicated in recruit-ing histone modifiers during X chromosome inactivation. These studies have pointed towards the presence of many more noncoding RNAs in the genome which might be epigenetically regulating diverse functions through their association with the chromatin.

In line with this assumption, we have identified chromatin associated RNAs (CARs) on a genome wide scale. Using sucrose gradient purified chromatin, we have isolated RNA from the soluble chromatin fractions in human fibroblast cells and sequenced it using Illumina platform. Analysis of the sequenced reads revealed that, majority of the reads (71%) mapped to intergenic and intronic regions whereas only 29% mapped to exons of pro-tein coding genes. By further analyzing the intergenic and intronic reads, we identified 141 CARs mapping to intronic regions and 74 CARs mapping to intergenic regions. We observed that 52% of the intronic CARs overlapped with the known totally intronic (TIN) and partially intronic (PIN) transcripts suggesting that CARs are independent transcripts. In addition, we observed that 15 of the intergenic CARs were part of identified lincRNAs of which 9 have been shown to associate with PRC2 members.

The identified CARs were found to be evolutionary conserved across pla-cental mammals. We validated the chromatin association of selected CARs using complementary technique, Chromatin RNA immunoprecipitation (ChRIP). One of the intergenic CARs, Intergenic 10 was functionally char-acterized. We show that, Intergenic 10 regulates the expression of its two

40

neighboring genes by modulating the active chromatin structure over their gene promoters.

PAPER IV: Transcriptomic profiling of primary neuroblastomas reveals a high-risk tumor associated long noncoding RNA NBAT1, with functional roles in cell proliferation and neuronal differentiation. LncRNAs are being reported in regulatory roles in many biological path-ways. Therefore, misregulation of lncRNA mediated functions could have an adverse effect on the cell and might lead to diseases such as cancer. Indeed, many examples of misregulated lncRNAs in diseases are being reported. Identifying such lncRNAs could serve as important information for better understanding of the disease conditions. In the current investigation, we have aimed to identify lncRNAs with probable roles in neuroblastoma tumors and characterize them in order to better understand lncRNA mediated gene regu-lation.

Neuroblastoma is a tumor of sympathetic nervous system affecting mostly children below the age group of 1.5 years. These tumors are highly hetero-geneous in nature and can be categorized into distinct clinical subtypes based on their clinical outcomes. However, broadly these tumors can be classified as either aggressive, high-risk tumors (characteristic chromosomal aberra-tions such as 1p, 11q- deletion or MYCN amplification) or regressive, low-risk tumors (only copy number chromosomal variations). With an aim to identify differentially expressed lncRNAs between low- and high-risk tu-mors, we have sequenced the transcriptomes of tumors representing, low-risk and two high-risk subtypes (11q- deletion and MYCN amplification). We obtained around 584, 756 and 70 differentially expressed genes (both protein coding genes and lncRNAs) between the low-risk vs. 11q-, low-risk vs. MNA (MYCN amplified) and low-risk vs. high-risk tumors respectively. Furthermore, we obtained 24 (out of 70 differentially expressed genes) novel lncRNAs as differentially expressed between low-and high-risk subtypes. We have validated the expression of some of these lncRNAs using qRT-PCR in an independent cohort of neuroblastoma tumors.

Many characteristic aberrations are associated with aggressive neuroblas-tomas such as amplification on chromosome 2, deletions on chromosome 1p arm and 11q arm. These regions are believed to have putative tumor sup-pressor genes or oncogenes which could contribute towards neuroblastoma aggressiveness. Therefore, we wanted to investigate if any of the differen-tially expressed lncRNA between these subtypes mapped to these character-istic nonrandom chromosomal locations. We have found lncRNAs mapping to MNA amplified region, 11q arm and 1p arm as differentially expressed

41

between the neuroblastoma subtypes. These RNAs could be involved in tumor suppressor or oncogenic functions in these neuroblastoma subtypes.

We observed an uncharacterized lncRNA, which we named as NBAT-1 (Neuroblastoma associated transcript-1), as one of the top ranked candidate to be differentially expressed between low-risk and high-risk neuroblastoma tumors. NBAT-1 has a relatively lower expression in high-risk tumors as compared to low-risk tumors. We have validated its expression in an inde-pendent cohort of 37 primary neuroblastoma tumors using qRT-PCR. Inter-estingly, an aggressive neuroblastoma associated SNP is known to reside in the intron of NBAT-1. Our analysis on SNP and its effect on NBAT-1 expres-sion revealed that risk allele status at the SNP correlated with lowered ex-pression of NBAT-1 in high-risk tumors. In order to further understand the role of NBAT-1 in neuroblastoma progression, we have depleted the RNA using sh-RNAs against NBAT-1 in SH-SY5Y cells. Knocking down of the RNA resulted in an increased proliferation and invasiveness in the SH-SY5Y cells. We have sequenced the RNA obtained from NBAT-1 depleted as well as control cells. We found an enrichment of genes involved in cell prolifera-tion, migration and neuronal differentiation. The change in expression of genes in cell proliferation and migration upon NBAT-1 knockdown indicates towards its tumor suppressor role and explains the observed phenotype of increased proliferation and invasiveness. As we also observed neuronal differentiation as another important biological process to be affected upon NBAT-1 knockdown, we investigated the role of NBAT-1 in neuronal differ-entiation in SH-SY5Y cells. We observed that NBAT-1 is significantly up-regulated during neuronal differentiation and depletion of the RNA resulted in perturbed differentiation process.

Taken together, in the current investigation, we have identified several lncRNAs which could serve as potential biomarkers in neuroblastoma prog-nosis. Moreover, we report an lncRNA, NBAT-1 as differentially expressed between low- and high-risk tumors. Our functional studies on NBAT-1 indi-cate that it could promote neuronal differentiation along with inhibiting in-creased cell proliferation and invasiveness in primary neuroblastomas.

42

Acknowledgement

The work presented here in this thesis was carried out at Department of Im-munology, genetics and pathology (IGP), Uppsala University, as well as in Department of Medical and Clinical genetics, Gothenburg University. Firstly, I would like to sincerely thank my supervisor, Prof Chandrasekhar Kanduri for all his guidance, support, and motivation during this work. I am also grateful to him for letting me work on some very interesting projects in his laboratory. I will always be thankful to him for all the discussions and his valuable suggestions, which have immensely helped me to improve as a student.

Thanks to my Co-supervisor, Prof. Claes Wadelius for organizing chroma-tin journal club. It was a great source of keeping up to date on recent and interesting scientific studies.

Thanks to Dr. Fredrik Öberg, Dr Lena Åslund and other members of the chromatin journal club for the interesting articles and discussions. Thanks to Prof. Helena Jerenberg Wiklund for suggestions whenever needed.

I would like to thank Dr. Fredrik Hedborg with his help and support with the tumor project.

I would like to thank, head of the department, Prof. Lena Claesson-Welsh for the constructive research environment at the IGP. Thanks to all the past and present members of administration for their help whenever needed. Special thanks to Christina for all her help with my applications and forms. I would also like to thank all the members of animal facility for taking good care of the mouse. I am thankful to my senior lab colleague, Pandey Ji for his professional help during my initial days in the lab. It was also a pleasure to listen to your cool ideas. Thanks to Pratima Bhabhi for the good food at your place. Many thanks to Faizaan bhai for teaching me fine details of mouse work. I am thankful to you for your valuable suggestion and tips which were of great

43

use in my experiments. You also have been an inspiration for cooking good food. Thanks to Fatima Bhabhi for all the nice parties at your home.

Thanks to Natalia, for teaching me ice-skating, for all the nice parties and organizing many wonderful trips. I would like to thank, Tanmoy (Dada) for always being a very nice and dear friend. Thanks to you for your moral support and valuable suggestions throughout my work. I have always admired your scientific intellect. Thanks to Matthieu for always being ready to help whenever required. It has been very nice to have you as a colleague in the lab and know your thoughts on various topics. Thanks for coming to my wedding. Thanks to Kankadeb (KK) for his unconditional help in and outside the lab. Thanks for being very colorful and jovial member of the lab. It has been really great to have you around. Thanks to Sanhita for her timely help in my experiments. I would like to sincerely thank Santhilal for your constant support with the data analysis and patiently answering all my questions. Thanks to Sirisha for her help whenever required. Thanks to the past members of the lab, Roshan (Bhaiya) and Abiarchana for their cooperation in my experiments. Thanks to Arif for his suggestions in the lab. I would like to thank Meena for all the good food and parties at your home.

I would like to thank all the members of the third floor at Rudbeck lab for nice chats and discussion during lunch and tea time. Thanks to Nicola, Mi, Lesley, Larry, Annika, Tobias, Akhtar, Amar, Demet, Jelena, Sara, Naren da, Sneganshu Dada, Soumi di, Geeta, and Lothar for making my stay at IGP wonderful. It has been very nice to know you all. Thanks to all the members of our neighboring lab at Uppsala. Thanks to Charlotte, Antonia, Pernilla, Inger for being very nice to us and for all your help whenever needed. Special thanks to Prasoon (Baba) for your eternal knowledge on all the subjects. Thanks for your cooperation with all the things at Uppsala. Thanks to members of the Wadelius lab- Mehdi, Ola and Madhu for the discussions during journal club.

44

I would like to thank, Prof. Per Elias and Prof. Claes Gustafsson for their support and cooperation during shifting to our new lab in Gothenburg. Thanks to Dr. Erik Larsson for all his ideas and discussions with the data analysis.

Thanks to Sashidhar for all his help with the mouse work. I want to thank Prof. Saumitra Das for providing me with the opportunity to work in his lab at IISc, Bangalore at the initial stages of my research carrier. Thanks to all the lab members at IISc, Sharath, Rammana garu, Nithya, and Bhupi singh for your nice company. Special thanks to Bhupi singh for your support and cooperation during my stay at IIsc. I want to spe-cially thank Deb dada (Janab) for always being there for me whenever needed and always suggesting me constructively. I want to thank all my MSc teachers at KUD Dharwad for their encourage-ment and belief in me. I want to thanks all my MSc batch mates and friends for the very nice and memorable days spent with them at Dharwad. Thanks to Dr. Narahari for his guidance after my MSc.

I want to thank the Indian community at Uppsala for making my stay at Uppsala, very memorable. The years spent with Kali bhai, Jeetu mama, Puneet ji and Dada at Doblensgatan have been one of the best and memora-ble times of my life. Thanks to all of you for the great time. I will always cherish these memories. Thanks to Kali and Sulena, Nimesh and Smitha, Sandeep and Navya for all the get together and social gatherings. I also want to specially thank, Umash ji and Noopur Ji for all the nice dinners and interesting chats at Rudbeck. It has been wonderful to known you all. I also want to thank, my Indian friends in Gothenburg, Pankaj and Prisha, Rakesh and Pooja. I want to thank all my family members for their support and encouragement. I want to specially thank my parents and my younger brother Saurabh, for their unconditional love, affection, encouragement and moral support without which nothing would have been possible.

I want to thank my wife, Nalinee for all her love, support and understanding during my research studies. And lastly, I want to thank my little daughter, Nushi for filling my life with lots of joy and happiness. You are the cutest baby in the world.

45

References

1. Taft, R.J., M. Pheasant, and J.S. Mattick, The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays, 2007. 29(3): p. 288-99.

2. Mattick, J.S., The central role of RNA in human development and cognition. FEBS Lett, 2011. 585(11): p. 1600-16.

3. Cheng, J., et al., Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science, 2005. 308(5725): p. 1149-54.

4. Carninci, P., et al., The transcriptional landscape of the mammalian genome. Science, 2005. 309(5740): p. 1559-63.

5. Mercer, T.R., M.E. Dinger, and J.S. Mattick, Long non-coding RNAs: insights into functions. Nat Rev Genet, 2009. 10(3): p. 155-9.

6. Ponting, C.P., P.L. Oliver, and W. Reik, Evolution and functions of long noncoding RNAs. Cell, 2009. 136(4): p. 629-41.

7. Sabin, L.R., M.J. Delas, and G.J. Hannon, Dogma derailed: the many influences of RNA on the genome. Mol Cell, 2013. 49(5): p. 783-94.

8. Whitehead, J., G.K. Pandey, and C. Kanduri, Regulation of the mammalian epigenome by long noncoding RNAs. Biochim Biophys Acta, 2009. 1790(9): p. 936-47.

9. Mohammad, F., T. Mondal, and C. Kanduri, Epigenetics of imprinted long noncoding RNAs. Epigenetics, 2009. 4(5): p. 277-86.

10. Lee, J.T. and M.S. Bartolomei, X-inactivation, imprinting, and long noncoding RNAs in health and disease. Cell, 2013. 152(6): p. 1308-23.

11. Rinn, J.L. and H.Y. Chang, Genome regulation by long noncoding RNAs. Annu Rev Biochem, 2012. 81: p. 145-66.

12. Bernstein, E. and C.D. Allis, RNA meets chromatin. Genes Dev, 2005. 19(14): p. 1635-55.

13. Amaral, P.P. and J.S. Mattick, Noncoding RNA in development. Mamm Genome, 2008. 19(7-8): p. 454-92.

14. Mattick, J.S., Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms. Bioessays, 2003. 25(10): p. 930-9.

15. Razin, A. and A.D. Riggs, DNA methylation and gene function. Science, 1980. 210(4470): p. 604-10.

16. Bourc'his, D., et al., Dnmt3L and the establishment of maternal genomic imprints. Science, 2001. 294(5551): p. 2536-9.

17. Suetake, I., et al., DNMT3L stimulates the DNA methylation activity of Dnmt3a and Dnmt3b through a direct interaction. J Biol Chem, 2004. 279(26): p. 27816-23.

18. Chen, Z.X., et al., Physical and functional interactions between the human DNMT3L protein and members of the de novo methyltransferase family. J Cell Biochem, 2005. 95(5): p. 902-17.

19. Sharif, J., et al., The SRA protein Np95 mediates epigenetic inheritance by recruiting Dnmt1 to methylated DNA. Nature, 2007. 450(7171): p. 908-12.

46

20. Rothbart, S.B., et al., Association of UHRF1 with methylated H3K9 directs the maintenance of DNA methylation. Nat Struct Mol Biol, 2012. 19(11): p. 1155-60.

21. Bird, A.P., DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res, 1980. 8(7): p. 1499-504.

22. Gruenbaum, Y., et al., Methylation of CpG sequences in eukaryotic DNA. FEBS Lett, 1981. 124(1): p. 67-71.

23. Arnaud, P., Genomic imprinting in germ cells: imprints are under control. Reproduction, 2010. 140(3): p. 411-23.

24. Guseva, N., T. Mondal, and C. Kanduri, Antisense noncoding RNA promoter regulates the timing of de novo methylation of an imprinting control region. Dev Biol, 2012. 361(2): p. 403-11.

25. Larsen, F., et al., CpG islands as gene markers in the human genome. Genomics, 1992. 13(4): p. 1095-107.

26. Saxonov, S., P. Berg, and D.L. Brutlag, A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci U S A, 2006. 103(5): p. 1412-7.

27. Ooi, S.K., et al., DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature, 2007. 448(7154): p. 714-7.

28. Thomson, J.P., et al., CpG islands influence chromatin structure via the CpG-binding protein Cfp1. Nature, 2010. 464(7291): p. 1082-6.

29. Zhou, V.W., A. Goren, and B.E. Bernstein, Charting histone modifications and the functional organization of mammalian genomes. Nat Rev Genet, 2011. 12(1): p. 7-18.

30. Cedar, H. and Y. Bergman, Linking DNA methylation and histone modification: patterns and paradigms. Nat Rev Genet, 2009. 10(5): p. 295-304.

31. Schuettengruber, B., et al., Genome regulation by polycomb and trithorax proteins. Cell, 2007. 128(4): p. 735-45.

32. Chan, C.S., L. Rastelli, and V. Pirrotta, A Polycomb response element in the Ubx gene that determines an epigenetically inherited state of repression. EMBO J, 1994. 13(11): p. 2553-64.

33. Hansen, K.H., et al., A model for transmission of the H3K27me3 epigenetic mark. Nat Cell Biol, 2008. 10(11): p. 1291-300.

34. Margueron, R., et al., Role of the polycomb protein EED in the propagation of repressive histone marks. Nature, 2009. 461(7265): p. 762-7.

35. Wassenegger, M., et al., RNA-directed de novo methylation of genomic sequences in plants. Cell, 1994. 76(3): p. 567-76.

36. Castel, S.E. and R.A. Martienssen, RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nat Rev Genet, 2013. 14(2): p. 100-12.

37. Mohammad, F., et al., Kcnq1ot1 noncoding RNA mediates transcriptional gene silencing by interacting with Dnmt1. Development, 2010. 137(15): p. 2493-9.

38. Mohammad, F., et al., Long noncoding RNA-mediated maintenance of DNA methylation and transcriptional gene silencing. Development, 2012. 139(15): p. 2792-803.

39. Sun, B.K., A.M. Deaton, and J.T. Lee, A transient heterochromatic state in Xist preempts X inactivation choice without RNA stabilization. Mol Cell, 2006. 21(5): p. 617-28.

47

40. Lee, J.T., Gracefully ageing at 50, X-chromosome inactivation becomes a paradigm for RNA and chromatin control. Nat Rev Mol Cell Biol, 2011. 12(12): p. 815-26.

41. Schmitz, K.M., et al., Interaction of noncoding RNA with the rDNA promoter mediates recruitment of DNMT3b and silencing of rRNA genes. Genes Dev, 2010. 24(20): p. 2264-9.

42. Pandey, R.R., et al., Kcnq1ot1 antisense noncoding RNA mediates lineage-specific transcriptional silencing through chromatin-level regulation. Mol Cell, 2008. 32(2): p. 232-46.

43. Nagano, T., et al., The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science, 2008. 322(5908): p. 1717-20.

44. Zhao, J., et al., Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science, 2008. 322(5902): p. 750-6.

45. Rinn, J.L., et al., Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell, 2007. 129(7): p. 1311-23.

46. Kotake, Y., et al., Long non-coding RNA ANRIL is required for the PRC2 recruitment to and silencing of p15(INK4B) tumor suppressor gene. Oncogene, 2011. 30(16): p. 1956-62.

47. Kapranov, P., et al., Large-scale transcriptional activity in chromosomes 21 and 22. Science, 2002. 296(5569): p. 916-9.

48. Rinn, J.L., et al., The transcriptional activity of human Chromosome 22. Genes Dev, 2003. 17(4): p. 529-40.

49. Kretz, M., et al., Control of somatic tissue differentiation by the long non-coding RNA TINCR. Nature, 2013. 493(7431): p. 231-5.

50. Dinger, M.E., et al., Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res, 2008. 18(9): p. 1433-45.

51. Mercer, T.R., et al., Long noncoding RNAs in neuronal-glial fate specification and oligodendrocyte lineage maturation. BMC Neurosci, 2010. 11: p. 14.

52. Guttman, M., et al., Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature, 2009. 458(7235): p. 223-7.

53. Khalil, A.M., et al., Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A, 2009. 106(28): p. 11667-72.

54. Loewer, S., et al., Large intergenic non-coding RNA-RoR modulates reprogramming of human induced pluripotent stem cells. Nat Genet, 2010. 42(12): p. 1113-7.

55. Katayama, S., et al., Antisense transcription in the mammalian transcriptome. Science, 2005. 309(5740): p. 1564-6.

56. Louro, R., A.S. Smirnova, and S. Verjovski-Almeida, Long intronic noncoding RNA transcription: expression noise or expression choice? Genomics, 2009. 93(4): p. 291-8.

57. Rearick, D., et al., Critical association of ncRNA with introns. Nucleic Acids Res, 2011. 39(6): p. 2357-66.

58. Heo, J.B. and S. Sung, Vernalization-mediated epigenetic silencing by a long intronic noncoding RNA. Science, 2011. 331(6013): p. 76-9.

59. Wyers, F., et al., Cryptic pol II transcripts are degraded by a nuclear quality control pathway involving a new poly(A) polymerase. Cell, 2005. 121(5): p. 725-37.

48

60. Neil, H., et al., Widespread bidirectional promoters are the major source of cryptic transcripts in yeast. Nature, 2009. 457(7232): p. 1038-42.

61. Xu, Z., et al., Bidirectional promoters generate pervasive transcription in yeast. Nature, 2009. 457(7232): p. 1033-7.

62. Preker, P., et al., RNA exosome depletion reveals transcription upstream of active human promoters. Science, 2008. 322(5909): p. 1851-4.

63. Kim, T.K., et al., Widespread transcription at neuronal activity-regulated enhancers. Nature, 2010. 465(7295): p. 182-7.

64. De Santa, F., et al., A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol, 2010. 8(5): p. e1000384.

65. Zhao, J., et al., Genome-wide identification of polycomb-associated RNAs by RIP-seq. Mol Cell, 2010. 40(6): p. 939-53.

66. Guil, S., et al., Intronic RNAs mediate EZH2 regulation of epigenetic targets. Nat Struct Mol Biol, 2012. 19(7): p. 664-70.

67. Mondal, T., et al., Characterization of the RNA content of chromatin. Genome Res, 2010. 20(7): p. 899-907.

68. Crouse, H.V., The nature of the influence of x-translocations on sex of progeny in Sciara coprophila. Chromosoma, 1960. 11: p. 146-66.

69. Bell, A.C. and G. Felsenfeld, Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature, 2000. 405(6785): p. 482-5.

70. Kanduri, C., et al., Functional association of CTCF with the insulator upstream of the H19 gene is parent of origin-specific and methylation-sensitive. Curr Biol, 2000. 10(14): p. 853-6.

71. Kurukuti, S., et al., CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. Proc Natl Acad Sci U S A, 2006. 103(28): p. 10684-9.

72. Kanduri, C., Kcnq1ot1: a chromatin regulatory RNA. Semin Cell Dev Biol, 2011. 22(4): p. 343-50.

73. Mancini-Dinardo, D., et al., Elongation of the Kcnq1ot1 transcript is required for genomic imprinting of neighboring genes. Genes Dev, 2006. 20(10): p. 1268-82.

74. Shin, J.Y., G.V. Fitzpatrick, and M.J. Higgins, Two distinct mechanisms of silencing by the KvDMR1 imprinting control region. EMBO J, 2008. 27(1): p. 168-78.

75. Mohammad, F., et al., Kcnq1ot1/Lit1 noncoding RNA mediates transcriptional silencing by targeting to the perinucleolar region. Mol Cell Biol, 2008. 28(11): p. 3713-28.

76. Wutz, A., et al., Imprinted expression of the Igf2r gene depends on an intronic CpG island. Nature, 1997. 389(6652): p. 745-9.

77. Sleutels, F., R. Zwart, and D.P. Barlow, The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature, 2002. 415(6873): p. 810-3.

78. Jeon, Y., K. Sarma, and J.T. Lee, New and Xisting regulatory mechanisms of X chromosome inactivation. Curr Opin Genet Dev, 2012. 22(2): p. 62-71.

79. Pullirsch, D., et al., The Trithorax group protein Ash2l and Saf-A are recruited to the inactive X chromosome at the onset of stable X inactivation. Development, 2010. 137(6): p. 935-43.

80. Hasegawa, Y., et al., The matrix protein hnRNP U is required for chromosomal localization of Xist RNA. Dev Cell, 2010. 19(3): p. 469-76.

49

81. Augui, S., E.P. Nora, and E. Heard, Regulation of X-chromosome inactivation by the X-inactivation centre. Nat Rev Genet, 2011. 12(6): p. 429-42.

82. Brown, C.J. and H.F. Willard, The human X-inactivation centre is not required for maintenance of X-chromosome inactivation. Nature, 1994. 368(6467): p. 154-6.

83. Csankovszki, G., et al., Conditional deletion of Xist disrupts histone macroH2A localization but not maintenance of X inactivation. Nat Genet, 1999. 22(4): p. 323-4.

84. Csankovszki, G., A. Nagy, and R. Jaenisch, Synergism of Xist RNA, DNA methylation, and histone hypoacetylation in maintaining X chromosome inactivation. J Cell Biol, 2001. 153(4): p. 773-84.

85. Diaz-Perez, S.V., et al., A deletion at the mouse Xist gene exposes trans-effects that alter the heterochromatin of the inactive X chromosome and the replication time and DNA stability of both X chromosomes. Genetics, 2006. 174(3): p. 1115-33.

86. Yildirim, E., et al., Xist RNA is a potent suppressor of hematologic cancer in mice. Cell, 2013. 152(4): p. 727-42.

87. Sun, S., et al., Jpx RNA activates Xist by evicting CTCF. Cell, 2013. 153(7): p. 1537-51.

88. Conrad, T. and A. Akhtar, Dosage compensation in Drosophila melanogaster: epigenetic fine-tuning of chromosome-wide transcription. Nat Rev Genet, 2011. 13(2): p. 123-34.

89. Ilik, I.A., et al., Tandem stem-loops in roX RNAs act together to mediate X chromosome dosage compensation in Drosophila. Mol Cell, 2013. 51(2): p. 156-73.

90. Maenner, S., et al., ATP-dependent roX RNA remodeling by the helicase maleless enables specific association of MSL proteins. Mol Cell, 2013. 51(2): p. 174-84.

91. Tsai, M.C., et al., Long noncoding RNA as modular scaffold of histone modification complexes. Science, 2010. 329(5992): p. 689-93.

92. Wang, K.C., et al., A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature, 2011. 472(7341): p. 120-4.

93. Bertani, S., et al., The noncoding RNA Mistral activates Hoxa6 and Hoxa7 expression and stem cell differentiation by recruiting MLL1 to chromatin. Mol Cell, 2011. 43(6): p. 1040-6.

94. Guttman, M., et al., lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature, 2011. 477(7364): p. 295-300.

95. Ng, S.Y., R. Johnson, and L.W. Stanton, Human long non-coding RNAs promote pluripotency and neuronal differentiation by association with chromatin modifiers and transcription factors. EMBO J, 2012. 31(3): p. 522-33.

96. Ng, S.Y., et al., The long noncoding RNA RMST interacts with SOX2 to regulate neurogenesis. Mol Cell, 2013. 51(3): p. 349-59.

97. Klattenhoff, C.A., et al., Braveheart, a long noncoding RNA required for cardiovascular lineage commitment. Cell, 2013. 152(3): p. 570-83.

98. Cesana, M., et al., A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell, 2011. 147(2): p. 358-69.

99. Pasmant, E., et al., Characterization of a germ-line deletion, including the entire INK4/ARF locus, in a melanoma-neural system tumor family: identification of ANRIL, an antisense noncoding RNA whose expression coclusters with ARF. Cancer Res, 2007. 67(8): p. 3963-9.

50

100. Prensner, J.R. and A.M. Chinnaiyan, The emergence of lncRNAs in cancer biology. Cancer Discov, 2011. 1(5): p. 391-407.

101. Pasmant, E., et al., ANRIL, a long, noncoding RNA, is an unexpected major hotspot in GWAS. FASEB J, 2011. 25(2): p. 444-8.

102. Yap, K.L., et al., Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a. Mol Cell, 2010. 38(5): p. 662-74.

103. Gupta, R.A., et al., Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature, 2010. 464(7291): p. 1071-6.

104. Ji, P., et al., MALAT-1, a novel noncoding RNA, and thymosin beta4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene, 2003. 22(39): p. 8031-41.

105. Li, L., et al., Role of human noncoding RNAs in the control of tumorigenesis. Proc Natl Acad Sci U S A, 2009. 106(31): p. 12956-61.

106. Tripathi, V., et al., The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol Cell, 2010. 39(6): p. 925-38.

107. Tripathi, V., et al., Long noncoding RNA MALAT1 controls cell cycle progression by regulating the expression of oncogenic transcription factor B-MYB. PLoS Genet, 2013. 9(3): p. e1003368.

108. Bussemakers, M.J., et al., DD3: a new prostate-specific gene, highly overexpressed in prostate cancer. Cancer Res, 1999. 59(23): p. 5975-9.

109. Prensner, J.R., et al., Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol, 2011. 29(8): p. 742-9.

110. Prensner, J.R., et al., The long noncoding RNA SChLAP1 promotes aggressive prostate cancer and antagonizes the SWI/SNF complex. Nat Genet, 2013.

111. Yang, L., et al., lncRNA-dependent mechanisms of androgen-receptor-regulated gene activation programs. Nature, 2013. 500(7464): p. 598-602.

112. Maris, J.M., Recent advances in neuroblastoma. N Engl J Med, 2010. 362(23): p. 2202-11.

113. Scaruffi, P., The transcribed-ultraconserved regions: a novel class of long noncoding RNAs involved in cancer susceptibility. ScientificWorldJournal, 2011. 11: p. 340-52.

114. Mestdagh, P., et al., An integrative genomics screen uncovers ncRNA T-UCR functions in neuroblastoma tumours. Oncogene, 2010. 29(24): p. 3583-92.

115. Yu, M., et al., High expression of ncRAN, a novel non-coding RNA mapped to chromosome 17q25.1, is associated with poor prognosis in neuroblastoma. Int J Oncol, 2009. 34(4): p. 931-8.

116. Wang, K.C. and H.Y. Chang, Molecular mechanisms of long noncoding RNAs. Mol Cell, 2011. 43(6): p. 904-14.

117. Yu, W., et al., Epigenetic silencing of tumour suppressor gene p15 by its antisense RNA. Nature, 2008. 451(7175): p. 202-6.

118. Huarte, M., et al., A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell, 2010. 142(3): p. 409-19.

119. Hung, T., et al., Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters. Nat Genet, 2011. 43(7): p. 621-9.

120. Salmena, L., et al., A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell, 2011. 146(3): p. 353-8.

51

121. Kallen, A.N., et al., The Imprinted H19 LncRNA Antagonizes Let-7 MicroRNAs. Mol Cell, 2013. 52(1): p. 101-12.

122. Poliseno, L., et al., A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 2010. 465(7301): p. 1033-8.

123. Yin, Q.F., et al., Long noncoding RNAs with snoRNA ends. Mol Cell, 2012. 48(2): p. 219-30.

124. Gong, C. and L.E. Maquat, lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 3' UTRs via Alu elements. Nature, 2011. 470(7333): p. 284-8.

125. Carrieri, C., et al., Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat. Nature, 2012. 491(7424): p. 454-7.

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Medicine 940

Editor: The Dean of the Faculty of Medicine

A doctoral dissertation from the Faculty of Medicine, UppsalaUniversity, is usually a summary of a number of papers. A fewcopies of the complete dissertation are kept at major Swedishresearch libraries, while the summary alone is distributedinternationally through the series Digital ComprehensiveSummaries of Uppsala Dissertations from the Faculty ofMedicine.

Distribution: publications.uu.seurn:nbn:se:uu:diva-209596

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2013