sonja j. prohaska - uni-leipzig.de · green light + red light = yellow light sonja j. prohaska gene...

16
Gene Regulation Sonja J. Prohaska Computational EvoDevo, University of Leipzig Santa Fe Institute, Santa Fe, NM, U.S. WS 2017/18 Sonja J. Prohaska Gene Regulation

Upload: lytram

Post on 26-Jul-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Gene Regulation

Sonja J. Prohaska

Computational EvoDevo, University of LeipzigSanta Fe Institute, Santa Fe, NM, U.S.

WS 2017/18

Sonja J. Prohaska Gene Regulation

How can we find out that a gene is expressed?

... meansuring the (amount of) RNA transcript.check if transcripts are present

I expression of a single gene – sequence is known

→ (Q-)PCR (polymerase chain reaction)

I expression of many genes – sequences are known

→ DNA microarrays (also DNA-chip)

I expression of all genes – even unknown ones

→ RNA-seq (transcriptome sequencing)

check if RNA polymerase is “at the gene”

I ChIP-seq (chromatin immuno-precipitation, antibody against RNA polymerase)

check if RNA polymerase is transcribing the gene just now!

I GRO-seq (global run-on-sequencing)

check for the presence of transcription factors at the gene and its regulatory regions

I ChIP-seq (chromatin immuno-precipitation with antibodies against atranscription factor and subsequent sequencing)

Sonja J. Prohaska Gene Regulation

How can we find the target transcripts?

... make it glow, so you can see it.

RNA-FISH (imaging method)

I RNA f luorescence in-situ hybridization

I synthetic a short sequence that will hybridize uniquely to the target transcript

I label these RNAs with a fluorescence dye

I let these so-called “probes” hybridize to the target RNA

I watch for the dye under a fluorescence microscope

Sonja J. Prohaska Gene Regulation

How can we find the target transcripts?... amplify it, so it becomes the only one you see.PCR (amplification method)

I extract the RNA from the sampleI synthesize a primer pair that hybridizes uniquely

to the sequence of the target transcriptI do PCR, i.e. amplify the sequence fragment (from primer to primer)I use gelelectrophoresis to separat the resulting DNA by sizeI make the DNA visiable (Ethidiumbromid, UV-light)I the large amount of product and the observed lenght of the fragment

show you that the transcript of interest is present

5’−GGGAAA

3’−CCCTTT

CACACA−3’

GTGTGT−5’

cDNA

transcriptionreverse

RNA to DNA

Primer design

gene expression

RNA extraction

5’−GGGAAA

5’−GGGAAA CACACA−3’

3’−CCCTTT GTGTGT−5’

GTGTGT−5’

?

DNA region iof the gene of interest

Sonja J. Prohaska Gene Regulation

How can we find the target transcripts?

... amplify it, so it becomes the only one you see.

PCR (amplification method)

Sonja J. Prohaska Gene Regulation

What if we want to look at many genes at a time?

... extend the ideas from above!

(Tafel)

Sonja J. Prohaska Gene Regulation

What if we want to look at many genes at a time?

... extend the ideas thinking computationally!

How would you solve the following problem?

I somebody roled a dice a thousand times in total,

I it was a 20-sided

I loaded dice (gezinkter Wurfel).

I He wrote done the numbers one by one in an unordered list.

How would you represent his results in a data structure?

I you could use an array

I with the dimentions 1× 20 (i = {1, ...20})I then you would look at every number x in the person’s list exactly once

I you would check identity of the number x with i by

I checking for equality of x an i ( i.e. "x == i")

I AND if so, you would add ’1’ to the value at position i of the array

I final step: read out the values per position of the array

Sonja J. Prohaska Gene Regulation

What if we want to look at many genes at a time?

... extend the ideas thinking biochemically!

Back to our original problem?

I a thousand dice roles = the total amount of RNA

I 20-sides of the dice = total number of genes n

I loadedness = genes are expressed with different frequencies

I unordered list = soup of RNA in the cell

How would you represent the results in a data structure?

I you could use a microarray

I with the dimentions 1× n (i = {1, ...n})I then you would look at every single RNA x

I you would check identity of the RNA x with i by

I checking the ability of x to hybridize to i

I AND if so, it would stay attached to one if thousand copies of i at a small welli of the microarray

I final step: read out the amount of hybridized RNAs per well of the array

Sonja J. Prohaska Gene Regulation

What if we need to look at many genes at a time?

Congratulations! You just designed a microarray!

Sonja J. Prohaska Gene Regulation

What if we need to look at many genes at a time?

Major problems and solutions

I long sequences will break, fold, entangle,...

→ work with shorter fragments on the chip

I hybridization is dependent on the nucleotide composition

meaning: probes on the chip hybridize with differtent efficiency

probes are not comparable with respect to the amount of transcripts bound

→ measure relative amounts (changes)

I cross-hybridization

meaning: parts of the sample transcripts can hybridize to many probes on thechip

→ use multiple, unique probes per gene and average afterwards

Sonja J. Prohaska Gene Regulation

What if we need to look at many genes at a time?

... extend the ideas even further!

green light + red light = yellow light

Sonja J. Prohaska Gene Regulation

How do we analyze a Microarray experiment?I imaging: readout emission after fluorophor excitation with laser light for both

colors (red (= sample), green (= control)) separately

I normalization (average per probe)

I pair control and sample

I calculate log ratios: log(Ered/Egreen)

I determine significantly over- and underexpressed genes

red – sample; green – control; red – overexpression; green – underexpression;

Sonja J. Prohaska Gene Regulation

What if we don’t do not know our gene sequences?

... you could sequence all transcripts.RNA-seq RNA/transcriptome sequencing

Sonja J. Prohaska Gene Regulation

What if we don’t know our gene sequences?

I Step 1: Library preparation

Sonja J. Prohaska Gene Regulation

What if we don’t know our gene sequences?

... you could sequence all transcripts.RNA-seq RNA/transcriptome sequencing

I Step 2: Sequencing

I use a high-throughput-sequencing technology

e.g. Illumina

it will sequence 50nt (or 100nt) from one end of you fragments

I what to do with those short “reads”?

I Step 3: Read mapping

– map the reads back to the reference genome

– assemble de novo

Sonja J. Prohaska Gene Regulation

What if we don’t know our gene sequences?

Sonja J. Prohaska Gene Regulation