statistics for microarrays

44
Biological background: Molecular Biology Class web site: http://statwww.epfl.ch/davison/teaching/Microarr ays/ Statistics for Microarrays

Upload: upton

Post on 16-Mar-2016

30 views

Category:

Documents


1 download

DESCRIPTION

Statistics for Microarrays. Biological background: Molecular Biology. Class web site: http://statwww.epfl.ch/davison/teaching/Microarrays/. Acknowledgements. http://www.accessexcellence.org/AB/GG http://www.oup.co.uk/best.textbooks/biochemistry/genesvii - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistics for Microarrays

Biological background: Molecular Biology

Class web site: http://statwww.epfl.ch/davison/teaching/Microarrays/

Statistics for Microarrays

Page 3: Statistics for Microarrays

Two types of organisms*

* Every biological ‘rule’ has exceptions!

Page 4: Statistics for Microarrays

Timeline of Genetics Highlights

Page 6: Statistics for Microarrays

Human Chromosomes

Page 7: Statistics for Microarrays

Human Chromosome Banding Patterns

Page 8: Statistics for Microarrays

Chromosomes and DNA

Page 9: Statistics for Microarrays

Cell Division -- Mitosis

Page 10: Statistics for Microarrays

Cell Division -- Meiosis

Page 11: Statistics for Microarrays

Crossing over and Recombination

Page 12: Statistics for Microarrays

Mitosis and Meiosis Compared

Page 13: Statistics for Microarrays

(BREAK)

Page 14: Statistics for Microarrays

Nature (1953), 171:737

“We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest.”

DNA Structure Discovery

Page 15: Statistics for Microarrays

DNA

• A deoxyribonucleic acid or DNA molecule is a double-stranded linear polymer composed of four molecular subunits called nucleotides

• Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), cytosine (C), or thymine (T)

• The two strands are held together by weak hydrogen bonds between complementary bases

• Base-pairing occurs according to the rule: G pairs with C, and A pairs with T

Page 16: Statistics for Microarrays

DNA A-type (140D)(low water content)

DNA B-type (7BNA)(Watson-Crick form)

DNA Z-type (2ZNA)(high salt concentration)

Polymorphic DNA Tertiary Structures

Page 17: Statistics for Microarrays

Genes are linearly arranged along chromosomes

Page 18: Statistics for Microarrays

DNA Structure (overview)

Page 19: Statistics for Microarrays

A nucleotide is a phospate, a sugar, and a purine (A, G) or a pyramidine (T, C) base.

The monomeric units of nucleic acids are called nucleotides.

DNA Structure

Page 20: Statistics for Microarrays

Adenine (A) Guanine (G) (Purines)

Thymine (T) (DNA) (Pyrimidines)

Cytosine (C)

Uracil (U) (RNA)

Nucleotide Bases

Page 21: Statistics for Microarrays

Nucleotide codesA Adenine W Weak (A or T)

G Guanine S Strong (G or C)

C Cytosine M Amino (A or C)

T Thymine K Keto (G or T)

U Uracil B Not A (G or C or T)

R Purine (A or G) H Not G (A or C or T)

Y Pyrimidine (C or T) D Not C (A or G or T)

N Any nucleotide V Not T (A or G or C)

Page 22: Statistics for Microarrays

Base Pairing

Page 23: Statistics for Microarrays

Proteins

• Proteins: macromolecules composed of one or more chains of amino acids

• Amino acids: class of 20 different organic compounds containing a basic amino group (-NH2) and an acidic carboxyl group (-COOH)

• The order of amino acids is determined by the base sequence of nucleotides in the gene coding for the protein

• Proteins function as enzymes, antibodies, structures, etc.

Page 24: Statistics for Microarrays

Amino acid codesAlaArgAsnAspCysGlnGluGlyHisIleLeuLysMetPheProSerThrTrpTyrVa lAsxGlxSecUnk

ARNDCQEGHILKMFPSTWYVBZUX

AlanineArginineAsparagineAspartic acidCysteineGlutamineGlutamic acidGlycineHistidineIsoleucineLeucineLysineMethioninePhenylalanineProlineSerineThreonineTryptophanTyrosineValineAsn or AspGln or GluSelenocysteineUnknown

Page 25: Statistics for Microarrays

Primary Protein Structure

Page 26: Statistics for Microarrays

Multiple Levels of Protein Strucure

( Protein folding)

Page 27: Statistics for Microarrays

Tertiary Structure ofSperm whale myoglobin (1MBN)

Page 28: Statistics for Microarrays
Page 29: Statistics for Microarrays

(RT)

Page 30: Statistics for Microarrays

Nature (1953), 171:737

“It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

DNA Replication

Page 31: Statistics for Microarrays
Page 32: Statistics for Microarrays

DNA Replication

• The DNA strand that is copied to form a new strand is called a template

• In the replication of a double-stranded or duplex DNA molecule, both original (parental) DNA strands are copied

• When copying is finished, the two new duplexes, each consisting of one of the original strands plus its copy, separate from each other (semiconservative replication)

Page 33: Statistics for Microarrays

Semiconservative Replication

Page 34: Statistics for Microarrays

DNA Replication, ctd• DNA synthesis occurs in the chemical direction 5’3’• Nucleic acid chains are assembled from 5’ triphosphates of

deoxyribonucleosides (the triphosphates supply energy)• DNA polymerases are enzymes that copy (replicate) DNA• DNA polymerases require a short preexisting DNA strand

(primer) to begin chain growth. With a primer base-paired to the template strand, a DNA polymerase adds nucleotides to the free hydroxyl group at the 3’ end of the primer.

• DNA replication requires assembly of many proteins (at least 30) at a growing replication fork: helicases to unwind, primases to prime, ligases to ligate (join), topisomerases to remove supercoils, RNA polymerase, etc.

Page 35: Statistics for Microarrays
Page 36: Statistics for Microarrays

DNA Replication Fork

Page 37: Statistics for Microarrays

DNA is unwinding

DNA Synthesis

Page 38: Statistics for Microarrays
Page 39: Statistics for Microarrays
Page 40: Statistics for Microarrays

RNA• RNA, or ribonucleic acid, is similar to DNA, but

-- RNA is single-stranded-- the sugar is ribose rather than

deoxyribose-- uracil (U) is used instead of thymine

• RNA is important for protein synthesis and other cell activities

• There are several classes of RNA molecules, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and other small RNAs

Page 41: Statistics for Microarrays

The Genetic Code

• DNA: sequence of four different nucleotides

• Protein: sequence of twenty different amino acids

• The correspondence between the four-letter DNA alphabet and the twenty-letter protein alphabet is specified by the genetic code, which relates nucleotide triplets, or codons, to amino acids

Page 42: Statistics for Microarrays

Standard Genetic Code

Page 43: Statistics for Microarrays

Variation of genetic codesT1 T2 T3 T4 T5 T6 T9 T10 T12 T13 T14 T15

CUUCUCCUACUG

LeuLeuLeuLeu

----

ThrThrThrThr

----

----

----

----

----

---Ser

----

----

----

AUUAUCAUAAUG

IleIleIleMet

--Met-

--Met-

----

--Met-

----

----

----

----

--Met-

----

----

UAUUACUAAUAG

TyrTyrStopStop

----

----

----

----

--GlnGln

----

----

----

----

--Tyr-

---Gln

AAUAACAAAAAG

AsnAsnLysLys

----

----

----

----

----

--Asn-

----

----

----

--Asn-

----

UGUUCGUGAUGG

CysCysStopTrp

--Trp-

--Trp-

--Trp-

--Trp-

----

--Trp-

--Cys-

----

--Trp-

--Trp-

----

AGUAGCAGAAGG

SerSerArgArg

--StopStop

----

----

--SerSer

----

--SerSer

----

----

--GlyGly

--SerSer

----

T1: standardT2: vert mtT3: yeast mtT4: other mtT5: invert. mtT6: cil. etc nuc.T9: ech. mtT10: eup. nuc.T12:alt yeast nucT13: asc. mtT14: flat. mtT15: bleph. nuc.

Page 44: Statistics for Microarrays

Protein Synthesis