genomesgenomes chapter 21. genomes sequencing of dna human genome project 1990-2003 6 countries 20...

Post on 26-Dec-2015

216 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GenomesGenomesGenomesGenomes

Chapter 21Chapter 21

Genomes• Sequencing of DNA• Human Genome Project• 1990-2003• 6 countries• 20 research centers

Genome• J. Craig Venter in 1992• Whole-genome shotgun

approach• Sequences random DNA

fragments directly

Fig. 21-3-3Fig. 21-3-3

Cut the DNAinto overlappingfragments short enoughfor sequencing

1

2

3

4

Clone the fragmentsin plasmid or phagevectors.

Sequence eachfragment.

Order thesequences intoone overallsequencewith computer software.

Genomes• Complete genome sequences • Human, chimpanzee, • E. coli, brewer’s yeast• Nematode, fruit fly, house mouse,

Genomes• Genomics:• Study of whole sets of genes & their

interactions• Bioinformatics:• Application of computers• Storage & Analysis of biological data

Genomes• Metagenomics• DNA• Entire groups of species • Environmental sample• Sequenced • Human “microbiome”

Figure 21.1Figure 21.1Figure 21.1Figure 21.1

Genomes • Comparison • Evolutionary history of genes• Taxonomic groups

Genome• Phenotype to genotype• Red eye fruit flies (w+w or w+w+)• Computer analysis of genome • Identifies sequences likely to

encode proteins• Genotype to phenotype

Genomes

Genome• NCBI• Genbank• BLAST• Compare DNA Sequences• Compare predicted protein

sequences• Domains (known aa sequences)

Fig. 21-4Fig. 21-4

tatggagaga ataaaagaac tgagagatct aatgtcgcag tcccgcactc gcgagatact 61 cactaagacc actgtggacc atatggccat aatcaaaaag tacacatcag gaaggcaaga 121 gaagaacccc gcactcagaa tgaagtggat gatggcaatg agatacccaa ttacagcaga 181 caagagaata atggacatga ttccagagag gaatgaacaa gggcaaaccc tctggagcaa 241 aacaaacgat gctggatcag accgagtgat ggtatcacct ctggccgtaa catggtggaa 301 taggaatggc ccaacaacaa gtacagttca ttaccctaag gtatataaaa cttatttcga 361 aaaggtcgaa aggttgaaat atggtacctt cggccctgtc cacttcagaa atcaagttaa 421 aataaggagg agagttgata caaaccctgg ccatgcagat ctcagtgcca aggaggcaca 481 ggatgtgatt atggaagttg ttttcccaaa tgaagtgggg gcaagaatac tgacatcaga 541 gtcacagctg gcaataacaa aagagaagaa agaagagctc caggattgta aaattgctcc 601 cttgatggtg gcgtacatgc tagaaagaga attggtccgt aaaacaaggt ttctcccagt 661 agccggcgga acaggcagtg tttatattga agtgttgcac ttaacccaag ggacgtgctg 721 ggagcagatg tacactccag gaggagaagt gagaaatgat gatgttgacc aaagtttgat 781 tatcgctgct agaaacatag taagaagagc agcagtgtca gcagacccat tagcatctct 841 cttggaaatg tgccacagca cacagattgg aggagtaagg atggtggaca tccttagaca 901 gaatccaact gaggaacaag ccgtagacat atgcaaggca gcaatagggt tgaggattag 961 ctcatctttc agttttggtg ggttcacttt caaaaggaca agcggatcat cagtcaagaa

Genome• Proteomics:• Systematic study of all proteins

encoded by a genome• Proteins carry out most of the

cell’s activities

Application• Finding DNA sequence of

organisms• Predict structure & function of new

proteins & RNA sequences• Families of related proteins• Phylogenic trees evolutionary

relationships

Application• The Cancer Genome Atlas project• Monitors 2,000 genes in cancer

cells for changes• Mutations & rearrangements• Lung, ovarian and glioblastoma• Compare to normal cells

Application• DNA sequencing • Highlight diseases • Specialize tx

Genome size• Bacteria range from 1 to 6

million base pairs (Mb)• Eukaryotes usually larger• Humans have 3,200 Mb

Table 21-1Table 21-1

Fig. 21-UN1Fig. 21-UN1

Bacteria Archaea

Genomesize

Number of genes

Genedensity

Most are 1–6 Mb

1,500–7,500

Higher than in eukaryotes

Introns None inprotein-codinggenes

OthernoncodingDNA

Very little

Present insome genes

Can be large amounts;generally more repetitivenoncoding DNA inmulticellular eukaryotes

Unicellular eukaryotes:present, but prevalent onlyin some speciesMulticellular eukaryotes:present in most genes

Lower than in prokaryotes(Within eukaryotes, lowerdensity is correlated with larger genomes.)

5,000–40,000

Most are 10–4,000 Mb, buta few are much larger

Eukarya

Genome• Gene density:• Number of genes in a given length of DNA• Humans & other mammals-lowest • Multicellular eukaryotes have many

introns

• “Junk DNA”

Genome• Genomes of humans, rats, & mice

• 500 noncoding regions-are the same

• 98.5% of the genome does not code for proteins, rRNAs, or tRNAs

• 24% regulatory sequences & introns

Fig. 21-7Fig. 21-7Exons (regions of genes coding for protein

or giving rise to rRNA or tRNA) (1.5%)

RepetitiveDNA thatincludestransposableelementsand relatedsequences(44%)

Introns andregulatorysequences(24%)

UniquenoncodingDNA (15%)

RepetitiveDNAunrelated totransposableelements (15%)

L1sequences(17%)

Alu elements(10%)

Simple sequenceDNA (3%)

Large-segmentduplications (5–6%)

Genome • Pseudogene:• Former genes, mutated• Repetitive genes:• Sequences in multiple copies

Genome• Transposable elements:• DNA that move from one site to

another• Prokaryotes & eukaryotes• Barbara McClintock

Fig. 21-8Fig. 21-8

Genome• Eukaryotic transposable

elements• 1. Transposons:• Move within a genome • DNA intermediate• 2. Retrotransposons:• Move - RNA intermediate

Fig. 21-9aFig. 21-9a

TransposonNew copy oftransposon

DNA ofgenome Transposon

is copiedInsertion

Mobile transposon

(a) Transposon movement (“copy-and-paste” mechanism)

Fig. 21-9bFig. 21-9b

RetrotransposonNew copy of

retrotransposon

Reversetranscriptase

Insertion

RNA

(b) Retrotransposon movement

Genome• Alu elements• 10% of genome• Transposable elements• 300 nucleotides• Do not code for protein• Code for RNA

Genome• Line-1 or L1• 17% genome• Retrotransposons • 6500 nucleotides• Low transposition• Regulate gene expression• Developing neurons

Genome• Repetitive DNA not transposons• 15% • 1. Long sequences of DNA • 2. Simple sequence DNA • Many copies of repeated short

sequences• GTTACGTTACGTTACGTTACGTTAC

Genome• Short tandem repeat (STR)• Repeating units of 2 to 5

nucleotides• Vary among individuals• Centromeres• Telomeres

Genome• Multigene families:• Collections of identical or very similar

genes on a haploid set of chromosomes• Example: • Code for rRNA products• Single transcript makes all rRNA molecules• Transcript sequence repeated many times

Fig. 21-10aFig. 21-10a

(a) Part of the ribosomal RNA gene family

18S

28S

28S18S 5.8S

5.8S

rRNA

DNA

DNARNA transcripts

Nontranscribedspacer Transcription unit

Genome• Nonidentical genes• Hemoglobin• Chromosome 16-α globulin• Chromosome 11-ß globulin• Code separately• Animal development

Fig. 21-10bFig. 21-10b

(b) The human -globin and -globin gene families

Heme

Hemoglobin

-Globin

-Globin

-Globin gene family-Globin gene family

Chromosome 16 Chromosome 11

21

2

1

G A

Embryo Embryo FetusFetus

and adult Adult

Evolution• Human & chimpanzee genomes differ by

1.2%• More Alu elements in humans• Several genes are evolving faster in

humans• Genes involved in defense against malaria

and tuberculosis• Gene that regulations brain size• Genes that code for transcription factors

Fig. 21-15Fig. 21-15

Most recentcommonancestorof all livingthings

Billions of years ago4 3 2 1 0

Bacteria

Eukarya

Archaea

Chimpanzee

Human

Mouse

010203040506070

Millions of years ago

Evolution• FOXP2 gene• Vocalization• Mutation causes speech

impairment• 2 aa difference chimps and

humans

Evolution• Humans 23 pairs of chromosomes• Chimpanzees 24 pairs• Humans & chimpanzees diverged from a

common ancestor• 2 ancestral chromosomes fused in humans• Duplications & inversions result from

mistakes during meiotic recombination

Figure 21.11Figure 21.11Figure 21.11Figure 21.11

Telomeresequences

Centromeresequences

Telomere-likesequences

Centromere-likesequences

Humanchromosome

Chimpanzeechromosomes

12

132

Figure 21.12Figure 21.12Figure 21.12Figure 21.12

Human chromosome Mouse chromosomes

16 167 8 17

Figure 21.14Figure 21.14Figure 21.14Figure 21.14

Ancestral globin gene

α2 α1ζ ζ βα2

α1

yθ ϵ βG A

ϵ

β

β

βζ

α

α

α

Duplication ofancestral gene

Mutation inboth copies

Transposition todifferent chromosomes

Further duplicationsand mutations

Evo

luti

on

ary

tim

e

α-Globin gene familyon chromosome 16

β-Globin gene familyon chromosome 11

Figure 21.16Figure 21.16Figure 21.16Figure 21.16

EGF EGF EGF EGF

EGF

F F F F

F

K

K K

Epidermal growthfactor gene with multipleEGF exons

Fibronectin gene with multiple“finger” exons

Plasminogen gene with a“kringle” exon

Portions of ancestral genes TPA gene as it exists today

Exonshuffling

Exonduplication

Exonshuffling

Evolution• Evo-devo• Evolutionary developmental biology• Developmental processes in

multicellular organisms• Genomic information shows minor

differences in gene sequence or regulation

• Results in major differences in form

Evolution• Homeotic genes• Body segments (fruit fly)• 180-nucleotide sequence • Homeobox• Related homeobox sequences have

been found in regulatory genes of yeasts, plants, and even eukaryotes

Fig. 21-17Fig. 21-17

Adultfruit fly

Fruit fly embryo(10 hours)

Flychromosome

Mousechromosomes

Mouse embryo(12 days)

Adult mouse

top related