proteins. what is a protein? a protein is a molecule consisting of amino acids linked in a linear...

Post on 17-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Proteins

What is a protein?

• A protein is a molecule consisting of amino acids linked in a linear chain through peptide bonds.

Protein primary structure

Peptide formation

There are many kinds of proteins.

• Structural--determine shape and function of cells

• Enzymes--speed up chemical reactions

• Ligand-binding--bind small molecules and transport them to other locations

Cells

• muscle

• nerve

Structural proteins

• collagen -- in connective tissue such as cartilage

• elastin -- in connective tissue such as cartilage

• keratin--in hair and nails• actin -- in muscle• myosin -- in muscle to generate mechanical

forces

Enzymes

• glucose isomerase--convert glucose into fructose

• rennin--make cheese• cellulase--break down cellulose into sugars to

make ethanol• amylase--detergent for machine dish washing

Ligand-binding proteins.

• hemoglobin--transport oxygen from the lungs

• antibodies--bind foreign substances for destruction

The string of amino acids tends to “fold” into a shape.

Hemoglobin structure

Heart of Steel (Hemoglobin) by Julian Voss-Andreae

Protein views (Triose phosphate isomerase)

Visualizing proteins

Amino acids

• There are 20 different standard amino acids

• The different amino acids differ in chemical properties.

• Amino Acid 3-Letter 1-Letter Polarity Acidity Hydrophobicity index• Alanine Ala A nonpolar neutral 1.8• Arginine Arg R polar basic (s) -4.5• Asparagine Asn N polar neutral -3.5• Aspartic acid Asp D polar acidic -3.5• Cysteine Cys C nonpolar neutral 2.5• Glutamic acid Glu E polar acidic -3.5• Glutamine Gln Q polar neutral -3.5• Glycine Gly G nonpolar neutral -0.4• Histidine His H polar basic (w) -3.2• Isoleucine Ile I nonpolar neutral 4.5• Leucine Leu L nonpolar neutral 3.8• Lysine Lys K polar basic -3.9• Methionine Met M nonpolar neutral 1.9• Phenylalanine Phe F nonpolar neutral 2.8• Proline Pro P nonpolar neutral -1.6• Serine Ser S polar neutral -0.8• Threonine Thr T polar neutral -0.7• Tryptophan Trp W nonpolar neutral -0.9• Tyrosine Tyr Y polar neutral -1.3• Valine Val V nonpolar neutral 4.2

Hydrophobicity index.

• The larger the index, the stronger the tendency to be internal in the protein; the lower the index, the stronger the tendency to appear near the protein surface.

• Amino acids with high index are called hydrophobic; with low index are called hydrophilic.

What is the shape of the protein?

• This is the “protein folding problem.”

• The geometry and chemistry of the parts of the protein determine how it behaves in the cell.

DNA

• DNA is deoxyribose nucleic acid.

• It occurs as long molecules in a double helix.

DNA is a long molecule in a double helix

What makes DNA?

• DNA consists of sequences of nucleotides.

• There are 4 kinds of nucleotide:

• Adenine (A), Cytosine (C), Guanine (G), and Thymine (T)

Matching

• Each A has weak (“hydrogen”) bonds with T on the other chain.

• Each C has weak (“hydrogen”) bonds with G on the other chain.

A single chain carries the information

• For example, the two strings might be

ACGGTCAG

TGCCAGTC

• Hence all the information is in the order of A, C, G, T in one of the chains.

• We write DNA as a (long) string of A, C, G, T for example AGGCTACATAG…

Human DNA

• Humans have 46 chromosomes.

• Each chromosome is essentially a double helix of DNA, with variable numbers of nucleotides, from 50,000,000 to 250,000,000 base pairs.

• There are a total of about 2,860,000,000 nucleotide pairs.

Genes

• A gene is a portion of the DNA that tells how to make a protein.

DNA for beta hemoglobin

• ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA

• Amino Acid 3-Letter 1-Letter Polarity Acidity Hydrophobicity index• Alanine Ala A nonpolar neutral 1.8• Arginine Arg R polar basic (s) -4.5• Asparagine Asn N polar neutral -3.5• Aspartic acid Asp D polar acidic -3.5• Cysteine Cys C nonpolar neutral 2.5• Glutamic acid Glu E polar acidic -3.5• Glutamine Gln Q polar neutral -3.5• Glycine Gly G nonpolar neutral -0.4• Histidine His H polar basic (w) -3.2• Isoleucine Ile I nonpolar neutral 4.5• Leucine Leu L nonpolar neutral 3.8• Lysine Lys K polar basic -3.9• Methionine Met M nonpolar neutral 1.9• Phenylalanine Phe F nonpolar neutral 2.8• Proline Pro P nonpolar neutral -1.6• Serine Ser S polar neutral -0.8• Threonine Thr T polar neutral -0.7• Tryptophan Trp W nonpolar neutral -0.9• Tyrosine Tyr Y polar neutral -1.3• Valine Val V nonpolar neutral 4.2

DNA determines the order of amino acids

• ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA

Primary structure for beta hemoglobin--the order

• MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

Hemoglobin structure

How does DNA determine the order of amino acids?

• Three successive nucleotides form a “codon.”

• Different codons stand for different amino acids.

Translating codons• Ala/A GCT, GCC, GCA, GCG Leu/L TTA, TTG, CTT, CTC, CTA, CTG• Arg/R CGT, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG• Asn/N AAT, AAC Met/M ATG• Asp/D GAT, GAC Phe/F TTT, TTC• Cys/C TGT, TGC Pro/P CCT, CCC, CCA, CCG• Gln/Q CAA, CAG Ser/S TCT, TCC, TCA, TCG, AGT, AGC• Glu/E GAA, GAG Thr/T ACT, ACC, ACA, ACG• Gly/G GGT, GGC, GGA, GGG Trp/W TGG• His/H CAT, CAC Tyr/Y TAT, TAC• Ile/I ATT, ATC, ATA Val/V GTT, GTC, GTA, GTG• START ATG STOP TAG, TGA, TAA

DNA for beta hemoglobin

• ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGGTGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCCACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAAAGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGGGCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGATCCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCATCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAGTGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA

Primary structure for beta hemoglobin

• MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

Hemoglobin structure

The order of amino acids is important

• Consider what may happen when the “wrong” amino acid is in a certain position.

Primary structure for beta hemoglobin

• MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

Sickle cell anemia beta hemoglobin

• MVHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYWTQRFFESFGDLSTPDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH

• Amino Acid 3-Letter 1-Letter Polarity Acidity Hydrophobicity index• Alanine Ala A nonpolar neutral 1.8• Arginine Arg R polar basic (s) -4.5• Asparagine Asn N polar neutral -3.5• Aspartic acid Asp D polar acidic -3.5• Cysteine Cys C nonpolar neutral 2.5• Glutamic acid Glu E polar acidic -3.5• Glutamine Gln Q polar neutral -3.5• Glycine Gly G nonpolar neutral -0.4• Histidine His H polar basic (w) -3.2• Isoleucine Ile I nonpolar neutral 4.5• Leucine Leu L nonpolar neutral 3.8• Lysine Lys K polar basic -3.9• Methionine Met M nonpolar neutral 1.9• Phenylalanine Phe F nonpolar neutral 2.8• Proline Pro P nonpolar neutral -1.6• Serine Ser S polar neutral -0.8• Threonine Thr T polar neutral -0.7• Tryptophan Trp W nonpolar neutral -0.9• Tyrosine Tyr Y polar neutral -1.3• Valine Val V nonpolar neutral 4.2

Simple model

• Pretend there are only 2 kinds of amino acid--H and P.

• H stands for “hydrophobic”.

• Pretend that they must be placed on a grid.

• Example: HHPPPPPPPHH

A folding of HHPPPPPPPHH

H H P P

P

PP

PPH

H

Another folding of HHPPPPPPPHH

H H P P

P

PPPP

H H

Energy

• HH has energy -1.

• PP has energy 0.

• HP has energy 0.

• PH has energy 0.

• The protein folds so as to minimize the energy.

A folding of HHPPPPPPPHH with energy -2

H H P P

P

PP

PPH

H

A folding of HHPPPPPPPHH with energy -4

H H P P

P

PPPP

H H

A folding of HHPPPPPPPHH with ? energy

H H P P

P

PPP

HH P

The real problem

• There are 20 amino acids.

• Pairs have different energies.

• Typically a protein has about 100 amino acids.

• The protein is in 3 dimensions.

• It does not need to be on a grid.

• It must be worked on a computer.

The Direct Approach

• Write down a formula for the energy E, taking into account the (variable) locations of all amino acids, all charges and electrostatic attractions and repulsions, and all constraints.

• Minimize E.

Indirect Methods

• Statistics of amino acids in known structures

• Neural network models

• Nearest neighbor methods

• Hidden Markov models

Does a method work?

• We want to be able to check some answers, to see whether a method appears to work.

• Professor Zhijun Wu works on some problems related to this.

NMR

• NMR is Nuclear Magnetic Resonance

• Using NMR one can often find the distances between some particular atoms in a protein.

Distances

• Here d(1,4) is the distance between the first and fourth atoms.A1

A4

A3

A2

d(1,4)

d(2,3)

Locations

• A1 is at (x11, x12, x13).• A2 is at (x21, x22, x23).• A3 is at (x31, x32, x33).• A4 is at (x41, x42, x43).

• Once you know all the locations, you know the shape of the protein.

A1

A4

A3

A2

d(1,4)

d(2,3)

Position Matrix

• Form the matrix X

x11 x12 x13

x21 x22 x23

x31 x32 x33

x41 x42 x43

A1

A4

A3

A2

d(1,4)

d(2,3)

Matrix Equation

• It turns out that

X XT = D where D is a matrix that can be obtained just using all the numbers d(i,j).

A1

A4

A3

A2

d(1,4)

d(2,3)

The matrix D

• If there are n atoms and the last is at the origin, then the entry of D in the ith row and jth column is

(d(i,n)2 - d(i,j)2 + d(j,n)2) / 2

A1

A4

A3

A2

d(1,4)

d(2,3)

Solving the matrix equation

• Professor Zhijun Wu studies ways to solve such matrix equations rapidly.A1

A4

A3

A2

d(1,4)

d(2,3)

Energy

• HH has energy -1.

• PP has energy 0.

• HP has energy 0.

• PH has energy 0.

• The protein folds so as to minimize the energy.

What is the best folding of

• HPPHPPHPHPPHPHPHHH

• (Careful: answer is on the next slide)

HPPHPPHPHPPHPHPHHH

H

P P

H P

PH

PH

PP

HP

H

P

H

H

H

with energy -11

top related