biochemistry 301 principles of protein structure walter chazin 5140 biosci/mrbiii e-mail:...
Post on 19-Dec-2015
219 views
TRANSCRIPT
Biochemistry 301
Principles of Protein Structure
Walter Chazin5140 BIOSCI/MRBIIIE-mail: Walter.Chazin
http://structbio.vanderbilt.edu/chazin
Jan. 8-10, 2003
Text Books
Branden and ToozeIntroduction to Protein Structure
Voet, Voet and PrattFundamentals of Biochemistry
StryerBiochemistry
Proteins: Polymers of Amino Acids
• 20 different amino acids: many combinations
• Proteins are made in the RIBOSOME
Amino Acid Chemistry
NH2 C
R1
CO
H
NH C
R2
COOH
H
NH2 C
R
COOH
H
amino acid
20 different types
Amino acid Polypeptide Protein
NH2 C
R1
COOH
H
NH2 C
R2
COOH
H
Amino Acid Chemistry
NH2 C
R
COOH
H
amino acid
The free amino and carboxylic acid groups have pKa’s
COOH COO-
pKa ~ 2.2
NH2NH3+
pKa ~ 9.4
At physiological pH, amino acids are zwitterions
+NH3 C
R
COO-
H
Amino Acid Chemistry
Note the axesAlso titratable
groups in side chain
Glycine Gly - G
2.4 9.8
Alanine Ala - A
2.4 9.9
Valine Val - V
2.2 9.7
Leucine Leu - L
2.3 9.7
Isoleucine
Ile - I
2.3 9.8
Amino Acids with Aliphatic R-Groups
pKa’s
Amino Acids with Polar R-GroupsNon-Aromatic Amino Acids with Hydroxyl R-Groups
Serine Ser - S
2.2 9.2 ~13
Threonine Thr - T
2.1 9.1 ~13
Amino Acids with Sulfur-Containing R-Groups
Cysteine Cys - C
1.9 10.8 8.3
Methionine
Met-M
2.1 9.3
Aspartic Acid Asp - D
2.0 9.9 3.9
Asparagine Asn - N
2.1 8.8
Glutamic Acid
Glu - E
2.1 9.5 4.1
Glutamine Gln - Q
2.2 9.1
Acidic Amino Acids and Amide Conjugates
Basic Amino Acids
Arginine Arg - R
1.8 9.0 12.5
Lysine Lys - K
2.2 9.2 10.8
Histidine
His - H
1.8 9.2 6.0
Aromatic Amino Acids and Proline
Phenylalanine
Phe - F
2.2 9.2
Tyrosine Tyr - Y
2.2 9.1 10.1
Tryptophan Trp-W
2.4 9.4
Proline Pro - P
2.0 10.6
Hierarchy of Protein Structure
• 20 different amino acids: many combinations
The order of amino acids: Protein sequencePrimary Structure
Local conformation, depends on sequenceSecondary Structure
Overall structure of the chain(s) in full 3DTertiary/Quaternary Structure
Beyond Primary Structure:The Peptide Bond
-C - N-
O
=
-H-C = N-
O-
-
-H
Resonance structures
Peptide plane is flat angle ~180º
Partial double-bond:
Peptide bond
Implications of Peptide Planes
angle varies little, and angles vary alot
Many / combinations cause atoms to collide
Each residue is sandwiched between two planes
C
HR
Peptide planes
C
H R
C
Polypeptide Backbone
Backbone restricted limited conformations
Collisions with side chain groups further limit /combinations
C
HR
C
H R
C
H R
Secondary StructureLocal Conformation of Consecutive Residues
• Three low energy backbone combinations
1. Right-hand helix: -helix (-40°, -60°)
2. Extended: antiparallel -sheet (140°, -140°)
3. Left-hand helix (rarerare): -helix (45°, 45°)Glycine: special it has no side chain!
• Hydrogen bonds between backbone atoms provides stability to secondary structures
• Amino acids have specific preferences
Secondary Structure- Helix
H-bond
Secondary Structure- Sheet
Oxygen Nitrogen
R Group
Hydrogen
Carbon Carbonyl C
H Bond
Secondary Structure- Turn
1
43
2
Reverses direction of the chain
Ribbon and Topology DiagramsRepresentations of Secondary Structures
Sheets (arrows), Helices (cylinders)
B/T- Figure 2.17
Ribbon and Topology DiagramsOrganization of Secondary Structures
helix
B/T- Figure 2.11
Beyond Secondary StructureSupersecondary structure (motifs): small, discrete, commonly observed aggregates of secondary structures
sheet
helix-loop-helix
Domains: independent units of structure barrel four-helix bundle
*Domains and motifs sometimes interchanged*
Protein Motifs
V/V/P- Figure 6.28
Hairpin Motif
B/T- Figure 2.14
Helix-Loop-Helix (H-L-H) Motif
B/T- Figure 2.12
EF-Hand H-L-H Motif
B/T- Figure 2.13
Greek Key Motif
B/T- Figure 2.15
Multi-Domain (Modular) Proteins
EGF
Protease
Kringle
Ca-binding
Protein
Domain
Tertiary StructureDefinition: Overall 3D form of a molecule
Organization of the secondary structures/ motifs/domains
Optimization of interactions between residues
A specific 3D structure is formed
All proteins have multiple secondary structures, almost always multiple motifs, and
in some cases multiple domains
Tertiary Structure
Specific structures result from long-range interactions
Electrostatic (charged) interactions
Hydrogen bonds (OH, N H, S H)
Hydrophobic interactions
Soluble proteins have an inside (core) and outside
Folding driven by water- hydrophilic/phobic
Side chain properties specify core/exterior
Some interactions inside, others outside
Tertiary Structure
I. Ionic Interactions (exterior)
Forms between 2 charged side chains:
1 Negative – Glu,Asp 1 Positive – Lys,Arg,His
Also called “salt bridges”.
Ionic interactions are pH-dependent (pKa).
Occurs at the exterior
NOTE: pKs for in the interior of a protein may be
very different from free amino acid.
Tertiary Structure
II. Hydrogen bonds (interior and exterior)
Forms between side chains/backbone/water:
Charged side chains: Glu,Asp,His,Lys,Arg
Polar chains: Ser,Thr,Cys,Asn,Gln,[Tyr,Trp]
Not a specific covalent bond – lower energy.
Occurs inside, at the exterior, and with water.
Tertiary Structure
III. Hydrophobic Interactions (interior)
Forms between side chains of non-polar residues:
Aliphatic (Ala,Val,Leu,Ile,Pro,Met)
Aromatic (Phe,Trp,[Tyr])
Clusters of side chains- but no requirement for a specific orientation like an H-bond
In the protein interior, away from water
Not pH dependent
Tertiary Structure
IV. Disulfide Bonds (interior and exterior)
Forms between Cys residues:
Cys-SH + HS-Cys Cys-S-S-Cys
Catalyzed by specific enzymes, oxidizing agents
Restricts flexibility of the protein
Usually within a protein, less for linking proteins
Disulfide Bonding
V/V/P- Figure 16.6
Quaternary Structure
Definition: Organization of multiple chain associations
Oligomerization- Homo (self), Hetero (different)
Used in organizing single proteins and protein
machines
Specific structures result from long-range interactions
Electrostatic (charged) interactions
Hydrogen bonds (OH, N H, S H)
Hydrophobic interactions
Disulfides only VERY infrequently
Quaternary Structure
The classic example- hemoglobin 2-2
B/T- Figure 3.7 END OF PART 1
Protein Structure from SequenceProtein Structure from SequenceThe pattern of amino acid side chains determines the local conformation and the global structure
*Pattern is more important than exact sequence*
A T V R L L E W E D L
Reporting/Comparing Protein Sequences
A T V R L L E Y K D L5 10
h-CaM
b-CaM
conservative non-conservative
Proteins Fold To TheirProteins Fold To TheirNative StructureNative Structure
Folded proteins are only marginally stable!!
~0.4 kJ•mol-1 required to unfold (cf. ~20/H-bond)
Balance loss of entropy vs. stabilizing forces
Protein fold is specified by sequence
Reversible reaction- denature (fold)/renature
Even single mutations can cause changes
Recent discovery that amyloid diseases (eg.
CJD, Alzheimer) are due to unstable protein folding
How Does a Protein Find It’s Fold? How Does a Protein Find It’s Fold?
A protein of n residues: 20n possible sequences!
100 residue protein has 10020 possibilities 1.3 X 10130!
The latest estimates indicate < 40,000 sequences in the human genome
THERE MUST BE RULES!
• 20 different amino acids: many combinations
N C
1 2 3 4
Amino terminus Carboxyl terminus
Residue number
Limitations on Protein SequenceLimitations on Protein Sequence
Minimum length based on ability to perform a
biochemical function: ~40 residues (e.g. inhibitors)
Maximum length based on complexity of assembly:
Conversion of DNA code and production of proteins
is carried out by molecular machines that are not
perfect. If the sequence gets too long, too many
errors will build up.
*Length is generally 100-1000 residues*
Protein FoldingProtein FoldingThe hydrophobic effect is the major driving force
Hydrophobic side chains cluster/exclude water
Release of water cages in unfolded state
Other forces providing stability to the folded state
Hydrogen bonds
Electrostatic interactions
Chemical cross links- Disulfides, metal ions
Protein FoldingProtein Folding
Random folding has too many possibilities
• Backbone restricted but side chains not
• A 100 residue protein would require 1087 s to search all conformations (age of universe < 1018 s)
• Most proteins fold in less than 10 s!!
Proteins must fold along specific pathways!!
Protein Folding PathwaysProtein Folding PathwaysUsual order of folding events
Secondary structures formed quickly (local)
Secondary structures aggregate to form motifs
Hydrophobic collapse to form domains
Coalescence of domains
Molecular chaperones assist folding in-vivo
Complexity of large chains/multi-domains
Cellular environment is rich in interacting molecules Chaperones sequester proteins and allow time to fold
Progressive Folding of ProteinsProgressive Folding of ProteinsFrom Disordered to Native StateFrom Disordered to Native State
Protein Folding Funnel
V/V/P- Figures 6.37/38
Functional Classes of ProteinsFunctional Classes of Proteins
• Receptors- sense stimuli, e.g. in neurons
• Channels- control cell contents
• Transport- e.g. hemoglobin in blood
• Storage- e.g. ferritin in liver
• Enzyme- catalyze biochemical reactions
• Cell function- multi-protein machines
• Structural- collagen in skin
• Immune response- antibodies
Structural Classes of ProteinsStructural Classes of Proteins
1. Globular proteins (enzymes, molecular machines)
Variety of secondary structures
Approximately spherical shape
Water soluble
Function in dynamic roles (e.g. catalysis,
regulation, transport, gene processing)
Globular ProteinsGlobular Proteins
V/V/P- Figure 6.27
Hemoglobin Conconavalin A Triose Phosphate isomerase
Structural Classes of ProteinsStructural Classes of Proteins
2. Fibrous Proteins (fibrils, structural proteins)
One dominating secondary structure
Typically narrow, rod-like shape
Poor water solubility
Function in structural roles (e.g. cytoskeleton,
bone, skin)
Collagen: A Fibrous ProteinCollagen: A Fibrous Protein
V/V/P- Figures 6.17/18
Triple Helix
Gly-Pro-Pro Repeat
StabilizingInter-strand
H-bonds
Structural Classes of ProteinsStructural Classes of Proteins
3. Membrane Proteins (receptors, channels)
Inserted into (through) membranes
Multi-domain- membrane spanning,
cytoplasmic, and extra-cellular domains
Poor water solubility
Function in cell communication (e.g. cell
signaling, transport)
Photosynthetic Reaction CenterPhotosynthetic Reaction Center
B/T Figure 13.6
Extracellular
Intracellular(cytoplasmic)
Membrane-spanning
In the physical sense, the progression of living organisms results from the communication
between molecules.
Interaction between molecules is determined by binding affinities.
Binding Classification of ProteinsBinding Classification of Proteins
• Structural- other structural proteins
• Receptors- regulatory proteins, transmitters
• Toxins- receptors
• Transport- O2/CO2, cholesterol, metals, sugars
• Storage- metals, amino acids,
• Enzymes- substrates, inhibitors, co-factors
• Cell function- proteins, RNA, DNA, metals, ions
• Immune response- foreign matter (antigens)
Surface Determines What BindsSurface Determines What Binds
1. Steric access
2. Shape
3. Hydrophobic accessible surface
4. Electrostatic surface
Sequence and structure optimized to generate surface properties for requisite binding event(s)
Determinants of Protein SurfaceDeterminants of Protein Surface
Function requires specific amino acid properties
Not all amino acids are equally useful
Abundant: Leu, Ala, Gly, Ser, Val, Glu
Rare: Trp, Cys, Met, His
Post-translational modifications Addition of co-factors- metals, hemes, etc. Chemical modification- phosphorylation,
glycosylation, acetylation, ubiquination, sumoylation
Binding Alters Protein StructureBinding Alters Protein StructureMechanisms of Achieving Functional PropertiesMechanisms of Achieving Functional Properties 1. Allosteric Control- binding at one site effects changes
in conformation or chemistry at a point distant in space
2. Stimulation/inhibition by control factors- proteins, ions, metals control progression of a biochemical process (e.g. controlling access to active site)
3. Reversible covalent modification- chemical bonding, e.g. phosphorylation (kinase/phosphatase)
4. Proteolytic activation/inactivation- irreversible, involves cleavage of one or more peptide bonds
Calcium Signal TransductionCalcium Signal TransductionAllostery & Stimulation by Control FactorAllostery & Stimulation by Control Factor
Target
Ca2+
Calmodulin
SequenceSequenceStructureStructureFunctionFunction
Many sequences can give same structure Side chain pattern more important than
sequence When homology is high (>50%), likely to have same
structure and function (Structural Genomics) Cores conserved Surfaces and loops more variable
*3-D shape more conserved than sequence*
*There are a limited number of structural frameworks*
I. Homologous: similar sequence (cytochrome c) Same structure Same function Modeling structure from homology
Varied Relationships Between Varied Relationships Between Sequence, Structure and Sequence, Structure and
FunctionFunction
V/V/P Figure 6.31
C-Type CytochromesC-Type CytochromesSame structure/function- Different SequenceSame structure/function- Different Sequence
Heme
Constant structural elements and basic architecture
Varied Relationships Between Varied Relationships Between Sequence, Structure and Sequence, Structure and
FunctionFunctionI. Homologous: very similar sequence (cytochrome c)
Same structure Same function Modeling structure from homology
II. Similar function- different sequence (dehydrogenases) One domain same structure One domain different
B/T Figure 10.8
NAD-Binding DomainsNAD-Binding DomainsConserved Domains/Functional ElementsConserved Domains/Functional Elements
Lactate DehydrogenaseAlcohol Dehydrogenase
Varied Relationships Between Varied Relationships Between Sequence, Structure and Sequence, Structure and
FunctionFunctionI. Homologous: very similar sequence (cytochrome c)
Same structure Same function Modeling structure from homology
II. Similar function- different sequence (dehydrogenases) One domain same structure One domain different
III. Similar structure- different function (cf. thioredoxin) Same 3-D structure Not same function
B/T Figures 10.8/2.7
NADH-Binding and RedoxNADH-Binding and RedoxSame structure- Different FunctionSame structure- Different Function
Alcohol Dehydrogenase Lactate Dehydrogenase
Thioredoxin