the structure and function of proteins bioinformatics ch 7
TRANSCRIPT
The Structure andFunction of Proteins
Bioinformatics Ch 7
The many functions of proteins
• Mechanoenzymes: myosin, actin• Rhodopsin: allows vision• Globins: transport oxygen• Antibodies: immune system• Enzymes: pepsin, renin, carboxypeptidase A• Receptors: transmit messages through membranes• Vitelogenin: molecular velcro
– And hundreds of thousands more…
Complex Chemistry Tutorial
• Molecules are made of atoms!
• There is a lot of hydrogen out there!
• Atoms make a “preferred” number of covalent (strong) bonds– C – 4
– N – 3
– O, S – 2
• Atoms will generally “pick up” enough hydrogens to “fill their valence capacity” in vivo.
• Molecules also “prefer” to have a neutral charge
Biochemistry
• In the context of a protein…– Oxygen tends to exhibit a slight negative charge
– Nitrogen tends to exhibit a slight positive charge
– Carbon tends to remain neutral/uncharged
• Atoms can “share” a hydrogen atom, each making “part” of a covalent bond with the hydrogen– Oxygen: H-Bond donor or acceptor
– Nitrogen: H-Bond donor
– Carbon: Neither
Proteins are chains of amino acids
• Polymer – a molecule composed of repeating units
Amino acid composition
• Basic Amino AcidStructure:– The side chain, R,
varies for each ofthe 20 amino acids
– Amino & Carboxyl groups, plus Carbon make the “Backbone” of the amino acid
C
RR
C
H
NO
OHH
H
Aminogroup
Carboxylgroup
Side chain
The Peptide Bond
• Dehydration synthesis
• Repeating backbone: N–C –C –N–C –C
– Convention – start at amino terminus and proceed to carboxy terminus
O O
Peptidyl polymers
• A few amino acids in a chain are called a polypeptide. A protein is usually composed of 50 to 400+ amino acids.
• Since part of the amino acid is lost during dehydration synthesis, we call the units of a protein amino acid residues.
carbonylcarbonylcarboncarbon
amideamidenitrogennitrogen
Side chain properties
• Recall that the electronegativity of carbon is at about the middle of the scale for light elements– Carbon does not make hydrogen bonds with water easily
– hydrophobic– O and N are generally more likely than C to h-bond to
water – hydrophilic
• We group the amino acids into three general groups:– Hydrophobic– Charged (positive/basic & negative/acidic)– Polar
The Hydrophobic Amino Acids
Proline severelyProline severelylimits allowablelimits allowableconformations!conformations!
The Charged Amino Acids
The Polar Amino Acids
More Polar Amino Acids
And then there’s…And then there’s…
Planarity of the peptide bond
Phi () – the angle of rotation about the N-C bond.
Psi () – the angle of rotation about the C-C bond.
The planar bond angles and bond lengths are fixed.
Phi and psi
= = 180° is extended conformation
: C to N–H : C=O to C
C
C=O
N–H
The Ramachandran Plot
• G. N. Ramachandran – first calculations of sterically allowed regions of phi and psi
• Note the structural importance of glycine
Observed(non-glycine)
Observed(glycine)Calculated
Primary & Secondary Structure
• Primary structurePrimary structure = the linear sequence of amino acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…• Secondary structureSecondary structure
– Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the-sheet
– The location of direction of these periodic, repeating structures is known as the secondary structuresecondary structure of the protein
The alpha helix 60°
Properties of the alpha helix 60°
• Hydrogen bondsHydrogen bondsbetween C=O ofresidue n, andNH of residuen+4
• 3.6 residues/turn
• 1.5 Å/residue rise
• 100°/residue turn
Properties of -helices
• 4 – 40+ residues in length• Often amphipathic or “dual-natured”
– Half hydrophobic and half hydrophilic– Mostly when surface-exposed
• If we examine many -helices,we find trends…– Helix formers: Ala, Glu, Leu,
Met– Helix breakers: Pro, Gly, Tyr,
Ser
The beta strand (& sheet) 135° +135°
Properties of beta sheets
• Formed of stretches of 5-10 residues in extended conformation
• Pleated – each C a bitabove or below the previous
• Parallel/aniparallelParallel/aniparallel,contiguous/non-contiguous
Parallel and anti-parallel -sheets• Anti-parallel is slightly energetically favored
Anti-parallelAnti-parallel ParallelParallel
Turns and Loops• Secondary structure elements are connected by
regions of turns and loops• Turns – short regions
of non-, non-conformation
• Loops – larger stretches with no secondary structure. Often disordered.– “Random coil”– Sequences vary much more than secondary structure
regions
Levels of Protein
Structure
• Secondary structure elements combine to form tertiary structure
• Quaternary structure occurs in multienzyme complexes– Many proteins are active
only as homodimers, homotetramers, etc.
Secondary Structure Prediction
• Based on backbone flexibility
• Various methods– Statistical, neural networks, evolutionary
computation.– Conserved aligned sequences as input (degree
calculated)– PHD can get 70-75% accuracy
Chou-Fasman ParametersName Abbrv P(a) P(b) P(turn) f(i) f(i+1) f(i+2) f(i+3)Alanine A 142 83 66 0.06 0.076 0.035 0.058Arginine R 98 93 95 0.07 0.106 0.099 0.085Aspartic Acid D 101 54 146 0.147 0.11 0.179 0.081Asparagine N 67 89 156 0.161 0.083 0.191 0.091Cysteine C 70 119 119 0.149 0.05 0.117 0.128Glutamic Acid E 151 37 74 0.056 0.06 0.077 0.064Glutamine Q 111 110 98 0.074 0.098 0.037 0.098Glycine G 57 75 156 0.102 0.085 0.19 0.152Histidine H 100 87 95 0.14 0.047 0.093 0.054Isoleucine I 108 160 47 0.043 0.034 0.013 0.056Leucine L 121 130 59 0.061 0.025 0.036 0.07Lysine K 114 74 101 0.055 0.115 0.072 0.095Methionine M 145 105 60 0.068 0.082 0.014 0.055Phenylalanine F 113 138 60 0.059 0.041 0.065 0.065Proline P 57 55 152 0.102 0.301 0.034 0.068Serine S 77 75 143 0.12 0.139 0.125 0.106Threonine T 83 119 96 0.086 0.108 0.065 0.079Tryptophan W 108 137 96 0.077 0.013 0.064 0.167Tyrosine Y 69 147 114 0.082 0.065 0.114 0.125Valine V 106 170 50 0.062 0.048 0.028 0.053
Chou-Fasman Algorithm
• Identify -helices– 4 out of 6 contiguous amino acids that have P(a) > 100– Extend the region until 4 amino acids with P(a) < 100
found– Compute P(a) and P(b); If the region is >5 residues
and P(a) > P(b) identify as a helix
• Repeat for -sheets [use P(b)]• If an and a region overlap, the overlapping
region is predicted according to P(a) and P(b)
Chou-Fasman, cont’d
• Identify hairpin turns:– P(t) = f(i) of the residue * f(i+1) of the next residue *
f(i+2) of the following residue * f(i+3) of the residue at position (i+3)
– Predict a hairpin turn starting at positions where:• P(t) > 0.000075• The average P(turn) for the four residues > 100 P(a) < P(turn) > P(b) for the four residues
• Accuracy 60-65%
Chou-Fasman Example
• CAENKLDHVRGPTCILFMTWYNDGP• CAENKL – Potential helix (!C and !N)
• Residues with P(a) < 100: RNCGPSTY
– Extend: When we reach RGPT, we must stop– CAENKLDHV: P(a) = 972, P(b) = 843– Declare alpha helix
• Identifying a hairpin turn– VRGP: P(t) = 0.000085– Average P(turn) = 113.25
• Avg P(a) = 79.5, Avg P(b) = 98.25
Protein Structure Examples
Views of a protein
Wireframe Ball and stick
Views of a proteinSpacefill Cartoon CPK colors
Carbon = green, black, or grey
Nitrogen = blue
Oxygen = red
Sulfur = yellow
Hydrogen = white