Download - Secondary Structure Prediction of proteins
![Page 1: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/1.jpg)
Secondary Structure Prediction Of Protein
Protein Sequence +
Structure VIJAY
![Page 2: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/2.jpg)
INRODUCTION
Primary structure (Amino acid sequence)
↓
Secondary structure (α-helix, β-sheet)
↓
Tertiary structure (Three-dimensional structure formed by assembly of secondary
structures)
↓
Quaternary structure (Structure formed by more than one polypeptide chains)
![Page 3: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/3.jpg)
Secondary Structure
Defined as the local conformation of protein backbone
Primary Structure —folding— Secondary Structure
a helix and b sheet
Secondary Structure
Regular Secondary
Structure
(a-helices, b-sheets)
Irregular
Secondary
Structure
(Tight turns,
Random coils,
bulges)
![Page 4: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/4.jpg)
![Page 5: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/5.jpg)
a helix
•common confirmation.
•spiral structure
•Tightly packed coiled polypeptide
backbone, with extending side chains
•Spontaneous
•stabilized by H-bonding between amide
hydrogens and carbonyl oxygens of peptide
bonds.
•R-groups lie on the exterior of the helix
and perpendicular to its axis.
•complete turn of helix —3.6 aminoacyl
residues with distance 0.54 nm
e.g. the keratins- entirely α-helical
Myoglobin- 80% helical
![Page 6: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/6.jpg)
•Glycine and Proline , bulky amino acids,
charged amino acids favor disruption of the
helix.
![Page 7: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/7.jpg)
b sheet
•β-sheets are composed of 2 or more different regions of
stretches of at least 5-10 amino acids.
•The folding and alignment of stretches of the polypeptide
backbone aside one another to form β-sheets is stabilized by
H-bonding between amide hydrogens and carbonyl oxygens
•the peptide backbone of the β sheet is highly extended.
•R groups of adjacent residues point in opposite directions.
• β-sheets are either parallel or antiparallel
![Page 8: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/8.jpg)
b-sheet
(parallel, anti-parallel)
![Page 9: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/9.jpg)
![Page 10: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/10.jpg)
What is secondary structure prediction?
Given a protein sequence (primary structure)
1st step in prediction of protein structure.
Technique concerned with determination of secondary structure of
given polypeptide by locating the Coils Alpha Helix Beta Strands in
plypeptide
GHWIATRGQLIREAYEDYRHFSSECPFIP
Predict its secondary structure content
(C=Coils H=Alpha Helix E=Beta Strands)
CEEEEECHHHHHHHHHHHCCCHHCCCCCC
![Page 11: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/11.jpg)
Why secondary structure prediction?
o secondary structure —tertiary structure prediction
o Protein function prediction
o Protein classification
o Predicting structural change
o detection and alignment of remote homology between proteins
o on detecting transmembrane regions, solvent-accessible residues,
and other important features of molecules
o Detection of hydrophobic region and hydrophilic region
![Page 12: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/12.jpg)
Prediction methods
o Statistical method
o Chou-Fasman method, GOR I-IV
o Nearest neighbors
o NNSSP, SSPAL
o Neural network
o PHD, Psi-Pred, J-Pred
o Support vector machine (SVM)
o HMM
![Page 13: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/13.jpg)
Chou-Fasman algorithm
Chou and fasman in 1978
It is based on assigning a set of prediction value to amino
acid residue in polypeptide and applying an algorithm to the
conformational parameter and positional frequency.
conformational parameter for each amino acid is calculated
by considering the relative frequency of each 20 amino
acid in proteins
By this C=Coils H=Alpha Helix E=Beta Strands are
determined
Also called preference parameter
![Page 14: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/14.jpg)
• A table of prediction value or preference parameter for each of 20 amino acid in alpha helix ,beta plate and turn already calculated and standardised.
• To obtain the prediction value the frequency of amino acids( i) in structure is divided by of all residences in protein (s)
• i/s • The resulting structural parameter of
p(alpha),p(beta),p(turn)vary —0.5 to 1.5 for 20 amino acid
![Page 15: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/15.jpg)
![Page 16: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/16.jpg)
Window is scanned to find a short sequence of amino acid that has high probability to form one type of structure
When 4 out of 6 amino acid have high probability >1.03 the – alpha helix
3 out of 5 amino acid with probability >1.03-beta
RULES
![Page 17: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/17.jpg)
ALGORITHM
o Note preference parameter for 20 aa in peptide
o Scan the window and identify the region where 4 out of
6 contiguous residue have p(alpha helix) >1.00
o Continue scanning in both the direction until the 4
contiguous residue that have an average p(alpha
helix)<1.00,end of helix
o If segment is longer than 5aa and p(alpha helix)>p(beta
sheet )-segment –completely alpha helix
o scan different segment and identify - alpha helix
![Page 18: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/18.jpg)
Identify the region where 3 out of 5 aa have the
value of p( beta sheet) >1.00 ,region is predicted
as beta sheet
Continue scanning both the direction until 4
residue that have p( beta sheet) <1.00
End of beta sheet
average p( beta sheet) >105 and p( beta sheet)
>p(alpha helix) than consider complete segment
as b pleated sheet
![Page 19: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/19.jpg)
If any region is over lapping than consider it as
alpha helix if average p(alpha helix)>p(beta sheet )
Or beta sheet if p(alpha helix)<p(beta sheet )
To identify turn
P(t)=f(j)f(j+1)f(j+2)f(j+3)
J=residual number
![Page 20: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/20.jpg)
result
Accuracy: ~50% ~60%
helix alanine,glutamine,leucine,methionine
Helix breaking proline and glycine
Beta sheet isoleucine,valine,tyrosine
Beta breaking proline,aspargine,glutamine
Turn contains proline(30%),serine(14%),lysine, aspargine(10%)
Glycine(19%),aspartic acid (`18%),serine(13%),tyrosine(11%)
http://www.accelrys.com/product/gcg-wisconsin-package/program-list.html
![Page 21: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/21.jpg)
Out put of Chou-Fasman
![Page 22: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/22.jpg)
![Page 23: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/23.jpg)
GOR METHOD
• GOR(Garnier,Osguthorpe,Robson)1978
• Chou fasman method is based on assumption that each amino
acid individually influence the 2ry structure of sequence
• GOR is based on, amino acid flanking the central amino acid
will influence the 2ry structure
• Consider a peptide central amino acid
side amino acid
• It assume that amino acid up to 8 residue on sides will
influence the 2ry structure of central residue
• 4th version
• 64% accurate
![Page 24: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/24.jpg)
ALGORITHUM
•It uses the sliding window of 17 amino acid
•The side amino acid sequence and alignment is determined to
predict secondary structure of central sequence
•Good for helix than sheet because beta sheet has more inter
sequence hydrogen bonding
•36.5% accurate for beta sheet
•input any amino acid sequence
•Output tells about secondary structure
![Page 25: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/25.jpg)
![Page 26: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/26.jpg)
NEAREST NEIGHBOUR METHOD
o Based on ,short homologues sequences of amino acids have the same secondary structure
o It predicts secondary structure of central homologues segment by neighbour homologues sequences
o By using structural database find some secondary structure of sequence which may be homologues to our target sequence
o Naturally evolved proteins with 35% identical amino acid sequence will have same secondary structure
o Find some sequence which may match with target sequence
o Scoring matrix,MSA
![Page 27: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/27.jpg)
“Singleton” score matrix
Helix Sheet Loop Buried Inter Exposed Buried Inter Exposed Buried Inter Exposed ALA -0.578 -0.119 -0.160 0.010 0.583 0.921 0.023 0.218 0.368 ARG 0.997 -0.507 -0.488 1.267 -0.345 -0.580 0.930 -0.005 -0.032 ASN 0.819 0.090 -0.007 0.844 0.221 0.046 0.030 -0.322 -0.487 ASP 1.050 0.172 -0.426 1.145 0.322 0.061 0.308 -0.224 -0.541 CYS -0.360 0.333 1.831 -0.671 0.003 1.216 -0.690 -0.225 1.216 GLN 1.047 -0.294 -0.939 1.452 0.139 -0.555 1.326 0.486 -0.244 GLU 0.670 -0.313 -0.721 0.999 0.031 -0.494 0.845 0.248 -0.144 GLY 0.414 0.932 0.969 0.177 0.565 0.989 -0.562 -0.299 -0.601 HIS 0.479 -0.223 0.136 0.306 -0.343 -0.014 0.019 -0.285 0.051 ILE -0.551 0.087 1.248 -0.875 -0.182 0.500 -0.166 0.384 1.336 LEU -0.744 -0.218 0.940 -0.411 0.179 0.900 -0.205 0.169 1.217 LYS 1.863 -0.045 -0.865 2.109 -0.017 -0.901 1.925 0.474 -0.498 MET -0.641 -0.183 0.779 -0.269 0.197 0.658 -0.228 0.113 0.714 PHE -0.491 0.057 1.364 -0.649 -0.200 0.776 -0.375 -0.001 1.251 PRO 1.090 0.705 0.236 1.249 0.695 0.145 -0.412 -0.491 -0.641 SER 0.350 0.260 -0.020 0.303 0.058 -0.075 -0.173 -0.210 -0.228 THR 0.291 0.215 0.304 0.156 -0.382 -0.584 -0.012 -0.103 -0.125 TRP -0.379 -0.363 1.178 -0.270 -0.477 0.682 -0.220 -0.099 1.267 TYR -0.111 -0.292 0.942 -0.267 -0.691 0.292 -0.015 -0.176 0.946 VAL -0.374 0.236 1.144 -0.912 -0.334 0.089 -0.030 0.309 0.998
![Page 28: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/28.jpg)
![Page 29: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/29.jpg)
Neural Network Method
•Prediction is done by utilizing the
information of different
DATABASE
•Linear sequence 3D structure of
Polypeptide
![Page 30: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/30.jpg)
Neural network
Input signals are summed and turned into zero or one
3.
J1
J2
J3
J4
Feed-forward multilayer network
Input layer Hidden layer Output layer
neurons
![Page 31: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/31.jpg)
Enter sequences
Compare Prediction to Reality
Adju
st W
eights
Neural network training
![Page 32: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/32.jpg)
Simple Neural Network
With Hidden Layer
out i fij
2
J fjk
1
Jk
kin
j
Simple neural network with hidden layer
![Page 33: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/33.jpg)
A
C
D
E
F
G
H
I
K
L
M
N
P
Q
R
S
T
V
W
Y
.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
Neural network for secondary structure
![Page 34: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/34.jpg)
Summary
Introduction
What is secondary structure prediction
Why
Chou-Fasman method
GOR I-IV
Nearest neighbors
Neural network
![Page 35: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/35.jpg)
![Page 36: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/36.jpg)
![Page 37: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/37.jpg)
Suggested reading:
Chapter 15 in “Current Topics in Computational Molecular Biology, edited by Tao Jiang, Ying Xu, and Michael Zhang. MIT Press. 2002.”
Bioinformatics by Cynthia and per jambeck
Bioinformatics by S.C.RASTOGI
Bioinformatics By Andreas
Optional reading: Review by Burkhard Rost:
http://cubic.bioc.columbia.edu/papers/2003_rev_dekker/paper.html
Reference
![Page 38: Secondary Structure Prediction of proteins](https://reader031.vdocuments.site/reader031/viewer/2022021423/587c6f811a28abd04e8b55e7/html5/thumbnails/38.jpg)