prediction of structural and functional features in proteins starting from the residue sequence...
TRANSCRIPT
![Page 1: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/1.jpg)
Prediction of structural and functional features in proteins
starting from the residue sequence
INTRODUCTION TO NEURAL NETWORKS
![Page 2: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/2.jpg)
Covalent structureTTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
Ct
Nt
3D structure
Secondary structureEEEE..HHHHHHHHHHHH....HHHHHHHH.EEEE...........
MAPPING PROBLEMS: Secondary structure
![Page 3: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/3.jpg)
position of Trans Membrane Segments along the sequenceTopography
Porin (Rhodobacter capsulatus)
Bacteriorhodopsin(Halobacterium salinarum)
Bil
ayer
-barrel -helices
Outer Membrane Inner Membrane
ALALMLCMLTYRHKELKLKLKK ALALMLCMLTYRHKELKLKLKK ALALMLCMLTYRHKELKLKLKK
MAPPING PROBLEMS: Topology of transmembrane proteins
![Page 4: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/4.jpg)
First generation methodsFirst generation methodsSingle residue statisticsSingle residue statistics
Propensity scales
For each residue
•The association between each residue and the different features is statistically evaluated
•Physical and chemical features of residues
A propensity value for any structure can be associated to any residue
HOW?
![Page 5: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/5.jpg)
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
Given a set of known structures we can count how many times a residue is associated to a structure.
Example: ALAKSLAKPSDTLAKSDFREKWEWLKLLKALACCKLSAALhhhhhhhhccccccccccccchhhhhhhhhhhhhhhhhhh
N(A,h) = 7, N(A,c) =1, N= 40
P(A,h) = 7/40, P(A,h) = 1/40
Is that enough for estimating a propensity?
![Page 6: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/6.jpg)
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
Given a set of known structures we can count how many times a residue is associated to a structure.
Example: ALAKSLAKPSDTLAKSDFREKWEWLKLLKALACCKLSAALhhhhhhhhccccccccccccchhhhhhhhhhhhhhhhhhh
N(A,h) = 7, N(A,c) =1, N= 40
P(A,h) = 7/40, P(A,h) = 1/40
We need to estimate how much independent the residue-to-structure association is.
P(h) = 27/40, P(c) = 13/40
![Page 7: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/7.jpg)
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
Given a set of known structures we can count how many times a residue is associated to a structure.
Example: ALAKSLAKPSDTLAKSDFREKWEWLKLLKALACCKLSAALhhhhhhhhccccccccccccchhhhhhhhhhhhhhhhhhh
N(A,h) = 7, N(A,c) =1, N= 40
P(A,h) = 7/40, P(A,h) = 1/40
P(h) = 27/40, P(c) = 13/40
If the structure is independent of the residue:P(A,h) = P(A)P(h)
The ratio P(A,h)/P(A)P(h) is the propensity
![Page 8: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/8.jpg)
Given a LARGE set of examples, a propensity value can be computed for each residue and each structure type
Name P(H) P(E) Alanine 1,42 0,83Arginine 0,98 0,93Aspartic Acid 1,01 0,54Asparagine 0,67 0,89Cysteine 0,70 1,19Glutamic Acid 1,51 0,37Glutamine 1,11 1,10Glycine 0,57 0,75Histidine 1,00 0,87Isoleucine 1,08 1,60Leucine 1,21 1,30Lysine 1,14 0,74Methionine 1,45 1,05Phenylalanine 1,13 1,38Proline 0,57 0,55Serine 0,77 0,75Threonine 0,83 1,19Tryptophan 1,08 1,37Tyrosine 0,69 1,47Valine 1,06 1,70
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
![Page 9: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/9.jpg)
Given a new sequence a secondary structure prediction can be obtained by plotting the propensity values for each structure, residue by residue
Considering three secondary structures (H,E,C), the overall accuracy, as evaluated on an uncorrelated set of sequences with known structure, is very lowQ3 = 50/60 %
T S P T A E L M R S T GP(H) 69 77 57 69 142 151 121 145 98 77 69 57P(E) 147 75 55 147 83 37 130 105 93 75 147 75
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
![Page 10: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/10.jpg)
http://www.expasy.ch/cgi-bin/protscale.pl
Secondary structure: Chou-Fasman propensity Secondary structure: Chou-Fasman propensity scalescale
![Page 11: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/11.jpg)
Transmembrane alpha-helices: Kyte-Doolittle Transmembrane alpha-helices: Kyte-Doolittle scalescale
It is computed taking into consideration the octanol-water partition coefficient, combined with the propensity of the residues to be found in known transmembrane helices
Ala: 1.800 Arg: -4.500 Asn: -3.500 Asp: -3.500 Cys: 2.500 Gln: -3.500 Glu: -3.500 Gly: -0.400 His: -3.200 Ile: 4.500 Leu: 3.800 Lys: -3.900 Met: 1.900 Phe: 2.800 Pro: -1.600 Ser: -0.800 Thr: -0.700 Trp: -0.900 Tyr: -1.300 Val: 4.200
![Page 12: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/12.jpg)
Second generation methods: GORSecond generation methods: GOR
The structure of a residue in a protein strongly depends on the sequence context
It is possible to estimate the influence of a residue in determining the structure of a residue close along the sequence. Usually windows from -8/8 to -13/13 are considered.
Coefficients P(A,s,i) estimate the contribution of the residue A in determining the structure s for a residue that is i positions apart along the sequence
![Page 13: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/13.jpg)
Struttura secondaria: Metodo GORStruttura secondaria: Metodo GOR
Q3 = 65 % (Considering three secondary structures (H,E,C), and evaluating the overall accuracy on an uncorrelated set of sequences with known structure)
The contribution of each position in the window is independent of the other ones. No correlation among the positions in the window is taken in to account.
![Page 14: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/14.jpg)
A more efficient method: Neural NetworksA more efficient method: Neural Networks
Alternative computing algorithm: analogies with the computation in the nervous system.
1) The nervous systems is constituted of elementary computing units: neurons2) The electric signal flows in a determined direction (dentrites->axon) (Principle of dynamic polarization)3)There is not cytoplasmic continuity among the neurons. Each neuron specifically communicates with some neighboring neurons by means of synapses (Principle of connective specificity)
![Page 15: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/15.jpg)
PredictionNew sequence
Prediction
Tools out of machine learning approaches
Tools out of machine learning approaches
Neural Networks can learn the mapping from sequence to secondary structureNeural Networks can learn the mapping from sequence to secondary structure
General
rules
Data Base Subset
Known mapping
TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
Training
EEEE..HHHHHHHHHHHH....HHHHHHHH.EEEE
![Page 16: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/16.jpg)
Neural network for secondary structure Neural network for secondary structure predictionprediction
Input
Output
C
M P I L K QK P I H Y H P N H G E A K G
A 0 0 0 0 0 0 0 0 0C 0 0 0 0 0 0 0 0 0D 0 0 0 0 0 0 0 0 0 E 0 0 0 0 0 0 0 0 0 F 0 0 0 0 0 0 0 0 0G 0 0 0 0 0 0 0 0 0H 0 0 0 1 0 1 0 0 1I 0 0 1 0 0 0 0 0 0K 1 0 0 0 0 0 0 0 0L 0 0 0 0 0 0 0 0 0M 0 0 0 0 0 0 0 0 0N 0 0 0 0 0 0 0 1 0P 0 1 0 0 0 0 1 0 0Q 0 0 0 0 0 0 0 0 0R 0 0 0 0 0 0 0 0 0S 0 0 0 0 0 0 0 0 0T 0 0 0 0 0 0 0 0 0 V 0 0 0 0 0 0 0 0 0W 0 0 0 0 0 0 0 0 0Y 0 0 0 0 1 0 0 0 0
Usually:Input 17-23 residues
Hidden neurons :4-15
![Page 17: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/17.jpg)
ACDEFGHIKLMNPQRSTVWY.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
![Page 18: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/18.jpg)
Third generation methods: evolutionary Third generation methods: evolutionary informationinformation
1 Y K D Y H S - D K K K G E L - -2 Y R D Y Q T - D Q K K G D L - -3 Y R D Y Q S - D H K K G E L - -4 Y R D Y V S - D H K K G E L - -5 Y R D Y Q F - D Q K K G S L - -6 Y K D Y N T - H Q K K N E S - -7 Y R D Y Q T - D H K K A D L - -8 G Y G F G - - L I K N T E T T K 9 T K G Y G F G L I K N T E T T K10 T K G Y G F G L I K N T E T T K
A 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0D 0 0 70 0 0 0 0 60 0 0 0 0 20 0 0 0E 0 0 0 0 0 0 0 0 0 0 0 0 70 0 0 0F 0 0 0 10 0 33 0 0 0 0 0 0 0 0 0 0G 10 0 30 0 30 0 100 0 0 0 0 50 0 0 0 0H 0 0 0 0 10 0 0 10 30 0 0 0 0 0 0 0K 0 40 0 0 0 0 0 0 10 100 70 0 0 0 0 100I 0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0L 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0M 0 0 0 0 0 0 0 0 0 0 0 0 0 60 0 0N 0 0 0 0 10 0 0 0 0 0 30 10 0 0 0 0P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0Q 0 0 0 0 40 0 0 0 30 0 0 0 0 0 0 0R 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0S 0 0 0 0 0 33 0 0 0 0 0 0 10 10 0 0T 20 0 0 0 0 33 0 0 0 0 0 30 0 30 100 0V 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0W 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0Y 70 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0
Position
![Page 19: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/19.jpg)
SeqNo No V L I M F W Y G A P S T C H R K Q E N D
1 1 80 0 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 0 80 3 3 50 0 0 0 0 0 0 0 33 0 0 0 0 0 0 0 0 17 0 0 4 4 0 0 0 0 0 0 0 0 13 63 13 0 0 0 0 0 0 13 0 0 5 5 13 0 0 0 0 0 0 13 75 0 0 0 0 0 0 0 0 0 0 0 6 6 0 0 0 13 0 0 0 0 0 13 0 13 0 0 0 0 0 0 0 63 7 7 0 0 0 38 0 0 0 38 0 0 0 0 0 0 0 25 0 0 0 0 8 8 25 13 0 0 0 0 0 0 50 0 13 0 0 0 0 0 0 0 0 0 9 9 0 13 13 0 0 0 0 0 0 25 0 0 0 0 0 50 0 0 0 0 10 10 0 0 25 13 0 0 0 0 13 13 0 0 0 0 0 38 0 0 0 0 11 11 0 0 0 0 0 0 0 0 25 0 0 0 0 0 0 13 13 0 0 50 12 12 0 0 0 0 43 0 0 29 0 29 0 0 0 0 0 0 0 0 0 0 13 13 0 14 29 0 0 0 0 0 29 0 0 0 0 0 0 0 0 14 0 14 14 14 0 0 0 0 0 0 0 43 29 0 0 0 0 0 0 29 0 0 0 0
The Network Architecture for Secondary Structure
Prediction
The Network Architecture for Secondary Structure
PredictionThe First Network (Sequence to Structure)The First Network (Sequence to Structure)
H E C
CCHHEHHHHCHHCCEECCEEEEHHHCC
![Page 20: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/20.jpg)
The Network Architecture for Secondary Structure
Prediction
The Network Architecture for Secondary Structure
Prediction
SeqNo No V L I M F W Y G A P S T C H R K Q E N D
1 1 80 0 20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 0 80 3 3 50 0 0 0 0 0 0 0 33 0 0 0 0 0 0 0 0 17 0 0 4 4 0 0 0 0 0 0 0 0 13 63 13 0 0 0 0 0 0 13 0 0 5 5 13 0 0 0 0 0 0 13 75 0 0 0 0 0 0 0 0 0 0 0 6 6 0 0 0 13 0 0 0 0 0 13 0 13 0 0 0 0 0 0 0 63 7 7 0 0 0 38 0 0 0 38 0 0 0 0 0 0 0 25 0 0 0 0 8 8 25 13 0 0 0 0 0 0 50 0 13 0 0 0 0 0 0 0 0 0 9 9 0 13 13 0 0 0 0 0 0 25 0 0 0 0 0 50 0 0 0 0 10 10 0 0 25 13 0 0 0 0 13 13 0 0 0 0 0 38 0 0 0 0 11 11 0 0 0 0 0 0 0 0 25 0 0 0 0 0 0 13 13 0 0 50 12 12 0 0 0 0 43 0 0 29 0 29 0 0 0 0 0 0 0 0 0 0 13 13 0 14 29 0 0 0 0 0 29 0 0 0 0 0 0 0 0 14 0 14 14 14 0 0 0 0 0 0 0 43 29 0 0 0 0 0 0 29 0 0 0 0
The Second Network (Structure to Structure)The Second Network (Structure to Structure)
CCHHEHHHHCHHCCEECCEEEEHHHCC
H E C
![Page 21: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/21.jpg)
Protein set
Training set 1
Testing set 1
The cross validation procedureThe cross validation procedure
The Performance on the Task of Secondary Structure
Prediction
The Performance on the Task of Secondary Structure
Prediction
![Page 22: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/22.jpg)
Efficiency of the Neural Network-Based Predictors onthe 822 Proteins of the Testing Set
INPUTQ3 (%) 66.3
Single SOV 0.62Sequence Q[H] 0.69 Q[E] 0.61 Q[C] 0.66
P[H] 0.70 P[E] 0.54 P[C] 0.71C[H] 0.54 C[E] 0.44 C[C] 0.45
Q3(%) 72.4Multiple SOV 0.69Sequence Q[H] 0.75 Q[E] 0.65 Q[C] 0.75(MaxHom) P[H] 0.77 P[E] 0.64 P[C] 0.73
C[H] 0.64 C[E] 0.54 C[C] 0.53Q3(%) 73.4
Multiple SOV 0.70Sequence Q[H] 0.75 Q[E] 0.70 Q[C] 0.73(PSI-BLAST) P[H] 0.80 P[E] 0.63 P[C] 0.75
C[H] 0.67 C[E] 0.56 C[C] 0.53
Combinando differenti reti: Q3 =76/78%
![Page 23: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/23.jpg)
Secondary Structure PredictionSecondary Structure Prediction
From sequenceFrom sequence
TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
EEEE..HHHHHHHHHHHH....HHHHHHHH.EEEE...........
To secondary structureTo secondary structure
7997688899999988776886778999887679956889999999
And to the reliability of the predictionAnd to the reliability of the prediction
![Page 24: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/24.jpg)
PredictProtein Burkhard Rost (Columbia Univ.)http://cubic.bioc.columbia.edu/predictprotein/
PsiPRED David Jones (UCL)http://bioinf.cs.ucl.ac.uk/psipred/
JPred Geoff Barton (Dundee Univ.)
SecPRED http://www.biocomp.unibo.it
SERVERSSERVERS
![Page 25: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/25.jpg)
QEALEIA
1TIF
1WTUA
Translation Initiation Factor 3
Bacillus stearothermophilus
……GIKSKQEALEIAARRN……
Transcription Factor 1
Bacteriophage Spo1
……FNPQTQEALEIAPSVGV……
Chamaleon sequencesChamaleon sequences
![Page 26: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/26.jpg)
We extract: We extract:
2,452 5-mer chameleons 107 6-mer chameleons 16 7-mer chameleons 1 8-mer chameleon
2,576 couples
The total number of residues in chameleons is 26,044 out of 755 protein chains (~15%)
from a set of 822 non-homologous proteins(174,192 residues)
![Page 27: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/27.jpg)
C
NGDQLGIKSKQEALEIAARRNLDLVLVAP
C
ARKGFNPQTQEALEIAPSVGVSVKPG
Prediction of the Secondary Structure of Chameleon sequences with Neural
Networks
Prediction of the Secondary Structure of Chameleon sequences with Neural
NetworksQEALEIAHHHHHHH
QEALEIACCCCCCC
![Page 28: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/28.jpg)
The Prediction of Chameleons with Neural Networks
The Prediction of Chameleons with Neural Networks
![Page 29: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/29.jpg)
•Secondary structure
•Topology of transmebrane proteins
•Cysteine bonding state
•Contact maps of proteins
•Interaction sites on protein surface
Other neural network-based predictorsOther neural network-based predictors
![Page 30: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/30.jpg)
Prediction of the cysteine bonding statePrediction of the cysteine bonding state
Tryparedoxin-I from Crithidia fasciculata (1QK8)
Cys40
Cys43
Cys68
Free cysteines
Disulphide bonded cysteines
MSGLDKYLPGIEKLRRGDGEVEVKSLAGKLVFFYFSASWCPPCRGFTPQLIEFYDKFHES KNFEVVFCTWDEEEDGFAGYFAKMPWLAVPFAQSEAVQKLSKHFNVESIPTLIGVDADSG DVVTTRARATLVKDPEGEQFPWKDAP
![Page 31: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/31.jpg)
A neural network-based method for
predicting the disulfide connectivity
in proteins
A neural network-based method for
predicting the disulfide connectivity
in proteins
![Page 32: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/32.jpg)
The Protein Folding
T T C C P S I V A R S N F N V C R L P G T P E A L C A T Y T G C I I I P G A T C P G D Y A N
![Page 33: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/33.jpg)
The Protein Folding
RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGGCRAKRNNFKSAEDCMRTCGGA
![Page 34: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/34.jpg)
Disulfide bonds Disulfide bonds
2-SH -> -SS- + 2H+ + 2e-
S-S distance 2.2 Å
Torsion angle C-S-S-C 90°
Bond Energy 3 Kcal/mol
S
SC CC
C
![Page 35: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/35.jpg)
Intra-chain disulfide bonds in proteins
Of 1259 proteins (a non redundant PDB subset):
• 23% of the chainshave disulfide bonds (S S)
• SS distribution (between secondary structures) % H E C H 7 9 14 E 17 27 C 26
![Page 36: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/36.jpg)
Intra-chain disulfide bonds in proteins
•Distribution: Type % All-13 All-31 / 11 + 13 Small domains 29 Others 3
Distribution of disulfide bonds in the SCOP domains
•99 % of the disulfide bonds are intra-domain
![Page 37: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/37.jpg)
Prediction of the disulfide-bonding state of cysteines in
proteins
Starting from the protein sequence can we
discriminate whether a cysteine residue is disulfide-bonded?
Problem no 1:
![Page 38: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/38.jpg)
NGDQLGIKSKQEALCIAARRNLDLVLVAP
bonded
Non bonded
Perceptron (input: sequence profile)Perceptron (input: sequence profile)
![Page 39: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/39.jpg)
Plotting the trained weigthsPlotting the trained weigths
Residue
Hinton’s plot
bonding state
non bonding state
V L I M F W Y G A P S T C H R K Q E N D 0 & #
-5-4-3-2-1 0 1 2 3 4 5
Residue V L I M F W Y G A P S T C H R K Q E N D 0 & #
-5-4-3-2-1 0 1 2 3 4 5
Posi
tio
nPosi
tio
n
Residue
![Page 40: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/40.jpg)
End
Begin
1
3
2
4
Bonded statesFree states
It is possible to add a sintax?It is possible to add a sintax?
![Page 41: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/41.jpg)
Bonding Residue State State
C40C43C68
End
Begin
1
3
2
4
A pathA path
![Page 42: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/42.jpg)
Bonding Residue State State
C40 1 FC43C68
End
Begin
1
3
2
4
P(seq) = P(1 | Begin) P(C40 | 1) ...
A pathA path
![Page 43: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/43.jpg)
Bonding Residue State State
C40 1 FC43 2 BC68
End
Begin
1
3
2
4
P(seq) = P(1 | Begin) P(C40 | 1) ... P(2 | 1) P(C43 | 2) ..
A pathA path
![Page 44: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/44.jpg)
Bonding Residue State State
C40 1 FC43 2 BC68 4 B
End
Begin
1
3
2
4
P(seq) = P(1 | Begin) P(C40 | 1) ... P(2 | 1) P(C43 | 2) .. P(4 | 2) P(C68 | 4) ..
A pathA path
![Page 45: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/45.jpg)
Bonding Residue State State
C40 1 FC43 2 BC68 4 B
End
Begin
1
3
2
4
P(seq) = P(1 | Begin) P(C40 | 1) ... P(2 | 1) P(C43 | 2) .. P(4 | 2) P(C68 | 4) .. P(End | 4)
A pathA path
![Page 46: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/46.jpg)
End
Begin
1
43
2
Bonding Residue State State
C40 1 FC43 1 FC68 1 F
End
Begin
1
43
2
Bonding Residue State State
C40 1 FC43 2 BC68 4 B
End
Begin
1
43
2
Bonding Residue State State
C40 2 BC43 4 BC68 1 F
End
Begin
1
43
2
Bonding Residue State State
C40 2 BC43 3 FC68 4 B
4 possible paths4 possible paths
![Page 47: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/47.jpg)
MYSFPNSFRFGWSQAGFQCEMSTPGSEDPNTDWYKWVHDPENMAAGLCSGDLPENGPGYWGNYKTFHDNAQKMCLKIARLNVEWSRIFPNP...
P(B|W1), P(F|W1) P(B|W3), P(F|W3)P(B|W2), P(F|W2)
W1 W2 W3
Free Cys
Bonded Cys
End
Begin
Viterbi path
Prediction of bonding state of cysteines
Hybrid systemHybrid system
![Page 48: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/48.jpg)
Residue
C40 C43 C68
Prediction for TriparedoxinPrediction for Triparedoxin
![Page 49: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/49.jpg)
NN Output NN predResidue B F
C40 99 1 B C43 82 18 B C68 61 39 B
Prediction for TriparedoxinPrediction for Triparedoxin
![Page 50: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/50.jpg)
NN Output NN pred HMM HMM predResidue B F Viterbi path
C40 99 1 B 2 BC43 82 18 B 4 BC68 61 39 B 1 F
End
Begin
1
43
2
Prediction for TriparedoxinPrediction for Triparedoxin
![Page 51: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/51.jpg)
Table I. Performance of the NN predictor (20-fold cross
validation) Set Q2 C Q(B) Q(F) P(B) P(F) Q2prot WD 80.4 0.56 67.2 87.5 74.3 83.2 56.9 RD 80.1 0.56 67.2 87.6 75.7 82.2 49.7
B= cysteine bonding state, F=cysteine free state. WD= whole database (969 proteins, 4136 cysteines) RD= Reduced database, in which the chains containing only one cysteine are
removed (782 proteins, 3949 cysteines).
Table II. Performance of the Hidden NN predictor (20-fold cross validation) Set Q2 C Q(B) Q(F) P(B) P(F) Q2prot WD 88.0 0.73 78.1 93.3 86.3 88.8 84.0 RD 87.4 0.73 78.1 92.8 86.3 88.0 80.2
Neural Network
Hybrid system
Martelli PL, Fariselli P, Malaguti L, Casadio R. -Prediction of the disulfide bonding state of cysteines in proteins with hidden neural networks- Protein Eng. 15:951-953 (2002)
PerformancePerformance
![Page 52: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/52.jpg)
Prediction of the connectivity of disulfide bonds in proteins
When the bonding state of cysteines is known can we
predict the connectivity pattern of disulfide bonds?
Problem no 2:
![Page 53: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/53.jpg)
Prediction of disulfide connectivity in proteins Bovine trypsin Inhibitor 6PTI
5 14 30 38 51 55
connectivity pattern
... Sequence
555
5114
38
30
N
C
![Page 54: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/54.jpg)
Prediction of disulfide connectivity in proteins as a problem of maximum-weight perfect
matching
Cys4
Cys2
Cys3Cys1W24
W23W13
W14
W12
W34
N
C
Protein sequence
The undirected weighted graph with V=2B vertices (no of cysteines) and E=2B(2B-1)/2 undirected edges (strength of the interaction W)
Representation:
![Page 55: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/55.jpg)
•It is not necessary to compute all the possible connectivity patterns ( (i B) (2i-1)) •Given a complete graph G=(2B,E)
the matching with the maximum weight can be computed in a O((B)3) time
with the Edmonds-Gabow’s algorithm*
* Gabow, H.N. (1975). Technical Report,CU-CS-075-75, Dept. of Comp. Sci. Colorado University
From the Graph Theory:
![Page 56: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/56.jpg)
How to assign the costs (W) of the edges in the
graph
Cys4
Cys2
Cys3Cys1W24
W23W13
W14
W12
W34
N
C
Cys4
Cys2
Cys3Cys1
Cys4
Cys2
Cys3Cys1W24
W23W13
W14
W12
W34W24
W23W13
W14
W12
W34
N
C
N
C
![Page 57: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/57.jpg)
Assumption: for each cysteine all its sequence nearest neighbours make
contacts
CN
Cys i
Cys j
neighbours (Ni)
neigh
bou
rs (N
j)
Cys i Cys j
All possible interactionsusing 1 nearest neighbour
![Page 58: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/58.jpg)
0
2
4
6
8
10
12
14
16
0 50 100 150 200 250 300 350 400 450
Sequence separation
Fre
qu
en
cy(%
)Frequency distribution of disulfide bonds with respect to sequence separation (726 proteins)
![Page 59: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/59.jpg)
Neural Networks for predicting the edge values
Neural Networks for predicting the edge values
Output ( 1 node)
Hidden nodes(6 nodes)
Input(212 nodes)
Disulfide pair propensity (output = wij)
Each pair in the neighbours of 4 residues
+ Sequence separation + No of SS bonds
(210 + 2 Input nodes)
![Page 60: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/60.jpg)
Accuracy (Qp) of EG vs NN
Chains B Random EG NN
158 2 0.333 0.46 0.68
153 3 0.067 0.17 0.21
103 4 0.009 0.11 0.20
44 5 0.001 0.00 0.02
![Page 61: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/61.jpg)
The state of art:
•Prediction of bonding states is quite satisfactory
•Prediction of connectivity needs to be improved
![Page 62: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/62.jpg)
Prediction of FoldonsPrediction of Foldons
Piero Fariselli
![Page 63: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/63.jpg)
The Folding Problem as a Mapping Problem
The Folding Problem as a Mapping Problem
Covalent structureTTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
Ct
Nt
3D structure
Secondary structureEEEE..HHHHHHHHHHHH....HHHHHHHH.EEEE...........
![Page 64: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/64.jpg)
We can collect from the PDB data base some 1500 chains of known structures from which to derive non redundant information relating sequence to:
• secondary structure
• structural and functional motifs
• 3D structure
![Page 65: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/65.jpg)
1 Y K D Y H S - D K K K G E L - - 2 Y R D Y Q T - D Q K K G D L - - 3 Y R D Y Q S - D H K K G E L - - 4 Y R D Y V S - D H K K G E L - - 5 Y R D Y Q F - D Q K K G S L - - 6 Y K D Y N T - H Q K K N E S - - 7 Y R D Y Q T - D H K K A D L - - 8 G Y G F G - - L I K N T E T T K 9 T K G Y G F G L I K N T E T T K 10 T K G Y G F G L I K N T E T T K
A 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D 0 0 70 0 0 0 0 60 0 0 0 0 20 0 0 0 E 0 0 0 0 0 0 0 0 0 0 0 0 70 0 0 0 F 0 0 0 10 0 33 0 0 0 0 0 0 0 0 0 0 G 10 0 30 0 30 0 100 0 0 0 0 50 0 0 0 0 H 0 0 0 0 10 0 0 10 30 0 0 0 0 0 0 0 K 0 40 0 0 0 0 0 0 10 100 70 0 0 0 0 100 I 0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 L 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0 M 0 0 0 0 0 0 0 0 0 0 0 0 0 60 0 0 N 0 0 0 0 10 0 0 0 0 0 30 10 0 0 0 0 P 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Q 0 0 0 0 40 0 0 0 30 0 0 0 0 0 0 0 R 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S 0 0 0 0 0 33 0 0 0 0 0 0 10 10 0 0 T 20 0 0 0 0 33 0 0 0 0 0 30 0 30 100 0 V 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 W 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Y 70 0 0 90 0 0 0 0 0 0 0 0 0 0 0 0
sequence position
Evolutionary information
•Multiple Sequence Alignment (MSA) of similar sequences
•Sequence profile: for each position a 20-valued vector contains the aminoacidic composition of the aligned sequences.
MS
ASe
quen
ce p
rofi
le
![Page 66: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/66.jpg)
The Early Stages of Folding:
Initiation SitesThe Unfolded Chain
Prediction of Initiation Sites of Protein FoldingPrediction of Initiation Sites of Protein Folding
Folded Protein
The Folding ProcessThe Folding Process
![Page 67: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/67.jpg)
Frustration in proteins
• The simultaneous minimisation of all the interaction energies is impossible
• The simultaneous minimisation of all the interaction energies is impossible
![Page 68: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/68.jpg)
The network architecture
Output
Hidden
Input
Input Window
Non
..ALS.......QGFLLIARQPPFTYFTV......HW..
![Page 69: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/69.jpg)
Q2 = 0.85 Q(H)= 0.67 Q(nonH) = 0.93 Sovpred = 0.85
C = 0.63 Pc(H) = 0.80 Pc(nonH) = 0.86 Sovobs = 0.76
The prediction efficiency of the network
![Page 70: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/70.jpg)
The conformation of residue R depends both on local (window W) and non local (context C) interactions.
The convergence theorem ensures that:Oi = Probability ( StructureR= i| W )
If , for any i, Oi 1 , then the structure of residue R depends mainly on W and only slightly on C
Context C
Residue RWindow W
O Onon
Neural Network
Theoretical background
![Page 71: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/71.jpg)
P ( | , ) ( , ) i i natW C ( W,C )
C
P W W C P Ci i( | ) ( | , ) ( ) P
P W W C P Ci i
C
i nat( | ) ( | , ) ( ) ( , (W) ) P
R W C• Anfinsen’s hypothesis:
• Averaging over all the contexts (performed by NN):
• When the pattern is self-stabilising (W dependent):
P ( | , )i W C P ( | )i W=
• Then the Anfinsen’s hypothesis can be cast in a local form:
![Page 72: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/72.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7Entropy (S5)
Rel
iabi
lity
Inde
x
Relationship between the reliability index and the Shannon entropy
![Page 73: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/73.jpg)
S = i Oi log Oi
INPUT
O O non-
MAS..... QLMLKDFLNRTPL.........GHI
......... ..........
_
![Page 74: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/74.jpg)
Entropy = Shannon-entropy in (ln 2)/10 units ( S = -i o i ln ( o i ) )NC = Number of protein segments correctly predicted in -helixNT = Total number of protein segments predicted in -helix
Protein segments correctly predicted in -helical structure
13579
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
0
20
40
60
80
100
NC / NT (%)
Entropy Segment length
![Page 75: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/75.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1 11 21 31 41 51 61 71 81 91 101 111 121
EntropyPredicted helices
Extracted fragments
Profile of the smoothed entropy (S5) for the hen egg lysozyme (132L)
Protein chain
S5
![Page 76: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/76.jpg)
Hen egg lysozyme (132L)
C-terminus
N-terminus
![Page 77: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/77.jpg)
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.1 0.2 0.3 0.4 0.5 0.6 0.7
Entropy (S5)
Frequency Correct
WrongDifferences
0.0
Frequency distribution of predicted helical segments as a function of their entropy value
Threshold value
![Page 78: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/78.jpg)
An example of the data base of minimally frustrated protein fragments
http://www.biocomp.unibo.it/DB/
![Page 79: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/79.jpg)
Training set from PDB
Number ofproteins
Number ofamino acids
Number of-helices
Averagelength
822 174191 4783 116
Number ofproteins
Number ofamino acids
Number of-helical segments
Averagelength
626 21553 3000 72
Data base of minimally frustrated -helical segments
![Page 80: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/80.jpg)
Comparison of minimally frustrated segments with putative folding initiation sites experimentally determined
*Not yet experimentally detected
![Page 81: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/81.jpg)
Comparison of minimally frustrated segments with peptides extracted from proteins
Code* Peptides* % Helix insolution*
Entropy(S5)
ExtractedSegment
3FXC TYKVTELINEAEGINETIDCDD 1 ##### ####3LZM GFTNSLRMLQQKRWDEAVNLAKS 10 0.262 WDEAVNL
“ 10 0.329 LRMLQQK3LZM-2 GVAGFTNSLRMLQQKRWDEAAVNLAKS 12 0.203 SLRMLQ
“ 12 0.210 DEAAVNLCIII ESLLERITRKLRDGWKRLIDIL 8 0.171 LLERIT
“ 8 0.260 WKRLIDCIII-L ESLLERITRKL 15 0.171 LLERITCIII-R RDGWKRLIDIL 4 0.260 WKRLIDCIII-M RITRKLRDGWK 2 #### ####Sigma KVATTKAQRKLFFNLRKTKQRL 9 0.218 TKAQRKCOMA1 DHPAVMEGTKTILETDSNLS 4 #### ####COMA2 EPSEQFIKQHDFSSY 3 #### ####COMA3 VNGMELSKQILQENPH 6 0.189 LSKQILQCOMA4 EVEDYFEEAIRAGLH 20 0.020 YFEEAIRCOMA5 KEKITQYIYHVLNGEIL 3 #### ####ARA1 AVGKSNLLSRYARNEFSA 2 #### ####ARA2 RFRAVTSAYYRGAVG 3 #### ####ARA3 TRRTTFESVGRWLDELKIHSD 7.5 0.194 SVGRWLARA4 AVSVEEGKALAEEEGLF 4 #### ####ARA5 STNVKTAFEMVILDIYNNV 3 #### ####G1 DTYKLILNGKTLKGETTTEA 2 #### ####G2 GDAATAEKVFKKIANDNGVD 4 #### ####G3 GEWTYDDATKTFTVTE 2 #### ####
* Muñoz and Serrano, 1994.
![Page 82: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/82.jpg)
Minimally frustrated -helical segments are useful for determining:
• Folding initiation sites
• -helix stability
• de-novo design of -helices
![Page 83: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/83.jpg)
Structure prediction of membrane proteins
![Page 84: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/84.jpg)
![Page 85: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/85.jpg)
Inner Membrane proteins(all -Transmembrane
proteins)
Outer Membrane proteins(all -Transmembrane
proteins)
![Page 86: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/86.jpg)
Porin (Rhodobacter capsulatus)
Bacteriorhodopsin(Halobacterium salinarum)
Bila
yer
-barrel -helices
Outer Membrane
Inner Membrane
![Page 87: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/87.jpg)
Predictors of the Topology of Membrane Proteins
position of Trans Membrane Segments along the sequenceTopography
++++ +
+
Topology
Bilayer
N
C
Out
In
position of N and C termini with respect to the bilayer
Lipidic Bilayer
![Page 88: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/88.jpg)
Prediction of transmembrane segments
![Page 89: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/89.jpg)
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 70 0 0 0 0 60 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 33 0 0 010 0 30 0 30 0 100 0 0 0 0 0 0 10 0 0 10 30 0 40 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 0 0 0 30 0 50 0 0 0 0 0 0 0 0 0 0 0 0 33 0 0 020 0 0 0 0 33 0 0 0 0 0 0 0 10 0 0 0 0 0 10 0 0 0 0 0 0 070 0 0 90 0 0 0 0 0
TM nonTM
Window: 9 residues
5 hidden neurons
2 output neurons
Neural Network for the prediction of TMS in -barrel membrane proteins. (Jacoboni et al., 2001)
![Page 90: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/90.jpg)
A generic model for membrane proteins (TMHMM)
A generic model for membrane proteins (TMHMM)
Transmembrane Inner Side
Outer Side
End
Begin
![Page 91: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/91.jpg)
Sequence-profile-based HMMSequence-profile-based HMM
085 0 0 5 0 0 0 0 2 0 8 0 0 0 0 0 0 0 0
0 0 0 0 4 013 0 4 0 5 0 6 0 023 0 144 0
0 022 023 0 0 5 023 0 3 011 0 0 2 011 0
034 0 0 024 0 0 0 0 0 2 022 018 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 092 0 0 0 0 0 0 0
90 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 077 023
3 0 2 7 4 0 8 6 1 3 6 5 512 5 617 2 2 6
..A C L P R P E T ...
t
Sequence of characters ct
Sequence of A-dimensional
vectors
st
0 st (n) S t,n S=100
k=1 st (n) = S t A
90 0 0 0 0 0 0 0 010 0 0 0 0 0 0 0 0 0 0
n
For proteins A=20
Constraints
Martelli et al., Bioinformatics 18, S46-53, 2002
![Page 92: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/92.jpg)
The new algorithms make possible:
•to feed HMMs with sequence profiles
•to eventually couple NNs and HMMs (Hidden Neural Networks)
Advantages:
•Higher performance than standard HMMs
•Increased discrimination capability of a given class
Martelli et al., Bioinformatics, 2002Martelli et al., Protein Eng. 2002,
![Page 93: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/93.jpg)
Prediction of the Topology of -Transmembrane Proteins
position of Trans Membrane Helices along the sequenceTopography
++++ +
+
Topology
Bilayer
N
C
Out
In
position of N and C termini with respect to the bilayer
The prediction accuracy of topography is 92%
The prediction accuracy of topology is 81 %
![Page 94: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/94.jpg)
position of Transmembrane Strands along the sequenceTopography:
Prediction of the Topology of -Transmembrane Proteins
++++ +
+
Topology:
Bilayer
N
C
LPS (Out)
Periplasmic (In)
position of N and C termini with respect to the bilayer
The prediction accuracy of topography is 73 %
The prediction accuracy of topology is 73 %
![Page 95: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/95.jpg)
0
10
20
30
40
50
60
70
80
90
100
2.75 2.8 2.85 2.9 2.95
Per
cent
age
Outer membrane
Globular
Inner membrane
I(s | M) = -1/L log P(s | M)
The discriminative capability of the HMM model
![Page 96: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/96.jpg)
An application: modeling the 3D structure of eukaryotic barrel
proteins
![Page 97: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/97.jpg)
New folds Existing folds
Threading/ fold
recognition
Ab initio prediction
Building by homology
Homology (%)
0 10 20 30 40 50 60 70 80 90 100
3D structure prediction of proteins
Membrane proteins
![Page 98: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/98.jpg)
2omf_.seq/ AEIYNKDGNK VDLYGKAVGL HYFSKGNGEN SYGGNGDMTY ARLGFKGETQ 2omf_.str/ CCCCCCCCEE EEEEEEEEEE EEECCCCCCC CCCCCCCCCE EEEEEEEEEE protx.str/ *******CCC CCCCEEEEEE EEEC****** ********CE EEEEEEEECC protx.seq/ *******KGY NFGLWKLDLK TKTS****** ********SG IEFNTAGHSN 2omf_.seq/ I*NSDLTGYG QWEYNFQGNN SEGADAQTGN KTRLAFAGLK YADVGSFDYG 2omf_.str/ C*CCCEEEEE EEEEEEECCC CCCCCCCCCC EEEEEEEEEE ECCCEEEEEE protx.str/ CCCCCEEEEE EEEEEEC*** ********** EEEEEEEEEC CCCCCEEEEE protx.seq/ QESGKVFGSL ETKYKVK*** ********** DYGLTLTEKW NTDNTLFTEV 2omf_.seq/ RNYGVVYDAL GYTDMLPEFG GDTAYSDDFF VGRVGGVATY RNSNFFGLVD 2omf_.str/ ECCCCCCCCC CCCCCCCCCC CCCCCCCCCC CCCCCCEEEE EECCCCCCCC protx.str/ EEEECC**** ********** ********** **CCEEEEEE EEECCCCCCC protx.seq/ AVQDQL**** ********** ********** **LEGLKLSL EGNFAPQSGN 2omf_.seq/ GLNFAVQYLG KNER****** *********D TARRSNGDGV GGSISYEYE* 2omf_.str/ CEEEEEEEEC CCCC****** *********C CCCCCCCCEE EEEEEEEEC* protx.str/ EEEEEEEEEE EEEECCCCCC CCCCCCCEEE EEEEEEEEEE EEEEEEECCC protx.seq/ KNGKFKVAYG HENVKADSDV NIDLKGPLIN ASAVLGYQGW LAGYQTAFDT 2omf_.seq/ **GFGIVGAY GAADRTNLQE AQPLGNGKKA EQWATGLKYD ANNIYLAANY 2omf_.str/ **CEEEEEEE EEEECCCCCC CCCCCCCCEE EEEEEEEEEE ECCEEEEEEE protx.str/ CCEEEEEEEE EEEEEEEEEE EEECCCCCCC EEEEEEEEEE CEEEEEEEEE protx.seq/ QQSKLTTNNF ALGYTTKDFV LHTAVNDGQE FSGSIFQRTS DKLDVGVQLS 2omf_.seq/ GETRNATPIT NKFTNTSGFA NKTQDVLLVA QYQFDFGLRP SIAYTKSKAK 2omf_.str/ EEEECCCCCC CCCCCCCCCC CEEEEEEEEE EEECCCCEEE EEEEEEEEEE protx.str/ EEECC***** ********** *CCCEEEEEE EEECCCCEEE EEEEEEC*** protx.seq/ WASGT***** ********** *SNTKFAIGA KYQLDDDARV RAKVNNA*** 2omf_.seq/ DVEGIGDVDL VNYFEVGATY YFNKNMSTYV DYIINQIDSD NKLGVGSDDT 2omf_.str/ CCCCCCCEEE EEEEEEEEEE ECCCCEEEEE EEEEECCCCC CCCCCCCCCE protx.str/ *********E EEEEEEEEEE EC***EEEEE EEEEECCC** *****CCCCE protx.seq/ *********S QVGLGYQQKL RT***GVTLT LSTLVDGK** *****NFNAG 2omf_.seq/ VAVGIVYQF* *** 2omf_.str/ EEEEEEEEE* *** protx.str/ EEEEEEEEEE EC* protx.seq/ GHKIGVGLEL EA*
Structural alignment of VDAC with the template
![Page 99: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/99.jpg)
A low resolution 3D Model of VDAC the sequence from Neurospora crassa)
Casa
![Page 100: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/100.jpg)
A low resolution 3D model of VDAC:location of mutated residues
Casadio et al., FEBS Lett 520:1-7 (2002)
![Page 101: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/101.jpg)
Predictors of membrane protein structures can be used to filter genomes and find new
membrane proteins without sequence homologoues
![Page 102: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/102.jpg)
FISHING NEW OUTER MEMBRANE PROTEINS IN
GRAM-NEGATIVE BACTERIA
FISHING NEW OUTER MEMBRANE PROTEINS IN
GRAM-NEGATIVE BACTERIA
![Page 103: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/103.jpg)
MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVSQLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESRHPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDIGSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKRITYRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNNNTSDYSKNGAGIENYNFITTAGLKYTF
Signal peptides in protein sequences:
Sequences of outer membrane proteins have signal peptides:
the secretion marker is also a marker of outer membrane proteins
Proteins have intrinsic signals that govern their transport and localization in the cell: a secretion hydrophic marker (or signal peptide)
![Page 104: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/104.jpg)
MKLLQRGVALALLTTFTLASETALAYEQDKTYKITVLHTNDHHGHF
Signal Pepetide Mature protein
Cleavage site
Signal Peptide prediction
![Page 105: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/105.jpg)
MKLLQRGVALALLTTFTLASETALAYEQDKTYKITVLHTNDHHGHF
Predicts if a given residue position belongs to the Signal Pepetide
2 Neural Networs
SignalNet CleavageNet
Predicts if a given residue position is the cleavage site
![Page 106: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/106.jpg)
Organism Window C Q2
Eukaryotes 15-1-15 0.83 0.95 Gram positive 15-1-15 0.79 0.92Gram negative 11-1-11 0.78 0.92
SignalNet Accuracy
![Page 107: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/107.jpg)
Organism Window C Q2
Eukaryotes 15-1-2 0.61 0.97 Gram positive 20-1-3 0.56 0.96 Gram negative 11-1-2 0.62 0.96
CleavageNet Accuracy
![Page 108: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/108.jpg)
Organism SignalP SPEP
Eukaryotes (+) 0.99 0.97 Eukaryotes (-) 0.85 0.94
Prokaryotes(+) 0.99 0.97Prokaryotes (-) 0.93 0.96
Escherichia coli(+/-) 0.95 0.96
Comparison with SignalP
![Page 109: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/109.jpg)
Performance of SignalNN on 2160 annotated proteins
250
Prediction
An
nota
tion
2160
Withoutsignal Total
Withsignal
Wit
hou
tsig
nal
Tota
lW
ith
sig
nal
260 1900
1910
205
1855
Correct predictions
55
45
Wrong predictions
Q2 = 96 %
Qsignal = 82 %Qnon-signal = 97 %
Psignal = 78 %Pnon-signal = 98 %
![Page 110: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/110.jpg)
Predictors of Membrane Topography: Rate of false positives
The predictors are tested on on 809 globular protein with sequence identity 25 % :
0.5 % have at least 1 -TM helix predicted
5.6 % have at least 2 -TM strand predicted
![Page 111: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/111.jpg)
PROTEOME
Signal peptide
Yes
All- TM All- TM
No
No
All- TM
Yes
all -TM
Yes
all -TMY
esall -TM
No
Globular
No
Globular
HUNTER
![Page 112: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/112.jpg)
* the number of new proteins predicted in the class with Hunter, out of the non-annotated region
Predicting globular, inner and outer membrane proteins in genomes of Gram-negative bacteria with
Hunter
![Page 113: Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS](https://reader030.vdocuments.site/reader030/viewer/2022032706/56649ddf5503460f94ad9009/html5/thumbnails/113.jpg)