predicting protein properties and structure

Post on 23-Jan-2016

55 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Predicting Protein Properties and Structure. Rui Alves. Organization of the Talk. From cDNA sequence to protein sequence. Analyzing the information in the protein sequence Predicting the fold (secondary structure) of a protein Predicting the (tertiary) structure of a protein. - PowerPoint PPT Presentation

TRANSCRIPT

Predicting Protein Properties and Structure

Rui Alves

Organization of the Talk

• From cDNA sequence to protein sequence.

• Analyzing the information in the protein sequence

• Predicting the fold (secondary structure) of a protein

• Predicting the (tertiary) structure of a protein

Predicting protein sequence from DNA sequence

• Protein sequence can be predicted by translating the cDNA and using the genetic code.

Translating cDNA into protein sequence

ATGTCTCTTATATGA…

MetSerLeuIleTer

No Gene!!!!!

Translating cDNA to Protein

Translating yeast mitochondrial cDNA into protein sequence

ATGTCTCTTATATGA………SECIS sequence

TrpSerThrMetsCys

MetSerLeuIleTer

There is a Gene with a considerably different protein sequence from the one we would

predict from the universal genetic code!!!!!

Organization of the Talk

• From cDNA sequence to gene sequence.

• Analyzing the information in the protein sequence

• Predicting the fold (secondary structure) of a protein

• Predicting the (tertiary) structure of a protein

Inferring function from sequence

Your Sequence

Protein Sequence Database

No Known Homologues in the Database

Oh, $#!¥!!!

Go to the Protein Databank to get structure

&

Live happily ever after

Analyzing the information in the protein sequence

• Physical-Chemical Information

Why are these properties useful?

For example, they help identifying your protein in an electrophoresis gel

Analyzing the physical chemical information in the protein sequence

How to predict hidrophobicity

How to predict molecular mass

Ala

Molecular Mass: 71.09

Cys

71.09+103.15-18

-H2O

How to predict isoelectric point

Ala

Isoelectric Point:

Cys …

- 9.3 … pH

Pro

tein

Cha

rge

0

0 16

-

+

~10

Amino acid pKa is dependent upon environment

Buried amino acids do not gain/loose protons as easily as exposed amino acids

Does not work very well

Isoelectric point is the pH at which the protein is not charged

At each value of pH, calculate the state of hydrogenation of each residue and thus the charge of the whole protein

Analyzing the information in the protein sequence

• Physical-Chemical Information• e.g.

http://prowl.rockefeller.edu/prowl-cgi/sequence.exe/.fsa

• Localization, modifications & secondary structure Information

• E.g. http://seq.cbrc.jp/proteinLocalizationResources/localizationLinks.html

Predicting the localization of your protein

• Search for homology to the relevant TS in your protein

• Complications:

•Small sequences, divergence, change between organisms

• Signal Peptides

•Nuclear localization signals at the N-terminal

•Mitochondrial TS

•Peroxysomal TS

•…

How is the localization of a protein predicted?

Predicting post translational modifications to your protein

How are post translational modifications to a protein predicted?

• Signal sequences

• Search for homology to pattern peptides

Training set of known structures

Training set of corresponding sequences

Test set of known structures

Test set of corresponding sequences

How is 2ndary structure predicted?

p(-helix) p(coil) p(-strand)

A 0.23 0.28 0.5

Database of known structures

Database of corresponding sequences

ACDEFGTYAEE……

-helix coil -strand

p(-helix) p(coil) p(-strand)

A…C… A…C.. A…C…

A 0.1…0.03 0.04…0.002 0.1…0.21

p(aa1-coil) p(aa1-helix)

p(aa1-strand) …

Predict 2ary structureCompare

Bad Predictions:

Reshuffle training set and test set and repeat until predictions are correct

Good Predictions:

Method ready for new sequence 2ndary structure prediction

Predicting transmembrane helices

How are transmembrane regions predicted?

• Transmembrane segments are 17 residues long

17 aa residues

Hydrophobic Hydrophobic

Two Transmembrane helices

How is membrane orientation predicted?

HN-

Outside

Cytosol

NH

NH

Signal Peptide

17 aa

15 aa 15 aa

+++ ---

Organization of the Talk

• From cDNA sequence to gene sequence.

• Analyzing the information in the protein sequence

• Predicting the fold (secondary structure) of a protein

• Predicting the (tertiary) structure of a protein

What is fold?

• Fold can be roughly defined as the succession of --coil structures in a protein

Predicting protein folding

How is fold predicted?

Database of known structures

Database of corresponding sequences

Database of probabilities of aa in 2ndary structure

YOUR SEQUENCE

Homology

based helix

coil-strand

profile folds database

Server

Strong Homology

… Fold Prediction

Weak/No Homology

Helix-coil-strand

profile prediction

… Fold Prediction

Organization of the Talk

• From cDNA sequence to gene sequence.

• Analyzing the information in the protein sequence

• Predicting the fold (secondary structure) of a protein

• Predicting the (tertiary) structure of a protein

Predicting protein structure

• Homology Modeling– 3D-JIGSAW, SWISSMODEL

• Ab initio Modeling– ROBETTA

Predicting protein structure by homology

How does homology modeling work?

Database of known structures

Database of corresponding sequences

…YDVRSEQVENCE…

Server/

Program

Strong Homologues

Best possible Sequence alignment

…YDVR-SEQVENCE…

…YDVRMSD-VDNCD…

…YDVR-SEQVENCE…

…YDVRMSD-VDNCD…

Thread sequence to predict over known structure according to alignment

… Optimization via energy

minimization, etc…

Predicting protein structure

• Homology Modeling– 3D-JIGSAW,SWISSMODEL

• Ab initio Modeling– ROBETTA

Predicting protein structure by ab initio methods

Database of corresponding sequences

…YDVRSEQVENCE…

Server/

Program

NO Homologues

Database of structures for smaller amino acid runs

…YDVR-SEQ

…YDVRMSD-……YDVR-SEQ

…YPVRMSD-…

…VENCE…

…YDNCD……VENCE…

…VEQCE…

… Assemble

Energy minimization

& optimization

Summary

• From cDNA sequence to gene sequence.

• Analyzing the information in the protein sequence

• Predicting the fold (secondary structure) of a protein

• Predicting the (tertiary) structure of a protein

top related