predicting pdz domain protein- protein interactions from the genome gary bader donnelly centre for...

48
Predicting PDZ domain protein-protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto VanBUG, Vancouver, Jan.8.2009 http://baderlab.org

Upload: holly-dickerson

Post on 20-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Predicting PDZ domain protein-protein interactions from the

genomeGary Bader

Donnelly Centre for Cellular and Biomolecular Research

University of TorontoVanBUG, Vancouver, Jan.8.2009

http://baderlab.org

Page 2: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Computational Cell Map

Cary MP et al. Pathway information… FEBS Lett. 2005Bader GD et al. Functional genomics and proteomicsTrends Cell Biol. 2003

Map the cell• Predict map from genome• Multiple perturbation mapping• Active cell map• Map visualization and analysis software

Read map to understand • Cell processes• Gene function• Disease effects• Map evolution

Page 3: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

How are biological networks in the cell

encoded in the genome?Can we accurately predict biologically relevant interactions from

a genome?

How do genome sequence changes underlying disease affect the molecular network in the cell?

Can we predict how well model pathways or phenotypes will translate to human?

Can we design new networks de novo?

Page 4: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Predicting Protein Interaction Networks From the Genome

• Ideally:

• Reality:– Not currently possible– Signaling pathways too divergent to accurately

map by orthology– Protein interaction prediction likely as hard as

protein folding, in general e.g. induced fit

AccuratelyPredict

Page 5: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

• Map via orthology relationships– Metabolic pathways

• E.g. KEGG, BioCyc, metaSHARK

– Protein-protein interactions• E.g. OPHID, HomoMINT

– Signaling pathways• E.g. Reactome

• Infer using functional associations– Phylogenetic profile, Rosetta Stone

• Infer from molecular profiles– Gene expression gene regulatory network– E.g. ARACNE, MEDUSA, MatrixREDUCE

Predicting Networks

Bader & Enright

Pinney et al.NAR 2005

Page 6: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Peptide Recognition Domains

• Simple binding sites• Well studied• Numerous• Biologically important

– Eukaryotic signaling systems often involve modular protein-protein interaction domains

http://pawsonlab.mshri.on.ca/

http://nashlab.uchicago.edu/domains/

Page 7: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Protein Domain Interaction Network Prediction

Genome

Gene and protein prediction

Domain prediction

Specificity prediction

Protein-protein interaction prediction

Page 8: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Protein Domain Interaction Network Prediction

Genome

Gene and protein prediction

Domain prediction

Specificity prediction

Protein-protein interaction prediction

Page 9: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Par-6 PDZ DomainVKESLV-COOH(1RZX, Fly)

PDZ Domains• 80-90 aa’s, 5-6 beta

strands, 2 alpha helices• Recognize

hydrophobic C-termini• Membrane localization of

signaling components• Neuronal development,

cell polarity, ion channel regulation

C

Tonikian et al. PLoS Biology Sep.2008

Dev Sidhu

Page 10: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

~250 Human PDZ Domains

Multiple sequencealignment

Page 11: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

~250 Human PDZ Domains

Multiple sequencealignment

Page 12: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

C-Terminus

PDZBindingMotifs

polarbasicacidichydrophobic

Class 1: X[T/S]X

Class 2: XX

Page 13: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Sequence Logo

SWWPDSWVNAFEETWVNPFWDVWVNPFWDVWVSVDVDTWV-AYFDTWVSTFLETWVKGVFESWVESWHDSWV-GDQDTWVGRWMDTWVKFWRDTWL…

Profile

polar=green, basic=blue, acidic=red, hydrophobic=black

Logo

-3 -2 -1 0A 0 0 0 0C 0 0 0 0D 0.7 0.1 0 0E 0.3 0.05 0 0F 0 0 0.05 0G 0 0 0 0H 0 0 0 0I 0 0 0 0K 0 0 0 0L 0 0 0 0.1M 0 0 0 0N 0 0 0 0P 0 0 0 0Q 0 0 0 0R 0 0 0 0S 0 0.15 0 0T 0 0.7 0 0V 0 0 0 0.9W 0 0 0.95 0Y 0 0 0 0

Position

Am

ino

Aci

d

http://weblogo.berkeley.edu/Schneider TD, Stephens RM. 1990.Nucleic Acids Res. 18:6097-6100

Page 14: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

82 worm and human PDZ specificities mapped by phage display

~3100 peptides

Page 15: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

PDZ SpecificityMap

Class 1: X[T/S]X

Class 2: XX

Page 16: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

PDZ SpecificityMap

Class 1: X[T/S]X

Class 4: XGX

Class 3: X[D/X]X

Class 2: XX

Page 17: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Class 1: X[T/S]X

Class 4: XGX

Class 3: X[D/X]X

Class 2: XXPDZ SpecificityMap

16 Classes

Page 18: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Specificity at Most Positions

Page 19: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Position Versatile

Many Distinct Specificities

Page 20: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Versatile and Robust91 Erbin mutants phaged, 3400 peptidesMutations cause specificity switch, not function loss

Page 21: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Conserved Specificity, Expanded UsePDZ domains are versatile, but only ~16 classes used from worm to humanOne billion years of evolutionModel: specificities arose early, domains expanded under evolutionary constraints

Raffi Tonikian

Page 22: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Protein Domain Interaction Network Prediction

Genome

Gene and protein prediction

Domain prediction

Specificity prediction

Protein-protein interaction prediction

Page 23: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Predicting PDZ Specificity

>ERBB2IP-1RVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINIEHGQAVSLLKTFQNTVELII

Tonikian et al. PDZ specificity map

Page 24: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Sequence Predicts Specificity

Page 25: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

50 mapped PDZ domains>70% similar to 69unmapped PDZ

Double coverage to45% of worm/human

33 more PDZ groups110 singletons

Mapped

Unmapped Worm

Human

Page 26: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Are Residues Correlated?

~80

~3000Boris Reva, Chris Sander

Page 27: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Domain Position

Peptide Position

Joint Freq

Domain Freq

Peptide Freq

MutualInformation

(H@105) (T@7) 886 1367 913 0.166384111

(P@53) (T@7) 373 411 913 0.130328629

(Q@67) (W@8) 366 377 1037 0.117349366

(V@109) (T@7) 836 1430 913 0.115598151

(S@64) (E@6) 218 386 414 0.109298916

(V@9) (W@4) 150 202 340 0.109096478

(A@102) (E@6) 228 429 414 0.107661006

(L@30) (S@6) 207 383 384 0.106889284

(P@53) (E@6) 219 411 414 0.103683514

(L@26) (E@6) 391 1138 414 0.10274842

Top 10 1-1 Rules

p joint lnp joint

pdomain ppeptide

886

2083ln

886

20831367

2083

913

2083

0.17

Page 28: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Correlation Validation

Page 29: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Prediction Can Be Accurate

Experiment

Prediction

Page 30: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Challenge: But Not Always

Experiment

Prediction

Shirley Hui

Page 31: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Predicting PDZ Specificity

Machine Learning

Predictions

YESNO…

Negative:Positive:

YESYESYESYES

NONONONO

Test Examples(PDZ-Peptide Pairs)

??

Training Examples(Binding and Non binding PDZ-Peptide Pairs)

…Shirley Hui, Xiaojian Shao

Consider sequence and physicochemical propertieshigh accuracy at matching known domains to peptides

Page 32: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Protein Domain Interaction Network Prediction

Genome

Gene and protein prediction

Domain prediction

Specificity prediction

Protein-protein interaction prediction

Page 33: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Genome Search

SWWPDSWVNAFEETWVNPFWDVWVNPFWDVWVSVDVDTWV-AYFDTWVSTFLETWVKGVFESWVESWHDSWV-GDQDTWVGRWMDTWVKFWRDTWL…

Profile

Phage Results

polar=green, basic=blue, acidic=red, hydrophobic=black

ERBINPDZ

Page 34: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

>Q86W91_HUMAN Plakophilin 4, isoform b...LKSTTNYVDFYSTKRPSYRAEQYPGSPDSWV

QYPGSPDSWV

Genome Search

DSWV

PDZ ERBIN

C-TerminalMatch Score

5.5

Predicted C-Terminal Motif

w

iip

110logAssumes: Position independence,

uniform input, good samplingPhysiological binder is similar to phage sequence

Page 35: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Known Interactor

High Score

Prediction Can be AccurateERBIN PDZInteractionPrediction

10E-5 (High)

Probabilityof PDZ binding

10E-7 (Low)

ERBB2IP-1

…but requires further experimental support

Page 36: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

...

Page 37: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

p-value

Network of prioritized human PDZ interactions

336 interactions between 54 PDZ domains, 247 proteins

Matches known biology, significantly enriched in known interactors

8% overlap, p=8.6x10-18

Page 38: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

In vitro Biologically Relevant(In vivo)

Future: In vivo Protein Interaction Prediction

PDZ

PhageDisplay

Genome

ProteinExpression

ProteinFunction

ProteinLocation

ProteinStructure

EvolutionaryContext

NetworkContext

Bind

DLGs NMDAR

In silicoPredictions

Peptides

Page 39: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

PDZ Human-Virus Interactions

89 viral proteins matched better than any human protein(vs. 30 domains)

Affinities (ELISA)

Yingnan Zhang

Page 40: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Crtam peptide inhibitorblocks SCRIB-3 bindingand polarization

Synthetic viral peptidepromotes T cellproliferation

T cell

Jung-Hua Yeh and Andrew Chan

Crtam Ig transmembraneprotein important inlate phase T cellactivation

Non SCRIB binding SCRIB Binding

Non SCRIB binding

SCRIB Binding

Page 41: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Conclusions• PDZ domains are highly specific, versatile and

robust to mutation• Many specificities possible, but only a few are

used• Specificity can be predicted from domain

sequence• Prioritize predictions for experimental follow up• Use by pathogens• PDZ specificity map useful for:

– Novel protein interaction discovery– Peptidomimetic therapeutic design– PDZ design (synthetic biology)

Page 42: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Cell map exploration and analysis

Databases

Literature

Expert knowledge

Experimental Data

Can we accurately predict protein interactions?

PathwayInformation

PathwayAnalysis

(Cytoscape)

Page 43: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

http

://pa

thgu

ide.

org

Vuk Pavlovic

~280 PathwayDatabases!

Page 44: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Pathway Commons: A Public Library

•Books: Pathways•Lingua Franca: BioPAX OWL•Index: cPath pathway database software•Translators: translators to BioPAX

•Open access, free software•No competition: Author attribution•Aggregate ~ 20 databases in BioPAX format

http:pathwaycommons.org

Sander Lab(MSKCC)Bader Lab

Page 45: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Network visualization and analysis

UCSD, ISB, Agilent, MSKCC, Pasteur, UCSF, Unilever, U of Toronto, U of Michigan

http://cytoscape.org

Pathway comparisonLiterature miningGene Ontology analysisActive modulesComplex detectionNetwork motif search

Page 46: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Gene Function Prediction

•Guilt-by-association principle

•Biological networks are combined intelligently to optimize prediction accuracy

•Algorithm is more fast and accurate than its peers

http://www.genemania.org

Quaid Morris (CCBR)Rashad Badrawi, Ovi Comes, Sylva Donaldson, Christian Lopes,Jason Montojo, Khalid Zuberi

Page 47: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

Canadian Bioinformatics Workshops 2009

Interpreting Gene Lists from -omics Studies

Date: July 9-10, 2009, Toronto

Faculty: Gary Bader, Quaid Morris & Wyeth Wasserman

Clinical Genomics and Biomarker Discovery

Date: July 16-17, 2009, Toronto

Faculty: Sohrab Shah

Informatics on High-Throughput Sequencing Data

Date: July 23-24, 2009, Toronto

Faculty: Michael Brudno, Asim Siddiqui & Francis Ouellette

Exploratory Data Analysis and Essential Statistics using R

October 2-3, 2009, Toronto

Faculty: Raphael Gottardo and Boris Steipe

Applications now being accepted at www.bioinformatics.ca

Limited registration

Registration Fee: $500

Page 48: Predicting PDZ domain protein- protein interactions from the genome Gary Bader Donnelly Centre for Cellular and Biomolecular Research University of Toronto

AcknowledgementsPDZ WorkGenentech

Dev SidhuYingnan ZhangHeike HeldStephen SazinskyYan Wu

University of TorontoCharlie BooneRaffi Tonikian, Xiaofeng

Xin

MSKCCChris SanderBoris Reva

CytoscapeTrey Ideker (UCSD)Kei Ono, Mike Smoot, Peng Liang Wang (Ryan Kelley, Nerius Landys, Chris Workman, Mark Anderson, Nada Amin, Owen Ozier, Jonathan Wang)

Lee Hood (ISB)Sarah Killcoyne, John Boyle, Ilya Shmulevich (Iliana Avila-Campillo, Rowan Christmas, Andrew Markiel, Larissa Kamenkovich, Paul Shannon)

Benno Schwikowski (Pasteur)Mathieu Michaud (Melissa Cline, Tero Aittokallio)

Chris Sander (MSKCC)Ethan Cerami, Ben Gross (Robert Sheridan)

Annette Adler (Agilent)Allan Kuchinsky, Mike Creech (Aditya Vailaya)

Bruce Conklin (UCSF)Alex Pico, Kristina Hanspers

Bader LabG2NChris TanDavid GfellerShirley HuiXioajian ShaoShobhit JainMPAnastasija BaryshnikovaIain WallaceLaetitia MorrisonRon AmmarACMDaniele MericoRuth IsserlinVuk PavlovicOliver Stueker

Pathway CommonsChris SanderEthan CeramiBen GrossEmek DemirRobert HoffmannIgor RodchenkovRashad Badrawi

FundingCIHR, NSERC, NIHGenome CanadaCanada Foundationfor Innovation/ORF http://baderlab.org