sequence variations, protein interactions, and diseases€¦ · sequence variations, protein...
TRANSCRIPT
-
Sequence variations, protein
interactions, and diseases
Teresa Przytycka
NIH / NLM / NCBI
October 7, 2009
Inferring protein interaction
from sequence co-evolution
-
• Protein interaction and sequence co-
evolution
-
Interacting proteins are expected to co-
evolve to ensure proper binding
-
BSOSC Review, November 2008 4
Ph
ylo
gen
eti
c
Tre
es
Protein A Protein B
Ort
ho
log
s
Organism 1
Organism 2
Organism 3
Organism n
Mirrortree Method:Protein A Protein B
Inferring protein interaction from
the co-evolution principle
[Goh et al 2000, Pazos and Valencia 2001]
-
5
Evolutionary vector
1
2
3
.
.
.
1 2 3 ….. n
Distance Matrix
-
BSOSC Review, November 2008 6Compute correlation
Ph
ylo
gen
eti
c
Tre
es
Protein A Protein B
Ort
ho
log
s
Organism 1
Organism 2
Organism 3
Organism n
Mirrortree Method:Protein A Protein B
Inferring protein interaction from
the co-evolution principle
-
Simple idea but lot’s of questions…
• How to separate co-evolution due to
common speciation history form co-
evolution due to function?
• Is the co- evolution signal distributed
uniformly over the sequence? Between
binding site only?
• Predicting Interaction specificity SOSC Review, November 2008 7
Kann et. al. Proteins 2007
Kann et. al. JMB 2009
Jothi et. al. JMB 2007
Jothi et. al. Bioinformatics 2005
-
Challenges
• How to separate co-evolution due to
common speciation history form co-
evolution due to function?
• Is the co- evolution signal distributed
uniformly over the sequence? Between
binding site only?
• Predicting Interaction specificity BSOSC Review, November 2008 8
Kann et. al. Proteins 2007
Kann et. al. JMB 2009
Jothi et. al. JMB 2007
Jothi et. al. Bioinformatics 2005
-
Do binding sites co-evolve more tightly?
BSOSC Review, November 2008 9Kann et. al. JMB 2009
-
Binding sites are important but not the
only contributor of the co- evolutionary
signal
BSOSC Review, November 2008 11Kann et. al. JMB 2009
-
BSOSC Review, November 2008 12
Predicting interacting domains
Jothi et. al. JMB 2007
Given interacting multi-domain proteins
domains that are in contact
-
BSOSC Review, November 2008 13
Co
rre
lati
on
Mirror tree approach can be used to recognize
interacting domains
MSA of
Protein A
MSA of
Protein B
Do
ma
in
Ali
gn
me
nts
Sim
ilari
ty
Ma
tric
es
Protein A Protein B
Organism 3
Organism 1
Organism 2
Organism n
Ort
ho
log
s
0.63 0.83 0.79
0.59 0.91 0.89
In 64% cases, the domain pair with
highest correlation was interacting
(55% expected by chance)
RESULTS: Jothi et. al. JMB 2007
-
BSOSC Review, November 2008 14
Predicting interacting domains
Jothi et. al. JMB 2007
Given protein-protein interaction network
domains that are in contact
Guimaraes et. al. Genome Biology 2008
Given interacting multi-domain proteins
domains that are in contact
-
BSOSC Review, November 2008 15
• Protein interaction and sequence co-
evolution
• Predicting domain interaction from
protein interaction networks
-
BSOSC Review, November 2008 16
Parsimony approach
Assumption: Protein interactions are
mediated by domain interactions
Hypothesis: Interactions evolved in
most parsimonious way
Method: Find the smallest set of
domain pairs whose interaction
would explain all protein
interactions in the network
Guimaraes et. al. Genome Biology 2008
-
BSOSC Review, November 2008 17
Additional problems to solve:
Constraints: one per protein interaction Pm Pn
Linear programming formulation
For each domain pair Di Dj: variable xij taking value 0 or 1xij
• Model the noise in the network
• Estimate p-values
Pm
Pn
Objective function(representing parsimony assumption):
Interacting domains pairs – domains pairs with Xij=1
Guimaraes et. al. Genome Biology 2008
-
BoSC, October 2006 18
Results compared to previous methods: Identifying
interacting domain pair in interacting protein pair
-
BSOSC Review, November 2008 19
• Protein interaction and sequence co-
evolution
• Predicting domain interaction from
protein interaction networks
• Combining genetic sequence variation,
genome wide expression profile and
protein interaction to infer pathways
dysregulated in complex diseases
-
Genetic variations in individuals affects gene
expression level
20
Co
ntr
ol
1
Co
ntr
ol
2
Co
ntr
ol
3
Ca
se 1
Ca
se 2
Ca
se 3
Ca
se 4
Ca
se 5
Ca
se 6
Ca
se 7
Case 8
Gene 1
Gene 2
Gene 3
.
.
.
.
.
Gene 3
Genomes
loci eQTL
Case 1
Ca
se
2
Case 7
…
…
Genotype variations
Huang et.al Bioinformatics 2009
-
Bringing PPI network and other high throughput
networks
21
Co
ntr
ol
1
Co
ntr
ol
2
Co
ntr
ol
3
Ca
se 1
Ca
se 2
Ca
se 3
Ca
se 4
Ca
se 5
Ca
se 6
Ca
se 7
Case 8
Gene 1
Gene 2
Gene 3
.
.
.
.
.
Gene 3
Genomes
loci
Target geneCausal gene
Case 1
Ca
se
2
Case 7
…
…
Kim et.al in submitted
eQTL
-
Associations of genes expressed differently
in disease /control groups are primary target
22
Co
ntr
ol
1
Co
ntr
ol
2
Co
ntr
ol
3
Ca
se 1
Ca
se 2
Ca
se 3
Ca
se 4
Ca
se 5
Ca
se 6
Ca
se 7
Case 8
Gene 1
Gene 2
Gene 3
.
.
.
.
.
Gene 3
loci
controls Disease Cases C
ase 1
Ca
se
2
Case 7
…
…
eQTL
Representative target disease genesPutative causal mutations
-
Uncovering causal genes and dys-regulated
pathways
23
Co
ntr
ol
1
Co
ntr
ol
2
Co
ntr
ol
3
Ca
se 1
Ca
se 2
Ca
se 3
Ca
se 4
Ca
se 5
Ca
se 6
Ca
se 7
Case 8
Gene 1
Gene 2
Gene 3
.
.
.
.
.
Gene 3
loci
controls Disease Cases C
ase 1
Ca
se
2
Case 7
…
…
Kim et.al. submitted
eQTL
Disease markers
Causal genes
-
Acknowledgments
BSOSC Review, November 2008 24
Former lab members
Katia Guimaraes (associate professor, Brazil)
Raja Jothi (currently PI at NIEHS )
Elena Zotenko (currently Max Planck Institute)
Current lab members
Dong Yeon Cho
Yoo-ah Kim
Yang Huang
Damian Wojtowicz
Jie Zheng
Collaborators
Maricel Kann UMBC
-
BSOSC Review, November 2008 25
-
BSOSC Review, November 2008 26
Compute
conservation profile
Use conserved positions only
Additional correction using previously mentioned methods
Evolutionarily conserved regions help separate functional
co-evolution from co-evolution due common speciation
historyLow entropy (evolutionarily conserved)
Kann et. al. Proteins 2007