chemical physics letters - estrada lab · 2015-10-08 · diagonal matrix of node degrees and a is...
TRANSCRIPT
Chemical Physics Letters 486 (2010) 166–170
Contents lists available at ScienceDirect
Chemical Physics Letters
journal homepage: www.elsevier .com/locate /cplet t
Topological atomic displacements, Kirchhoff and Wiener indices of molecules
Ernesto Estrada a,*, Naomichi Hatano b
a Department of Mathematics, Department of Physics and Institute of Complex Systems, University of Strathclyde, Glasgow G11XQ, United Kingdomb Institute of Industrial Science, University of Tokyo, Komaba, Meguro 153-8505, Japan
a r t i c l e i n f o
Article history:Received 13 December 2009In final form 29 December 2009Available online 4 January 2010
0009-2614/$ - see front matter � 2009 Elsevier B.V. Adoi:10.1016/j.cplett.2009.12.090
* Corresponding author.E-mail address: [email protected] (E. Es
a b s t r a c t
We provide a physical interpretation of the Kirchhoff index of any molecules as well as of the Wienerindex of acyclic ones. For the purpose, we use a local vertex invariant that is obtained from first principlesand describes the atomic displacements due to small vibrations/oscillations of atoms from their equilib-rium positions. In addition, we show that the topological atomic displacements correlate with the tem-perature factors (B-factors) of atoms obtained by X-ray crystallography for both organic molecules andbiological macromolecules.
� 2009 Elsevier B.V. All rights reserved.
1. Introduction
Many topological ideas have been introduced in chemistry in anad hoc way [1]. A classical example is provided by the oldest topo-logical index, which is nowadays known as the Wiener index, W[2]. It is defined as the sum of all shortest-path distances between(non-hydrogen) atoms in a molecule. This index correlates verywell with many physico-chemical properties of organic molecules[3]. Several attempts to provide a physico-chemical interpretationof W have been conducted. In one of them, W has been shown torepresent a rough measure of the molecular surface area [4]. Morerecently, Gutman and Zenkevich [5] have shown that this index isrelated to the internal energy of organic molecules, with a specialrole played by the vibrational energy.
In general, very few approaches to defining topological invari-ants in chemistry start from first-principle physical concepts,deriving indices which are physically sound and chemically useful.An attempt to define a topological index along the line of this strat-egy was developed by Klein and Randic, who defined the so-calledKirchhoff index, Kf [6]. The Kirchhoff index is defined in an analo-gous way to the Wiener index but by using the concept ofresistance distance rij between pairs of nodes instead of the short-est-path distance. Despite that the index uses well-known con-cepts from physics such as Ohm’s and Kirchhoff’s laws [6], it isnot straightforward to realise what the ‘electrical resistance’means for a chemical bond. These difficulties have urged us tosearch for a first-principle approach to defining topological invari-ants that have a clear physico-chemical interpretation and thatsolve existing chemico-structural problems.
Here we derive a local vertex invariant from first principleswhich describes the atomic displacements due to small vibra-
ll rights reserved.
trada).
tions/oscillations of atoms from their equilibrium positions.Using this approach we provide a physical interpretation of theKirchhoff index of any molecule in terms of atomic displace-ments. We show here that the Kirchhoff index is the sum ofthe squared atomic displacements produced by small molecularvibrations or oscillations of atoms from their equilibrium posi-tions. For acyclic molecules as the ones studied by Gutmanand Zenkevich [5], our results explain the relationship betweenthe Wiener index and vibrational molecular energy. The topolog-ical atomic displacements are shown here to correlate with thetemperature factors (B-factors) of atoms obtained by X-ray crys-tallography. We illustrate our results for both organic moleculesand proteins.
2. Background
Here we represent molecules as graphs G = (V, E), where nodesrepresent united atoms and edges represent physical interactionsbetween such united atoms. In the simplest case of an alkanemolecule the nodes represent the united atoms CHn, where n = 0,1, 2, 3, and the edges are the covalent C–C bonds; in other words,the graph corresponds to the hydrogen-depleted moleculargraph. However, we are not constrained here to such representa-tion. For instance, a protein can be represented through its residueinteraction graph/network [7]. In this approach the nodes areunited atom representations of the amino acids, centred at theirCb atoms, with the exception of glycine for which Ca is used. Twonodes are then connected if the distance rij between both Cb atomsof the residues i and j is not longer than a certain cutoff value rC.The elements of the adjacency matrix of the residue network areobtained by
Aij ¼HðrC � rijÞ i – j
0 i ¼ j
�;
E. Estrada, N. Hatano / Chemical Physics Letters 486 (2010) 166–170 167
where H(x > 0) = 1 and Hðx 6 0Þ ¼ 0. We use rC = 7.0 Å [7]hereafter.
The Wiener index W is defined as [2]
W ¼Xi<j
dij; ð1Þ
where dij is the shortest-path distance between atoms i and j in themolecular graph. In the case of a molecular network like a residuenetwork, the Wiener index divided by the number of nodes hasbeen used as a criterion for defining ‘small-world’ networks.
The Kirchhoff index is defined as [6]
Kf ¼Xi<j
rij; ð2Þ
where the resistance distance rij between nodes i and j in a graph isobtained through the Moore–Penrose generalised inverse of theLaplacian matrix L+ [8], as
rij ¼ ðLþÞii þ ðLþÞij � 2ðLþÞij: ð3Þ
The Laplacian matrix is defined as L = D � A, where D is thediagonal matrix of node degrees and A is the adjacency matrix. Itis well-known that for acyclic molecules the Wiener and the Kirch-hoff indices coincide.
3. Topological atomic displacements
We now consider the classical analogy in which the atoms arerepresented by balls and bonds are identified with springs with acommon spring constant k [9]. We would like to consider a vibra-tional excitation energy from the static position of the molecule.Let xi denote the displacement of an atom i from its static position.Then the vibrational potential energy of the molecule can be ex-pressed as
Vð~xÞ ¼ k2~xT L~x; ð4Þ
where ~x is the vector whose ith entry xi is the displacement of theatom i.
Now we are going to suppose that the molecule is immersedinto a thermal bath of inverse temperature b ¼ 1=kBT , where kB isthe Boltzmann constant. Then the probability distribution of thedisplacement of the nodes is given by the Boltzmann distribution
Pð~xÞ ¼ e�bVð~xÞ
Z¼ 1
Zexp �bk
2~xT L~x
� �; ð5Þ
where the normalisation factor Z is the partition function of themolecule
Z �Z
d~x exp � bk2~xT L~x
� �: ð6Þ
The mean displacement of an atom i can be expressed by
Dxi �ffiffiffiffiffiffiffiffiffix2
i
� �q¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiZx2
i Pð~xÞd~x
s: ð7Þ
We can calculate this quantity once we can diagonalise theLaplacian matrix L. Let us denote by U the matrix whose columnsare the orthonormal eigenvectors ~wl and K the diagonal matrixof eigenvalues kl of the Laplacian matrix. Note here that the eigen-values of the Laplacian of a molecular graph are positive except forone zero eigenvalue. Then, we write the Laplacian spectrum as0 ¼ k1 < k2 6 � � � 6 kn. An important observation here is thatthe zero eigenvalue does not contribute to the vibrational energy. Thisis because the mode l = 1 is the mode where all the atoms (balls)move coherently in the same direction and thereby the whole mol-ecule moves in one direction. In other words, this is the motion ofthe centre of mass, not a vibration.
In calculating Eqs. (5) and (6), the integration measure is trans-formed as
d~x �Yn
i¼1
dxi ¼ det Uj jYi¼1
dyi ¼ d~y; ð8Þ
because the determinant of the orthogonal matrix, det U, is either±1. Then we have
Z ¼Z
d~y exp � bk2~yTK~y
� �¼Yn
l¼1
Z þ1
�1dyl exp � bk
2kly2
l
� �: ð9Þ
Note again that because k1 ¼ 0 the contribution from this eigen-value obviously diverges. This is because nothing stops the wholemolecule from moving coherently in one direction. When we areinterested in the vibrational excitation energy within the network,we should offset the motion of the centre of mass and focus on therelative motion of the nodes. We therefore redefine the partitionfunction by removing the first component l = 1 from the last prod-uct. We thereby have
~Z ¼Yn
l¼2
Z þ1
�1dyl exp � bk
2kly2
l
� �¼Yn
l¼2
ffiffiffiffiffiffiffiffiffiffi2p
bkkl
s: ð10Þ
Next we calculate the mean displacement Dxi defined by Eq. (7).We first compute the numerator of the right-hand side of Eq. (7) asfollows:
Ii ¼Z
d~xx2i exp �bk
2~xT L~x
� �¼Z
d~y U~yð Þ2i exp �bk2~yTK~y
� �
¼Z
d~yXn
m¼1
Uimym
!2
exp �bk2~yTK~y
� �
¼Z
d~yXn
m¼1
Xn
c¼1
UimUicymyc
!Yn
l¼1
exp � bk2
kly2l
� �: ð11Þ
On the right-hand side, any terms with m – c will vanish after inte-gration because the integrand is an odd function with respect to ymand yc. The only possibility of a finite result is due to terms withm = c. We therefore have
Ii ¼Z
d~yXn
m¼1
Uimymð Þ2" #Yn
l¼1
exp � bk2
kly2l
� �
¼Z þ1
�1dy1 Ui1y1ð Þ2 �
Yn
l¼2
Z þ1
�1dyl exp � bk
2kly2
l
� �
þXn
m¼2
Z þ1
�1dym Uimymð Þ2 exp � bk
2kmy2
m
� �
�Yn
l¼1l–m
Z þ1
�1dyl exp � bk
2kly2
l
� �; ð12Þ
where we separated the contribution from the zero eigenvalue andthose from the other ones. Due to the divergence introduced by thezero eigenvalue we proceed the calculation by redefining the quan-tity Ii with the zero mode removed:
~Ii �Xn
m¼2
Z þ1
�1dym Uimymð Þ2 exp � bk
2kmy2
m
� �
�Yn
l¼2l–m
Z þ1
�1dyl exp � bk
2kly2
l
� �
¼Xn
m¼2
U2im
2
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi8p
bkkmð Þ3
s�Yn
l¼2l–v
ffiffiffiffiffiffiffiffiffiffi2p
bkkm
s¼ ~Z �
Xn
m¼2
U2im
bkkm: ð13Þ
Fig. 1. Linear correlation between the topological atomic displacements and experimental B-factors for carbon atoms of anthracene (empty circles) and pyrene (emptysquares). The temperature factors of equivalent atoms were averaged.
168 E. Estrada, N. Hatano / Chemical Physics Letters 486 (2010) 166–170
We therefore arrive at the following expression for the meandisplacement of an atom:
Dxi �ffiffiffiffiffiffiffiffiffix2
i
� �q¼
ffiffiffi~Ii
~Z
s¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXn
m¼2
U2im
bkkm
vuut : ð14Þ
If we designate by L+ the Moore–Penrose generalised inverse of thegraph Laplacian [8], which has been proved to exist for any molec-ular graph, then it is straightforward to realise that
Dxið Þ2 ¼ 1bk
Lþ�
ii: ð15Þ
4. Kirchhoff and Wiener indices revisited
From now on we are going to consider the case bk � 1 for thesake of simplicity. Then, it is easy to see that due to the orthonor-mality of the eigenvectors of the inverse Laplacian, we have
Xn
i¼1
Dxið Þ2 ¼ tr Lþ�
¼Xn
j¼2
1kj¼ 1
nKf ðGÞ; ð16Þ
That is, the Kirchhoff index of a molecular graph is simply the sumof the squared atomic displacements produced by small molecularvibrations multiplied by the number of atoms in the molecule. Sinceit is well-known that for acyclic molecules, i.e., molecular trees, theKirchhoff and Wiener indices coincide, we also have
WðTÞ ¼ nXn
i¼1
Dxið Þ2: ð17Þ
Then the potential energy (4) can be expressed as
Vð~xÞ ¼ 12
Xn
i¼1
kiðDxiÞ2 �Xi;j2E
Dxið Þ Dxj�
:
Let Ri ¼Pn
j¼1rij be the sum of all resistance distances from atom i toany atom in the molecule, i.e., the sum of the ith row (or column) ofthe resistance distance matrix. That is, Ri ¼
Pnj¼1ðL
þiiþ Lþjj � 2Lþij Þ. It
is known thatPn
j¼1Lþij ¼ 0. Then
Ri ¼ nLþii þ tr Lþ�
¼ n Dxið Þ2 þXn
i¼1
Dxið Þ2; ð18Þ
This relation indicates that (Dxi)2 and Ri are linearly related for theatoms of a given molecule. Using Eq. (18) we can express the po-tential energy (4) in terms of the resistance distance of the atomsin the molecular graph
V ~xð Þ ¼ 12n
Xn
i¼1
kiRi �1n2
Xi;j2E
RiRj �KfnðRi þ RjÞ þ
1n2 Kf
�1=2
� Kf2n
kh i:
ð19Þ
where hki is the average degree of the molecular graph. The firstterm in the right-hand side of Eq. (19) was already introduced byEstrada et al. [10] as a topological index obtained from the quadraticform vh jD uj i, where v is a vector of node degrees, D is the distancematrix and u is an all-one vector.
In summary, the normalised Kirchhoff index of a moleculargraph represents the sum of squared displacements of atoms dueto molecular vibrations and the sum of resistance distances for agiven atom depends linearly on the square of the displacementof the corresponding atom. The term (Dxi)2 has a very clear phys-ical interpretation. It represents the atomic displacement due tomolecular vibrations. Small values of (Dxi)2 indicate that thoseatoms are very rigid in the molecule. For instance, in 2,2,3-trim-ethylbutane the smallest displacement is obtained for the carbonatom connected to three methyl groups DxC = 0.534, followed bythe one bonded to two CH3 groups, DxCH = 0.655. Then, the methylgroups display the largest displacements, DxCH3 ¼ 1:000 for thoseat position 2 and DxCH3 ¼ 1:069 for those at position 3.
5. Topological displacements and temperature factors
We guess that the atomic displacement Dxi should display somelinear correlation with an experimental measure of how muchan atom oscillates or vibrates around its equilibrium position. Suchexperimental measure is provided by X-ray experiments as the so-called B-factor or temperature factor, and represents the reduction
Fig. 2. Profiles of the topological atomic displacements (solid line) and the B-factors (dotted line) for lipase b from Candida antarctica, 1tca (top), and illustration of the aminoacids having the 20 largest (blue) and 20 smallest (red) values of the topological atomic displacements (bottom). (For interpretation of the references to colour in this figurelegend, the reader is referred to the web version of this article.)
E. Estrada, N. Hatano / Chemical Physics Letters 486 (2010) 166–170 169
of coherent scattering of X-rays due to the thermal motion of theatoms. For instance, in the molecule of naphthalene the atomic dis-placements of carbon atoms correlate very well with the experi-mental B-factors (in parenthesis) [11]: 0.898 (4.6 Å2), 0.815(4.0 Å2) and 0.615 (3.4 Å2), which gives a correlation coefficientr = 0.963. The following correlation coefficients are obtained for:anthracene [12] (r = 0.996); phenanthrene [13] (r = 0.955); pyrene[14] (r = 0.997); and triphenylene [15] (r = 0.997). In these cases weaveraged the values of B-factors for equivalent carbon atoms. InFig. 1 we plot the values of Dxi versus the B-factors for the carbonatoms of anthracene and pyrene.
The B-factors are quite relevant for the study of protein struc-tures as they contain valuable information about the dynamicalbehaviour of proteins and several methods have been designedfor their prediction [16]. It is known that regions with large B-fac-tors are usually more flexible and functionally important. Theatomic displacements have been used previously by Bahar et al.[17] to describe thermal fluctuations in proteins. We note in passingthat we use here a residue network representation of the proteinbased on b-carbons instead of the a-carbons used by Bahar et al.
For the sake of illustration we have selected here the lipase bfrom Candida antarctica (1tca) [18]. In this case we obtain a corre-
170 E. Estrada, N. Hatano / Chemical Physics Letters 486 (2010) 166–170
lation coefficient r = 0.74 between the experimental B-factors andthe topological atomic displacements of b-carbons of the aminoacids. For this protein Yuan et al. [19] reported r = 0.63 for predict-ing the experimental B-factors. In Fig. 2 (top) we illustrate the pro-files for the normalised B-factors and the topological atomicdisplacements of residues for this protein. We also represent inFig. 2 (bottom) the 20 residues with the highest values of Dxi inthe molecular structure of the protein. We recall that the residueswith the largest values of the atomic displacements are those dis-playing the highest flexibility in the protein. Here we have repre-sented these residues by using blue colour for the atoms in theseresidues. We also represent the 20 residues with the lowest valuesof Dxi, which correspond to those displaying the highest rigidity inthe protein. They are coloured in red in the molecular structure ofthe protein. As can be seen the most flexible amino acids are thosewhich are on the surface of the protein, while the most rigid onesare concentrated around the protein core.
The new relationship obtained here between the topologicalatomic displacements and the sum of the resistance distances fora given atom, i.e., the expression (17), opens up new possibilitiesfor interpreting Dxi in a given molecule. According to Eq. (17) thetopological displacements for the atoms in a molecule depend onlyon the sum of the resistance distances for the corresponding atom,e.g., ðDxiÞ2 � 1
n
Pjrij [6]. It is known that if there is more than one
path connecting two atoms in a molecule, i.e., there are cycles,the resistance distance is smaller than in the case when there isonly a single path. Then, if there is one oscillation/vibration inone atom which is transmitted to all the other atoms through thedifferent paths connecting them, the vibration is attenuated alongevery path. Consequently, a small value of Dxi is due to the fact thatthe atom i is part of a large number of paths connecting it to otheratoms. This implies that when the other atoms oscillate/vibratetheir effect is very much attenuated before arriving to i.
6. Conclusions
We have developed a theoretical approach based on classicalmolecular mechanics to accounting for small displacements of
atoms from their equilibrium positions due to oscillations or vibra-tions. The topological atomic displacements are expressed in termsof the eigenvalues and eigenvectors of the discrete Laplacian ma-trix of the molecular graph. Using this approach we have given aclear and unambiguous physical interpretation of the Kirchhoff in-dex as well as of the Wiener index of acyclic molecules. It explainsprevious empirical results clearly, showing that these indices arerelated to vibrational energy of alkanes and dithioderivativecompounds. More importantly, the topological atomic displace-ments are well correlated with the B-factors obtained by X-raycrystallography.
Acknowledgements
EE thanks partial financial support from the New Professor’sFund given by the Principal, University of Strathclyde.
References
[1] J. Devillers, A.T. Balaban (Eds.), Topological Indices and Related Descriptors inQSAR and QSPR, Gordon and Breach, Amsterdam, 1999.
[2] H. Wiener, J. Am. Chem. Soc. 69 (1947) 17.[3] S. Nikolic, N. Trinajstic, Z. Mihalic, Croat. Chem. Acta 68 (1995) 105.[4] I. Gutman, T. Körtvélyesi, Z. Naturforsch. 50a (1995) 669.[5] I. Gutman, I.G. Zenkevich, Z. Naturforsch. 57a (2002) 824.[6] D.J. Klein, M. Randic, J. Math. Chem. 12 (1993) 81.[7] A.R. Atilgan, P. Akan, C. Baysal, Biophys. J. 86 (2004) 85.[8] W. Xiao, I. Gutman, Theor. Chem. Acc. 110 (2003) 284.[9] R.D. Gregory, Classical Mechanics, Cambridge University Press, 2006.
[10] E. Estrada, L. Rodríguez, A. Gutiérrez, Match. Commun. Math. Comput. Chem.35 (1997) 145.
[11] D.W.J. Cruickshank, Acta Cryst. 10 (1957) 504.[12] D.W.J. Cruickshank, Acta Cryst. 9 (1956) 915.[13] J. Trotter, Acta Cryst. 16 (1963) 605.[14] A. Cameraman, J. Trotter, Acta Cryst. 18 (1965) 636.[15] F.R. Ahmed, Acta Cryst. 16 (1963) 503.[16] R. Soheiliford, D.E. Makarov, G.J. Rodin, Phys. Biol. 5 (2008) 026008.[17] I. Bahar, A. Rana Atilgan, B. Erman, Fold. Des. 2 (1997) 173.[18] J. Uppenberg, M.T. Hansen, S. Patkar, T.A. Jones, Structure 2 (1994) 293.[19] Z. Yuan, T.L. Bailey, R.D. Teasdale, Proteins: Struct., Funct., Bionf. 58 (2005) 905.