conformational degeneracy restricts the effective information content of heparan sulfate
Post on 15-Dec-2016
212 Views
Preview:
TRANSCRIPT
Conformational degeneracy restricts the effective information
content of heparan sulfatew
Timothy R. Rudd and Edwin A. Yates*
Received 10th November 2009, Accepted 18th January 2010
First published as an Advance Article on the web 15th February 2010
DOI: 10.1039/b923519a
The linear, sulfated polysaccharide heparan sulfate occupies a pivotal position in intercellular
signalling events, interacting with numerous proteins on the cell surface and in the extracellular
matrix. Its complex sequences suggest high potential information content but, despite extensive
efforts, a clear relationship between its substitution pattern and biological activity remains elusive.
This results from technical limitations, compounded by attempts to correlate substitution pattern
directly with activity without considering other conformational factors. For a series of
systematically modified analogues of heparan sulfate, the relationship between substitution
pattern and experimental 13C NMR chemical shifts, which act as reporters of the presence of
conformational change, particularly around the glycosidic linkages, was explored through
chemometric analysis. From analysis of the experimental data it was evident that wide linkage
variation arose from O-sulfation in iduronate and N-sulfation in glucosamine residues but, their
effects were distinct, while 6-O-sulfation had much less impact. Models of saccharide sequences
showed that the maximum spread of variation in glycosidic linkages occurred before maximum
sequence diversity and revealed a highly degenerate system: a fraction of possible sequences is
sufficient to provide diverse backbone conformations to satisfy particular protein binding
requirements. The unique information content potentially available in HS sequences, defined
ultimately by conformation, is vastly inferior to the potential sequence diversity.
1. Introduction
The glycosaminoglycan, heparan sulfate, forms an important
nexus betweenmammalian cells and their immediate environment,
the extracellular matrix, and is therefore of major biological
and potential medical significance. Heparan sulfate (HS) has
become the focus of considerable interest following its
implication in many diverse biological and medical functions,
ranging from the apparently structure-specific interaction with
antithrombin (AT) and subsequent inhibition of factor Xa, to
the much less specific interaction with thrombin (factor IIa), as
well as to a wide variety of other proteins,1 which include the
Alzheimer’s b-secretase,2,3 fibroblast growth factors and
receptors (FGF/FGFRs),4–7 other growth factors, such as
GDNF,8,9 components of the WNT signalling pathway,10
microbial surface proteins, including the viral coat protein of
herpes simplex virus (HSV),11,12 HIV 13–15 and the inhibition
of microbial attachment.16 Many of these examples involve
interactions between HS chains and single proteins, others
require formation of ternary complexes and might be expected
to exhibit higher levels of structural specificity. However, the
search for an understanding of structure–function relation-
ships has, so far, proved elusive and remains the subject of
much debate. Only the interactions between AT and a
relatively restricted series of pentasaccharides in heparin17
or that between the HSV coat protein and 3-O-sulfated
structures in HS11 approach the term ‘‘specific’’ in the accepted
biochemical sense.
Some hold the view that specificity is high, but conclude this
on the basis of a set of test compounds of limited sequence
variety. Others speculate that HS activities have little or no
structural specificity and that their properties are due to
general charge density considerations.18 For the vast majority
of HS binding proteins, a broad range of binding and activities
are observed experimentally as the sequence is varied and
School of Biological Sciences, University of Liverpool, Liverpool,UK L69 7ZB. E-mail: eayates@liv.ac.uk; Tel: +44 (0)151-795-4429w Electronic supplementary information (ESI) available: Table S1.Systematically modified heparins used in the analysis. Table S2. 13Cchemical shift assignments (ppm) for 12 chemically modified heparins.Table S3. The relative contribution of the first three principal components(c1 (accounting for 38.8% of the variance in the data), c2 (22.3%) andc3 (18.5%) to the range of 13C chemical shift values observed at thelinkage positions among the 12 modified polysaccharides. Fig. S1.Combinatorial modelling of A-1–I-4 linkages, then I-1–A-4 linkages;linkage variation against number of sulfates, kernel density plots of thelinkage variation distributions and variation of the linkage distance atevery level of sulfation for 2–16 residues. Table S4. Analysis of thevariation in individual glycosidic linkages from combinatorialmodelling of all possible oligosaccharide stretches from 2 to 16residues, linkages A-1, I-4 and I-1, A-4 individually. Fig. S2. Distributionof linkage variation (r.s.s. distances) along all possible sequencecombinations for 4 and 6 residues, respectively. Fig. S3. Variabilityof linkage positions in all possible tetrasaccharide stretches at eachlinkage position. Table S5. Breakdown of individual linkage variationfor all possible sequences of 4 residues. Fig. S4A. Dissimilaritiesbetween all stretches of 4 residues running in the same direction.Fig. S4B. Dissimilarities between all stretches of 4 residues running inopposite directions. Fig. S5. Histogram of the dissimilarity values forall stretches of 4 residues and 6 residues running in the same directionand the comparison of forward and backwards measuring dissimilarityalong the sequence, rather than overall variation in the linkage. SeeDOI: 10.1039/b923519a
902 | Mol. BioSyst., 2010, 6, 902–908 This journal is �c The Royal Society of Chemistry 2010
PAPER www.rsc.org/molecularbiosystems | Molecular BioSystems
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online / Journal Homepage / Table of Contents for this issue
no simple relationship between substitution pattern and
activity usually emerges. The field has, therefore, reached
something of an impasse, with both mechanistic insight and
progress towards exploiting the many medical and pharma-
ceutical applications of HS derivatives being effectively
hampered by the lack of understanding of structure–function
relationships.
Heparan sulfate and its close structural analogue, heparin,
as well as their chemically modified derivatives, share a
common structural backbone, based on a repeating 1,4-linked
disaccharide unit [Scheme 1] comprising a uronic acid (either
b-D-GlcA or a-L-IdoA) and a-D-glucosamine, with varying
patterns of sulfation, at position-2 of the uronic acid and/or
-6 of glucosamine with either N-acetyl, N-sulfonamido
(N-sulfate) or free amino groups at position-2. Other, rarer
sulfations can also occur, most notably 3-O-sulfation in
glucosamine and 2-O-sulfation in GlcA residues. It has been
speculated that HS may have high information content, an
idea that is based on its potentially vast sequence diversity but,
the extent to which this is exploited in nature and if it is, the
nature of the relationships between substitution pattern,
structure and function, remain important questions. Where
the conformational details have not been specifically pursued
by in-depth studies on small numbers of well-defined oligo-
saccharides, there has been a tendency to treat these molecules
as comprising linear chains with appended charged groups
(there is often an emphasis on sulfates), while the backbone
geometry, although likely to play a major role in defining the
binding and hence activity of HS with proteins, has largely
been overlooked. Indeed, it can be argued that providing the
ability for HS chains to adopt appropriate backbone geometries
is a prerequisite to disporting the charged groups in suitable
geometric arrangements.
A first step towards understanding these molecules and
hence their complex interactions with proteins must be to
delineate the relationship between their substitution patterns
and conformational properties, which may help provide an
explanation for experimentally observed structure–function
relationships. Without doubt, the most informative current
method for the study of conformation in solution is nuclear
magnetic resonance (NMR) spectroscopy and a considerable
body of work has pursued this aim, usually for individual
oligosaccharides8,11,19–25 or heparin-related polysaccharides.26–28
However, there are a huge number of combinations of
substitution patterns for even modestly sized oligosaccharides
and the extent to which this diversity is exploited, or indeed,
the extent to which it needs to be exploited in nature, is not
known. In the present article, general characteristics of this
system, which underlie the relationship of sequence diversity
to information content, are sought and the latter issue is
addressed.
Individual saccharides are very difficult to purify from
natural sources and tackling the problem through studying
the conformation of the many synthetic oligosaccharides that
would be required is currently difficult to envisage for practical
reasons. Here, the nature of the relationship between substitution
(O-sulfation, N-sulfation, N-acetylation) and the observed13C NMR chemical shift patterns in 12 systematically modified
polysaccharides are examined, employing multivariate statistical
techniques, with the aim of revealing the underlying relationships
between substitution pattern and changes in conformation,
particularly at the glycosidic linkages, which may help account
for the activities of HS saccharides.
2. Results and discussion
The experimental 13C NMR chemical shift data of 12
model polysaccharides have been analysed to determine the
extent to which the ability to vary the substitution pattern at
each of the three main positions of substitution: position-2 of
iduronate (I-2), position-2 of glucosamine (A-2) or position-6
of glucosamine (A-6), either alone, or in combination, affects
the overall degree of structural variability observed, parti-
cularly in the linkages (defined by A-4 to I-1 and I-4 to A-1)
but also at I-3 and I-5, which report conformational changes
in iduronate residues [Table 2]. These seemingly esoteric
questions are of fundamental importance to an understanding
of HS–heparin structure–function relationships because the
relative structural significance, in terms of backbone geometry,
of the three modifications which mimic the biosynthetic steps
(2- or 6-O-sulfation in IdoA or GlcN, or N-sulfation/
N-acetylation in GlcN) can be determined. This will be a
crucial first step in relating structure to activity in a way that
moves beyond simple correlations between sulfation and
activity.
The effect of sulfation at I-2 or A-2 (component 1 or 2) has
significant effects at I-3 and I-5, indicative of conformational
change [Table 1], whereas sulfation at A-6 (component 3) has
negligible effect. Assignment of the 13C NMR spectra28 was
carried out using multi-dimensional homonuclear (1H–1H)
and heteronuclear (1H–13C) NMR [Table S1, ESIw]. Subsequentprincipal component analysis identified three principal
components (c1. . .c3) describing 38.8, 33.2 and 18.5% of the
variation (79.6% overall) in the data. It was noteworthy that
c1, c2 and c3 correlated strongly with the substitution state of
Scheme 1 Schematic of general repeating disaccharide structure of
heparan sulfate and modified heparin polysaccharides: [–4) a-L-IdoA1–4 a-D-glucosamine (1–], where R1 =H or SO3
�, R2 =H/COCH3 or
SO3� and R3 = H or SO3
�. The a-L-IdoA can be replaced by its C-5
epimer, b-D-GlcA.
Table 1 Influence of modifications on 13C NMR chemical shifts atI-3 and I-5
Position Component 1 Component 2 Component 3
I-3 0.77 0.60 0.01I-5 0.77 0.50 �0.06
This journal is �c The Royal Society of Chemistry 2010 Mol. BioSyst., 2010, 6, 902–908 | 903
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
positions I-2, A-2 and A-6, respectively, and also that
the dependencies of variation of 13C chemical shift values on
the 3 components varied considerably at the linkage positions
[Table 2].
The relative contributions of each component to the
overall variation in chemical shifts at the glycosidic linkage
positions (A-1, I-4, I-1 and A-4) were extracted from the
dataset comprising all the chemical shift values and are
shown in Fig. 1 and broken down in Table 2 to show the
effect at individual linkage positions. This revealed the relative
extent to which the ability to vary substitution pattern
at a given position could be related to variations in
chemical shift values at the four linkage positions, A-1, I-4,
I-1 and A-4.
2.1 Linkages are influenced differentially by the three
substitution positions
A measure of the variations in 13C chemical shift values at
those positions involved in the glycosidic linkages, A-1, I-4, I-1
and A-4, were extracted from the loading plots, which had
themselves been derived using all the chemical shifts in the
molecule [Table S1, ESIw]. Variation at positions A-1 and I-1
depended heavily on modification at A-2 (component 2; c2 in
Table 2) and modification at I-2 (c1) respectively. Variation
at A-4 depended on both modification at A-6 (c3) and
modification at A-2 (c2), while variation at I-4, on the
other hand, depended on c1 (modification at I-2) and c2
(modification at A-2) [Fig. 1A–C]. Variation at none of the
linkage positions depended heavily on the substitution condi-
tion at both I-2 and A-6 (c1 and c3), indicating that modifica-
tions at A-2 and A-6 created variation at the linkage positions
independently of each other. Looked at another way, the
linkages either side of glucosamine were influenced primarily
by substitution at A-2 and A-6, but not I-2 (i.e. by c2 and c3,
but not c1) and those either side of the iduronate residue were
influenced by substitution at I-2 and A-2, but not A-6 (i.e. by
c1 and c2, but not c3). It is noteworthy that c3 (modification
at A-6) had a significant effect only at A-4 and then the effect
was moderate, while two substitutions resulted in effects at
I-1: either at I-2 or A-2 (c1 and c2).
Table 2 Combinations of components 1, 2 and 3 (representing an ability to modify positions I-2, A-2 and A-6 respectively) and their resulting‘‘distance’’ (in terms of overall (root sum of squares) changes in chemical shift) away from the unmodified compound (defined as the origin)
Contribution
r.s.s. ‘‘distance’’aA-1 I-4 I-1 A-4
Combination of components R1 R2 R3 R4
c1 c1 0.05 0.52 0.98 0.02 1.11c2 c2 0.87 0.84 0.02 0.19 1.22c3 c3 0.04 0.02 0.01 0.51 0.51c1 + c2 c1 0.05 0.52 0.98 0.02 1.65
c2 0.87 0.84 0.02 0.19c1 + c3 c1 0.05 0.52 0.98 0.02 1.22
c2 0.04 0.02 0.01 0.51c2 + c3 c2 0.87 0.84 0.02 0.19 1.33
c3 0.04 0.02 0.01 0.51c1 + c2 + c3 c1 0.05 0.52 0.98 0.02 1.73
c2 0.87 0.84 0.02 0.19c3 0.04 0.02 0.01 0.51
None (origin) — 0 0 0 0 Originr.s.s. variation at each positionb — 3.83 5.51 3.94 2.88 —
a [S4i=1ri
2]1/2. b [Sri2]1/2.
Fig. 1 Contributions (loading barplots) of the three principal
components to variation at each glycosidic linkage position in the
disaccharide repeating unit of modified heparin derivatives: A.
Component 1, B. Component 2, C. Component 3. [blue star highlights
I-1 or I-4, red stars highlight A-1 or A-4]. D. Overall extent of
variation in the linkage positions bestowed by the ability to modify
substituents at I-2, A-2 and A-6 (correlated with c1, c2 and c3
respectively) alone and in combination (root sum of squares) plotted
as the root sum of squares distance (r.s.s. distance) from starting
structure (to scale). The effects of c1, c2 and c3 alone are shown in
blue, red and green, and correlate with the substitution condition at
I-2, A-2 and A-6 respectively.
904 | Mol. BioSyst., 2010, 6, 902–908 This journal is �c The Royal Society of Chemistry 2010
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
2.2 Analysis of the overall effects of single and combined
modifications, and at each type of linkage
Modifications at A-2 caused the most significant changes to
linkage positions, followed by those at I-2 and then, to a lesser
extent, by those at A-6. It was also possible to examine the
combined effects of multiple modifications on the variation in
chemical shift values at each linkage position. Summing
the relative contributions (root sum of squares distance:
(r12 + r2
2 + r32 + r4
2)1/2) over all linkages represents the
overall ‘‘distance’’, a measure of the variation from an
unmodified starting structure (the origin) that can be generated
by having the capability of making particular single or
combined modifications, whose magnitudes are represented
in Fig. 1D.
The degree of variation at each linkage position [i.e. A-1,
I-4, I-1 and A-4] was analysed, in the case of single and then
combined modifications, by summing the contributions at
each linkage position of the relevant components for each
combination of possible modification: at I-2 (c1 alone), at A-2
(c2 alone), at A-6 (c3 alone), then at I-2 and A-2 (c1 + c2) etc.,
with ‘‘none’’ defining the origin, i.e. no substitution [Table 2].
Similarly, the extent to which variation in the 13C chemical
shift values for each particular linkage [i.e. A-1. . .I-4 or
I-1. . .A-4] was bestowed by an ability to alter substituents at
single or combined positions could also be determined. At
each linkage position, these values were summed [i.e. down the
columns in Table 2] to give the overall variation possible for all
combinations of modifications at each individual linkage
position, I-4 being most strongly affected and, greater variation
was apparent in the A-1. . .I-4 linkage (3.83 and 5.51) than in
I-1. . .A-4 (3.94 and 2.88; values taken from the last row of
Table 2).
The order of significance (i.e. how far the resulting struc-
tures were from the starting structure) was the following: the
substitution state at A-2 was most significant, then I-2 and the
least significant was the substitution state at A-6. The bio-
synthetic enzymes are thought to act in the same order. This
suggests that the ability to modify the backbone geometry
underlies the relationship between sequence and structure in
HS–heparin. It was also interesting that 6-O-sulfation was not
responsible for widespread structural changes in the backbone;
its most significant effect being a moderate influence on
chemical shift values at A-4. In contrast, the modifications at
A-2 and I-2, which occur earlier in the biosynthetic pathway,
have more significant effects on the backbone structure.
Modification at A-2 clearly has widespread effects on linkage
positions, influencing geometry significantly at A-1, I-4 and
moderately so at A-4, while modification at I-2 primarily
influences I-1 and I-4 [Fig. 1A–C and Table 2].
2.3 Implications for heterogeneous sequences
All of the examples above were derived from essentially
homogeneous polysaccharides. The implications for hetero-
geneous sequences were next examined in a sample, in which
randomly distributed chemical modifications involving partial
de-O-sulfation at A-6 and I-2 generated statistical variability
in the structure,28,29 with only A-2 remaining completely
N-acetylated. The principal signals at the anomeric positions
I-1 and A-1 were recorded for combinations of nearest
neighbours, represented in Table 3.
A-1 is insensitive to I-2 modification and I-1 is insensitive to
A-6 modification in their adjacent residues, agreeing with the
findings for the homogeneously modified polysaccharides
(Section 2.2). In addition, despite a mixed population of
adjacent residues, no signals lying outside those values
observed in the homogeneous polysaccharides were evident.
This suggests that the limits of the extent of linkage variation
in heterogeneous sequences are formed by those observed for
the homogeneous polysaccharides.
2.4 Evaluating the effects of substitution patterns at the
glycosidic linkages
The analyses reported here are for sodium salts, in which
there is no possibility of inter- or intra-residue bridging by
multivalent cations.30,31 Inter-residue hydrogen bonding does
exist for some derivatives, however, and has been identified
previously in these compounds.27,32 For the present case, if the
total number of possible ways of influencing the linkage
geometry are considered, and taking values from Table 2,
where any significant effect (values r0.05 are set to 0 and
>0.05 to 1) at A-1, I-4, I-1 and A-4 is represented as 1 and no
effect as 0, the relationship can be simplified [Table 4] to
reveal the presence (1) or absence (0) of an effect. Summing
the contributions made by the components in the same
Table 3 13C chemical shifts at linkage positions A-1 and I-1 inhomogeneous and heterogeneous sequences in a partially modifiedheparin polysaccharide. In both instances, the middle residue (labelledB) is shown with the adjacent residues either side at A and C. Inhomogeneous sequences, the residues at A and C are identical, while inheterogeneous sequences, they are different. The chemical shifts of A-1(a) and I-1 (b) do not differ significantly in cases when (at A and C)adjacent residues differ compared to those of the homogeneous cases(when A = C). This indicates that the effects of the neighbouringgroups (towards the non-reducing end direction) are not great on thelinkage positions and no unexpected shifts in the residue at B occur
(a)I2X–A
6XNAc(A-1)–I2X A-1 of IdoA–GlcN–IdoA
A B C A-1 d 13C (ppm) D (ppm)IdoA2S
–GlcNAc –IdoA2S96.8
0.0IdoA 96.8IdoA
–GlcNAc –IdoA97.1
0.4IdoA2S 97.5IdoA2S
–GlcNAc6S –IdoA2S96.6
0.1IdoA 96.7IdoA2S
–GlcNAc6S –IdoAN/A
N/AIdoA 97.1
0.3 DGlcNAc � 6S(b)
A6XNAc–I2X(I-1)–A
6XNAc I-1 of GlcN–IdoA–GlcN
A B C I-1 d13C (ppm) D (ppm)GlcNAc
–IdoA –GlcNAc104.3
0.3GlcNAc6S 104.6GlcNAc
–IdoA –GlcNAc6S104.6
0.2GlcNAc6S 104.8GlcNAc
–IdoA2S –GlcNAc102.3
0.1GlcNAc6S 102.2GlcNAc
–IdoA2S –GlcNAc6S102.4
0.2GlcNAc6S 102.2
0.1 DIdoA/IdoA2S
X denotes the possible substitutions: IdoA2S/IdoA2OH and
GlcNAc6S/GlcNAc6OH. The error in the reported 13C chemical shift
values is � 0.1 ppm.
This journal is �c The Royal Society of Chemistry 2010 Mol. BioSyst., 2010, 6, 902–908 | 905
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
combinations as above [Table 2] (but using binary addition,
akin to Boolean OR logic; [disjunction x 3 y], so that one or
more effects are registered in the same way, indicated by 1: i.e.
0 + 0 = 0, 1 + 0 = 1, 0 + 1 = 1, 1 + 1 = 1) simplifies the
table and permits the existence of change at each glycosidic
linkage to be recorded. The possible changes at the four
linkage positions, A-1, I-4, I-1 and A-4, were simplified to
two linkages: L1 and L2 [Table 4] shown in the right hand
columns; L1 representing linkage A-1. . .I-4 and L2 representing
I-1. . .A-4.
This shows that certain combinations of substitutions
altered the linkages, L1 and L2, in distinct ways but, also that
there was considerable redundancy, with six unique ways of
simultaneously modifying both glycosidic linkages. Notably,
the 8 combinations failed to generate all 4 possible patterns of
variation in linkage positions (i.e. 0 0, 1 0, 0 1 and 1 1); note
that c2 and c2 + c3 and also c1 + c2 and c1 + c2 + c3 gave
the same result; 1101 and 1111 (= 1 1), respectively, but no
combination of substitutions was able to achieve the pattern
(1 0) i.e., to influence L1 (A-1–I-4) alone.
2.5 Consequences of the limited influence of substitution
pattern on linkage variability in all possible sequences
with 2 to 16 residues
Following the result in Section 2.4, all possible combinations
of pairwise sequences from 2 to 16 residues were modelled,
calculating for each sequence combination the overall
variation in linkage geometry [as a ‘‘distance’’ from the origin
i.e., from an unmodified structure of the same length] which
was plotted against the number of sulfate groups it contained
[Fig. 2A]. It is important to note that these models represent
segments of this length as if they were in a polysaccharide
chain, not as free oligosaccharides. The results were also
arrayed as plots of frequency, or density, (i.e. the number of
structures with a given ‘‘distance’’) [Fig. 2B] and the range of
variability for each level of sulfation was determined [Fig. 2C].
Shorter sequences had a rather uneven ‘‘distance’’ distribution
(left-skewed), but this became smoother with increasing chain
length [Fig. 2B]. Furthermore, it was interesting to note
that the widest spread of structural variation (i.e. spread of
‘‘distance’’) occurred at moderate sulfation levels and always
before the maximum sequence diversity [Fig. 2C]. This was a
consequence of the modest influence of the modification at A-6
on the linkage geometry and shows that considerable linkage
variability can already be attained with moderate sulfation
levels and only 2 types of substitution: at A-2 and I-2. The
analysis also illustrated the huge degeneracy present in the
system. The areas in the centre of the plots were occupied by
large numbers of structures for longer oligosaccharides
[Fig. 2A] (For 16 residues stretches, there are B106 sequences
that have 12 sulfates which lie within �1% average r.s.s.
distance.). It is also interesting to note, the two linkages
(A-1. . .I-4 and I-1. . .A-4) have different levels of variation,
with that at A1–I4 being greater than that at I1–A4 [ESI,
Fig. S1]. The implication from these results is that only a
modest subset of all possible structures needs to be synthesised
to generate substantial variation in linkage geometry, the
variability is ‘uneven’ within the structure but, the system is,
in terms of overall linkage variation, highly degenerate.
3. Experimental
The approach presented here aims to deduce the location
and severity of changes in environment at positions in the
repeating structure by analysis of the 13C NMR chemical shift
values for a series of modified heparin polysaccharides, acting
as models of HS. Experimental 13C NMR chemical shift
values were measured for a library of chemically modified
heparin polysaccharides, analysed by factor analysis, with
prior mean-centering and factors were extracted through
principal components. The loadings were derived from the
analyses reporting the effective change although known, in
chemical shift when a modification was made at specific
positions in the molecules; the modifications causing the
change could be identified by the component regression scores.
A combinatorial approach was taken to model a number of
different length sequences by combining the loadings derived
from the previous analysis of the experimental 13C NMR
chemical shift data. The overall effect on the linkages was
investigated by calculating the root sum squared contributions
from all linkage positions for all possible disaccharides
[8 possibilities] to hexadecasaccharides [16 216 possibilities,
777]. This reports the overall effect on the chemical shifts in
the molecule due to the particular modification. For the
combinatorial modelling of all possible (even) sequences from
2–16 residues, the overall (i.e. combined for all linkages in each
sequence) root sum of squares variation from the origin (using
combinations of principal components) were calculated for
each possible sequence. These values were plotted [Fig. 2] as:
A. root sum of squares distances from the origin, B. kernel
density, i.e. number of sequences, with particular distances
from the origin and C. as variation in ‘‘distance from the
origin’’ (difference between maximum and minimum ‘‘distance’’,
Dmax � Dmin) for populations of sequences with particular
numbers of sulfate groups representing the overall range of
structural variation within that population. The same procedure
was also conducted for each type of linkage, either I-A or A-I,
and the results are shown in Fig. S1, ESIw. The root sum
squares were calculated for variation in 13C chemical shift
values for the specific linkage positions along the sequence, to
assess the contributions to linkage variation in both forward
Table 4 The effect of combinations of components (representing theability to modify substituents at A-2, I-2 and A-6) at the four positionsinvolved in the glycosidic linkage and in a reduced form addressing thepresence (1) or absence (0) of changes in the glycosidic linkages L1(A-1. . .I-4) and L2 (I-1. . .A-4). Single or multiple effects are recordedas 1, no effect as 0
Linkage positions Linkages
A-1 I-4 I-1 A-4 L1 L2
None 0 0 0 0 0 0c1 0 1 1 0 1 1c2 1 1 0 1 1 1c3 0 0 0 1 0 1c1 + c2 1 1 1 1 1 1c1 + c3 0 1 1 1 1 1c2 + c3 1 1 0 1 1 1c1 + c2 + c3 1 1 1 1 1 1
906 | Mol. BioSyst., 2010, 6, 902–908 This journal is �c The Royal Society of Chemistry 2010
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
and reverse directions. These were given equal weight and
averaged along the chain to report the overall linkage
variation. This was calculated for all possible sequences of
4 [64 possibilities] and 6 sugar residues [512 possibilities]
[Fig. S2, ESIw].Factor analysis using factors extracted by principal
components was performed using SPSS [SPSSUK Ltd.,
Woking United Kingdom]. All other analysis and data
manipulations were performed using Microsoft Excel (Office
2007) [data manipulation], SPSS [distance correlation matrix]
and R [R Development Core Team (2009). R: A language and
environment for statistical computing, R Foundation for
Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0,
URL http://www.R-project.org] [histograms, large dataset
plotting, kernel density plot]. Other figures were produced
using ChemDraw Ultra 11 [CambridgeSoft, Cambridge, UK]
and SigmaPlot 11 [Systat Software UK Limited,
Hounslow, UK].
4. Conclusion
The variation in linkage geometry of heparin and HS saccharides
is dependent on substitution pattern and this provides a subtle
link between sulfation pattern with the observed biological
activities. The conformational consequences are distinct at the
two linkages [A-1. . .I-4 and I-1. . .A-4] and at each type
of glycosidic linkage position [A-1, I-4, I-1 and I-4]. The
modifications which occur earlier in the biosynthetic pathway
(at A-2 and I-2) have the greatest effect on the linkages, while
6-O-sulfation has least apparent influence on backbone and
overall conformation.
The effect on linkage variation in short sequences arising
from an ability to modify I-2, A-2 and/or A-6 with sulfate
groups, or not, is predicted to result in uneven linkage
variation. The overall effect of this on the linkage ‘‘distance’’
becomes obscured for longer stretches by virtue of the vast
number of possible sequence combinations and their huge
apparent degeneracy, although when structural variety at each
linkage position along the chains is considered, the degeneracy
is reduced (o20%) [Fig. S5, ESIw] but is still very significant
[B3 � 106 possible sequences for hexadeccasaccharides].
Given the absence of strict sequence specificity for most
HS–protein interactions, this argues for a considerable degree
of conformational degeneracy in linkages as a function of
substitution pattern and implies either, that only a relatively
small subset of structures may be required in nature to satisfy
the various geometric requirements for HS–protein binding, or
that there may be many alternative sequences, any of which
will suffice in a given situation. In other words, the unique
information content potentially available in HS sequences,
defined ultimately by conformation, is vastly inferior to the
potential sequence diversity. The results also suggest that the
definition of specificity for these compounds needs to be stated
carefully and should include reference to both sequence as well
as overall chain and linkage conformation. The level of
degeneracy apparently present within the system implies
that several HS sequences should satisfy particular binding
requirements and has implications for the design of GAG
mimetics. It should be emphasised that, for valid structure–
function conclusions to be drawn employing an approach
based on screening libraries for activities, a sufficiently wide
range of structures (to cover geometric variety sufficiently, not
simply to occupy degenerate sequences) first needs to be
screened.29 Those structure–activity relationships which are
derived from experiments employing, for example, structurally
limited naturally occurring HS from a particular source, may
not contain sufficient geometric diversity to allow rigorous
conclusions to be drawn concerning issues of structural variety
and specificity more widely. HS domains should be viewed in
terms of their conformational characteristics and not simply
the type of sulfation pattern.
Fig. 2 Combinatorial modelling of overall linkage variation for all
possible oligosaccharides (pairwise from 2 to 16 residues): A. Com-
bined linkage ‘r.s.s. distances’ against number of sulfates for the
linkages A-1, I-1 and I-1, A-4 combined. B. Kernel density plots of
the linkage distance distributions in A. C. Total variation (maximum
r.s.s. distance, Dmax �mimimum r.s.s. distance, Dmin) of the combined
linkages at each level of sulfation for all possible sequences.
This journal is �c The Royal Society of Chemistry 2010 Mol. BioSyst., 2010, 6, 902–908 | 907
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
An important observation is that high backbone diversity
was attainable even with low levels of sulfation, providing the
possibility of many structurally suitable molecular scaffolds
(defined by backbone geometry) for direct interactions with
single proteins, but this also allows scope for differentiation
during the formation of higher complexes (e.g. FGF/FGFR)
through additional modifications.
The findings presented here may also be of use for projected
structure–function studies; one of the limiting factors has
always been the huge number of possible structures which
need to be considered. This analysis suggests the beginning of
a rationale for the selection of possible synthetic target
structures.
Acknowledgements
The authors gratefully acknowledge funding from The
Wellcome Trust, The Royal Society and BBSRC. Prof. D. G.
Fernig is thanked for useful discussions.
Notes and references
1 A. Ori, M. C. Wilkinson and D. G. Fernig, Front. Biosci., 2008,4309–4338.
2 Z. Scholefield, E. A. Yates, G. Wayne, A. Amour, W. McDowelland J. E. Turnbull, J. Cell Biol., 2003, 163, 97–107.
3 J. van Horssen, P. Wesseling, L. P. van den Heuvel, R. M. de Waaland M. M. Verbeek, Lancet Neurol., 2003, 2, 482–492.
4 B. L. Allen and A. C. Rapraeger, J. Cell Biol., 2003, 163,637–648.
5 M. Mohammadi, S. K. Olsen and O. A. Ibrahimi, Cytokine GrowthFactor Rev., 2005, 16, 107–137.
6 L. Pellegrini, Curr. Opin. Struct. Biol., 2001, 11, 629–634.7 Z. L. Wu, L. Zhang, T. Yabe, B. Kuberan, D. L. Beeler,A. Love and R. D. Rosenberg, J. Biol. Chem., 2003, 278,17121–17129.
8 J. A. Davies, E. A. Yates and J. E. Turnbull, Growth Factors, 2003,21, 109–119.
9 S. M. Rickard, R. S. Mummery, B. Mulloy and C. C. Rider,Glycobiology, 2003, 13, 419–426.
10 K. Itoh and S. Y. Sokol, Development, 1994, 120, 2703–2711.11 R. Copeland, A. Balasubramaniam, V. Tiwari, F. Zhang,
A. Bridges, R. J. Linhardt, D. Shukla and J. Liu, Biochemistry,2008, 47, 5774–5783.
12 D. Pinna, P. Oreste, T. Coradin, A. Kajaste-Rudnitski, S. Ghezzi,G. Zoppetti, A. Rotola, R. Argnani, G. Poli, R. Manservigi andE. Vicenzi, Antimicrob. Agents Chemother., 2008, 52, 3078–3084.
13 M. Moulard, H. Lortat-Jacob, I. Mondor, G. Roca, R. Wyatt,J. Sodroski, L. Zhao, W. Olson, P. D. Kwong and Q. J. Sattentau,J. Virol., 2000, 74, 1948–1960.
14 M. Tyagi, M. Rusnati, M. Presta and M. Giacca, J. Biol. Chem.,2001, 276, 3254–3261.
15 R. R. Vives, A. Imberty, Q. J. Sattentau and H. Lortat-Jacob,J. Biol. Chem., 2005, 280, 21353–21357.
16 E. L. G. M. Tonnaer, T. G. Hafmans, T. H. Van Kuppevelt,E. A. M. Sanders, P. E. Verweij and J. H. A. J. Curfs, MicrobesInfect., 2006, 8, 316–322.
17 M. Petitou, Nouv. Rev. Fr. Hematol., 1984, 26, 221–226.18 J. Kreuger, D. Spillmann, J. P. Li and U. Lindahl, J. Cell Biol.,
2006, 174, 323–327.19 J. Angulo, M. Hricovini, M. Gairi, M. Guerrini, J. L. de Paz,
R. Ojeda, M. Martin-Lomas and P. M. Nieto, Glycobiology, 2005,15, 1008–1015.
20 W. L. Chuang, M. D. Christ and D. L. Rabenstein, Anal. Chem.,2001, 73, 2310–2316.
21 R. Lucas, J. Angulo, P. M. Nieto and M. Martin-Lomas, Org.Biomol. Chem., 2003, 1, 2253–2266.
22 D. Mikhailov, K. H. Mayo, I. R. Vlahov, T. Toida, A. Pervin andR. J. Linhardt, Biochem. J., 1996, 318(Pt 1), 93–102.
23 K. Sugahara, R. Tohno-oka, S. Yamada, K. H. Khoo,H. R. Morris and A. Dell, Glycobiology, 1994, 4, 535–544.
24 G. Torri, B. Casu, G. Gatti, M. Petitou, J. Choay, J. C. Jacquinetand P. Sinay, Biochem. Biophys. Res. Commun., 1985, 128,134–140.
25 S. Yamada, Y. Yamane, H. Tsuda, K. Yoshida and K. Sugahara,J. Biol. Chem., 1998, 273, 1863–1871.
26 B. Mulloy and M. J. Forster, Glycobiology, 2000, 10, 1147–1156.27 E. A. Yates, F. Santini, B. De Cristofano, N. Payre, C. Cosentino,
M. Guerrini, A. Naggi, G. Torri and M. Hricovini, Carbohydr.Res., 2000, 329, 239–247.
28 E. A. Yates, F. Santini, M. Guerrini, A. Naggi, G. Torri andB. Casu, Carbohydr. Res., 1996, 294, 15–27.
29 E. A. Yates, S. E. Guimond and J. E. Turnbull, J. Med. Chem.,2004, 47, 277–280.
30 S. E. Guimond, T. R. Rudd, M. A. Skidmore, A. Ori, D. Gaudesi,C. Cosentino, M. Guerrini, R. Edge, D. Collison, E. J. McInnes,G. Torri, J. E. Turnbull, D. G. Fernig and E. A. Yates, Biochemistry,2009, 48, 4772–4779.
31 T. R. Rudd, S. E. Guimond, M. A. Skidmore, L. Duchesne,M. Guerrini, G. Torri, C. Cosentino, A. Brown, D. T. Clarke,J. E. Turnbull, D. G. Fernig and E. A. Yates, Glycobiology, 2007,17, 983–993.
32 T. R. Rudd, M. A. Skidmore, S. E. Guimond, C. Cosentino,G. Torri, D. G. Fernig, R. M. Lauder, M. Guerrini andE. A. Yates, Glycobiology, 2009, 19, 52–67.
908 | Mol. BioSyst., 2010, 6, 902–908 This journal is �c The Royal Society of Chemistry 2010
Publ
ishe
d on
15
Febr
uary
201
0. D
ownl
oade
d by
Joh
ns H
opki
ns U
nive
rsity
on
24/0
9/20
13 1
1:36
:55.
View Article Online
top related