past, present, and future of multidimensional...
TRANSCRIPT
Past, Present, and Future of Multidimensional Scaling
Patrick J. F. Groenen
*Econometric Institute, Erasmus University Rotterdam, The Netherlands,
[email protected], http://people.few.eur.nl/groenen/
Summary:
1 What is MDS?
2 Some Historical Milestones
3 Present
4 Future
5 Summary of highlights in MDS
Past, Present, and Future of MDS – 2 –
1 What is MDS?
• Table of travel times by train between 10 French cities:
Bor-
deaux Brest Lille Lyon Mar-seille Nice Parijs
Strassbourg
Tou-louse Tours
Bordeaux 0 Brest 9h58 0 Lille 6h39 7h11 0 Lyon 8h05 7h11 4h52 0 Marseille 5h47 8h49 6h12 1h35 0 Nice 8h30 13h36 8h20 4h33 2h26 0 Parijs 2h59 4h17 1h04 2h01 3h00 5h52 0 Strassbourg 8h08 10h16 6h54 4h36 7h04 11h15 4h01 0 Toulouse 2h02 13h52 9h42 4h25 3h26 6h29 5h14 10h56 0 Tours 2h36 5h38 4h17 4h21 5h13 9h04 1h13 6h03 6h06 0
Marseille Toulouse
Bordeaux
Lyon
Nice
Paris
Tours
Lille
Strassbourg Brest
Marseille Toulouse
Bordeaux Lyon
Nice
Paris Tours Lille
Strassbourg Brest
MDS map of travel time by train. Geographic map of France.
Past, Present, and Future of MDS – 3 –
dissimilarity matrix ∆
O1 O2 O3 L On-1 On
O1 0
O2 δ12 0
O3 δ13 δ23 0
M M M M O
On-1 δ1,n-1 δ2,n-1 δ3,n-1 L 0
On δ1n δ2n δ3n L δ2n 0
⇓
coordinates matrix X
dim 1 dim 2
O1 x11 x12
O2 x21 x22
O3 x31 x32
M M M
On-1 xn-1,1 xn-1,2
On xn1 xn1
O1
O2 O
3
On
On-1
•
•
•
•
•
⇒
Past, Present, and Future of MDS – 4 –
• First sentence in Borg and Groenen (2005):
Multidimensional scaling (MDS) is a method that represents measurements of similarity
(or dissimilarity) among pairs of objects as distances between points of a low-
dimensional space.
• Who uses MDS?
– psychology, – medicine,
– sociology, – chemistry,
– archaeology, – network analysis
– biology, – economists, etc.
• Similarities and dissimilarities:
– Large similarity approximated by small distance in MDS.
– Large dissimilarity (δij) approximated by large distance in MDS.
– General term: proximity.
Past, Present, and Future of MDS – 5 –
2 Some Historical Milestones
• 1635: van Langren: Provides a distance matrix and a map.
Map of Durham county
– Cartographer: Jacob van Langren – Date 1635
Newcastle Durham
Past, Present, and Future of MDS – 6 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for
classical MDS and gives connection to
principal components analysis.
Classical MDS: minimize Strain(X) = 1/4||J(∆∆∆∆(2)–D(2)(X))J||2
with J centering matrix by
eigendecomposition of –½ J∆∆∆∆(2)J
Past, Present, and Future of MDS – 7 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for classical MDS and
gives connection to principal components analysis.
• 1962: Shepard: Provides a heuristic for MDS.
Past, Present, and Future of MDS – 8 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for classical MDS and
gives connection to principal components analysis.
• 1962: Shepard: Provides a heuristic for MDS.
• 1964: Kruskal: Establishes least-squares MDS.
Provides a minimization algorithm.
Proposes ordinal MDS plus optimization
Minimize Stress-I: σI(X,d̂) = ( )∑
∑
<
<−
ji ij
ji ijij
d
dd
)(
)(ˆ
2
2
X
X
with ijd̂ disparity satisfying a monotone
relation with proximities.
Past, Present, and Future of MDS – 9 –
– Classic example: Rothkopf (1957) Morse code confusion data
+ Is there some systematic way in which people confuse Morse codes?
+ 36 Morse code (26 for alphabet, 10 for numbers)
+ Subjects task: judge whether two Morse codes are the same or not. For example:
+ Is .- (N) the same as .-. (R)? Yes (1), or no (2)
+ Stimulus pair presented in two orders: pair NR and RN.
+ Each subject judges many combinations of Morse codes.
+ N = 598.
+ Morse code confusion table: proportion confused.
+ Data are similarities
A B C D L 0
.- A 92 4 6 13 L 3
-... B 5 84 37 31 L 4
-.-. C 4 38 87 17 L 12
-.. D 8 62 17 88 L 6
M M M M M M O M
----- 0 9 3 11 2 L 94
Past, Present, and Future of MDS – 10 –
– Classic example: Rothkopf (1957) Morse code confusion data
.-
-...
-.-.
-..
.
..-.
--.
.... ..
.---
-.-.-.. --
-.
---
.--.
--.-
.-.
...
-
..-
...-
.---..-
-.----..
.----
..---
...--
....-
....
-....
--...
---..
----.-----
Past, Present, and Future of MDS – 11 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for classical MDS and
gives connection to principal components analysis.
• 1962: Shepard: Provides a heuristic for MDS.
• 1964: Kruskal: Establishes least-squares MDS. Provides a minimization algorithm.
Proposes ordinal MDS plus optimization
• 1964: Guttman: Facet theory and regional interpretation in MDS.
– In facet theory, extra information (external variables) is available on
the objects according to the facet design by which the objects are
generated:
Past, Present, and Future of MDS – 12 –
• 1964: Guttman: Facet theory and regional interpretation in MDS.
+ Every object i belongs to a category on one or more facets.
+ See, e.g., Guttman (1959), Borg & Shye (1995), Borg & Groenen (1997, 1998)
Dissimilarity matrix ∆: Facet design Facet
O1 O2 O3 L On-1 On 1 2 3
O1 0 O1 1 1 3
O2 δ12 0 O2 1 2 3
O3 δ13 δ23 0 O3 2 1 3
M M M M O M M M M
On-1 δ1,n-1 δ2,n-1 δ3,n-1 L 0 On-1 3 1 1
On δ1n δ2n δ3n L δ2n 0 On 3 2 1
– The extra facet information is used to partition the objects in the MDS space in regions.
– Facets are used for regional hypotheses about the empirical structure of the data.
a
a
a
aa
b
b
b
b
b
c
c
c
c
a
aa
a
b bb
bb
b
c
c
cc c
c
c
a
a
a a
aa
a
bb
b
c
cc
c
b
b
c
c
axial modular polar
Past, Present, and Future of MDS – 13 –
• 1964: Guttman: Facet theory and regional interpretation in MDS.
– For the Morse code data, we have additional information available:
1. Length of the signal (.05 to .95 seconds).
2. Signal type (ratio of long versus short beeps).
Letter Morse code Length Signal type
Letter
Morse code Length Signal type
A .- 25 1=2 S ... 25 1 B -... 45 1>2 T - 15 2 C -.-. 55 1=2 U ..- 35 1>2 D -.. 35 1>2 V ...- 45 1>2 E . 05 1 W .-- 45 1<2 F ..-. 45 1>2 X -..- 55 1=2 G --. 45 1<2 Y -.-- 65 1<2 H .... 35 1 Z --.. 55 1=2 I .. 15 1 1 .---- 85 1<2 J .--- 65 1<2 2 ..--- 75 1<2 K -.- 45 1<2 3 ...-- 65 1>2 L .-.. 45 1>2 4 ....- 55 1>2 M -- 35 2 5 ..... 45 1 N -. 25 1=2 6 -.... 55 1>2 O --- 55 2 7 --... 65 1>2 P .--. 55 1=2 8 ---.. 75 1<2 Q --.- 65 1<2 9 ----. 85 1<2 R .-. 35 1>2 0 ----- 95 1 S ... 25 1
Past, Present, and Future of MDS – 14 –
• 1964: Guttman: Facet theory and regional interpretation in MDS.
– Borg and Groenen (2005): Regional restrictions through Proxscal, by specifying:
+ two dimensions
+ two external variables,
+ each variable is transformed ordinally using the primary approach ties.
1 1>2 1=2 2>1 2
95
85
75
65
55
45
35
05
25
15
1111
112121
211
111
12 21
11
122
212
2121
2211
2122 22121222
22221
222
1221
221
222
1
11222
22211
11122
22111
2111111112
11111
11121121 1211
2111
2112
1222222222
1 1>2 1=2 2>1 2
95
85
75
65
55
45
35
25
15
05
12
2111 2121
211
1
1121221
1111
11
1222
212
1211
22
21
2221221
2212
121
111
2
112
1112
122
21122122
2211
12222
1122211122
11112
11111
21111
22111
22211 22221
22222
Unconstrained Regionally constrained
Past, Present, and Future of MDS – 15 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for classical MDS and
gives connection to principal components analysis.
• 1962: Shepard: Provides a heuristic for MDS.
• 1964: Kruskal: Establishes least-squares MDS. Provides a minimization algorithm.
Proposes ordinal MDS plus optimization
• 1964: Guttman: Facet theory and regional interpretation in MDS.
• 1969: Horan Dimension weighting models in 3-way MDS
1970: Carroll and Chang: Introduction (INDSCAL, IDIOSCAL)
Past, Present, and Future of MDS – 16 –
• 1969: Horan: Dimension weighting models in 3-way MDS
1970: Carroll and Chang: Introduction (INDSCAL, IDIOSCAL)
– 3-way MDS: more than one dissimilarity matrix:
– In the weighted Euclidean model or each source, the common space
G may be stretched or shrunk along the axes.
– Model δijk ≈ dij(GSk) with
+ G a single common space and
+ Sk is a diagonal matrix of
dimension weights and
– INDSCAL uses STRAIN loss.
J J J J J J J J J J J J J
J J J
J J J J J J J J J J J J J
J J J
J J J J J
J
J
J
J J J J J
J
J
J
J J J J J J J J J J J J J J J J
≈
≈
≈
∆∆∆∆ 1
∆∆∆∆ 2
∆∆∆∆ 3
Common
space
s 11
= 1.5
s 12
= .5
s 21
= .8
s 22
= 1.5
s 31
= 1
s 32
= .3
Past, Present, and Future of MDS – 17 –
• 1969: Horan: Dimension weighting models in 3-way MDS
1970: Carroll and Chang: Introduction (INDSCAL, IDIOSCAL)
– 3-way MDS: more than one dissimilarity matrix:
– In the weighted Euclidean model or each source, the common space
G may be stretched or shrunk along the axes.
– In the generalized Euclidean model, the common space G may rotated, then stretched
or shrunk along (rotated) axes.
– Model δijk ≈ dij(GSk) with
+ G a single common space and
+ Sk is any matrix of dimension
weights
– IDIOSCAL uses STRAIN loss.
J J J J J J J J J J J J J
J J J
≈
≈
≈
∆∆∆∆ 1
∆∆∆∆ 2
∆∆∆∆ 3
Common
space
α 3 = -10
o
s 31
= 1
s 32
= .3
J J
J J
J J J J J J J J J
J J
J
J J J J J J J J J J J J J
J J
J
J J J J J J J J J J J J J J J J
α 2 = 45
o
s 21
= .8
s 22
= .5
α 1 = 30
o
s 11
= 1.2
s 12
= .5
Past, Present, and Future of MDS – 18 –
• 1635: van Langren: Provides a distance matrix and a map.
• 1958: Torgerson: Provides a solution for classical MDS based on
eigendecomposition
• 1966: Gower: Provides independently the same solution for classical MDS and
gives connection to principal components analysis.
• 1962: Shepard: Provides a heuristic for MDS.
• 1964: Kruskal: Establishes least-squares MDS. Provides a minimization algorithm.
Proposes ordinal MDS plus optimization
• 1964: Guttman: Facet theory and regional interpretation in MDS.
• 1969: Horan Dimension weighting models in 3-way MDS
1970: Carroll and Chang: Introduction (INDSCAL, IDIOSCAL)
Past, Present, and Future of MDS – 19 –
2.1 Milestones in MDS algorithms
• 1958, 1966: Torgerson & Gower: solutions for classical MDS.
• 1964: Kruskal: Introduction Stress-I loss function plus
minimization and ordinal MDS.
• 1977: De Leeuw: Introduction SMACOF (Scaling by MAjorizing a
COomplicated function) algorithm for MDS.
• 1980: De Leeuw & Heiser: SMACOF extended to a comprehensive
MDS algorithm allowing transformations of the
dissimilarities, constraints on the configuration,
and three-way dimension weighting extensions.
• 1988: De Leeuw: Convergence results derived of the SMACOF
algorithm.
• 1995: Groenen, Mathar, Heiser: Extension SMACOF to city-block
distances.
De Leeuw
Heiser
Mathar
Past, Present, and Future of MDS – 20 –
• Formalizing MDS by minimizing raw Stress over X:
σr(X, d̂) = w iji < j
∑ ˆ d ij − d ij (X)( )2
with wij ≥ 0 and δij ≥ 0
where
ˆ d ij disparity, d–hat, pseudo-distance: optimal transformation of dissimilarities
subject to (ordinal) restrictions and
∑< ji
ijijdw 2ˆ = n(n–1)/2 to avoid the trivial solution d̂=0 and X=0
dij(X) Euclidean distance between rows i and j of X
X n×p matrix of coordinates of n objects by p dimensions
wij nonnegative weights (for example, to code missings)
Past, Present, and Future of MDS – 21 –
• Constrained MDS (De Leeuw & Heiser, 1980):
– Easy to combine majorization with constraints.
– The majorizing function ˆ σ (X,Y) can be conveniently expressed as
ˆ σ (X,Y) = 2
δη + tr XV'X – 2 tr X'B(Y)Y
= 2
δη + tr XV'X – 2 tr X'VX
= 2
δη + (tr XV'X + tr X 'VX – 2 tr X'VX ) – tr X 'VX
= 2
δη + tr(X – X )'V(X – X ) – tr X 'VX
with
Y the previous configuration
X the unconstrained update 2
δη the sum of squared dissimilarities
V a fixed (positive semi-definite) matrix depending on the weights.
Quadratic in X Constant Constant
Past, Present, and Future of MDS – 22 –
∆∆∆∆1
∆∆∆∆2
∆∆∆∆3
• What type of constraints can be imposed?
– Any constraint on X that is solved easily by minimizing least squares error,
e.g., the linear constraints X = ZC (for given Z)
– Three-way MDS through constrained MDS(De Leeuw & Heiser, 1980):
– Minimize
σr(G,S1,S2,...,Sk)= ∑∑= <
K
k jiijkw
1
(δijk – dij(GSk))2
where
+ G is the n×p matrix of coordinates (the common space)
+ Sk is the p×p matrix of weights
– Consider the block matrices:
∆* =
4
3
2
1
∆000
0∆00
00∆0
000∆
, W* =
4
3
2
1
W000
0W00
00W0
000W
, and X* =
4
3
2
1
X
X
X
X
– Then, the dimension weighting models amount to restricting X* by Xk = GSk
Past, Present, and Future of MDS – 23 –
3 Present
• 1986-1998: Meulman: integration of (nonlinear) multivariate
analysis and MDS.
– Much emphasis on the representation of objects, less on the
variables.
– Fitting by MDS through Stress as a dimension reduction technique.
– Including a wide variety of MVA techniques:
+ (Nonlinear) PCA
+ Multiple Correspondence Analysis
+ Correspondence Analysis
+ Generalized Canonical Correlation Analysis
+ Discriminant Analysis.
Past, Present, and Future of MDS – 24 –
• 1986-1998: Meulman: integration of (nonlinear) multivariate analysis and MDS.
• 1994 Buja: Constant dissimilarities
– Data with all δij = 1 can be seen as maximum noninformative in MDS,
since all pairs of objects are equally dissimilar.
– Suppose ∆ = c
0111
1011
1101
1110
with c > 0
– What configuration does MDS yield with constant data?
Past, Present, and Future of MDS – 25 –
• 1986-1998: Meulman: integration of (nonlinear) multivariate analysis and MDS.
• 1994 Buja: Constant dissimilarities
– Data with all δij = 1 can be seen as maximum noninformative in MDS,
since all pairs of objects are equally dissimilar.
– Suppose ∆ = c
0111
1011
1101
1110
with c > 0
– What configuration does MDS yield with constant data?
– Buja, Logan, Reeds, & Shepp (1994) proved:
1 dimensional 2 dimensional 3 dimensional or higher
points equally spaced points on points on a sphere
on a line concentric circles
• • • • • • • • •
Past, Present, and Future of MDS – 26 –
• 1986-1998: Meulman: integration of (nonlinear) multivariate analysis and MDS.
• 1994 Buja: Constant dissimilarities
• 1978,1995- Various authors: local minima in MDS:
1. unidimensional scaling (Defays, De Leeuw, Pliner, Hubert, Arabie, Vera)
Daniel Defays Larry Hubert & Mathew
Hesson-McInnes
Phipps Arabie Jose Fernando
Vera
Past, Present, and Future of MDS – 27 –
– When do local minima occur?
+ Unidimensional scaling
(De Leeuw & Heiser, 1977; Defays, 1978; Hubert and Arabie 1986; Pliner, 1996).
+ City-block MDS (Hubert, Arabie & Hesson-McInnes, 1992).
+ Depends on data:
with increasing dimensionality ⇒ fewer local minima
and error structure.
Past, Present, and Future of MDS – 28 –
– When do local minima occur?
+ Unidimensional scaling
(De Leeuw & Heiser, 1977; Defays, 1978; Hubert and Arabie 1986; Pliner, 1996).
+ City-block MDS (Hubert, Arabie & Hesson-McInnes, 1992).
+ Depends on data:
with increasing dimensionality ⇒ fewer local minima
and error structure.
– What can you do about local minima?
+ Multiple random starts.
+ Tunneling (Groenen & Heiser, 1996)
+ Distance smoothing: unidimensional scaling (Pliner, 1996), city-block MDS, general
MDS (Groenen et al. 1999).
+ Meta heuristics
+ simulated annealing: De Soete, Hubert, Arabie (1988), Brusco (2001), Vera &
Heiser (2005, 2007)
+ genetic algorithm, …..
Past, Present, and Future of MDS – 29 –
• 1986-1998: Meulman: integration of (nonlinear) multivariate analysis and MDS.
• 1994 Buja: Constant dissimilarities
• 1978,1995- Various authors: local minima in MDS:
• 1998: Buja: Applying weights in Stress to mimic loss functions
+ Choose wij = 2−ijδ . Then Raw Stress becomes:
σr(X) = ( )∑ <−δ
ji ijijijw2
)(d X = ( )∑ <
− −ji ijijij
22 )(d Xδδ = ∑ <
−
jiij
ij
2)(d
1δ
X
+ These weights make that Stress fits the ratio of distances to dissimilarities:
large ijδ = 10, dij(X) = 5: 5.10
51
)(d1 =
−=
δ−=
ij
ijije
X
small ijδ = 2, dij(X) = 1: 5.2
11
)(d1 =
−=
δ−=
ij
ijije
X
+ This is a very good idea to attach equal importance to small and large errors.
Past, Present, and Future of MDS – 30 –
• 1986-1998: Meulman: integration of (nonlinear) multivariate analysis and MDS.
• 1994 Buja: Constant dissimilarities
• 1978,1995- Various authors: local minima in MDS:
• 1998: Buja: Applying weights in Stress to mimic loss functions (after Buja,
1998).
Past, Present, and Future of MDS – 31 –
4 Future
• 1999-: Heiser, Meulman, Busing: PROXSCAL (i.e. SMACOF) in SPSS (PASW)
• 2009: De Leeuw & Mair: SMACOF in R.
Past, Present, and Future of MDS – 32 –
• 1999-: Heiser, Meulman, Busing: PROXSCAL (i.e. SMACOF) in SPSS (PASW)
• 2009: De Leeuw & Mair : SMACOF in R.
• 2000: Tenenbaum, et al.: Large scale MDS ISOMAP heuristic for.
• 2005-: Groenen, Trosset, Kagie: Large scale MDS through Stress.
– Problems:
+ Computationally too demanding.
+ Storage is a problem (n2).
+ Uninformative solutions.
10 100 1000 10000.01
.1
1
10
100
1000
n
CP
U s
eco
nd
s
Past, Present, and Future of MDS – 33 –
– Solution Groenen, Trosset, Kagie:
+ Use only a fraction of the data.
+ Make use of smart designs.
+ Use sparseness of the data efficiently to obtain a fast majorization algorithm.
– Comparison large scale majorization versus SMACOF
– n = 1,000
– Proportion nonmissing: .05
(Nnonmis = 23,000 out of 499,500)
0 0.5 1 1.5 20.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
CPU seconds
Str
es
s
Large scale majorizationSMACOF
– n = 10,000
– Proportion nonmissing: .005
(Nnonmis = 250,000 out of 49,995,000)
0 50 100 150 2000.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
CPU seconds
Str
es
s
Large scale majorizationSMACOF
Past, Present, and Future of MDS – 34 –
– Avoiding uninformative solutions:
+ Edinburgh Associative Thesaurus (EAT) data set (1968, 1971): words associated
with stimulus
+ cij contains the number of associations between words i and j.
+ n = 23,219 terms.
+ 325,060 nonzero association counts between terms (sparseness = 0.1%).
+ Solution when choosing: δij = 1/cij
Past, Present, and Future of MDS – 35 –
+ Solution when choosing the gravity model δij = ij
ji
c
oo and wij = 5
ijδ
with oi is the total number of occurrences of term i
Past, Present, and Future of MDS – 36 –
Past, Present, and Future of MDS – 37 –
• 1999-: Heiser, Meulman, Busing: PROXSCAL (i.e. SMACOF) in SPSS (PASW)
• 2009: De Leeuw & Mair : SMACOF in R.
• 2000: Tenenbaum, et al.: Large scale MDS ISOMAP heuristic.
• 2005-: Groenen, Trosset, Kagie: Large scale MDS through Stress.
• 2002-: Buja, Cook, Swayne: Dynamic MDS visualization in the G-GVis software.
• 2003: Groenen: Dynamic MDS visualization through iMDS.
Andreas Buja, Deborah Swayne, Di Cook
Past, Present, and Future of MDS – 38 –
• 1999-: Heiser, Meulman, Busing: PROXSCAL (i.e. SMACOF) in SPSS (PASW)
• 2009: De Leeuw & Mair : SMACOF in R.
• 2000: Tenenbaum, et al.: Large scale MDS ISOMAP heuristic.
• 2005-: Groenen, Trosset, Kagie: Large scale MDS through Stress.
• 2002-: Buja, Cook, Swayne: Dynamic MDS visualization in the G-GVis software.
• 2003: Groenen: Dynamic MDS visualization through iMDS.
• 2002: Denœux, Masson, Groenen, Winsberg, Diday: Symbolic MDS of intervals
• 2006: Groenen, Winsberg: Symbolic MDS of histograms
Past, Present, and Future of MDS – 39 –
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
1
2
3
4
5
6
78
9
10
d28
(L)
d28
(U)
• 2002: Denœux, Masson, Groenen, Winsberg, Diday: Symbolic MDS of interval data
– Instead of δij, its interval is given: δij ∈ [ )()( , Uij
Lij δδ ]
– Find MDS with coordinates xis also in an interval.
– Minimize I-Stress:
),(2 RXIσ = ( )∑ <−
ji
Lij
Lijij dw
2)()( ),( RXδ +
( )∑ <−
ji
Uij
Uijij dw
2)()( ),( RXδ
with
),()( RXLijd longmallest distance between
rectangles
),()( RXUijd longest distance between
rectangles
Past, Present, and Future of MDS – 40 –
• 2006: Groenen, Winsberg: Symbolic MDS of histograms
– Instead of δij, its empirical distribution (percentiles) are given: αααα = [.20, .30, .40]
Lower bound Upper bound
k αααα percentile )(Lijkδ percentile )(U
ijkδ
1 .20 20 )(1L
ijδ 80 )(1U
ijδ
2 .30 30 )(2L
ijδ 70 )(2U
ijδ
3 .40 40 )(3L
ijδ 60 )(3U
ijδ
– Minimize
),...,,( 12
KHistI RRXσ =
( )∑ ∑ <−
k ji kL
ijL
ijkij dw2)()( ),( RXδ +
( )∑ ∑ <−
k ji kU
ijU
ijkij dw2)()( ),( RXδ
subject to 0 ≤ ris1 ≤ ris2 ≤ … ≤ risK.
Past, Present, and Future of MDS – 41 –
• 1999-: Heiser, Meulman, Busing: PROXSCAL (i.e. SMACOF) in SPSS (PASW)
• 2009: De Leeuw & Mair : SMACOF in R.
• 2000: Tenenbaum, et al.: Large scale MDS ISOMAP heuristic.
• 2005-: Groenen, Trosset, Kagie: Large scale MDS through Stress.
• 2002-: Buja, Cook, Swayne: Dynamic MDS visualization in the G-GVis software.
• 2003: Groenen: Dynamic MDS visualization through iMDS.
• 2002: Denœux, Masson, Groenen, Winsberg, Diday: Symbolic MDS of intervals
• 2006: Groenen, Winsberg: Symbolic MDS of histograms
• 2010: Groenen: Dynamic MDS of Dutch political parties
Past, Present, and Future of MDS – 42 –
• Political party comparison website for Dutch parliament elections 2010 asks
to rate 30 politcal statements (www.stemwijzer.nl), e.g.,
1. The government needs to cut the budget by biljons. The budget deficit should
disappear at the latest in 2015.
Agree Don’t know Disagree
2. Those with high income should pay more taxes.
Agree Don’t know Disagree
– 11 political parties also rated these 30 items.
– What is the political landscape in the Dutch elections of 2010?
– Do iMDS on the distances between the 11 parties in 30 dimensional space.
Past, Present, and Future of MDS – 43 –
5 Summary of highlights in MDS
Past Main author(s) Topic
1958, 1966 Torgerson,Gower Classical MDS
1964 Kruskal Least-squares MDS through Stress with
transformations
1964 Guttman Facet theory and regional interpretations in MDS
1969, 1970 Horan, Carroll Three-way MDS models (INDSCAL, IDIOSCAL)
1977- De Leeuw and others The majorization algorithm for MDS
Present
1986-1998 Meulman Distance-based MVA through MDS
1994 Buja Constant dissimilarities
1978, 1995- Various Local minimum problem
1998 Buja Smart use of weights in MDS
Past, Present, and Future of MDS – 44 –
Future
1999, Heiser, Meulman,
Busing
Modern MDS software: Proxscal in SPSS (PASW)
2000 Tenenbaum, et al. Large scale MDS ISOMAP heuristic
2002 Buja, Swayne, Cook Dynamic MDS in GGvis (part of GGobi)
2003 Groenen Dynamic MDS visualization through iMDS
2005- Groenen, Trosset,
Kagie
Large scale MDS through Stress
2002 Denœux, Masson,
Groenen, Winsberg,
Diday
Symbolic MDS of interval dissimilarities
2006 Groenen, Winsberg Symbolic MDS of histograms
2009 De Leeuw, Mair Smacof package in R