a generalization of verhelst's solution for a constrained regression problem in alscal and...

8
PSYCHOMETRIKA--VOL.48, NO. 4. DECEMBER, 1983 NOTES AND COMMENTS A GENERALIZATION OF VERHELST'S SOLUTION FOR A CONSTRAINED REGRESSION PROBLEM IN ALSCAL AND RELATED MDS-ALGORITHMS Jos M. F. TEN BERGE UNIVERSITY OF GRONINGEN Verhelst derived a solution for a constrained regression problem which occurs in the interval measurement application of ALSCAL and related MDS-algorithms. In the present paper it is shown that Verhelst's solution is based on an implicit nonsingularity assumption. A general solu- tion, which contains Verhelst's solution as a special case, is derived by a simple completing-the- squares type approach instead of partial differentiation with a Lagrange multiplier. In additio n, this approach permits the identification of a small interval which uniquely contains the optimal value of a parameter needed to solve the special case where Verhelst's solution is valid. Takane, Young and De Leeuw [1977] considered a constrained regression problem which occurs in the interval measurement case of ALSCAL. Verhelst [1981] pointed out that their solution is erroneous and offered an alternative solution. However, it can be shown that Verhelst's solution is only valid if a certain assumption is met. In the present paper a generalized solution, which contains Verhelst's solution as a special case, will be derived. First, however, we shall explain the problem and outline Verhelst's solution. The problem at hand is that of minimizing the function ~2(X) = (d - 0X)'(d - 0X) (1) subject to the constraint f12 = 4~7, where X' --- (~, fl, ?), d is a given n-vector of squared distances, and 0 is a given n × 3 matrix of rank 3 of the Vandermonde type. Let 0'0 have the lower-triangular decomposition 0'0 = (F')- IF- 1 (2) Let B be a 3 x 3 matrix with bal = b13 = - ½, b22 = 1 and zeroes elsewhere. Then = Bg (3) where g' = (-2~, fl, -20t). Let G be defined as G = F-1B(F') -1 (4) with eigendecomposition G = PAP', (5) 61 < o < 62 <_ 6 a , P'P = PP' = I 3 . With these definitions, Verhelst [1981] arrives at the following necessary condition for a minimum of ~b 2: (B - 2FF')g = v, (6) where v = (0'0)-10'd and 2 is a Lagrange multiplier. When (B - 2FF') is nonsingular then The author is obliged to Dirk Knol and Klaas Nevels for helpful comments. Reprint requests should be addressed to Jos ten Berge, Subfakulteit Psychologic R.U. Groningen, Grote Markt 31/32, 9712 HV Oroningen, The Netherlands. 0033 - 3123/83/1200- 5040500. 50/0 © 1983 The Psychometric Society 631

Upload: jos-m-f-ten-berge

Post on 21-Aug-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

PSYCHOMETRIKA--VOL. 48, NO. 4. DECEMBER, 1983

NOTES AND COMMENTS

A GENERALIZATION OF VERHELST'S SOLUTION FOR A CONSTRAINED REGRESSION PROBLEM IN

ALSCAL AND RELATED MDS-ALGORITHMS

Jos M. F. TEN BERGE

UNIVERSITY OF GRONINGEN

Verhelst derived a solution for a constrained regression problem which occurs in the interval measurement application of ALSCAL and related MDS-algorithms. In the present paper it is shown that Verhelst's solution is based on an implicit nonsingularity assumption. A general solu- tion, which contains Verhelst's solution as a special case, is derived by a simple completing-the- squares type approach instead of partial differentiation with a Lagrange multiplier. In additio n, this approach permits the identification of a small interval which uniquely contains the optimal value of a parameter needed to solve the special case where Verhelst's solution is valid.

Takane, Young and De Leeuw [1977] considered a constrained regression problem which occurs in the interval measurement case of ALSCAL. Verhelst [1981] pointed out that their solution is erroneous and offered an alternative solution. However, it can be shown that Verhelst's solution is only valid if a certain assumption is met. In the present paper a generalized solution, which contains Verhelst's solution as a special case, will be derived. First, however, we shall explain the problem and outline Verhelst's solution.

The problem at hand is that of minimizing the function

~2(X ) = (d - 0X)'(d - 0X) (1)

subject to the constraint f12 = 4~7, where X' --- (~, fl, ?), d is a given n-vector of squared distances, and 0 is a given n × 3 matrix of rank 3 of the Vandermonde type. Let 0'0 have the lower-triangular decomposition

0'0 = (F' ) - I F - 1 (2)

Let B be a 3 x 3 matrix with bal = b13 = - ½, b22 = 1 and zeroes elsewhere. Then

= Bg (3)

where g' = ( -2~ , fl, -20t). Let G be defined as

G = F - 1 B ( F ' ) -1 (4)

with eigendecomposition

G = P A P ' , (5)

61 < o < 62 <_ 6 a , P ' P = P P ' = I 3 . With these definitions, Verhelst [1981] arrives at the following necessary condition for a minimum of ~b 2:

(B - 2FF')g = v, (6)

where v = (0'0)-10'd and 2 is a Lagrange multiplier. When (B - 2FF ' ) is nonsingular then

The author is obliged to Dirk Knol and Klaas Nevels for helpful comments. Reprint requests should be addressed to Jos ten Berge, Subfakulteit Psychologic R.U. Groningen, Grote

Markt 31/32, 9712 HV Oroningen, The Netherlands.

0033 - 3123/83/1200- 5040500. 50/0 © 1983 The Psychometric Society

631

632 PSYCHOMETRIKA

(6) implies that 2 must be a roo t of

3 y261 y'(A - 21)-tA(A - 21 ) - ly = ~. (6, 2) 2 = o (7)

i = 1

where y = P'F-*v = P'F'O'd. Verhelst implicitly assumes that (B - 2FF') is nonsingular at the solut ion points and suggests finding the roots of (7). Solutions corresponding to the different roots of (7) are then compared in terms of the associated value of ~b 2 and the best-fitting solution is retained.

We shall now derive a general solution for the problem at hand which is free f rom implicit assumptions. The solut ion derived by Verhelst will appear to be a special case of our solution. In addi t ion it will be shown that in the case where Verhelst 's solut ion is valid the value of 2 which corresponds to the global constrained min imum of ~b 2 is the unique root of(7) in the interval (6 l, 32).

A General Solution

Consider the t ransformat ion h = F - i x and z = F - I v . Then the problem of mini- mizing ~62(~) subject to the constraint t2 = 4~y is equivalent to that of minimizing

f (h) = d'd - 2z'h + h'h (8)

subject to the constraint

h ' G lh = o. (9)

Define the interval L(y, A) as follows:

L(y, A) = [61, 62) if Yl = 0 (Case 1); (10)

L(y, A) = (31, 62] i f y l # o , 6 2 < 6 3 and (11)

y2~ ~' y~63 < o (61 - - 6 2 ) "t- ( 6 3 - - 6 2 ) 2 __

or if Yt # o, Y2 = O, 62 = 33 and Y3 = 0 (Case 2); and

L(y, A) = (3 l, 62) otherwise (Case 3). (12)

Consider the matr ix ( I - 2A- l ) . In each Case this matr ix is positive semidefinite for 2 ~ L(y, A). Define its generalized inverse as (1 - 2A- l ) - . For every 2 ~ L(y, A) we have

{(I -- 2A-x)L'2p'h -- ((I -- 2A-1)-) l /2y},{(l _ 2A-~)l /2p 'h - ((I - 2A-1)-) t /2y} >_ o.

03)

Writing (9) as h ' P A - t p ' h = o and expanding (13) yields

h'h - 2h 'P(l - 2A- l)l/2((I -- 2A- 1 ) - ) 1 / 2 y ~ _ y,(l _ 2A- 1)-y. (14)

In each Case (I - 2A- 1)1/2(( i __ 2 A - 1 ) - ) 1 / 2 y = y. Hence (14) can be simplified to

h'h - 2h'Py_._> - y ' ( I - 2 A - t ) - y . (15)

Combining (8) and (15) yields

f (h) _ d'd - y'(l - 2A-X)-y, 2 ~ L(y, A). (16)

Defining g(2) = y'(I - 2 A - l ) - y we have

f(h)__~ d'd - g(2), 2 e L(y, A). (17)

JOS M. F. TEN BERGE 6 3 3

It should be noted tha t the r ight -hand side of(17) is independent of h. Therefore, if we find a h* satisfying (9) and a 2* e L(y, A) for which (17) holds as an equali ty then h* yields the global m i n i m u m of f (h ) subject to (9). We shall now show how such a pair (h*, 2*) can be ob ta ined in each of the three cases involved.

Solutions for Case 1

In Case 1 Yl = o and L(y, A) = [81, 62). Define 2* = 81 and h* = Pwl , with

{W_~ ( ~2_~.~ y~83 2'~I I/2 y232, _y3___~3_ "~ w~ = - 6 , . (3 ,.)' + ( 3 : ~ - - ~ , . ) / 3 ' ( 6 ~ - - 3 , . ) (63 - 6 , ) j

First, it can be verified that h* satisfies (9), that is,

y2262 y]63 y2262 y263 h * ' P A - 1P'h* = W'lA- twx = (82 - 61) 2 (83 - 61) 2 + (fi2 - 61) 2 + (83 - 31) 2

(18)

- - O .

(19)

Fur thermore , it can be seen that

f (h*) = d'd - 2z'Pwx + w'lwl = d'd - 2y'wl + W]Wl - 6xw]A-Xwx

= d'd - 2{ + ) + w W - 3,A-i)w, k(82 - - 31) (33 - - 81) )

( y262 - Y 2 - 6 6 - ' ~ = d ' d - g ( 3 1 ) . (20) = d'd - - k i ~ - ~ l ) --[- (83 _ 6 1 ) )

This shows that h* as defined by (18) yields the global m i n i m u m o f f ( h ) subject to (9) in Case 1. It should be noted that the solution is not unique since the sign of the first element of w I is arbi t rary.

Solutions for Case 2

In Case 2 we have Yt # o, Y2 = o, L(y, A) = (61, 62] and either the subcase 82 < 83 and y281/(81 - 62) 2 + y] 33/(33 - 62) 2 < o, or the subcase 32 = 83 and Y3 = o. In order to obta in a solut ion for the first subcase, let 2* = 82 and h* = Pw2, with

, { fl~% .YL~ .3.3 .~l I/2 Y3.~3 .'~ W2 = ( 8 2)' + [ - 6 2 ( "yL~± 2 + (21) - L \(a, - ~s.,) (63 - 8 . . , ) " / ] '(63 - , h ) J

Again, it can be verified that

y~8, y~6, y~33 y~83 h * ' P A - xp'h* = w2 A - l w 2 = (61 -- 62) 2 (61 -- 62) 2 (6s -- 62) 2 + (63 -- 62) 2 - - O,

(22)

hence h* satisfies (9). F u r t h e r m o r e we have

f (h*) = d'd - 2z'Pw2 + w~ Wz = d'd - 2y'wz + w~ w2 - 62 w~ A - lw 2

= d ' d - ( _Y~k -I- y~3a \(61 _ 62 ) (6~--_-~2)] = d'd - 0(62)- (23)

This shows that h* as defined by (21) yields the global m i n i m u m o f f ( h ) subject to (9) in the first subcase of Case 2. I t should be noted that the sign of the second element in w 2 is now arbi t rary .

634 PSYCHOMETRIKA

The solution offered here is not valid in the second subcase of Case 2, where Y3 ---- O and 62 = 63. For this subcase an infinite number of solutions can be constructed. Let again 2" = 62 and let w~ be replaced by the vector

V2 "~ 6 2' XI' X2

where x I and x 2 can be any pair of numbers satisfying

x~ + x~ - - 6 , 6 2 y~ (25) ( ~ " - - ' ~ 2 ) 2 "

Then it can be verified that v2 satisfies the constraint v~ A- Iv 2 = 0 and that the associated value off(h*) is

y~,~ f ( h * ) = d ' d - - - d ' d - 0(62). (26)

(61 - 62)

This shows that the solution given in (24) is globally optimal for the second subcase of Case 2.

A Solution for Case 3

In Case 3 we have Yl ~ o, L(y, A) = (6x, 62) and either the subcase Y2 ~ o, or the subcase Y2 = o, Y3 ~ o, and 62 = 63, or the subcase Y2 = o, 62 < 63 , and

y261 y2 63 (61 - - 62)2 + (63 - - 62)2 >" O.

Define for 2 # 6~ the derivative function

0'(2) = - ~ v'~6' 2 "

In each subcase 0'(2) is negative when 2 approaches 61 from above and positive when 2 approaches 62 from below. In addition, 9'(2) is continuous and monotone increasing on (61, 62). Hence there must be a unique root of 0'(2) = 0 in the interval (61, 62). Define 2" as this root and define h* as Pw 3 , where

w 3 = (I - 2*A- 1)- ly. (27)

Then it is easily verified that

h* 'PA- xp'h * = w~ A-lw3 = y'(I -- 2 , A - I ) - 2 A - l y

s y~6~ = ,~ i (6 i - -~ - , )2 - a'(,~*) = o (28)

and that

f(h*) = d'd - 2y'w 3 + w~ wa = d'd - 2y'wa + w~ (I - 2*A-t)w3

= d ' d - 2y'(I - 2 * A - ' ) - X y + y,(l _ 2*A-x)-1( i _ 2*A-xXI - 2 * A - 1 ) - l y

= d ' d - y ' ( I - 2 * A - ~ ) - l y = d ' d - g ( 2 * ) . ( 2 9 )

This shows that h* as defined by (27) yields the global minimum of f (h ) subject to (9) in Case 3. The unique root of g'(2) = o in the interval (6a, 62) can readily be obtained, for instance, by Newton's iteration method, starting in 2 = o.

JOS M. F. TEN BERGE 635

Uniqueness o f the Solution for Case 3

In the present paper the problem of minimizing the constrained least-squares func- tion (1) has been approached by deriving lower bounds to this function and constructing solutions for which the lower bounds are attained, in each of the three cases defined above, respectively. The lower bounds were derived by considering values of the parame- ter 2 in the intervals L(y, A) only. The very fact that, for certain solutions, these lower bounds are attained is sufficient to establish global optimality of these solutions: It is not possible to find better solutions by considering values of 2 outside L(y, A).

Nevertheless, our approach does not imply that our solutions are complete. Ad- ditional derivations are needed to show that no other solutions than the ones presented above are globally optimal. Such derivations can be given but are quite involved if every case and subcase is to be covered. Therefore, we shall only prove uniqueness for our solution for Case 3, which is the most important case from a practical point of view.

Every solution must satisfy Verhelst's necessary condition for a minimum

(B - 2FF')g = v, (30)

cf. (6), which can be transformed into

(G - 2l)F'g = F - Iv. (31)

From g = B - t F h , F - i v = z, P'z = y and defining P'h = w we have

(I - - 2 A - l )w = y (32)

as a necessary condition for a minimum. In addition, every solution for w must satisfy

w'A- lw = o, (33)

cf. (9). Finally, in Case 3 the minimum off(h) is

min f(h) = d'd - #(2") (34)

where 2* is the unique root of g'(2) = o in the interval (61, 62). Suppose there exists a # ~- 4" which also corresponds to a globally optimal solution. Then it would follow from (34) that

g(#) = W~*). (35)

We shall now show that such a/z does not exist. First consider the possibility of having a # ~ 6i, i = I, 2, 3. Then (I -- #A-1) must have an inverse and from (32) and (33) it follows that # must satisfy

y~6~ g'(l~) = ( 6 , - #)2 - o. (36)

i=1

However, (36) contradicts (35). This can be seen if we write, for 2 ~ (61, 6~),

3 #6, y 6, o(2) - , ( # ) = .olE = .=1 i.) - .=1

3 y ~ 6 i ( 6 1 - 2 + 2 - # )

/ - + (4 -- #)2 ~ y~6, (37) = ( 4 - #) ~=%=7-(6~ - #)2 ~= 1 ~ 2X6~ - t~) ~"

The first sum in (37) is zero by (36), and the second sum is strictly positive since the first

636 PSYCHOMETRIKA

term is, and the remaining two terms are nonnegative. This shows that there is no # ~: 6i, i = 1, 2, 3, for which (32), (33) and (35) are satisfied.

Next, consider the possibility of having # = 51. It is readily verified that, for 61 < 4 < 6 2 ,

5jy~ 0(2)=(51 _2) (62 - 2) ' (63 - 2)

62y 2 (62 - 2)

63 y2 62 y2 63 y2 > - - + - - - g(61), (38)

(63 -- 2) (62 -- 61) (63 -- 61)

which implies that (35) cannot be satisfied by # = 51. Next, consider the possibility of having/~ = 62 < 63 . It is readily verified that

62 y~ y~6~ g(2) - g(5 ) - (65 - 4----3 + E (6, - 4)

Y f S i > y26 i y 2 6 i

--1~2 ( 6 t - 52~- i~2 ( 6 i - 4) i~2 (~ ; :~2)

for 51 < 2 < 52 . Analogous to (37), the right-hand side of (39) can be expressed as

y26~

i~2 (~. ---- -~)

y{5~

(5i--62) y?5+ 2 - (2 - 62)+~ 2 (6+ - ~ 2 )

+ (4 - 62) ~ E yp6, i~2 (6i - - 2X6i- 62) 2.

(39)

(40)

As in (37), the last sum in (40) is strictly positive. In addition, it can be shown that

y~6, (41) (2 - 6~),~2L (5, - - g~)~ -> o.

This can be seen in the following way. Substitution of # = 62 for 2 in (32) yields

(I - 52 A - l )w = y, (42)

which has a general solution for w,

w = (I - 62 A - 1 ) - y + , (43)

where x is an arbitrary number. From (33) and (43) we have

w2 x2 ~2 Y 26, i=l 6i -- 62 "+ t," (5 i -- 62)2 = O. (44)

Since x 2 / 5 2 > o, implies that (41) is correct. From (39), (40) and (41) we have

g(2*) > g(62). (45)

Hence there cannot be a globally optimal solution corresponding to/z = 52 < 53, in Case 3.

Next, consider the possibility of having # = 63 > 62. It is readily verified, for 61 < 2 < 52, that

61y~ z + 62 y~ 63 ~ 61y~ 62 Y~ 61y~ 62 y~ g(Sa) (46) g("~)= 61__2 ~ '-|- ~-----3"-~-'---~'~-- 61__ ,~ "~-' 52__a • 61__6~ '-t'- 62__6-~;

JOS M. F. TEN BERGE

TABLE I

637

Computational Example of the Solutions in Case 2

0 d G P

1 1 1 .6533 -2 o -1 .924 - .383

1 2 4 .2706 o 2 o o o

1 2 4 .2706 -1 o o .383 .924

1 3 9 .6533

0 '0 F' (F ' ) - I

4 8 18 .5 o o 2 o o -2.414 o o

8 18 44 - I .414 .707 o 4 1,414 o o ,414 o

18 44 114 3.5 -4 I 9 5.657 I o o 2

Y W2a x a O× a W2b ×b OXb

I .8536 2.6132 .6533 .854 .467 .467

o .3536 -2.6132 o -.354 o .467

o o .6533 o o o .467

.6533 .467

which implies 0(2*) > g(63). Hence # cannot be 33 if 33 > 62. Finally, consider the possibility of having # = 62 = 33. Then for 61 < 2 < 62 we have

g('~) = 3 , - ,~ + ~ + 33 - ~ a , - ,~ > a , - a~ - a(32). (4"0

This shows that .a cannot equal 62 = 53. This concludes the uniqueness proof for our solution of Case 3.

Discussion

In Case 3 the nonsingularity assumption made by Verhelst is met, hence Verhelst's solution applies here. It differs from our solution for Case 3 in that it requires the determi-

638 PSYCHOMETRIKA

nation of all roots of g'(2) = o and choosing the best one by comparing the associated values of ~b2(X), whereas in our solution only the unique root of g'(2) = o between 61 and 62 needs to be determined.

In Cases 1 and 2 Verhelst's solution does not apply. In these cases the solution is not unique. A computat ional example for Case 2 (first subcase) is summarized in Table 1. The two available solutions for this subcase are marked by letters a and b. It should be noted that solution b trivializes the distances, hence solution a must be preferred, for these data.

REFERENCES

Takane, Y., Young, F. W. & De Leeuw, Nonmetric individual differences multidimensional scaling: An alter- nating least-squares method with optimal scaling features. Psychoraetrika, 1977, 42, 7-67.

Verhelst, N. D. A note on Alscal: The estimation of the additive constant. Psychoraetrika, 1981, 46, 465-468.

Manuscript received 10/27/82 Final version received 6/27/83