superiority of the r – k class estimator...

15
This article was downloaded by: [The University of Manchester Library] On: 09 October 2014, At: 14:42 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Communications in Statistics - Theory and Methods Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsta20 Superiority of the r–k Class Estimator Over Some Estimators In A Linear Model Gülesen Üstündaǧ Şiray a & Sadullah Sakallıoǧlu a a Department of Statistics, Faculty of Science and Letters , Cukurova University , Adana , Turkey Published online: 13 Jun 2012. To cite this article: Gülesen Üstündaǧ Şiray & Sadullah Sakallıoǧlu (2012) Superiority of the r–k Class Estimator Over Some Estimators In A Linear Model, Communications in Statistics - Theory and Methods, 41:15, 2819-2832, DOI: 10.1080/03610926.2011.648786 To link to this article: http://dx.doi.org/10.1080/03610926.2011.648786 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Upload: sadullah

Post on 22-Feb-2017

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

This article was downloaded by: [The University of Manchester Library]On: 09 October 2014, At: 14:42Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsta20

Superiority of the r–k Class Estimator Over SomeEstimators In A Linear ModelGülesen Üstündaǧ Şiray a & Sadullah Sakallıoǧlu a

a Department of Statistics, Faculty of Science and Letters , Cukurova University , Adana ,TurkeyPublished online: 13 Jun 2012.

To cite this article: Gülesen Üstündaǧ Şiray & Sadullah Sakallıoǧlu (2012) Superiority of the r–k Class EstimatorOver Some Estimators In A Linear Model, Communications in Statistics - Theory and Methods, 41:15, 2819-2832, DOI:10.1080/03610926.2011.648786

To link to this article: http://dx.doi.org/10.1080/03610926.2011.648786

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Communications in Statistics—Theory and Methods, 41: 2819–2832, 2012Copyright © Taylor & Francis Group, LLCISSN: 0361-0926 print/1532-415X onlineDOI: 10.1080/03610926.2011.648786

Superiority of the r − kClass Estimator OverSome Estimators In A LinearModel

GÜLESEN ÜSTÜNDAG SIRAYAND SADULLAH SAKALLIOGLU

Department of Statistics, Faculty of Science and Letters,Cukurova University, Adana, Turkey

In regression analysis, to overcome the problem of multicollinearity, the r − k classestimator is proposed as an alternative to the ordinary least squares estimatorwhich is a general estimator including the ordinary ridge regression estimator, theprincipal components regression estimator and the ordinary least squares estimator.In this article, we derive the necessary and sufficient conditions for the superiority ofthe r − k class estimator over each of these estimators under the Mahalanobis lossfunction by the average loss criterion. Then, we compare these estimators with eachother using the same criterion. Also, we suggest to test to verify if these conditionsare indeed satisfied. Finally, a numerical example and a Monte Carlo simulationare done to illustrate the theoretical results.

Keywords Average loss criterion; r − k Class estimator; Mahalanobis lossfunction; Principal components regression estimator; Ridge regression estimator.

Mathematics Subject Classification 62J05; 62J07.

1. Introduction

In linear regression analysis, the presence of multicollinearity among regressorvariables may cause highly unstable least squares estimates of the regressionparameters. With multicollinear data some coefficients may be statisticallyinsignificant and may have the wrong signs. To overcome weakness of the ordinaryleast squares (OLS) estimator alternative estimators have been designed, some ofwhich are the ordinary ridge regression (ORR) estimator which was proposed byHoerl and Kennard (1970), the principal components regression (PCR) estimatorand the r − k class estimator which was proposed by Baye and Parker (1984).

Let us consider the multiple linear regression model

y = X� + �� � ∼ N�0� �2In�� (1)

Received January 10, 2011; Accepted December 5, 2011Address correspondence to Gülesen Üstündag Siray, Department of Statistics, Faculty

of Science and Letters, Cukurova University, Adana 01330, Turkey; E-mail: [email protected]

2819

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 3: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2820 Üstündag Siray and Sakallıoglu

where y is an observable n× 1 random vector, X is an n× p matrix of observableregressor variables (where rank�X� ≤ p� standardized so that X′X is in correlationform, � is a p× 1 vector of unknown parameters, and � is an n× 1 nonobservablerandom vector. Let T = [

t1� t2� � � � � tp]be an orthogonal matrix with T ′X′XT =

� where � = diag(1� 2� � � � � p

)is the diagonal matrix of the eigenvalues of

X′X and 1 ≥ 2 ≥ · · · ≥ p. Further, let Tr = t1� t2� � � � � tr � be the remainingcolumns of T having deleted �p− r� columns where r ≤ p. Thus, T ′

rX′XTr =

�r = diag �1� 2� � � � � r� and T ′p−rX

′XTp−r = �p−r = diag(r+1� r+2� � � � � p

)where

Tp−r =[tr+1� tr+2� � � � � tp

]. The r − k class estimator given by Baye and Parker (1984)

is

�r �k� = Tr �T′rX

′XTr + kIr�−1

T ′rX

′y� k ≥ 0� (2)

The r − k class estimator is a general estimator which includes the OLS estimator,PCR estimator and ORR estimator as special cases:

�p �0� = � = �X′X�−1 X′y is the OLS estimator� (3)

�r �0� = �r = Tr �T′rX

′XTr�−1 T ′

rX′y is the PCR estimator� (4)

�p �k� = � �k� = (X′X + kIp

)−1X′y is the ORR estimator� (5)

Especially in econometric works, researchers attempting to reducemulticollinearity have used either PCR or ORR. The r − k class estimator combinesthe ridge and principal components techniques into a single estimator. Baye andParker (1984) showed that the r − k class estimator is better than the PCR estimatorby the scalar mean square error (SMSE) criterion. However, they didn’t comparethe r − k class estimator to the ORR estimator and the r − k class estimator to theOLS estimator. Nomura and Ohkubo (1985) compared the r − k class estimator tothe ORR estimator and the OLS estimator by the SMSE criterion. Sarkar (1996)obtained conditions under which the r − k class estimator is superior to the OLSestimator, PCR estimator and ORR estimator by the matrix mean square error(MMSE) criterion.

It is interesting to the note that the studies on the biased estimators use themean square error (MSE) criterion, or equivalently the quadratic loss function asa measure of estimators’ performance. This article extends these studies by choiceof the loss function used to decide on a preferred estimator of �. We consideredthe loss function is Mahalanobis loss function in order to comparing the r − k classestimator, the OLS estimator, the PCR estimator and the ORR estimator with eachother. This function is defined as a distance rule which is called minimum distancerule in the context of classification of normal populations by Mahalanobis (1936).Mahalanobis loss function is previously used by Peddada et al. (1989) for comparinggeneralized ridge regression (GRR) estimator and the OLS estimator.

In this article we compare the r − k class estimator to the OLS estimator, thePCR estimator and the ORR estimator by the average loss criterion. The averageloss criterion may be defined as follows. Let �1 and �2 be two estimators for aparameter �. The estimator �1 is superior to �2 iff

E�L��1�� < E�L��2��� (6)

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 4: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2821

where L denotes the loss function. Clearly, when using mean squares as lossfunction, this reduces to the MSE criterion. For an estimator �1 of �, Mahalanobisloss function is defined as

LM��1� = ��1 − ��′�cov��1��−1��1 − ��� (7)

where cov��1� is the covariance matrix of the �1.Peddada et al. (1989) discussed the necessary and sufficient conditions for the

inadmissibility of the GRR estimator under the Mahalanobis loss function by theaverage loss criterion. Considering model (1), the GRR estimator, which was givenby Hoerl and Kennard (1970), is defined as

� �K� = �X′X + K�−1

X′y�

where K is a positive definite matrix. When taking K = kIp, it reduces to theORR estimator, where k ≥ 0. We can get the comparison of the ORR estimatorto the OLS estimator by the average loss criterion by substituting K = kIp in thecomparison of the GRR estimator to the OLS estimator by the average loss criteriongiven in Peddada et al. (1989).

In this article, we compare the r − k class estimator to the OLS estimator, thePCR estimator and the ORR estimator. In a special case, we get the comparison ofthe ORR estimator to the OLS estimator by the average loss criterion, so that wecan compare our results with the results given by Peddada et al. (1989) when K =kIp. We can also get the comparison of the PCR estimator to the OLS estimator andcomparison of the PCR estimator to the ORR estimator under the Mahalanobisloss function by the average loss criterion. Then, we suggest to test to verify if theseconditions are indeed satisfied in Sec. 3. We consider a numerical example and asimulation study to justify the superiority of the mentioned estimator in Secs. 4and 5.

2. Average Loss Comparisons

The r − k class estimator is given in (2). Let us denote covariance matrix of the r − kclass estimator is cov��r�k��. It is known that the covariance matrix of the r − k

class estimator as follows:

cov��r�k�� = �2Tr ��r + kIr�−1 �r ��r + kIr�

−1 T ′r � (8)

Let us write Sr �k� = �r + kIr and

X′X = Tr�rT′r + Tp−r�p−rT

′p−r � (9)

Using these expression, �r �k� in (2) and cov��r�k�� in (8) can be written in the form

�r �k� = TrSr �k�−1 T ′

rX′y� (10)

cov��r�k�� = �2TrSr �k�−1 �rSr �k�

−1 T ′r � (11)

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 5: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2822 Üstündag Siray and Sakallıoglu

Now, we can write the Mahalanobis loss function of the r − k class estimator asfollows:

LM��r�k�� = ��r�k�− ��′�cov��r�k���−1��r�k�− ��

= �−2y′XTr�−1r T ′

rX′y − �−2y′XTr�

−1r Sr�k�T

′r�

− �−2�′TrSr�k��−1r T ′

rX′y + �−2�′TrSr�k��

−1r Sr�k�T

′r�� (12)

2.1. Comparison of the r − k Class Estimator to the OLS Estimator

We now compare the r − k class estimator to the OLS estimator. It is known thatthe OLS estimator can be obtained by substituting k = 0 and r = p in the r − kclass estimator, so the Mahalanobis loss function of the OLS estimator can beobtained by putting k = 0 and r = p in Mahalanobis loss function of the r − k classestimator. Hence, we can write the Mahalanobis loss function of the OLS estimatoras follows:

LM��� = �−2y′XT�−1T ′X′y − �−2�′X′y − �−2y′X� + �−2�′T�T ′�� (13)

Using (9), (13) becomes in the form

LM��� = �−2y′XTr�−1r T ′

rX′y + �−2y′XTp−r�

−1p−rT

′p−rX

′y

− �−2�′X′y − �−2y′X� + �−2�′Tr�rT′r� + �−2�′Tp−r�p−rT

′p−r�� (14)

Now we give the following theorem to the comparison of the r − k classestimator to the OLS estimator under the Mahalanobis loss function.

Theorem 2.1. The OLS estimator is superior to the r − k class estimator under theMahalanobis loss function if and only if

p− r ≤ k2�−2�′TCT ′��

Proof. From (12) and (14), we get

LM���− LM��r�k�� = �−2y′XTp−r�−1p−rT

′p−rX

′y

+ �−2y′X(Tr�

−1r Sr �k� T

′r − Ip

)�

+ �−2�′ (TrSr �k��−1r T ′

r − Ip)X′y

+ �−2�′Tr

(�r − Sr �k��

−1r Sr �k�

)T ′r�

+ �−2�′Tp−r�p−rT′p−r�� (15)

The expected mean of (15) is equal to

E�LM���− LM��r�k��� = p− r − k2�−2�′TCT ′�� (16)

where C =(

�−1r 0∼0∼ 0∼

). It can be seen that p− r is positive. Also, since C is non negative

definite matrix, �−2�′TCT ′� is non negative. By using the average loss criterion

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 6: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2823

we obtain that the necessary and sufficient condition for the superiority of theOLS estimator to the r − k class estimator under the Mahalanobis loss function asfollows:

E(LM���− LM��r�k��

)≤ 0 ⇔ p− r ≤ k2�−2�′TCT ′�� (17)

Using the expression in (17) we can give an estimate for k in order that the OLSestimator can be superior to the r − k class estimator.

Theorem 2.2.

i)�′TCT ′� > 0� (18a)

ii) Let The OLS estimator is superior to the r − k class estimator under the Mahalanobisloss function iff

k ≥√�2�p− r�

�′TCT ′�(18b)

where C =(

�−1r 0∼0∼ 0∼

).

Proof. �′TCT ′� > 0 is provided. From (17),

E(LM���− LM��r�k��

)≤ 0 ⇔ k2 ≥ �2 �p− r�

�′TCT ′��

In other words, the OLS estimator is superior to the r − k class estimator under the

Mahalanobis loss function iff k ≥√

�2�p−r�

�′TCT ′� , where C =(

�−1r 0∼0∼ 0∼

).

k in (18b) depends upon the unknown parameters � and �2. For possibleparameters � and �2 > 0, there exists a k such that the estimator �r �k� has anaverage loss not greater than that of �’s.

After that, we compare the ORR estimator to the OLS estimator and the PCRestimator to the OLS estimator as special cases.

2.2. Comparison of the ORR Estimator to the OLS Estimator

It is known that the ORR estimator can be obtained by substituting r = p inthe r − k class estimator. Therefore, the comparison of the ORR estimator to theOLS estimator can be acquired by putting r = p in the comparison of the r − kclass estimator to the OLS estimator. For r = p, the Mahalanobis loss of functionof the r − k class estimator becomes the Mahalanobis loss function of the ORRestimator:

LM�� �k�� = �−2y′XT�−1T ′X′y − �−2y′XT�−1S �k� T ′�

− �−2�′TS �k��−1T ′X′y + �−2�′TS �k��−1S �k� T ′�� (19)

where S �k� = �+ kI .

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 7: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2824 Üstündag Siray and Sakallıoglu

Theorem 2.3. The ORR estimator is never superior to the OLS estimator under theMahalanobis loss function by the average loss criterion.

Proof. The difference between the Mahalanobis loss functions of OLS estimatorand ORR estimator is

LM���− LM�� �k�� = �−2y′X(T�−1S �k� T ′ − Ip

)�

+ �−2�′ (TS �k��−1T ′ − Ip)X′y

+ �−2�′T(�− S �k��−1S �k�

)T ′�� (20)

Using the average loss criterion, we get the necessary and sufficient condition forthe superiority of the OLS estimator to the ORR estimator under the Mahalanobisloss function:

E�LM���− LM�� �k��� ≤ 0 ⇔ 0 ≤ k2�−2�′T�−1T ′�� (21)

Since �−1 is positive definite �−2�′T�−1T ′� is always positive. Hence, the ORRestimator is never superior to the OLS estimator under the Mahalanobis lossfunction by the average loss criterion.

Peddada et al. (1989) obtained that the necessary and sufficient condition forthe inadmissibility of the GRR estimator under the Mahalanobis loss function bythe average loss criterion. Specifically, when we take K = kIp in the comparison ofthe GRR estimator to the OLS estimator which was given by Peddada et al. (1989),we can get the comparison of the ORR estimator to the OLS estimator. By thisway, when we take K = kIp, we see that our result is the same with the result givenby Peddada et al. (1989), for the comparison of the ORR estimator to the OLSestimator.

2.3. Comparison of the PCR Estimator to the OLS Estimator

Because the PCR estimator can be obtained by substituting k = 0 in the r − kclass estimator, the comparison of the PCR estimator to the OLS estimator can beobtained by putting k = 0 in the comparison of the r − k class estimator to the OLSestimator. For k = 0, the Mahalanobis loss of function of the r − k class estimatorbecomes to the Mahalanobis loss of function of the PCR estimator:

LM��r� = �−2y′XTr�−1r T ′

rX′y − �−2y′XTrT

′r� − �−2�′TrT

′rX

′y

+ �−2�′Tr�rT′r�� (22)

Theorem 2.4. The PCR estimator is always superior to the OLS estimator under theMahalanobis loss function by the average loss criterion.

Proof. The expected mean of the difference between the Mahalanobis lossfunctions of the PCR estimator and the OLS estimator can be derived from (16) byputting k = 0:

E�LM���− LM��r�� = p− r� (23)

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 8: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2825

Thus, necessary and sufficient condition for the superiority of the OLS estimator tothe PCR estimator under the Mahalanobis loss function is

E(LM���− LM��r�

)≤ 0 ⇔ p− r ≤ 0� (24)

Since p− r cannot be negative, PCR estimator is always superior to the OLSestimator under the Mahalanobis loss function by the average loss criterion.

2.4. Comparison of the r − k Class Estimator to the PCR Estimator

Now, we give the theorem for the comparison of the r − k class estimator to thePCR estimator under the Mahalanobis loss function.

Theorem 2.5. The r − k class estimator is never superior to the PCR estimator underthe Mahalanobis loss function by the average loss criterion.

Proof. The Mahalanobis loss function of the PCR estimator was previouslyobtained in (22). Then, difference between the Mahalanobis loss functions of thePCR and r − k class estimators equals

LM��r�− LM��r�k�� = �−2ky′XTr�−1r T ′

r� + �−2k�′Tr�−1r T ′

rX′y

+ �−2�′Tr

(�r − Sr �k��

−1r Sr �k�

)T ′r�� (25)

By using the average loss criterion we get the necessary and sufficient conditionfor the superiority of the PCR estimator to the r − k class estimator under theMahalanobis loss function

E(LM��r�− LM��r�k��

)≤ 0 ⇔ −k2�−2�′Tr�

−1r T ′

r� ≤ 0� (26)

Since �−1r is positive definite −k2�−2�′Tr�

−1r T ′

r� always will be negative. Thus, thisinequality always comes true. The r − k class estimator is never superior to the PCRestimator under the Mahalanobis loss function by the average loss criterion.

2.5. Comparison of the r − k Class Estimator to the ORR Estimator

We now compare the r − k class estimator to the ORR estimator under theMahalanobis loss function.

Theorem 2.6. The ORR estimator cannot be superior to the r − k class estimator underthe Mahalanobis loss function by the average loss criterion.

Proof. The Mahalonobis loss function of the ORR estimator was previouslyobtained in (9). We can find the difference between the Mahalanobis loss function ofthe ORR estimator and the Mahalanobis loss function of the r − k class estimator.From (12) and (9), we derive

LM�� �k��− LM��r�k�� = �−2y′XTp−r�−1p−rT

′p−rX

′y

− k�−2y′XTp−r�−1p−rT

′p−r� − �−2y′XTp−rT

′p−r�

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 9: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2826 Üstündag Siray and Sakallıoglu

− k�−2�′Tp−r�−1p−rT

′p−rX

′y − �−2�′Tp−rT′p−rX

′y

+ �−2�′Tp−r�p−rT′p−r� + 2k�−2�′Tp−rT

′p−r�

+ k2�−2�′Tp−r�−1p−rT

′p−r�� (27)

Using the average loss criterion we have the necessary and sufficient conditionfor the superiority of the ORR estimator to the r − k class estimator under theMahalanobis loss function

E(LM

(� �k�

)− LM��r�k��

)≤ 0 ⇔ p− r + �−2�′TDT ′� ≤ 0� (28)

where D =( 0∼ 0∼

0∼ k2�−1p−r

). Since D is a nonnegative definite matrix, �−2�′TDT ′� is

nonnegative. Also, p− r is positive. Therefore, p− r + �−2�′TDT ′� cannot benegative. Consequently, the ORR estimator cannot be superior to the r − k classestimator under the Mahalanobis loss function by the average loss criterion.

2.6. Comparison of the PCR Estimator to the ORR Estimator

Lastly, we compare the PCR estimator to the ORR estimator under theMahalanobis loss function.

Theorem 2.7. The ORR estimator cannot be superior to the PCR estimator under theMahalanobis loss function by the average loss criterion.

Proof. The expected mean of the difference between the Mahalanobis lossfunctions of the ORR estimator given in (2) and the PCR estimator given in (22)equals

E�LM���k��− LM��r�� = p− r + k2�−2�′T�−1T ′�� (29)

Hence, necessary and sufficient condition for the superiority of the ORR estimatorto the PCR estimator under the Mahalanobis loss function is

E�LM���k��− LM��r�� ≤ 0 ⇔ p− r + k2�−2�′T�−1T ′� ≤ 0� (30)

Since �−1 is a positive definite matrix, then �′T�−1T ′� is positive. Also, p− r ispositive. Therefore, p− r + k2�−2�′T�−1T ′� will always be positive. Consequently,the ORR estimator is not superior to the PCR estimator under the Mahalanobisloss function by the average loss criterion.

3. Tests of the Hypothesis for the Conditions

The necessary and sufficient conditions obtained in Theorems 2.1 and 2.2 in theprevious section depend on unknown parameters � and �2. An investigator musthave reasons to believe these conditions. In this section we suggest test to decidewhether or not the relevant conditions are satisfied in a given situation.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 10: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2827

Let H is q × p of rank q ≤ p. Under the assumption that y is distributed asN(X�� �2I

), we can derive test for the general linear model (1):

H0 � H� = t� (31)

An F test statistic for testing (31) is

F = �H� − t�′H�X′X�−1H ′�−1�H� − t�/q

y′I − X�X′X�−1X′�y/n− p� (32)

If H0 � H� = t is false then F is distributed as F �q� n− p� �, where the noncentrality parameter = �H� − t�′ H �X′X�−1 H ′�−1 �H� − t�/2�2. If H0 � H� = t istrue then = 0 and F is distributed as F �q� n− p�, (Rencher, 2000, p. 190).

We can construct a hypothesis testing on Theorem 2.1. The necessary andsufficient condition for the OLS estimator is superior to the r − k class estimatorcan be written as k2�−2�′TCT ′�

p−r≥ 1. Therefore if we wish to test whether � is superior

to the �r �k� in respect of the Mahalanobis loss function, we can test the hypothesis

H0 �k2�−2�′TCT ′�

p− r≥ 1 (33)

against the alternative hypothesis H1 �k2�−2�′TCT ′�

p−r< 1. Under the assumption

of normality k�−1C1/2T ′ �√p−r

∼ N(

k�−1C1/2T ′�√p−r

� k2

p−rC�−1

). Hence, �−2�′T�T ′� ∼

2(p� �′T�T ′�

2�2

), where = �′T�T ′�

2�2 . The residual sum of squares can be expressed as

y′I − X �X′X�−1 X′�y = e′e, where e is the n× 1 vector of OLS residual. Thus, thetest statistic will be

F1 =(n− p

p

)�′T�T ′�

e′e∼ F

(p� n− p�

�′T�T ′�2�2

)Obviously, the non centrality parameter is not smaller than one under H0 andsmaller than one under H1.

In addition to this hypothesis testing we can also construct a hypothesis testingon Theorem 2.2 for the condition �′TCT ′� > 0. Because C is non negative definitematrix, �′TCT ′� ≥ 0. Hence, we must test whether or not �′TCT ′� = 0. If we use theequality of �′TCT ′� = �′Tr�

−1r T ′

r�, (18a) is equivalent to T ′r� �= 0. Then we can test

the hypothesis

H0 � T′r� = 0 (34)

against the H1 � T′r� �= 0. Under the assumption of normality, T ′

r �r �k� has anormal distribution with expected value Sr �k�

−1 T ′rX

′X� and covariance matrix�2Sr �k�

−1 �rSr �k�−1. Thus, the test statistics for (34) is given by

F2 =(n− p

r

) �r �k�′ TrSr �k��

−1r Sr �k� T

′r �r �k�

e′e�

It is obvious that F2 ∼ F �r� n− p� under H0.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 11: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2828 Üstündag Siray and Sakallıoglu

4. Numerical example

We now analyze the data generated by Hoerl and Kennard (1981) for thecomparisons of the r − k class, OLS, ORR, and PCR estimators under theMahalanobis loss function by the average loss criterion. The data set was generatedby taking a factor structure from a real data set and choosing �1, �2, �3, �4, �5 atrandom with the constraint that �′� = 300 and normal error � with mean zero and�2 = 1. The resulting model is y = X� + �, � is normally distributed as N

(0� �2I

).

Our computations here were performed by using Matlab 7.5.The eigenvalues of X′X are 4.5789, 0.1941, 0.1549, 0.0583, 0.0138. Condition

number is 331.8043, which is large and so the collinearity is present. To overcomethe collinearity we can use the PCR, ORR and r − k class estimators.

We use this data to clarify which estimator is superior to the other underthe Mahalanobis loss function by the average loss criterion. For this purpose,Table 1 summarizes the E�LM��r�k��� and E�LM���k��� for various k values. It isobvious that the mean of the Mahalanobis loss function of the ORR and r − kclass estimators for k = 0 are equal to the mean of the Mahalanobis loss functionof the OLS estimator and the PCR estimator, E�LM���� = 5 and E�LM��r�� = 1,respectively.

Table 1Values of E�LM��r�k��� and E�LM���k��� for various k

values

k E�LM��r�k��� E�LM���k���

0 1�0000 5�00000�01 1�0053 5�11220�02 1�0214 5�44870�03 1�0481 6�00960�04 1�0856 6�79490�05 1�1337 7�80450�06 1�1925 9�03850�07 1�2620 10�49680�08 1�3423 12�17950�09 1�4331 14�08660�10 1�5348 16�21800�15 2�2032 30�24050�20 3�1390 49�87190�25 4�3422 75�11240�2736 5�0030 88�97440�30 5�8128 105�96180�40 9�5560 184�48770�50 14�3688 285�44950�60 20�2510 408�84730�70 27�2028 554�68110�80 35�2240 722�95080�90 44�3148 913�65651 54�4750 1126�798

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 12: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2829

Figure 1. E�LM��r�k��� and E�LM���� vs. k.

We observed that the PCR estimator is always superior to the OLS, ORRand r − k class estimators under the Mahalanobis loss function by the average losscriterion. In Sec. 2.2, we stated that the ORR estimator is never superior to the OLSestimator and in Sec. 2.5 we expressed that the ORR estimator cannot be superiorto the r − k class estimator under the Mahalanobis loss function by the average losscriterion. The numerical results in Table 1 support these.

As we stated in Theorem 2.2, for k values bigger than 0.2735 (k ≥ 0�2735� ther − k class estimator is superior to the OLS estimator. Figure 1 shows the mean ofthe Mahalanobis loss function of the OLS and r − k class estimators for k values.The result obtained for the superiority of the r − k class estimator to the OLSestimator in Sec. 2.1 can be seen in Fig. 1.

5. Monte Carlo Simulation Study

In this section we will discuss the simulation study by using Matlab 7.5 programto compare the performance of the r − k class, OLS, ORR, and PCR estimatorsunder the Mahalanobis loss function. Following McDonald and Galarneau (1975)and Kibria (2003), the explanatory variables are generated by

xij =(1− �2

)1/2zij + �zip+1� i = 1� 2� � � � � n� j = 1� 2� � � � � p�

where zij are independent standard normal pseudo-random numbers, � is specifiedso that the correlation between any two explanatory variables is given by �2. In thisstudy, to investigate the effects of different degrees of collinearity on the estimators,we consider � = 0�85 and 0.95. The corresponding condition numbers indicate weakto severe collinearity. We have simulated the data with sample sizes n = 15, 50, and100 and p = 4.

The dependent variable yi are then generated by the following equation:

yi = �0 + �1xi1 + �2xi2 + · · · + �pxip + �i� i = 1� 2� � � � � n

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 13: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2830 Üstündag Siray and Sakallıoglu

where �i are independent normal pseudo-random numbers with mean 0 andvariance �2 and �0 is taken to be identically zero. The standard deviationsconsidered in the simulation study are � = 0�1, 0.5, 1, and 4. The explanatoryvariables are then standardized so that X′X is in correlation form. Then thedependent variable is standardized so that X′y is the vector of correlations of thedependent variable with each explanatory variable. Estimated coefficients are thenconverted back to the original variables scale.

Table 2Estimated Mahalanobis losses of the OLS, PCR, ORR and r − k class estimators

for n = 15, p = 4

� = 0�1 � = 0�5 � = 1 � = 4

� 0.85 0.95 0.85 0.95 0.85 0.95 0.85 0.95

� 5�2867 5�2983 5�2867 5�2983 5�2867 5�2983 5�2867 5�2983�r 1�2926 1�2976 1�2926 1�2976 1�2926 1�2976 1�2926 1�2976� �k� 6�3160 6�1709 5�7972 5�5326 5�5018 5�3822 5�3069 5�3060�r �k� 2�3219 2�1701 1�8031 1�5319 1�5076 1�3815 1�3128 1�3052� 15�2236 50�2128 15�2236 50�2128 15�2236 50�2128 15�2236 50�2128

Table 3Estimated Mahalanobis losses of the OLS, PCR, ORR and r − k class estimators

for n = 50, p = 4

� = 0�1 � = 0�5 � = 1 � = 4

� 0.85 0.95 0.85 0.95 0.85 0.95 0.85 0.95

� 4�2481 4�2512 4�2481 4�2512 4�2481 4�2512 4�2481 4�2512�r 1�0953 1�0968 1�0953 1�0968 1�0953 1�0968 1�0953 1�0968� �k� 5�2505 5�0951 4�7587 4�4946 4�4850 4�3421 4�2664 4�2585�r �k� 2�0977 1�9406 1�6059 1�3402 1�3322 1�1867 1�1136 1�1041� 16�7673 54�8149 16�7673 54�8149 16�7673 54�8149 16�7673 54�8149

Table 4Estimated Mahalanobis losses of the OLS, PCR, ORR and r − k class estimators

for n = 100, p = 4

� = 0�1 � = 0�5 � = 1 � = 4

� 0.85 0.95 0.85 0.95 0.85 0.95 0.85 0.95

� 4�1729 4�1726 4�1729 4�1726 4�1729 4�1726 4�1729 4�1726�r 1�0533 1�0545 1�0533 1�0545 1�0533 1�0545 1�0533 1�0545� �k� 5�1576 5�0121 4�6917 4�4196 4�4080 4�2650 4�1921 4�1801�r �k� 2�0380 1�8940 1�5721 1�3015 1�2884 1�1469 1�0725 1�0620� 12�5810 42�7890 12�5810 42�7890 12�5810 42�7890 12�5810 42�7890

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 14: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

Superiority of the r − k Class Estimator 2831

To ensure k <√

�2�p−r�

�′TCT ′� (Theorem 2.1), we choose k =√

�2

�′Tr�−1r T ′

r �.

Then the experiment is replicated 2000 times by generating new error terms.We use the average loss criterion to investigate performance of the r − k class,

OLS, ORR and PCR estimators. The estimated Mahalanobis loss for any estimator�

�∗ is calculated as follows:

LM = 1MCN

MCN∑r=1

(�

�∗r −�

)′ cov

(�

�∗r

)−1(�

�∗r −�

)�

where�

�∗r is the computed value of

�∗ for the rth replication of the experiment andMCN is the number of replications, which is taken 2000 for this experiment.

The results of the simulation study are summarized in Tables 2–4. And thedegree of multicollinerity is given with the condition number � = max/min, in thetables.

From the simulation results shown in Tables 2–4, we can see that with theincreasing of the �, the estimated Mahalanobis loss values of the ORR and r −k class estimators decrease. Moreover, with the increasing of the sample size (n�,the estimated Mahalanobis loss values of the OLS, PCR, ORR and r − k classestimators decrease.

We can see that for all cases, the PCR estimator outperform the otherestimators and the ORR estimator is never superior to the other estimators. So thesimulation results support the findings in this article.

6. Conclusions

In this article, we obtained necessary and sufficient conditions for the superiorityof the r − k class estimator over the OLS, PCR, and ORR estimators under theMahalanobis loss function, which is a different loss function from quadratic lossfunction, using the average loss criterion. Also, we compare with the OLS, PCR,and ORR estimators each other. We have seen that the some conditions obtained onthe comparisons depend on the unknown parameters, �, �2. So we have constructedtests to decide whether or not the relevant conditions are satisfied. Because ther − k class estimator combines the ridge and principal components techniques into asingle estimator, it is preferred by the researchers. Although Baye and Parker (1984)showed that the r − k class estimator is better than the PCR estimator by the SMSEcriterion, we have seen that the PCR estimator outperform the r − k class estimatorunder the Mahalanobis loss function.

Acknowledgments

This research was supported by Çukurova University Academic Research Projectsunder project number FEF-2008D17. The first author was supported by theTUBITAK-BIDEB.

References

Baye, M. R., Parker, D. F. (1984). Combining ridge and principal component regression: amoney demand illustration. Commun. Statist. Theor. Meth. 13:197–205.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4

Page 15: Superiority of the               r               –               k               Class Estimator Over Some Estimators In A Linear Model

2832 Üstündag Siray and Sakallıoglu

Hoerl, A. E., Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonalproblems. Technometrics 12:55–67.

Hoerl, A. E., Kennard, R. W. (1981). Ridge regression-1980 advances, algorithms, andapplications. Amer. J. Mathemat. Manage. Sci. 1:5–83.

Kibria, B. M. G. (2003). Performance of some new ridge regression estimators. Commun.Statist. Simul. Comput. 32(2):419–435.

Mahalanobis, P. C. (1936). On the generalized distance in the statistics. Proc. Nat. Instit. Sci.India. 12:49–55.

McDonald, G. C., Galarneau, D. I. (1975). A monte carlo evaluation of some ridge-typeestimators. J. Amer. Statist. Assoc. 20:407–416.

Nomura M., Ohkubo, T. (1985). A note on combining ridge and principal componentregression. Commun. Statist. Theor. Metho. 14:2489–2493.

Peddada, S. D., Nigam, A. K., Saxena, A. K. (1989). On the inadmissibility of ridgeestimator in a linear model. Commun. Statist. Theor. Meth. 18(10): 3571–3585.

Rencher, A. C. (2000). Linear Models in Statistics. New York: Wiley.Sarkar, N. (1996). Mean square error matrix comparison of some estimators in linear

regressions with multicollinearity. Statist. Probab. Lett. 30:133–138.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 1

4:42

09

Oct

ober

201

4