founded 1348

25
Founded 1348 Charles University http://www.fsv.cuni.c

Upload: ivan-dudley

Post on 02-Jan-2016

28 views

Category:

Documents


3 download

DESCRIPTION

http://www.fsv.cuni.cz. Charles University. Founded 1348. Barcelona. Barcelona. World Congress of the Bernoulli Society. World Congress of the Bernoulli Society. 25. - 31. 7. 2004. 25. - 31. 7. 2004. LEAST WEIGTED SQUARES FOR PANEL DATA. LEAST WEIGTED SQUARES - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Founded 1348

Founded 1348Charles University

http://www.fsv.cuni.cz

Page 2: Founded 1348

Institute of Information Theory and Automation

Academy of Sciencesof the Czech Republ

Institute of Information Theory Institute of Economic Studies Faculty of Social Sciences

Charles UniversityPrague

Institute of Economic Studies Faculty of Social Sciences

Charles UniversityPrague

http://samba.fsv.cuni.cz/~visek/bernoulli

Barcelona 25. - 31. 7. 2004

LEAST WEIGTED SQUARES FOR PANEL DATA

and AutomationAcademy of Sciencesof the Czech Republ

Jan Ámos VíšekJan Ámos Víšek

World Congress of the Bernoulli Society Barcelona

LEAST WEIGTED SQUARES FOR PANEL DATA

25. - 31. 7. 2004 World Congress of the Bernoulli Society

http://samba.fsv.cuni.cz/~visek/bernoulli

Page 3: Founded 1348

Topic of presentation

● Definition of the Least Weighted Squares

● Their properties

● Paradigm of the robust estimation

( which the Least Weighted Squares fulfill)

● Algorithm for their evaluation

Page 4: Founded 1348

Consistency

Asymptotic normality

Reasonably high efficiency

Unbiasedness

Nearly impossible to fulfill for robust estimators, hence abandoned

Nearly “automatically” fulfilled for “classical” estimators, hence frequently unduly ignored in robust regression

Bickel, P.J. (1975): One-step Huber estimates in the linear model.

Jurečková J., P. K. Sen (1984): On adaptive scale-equivariant M-estimators in linear models.

JASA 70, 428-433.

Statistics and Decisions, vol. 2 (1984), Suppl. Issue No.1.

Requirements on an estimator of regression coefficients naturally inherited from the classical statistics

Robust regression

E.g. simple M-estimators lack this property, for discussion see

Scale- and regression-equivariance

Page 5: Founded 1348

Low local shift sensitivity

Preferably finite rejection point

Quite low gross-error sensitivity

Hampel, F. R., E. M. Ronchetti, P. J. Rousseeuw, W. A. Stahel (1986):

New York: J.Wiley & Sons.Robust Statistics - The Approach Based on Influence Functions.

Requirements on an estimator of regression coefficients naturally stemming from principles of robustness

Let’s call these four points Hampel’s paradigm

If interested in, ask me for sending by e-mail.

Víšek, J. Á. (2003): Development of the Czech export in nineties. In: Consolidation of governing and business in the Czech republic and EU I., 193 - 220, ISBN 80-86732-00-2, MatFyz Press.

The applications indicated that “high” should be substituted by “controlable”, see e.g.

Robust regression

High breakdown point

Page 6: Founded 1348

Let us agree, for a while, that the majority of data determines the “true” model.

Then a small change even of one observation can cause a large change of estimate.

High breakdown point may be sometimes self-destructive

Requirements on an estimator of regression coefficients naturally stemming from ..... - a comment

Robust regression

The method too much relies on selected “true” points !What is the problem ?

Hence, it may be preferable to reject observations “smoothly”.

Page 7: Founded 1348

Available diagnostics, sensitivity studies and accompanying procedures

Existence of an implementation of the algorithm with acceptable complexity and reliability of evaluation

If interested in, ask me for sending by e-mail.

Víšek, J.Á. (2000): A new paradigm of point estimation. Proc. of Data Analysis 2000/II, Modern Statistical Methods - Modeling, Regression, Classification and Data Mining, ISBN 80-238-6590-0, 195 - 230.

Requirements on an estimator of regression coefficients ( nearly) inevitable for successful applications

Robust regression

Let’s discuss them point by point.

An efficient and acceptable heuristics

Page 8: Founded 1348

If interested in, ask me for sending by e-mail.

Available diagnostics, sensitivity studies and ......

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

Kalina, J. (2003): Autocorrelated disturbances of robust regression. European Young Statistician Meeting 2003 – to appear.

Víšek, J.Á. (2003): Durbin-Watson statistic in robust regression. Probability and Mathematical Statistics, vol. 23., Fasc. 2(2003), 435 - 483.

Víšek, J.Á. (2002): White test for the least weigthed squares. COMPSTAT 2002, Berlin, Short Communications and Poster (CD), ISBN 3-00-009819-4 (eds. S. Klinke, P. Ahrend, L. Richter).

Víšek, J.Á. (2001): Durbin-Watson statistic for the least trimmed squares. Bulletin of the Czech Econometric Society, vol. 8, 14/2001, 1 – 40.

Víšek, J.Á. (1998): Robust specification test. Proc. Prague Stochastics'98 (eds. M. Hušková, P. Lachout, Union of Czechoslovak Mathematicians and Physicists), 1998, 581 - 586.

Víšek, J.Á. (2003): Estimating contamination level. Proc. Fifth Pannonian Sympos.on Math. Statist., Visegrad, Hungary 1985, 401--414.

as or

Page 9: Founded 1348

Available diagnostics, sensitivity studies and accompanying procedures

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

Víšek, J.Á. (2002): Sensitivity analysis of M-estimates of nonlinear regression model: Influence of data subsets. Ann. Inst.Statist. Math., 54, 2, 261 - 290.

Víšek, J.Á. (1997): Contamination level and sensitivity of robust tests. Handbook of Statist. 15, 633 – 642 (eds. G. S. Maddala & C. R.. Rao) Amsterdam: Elsevier Science B. V.

Víšek, J.Á. (1996): Sensitivity analysis of M-estimates. Ann. Inst.Statist. Math., 48(1996), 469-495.

If interested in, ask me for reprints.

Víšek, J.Á. (1986): Sensitivity of the test error probabilities with respect to the level of contamination in general model of contaminacy. J. Statist.Planning and Inference 14,(1986), 281--299.

Jurečková J., J. Á. Víšek (1984): Sensitivity of Chow--Robbins procedure to the contamination. Commun. Statist. -- Sequential Analys. 1984 3 (2), 175--190.

or

as

Page 10: Founded 1348

Available diagnostics, sensitivity studies and accompanying procedures

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

If interested in, ask me for sending by e-mail.

Víšek, J.Á. (1997): Robustifying instrumental variables. Submitted to COMPSTAT 2004.

Víšek, J.Á. (1996): Selecting regression model. Probability and Mathematical Statistics 21,. 2 (2001), 467 – 492.

Víšek, J.Á. (2000): Robust instrumental variables and specification test. Proc. PRASTAN 2000, ISBN 80-227-1486-0, 133 - 164..

Víšek, J.Á. (1998): Robust instruments. Proc. Robust'98 (ed. J. Antoch & G. Dohnal) Union of Czechoslovak Mathematicians and Physicists, 195 - 224.

asor

Page 11: Founded 1348

Existence of an implementation of the algorithm with acceptable complexity and reliability of evaluation

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

Hettmansperger, T.P., S. J. Sheather (1992): A Cautionary Note on the Method of Least Median Squares. The American Statistician 46, 79-83.

- the timing of sparks - air / fuel ratio - intake temperature - exhaust temperature

Explanatory variables:

Response variable: Number of knocks of an engine

Number of observations: 16

Engine knock data - treated by the Least Median of Squares

The results were due to bad algorithm, they used. They are on the next page.

A small change (7.2%) of one value in data caused a large change of the estimates.

Page 12: Founded 1348

Existence of an implementation of the algorithm with ....

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

Data Intrc. spark air intake exhaust 11th res.

Correct 30.08 0.21 2.90 0.56 0.93 0.570

Wrong -86.5 4.59 1.21 1.47 .069 0.328

Engine knock data - results by Hettmansperger and Sheather

Data Intrc. spark air intake exhaust 11th res.

Correct 30.04 0.14 3.08 0.46 -.007 0.450

Wrong 48.38 -.73 3.39 0.19 -.011 0.203

Boček, P., P. Lachout (1995): Linear programming approach to LMS-estimation. Mem. vol. Comput. Statist. & Data Analysis 19 (1995), 129 - 134..

A new algorithm, based on simplex method, was nearly immediately available, although published a bit later.

It indicates that the reliability of algorithm and its implementation is crucial.

Minimized squared residual

Page 13: Founded 1348

An efficient and acceptable heuristics (?)

Requirements on a robust estimator of regression coefficients ( nearly) inevitable for successful applications

hints that, in the case of sufficient “demand for data-processing”, we may “cope” without any heuristics.

- it seems quit acceptable heuristics, unfortunately it does not work,

- for the example of data for which the min-max-estimator failed see

- maximum was taken over some set of underlying d.f.’s and minimum over possible estimators,

Víšek, J.Á. (2000): On the diversity of estimates. CSDA 34, (2000) 67 - 89.

But papers like

-the problem is that the method implicitly takes maximum over “unexpected” set of d.f.’s.

Hansen, L. P. (1982): Large sample properties of generalized method of moments estimators. Econometrica, 50, no 4, 1029 - 1054.

In 1989 Martin et al. studied estimators minimizing maximal bias of them

Martin, R.. D., V. J. Yohai, R. H. Zamar (1989): Min-max bias robust regression. Ann Statist. 17, 1608 - 1630.

Page 14: Founded 1348

non-increasing, absolutely continuous

If interested in, ask me for sending by e-mail.

Víšek, J.Á. (2000): Regression with high breakdown point. ROBUST 2000, 324 – 356, ISBN 80-7015-792-5.

The least weighted squares

Page 15: Founded 1348

Mašíček, L. (2003): Consistency of the least weighted squares estimator. To appear in Kybernetika.

Plát, P. (2003): Nejmenší vážené čtverce. (The Least Weighted Squares, in Czech.) Diploma thesis on the Faculty of Nuclear and Physical Engineering , he Czech Technical University, Prague

Mašíček,, L. (2003): Diagnostika a sensitivita robustního odhadu. (Diagnostics and sensitivity of robust estimators, in Czech) Dissertation on the Faculty of Mathematics, Charles University.

The least weighted squares

Both, in the framework of random carriers

as well as for deterministc ones

we have consistency, asymptotic normality and Bahadur representation of the Least weighted Squares.

There are also some optimality results

Mašíček,, L. (2003): Optimality of the least weighted squares estimator. To appear in the Proceedings of ICORS'2003.

Page 16: Founded 1348

The least weighted squares

There is also algorithm for evaluating the LEAST WEIGHTED SQUARES.

It is a modification of the algorithm for the LEAST TRIMMED SQUARES which was described and tested in:

If interested in, ask me for sending a copy.

Víšek, J.Á. (1996): On high breakdown point estimation. Computational Statistics (1996) 11:137-146.

Víšek, J.Á. (2000): On the diversity of estimates. CSDA 34, (2000) 67 - 89.

Čížek, P., J. Á. Víšek (2000): The least trimmed squares. User Guide of Explore, Humboldt University.

(Of course, the algorithm for LTS is available in the package EXPLORE.)

Page 17: Founded 1348

The least weighted squares - algorithm

Select randomly p + 1 observations and find regression plane through them.A

Put

Is this sum of weighted squared residuals smaller than the sum from the previous step?

B

Evaluate squared residuals for all observations, order these squared residuals from the largest one to the smallest, multiply them by the weights

and evaluate the sum of these products.

No

Order observations in the same order as the squared residuals and apply the classical weighted least squares on them with weights

and find new regression plane.

Yes

Page 18: Founded 1348

Return to

Continued

Have we found already 20 identical models or have we exhausted a priori given number of repetitions ?

The least weighted squares - algorithm

End of evaluation

Yes No

A

B

The algorithm is available in MATLAB.

In the case when we were able to pass all n! orders of observations ( less than 10 observations), i.e. when we were able to find the LEAST WEIGHTED SQUARES estimator precisely, the algorithm returned the same value.

An arbitrary reasonable number

Page 19: Founded 1348

An assumed “casual” model

We would like to estimate consistently the model it seems that we have to believe that the disturbances are orthogonal to the model !

Observed data

Brief repetition of already introduced framework

Unifying GMM and robust approach

GMM weighted estimation

we look for some instruments, being close to model, however orthogonal to disturbances !

It is evident that it can’t be generally true

Page 20: Founded 1348

continued

Instruments

Unifying GMM and robust approach

GMM weighted estimation

non-increasing, absolutely continuous

Weight function

Disturbances

Ordered statistics of the squared disturbances

Page 21: Founded 1348

continued Unifying GMM and robust approach

GMM weighted estimation

Ranks of the squared disturbances

Orthogonality conditions Kronecker

product

This equality defines function Th

Page 22: Founded 1348

continued

Unifying GMM and robust approach

GMM weighted estimation

Residuals

Ordered statistics of the squared residuals

Ranks of the squared residuals

Empirical counterpart to the orthogonality conditions

Page 23: Founded 1348

Unifying GMM and robust approach

GMM weighted estimation

and its covariance matrix

continued

Empirical counterpart to the orthogonality conditions

Page 24: Founded 1348
Page 25: Founded 1348

THANKS for A

TTENTION