imputtiona of complex data with r-package vim: … · imputtiona of complex data with r-package...

67
,

Upload: others

Post on 01-Feb-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IMPUTATION OF COMPLEX DATA WITH R-PACKAGEVIM: TRADITIONAL AND NEW METHODS BASED

ON ROBUST ESTIMATION.Key Invited Paper

Matthias Templ1,2, Alexander Kowarik1, Peter Filzmoser2

1 Department of Methodology, Statistics Austria2 Department of Statistics and Probability Theory, TU WIEN, Austria

UNECE Worksession on data editingLjubljana, Mai 10, 2011

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 1 / 50

Page 2: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Content

1 Overview

2 Visualisation Tools

3 Robust Imputation: Motivation

4 Challenges

5 IRMI

6 Simulation results

7 Results from real-world data

8 Small walkthrough to VIM

9 Conclusion

1 Overview

2 Visualisation Tools

3 Robust Imputation: Motivation

4 Challenges

5 IRMI

6 Simulation results

7 Results from real-world data

8 Small walkthrough to VIM

9 Conclusion

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 2 / 50

Page 3: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Overview

R-package VIM

VIM = Visualization and Imputation of Missings

Univariate, bivariate, multiple and multivariate plot methods tohighlight missing values in complex data sets to learn about theirstructure (MCAR, MAR, MNAR). Comes with a GUI as well.

Hot-deck, k-NN and EM-based (robust) imputation methods forcomplex data sets. Due to time reasons we mostly concentrate onEM-based imputation. For hot-deck and k-NN, please have a look atthe paper.

VIM-book in 2012

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 3 / 50

Page 4: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 4 / 50

Page 5: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 5 / 50

Page 6: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 6 / 50

Page 7: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 7 / 50

Page 8: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 8 / 50

Page 9: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 9 / 50

Page 10: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 10 / 50

Page 11: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 11 / 50

Page 12: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Pro

port

ion

of m

issi

ngs

0.00

0.02

0.04

0.06

0.08

0.10

py01

0n

py03

5n

py05

0n

py07

0n

py08

0n

py09

0n

py10

0n

py11

0n

py12

0n

py13

0n

py14

0n

Com

bina

tions

py01

0n

py03

5n

py05

0n

py07

0n

py08

0n

py09

0n

py10

0n

py11

0n

py12

0n

py13

0n

py14

0n

3827425119534832231311119865443222222211111111111

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 12 / 50

Page 13: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

0.0

0.2

0.4

0.6

0.8

1.0

P033000

mis

sing

/obs

erve

d in

py0

10n

0 5 10 15 20 25 35

mis

sing

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 13 / 50

Page 14: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI . . .

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 14 / 50

Page 15: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI - Imputation

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 15 / 50

Page 16: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI - Imputation

kNN(data , variable=colnames(data), metric=NULL , k=5,

dist_var=colnames(data),weights=NULL ,numFun = median ,

catFun=maxCat ,makeNA=NULL ,NAcond=NULL , impNA=TRUE ,

donorcond=NULL ,mixed=vector(),trace=FALSE ,

imp_var=TRUE ,imp_suffix="imp",addRandom=FALSE)

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 16 / 50

Page 17: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI - Imputation

hotdeck(data , variable=colnames(data), ord_var=NULL ,

domain_var=NULL ,makeNA=NULL ,NAcond=NULL ,impNA=TRUE ,

donorcond=NULL ,imp_var=TRUE ,imp_suffix="imp")

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 17 / 50

Page 18: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Visualisation Tools

The GUI - Imputation

irmi(x, eps = 0.01 , maxit = 100, mixed = NULL ,

step = FALSE , robust = FALSE , takeAll = TRUE ,

noise = TRUE , noise.factor = 1, force = FALSE ,

robMethod = "lmrob", force.mixed = TRUE ,

mi = 1, trace=FALSE)

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 18 / 50

Page 19: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

Median Imputation

●●

●●

●●

●●

●●

●●

● ●

−2 0 2 4 6 8 10

−3

−2

−1

01

23

x

y ●

●●

●●

fully observedinclude missings in xinclude missings in yimputed values

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 19 / 50

Page 20: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

kNN Imputation

●●

●●

●●

●●

●●

●●

● ●

−2 0 2 4 6 8 10

−3

−2

−1

01

23

x

y ●

●●

●●

fully observedinclude missings in xinclude missings in yimputed values

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 20 / 50

Page 21: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IVEWARE

●●

●●

●●

●●

●●

●●

● ●

●●

●●

fully observedinclude missings in xinclude missings in yimputed values

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 21 / 50

Page 22: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IRMI

●●

●●

●●

●●

●●

●●

● ●

●●

●●

fully observedinclude missings in xinclude missings in yimputed values

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 22 / 50

Page 23: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

Median Imputation

●●

●●

●●

●●

●●

● ●

−2 0 2 4 6 8 10

−3

−2

−1

01

2

x

y

● original dataimputed data values

tolerance ellipse from original datatolerance ellipse from imputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 23 / 50

Page 24: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

kNN Imputation

●●

●●

●●

●●

●●

● ●

−2 0 2 4 6 8 10

−3

−2

−1

01

2

x

y

● original dataimputed data values

tolerance ellipse from original datatolerance ellipse from imputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 24 / 50

Page 25: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IVEWARE

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

●●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

● original data valuesimputed data values

original dataimputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 25 / 50

Page 26: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IRMI

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

●●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

● original data valuesimputed data values

original dataimputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 26 / 50

Page 27: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

Median Imputation

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

● ●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

−2 0 2 4 6 8

−5

05

1015

2025

x

y

tolerance ellipse from original datatolerance ellipse from imputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 27 / 50

Page 28: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

kNN Imputation

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

● ●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

−2 0 2 4 6 8

−5

05

1015

2025

x

y

tolerance ellipse from original datatolerance ellipse from imputed data

●●●●

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 28 / 50

Page 29: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IVEWARE

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

● ●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

● original data valuesimputed data values

original dataimputed data

●●

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 29 / 50

Page 30: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Robust Imputation: Motivation

IRMI

●●

●● ●

●●●

● ●●

● ●

●●

●●●

● ●● ● ●

●●

●●

●● ●

●●

● ●●

●●

● ●●

●●

●●

● ●

●●

● ●

● original data valuesimputed data values

original dataimputed data

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 30 / 50

Page 31: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

Some Challenges

Mixed type of variables: various variables being nominal scaled, somevariables might be ordinal and some variables could bedetermined to be of continuous scale.

Semi-continuous variables: �semi-continuous� distributions, i.e. a variableconsisting of a continuous scaled part and a certainproportion of equal values.

Far from normality: Virtually always outlying observations included inreal-world data.

multiple imputation: Imputated must be both, re�ect the multivariatestructure of the data and including �randomness�.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 31 / 50

Page 32: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

Some Challenges

Mixed type of variables: various variables being nominal scaled, somevariables might be ordinal and some variables could bedetermined to be of continuous scale.

Semi-continuous variables: �semi-continuous� distributions, i.e. a variableconsisting of a continuous scaled part and a certainproportion of equal values.

Far from normality: Virtually always outlying observations included inreal-world data.

multiple imputation: Imputated must be both, re�ect the multivariatestructure of the data and including �randomness�.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 31 / 50

Page 33: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

Some Challenges

Mixed type of variables: various variables being nominal scaled, somevariables might be ordinal and some variables could bedetermined to be of continuous scale.

Semi-continuous variables: �semi-continuous� distributions, i.e. a variableconsisting of a continuous scaled part and a certainproportion of equal values.

Far from normality: Virtually always outlying observations included inreal-world data.

multiple imputation: Imputated must be both, re�ect the multivariatestructure of the data and including �randomness�.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 31 / 50

Page 34: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

MIX, MICE, MI, MITOOLS, . . .

All missing values imputed with simulated values drawn from theirpredictive distribution given the observed data and the speci�edparameter.

−→ based on sequential regressions.

EM-based

In general, there are often problems when applied to complex data sets.

. . . and, of course, they are highly driven by in�uencial points,representative and non-representative outliers.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 32 / 50

Page 35: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

MIX, MICE, MI, MITOOLS, . . .

All missing values imputed with simulated values drawn from theirpredictive distribution given the observed data and the speci�edparameter.

−→ based on sequential regressions.

EM-based

In general, there are often problems when applied to complex data sets.

. . . and, of course, they are highly driven by in�uencial points,representative and non-representative outliers.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 32 / 50

Page 36: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

MIX, MICE, MI, MITOOLS, . . .

All missing values imputed with simulated values drawn from theirpredictive distribution given the observed data and the speci�edparameter.

−→ based on sequential regressions.

EM-based

In general, there are often problems when applied to complex data sets.

. . . and, of course, they are highly driven by in�uencial points,representative and non-representative outliers.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 32 / 50

Page 37: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

MIX, MICE, MI, MITOOLS, . . .

All missing values imputed with simulated values drawn from theirpredictive distribution given the observed data and the speci�edparameter.

−→ based on sequential regressions.

EM-based

In general, there are often problems when applied to complex data sets.

. . . and, of course, they are highly driven by in�uencial points,representative and non-representative outliers.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 32 / 50

Page 38: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

MIX, MICE, MI, MITOOLS, . . .

All missing values imputed with simulated values drawn from theirpredictive distribution given the observed data and the speci�edparameter.

−→ based on sequential regressions.

EM-based

In general, there are often problems when applied to complex data sets.

. . . and, of course, they are highly driven by in�uencial points,representative and non-representative outliers.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 32 / 50

Page 39: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

IVEWARE

Very popular software used in many applications in O�cial Statistics.

Similar to the previous mentioned methods.

The imputations are obtained by �tting a sequence of (Bayesian)regression models and drawing values from the correspondingpredictive distributions.Sequentially imputation: in each step, one variable serve asresponse and certain other variables serves as predictors. Fit acertain model using the observed part of the response and estimate(update) the (former) missing values in the response.

Initialization loop: . . .

Second outer loop:

Estimates of missing values are updated sequentially using one variable

as response and all other variables as predictors until convergency.

Since missing values are drawn from their predictive distribution giventhe observed data and the speci�ed parameter, the procedure allowsmultiple imputation.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 33 / 50

Page 40: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

IVEWARE

Very popular software used in many applications in O�cial Statistics.

Similar to the previous mentioned methods.

The imputations are obtained by �tting a sequence of (Bayesian)regression models and drawing values from the correspondingpredictive distributions.Sequentially imputation: in each step, one variable serve asresponse and certain other variables serves as predictors. Fit acertain model using the observed part of the response and estimate(update) the (former) missing values in the response.

Initialization loop: . . .

Second outer loop:

Estimates of missing values are updated sequentially using one variable

as response and all other variables as predictors until convergency.

Since missing values are drawn from their predictive distribution giventhe observed data and the speci�ed parameter, the procedure allowsmultiple imputation.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 33 / 50

Page 41: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

IVEWARE

Very popular software used in many applications in O�cial Statistics.

Similar to the previous mentioned methods.

The imputations are obtained by �tting a sequence of (Bayesian)regression models and drawing values from the correspondingpredictive distributions.Sequentially imputation: in each step, one variable serve asresponse and certain other variables serves as predictors. Fit acertain model using the observed part of the response and estimate(update) the (former) missing values in the response.

Initialization loop: . . .

Second outer loop:

Estimates of missing values are updated sequentially using one variable

as response and all other variables as predictors until convergency.

Since missing values are drawn from their predictive distribution giventhe observed data and the speci�ed parameter, the procedure allowsmultiple imputation.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 33 / 50

Page 42: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

IVEWARE

Very popular software used in many applications in O�cial Statistics.

Similar to the previous mentioned methods.

The imputations are obtained by �tting a sequence of (Bayesian)regression models and drawing values from the correspondingpredictive distributions.Sequentially imputation: in each step, one variable serve asresponse and certain other variables serves as predictors. Fit acertain model using the observed part of the response and estimate(update) the (former) missing values in the response.

Initialization loop: . . .

Second outer loop:

Estimates of missing values are updated sequentially using one variable

as response and all other variables as predictors until convergency.

Since missing values are drawn from their predictive distribution giventhe observed data and the speci�ed parameter, the procedure allowsmultiple imputation.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 33 / 50

Page 43: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Challenges

IVEWARE

Very popular software used in many applications in O�cial Statistics.

Similar to the previous mentioned methods.

The imputations are obtained by �tting a sequence of (Bayesian)regression models and drawing values from the correspondingpredictive distributions.Sequentially imputation: in each step, one variable serve asresponse and certain other variables serves as predictors. Fit acertain model using the observed part of the response and estimate(update) the (former) missing values in the response.

Initialization loop: . . .

Second outer loop:

Estimates of missing values are updated sequentially using one variable

as response and all other variables as predictors until convergency.

Since missing values are drawn from their predictive distribution giventhe observed data and the speci�ed parameter, the procedure allowsmultiple imputation.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 33 / 50

Page 44: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

IRMI

Only the second outer loop is used (missing values are initialised in another manner)

In contradiction to IVEWARE we use quite di�erent regressionmethods → Robust methods (Note: a lot of problems has to besolved when using robust methods for complex data like EU-SILC).

Alternatively, stepwise model selction tools are integrated using AICor BIC.

Multiple imputation is provided.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 34 / 50

Page 45: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

IRMI

Only the second outer loop is used (missing values are initialised in another manner)

In contradiction to IVEWARE we use quite di�erent regressionmethods → Robust methods (Note: a lot of problems has to besolved when using robust methods for complex data like EU-SILC).

Alternatively, stepwise model selction tools are integrated using AICor BIC.

Multiple imputation is provided.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 34 / 50

Page 46: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

IRMI

Only the second outer loop is used (missing values are initialised in another manner)

In contradiction to IVEWARE we use quite di�erent regressionmethods → Robust methods (Note: a lot of problems has to besolved when using robust methods for complex data like EU-SILC).

Alternatively, stepwise model selction tools are integrated using AICor BIC.

Multiple imputation is provided.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 34 / 50

Page 47: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 48: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 49: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 50: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 51: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 52: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

IRMI

Selection of Regression Models

If the response is

continuous, robust (IRMI) or ols (IMI, IVEWARE) regression methodsare used.

categorical , generalized linear regression is applied (IRMI: robust ornon-robust).

binary , logistic linear regression is applied (IRMI: robust butnon-robust is prefered).

mixed , a two-stage approach is used whereas in the �rst stage logisticregression is applied in order to decide if a missing value is imputedwith zero or by applying robust regression based on the continuouspart of the response.

count, robust generalized linear models (family: Poisson) is used.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 35 / 50

Page 53: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Measures of information loss

Errors from Categorical and Binary Variables

This error measure is de�ned as the proportion of imputed values takenfrom an incorrect category on all missing categorical or binary values:

errc =1

mc

pc∑j=1

n∑i=1

I(xorigij 6= ximpij ) , (1)

with I the indicator function, mc the number of missing values in the pccategorical variables, and n the number of observations.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 36 / 50

Page 54: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Measures of information loss

Errors from Continuous and Semi-continuous Variables

Here we assume that the constant part of the semi-continuous variable iszero. Then, the joint error measure is

errs =1

ms

ps∑j=1

n∑i=1

[∣∣∣∣∣(xorigij − x

impij )

xorigij

∣∣∣∣∣ · I(xorigij 6= 0 ∧ ximpij 6= 0) +

I((xorigij = 0 ∧ ximpij 6= 0) ∨ (xorigij 6= 0 ∧ x

impij = 0))

](6)

with ms the number of missing values in the ps continuous andsemi-continuous variables.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 37 / 50

Page 55: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Speci�c simulation results

Simulation Results: Varying the Correlation0.

00.

10.

20.

30.

40.

5

Correlation

err c

0.1 0.3 0.5 0.7 0.9

● IRMIIMIIVEWARE

0.00

0.05

0.10

0.15

0.20

0.25

0.30

Correlation

err s

0.1 0.3 0.5 0.7 0.9

● IRMIIMIIVEWARE

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 38 / 50

Page 56: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Speci�c simulation results

Simulation Results: Varying the Amount of Variables0.

000.

050.

100.

150.

200.

250.

300.

35

Number of Variables

err c

4 12 20 28 36

●●

●●

● IRMIIMIIVEWARE

0.00

0.05

0.10

0.15

0.20

Number of Variables

err s

4 12 20 28 36

● ● ●

● IRMIIMIIVEWARE

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 39 / 50

Page 57: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Speci�c simulation results

Including (moderate) Outliers and Varying their Amount,high Correlation

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Outliers [%]

err c

0 10 20 30 40 50

●●

●●

● IRMIIMIIVEWARE

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Outliers [%]

err s

0 10 20 30 40 50

● ●●

● IRMIIMIIVEWARE

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 40 / 50

Page 58: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Simulation results Speci�c simulation results

Including (moderate) Outliers and Varying their Amount,low Correlation

0.0

0.1

0.2

0.3

0.4

Outliers [%]

err c

0 10 20 30 40 50

●●

● ● ●●

● IRMIIMIIVEWARE

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Outliers [%]

err s

0 10 20 30 40 50

● ●● ●

● IRMIIMIIVEWARE

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 41 / 50

Page 59: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Results from real-world data

Imputation in EU-SILC

We considered certain HH-components, but also some nominal variables,such as household size, region and htype3.

1 R = 0

2 Set missing values in HH-components randomly (MCAR). R ++

3 Impute the missing values.

4 Evaluate the imputations using certain information loss measures.

5 Go to (2) until R = 100.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 42 / 50

Page 60: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Results from real-world data

Imputation in EU-SILC, Results

●●●●●●●●

●●●●●●

IRMI IMI IVEWARE

2040

6080

100

err s

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 43 / 50

Page 61: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Results from real-world data

CENSUS Data - no outliers

● ●

IRMI IMI IVEWARE

0.2

0.4

0.6

0.8

err s

(i) Error for categorical variables

●●

●●

●●

●●●●

●●

●●

●●

●●

●●

IRMI IMI IVEWARE0.

00.

20.

40.

6

err s

(j) Error for numerical variables

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 44 / 50

Page 62: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Results from real-world data

Airquality Data

●●

●●

●●

●●

●●●

●●●●

●●●

IRMI IMI IVE

0.5

1.0

1.5

err s

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 45 / 50

Page 63: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Small walkthrough to VIM

Example Data: SBS data

Pro

port

ion

of m

issi

ngs

0.00

0.01

0.02

0.03

0.04

0.05

Um

satz

B31

B41

B23 a1 a2 a6 a2

5 e2B

esch

Com

bina

tions

Um

satz

B31

B41

B23 a1 a2 a6 a2

5 e2B

esch

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 46 / 50

Page 64: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Small walkthrough to VIM

Most important functionality for imputation

Listing 1: Hotdeck imputation.

hotdeck(x, ord_var=c("Besch","Umsatz"), imp_var=FALSE)

Listing 2: k-nearest neighbor imputation.

kNN(x)

Listing 3: Application of robust iterative model-based imputation.

imp <- irmi(x)

. . . sensible defaults!

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 47 / 50

Page 65: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Small walkthrough to VIM

SBS data: Simulation Results

●●

hotdeck imi irmi kNN

020

0040

0060

0080

00

absolute deviation

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 48 / 50

Page 66: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Conclusion

Conclusion

We proposed the system VIM for visualization and imputation ofmissing values.

IRMI performs almost always best, but hot-deck methods have it'sadvantages as well (they are very fast and easy understandable)

VIM is an free and open-source project. It can be freely downloaded at

http://cran.r-project.org/package=VIM

Joint development and contributions are warmly welcome.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 49 / 50

Page 67: IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: … · IMPUTTIONA OF COMPLEX DATA WITH R-PACKAGE VIM: TRADITIONAL AND NEW METHODS BASED ON ROBUST ESTIMATION. Key Invited Paper Matthias

Conclusion

References

M. Templ, A. Alfons, and A. Kowarik. VIM: Visualization and Imputationof Missing Values, 2011a. URL http://cran.r-project.org. Rpackage version 2.0.1.

Matthias Templ, Alexander Kowarik, and Peter Filzmoser. Iterativestepwise regression imputation using standard and robust methods.Computational Statistics Data Analysis, In Press, Uncorrected Proof,2011b. ISSN 0167-9473. doi: DOI:10.1016/j.csda.2011.04.012. URLhttp://www.sciencedirect.com/science/article/

B6V8V-52R7YYH-2/2/2c6c9ed7138d50c4197e991d8ffb8a1f.

Templ, et al. (STAT, TU) Robust Imputation Ljubljana, Mai 10, 2011 50 / 50