package ‘pcamixdata’ - u-bordeaux.frmchave100p/wordpress/wp-content/... · 2013. 12. 23. ·...

Package ‘PCAmixdata’December 23, 2013

Type Package

Title Multivariate analysis for a mixture of quantitative and qualitative data

Version 2.0

Date 2013-04-10

AuthorMarie Chavent and Vanessa Kuentz and Amaury Labenne and Benoit Liquet and Jerome Saracco

Maintainer <[email protected]>

Description Principal Component Analysis and orthogonal rotation for a mixtureof quantitative and qualitative variables.

License GPL (>=2.0)

Collate 'PCAmix.R' 'PCArot.R' 'plot.PCAmix.R' 'predict.PCAmix.R' 'print.PCAmix.R' 'recod.R''recodqual.R' 'recodquant.R' 'sol.2dim.R' 'summary.PCAmix.R' 'tab.disjonctif.NA.R''svd.triplet.R' 'Cut.Group.R' 'nb.level.R' 'Tri.Data.R' 'Lg.pond.R' 'Lg.R' 'RV.R' 'RV.pond.R''MFAmix.R' 'plot.MFAmix.R' 'print.MFAmix.R' 'summary.MFAmix.R'

R topics documented:Cut.Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2decathlon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3flower . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3INSEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Lg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Lg.pond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5MFAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6nb.level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8PCAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8PCArot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11plot.MFAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13plot.PCAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15predict.PCAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17print.MFAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18print.PCAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18recod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19recodqual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1

2 Cut.Group

recodquant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21RV.pond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22sol.2dim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22summary.MFAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23summary.PCAmix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24svd.triplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25tab.disjonctif.NA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25Tri.Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26vnf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26wine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Index 28

Cut.Group Cut.Group

Description

Cut a data frame into differents G data frames

Usage

Cut.Group(base, group, name.group)

Arguments

base the data.frame to be split with n rows and p columns

group a vector of size p whose values indicate at which group belongs each variable

name.group a vector of size G which contains names for each group we want to create

Value

list.group a list with each group created

Examples

data(decathlon)Cut.Group(decathlon,group=c(rep(1,10),2,2,3),name.group=c("Epreuve","Classement","Competition"))

flower 3

decathlon Performance in decathlon (data)

Description

The data used here refer to athletes’ performance during two sporting events.

Usage

data(decathlon)

Format

A data frame with 41 rows and 13 columns: the first ten columns corresponds to the performanceof the athletes for the 10 events of the decathlon. The columns 11 and 12 correspond respectivelyto the rank and the points obtained. The last column is a categorical variable corresponding to thesporting event (2004 Olympic Game or 2004 Decastar)

Source

Departement of Applied Mathematics, Agrocampus Rennes

flower Flower Characteristics

Description

8 characteristics for 18 popular flowers.

Usage

data(flower)

Format

A data frame with 18 observations on 8 variables:

[ , "V1"] factor winters[ , "V2"] factor shadow[ , "V3"] factor tubers[ , "V4"] factor color[ , "V5"] ordered soil[ , "V6"] ordered preference[ , "V7"] numeric height[ , "V8"] numeric distance

V1 winters, is binary and indicates whether the plant may be left in the garden when it freezes.

V2 shadow, is binary and shows whether the plant needs to stand in the shadow.

V3 tubers, is asymmetric binary and distinguishes between plants with tubers and plants that grow

4 Lg

in any other way.

V4 color, is nominal and specifies the flower’s color (1 = white, 2 = yellow, 3 = pink, 4 = red, 5 =blue).

V5 soil, is ordinal and indicates whether the plant grows in dry (1), normal (2), or wet (3) soil.

V6 preference, is ordinal and gives someone’s preference ranking going from 1 to 18.

V7 height, is interval scaled, the plant’s height in centimeters.

V8 distance, is interval scaled, the distance in centimeters that should be left between the plants.

Source

The reference below.

References

Anja Struyf, Mia Hubert & Peter J. Rousseeuw (1996): Clustering in an Object-Oriented Environ-ment. Journal of Statistical Software, 1. http://www.stat.ucla.edu/journals/jss/

INSEE INSEE

Description

Description of 302 cities by 56 variables structured in 8 groups

Usage

data(INSEE)

Format

A data frame with 302 observations on 56 variables

Source

www.insee.fr

Lg Coefficient Lg

Description

Compute a data fame with all the Lg coefficients between each matrix of a list

Usage

Lg(liste.mat)

Arguments

liste.mat a list of G matrix

http://www.stat.ucla.edu/journals/jss/

Lg.pond 5

Value

Lg a data frame with G rows and G colums with all the Lg coefficients

References

Escofier B et Pages J (1998), Analyses factorielles simples et multiples, Dunod, 3e ed.

J.Pages,Analyse factorielle multiple appliquee aux variables qualitatives et aux donnees mixtes.Revue de statistique appliquee, tome 50, num 4 (2002), p. 5-37

See Also

RV,RV.pond,Lg.pond,

Examples

V0<-c("a","b","a","a","b")V01<-c("c","d","e","c","e")V1<-c(5,4,2,3,6)V2<-c(8,15,4,6,5)V3<-c(4,12,5,8,7)V4<-c("vert","vert","jaune","rouge","jaune")V5<-c("grand","moyen","moyen","petit","grand")G1<-data.frame(V0,V01,V1)G2<-data.frame(V2,V3)G3<-data.frame(V4,V5)liste.mat<-list(G1,G2,G3)Lg(liste.mat)

Lg.pond Coefficient Lg with ponderation

Description

Compute a data fame with all the Lg coefficients between each matrix of a list ponderated

Usage

Lg.pond(liste.mat, ponde)

Arguments


ponde a vector of size G with the ponderation associated to each matrix

Value

Lg.pond a data frame with G rows and G colums with all the Lg coefficients ponderated

6 MFAmix

Examples

V0<-c("a","b","a","a","b")V01<-c("c","d","e","c","e")V1<-c(5,4,2,3,6)V2<-c(8,15,4,6,5)V3<-c(4,12,5,8,7)V4<-c("vert","vert","jaune","rouge","jaune")V5<-c("grand","moyen","moyen","petit","grand")G1<-data.frame(V0,V01,V1)G2<-data.frame(V2,V3)G3<-data.frame(V4,V5)ponderation<-c(1,2,1)liste.mat<-list(G1,G2,G3)Lg.pond(liste.mat,ponderation)

MFAmix Multiple Factor Analysis for a mixture of qualitative and quantitativevariables inside the groups

Description

Performs Multiple Factor Analysis in the sense of Escofier-Pages. Groups of variables can bequantitative, categorical or contain both quantitatives and categoricals variables.

Usage

MFAmix(data, group, name.group, ndim, graph = TRUE,axes = c(1, 2))

Arguments

data a data frame with n rows and p colums containing all the variables. This dataframe will be split into G groups according to the vector group

group a vector of size p whose values indicate at which group each variable belongs

name.group a vector of size G which contains names for each group we want to create. Eachname must be written as a charactor chain without any spaces and any specialcharacters. Only letters, numbers and "_" allowed.

ndim number of dimensions kept in the results

graph boolean, if TRUE (default) a graph is displayed

axes a length 2 vector specifying the axes to plot

Value

eig a matrix containing all the eigenvalues, the percentage of variance and the cu-mulative percentage of variance

separate.analyses

the results for the separate analyses for each group

group a list of matrices containing all the results for the groups (Lg and RV coefficients,coordinates, square cosine, contributions, distance to the origin)

MFAmix 7

partial.axes a list of matrices containing all the results for the partial axes (coordinates, cor-relation between variables and axes)

ind a list of matrices containing all the results for the individuals (coordinates, squarecosine, contributions)

ind.partiel a matrice containing the coordinates of the partial individuals

quanti.var a list of matrices containing all the results for the quantitative variables (coordi-nates, contribution, cos2)

quali.var a list of matrices containing all the results for the categorical variables (coordi-nates, contribution, cos2)

global.pca The results of the MFAmix considered as a unique weighted PCAmix

Author(s)

Marie Chavent <[email protected]>, Vanessa Kuentz, Amaury Labenne, Benoit Li-quet, Jerome Saracco

References

Escofier B et Pages J (1983), Methode pour l’analyse de plusieurs groupes de variables. Applicationa la caracterisation des vins rouges du Val de Loire, Revue de statistique appliquee, 31(2) : 43-59.


Examples

#MFAmix:data(INSEE)

class.var<-c(1,1,1,1,1,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,5,5,1,1,1,1,6,7,7,7,7,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,8,8,1)

nom.groupes<-c("Log_Services","Population","Emplois","Condit_Famil","Educ","Revenu","Environnement","Secu_et_Soins")

res<-MFAmix(data=INSEE,group=class.var,name.group=nom.groupes,ndim=5,graph=FALSE)summary(res)res$eigres$group$Lg

#Eigenvalues of the separate analysesres$recap.eig.separate

#Individuals map on dim 1-2 colored by MedGenplot(res,choice="ind",invisible="quali",habillage="MedGen_R",

cex=0.6,title="Individuals colored by the Qualitative variable MedGen_R")#Categories maps on dim 2-3plot(res,choice="ind",invisible="ind",cex=0.6,axes=c(2,3))#Partial axes on dim 1-2plot(res,choice="axes",invisible="ind",cex=0.6,habillage="group",

col.hab=c("green","blue","red","skyblue1","orange","grey","purple","brown"))#Groups representation on dim 1-2plot(res,choice="group",cex=0.6,habillage="group")#Correlation circle on dim 1-2plot(res,choice="var",cex=0.6,habillage="group")#Squared loadings on dim 1-2plot(res,choice="loadings",cex=0.6,habillage="group")#Some partial individuals

8 PCAmix

plot(res,choice="ind",invisible="quali",habillage="group",partial=c("33119","17410"),cex=0.6)#All partial individualsplot(res,choice="ind",invisible="quali",habillage="group",partial="all",cex=0.6)

nb.level nb.level

Description

Return the number of levels of a factor

Usage

nb.level(fact)

Arguments

fact a factor

Value

The number of levels of the factor

PCAmix Principal Component Analysis for a mixture of qualitative and quan-titative variables

Description

PCAmix is a principal component method for a mixture of qualitative and quantitative variables.PCAmix includes the ordinary principal component analysis (PCA) and multiple correspondenceanalysis (MCA) as special cases. Squared loadings are correlation ratios for qualitative variablesand squared correlation for quantitative variables. Missing values are replaced by means for quan-titative variables and by zeros in the indicator matrix for qualitative variables. Note that when allthe p variables are qualitative, the scores of the n observations are equal to the usual scores of MCAtimes square root of p and the eigenvalues are then equal to the usual eigenvalues of MCA times p.

Usage

PCAmix(X.quanti = NULL, X.quali = NULL, ndim = 5,weight.col = NULL, weight.row = NULL, graph = TRUE)

PCAmix 9

Arguments

X.quanti a numeric matrix of data, or an object that can be coerced to such a matrix (suchas a numeric vector or a data frame with all numeric columns).

X.quali a categorical matrix of data, or an object that can be coerced to such a matrix(such as a character vector, a factor or a data frame with all factor columns).

ndim number of dimensions kept in the results (by default 5).

graph boolean, if TRUE the following graphs are displayed for the first two dimensionsof PCAmix: plot of the observations (the scores), plot of the variables (squaredloadings) plot of the correlation circle (if quantitative variables are available),plot of the categories (if qualitative variables are available).

weight.col a vector of weights for the quantitatives variables and for the indicator of quali-tatives variables

weight.row a vector of weights for the individuals

Value

eig eigenvalues (i.e. variances) of the Principal Components (PC).

scores scores a n by ndim numerical matrix which contains the scores of the n obser-vations on the ndim first Principal Components (PC).

scores.stand a n by ndim numerical matrix which contains the standardized scores of the nobservations on the ndim first Principal Components (PC).

sload a p by ndim numerical matrix which contains the squared loadings of the pvariables on the ndim first PC. For quantitative variables (resp. qualitative),squared loadings are the squared correlations (resp. the correlation ratios) withthe PC scores.

categ.coord ’NULL’ if X.quali is ’NULL’ . Otherwise a m by ndim numerical matrix whichcontains the coordinates of the m categories of the qualitative variables on thendim first PC. The coordinates of the categories are the averages of the standard-ized PC scores of the observations in those categories.

quanti.cor ’NULL’ if X.quanti is ’NULL’. Otherwise a p1 by ndim numerical matrix whichcontains the coordinates (the loadings) of the p1 quantitative variables on thendim first PC. The coordinates of the quantitative variables are correlations withthe PC scores.

quali.eta2 ’NULL’ if X.quali is ’NULL’ . Otherwise a p2 by ndim numerical matrix whichcontains the squared loadings of the p2 qualitative variables on the ndim firstPC. The squared loadings of the qualitative variables are correlation ratios withthe PC scores.

res.ind Results for the individuals (coord,contrib in percentage,cos2)

res.quanti Results for the quantitatives variables (coord,contrib in percentage,cos2)

res.categ Results for the categories of the categorials variables (coord,contrib in percent-age,cos2)

coef Coefficients of the linear combinations of the quantitative variables and the cat-egories for constructing the principal components of PCAmix.

V The standardized loadings.

rec Results of the fonction recod(X.quanti,X.quali).

M Metric used in the svd for the weights of the variables.

10 PCAmix

Author(s)

Marie Chavent <[email protected]>, Vanessa Kuentz, Benoit Liquet, JeromeSaracco

References

Chavent, M., Kuentz, V., Saracco, J. (2011), Orthogonal Rotation in PCAMIX. Advances in Clas-sification and Data Analysis, Vol. 6, pp. 131-146.

Kiers, H.A.L., (1991), Simple structure in Component Analysis Techniques for mixtures of quali-tative and quantitative variables, Psychometrika, 56, 197-212.

Examples

#PCAMIX:data(wine)X.quanti <- wine[,c(3:29)]X.quali <- wine[,c(1,2)]pca<-PCAmix(X.quanti,X.quali,ndim=4)pca<-PCAmix(X.quanti,X.quali,ndim=4,graph=FALSE)pca$eig

#Scores on dim 1-2plot(pca,choice="ind",quali=wine[,1],

posleg="bottomleft",main="Scores")#Scores on dim 2-3plot(pca,choice="ind",axes=c(2,3),quali=wine[,1],

posleg="bottomleft",main="Scores")#Other graphicsplot(pca,choice="var",main="Squared loadings")plot(pca,choice="categ",main="Categories")plot(pca,choice="cor",xlim=c(-1.5,2.5),

main="Correlation circle")#plot with standardized scores:plot(pca,choice="ind",quali=wine[,1],stand=TRUE,

posleg="bottomleft",main="Standardized Scores")plot(pca,choice="var",stand=TRUE,main="Squared loadings")plot(pca,choice="categ",stand=TRUE,main="Categories")plot(pca,choice="cor",stand=TRUE,main="Correlation circle")

#PCA:data(decathlon)quali<-decathlon[,13]pca<-PCAmix(decathlon[,1:10])pca<-PCAmix(decathlon[,1:10], graph=FALSE)plot(pca,choice="ind",quali=quali,cex=0.8,

posleg="topright",main="Scores")plot(pca, choice="var",main="Squared correlations")plot(pca, choice="cor",main="Correlation circle")

#MCAdata(flower)mca <- PCAmix(X.quali=flower[,1:4])mca <- PCAmix(X.quali=flower[,1:4],graph=FALSE)plot(mca,choice="ind",main="Scores")

PCArot 11

plot(mca,choice="var",main="Correlation ratios")plot(mca,choice="categ",main="Categories")

#Missing valuesdata(vnf)PCAmix(X.quali=vnf)vnf2<-na.omit(vnf)PCAmix(X.quali=vnf2)

PCArot Varimax rotation in PCAmix

Description

Orthogonal rotation in PCAmix by maximization of the varimax function expressed in terms ofPCAmix squared loadings (correlation ratios for qualitative variables and squared correlations forquantitative variables). PCArot includes the ordinary varimax rotation in Principal ComponentAnalysis (PCA) and rotation in Multiple Correspondence Analysis (MCA) as special cases.

Usage

PCArot(obj, dim, itermax = 100, graph = TRUE)

Arguments

obj an object of class PCAmix.

dim number of rotated Principal Components.

itermax maximum number of iterations in the Kaiser’s practical optimization algorithmbased on successive pairwise planar rotations.

graph boolean, if TRUE the following graphs are displayed for the first two dimensionsafter rotation: plot of the observations (the scores), plot of the variables (squaredloadings) plot of the correlation circle (if quantitative variables are available),plot of the categories (if qualitative variables are available).

Value

theta angle of rotation if dim is equal to 2.

T matrix of rotation.

eig variances of the ndim dimensions after rotation.

scores a n by ndim numerical matrix which contains the rotated scores of the n obser-vations.

scores.stand a n by dim numerical matrix which contains the rotated standardized scores ofthe n observations.

sload a p by ndim numerical matrix which contains the rotated squared loadings ofthe p variables. For quantitative variables (resp. qualitative), the rotated squaredloadings are the squared correlations (resp. the correlation ratios) with the ro-tated standardized scores.

12 PCArot

categ.coord ’NULL’ if X.quali is ’NULL’. Otherwise a m by dim numerical matrix whichcontains the coordinates of the m categories after rotation. The coordinates ofthe categories after rotation are the averages of the rotated standardized scoresof the observations in those categories.

quanti.cor ’NULL’ if X.quanti is ’NULL’. Otherwise a p1 by dim numerical matrix whichcontains the coordinates (the loadings) of the p1 quantitative variables after rota-tion. The coordinates of the quantitative variables after rotation are correlationswith the rotated standardized scores.

quali.eta2 ’NULL’ if X.quali is ’NULL’. Otherwise a p2 by dim numerical matrix whichcontains the squared loadings of the p2 quantitative variables after rotation. Thesquared loadings of the qualitative variables after rotation are correlation ratiowith the rotated standardized scores.

coef Coefficients of the linear combinations of the quantitative variables and the cat-egories for constructing the rotated principal components of PCAmix.

V The rotated standardized loadings.

Author(s)

Marie Chavent <[email protected]>, Vanessa Kuentz, Benoit Liquet, Jerome Saracco

References

Chavent, M., Kuentz, V., Saracco, J. (2011), Orthogonal Rotation in PCAMIX. Advances in Clas-sification and Data Analysis, Vol. 6, pp. 131-146.


See Also

plot.PCAmix, summary.PCAmix,PCAmix,predict.PCAmix

Examples

#PCAMIX:data(wine)X.quanti <- wine[,c(3:29)]X.quali <- wine[,c(1,2)]pca<-PCAmix(X.quanti,X.quali,ndim=4, graph=FALSE)pca

rot<-PCArot(pca,3)rotrot$eig #percentages of variances after rotation

plot(rot,choice="ind",quali=wine[,1],posleg="bottomleft", main="Rotated scores")plot(rot,choice="var",main="Squared loadings after rotation")plot(rot,choice="categ",main="Categories after rotation")plot(rot,choice="cor",main="Correlation circle after rotation")

#PCA:

plot.MFAmix 13

data(decathlon)quali<-decathlon[,13]pca<-PCAmix(decathlon[,1:10], graph=FALSE)

rot<-PCArot(pca,3)plot(rot,choice="ind",quali=quali,cex=0.8,posleg="topright",main="Scores after rotation")plot(rot, choice="var", main="Squared correlations after rotation")plot(rot, choice="cor", main="Correlation circle after rotation")

#MCAdata(flower)mca <- PCAmix(X.quali=flower[,1:4],graph=FALSE)

rot<-PCArot(mca,2)plot(rot,choice="ind",main="Scores after rotation")plot(rot, choice="var", main="Correlation ratios after rotation")plot(rot, choice="categ", main="Categories after rotation")

plot.MFAmix Graphs of a MFAmix analysis

Description

Graphs of MFAmix analysis : plot of the individuals, the partial individuals, the groups, the numericvariables on the correlation circle, the categories of the categorials variables, the partial axes of eachseparates analyses

Usage

## S3 method for class MFAmixplot(x, axes = c(1, 2), choice = "axes",

lab.grpe = TRUE, lab.var = TRUE, lab.ind = TRUE,lab.par = FALSE, habillage = "ind", col.hab = NULL,invisible = NULL, partial = NULL, lim.cos2.var = 0,chrono = FALSE, xlim = NULL, ylim = NULL, cex = 1,title = NULL, palette = NULL, new.plot = FALSE,leg=TRUE, ...)

Arguments

x an object of class MFAmix

axes alength 2 vector specifying the components to plot

choice a string corresponding to the graph that you want to do ("ind" for the individualor categorical variables graph, "var" for the correlation circle of the quantitativevariables, "axes" for the graph of the partial axes, "group" for the groups rep-resentation,"loadings" for the squared loadings : squared correlation coefficientbetween numeric variables and axis and correlation ration between catogoricalvariables and axis)

lab.grpe boolean, if TRUE, the labels of the groups are drawn

lab.var boolean, if TRUE, the labels of the variables are drawn

14 plot.MFAmix

lab.ind boolean, if TRUE, the labels of the individuals and of the categories are drawn

lab.par boolean, if TRUE, the labels of the partial points are drawn

habillage string corresponding to the color which are used. If "ind", one color is used foreach individual; if "group" the individuals are colored according to the group;else if it is the name of a categorical variable, it colors according to the differentcategories of this variable

col.hab the colors to use. By default, colors are chosen

invisible a string; for choix ="ind", the individuals can be omit (invisible = "ind"), or thecategories of the the categorical variables (invisible= "quali")

partial list of the individuals or of the categories for which the partial points should bedrawn (by default, partial = NULL and no partial points are drawn)

lim.cos2.var value of the square cosinus below whixh the points are not drawn

chrono boolean, if TRUE, the partial points of a same point are linked (useful whengroups correspond to different dates)

xlim numeric vectors of length 2, giving the x coordinates range

ylim numeric vectors of length 2, giving the y coordinates range

cex cf. function par in the graphics package

title string corresponding to the title of the graph you want to draw (by default NULLand a title is chosen)

new.plot boolean, if TRUE, a new graphical device is created

palette the color palette used to draw the points. By default colors are chosen. If youwant to define the colors: palette=palette(c("black","red","blue")); or you canuse: palette=palette(rainbow(30)), or in black and white for example: palette=palette(gray(seq(0,.9,len=25)))

leg boolean, if TRUE, a legend is displayed

... further arguments passed to or from othe methods

Value

Returns the graph you want to plot

Examples

#MFAmix:data(INSEE)

class.var<-c(1,1,1,1,1,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,5,5,1,1,1,1,6,7,7,7,7,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,8,8,1)

nom.groupes<-c("Log_Services","Population","Emplois","Condit_Famil","Educ","Revenu","Environnement","Secu_et_Soins")

res<-MFAmix(data=INSEE,group=class.var,name.group=nom.groupes,ndim=5,graph=FALSE)summary(res)res$eigres$group$Lg

#Eigenvalues of the separate analysesres$recap.eig.separate

#Individuals map on dim 1-2 colored by MedGenplot(res,choice="ind",invisible="quali",habillage="MedGen_R",

cex=0.6,title="Individuals colored by the Qualitative variable MedGen_R")

plot.PCAmix 15

#Categories maps on dim 2-3plot(res,choice="ind",invisible="ind",cex=0.6,axes=c(2,3))#Partial axes on dim 1-2plot(res,choice="axes",invisible="ind",cex=0.6,habillage="group",

col.hab=c("green","blue","red","skyblue1","orange","grey","purple","brown"))#Groups representation on dim 1-2plot(res,choice="group",cex=0.6,habillage="group")#Correlation circle on dim 1-2plot(res,choice="var",cex=0.6,habillage="group")#Squared loadings on dim 1-2plot(res,choice="loadings",cex=0.6,habillage="group")#Some partial individualsplot(res,choice="ind",invisible="quali",habillage="group",partial=c("33119","17410"),cex=0.6)#All partial individualsplot(res,choice="ind",invisible="quali",habillage="group",partial="all",cex=0.6)

plot.PCAmix Graphs of a PCAmix analysis before or after rotation

Description

Graphs of PCAmix analysis before or after rotation: plot of the observations (the scores or the stan-dardized scores), plot of the variables (squared loadings) correlation circle of the available quanti-tative variables, plot of the categories of the available qualitative variables.

Usage

## S3 method for class PCAmixplot(x,axes = c(1, 2), choice = "ind", stand=FALSE,label=TRUE, quali=NULL,posleg="topleft",xlim=NULL,ylim=NULL,cex=1,col.var=NULL,...)

Arguments

x an object of class PCAmix obtained with the function PCAmix or PCArot.

axes a length 2 vector specifying the components to plot.

choice the graph to plot ("ind" for the observations, "var" for the variables (plot of thesquared loadings), "cor" for the correlation circle if quantitative variables areavailable, "categ" for the categories if qualitative variables are available.

stand if ’TRUE’ the standardized scores are used in the plot of the observations.

label if ’FALSE’ the labels of the points are not plotted

quali a qualitative variable such as a character vector or a factor of size n (the numberof observation). The observations are colored according to the categories of thisvariable.

posleg position of the legend if quali is not ’NULL’.

xlim numeric vectors of length 2, giving the x coordinates range.

ylim numeric vectors of length 2, giving the y coordinates range.

cex cf. function par in the graphics package

col.var a vector of size p (the total n umber of variables) which allows to colorate eachvariable with a different color

... further arguments passed to or from other methods.

16 plot.PCAmix

Author(s)


References

Chavent, M., Kuentz, V., Saracco, J. (2011), Orthogonal Rotation in PCAMIX


See Also

summary.PCAmix,PCAmix,PCArot,

Examples

#PCAMIX:data(wine)X.quanti <- wine[,c(3:29)]X.quali <- wine[,c(1,2)]pca<-PCAmix(X.quanti,X.quali,ndim=4, graph=FALSE)

#Scores on dim 1-2plot(pca,choice="ind",quali=wine[,1],posleg="bottomleft",main="Scores")#Scores on dim 2-3plot(pca,choice="ind",axes=c(2,3),quali=wine[,1],posleg="bottomleft",main="Scores")#Other graphicsplot(pca,choice="var",main="Squared loadings")plot(pca,choice="categ",main="Categories")plot(pca,choice="cor",xlim=c(-1.5,2.5),main="Correlation circle")

rot<-PCArot(pca,3,graph=FALSE)plot(rot,choice="ind", main="Rotated scores",label=FALSE)plot(rot,choice="var",main="Squared loadings after rotation")plot(rot,choice="categ",main="Categories after rotation")plot(rot,choice="cor",main="Correlation circle after rotation")

#PCA:data(decathlon)quali<-decathlon[,13]pca<-PCAmix(decathlon[,1:10], graph=FALSE)plot(pca,choice="ind",quali=quali,stand=TRUE,cex=0.8,posleg="topright",main="Standardized scores")plot(pca, choice="cor", stand=TRUE,main="Correlation circle")

predict.PCAmix 17

predict.PCAmix Scores of new objects on the principal components of PCAmix orPCArot

Description

This function calculates the scores of a new set of data on the principal components of PCA. If thecomponents have been rotated, the function gives the scores of the new objects on the rotated PC.The new objects must have the same variables than the learning set.

Usage

## S3 method for class PCAmixpredict(object, X.quanti = NULL, X.quali = NULL,...)

Arguments

object an object of class PCAmix (output of the function PCAmix or PCArot)

X.quanti numeric matrix of data for the new objects

X.quali a categorical matrix of data for the new objects

... futher arguments pased to or from other methods

Value

Returns the matrix of the scores of the new objects of the ndim principal PC or rotated PC.

Author(s)

Marie Chavent <[email protected]>, Vanessa Kuentz, Benoit Liquet, JeromeSaracco

Examples

data(decathlon)n <- nrow(decathlon)sub <- sample(1:n,20)pca<-PCAmix(decathlon[sub,1:10], graph=FALSE)predict(pca,decathlon[-sub,1:10])rot <- PCArot(pca,dim=4)predict(rot,decathlon[-sub,1:10])

18 print.PCAmix

print.MFAmix Print a ’MFAmix’ object

Description

This is a method for the function print for objects of the class MFAmix.

Usage

## S3 method for class MFAmixprint(x, ...)

Arguments

x An object of class MFAmix generated by the function PCAmix.... Further arguments to be passed to or from other methods. They are ignored in

this function.

Author(s)


See Also

MFAmix

print.PCAmix Print a ’PCAmix’ object

Description

This is a method for the function print for objects of the class PCAmix.

Usage

## S3 method for class PCAmixprint(x, ...)

Arguments

x An object of class PCAmix generated by the functions PCAmix and PCArot.... Further arguments to be passed to or from other methods. They are ignored in

this function.

Author(s)


See Also

PCAmix , PCArot

recod 19

recod Recoding of the quantitative and qualitative data matrix

Description

Recoding of the quantitative and qualitative data matrix

Usage

recod(X.quanti, X.quali)

Arguments

X.quanti numeric matrix of data

X.quali a categorical matrix of data

Value

X X.quanti and X.quali concatenated in a single matrix

Y X.quanti with missing values replaced with mean values and X.quali with miss-ing values replaced by zeros, concatenated in a single matrix

Z X.quanti standardized (centered and reduced by standard deviations) concate-nated with the indicator matrix of X.quali centered and reduced with the squareroots of the relative frequencies of the categories

W X.quanti standardized (centered and reduced by standard deviations) concate-nated with the indicator matrix of X.quali centered

n the number of objects

p the total number of variables

p1 the number of quantitative variables

p2 the number of qualitative variables

g the means of the columns of X.quanti

s the standard deviations of the columns of X.quanti (population version with 1/n)

G The indicator matix of X.quali with missing values replaced by 0

Gcod The indicator matix G reduced with the square roots of the relative frequenciesof the categories

20 recodquant

recodqual Recoding of the qualitative data matrix

Description

Recoding of the qualitative data matrix

Usage

recodqual(X)

Arguments

X the qualitative data matrix

Value

G The indicator matix of X with missing values replaced by 0

Examples

data(vnf)X <- vnf[1:10,9:12]tab.disjonctif.NA(X)recodqual(X)

recodquant Recoding of the quantitative data matrix

Description

Recoding of the quantitative data matrix

Usage

recodquant(X)

Arguments

X the quantitative data matrix

Value

Z the standardized quantitative data matrix (centered and reduced with the stan-dard deviations)

g the means of the columns of X

s the standard deviations of the columns of X (population version with 1/n)

Xcod The quantitative matrix X with missing values replaced with the column meanvalues

RV 21

Examples

data(decathlon)X <- decathlon[1:5,1:5]X[1,2] <- NAX[2,3] <-NArec <- recodquant(X)

RV Coefficient RV

Description

Compute a data fame with all the RV coefficients between each matrix of a list

Usage

RV(liste.mat)

Arguments

liste.mat a list of G matrices

Value

RV a data frame with G rows and G colums with all the RV coefficients

References


J.Pages,Analyse factorielle multiple appliquee aux variables qualitatives et aux donnees mixtes.Revue de statistique appliquee, tome 50, num 4 (2002), p. 5-37

Examples

V0<-c("a","b","a","a","b")V01<-c("c","d","e","c","e")V1<-c(5,4,2,3,6)V2<-c(8,15,4,6,5)V3<-c(4,12,5,8,7)V4<-c("vert","vert","jaune","rouge","jaune")V5<-c("grand","moyen","moyen","petit","grand")G1<-data.frame(V0,V01,V1)G2<-data.frame(V2,V3)G3<-data.frame(V4,V5)liste.mat<-list(G1,G2,G3)RV(liste.mat)

22 sol.2dim

RV.pond Coefficient RV with ponderation

Description

Compute a data fame with all the RV coefficients between each matrix of a list ponderated

Usage

RV.pond(liste.mat, ponde)

Arguments


ponde a vector of size G with the ponderation associated to each matrix

Value

RV.pond a data frame with G rows and G colums with all the RV coefficients ponderated

Examples

V0<-c("a","b","a","a","b")V01<-c("c","d","e","c","e")V1<-c(5,4,2,3,6)V2<-c(8,15,4,6,5)V3<-c(4,12,5,8,7)V4<-c("vert","vert","jaune","rouge","jaune")V5<-c("grand","moyen","moyen","petit","grand")G1<-data.frame(V0,V01,V1)G2<-data.frame(V2,V3)G3<-data.frame(V4,V5)liste.mat<-list(G1,G2,G3)ponderation<-c(1,2,1)RV.pond(liste.mat,ponderation)

sol.2dim Varimax rotation for PCAmix in two dimensions

Description

Varimax rotation for PCAmix in two dimensions

Usage

sol.2dim(A, indexj, p, p1)

summary.MFAmix 23

Arguments

A matrix of loadings

indexj a vector with the number of the variable associated with each row of A

p the total number of variables

p1 the number of quantitative variables

Value

theta the angle of rotation

T the matrix of rotation

summary.MFAmix Summary of a ’MFAmix’ object

Description

This is a method for the function summary for objects of the class MFAmix.

Usage

## S3 method for class MFAmixsummary(object, ...)

Arguments

object an object of class MFAmix obtained with the function MFAmix.


Value

Returns the total number of observations, the number of numerical variables, the number of cate-gorical variables with the total number of categories. And all those values are declinated by groups.

See Also

plot.MFAmix,MFAmix

24 summary.PCAmix

summary.PCAmix Summary of a ’PCAmix’ object

Description

This is a method for the function summary for objects of the class PCAmix.

Usage

## S3 method for class PCAmixsummary(object, ...)

Arguments

object an object of class PCAmix obtained with the function PCAmix or PCArot.


Value

Returns the matrix of squared loadings. For quantitative variables (resp. qualitative), squared load-ings are the squared correlations (resp. the correlation ratios) with the scores or with the rotated(standardized) scores.

Author(s)


References

Chavent, M., Kuentz, V., Saracco, J. (2011), Orthogonal Rotation in PCAMIX


See Also

plot.PCAmix,PCAmix,PCArot,

Examples

data(wine)X.quanti <- wine[,c(3:29)]X.quali <- wine[,c(1,2)]pca<-PCAmix(X.quanti,X.quali,ndim=4, graph=FALSE)summary(pca)

rot<-PCArot(pca,3,graph=FALSE)summary(rot)

svd.triplet 25

svd.triplet Singular Value Decomposition of a Matrix

Description

Compute the singular-value decomposition of a rectangular matrix with weights for rows andcolumns. Borrowed from the ’FactoMineR’ package and, used as internal function

Usage

svd.triplet(X, row.w=NULL, col.w=NULL, ncp=Inf)

Arguments

X a data matrix

row.w vector with the weights of each row (NULL by default and the weights are uni-form)

col.w vector with the weights of each column (NULL by default and the weights areuniform)

ncp the number of components kept for the outputs

Value

vs a vector containing the singular values of ’x’;

u a matrix whose columns contain the left singular vectors of ’x’;

v a matrix whose columns contain the right singular vectors of ’x’.

See Also

svd

tab.disjonctif.NA Construction of the indicator matrix of a qualitative data matrix

Description

Construction of the indicator matrix of a qualitative data matrix

Usage

tab.disjonctif.NA(tab)

Arguments

tab a qualitative data matrix

Value

res the indicator matrix with NA for the categories of the variable which is missing in tab

26 vnf

Examples

data(vnf)X <- vnf[1:10,9:12]tab.disjonctif.NA(X)recodqual(X)

Tri.Data Tri.Data

Description

Cut a data frame with numeric variables and factors into two data frames, one with only numericvariables and the other one with only factors variables.

Usage

Tri.Data(base)

Arguments

base the data.frame with numeric variables and factors

Value

X.quanti The sub data.frame with only numeric variablesX.quali The sub data.frame with only categorial variables

Examples

data(decathlon)Tri.Data(decathlon)

vnf Questionnaire done by 1232 individuals who answered 14 questions

Description

A user satisfaction survey of pleasure craft operators on the “Canal des Deux Mers”, located inSouth of France, was carried out by the public corporation “Voies Navigables de France” responsi-ble for managing and developing the largest network of navigable waterways in Europe

Usage

data(vnf)

Format

A data frame with 1232 observations on the following 14 categorical variables.

Source

Josse, J., Chavent, M., Liquet, B. and Husson, F. (2012). Handling missing values with RegularizedIterative Multiple Correspondence Analysis. Journal of classification, Vol. 29, pp. 91-116.

wine 27

wine Wine

Description

The data used here refer to 21 wines of Val de Loire.

Usage

data(wine)

Format

A data frame with 21 rows (the number of wines) and 31 columns: the first column corresponds tothe label of origin, the second column corresponds to the soil, and the others correspond to sensorydescriptors.

Source

Centre de recherche INRA d’Angers

Index

∗Topic algebrasvd.triplet, 25

∗Topic datasetsdecathlon, 3flower, 3INSEE, 4vnf, 26wine, 27

∗Topic multivariateMFAmix, 6PCAmix, 8PCArot, 11

∗Topic printprint.MFAmix, 18print.PCAmix, 18

Cut.Group, 2

decathlon, 3

flower, 3

INSEE, 4

Lg, 4Lg.pond, 5, 5

MFAmix, 6, 18, 23

nb.level, 8

PCAmix, 8, 12, 16, 18, 24PCArot, 11, 16, 18, 24plot.MFAmix, 13, 23plot.PCAmix, 12, 15, 24predict.PCAmix, 12, 17print.MFAmix, 18print.PCAmix, 18

recod, 19recodqual, 20recodquant, 20RV, 5, 21RV.pond, 5, 22

sol.2dim, 22

summary.MFAmix, 23summary.PCAmix, 12, 16, 24svd, 25svd.triplet, 25

tab.disjonctif.NA, 25Tri.Data, 26

vnf, 26

wine, 27

28

package ‘pcamixdata’ - u-bordeaux.frmchave100p/wordpress/wp-content/... · 2013. 12. 23. ·...

Documents