R/feature_PROCRUSTES.R
feature_PROCRUSTES.Rd
do.procrustes
selects a set of features that best aligns PCA's coordinates in the embedded low dimension.
It iteratively selects each variable that minimizes Procrustes distance between configurations.
do.procrustes(X, ndim = 2, intdim = (ndim - 1), cor = TRUE)
an \((n\times p)\) matrix whose rows are observations and columns represent independent variables.
an integer-valued target dimension.
intrinsic dimension of PCA to be applied. It should be smaller than ndim
.
mode of eigendecomposition. FALSE
for decomposing covariance, and TRUE
for correlation matrix in PCA.
a named Rdimtools
S3 object containing
an \((n\times ndim)\) matrix whose rows are embedded observations.
a length-\(ndim\) vector of indices with highest scores.
a \((p\times ndim)\) whose columns are basis for projection.
name of the algorithm.
Krzanowski WJ (1987). “Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components.” Applied Statistics, 36(1), 22. ISSN 00359254.
# \donttest{
## use iris data
## it is known that feature 3 and 4 are more important.
data(iris)
iris.dat = as.matrix(iris[,1:4])
iris.lab = as.factor(iris[,5])
## try different strategy
out1 = do.procrustes(iris.dat, cor=TRUE)
out2 = do.procrustes(iris.dat, cor=FALSE)
out3 = do.mifs(iris.dat, iris.lab, beta=0)
## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1, 3))
plot(out1$Y, pch=19, col=iris.lab, main="PCA with Covariance")
plot(out2$Y, pch=19, col=iris.lab, main="PCA with Correlation")
plot(out3$Y, pch=19, col=iris.lab, main="MIFS")
par(opar)
# }