do.procrustes selects a set of features that best aligns PCA's coordinates in the embedded low dimension. It iteratively selects each variable that minimizes Procrustes distance between configurations.

do.procrustes(X, ndim = 2, intdim = (ndim - 1), cor = TRUE)

Arguments

X

an \((n\times p)\) matrix whose rows are observations and columns represent independent variables.

ndim

an integer-valued target dimension.

intdim

intrinsic dimension of PCA to be applied. It should be smaller than ndim.

cor

mode of eigendecomposition. FALSE for decomposing covariance, and TRUE for correlation matrix in PCA.

Value

a named Rdimtools S3 object containing

Y

an \((n\times ndim)\) matrix whose rows are embedded observations.

featidx

a length-\(ndim\) vector of indices with highest scores.

projection

a \((p\times ndim)\) whose columns are basis for projection.

algorithm

name of the algorithm.

References

Krzanowski WJ (1987). “Selection of Variables to Preserve Multivariate Data Structure, Using Principal Components.” Applied Statistics, 36(1), 22. ISSN 00359254.

Author

Kisung You

Examples

# \donttest{
## use iris data
## it is known that feature 3 and 4 are more important.
data(iris)
iris.dat = as.matrix(iris[,1:4])
iris.lab = as.factor(iris[,5])

## try different strategy
out1 = do.procrustes(iris.dat, cor=TRUE)
out2 = do.procrustes(iris.dat, cor=FALSE)
out3 = do.mifs(iris.dat, iris.lab, beta=0)

## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1, 3))
plot(out1$Y, pch=19, col=iris.lab, main="PCA with Covariance")
plot(out2$Y, pch=19, col=iris.lab, main="PCA with Correlation")
plot(out3$Y, pch=19, col=iris.lab, main="MIFS")

par(opar)
# }