do.dm
discovers low-dimensional manifold structure embedded in high-dimensional
data space using Diffusion Maps (DM). It exploits diffusion process and distances in data space to find
equivalent representations in low-dimensional space.
do.dm(
X,
ndim = 2,
preprocess = c("null", "center", "scale", "cscale", "decorrelate", "whiten"),
bandwidth = 1,
timescale = 1,
multiscale = FALSE
)
an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.
an integer-valued target dimension.
an additional option for preprocessing the data.
Default is "null". See also aux.preprocess
for more details.
a scaling parameter for diffusion kernel. Default is 1 and should be a nonnegative real number.
a target scale whose value represents behavior of heat kernels at time t. Default is 1 and should be a positive real number.
logical; FALSE
is to use the fixed timescale
value, TRUE
to ignore the given value.
a named list containing
an \((n\times ndim)\) matrix whose rows are embedded observations.
a list containing information for out-of-sample prediction.
a vector of eigenvalues for Markov transition matrix.
Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2005). “Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators.” In Proceedings of the 18th International Conference on Neural Information Processing Systems, NIPS'05, 955--962.
Coifman RR, Lafon S (2006). “Diffusion Maps.” Applied and Computational Harmonic Analysis, 21(1), 5--30.
# \donttest{
## load iris data
data(iris)
set.seed(100)
subid = sample(1:150,50)
X = as.matrix(iris[subid,1:4])
label = as.factor(iris[subid,5])
## compare different bandwidths
out1 <- do.dm(X,bandwidth=10)
out2 <- do.dm(X,bandwidth=100)
out3 <- do.dm(X,bandwidth=1000)
## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, pch=19, col=label, main="DM::bandwidth=10")
plot(out2$Y, pch=19, col=label, main="DM::bandwidth=100")
plot(out3$Y, pch=19, col=label, main="DM::bandwidth=1000")
par(opar)
# }