Ensembles of K-Subspaces method exploits multiple runs of K-Subspace Clustering and uses consensus framework to aggregate multiple clustering results to mitigate the effect of random initializations. When the results are merged, it zeros out \(n-q\) number of values in a co-occurrence matrix. The paper suggests to use large number of runs (B) where each run may not require large number of iterations (iter) since the main assumption of the algorithm is to utilize multiple partially-correct information. At the extreme case, iteration iter may be set to 0 for which the paper denotes it as EKSS-0.

EKSS(data, k = 2, d = 2, q = floor(nrow(data) * 0.75), B = 500, iter = 0)

Arguments

data

an \((n\times p)\) matrix of row-stacked observations.

k

the number of clusters (default: 2).

d

candidate dimension for each subspace (default: 2).

q

threshold; the number of smaller values to be zeroed out (default: 0.75*\(n\)).

B

the number of ensembles/runs (default: 500).

iter

the number of iteration for each run (default: 0).

Value

a named list of S3 class T4cluster containing

cluster

a length-\(n\) vector of class labels (from \(1:k\)).

algorithm

name of the algorithm.

References

Lipor J, Hong D, Tan YS, Balzano L (2021). “Subspace Clustering Using Ensembles of \(K\)-Subspaces.” arXiv:1709.04744.

Examples

# \donttest{ ## generate a toy example set.seed(10) tester = genLP(n=100, nl=2, np=1, iso.var=0.1) data = tester$data label = tester$class ## do PCA for data reduction proj = base::eigen(stats::cov(data))$vectors[,1:2] dat2 = data%*%proj ## run EKSS algorithm with k=2,3,4 with EKSS-0 and 5 iterations out2zero = EKSS(data, k=2) out3zero = EKSS(data, k=3) out4zero = EKSS(data, k=4) out2iter = EKSS(data, k=2, iter=5) out3iter = EKSS(data, k=3, iter=5) out4iter = EKSS(data, k=4, iter=5) ## extract label information lab2zero = out2zero$cluster lab3zero = out3zero$cluster lab4zero = out4zero$cluster lab2iter = out2iter$cluster lab3iter = out3iter$cluster lab4iter = out4iter$cluster ## visualize opar <- par(no.readonly=TRUE) par(mfrow=c(2,3)) plot(dat2, pch=19, cex=0.9, col=lab2zero, main="EKSS-0:K=2") plot(dat2, pch=19, cex=0.9, col=lab3zero, main="EKSS-0:K=3") plot(dat2, pch=19, cex=0.9, col=lab4zero, main="EKSS-0:K=4") plot(dat2, pch=19, cex=0.9, col=lab2iter, main="EKSS iter:K=2") plot(dat2, pch=19, cex=0.9, col=lab3iter, main="EKSS iter:K=3") plot(dat2, pch=19, cex=0.9, col=lab4iter, main="EKSS iter:K=4")
par(opar) # }