It is a minimum neighbor distance estimator of the intrinsic dimension based on Kullback Leibler divergence estimator.

est.mindkl(X, k = 5)

Arguments

X

an \((n\times p)\) matrix or data frame whose rows are observations.

k

the neighborhood size for defining locality.

Value

a named list containing containing

estdim

the global estimated dimension.

References

Lombardi G, Rozza A, Ceruti C, Casiraghi E, Campadelli P (2011). “Minimum Neighbor Distance Estimators of Intrinsic Dimension.” In Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds.), Machine Learning and Knowledge Discovery in Databases, volume 6912, 374--389. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-642-23782-9 978-3-642-23783-6.

See also

Author

Kisung You

Examples

# \donttest{
## create 3 datasets of intrinsic dimension 2.
X1 = aux.gensamples(dname="swiss")
X2 = aux.gensamples(dname="ribbon")
X3 = aux.gensamples(dname="saddle")

## acquire an estimate for intrinsic dimension
out1 = est.mindkl(X1, k=5)
out2 = est.mindkl(X2, k=5)
out3 = est.mindkl(X3, k=5)

## print the results
line1 = paste0("* est.mindkl : 'swiss'  estiamte is ",round(out1$estdim,2))
line2 = paste0("* est.mindkl : 'ribbon' estiamte is ",round(out2$estdim,2))
line3 = paste0("* est.mindkl : 'saddle' estiamte is ",round(out3$estdim,2))
cat(paste0(line1,"\n",line2,"\n",line3))
#> * est.mindkl : 'swiss'  estiamte is 3
#> * est.mindkl : 'ribbon' estiamte is 3
#> * est.mindkl : 'saddle' estiamte is 3
# }