It is a minimum neighbor distance estimator of the intrinsic dimension based on Maximum Likelihood principle.

est.mindml(X, k = 5)

Arguments

X

an \((n\times p)\) matrix or data frame whose rows are observations.

k

the neighborhood size for defining locality.

Value

a named list containing containing

estdim

the global estimated dimension.

References

Lombardi G, Rozza A, Ceruti C, Casiraghi E, Campadelli P (2011). “Minimum Neighbor Distance Estimators of Intrinsic Dimension.” In Gunopulos D, Hofmann T, Malerba D, Vazirgiannis M (eds.), Machine Learning and Knowledge Discovery in Databases, volume 6912, 374--389. Springer Berlin Heidelberg, Berlin, Heidelberg. ISBN 978-3-642-23782-9 978-3-642-23783-6.

See also

Author

Kisung You

Examples

# \donttest{
## create 3 datasets of intrinsic dimension 2.
set.seed(100)
X1 = aux.gensamples(dname="swiss")
X2 = aux.gensamples(dname="ribbon")
X3 = aux.gensamples(dname="saddle")

## acquire an estimate for intrinsic dimension
out1 = est.mindml(X1, k=10)
out2 = est.mindml(X2, k=10)
out3 = est.mindml(X3, k=10)

## print the results
line1 = paste0("* est.mindml : 'swiss'  estiamte is ",round(out1$estdim,2))
line2 = paste0("* est.mindml : 'ribbon' estiamte is ",round(out2$estdim,2))
line3 = paste0("* est.mindml : 'saddle' estiamte is ",round(out3$estdim,2))
cat(paste0(line1,"\n",line2,"\n",line3))
#> * est.mindml : 'swiss'  estiamte is 1.96
#> * est.mindml : 'ribbon' estiamte is 2.11
#> * est.mindml : 'saddle' estiamte is 2.1
# }