Skip to main content

Showing 1–3 of 3 results for author: Facco, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:1902.10459  [pdf, other

    stat.ML cs.LG

    Data segmentation based on the local intrinsic dimension

    Authors: Michele Allegra, Elena Facco, Francesco Denti, Alessandro Laio, Antonietta Mira

    Abstract: One of the founding paradigms of machine learning is that a small number of variables is often sufficient to describe high-dimensional data. The minimum number of variables required is called the intrinsic dimension (ID) of the data. Contrary to common intuition, there are cases where the ID varies within the same data set. This fact has been highlighted in technical discussions, but seldom exploi… ▽ More

    Submitted 13 July, 2020; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: 11 pages, 6 figures + 9 pages Supplementary Information

  2. Estimating the intrinsic dimension of datasets by a minimal neighborhood information

    Authors: Elena Facco, Maria d'Errico, Alex Rodriguez, Alessandro Laio

    Abstract: Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data science, molecular simulations and beyond. Several approaches work on the assumption that the important content of a dataset belongs to a manifold whose Intrinsic Dimension (ID) is much lower than the crude large number of coordinates. Such manifold is generally twisted and curved, in addition points on… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: Scientific Reports 2017

  3. Automatic topography of high-dimensional data sets by non-parametric Density Peak clustering

    Authors: Maria d'Errico, Elena Facco, Alessandro Laio, Alex Rodriguez

    Abstract: Data analysis in high-dimensional spaces aims at obtaining a synthetic description of a data set, revealing its main structure and its salient features. We here introduce an approach providing this description in the form of a topography of the data, namely a human-readable chart of the probability density from which the data are harvested. The approach is based on an unsupervised extension of Den… ▽ More

    Submitted 5 February, 2021; v1 submitted 28 February, 2018; originally announced February 2018.

    Comments: There is a Supplementary Information document in the ancillary files folder

    Journal ref: Information Sciences Volume 560, June 2021, Pages 476-492