Advanced search
1 file | 7.45 MB Add to list

Probabilistic modelling of general noisy multi-manifold data sets

Author
Organization
Project
  • SUNDIAL (SUrvey Network for Deep Imaging Analysis and Learning)
Abstract
The intrinsic nature of noisy and complex data sets is often concealed in low-dimensional structures embedded in a higher dimensional space. Number of methodologies have been developed to extract and represent such structures in the form of manifolds (i.e. geometric structures that locally resemble continuously deformable intervals of R-j1). Usually apriori knowledge of the manifold's intrinsic dimensionality is required. Additionally, their performance can often be hampered by the presence of a significant high-dimensional noise aligned along the low-dimensional core manifold. In real-world applications, the data can contain several low-dimensional structures of different dimensionalities. We propose a framework for dimensionality estimation and reconstruction of multiple noisy manifolds embedded in a noisy environment. To the best of our knowledge, this work represents the first attempt at detection and modelling of a set of coexisting general noisy manifolds by uniting two aspects of multi-manifold learning: the recovery and approximation of core noiseless manifolds and the construction of their probabilistic models. The easy-to-understand hyper-parameters can be manipulated to obtain an emerging picture of the multi-manifold structure of the data. We demonstrate the workings of the framework on two synthetic data sets, presenting challenging features for state-of-the-art techniques in Multi-Manifold learning. The first data set consists of multiple sampled noisy manifolds of different intrinsic dimensionalities, such as Mobius strip, toroid and spiral arm. The second one is a topologically complex set of three interlocked toroids. Given the absence of such unified methodologies in the literature, the comparison with existing techniques is organized along the two separate aspects of our approach mentioned above, namely manifold approximation and probabilistic modelling. The framework is then applied to a complex data set containing simulated gas volume particles from a particle simulation of a dwarf galaxy interacting with its host galaxy cluster. Detailed analysis of the recovered 1D and 2D manifolds can help us to understand the nature of Star Formation in such complex systems. (C) 2021 The Author(s). Published by Elsevier B.V.
Keywords
Artificial Intelligence, Linguistics and Language, Language and Linguistics, Latent variable models, Dimensionality estimation, Multi-manifold Learning, Riemannian manifolds, Generative topographic mapping, Density estimation, Probabilistic modelling, NONLINEAR DIMENSIONALITY REDUCTION, EIGENMAPS, GTM

Downloads

  • published.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 7.45 MB

Citation

Please use this url to cite or link to this publication:

MLA
Canducci, M., et al. “Probabilistic Modelling of General Noisy Multi-Manifold Data Sets.” ARTIFICIAL INTELLIGENCE, vol. 302, 2022, doi:10.1016/j.artint.2021.103579.
APA
Canducci, M., Tiño, P., & Mastropietro, M. (2022). Probabilistic modelling of general noisy multi-manifold data sets. ARTIFICIAL INTELLIGENCE, 302. https://doi.org/10.1016/j.artint.2021.103579
Chicago author-date
Canducci, M., P. Tiño, and Michele Mastropietro. 2022. “Probabilistic Modelling of General Noisy Multi-Manifold Data Sets.” ARTIFICIAL INTELLIGENCE 302. https://doi.org/10.1016/j.artint.2021.103579.
Chicago author-date (all authors)
Canducci, M., P. Tiño, and Michele Mastropietro. 2022. “Probabilistic Modelling of General Noisy Multi-Manifold Data Sets.” ARTIFICIAL INTELLIGENCE 302. doi:10.1016/j.artint.2021.103579.
Vancouver
1.
Canducci M, Tiño P, Mastropietro M. Probabilistic modelling of general noisy multi-manifold data sets. ARTIFICIAL INTELLIGENCE. 2022;302.
IEEE
[1]
M. Canducci, P. Tiño, and M. Mastropietro, “Probabilistic modelling of general noisy multi-manifold data sets,” ARTIFICIAL INTELLIGENCE, vol. 302, 2022.
@article{8719615,
  abstract     = {{The intrinsic nature of noisy and complex data sets is often concealed in low-dimensional structures embedded in a higher dimensional space. Number of methodologies have been developed to extract and represent such structures in the form of manifolds (i.e. geometric structures that locally resemble continuously deformable intervals of R-j1). Usually apriori knowledge of the manifold's intrinsic dimensionality is required. Additionally, their performance can often be hampered by the presence of a significant high-dimensional noise aligned along the low-dimensional core manifold. In real-world applications, the data can contain several low-dimensional structures of different dimensionalities. We propose a framework for dimensionality estimation and reconstruction of multiple noisy manifolds embedded in a noisy environment. To the best of our knowledge, this work represents the first attempt at detection and modelling of a set of coexisting general noisy manifolds by uniting two aspects of multi-manifold learning: the recovery and approximation of core noiseless manifolds and the construction of their probabilistic models. The easy-to-understand hyper-parameters can be manipulated to obtain an emerging picture of the multi-manifold structure of the data. We demonstrate the workings of the framework on two synthetic data sets, presenting challenging features for state-of-the-art techniques in Multi-Manifold learning. The first data set consists of multiple sampled noisy manifolds of different intrinsic dimensionalities, such as Mobius strip, toroid and spiral arm. The second one is a topologically complex set of three interlocked toroids. Given the absence of such unified methodologies in the literature, the comparison with existing techniques is organized along the two separate aspects of our approach mentioned above, namely manifold approximation and probabilistic modelling. The framework is then applied to a complex data set containing simulated gas volume particles from a particle simulation of a dwarf galaxy interacting with its host galaxy cluster. Detailed analysis of the recovered 1D and 2D manifolds can help us to understand the nature of Star Formation in such complex systems. (C) 2021 The Author(s). Published by Elsevier B.V.}},
  articleno    = {{103579}},
  author       = {{Canducci, M. and Tiño, P. and Mastropietro, Michele}},
  issn         = {{0004-3702}},
  journal      = {{ARTIFICIAL INTELLIGENCE}},
  keywords     = {{Artificial Intelligence,Linguistics and Language,Language and Linguistics,Latent variable models,Dimensionality estimation,Multi-manifold Learning,Riemannian manifolds,Generative topographic mapping,Density estimation,Probabilistic modelling,NONLINEAR DIMENSIONALITY REDUCTION,EIGENMAPS,GTM}},
  language     = {{eng}},
  pages        = {{29}},
  title        = {{Probabilistic modelling of general noisy multi-manifold data sets}},
  url          = {{http://dx.doi.org/10.1016/j.artint.2021.103579}},
  volume       = {{302}},
  year         = {{2022}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: