Skip to main content

Showing 1–9 of 9 results for author: Landa, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01718  [pdf, other

    stat.ML cs.LG math.ST

    Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets

    Authors: Boris Landa, Yuval Kluger, Rong Ma

    Abstract: Embedding high-dimensional data into a low-dimensional space is an indispensable component of data analysis. In numerous applications, it is necessary to align and jointly embed multiple datasets from different studies or experimental conditions. Such datasets may share underlying structures of interest but exhibit individual distortions, resulting in misaligned embeddings using traditional techni… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2209.08004  [pdf, ps, other

    math.ST cs.LG stat.ML

    Robust Inference of Manifold Density and Geometry by Doubly Stochastic Scaling

    Authors: Boris Landa, Xiuyuan Cheng

    Abstract: The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points. Yet, they can be inaccurate under high-dimensional noise, especially if the noise magnitude varies considerably across the data, e.g., under heteroskedasticity or outliers. In this work, we investigate a more robust alternative -- the doubly stochasti… ▽ More

    Submitted 10 July, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

  3. arXiv:2206.11386  [pdf, ps, other

    math.ST cs.LG stat.ML

    Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise

    Authors: Xiuyuan Cheng, Boris Landa

    Abstract: Bi-stochastic normalization provides an alternative normalization of graph Laplacians in graph-based data analysis and can be computed efficiently by Sinkhorn-Knopp (SK) iterations. This paper proves the convergence of bi-stochastically normalized graph Laplacian to manifold (weighted-)Laplacian with rates, when $n$ data points are i.i.d. sampled from a general $d$-dimensional manifold embedded in… ▽ More

    Submitted 17 July, 2024; v1 submitted 22 June, 2022; originally announced June 2022.

  4. arXiv:2103.13840  [pdf, other

    math.ST cs.IT

    Biwhitening Reveals the Rank of a Count Matrix

    Authors: Boris Landa, Thomas T. C. K. Zhang, Yuval Kluger

    Abstract: Estimating the rank of a corrupted data matrix is an important task in data analysis, most notably for choosing the number of components in PCA. Significant progress on this task was achieved using random matrix theory by characterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, e.g.… ▽ More

    Submitted 2 November, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    MSC Class: 62H12; 62H25

  5. arXiv:2006.00402  [pdf, ps, other

    stat.ML cs.IT cs.LG

    Doubly-Stochastic Normalization of the Gaussian Kernel is Robust to Heteroskedastic Noise

    Authors: Boris Landa, Ronald R. Coifman, Yuval Kluger

    Abstract: A fundamental step in many data-analysis techniques is the construction of an affinity matrix describing similarities between data points. When the data points reside in Euclidean space, a widespread approach is to from an affinity matrix by the Gaussian kernel with pairwise distances, and to follow with a certain normalization (e.g. the row-stochastic normalization or its symmetric variant). We d… ▽ More

    Submitted 25 January, 2021; v1 submitted 30 May, 2020; originally announced June 2020.

  6. arXiv:1906.00211  [pdf, ps, other

    math.ST cs.DS cs.IT

    Multi-reference factor analysis: low-rank covariance estimation under unknown translations

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: We consider the problem of estimating the covariance matrix of a random signal observed through unknown translations (modeled by cyclic shifts) and corrupted by noise. Solving this problem allows to discover low-rank structures masked by the existence of translations (which act as nuisance parameters), with direct application to Principal Components Analysis (PCA). We assume that the underlying si… ▽ More

    Submitted 21 September, 2020; v1 submitted 1 June, 2019; originally announced June 2019.

  7. arXiv:1905.12442  [pdf, other

    math.ST cs.DS cs.IT

    Rank-one Multi-Reference Factor Analysis

    Authors: Yariv Aizenbud, Boris Landa, Yoel Shkolnisky

    Abstract: In recent years, there is a growing need for processing methods aimed at extracting useful information from large datasets. In many cases the challenge is to discover a low-dimensional structure in the data, often concealed by the existence of nuisance parameters and noise. Motivated by such challenges, we consider the problem of estimating a signal from its scaled, cyclically-shifted and noisy ob… ▽ More

    Submitted 4 June, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

  8. arXiv:1802.01894  [pdf, ps, other

    cs.CV cs.LG

    The steerable graph Laplacian and its application to filtering image data-sets

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: In recent years, improvements in various image acquisition techniques gave rise to the need for adaptive processing methods, aimed particularly for large datasets corrupted by noise and deformations. In this work, we consider datasets of images sampled from a low-dimensional manifold (i.e. an image-valued manifold), where the images can assume arbitrary planar rotations. To derive an adaptive and… ▽ More

    Submitted 7 August, 2018; v1 submitted 6 February, 2018; originally announced February 2018.

  9. arXiv:1608.02702  [pdf, ps, other

    cs.CV math.NA

    Steerable Principal Components for Space-Frequency Localized Images

    Authors: Boris Landa, Yoel Shkolnisky

    Abstract: This paper describes a fast and accurate method for obtaining steerable principal components from a large dataset of images, assuming the images are well localized in space and frequency. The obtained steerable principal components are optimal for expanding the images in the dataset and all of their rotations. The method relies upon first expanding the images using a series of two-dimensional Prol… ▽ More

    Submitted 9 August, 2018; v1 submitted 9 August, 2016; originally announced August 2016.