Skip to main content

Showing 1–10 of 10 results for author: Svirsky, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.12951  [pdf, other

    cs.LG

    FineGates: LLMs Finetuning with Compression using Stochastic Gates

    Authors: Jonathan Svirsky, Yehonathan Refael, Ofir Lindenbaum

    Abstract: Large Language Models (LLMs), with billions of parameters, present significant challenges for full finetuning due to the high computational demands, memory requirements, and impracticality of many real-world applications. When faced with limited computational resources or small datasets, updating all model parameters can often result in overfitting. To address this, lightweight finetuning techniqu… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  2. arXiv:2410.17881  [pdf, other

    cs.LG

    AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning

    Authors: Yehonathan Refael, Jonathan Svirsky, Boris Shustin, Wasim Huleihel, Ofir Lindenbaum

    Abstract: Training and fine-tuning large language models (LLMs) come with challenges related to memory and computational requirements due to the increasing size of the model weights and the optimizer states. Various techniques have been developed to tackle these challenges, such as low-rank adaptation (LoRA), which involves introducing a parallel trainable low-rank matrix to the fixed pre-trained weights at… ▽ More

    Submitted 29 December, 2024; v1 submitted 23 October, 2024; originally announced October 2024.

  3. Sparse Binarization for Fast Keyword Spotting

    Authors: Jonathan Svirsky, Uri Shaham, Ofir Lindenbaum

    Abstract: With the increasing prevalence of voice-activated devices and applications, keyword spotting (KWS) models enable users to interact with technology hands-free, enhancing convenience and accessibility in various contexts. Deploying KWS models on edge devices, such as smartphones and embedded systems, offers significant benefits for real-time applications, privacy, and bandwidth efficiency. However,… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Journal ref: Interspeech 2024

  4. arXiv:2402.16383  [pdf, other

    cs.LG stat.ML

    Self Supervised Correlation-based Permutations for Multi-View Clustering

    Authors: Ran Eisenberg, Jonathan Svirsky, Ofir Lindenbaum

    Abstract: Combining data from different sources can improve data analysis tasks such as clustering. However, most of the current multi-view clustering methods are limited to specific domains or rely on a suboptimal and computationally intensive two-stage process of representation learning and clustering. We propose an end-to-end deep learning-based multi-view clustering framework for general data types (suc… ▽ More

    Submitted 20 May, 2025; v1 submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2306.04785  [pdf, other

    cs.LG stat.ML

    Interpretable Deep Clustering for Tabular Data

    Authors: Jonathan Svirsky, Ofir Lindenbaum

    Abstract: Clustering is a fundamental learning task widely used as a first step in data analysis. For example, biologists use cluster assignments to analyze genome sequences, medical records, or images. Since downstream analysis is typically performed at the cluster level, practitioners seek reliable and interpretable clustering models. We propose a new deep-learning framework for general domain tabular dat… ▽ More

    Submitted 9 June, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

  6. arXiv:2210.16022  [pdf, other

    eess.AS cs.SD

    SG-VAD: Stochastic Gates Based Speech Activity Detection

    Authors: Jonathan Svirsky, Ofir Lindenbaum

    Abstract: We propose a novel voice activity detection (VAD) model in a low-resource environment. Our key idea is to model VAD as a denoising task, and construct a network that is designed to identify nuisance features for a speech classification task. We train the model to simultaneously identify irrelevant features while predicting the type of speech event. Our model contains only 7.8K parameters, outperfo… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  7. arXiv:2110.05887  [pdf, other

    stat.ML cs.LG

    Discovery of Single Independent Latent Variable

    Authors: Uri Shaham, Jonathan Svirsky, Ori Katz, Ronen Talmon

    Abstract: Latent variable discovery is a central problem in data analysis with a broad range of applications in applied science. In this work, we consider data given as an invertible mixture of two statistically independent components and assume that one of the components is observed while the other is hidden. Our goal is to recover the hidden component. For this purpose, we propose an autoencoder equipped… ▽ More

    Submitted 7 March, 2023; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at Neurips 2022. In the current version the proof of the lemma is modified

    Journal ref: Advances in Neural Information Processing Systems 2022

  8. arXiv:2110.05306  [pdf, other

    stat.ML cs.AI cs.LG

    Deep Unsupervised Feature Selection by Discarding Nuisance and Correlated Features

    Authors: Uri Shaham, Ofir Lindenbaum, Jonathan Svirsky, Yuval Kluger

    Abstract: Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  9. arXiv:2011.07607  [pdf, other

    stat.ML cs.LG

    Deep Ordinal Regression using Optimal Transport Loss and Unimodal Output Probabilities

    Authors: Uri Shaham, Igal Zaidman, Jonathan Svirsky

    Abstract: It is often desired that ordinal regression models yield unimodal predictions. However, in many recent works this characteristic is either absent, or implemented using soft targets, which do not guarantee unimodal outputs at inference. In addition, we argue that the standard maximum likelihood objective is not suitable for ordinal regression problems, and that optimal transport is better suited fo… ▽ More

    Submitted 18 November, 2021; v1 submitted 15 November, 2020; originally announced November 2020.

  10. arXiv:2007.04728  [pdf, other

    cs.LG stat.ML

    Differentiable Unsupervised Feature Selection based on a Gated Laplacian

    Authors: Ofir Lindenbaum, Uri Shaham, Jonathan Svirsky, Erez Peterfreund, Yuval Kluger

    Abstract: Scientific observations may consist of a large number of variables (features). Identifying a subset of meaningful features is often ignored in unsupervised learning, despite its potential for unraveling clear patterns hidden in the ambient space. In this paper, we present a method for unsupervised feature selection, and we demonstrate its use for the task of clustering. We propose a differentiable… ▽ More

    Submitted 9 November, 2020; v1 submitted 9 July, 2020; originally announced July 2020.