Search | arXiv e-print repository

Invariant inter-subject relational structures in the human visual cortex

Authors: Ofer Lipman, Shany Grossman, Doron Friedman, Yacov Hel-Or, Rafael Malach

Abstract: It is a fundamental behavior that different individuals see the world in a largely similar manner. This is an essential basis for humans' ability to cooperate and communicate. However, what are the neuronal properties that underlie these inter-subject commonalities of our visual world? Finding out what aspects of neuronal coding remain invariant across individuals' brains will shed light not only… ▽ More It is a fundamental behavior that different individuals see the world in a largely similar manner. This is an essential basis for humans' ability to cooperate and communicate. However, what are the neuronal properties that underlie these inter-subject commonalities of our visual world? Finding out what aspects of neuronal coding remain invariant across individuals' brains will shed light not only on this fundamental question but will also point to the neuronal coding scheme as the basis of visual perception. Here, we address this question by obtaining intracranial recordings from three cohorts of patients taking part in a different visual recognition task (overall 19 patients and 244 high-order visual contacts included in the analyses) and examining the neuronal coding scheme most consistent across individuals' visual cortex. Our results highlight relational coding - expressed by the set of similarity distances between profiles of pattern activations - as the most consistent representation across individuals. Alternative coding schemes, such as population vector coding or linear coding, failed to achieve similar inter-subject consistency. Our results thus support relational coding as the central neuronal code underlying individuals' shared perceptual content in the human brain. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2402.03867 [pdf, other]

Binaural sound source localization using a hybrid time and frequency domain model

Authors: Gil Geva, Olivier Warusfel, Shlomo Dubnov, Tammuz Dubnov, Amir Amedi, Yacov Hel-Or

Abstract: This paper introduces a new approach to sound source localization using head-related transfer function (HRTF) characteristics, which enable precise full-sphere localization from raw data. While previous research focused primarily on using extensive microphone arrays in the frontal plane, this arrangement often encountered limitations in accuracy and robustness when dealing with smaller microphone… ▽ More This paper introduces a new approach to sound source localization using head-related transfer function (HRTF) characteristics, which enable precise full-sphere localization from raw data. While previous research focused primarily on using extensive microphone arrays in the frontal plane, this arrangement often encountered limitations in accuracy and robustness when dealing with smaller microphone arrays. Our model proposes using both time and frequency domain for sound source localization while utilizing Deep Learning (DL) approach. The performance of our proposed model, surpasses the current state-of-the-art results. Specifically, it boasts an average angular error of $0.24 degrees and an average Euclidean distance of 0.01 meters, while the known state-of-the-art gives average angular error of 19.07 degrees and average Euclidean distance of 1.08 meters. This level of accuracy is of paramount importance for a wide range of applications, including robotics, virtual reality, and aiding individuals with cochlear implants (CI). △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2211.11825 [pdf, other]

Multi-Directional Subspace Editing in Style-Space

Authors: Chen Naveh, Yacov Hel-Or

Abstract: This paper describes a new technique for finding disentangled semantic directions in the latent space of StyleGAN. Our method identifies meaningful orthogonal subspaces that allow editing of one human face attribute, while minimizing undesired changes in other attributes. Our model is capable of editing a single attribute in multiple directions, resulting in a range of possible generated images. W… ▽ More This paper describes a new technique for finding disentangled semantic directions in the latent space of StyleGAN. Our method identifies meaningful orthogonal subspaces that allow editing of one human face attribute, while minimizing undesired changes in other attributes. Our model is capable of editing a single attribute in multiple directions, resulting in a range of possible generated images. We compare our scheme with three state-of-the-art models and show that our method outperforms them in terms of face editing and disentanglement capabilities. Additionally, we suggest quantitative measures for evaluating attribute separation and disentanglement, and exhibit the superiority of our model with respect to those measures. △ Less

Submitted 23 August, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Journal ref: ICCV 2023

arXiv:2203.16626 [pdf, other]

DDNeRF: Depth Distribution Neural Radiance Fields

Authors: David Dadon, Ohad Fried, Yacov Hel-Or

Abstract: In recent years, the field of implicit neural representation has progressed significantly. Models such as neural radiance fields (NeRF), which uses relatively small neural networks, can represent high-quality scenes and achieve state-of-the-art results for novel view synthesis. Training these types of networks, however, is still computationally very expensive. We present depth distribution neural… ▽ More In recent years, the field of implicit neural representation has progressed significantly. Models such as neural radiance fields (NeRF), which uses relatively small neural networks, can represent high-quality scenes and achieve state-of-the-art results for novel view synthesis. Training these types of networks, however, is still computationally very expensive. We present depth distribution neural radiance field (DDNeRF), a new method that significantly increases sampling efficiency along rays during training while achieving superior results for a given sampling budget. DDNeRF achieves this by learning a more accurate representation of the density distribution along rays. More specifically, we train a coarse model to predict the internal distribution of the transparency of an input volume in addition to the volume's total density. This finer distribution then guides the sampling procedure of the fine model. This method allows us to use fewer samples during training while reducing computational resources. △ Less

Submitted 30 March, 2022; originally announced March 2022.

arXiv:2203.15065 [pdf, other]

DeepShadow: Neural Shape from Shadow

Authors: Asaf Karnieli, Ohad Fried, Yacov Hel-Or

Abstract: This paper presents DeepShadow, a one-shot method for recovering the depth map and surface normals from photometric stereo shadow maps. Previous works that try to recover the surface normals from photometric stereo images treat cast shadows as a disturbance. We show that the self and cast shadows not only do not disturb 3D reconstruction, but can be used alone, as a strong learning signal, to reco… ▽ More This paper presents DeepShadow, a one-shot method for recovering the depth map and surface normals from photometric stereo shadow maps. Previous works that try to recover the surface normals from photometric stereo images treat cast shadows as a disturbance. We show that the self and cast shadows not only do not disturb 3D reconstruction, but can be used alone, as a strong learning signal, to recover the depth map and surface normals. We demonstrate that 3D reconstruction from shadows can even outperform shape-from-shading in certain cases. To the best of our knowledge, our method is the first to reconstruct 3D shape-from-shadows using neural networks. The method does not require any pre-training or expensive labeled data, and is optimized during inference time. △ Less

Submitted 30 October, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

Comments: ECCV 2022. Project page available at https://asafkar.github.io/deepshadow/

arXiv:2110.04519 [pdf, other]

Pairwise Margin Maximization for Deep Neural Networks

Authors: Berry Weinstein, Shai Fine, Yacov Hel-Or

Abstract: The weight decay regularization term is widely used during training to constrain expressivity, avoid overfitting, and improve generalization. Historically, this concept was borrowed from the SVM maximum margin principle and extended to multi-class deep networks. Carefully inspecting this principle reveals that it is not optimal for multi-class classification in general, and in particular when usin… ▽ More The weight decay regularization term is widely used during training to constrain expressivity, avoid overfitting, and improve generalization. Historically, this concept was borrowed from the SVM maximum margin principle and extended to multi-class deep networks. Carefully inspecting this principle reveals that it is not optimal for multi-class classification in general, and in particular when using deep neural networks. In this paper, we explain why this commonly used principle is not optimal and propose a new regularization scheme, called {\em Pairwise Margin Maximization} (PMM), which measures the minimal amount of displacement an instance should take until its predicted classification is switched. In deep neural networks, PMM can be implemented in the vector space before the network's output layer, i.e., in the deep feature space, where we add an additional normalization term to avoid convergence to a trivial solution. We demonstrate empirically a substantial improvement when training a deep neural network with PMM compared to the standard regularization terms. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2009.06011

arXiv:2009.06011 [pdf, other]

Margin-Based Regularization and Selective Sampling in Deep Neural Networks

Authors: Berry Weinstein, Shai Fine, Yacov Hel-Or

Abstract: We derive a new margin-based regularization formulation, termed multi-margin regularization (MMR), for deep neural networks (DNNs). The MMR is inspired by principles that were applied in margin analysis of shallow linear classifiers, e.g., support vector machine (SVM). Unlike SVM, MMR is continuously scaled by the radius of the bounding sphere (i.e., the maximal norm of the feature vector in the d… ▽ More We derive a new margin-based regularization formulation, termed multi-margin regularization (MMR), for deep neural networks (DNNs). The MMR is inspired by principles that were applied in margin analysis of shallow linear classifiers, e.g., support vector machine (SVM). Unlike SVM, MMR is continuously scaled by the radius of the bounding sphere (i.e., the maximal norm of the feature vector in the data), which is constantly changing during training. We empirically demonstrate that by a simple supplement to the loss function, our method achieves better results on various classification tasks across domains. Using the same concept, we also derive a selective sampling scheme and demonstrate accelerated training of DNNs by selecting samples according to a minimal margin score (MMS). This score measures the minimal amount of displacement an input should undergo until its predicted classification is switched. We evaluate our proposed methods on three image classification tasks and six language text classification tasks. Specifically, we show improved empirical results on CIFAR10, CIFAR100 and ImageNet using state-of-the-art convolutional neural networks (CNNs) and BERT-BASE architecture for the MNLI, QQP, QNLI, MRPC, SST-2 and RTE benchmarks. △ Less

Submitted 13 September, 2020; originally announced September 2020.

arXiv:2008.01487 [pdf, other]

Autoencoder Image Interpolation by Shaping the Latent Space

Authors: Alon Oring, Zohar Yakhini, Yacov Hel-Or

Abstract: Autoencoders represent an effective approach for computing the underlying factors characterizing datasets of different types. The latent representation of autoencoders have been studied in the context of enabling interpolation between data points by decoding convex combinations of latent vectors. This interpolation, however, often leads to artifacts or produces unrealistic results during reconstru… ▽ More Autoencoders represent an effective approach for computing the underlying factors characterizing datasets of different types. The latent representation of autoencoders have been studied in the context of enabling interpolation between data points by decoding convex combinations of latent vectors. This interpolation, however, often leads to artifacts or produces unrealistic results during reconstruction. We argue that these incongruities are due to the structure of the latent space and because such naively interpolated latent vectors deviate from the data manifold. In this paper, we propose a regularization technique that shapes the latent representation to follow a manifold that is consistent with the training images and that drives the manifold to be smooth and locally convex. This regularization not only enables faithful interpolation between data points, as we show herein, but can also be used as a general regularization technique to avoid overfitting or to produce new samples for data augmentation. △ Less

Submitted 21 October, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

Comments: Submitted Sept 2020

arXiv:2004.10306 [pdf, other]

doi 10.1109/TIP.2021.3065226

The Role of Redundant Bases and Shrinkage Functions in Image Denoising

Authors: Yacov Hel-Or, Gil Ben-Artzi

Abstract: Wavelet denoising is a classical and effective approach for reducing noise in images and signals. Suggested in 1994, this approach is carried out by rectifying the coefficients of a noisy image in the transform domain, using a set of scalar shrinkage function (SFs). A plethora of papers deals with the optimal shape of the SFs and the transform used, where it is known that applying the SFs in redun… ▽ More Wavelet denoising is a classical and effective approach for reducing noise in images and signals. Suggested in 1994, this approach is carried out by rectifying the coefficients of a noisy image in the transform domain, using a set of scalar shrinkage function (SFs). A plethora of papers deals with the optimal shape of the SFs and the transform used, where it is known that applying the SFs in redundant bases provides improved results. This paper provides a complete picture of the interrelations between the transform used, the optimal shrinkage functions, and the domains in which they are optimized. In particular, we show that for subband optimization, where each SF is optimized independently for a particular band, optimizing the SFs in the spatial domain is always better than or equal to optimizing the SFs in the transform domain. For redundant bases, we provide the expected denoising gain we may achieve, relative to the unitary basis, as a function of the redundancy rate. △ Less

Submitted 21 April, 2020; originally announced April 2020.

arXiv:2002.01793 [pdf, other]

Proximity Preserving Binary Code using Signed Graph-Cut

Authors: Inbal Lav, Shai Avidan, Yoram Singer, Yacov Hel-Or

Abstract: We introduce a binary embedding framework, called Proximity Preserving Code (PPC), which learns similarity and dissimilarity between data points to create a compact and affinity-preserving binary code. This code can be used to apply fast and memory-efficient approximation to nearest-neighbor searches. Our framework is flexible, enabling different proximity definitions between data points. In contr… ▽ More We introduce a binary embedding framework, called Proximity Preserving Code (PPC), which learns similarity and dissimilarity between data points to create a compact and affinity-preserving binary code. This code can be used to apply fast and memory-efficient approximation to nearest-neighbor searches. Our framework is flexible, enabling different proximity definitions between data points. In contrast to previous methods that extract binary codes based on unsigned graph partitioning, our system models the attractive and repulsive forces in the data by incorporating positive and negative graph weights. The proposed framework is shown to boil down to finding the minimal cut of a signed graph, a problem known to be NP-hard. We offer an efficient approximation and achieve superior results by constructing the code bit after bit. We show that the proposed approximation is superior to the commonly used spectral methods with respect to both accuracy and complexity. Thus, it is useful for many other problems that can be translated into signed graph cut. △ Less

Submitted 5 February, 2020; originally announced February 2020.

Journal ref: AAAI Conference on Artificial Intelligence , Feb. 2020

arXiv:1911.06996 [pdf, other]

Selective sampling for accelerating training of deep neural networks

Authors: Berry Weinstein, Shai Fine, Yacov Hel-Or

Abstract: We present a selective sampling method designed to accelerate the training of deep neural networks. To this end, we introduce a novel measurement, the minimal margin score (MMS), which measures the minimal amount of displacement an input should take until its predicted classification is switched. For multi-class linear classification, the MMS measure is a natural generalization of the margin-based… ▽ More We present a selective sampling method designed to accelerate the training of deep neural networks. To this end, we introduce a novel measurement, the minimal margin score (MMS), which measures the minimal amount of displacement an input should take until its predicted classification is switched. For multi-class linear classification, the MMS measure is a natural generalization of the margin-based selection criterion, which was thoroughly studied in the binary classification setting. In addition, the MMS measure provides an interesting insight into the progress of the training process and can be useful for designing and monitoring new training regimes. Empirically we demonstrate a substantial acceleration when training commonly used deep neural network architectures for popular image classification tasks. The efficiency of our method is compared against the standard training procedures, and against commonly used selective sampling alternatives: Hard negative mining selection, and Entropy-based selection. Finally, we demonstrate an additional speedup when we adopt a more aggressive learning drop regime while using the MMS selective sampling method. △ Less

Submitted 16 November, 2019; originally announced November 2019.

Showing 1–11 of 11 results for author: Hel-Or, Y