Showing 1–2 of 2 results for author: Biscione, V
-
A case for robust translation tolerance in humans and CNNs. A commentary on Han et al
Authors:
Ryan Blything,
Valerio Biscione,
Jeffrey Bowers
Abstract:
Han et al. (2020) reported a behavioral experiment that assessed the extent to which the human visual system can identify novel images at unseen retinal locations (what the authors call "intrinsic translation invariance") and developed a novel convolutional neural network model (an Eccentricity Dependent Network or ENN) to capture key aspects of the behavioral results. Here we show that their anal…
▽ More
Han et al. (2020) reported a behavioral experiment that assessed the extent to which the human visual system can identify novel images at unseen retinal locations (what the authors call "intrinsic translation invariance") and developed a novel convolutional neural network model (an Eccentricity Dependent Network or ENN) to capture key aspects of the behavioral results. Here we show that their analysis of behavioral data used inappropriate baseline conditions, leading them to underestimate intrinsic translation invariance. When the data are correctly interpreted they show near complete translation tolerance extending to 14° in some conditions, consistent with earlier work (Bowers et al., 2016) and more recent work Blything et al. (in press). We describe a simpler model that provides a better account of translation invariance.
△ Less
Submitted 14 December, 2020; v1 submitted 10 December, 2020;
originally announced December 2020.
-
The human visual system and CNNs can both support robust online translation tolerance following extreme displacements
Authors:
Ryan Blything,
Valerio Biscione,
Ivan I. Vankov,
Casimir J. H. Ludwig,
Jeffrey S. Bowers
Abstract:
Visual translation tolerance refers to our capacity to recognize objects over a wide range of different retinal locations. Although translation is perhaps the simplest spatial transform that the visual system needs to cope with, the extent to which the human visual system can identify objects at previously unseen locations is unclear, with some studies reporting near complete invariance over 10° a…
▽ More
Visual translation tolerance refers to our capacity to recognize objects over a wide range of different retinal locations. Although translation is perhaps the simplest spatial transform that the visual system needs to cope with, the extent to which the human visual system can identify objects at previously unseen locations is unclear, with some studies reporting near complete invariance over 10° and other reporting zero invariance at 4° of visual angle. Similarly, there is confusion regarding the extent of translation tolerance in computational models of vision, as well as the degree of match between human and model performance. Here we report a series of eye-tracking studies (total N=70) demonstrating that novel objects trained at one retinal location can be recognized at high accuracy rates following translations up to 18°. We also show that standard deep convolutional networks (DCNNs) support our findings when pretrained to classify another set of stimuli across a range of locations, or when a Global Average Pooling (GAP) layer is added to produce larger receptive fields. Our findings provide a strong constraint for theories of human vision and help explain inconsistent findings previously reported with CNNs.
△ Less
Submitted 8 December, 2020; v1 submitted 27 September, 2020;
originally announced September 2020.