-
Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru
Authors:
Dunant Cusipuma,
David Ortega,
Victor Flores-Benites,
Arturo Deza
Abstract:
As multimodal foundational models start being deployed experimentally in Self-Driving cars, a reasonable question we ask ourselves is how similar to humans do these systems respond in certain driving situations -- especially those that are out-of-distribution? To study this, we create the Robusto-1 dataset that uses dashcam video data from Peru, a country with one of the worst (aggressive) drivers…
▽ More
As multimodal foundational models start being deployed experimentally in Self-Driving cars, a reasonable question we ask ourselves is how similar to humans do these systems respond in certain driving situations -- especially those that are out-of-distribution? To study this, we create the Robusto-1 dataset that uses dashcam video data from Peru, a country with one of the worst (aggressive) drivers in the world, a high traffic index, and a high ratio of bizarre to non-bizarre street objects likely never seen in training. In particular, to preliminarly test at a cognitive level how well Foundational Visual Language Models (VLMs) compare to Humans in Driving, we move away from bounding boxes, segmentation maps, occupancy maps or trajectory estimation to multi-modal Visual Question Answering (VQA) comparing both humans and machines through a popular method in systems neuroscience known as Representational Similarity Analysis (RSA). Depending on the type of questions we ask and the answers these systems give, we will show in what cases do VLMs and Humans converge or diverge allowing us to probe on their cognitive alignment. We find that the degree of alignment varies significantly depending on the type of questions asked to each type of system (Humans vs VLMs), highlighting a gap in their alignment.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Fair and Accurate Regression: Strong Formulations and Algorithms
Authors:
Anna Deza,
Andrés Gómez,
Alper Atamtürk
Abstract:
This paper introduces mixed-integer optimization methods to solve regression problems that incorporate fairness metrics. We propose an exact formulation for training fair regression models. To tackle this computationally hard problem, we study the polynomially-solvable single-factor and single-observation subproblems as building blocks and derive their closed convex hull descriptions. Strong formu…
▽ More
This paper introduces mixed-integer optimization methods to solve regression problems that incorporate fairness metrics. We propose an exact formulation for training fair regression models. To tackle this computationally hard problem, we study the polynomially-solvable single-factor and single-observation subproblems as building blocks and derive their closed convex hull descriptions. Strong formulations obtained for the general fair regression problem in this manner are utilized to solve the problem with a branch-and-bound algorithm exactly or as a relaxation to produce fair and accurate models rapidly. Moreover, to handle large-scale instances, we develop a coordinate descent algorithm motivated by the convex-hull representation of the single-factor fair regression problem to improve a given solution efficiently. Numerical experiments conducted on fair least squares and fair logistic regression problems show competitive statistical performance with state-of-the-art methods while significantly reducing training times.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Learn2Aggregate: Supervised Generation of Chvátal-Gomory Cuts Using Graph Neural Networks
Authors:
Arnaud Deza,
Elias B. Khalil,
Zhenan Fan,
Zirui Zhou,
Yong Zhang
Abstract:
We present $\textit{Learn2Aggregate}$, a machine learning (ML) framework for optimizing the generation of Chvátal-Gomory (CG) cuts in mixed integer linear programming (MILP). The framework trains a graph neural network to classify useful constraints for aggregation in CG cut generation. The ML-driven CG separator selectively focuses on a small set of impactful constraints, improving runtimes witho…
▽ More
We present $\textit{Learn2Aggregate}$, a machine learning (ML) framework for optimizing the generation of Chvátal-Gomory (CG) cuts in mixed integer linear programming (MILP). The framework trains a graph neural network to classify useful constraints for aggregation in CG cut generation. The ML-driven CG separator selectively focuses on a small set of impactful constraints, improving runtimes without compromising the strength of the generated cuts. Key to our approach is the formulation of a constraint classification task which favours sparse aggregation of constraints, consistent with empirical findings. This, in conjunction with a careful constraint labeling scheme and a hybrid of deep learning and feature engineering, results in enhanced CG cut generation across five diverse MILP benchmarks. On the largest test sets, our method closes roughly $\textit{twice}$ as much of the integrality gap as the standard CG method while running 40$% faster. This performance improvement is due to our method eliminating 75% of the constraints prior to aggregation.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Fast Matrix Multiplication Without Tears: A Constraint Programming Approach
Authors:
Arnaud Deza,
Chang Liu,
Pashootan Vaezipoor,
Elias B. Khalil
Abstract:
It is known that the multiplication of an $N \times M$ matrix with an $M \times P$ matrix can be performed using fewer multiplications than what the naive $NMP$ approach suggests. The most famous instance of this is Strassen's algorithm for multiplying two $2\times 2$ matrices in 7 instead of 8 multiplications. This gives rise to the constraint satisfaction problem of fast matrix multiplication, w…
▽ More
It is known that the multiplication of an $N \times M$ matrix with an $M \times P$ matrix can be performed using fewer multiplications than what the naive $NMP$ approach suggests. The most famous instance of this is Strassen's algorithm for multiplying two $2\times 2$ matrices in 7 instead of 8 multiplications. This gives rise to the constraint satisfaction problem of fast matrix multiplication, where a set of $R < NMP$ multiplication terms must be chosen and combined such that they satisfy correctness constraints on the output matrix. Despite its highly combinatorial nature, this problem has not been exhaustively examined from that perspective, as evidenced for example by the recent deep reinforcement learning approach of AlphaTensor. In this work, we propose a simple yet novel Constraint Programming approach to find non-commutative algorithms for fast matrix multiplication or provide proof of infeasibility otherwise. We propose a set of symmetry-breaking constraints and valid inequalities that are particularly helpful in proving infeasibility. On the feasible side, we find that exploiting solver performance variability in conjunction with a sparsity-based problem decomposition enables finding solutions for larger (feasible) instances of fast matrix multiplication. Our experimental results using CP Optimizer demonstrate that we can find fast matrix multiplication algorithms for matrices up to $3\times 3$ in a short amount of time.
△ Less
Submitted 17 July, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Machine Learning for Cutting Planes in Integer Programming: A Survey
Authors:
Arnaud Deza,
Elias B. Khalil
Abstract:
We survey recent work on machine learning (ML) techniques for selecting cutting planes (or cuts) in mixed-integer linear programming (MILP). Despite the availability of various classes of cuts, the task of choosing a set of cuts to add to the linear programming (LP) relaxation at a given node of the branch-and-bound (B&B) tree has defied both formal and heuristic solutions to date. ML offers a pro…
▽ More
We survey recent work on machine learning (ML) techniques for selecting cutting planes (or cuts) in mixed-integer linear programming (MILP). Despite the availability of various classes of cuts, the task of choosing a set of cuts to add to the linear programming (LP) relaxation at a given node of the branch-and-bound (B&B) tree has defied both formal and heuristic solutions to date. ML offers a promising approach for improving the cut selection process by using data to identify promising cuts that accelerate the solution of MILP instances. This paper presents an overview of the topic, highlighting recent advances in the literature, common approaches to data collection, evaluation, and ML model architectures. We analyze the empirical results in the literature in an attempt to quantify the progress that has been made and conclude by suggesting avenues for future research.
△ Less
Submitted 31 October, 2023; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Joint rotational invariance and adversarial training of a dual-stream Transformer yields state of the art Brain-Score for Area V4
Authors:
William Berrios,
Arturo Deza
Abstract:
Modern high-scoring models of vision in the brain score competition do not stem from Vision Transformers. However, in this paper, we provide evidence against the unexpected trend of Vision Transformers (ViT) being not perceptually aligned with human visual representations by showing how a dual-stream Transformer, a CrossViT$~\textit{a la}$ Chen et al. (2021), under a joint rotationally-invariant a…
▽ More
Modern high-scoring models of vision in the brain score competition do not stem from Vision Transformers. However, in this paper, we provide evidence against the unexpected trend of Vision Transformers (ViT) being not perceptually aligned with human visual representations by showing how a dual-stream Transformer, a CrossViT$~\textit{a la}$ Chen et al. (2021), under a joint rotationally-invariant and adversarial optimization procedure yields 2nd place in the aggregate Brain-Score 2022 competition(Schrimpf et al., 2020b) averaged across all visual categories, and at the time of the competition held 1st place for the highest explainable variance of area V4. In addition, our current Transformer-based model also achieves greater explainable variance for areas V4, IT and Behaviour than a biologically-inspired CNN (ResNet50) that integrates a frontal V1-like computation module (Dapello et al.,2020). To assess the contribution of the optimization scheme with respect to the CrossViT architecture, we perform several additional experiments on differently optimized CrossViT's regarding adversarial robustness, common corruption benchmarks, mid-ventral stimuli interpretation and feature inversion. Against our initial expectations, our family of results provides tentative support for an $\textit{"All roads lead to Rome"}$ argument enforced via a joint optimization rule even for non biologically-motivated models of vision such as Vision Transformers. Code is available at https://github.com/williamberrios/BrainScore-Transformers
△ Less
Submitted 17 October, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks
Authors:
Anne Harrington,
Arturo Deza
Abstract:
Recent work suggests that representations learned by adversarially robust networks are more human perceptually-aligned than non-robust networks via image manipulations. Despite appearing closer to human visual perception, it is unclear if the constraints in robust DNN representations match biological constraints found in human vision. Human vision seems to rely on texture-based/summary statistic r…
▽ More
Recent work suggests that representations learned by adversarially robust networks are more human perceptually-aligned than non-robust networks via image manipulations. Despite appearing closer to human visual perception, it is unclear if the constraints in robust DNN representations match biological constraints found in human vision. Human vision seems to rely on texture-based/summary statistic representations in the periphery, which have been shown to explain phenomena such as crowding and performance on visual search tasks. To understand how adversarially robust optimizations/representations compare to human vision, we performed a psychophysics experiment using a set of metameric discrimination tasks where we evaluated how well human observers could distinguish between images synthesized to match adversarially robust representations compared to non-robust representations and a texture synthesis model of peripheral vision (Texforms). We found that the discriminability of robust representation and texture model images decreased to near chance performance as stimuli were presented farther in the periphery. Moreover, performance on robust and texture-model images showed similar trends within participants, while performance on non-robust representations changed minimally across the visual field. These results together suggest that (1) adversarially robust representations capture peripheral computation better than non-robust representations and (2) robust representations capture peripheral computation similar to current state-of-the-art texture peripheral vision models. More broadly, our findings support the idea that localized texture summary statistic representations may drive human invariance to adversarial perturbations and that the incorporation of such representations in DNNs could give rise to useful properties like adversarial robustness.
△ Less
Submitted 3 February, 2022; v1 submitted 1 February, 2022;
originally announced February 2022.
-
Safe Screening for Logistic Regression with $\ell_0$-$\ell_2$ Regularization
Authors:
Anna Deza,
Alper Atamturk
Abstract:
In logistic regression, it is often desirable to utilize regularization to promote sparse solutions, particularly for problems with a large number of features compared to available labels. In this paper, we present screening rules that safely remove features from logistic regression with $\ell_0-\ell_2$ regularization before solving the problem. The proposed safe screening rules are based on lower…
▽ More
In logistic regression, it is often desirable to utilize regularization to promote sparse solutions, particularly for problems with a large number of features compared to available labels. In this paper, we present screening rules that safely remove features from logistic regression with $\ell_0-\ell_2$ regularization before solving the problem. The proposed safe screening rules are based on lower bounds from the Fenchel dual of strong conic relaxations of the logistic regression problem. Numerical experiments with real and synthetic data suggest that a high percentage of the features can be effectively and safely removed apriori, leading to substantial speed-up in the computations.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation
Authors:
Binxu Wang,
David Mayo,
Arturo Deza,
Andrei Barbu,
Colin Conwell
Abstract:
Self-supervised learning is a powerful way to learn useful representations from natural data. It has also been suggested as one possible means of building visual representation in humans, but the specific objective and algorithm are unknown. Currently, most self-supervised methods encourage the system to learn an invariant representation of different transformations of the same image in contrast t…
▽ More
Self-supervised learning is a powerful way to learn useful representations from natural data. It has also been suggested as one possible means of building visual representation in humans, but the specific objective and algorithm are unknown. Currently, most self-supervised methods encourage the system to learn an invariant representation of different transformations of the same image in contrast to those of other images. However, such transformations are generally non-biologically plausible, and often consist of contrived perceptual schemes such as random cropping and color jittering. In this paper, we attempt to reverse-engineer these augmentations to be more biologically or perceptually plausible while still conferring the same benefits for encouraging robust representation. Critically, we find that random cropping can be substituted by cortical magnification, and saccade-like sampling of the image could also assist the representation learning. The feasibility of these transformations suggests a potential way that biological visual systems could implement self-supervision. Further, they break the widely accepted spatially-uniform processing assumption used in many computer vision algorithms, suggesting a role for spatially-adaptive computation in humans and machines alike. Our code and demo can be found here.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
The Effects of Image Distribution and Task on Adversarial Robustness
Authors:
Owen Kunhardt,
Arturo Deza,
Tomaso Poggio
Abstract:
In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular $ε$-interval $[ε_0, ε_1]$ (interval of adversarial perturbation strengths) that facilitates unbiased comparisons across models when they have different initial $ε_0$ performance. This can be used to determine how adversarially robust a model is to diff…
▽ More
In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular $ε$-interval $[ε_0, ε_1]$ (interval of adversarial perturbation strengths) that facilitates unbiased comparisons across models when they have different initial $ε_0$ performance. This can be used to determine how adversarially robust a model is to different image distributions or task (or some other variable); and/or to measure how robust a model is comparatively to other models. We used this adversarial robustness metric on models of an MNIST, CIFAR-10, and a Fusion dataset (CIFAR-10 + MNIST) where trained models performed either a digit or object recognition task using a LeNet, ResNet50, or a fully connected network (FullyConnectedNet) architecture and found the following: 1) CIFAR-10 models are inherently less adversarially robust than MNIST models; 2) Both the image distribution and task that a model is trained on can affect the adversarial robustness of the resultant model. 3) Pretraining with a different image distribution and task sometimes carries over the adversarial robustness induced by that image distribution and task in the resultant model; Collectively, our results imply non-trivial differences of the learned representation space of one perceptual system over another given its exposure to different image statistics or tasks (mainly objects vs digits). Moreover, these results hold even when model systems are equalized to have the same level of performance, or when exposed to approximately matched image statistics of fusion images but with different tasks.
△ Less
Submitted 21 February, 2021;
originally announced February 2021.
-
CUDA-Optimized real-time rendering of a Foveated Visual System
Authors:
Elian Malkin,
Arturo Deza,
Tomaso Poggio
Abstract:
The spatially-varying field of the human visual system has recently received a resurgence of interest with the development of virtual reality (VR) and neural networks. The computational demands of high resolution rendering desired for VR can be offset by savings in the periphery, while neural networks trained with foveated input have shown perceptual gains in i.i.d and o.o.d generalization. In thi…
▽ More
The spatially-varying field of the human visual system has recently received a resurgence of interest with the development of virtual reality (VR) and neural networks. The computational demands of high resolution rendering desired for VR can be offset by savings in the periphery, while neural networks trained with foveated input have shown perceptual gains in i.i.d and o.o.d generalization. In this paper, we present a technique that exploits the CUDA GPU architecture to efficiently generate Gaussian-based foveated images at high definition (1920x1080 px) in real-time (165 Hz), with a larger number of pooling regions than previous Gaussian-based foveation algorithms by several orders of magnitude, producing a smoothly foveated image that requires no further blending or stitching, and that can be well fit for any contrast sensitivity function. The approach described can be adapted from Gaussian blurring to any eccentricity-dependent image processing and our algorithm can meet demand for experimentation to evaluate the role of spatially-varying processing across biological and artificial agents, so that foveation can be added easily on top of existing systems rather than forcing their redesign (emulated foveated renderer). Altogether, this paper demonstrates how a GPU, with a CUDA block-wise architecture, can be employed for radially-variant rendering, with opportunities for more complex post-processing to ensure a metameric foveation scheme. Code is provided.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
Hierarchically Compositional Tasks and Deep Convolutional Networks
Authors:
Arturo Deza,
Qianli Liao,
Andrzej Banburski,
Tomaso Poggio
Abstract:
The main success stories of deep learning, starting with ImageNet, depend on deep convolutional networks, which on certain tasks perform significantly better than traditional shallow classifiers, such as support vector machines, and also better than deep fully connected networks; but what is so special about deep convolutional networks? Recent results in approximation theory proved an exponential…
▽ More
The main success stories of deep learning, starting with ImageNet, depend on deep convolutional networks, which on certain tasks perform significantly better than traditional shallow classifiers, such as support vector machines, and also better than deep fully connected networks; but what is so special about deep convolutional networks? Recent results in approximation theory proved an exponential advantage of deep convolutional networks with or without shared weights in approximating functions with hierarchical locality in their compositional structure. More recently, the hierarchical structure was proved to be hard to learn from data, suggesting that it is a powerful prior embedded in the architecture of the network. These mathematical results, however, do not say which real-life tasks correspond to input-output functions with hierarchical locality. To evaluate this, we consider a set of visual tasks where we disrupt the local organization of images via "deterministic scrambling" to later perform a visual task on these images structurally-altered in the same way for training and testing. For object recognition we find, as expected, that scrambling does not affect the performance of shallow or deep fully connected networks contrary to the out-performance of convolutional networks. Not all tasks involving images are however affected. Texture perception and global color estimation are much less sensitive to deterministic scrambling showing that the underlying functions corresponding to these tasks are not hierarchically local; and also counter-intuitively showing that these tasks are better approximated by networks that are not deep (texture) nor convolutional (color). Altogether, these results shed light into the importance of matching a network architecture with its embedded prior of the task to be learned.
△ Less
Submitted 25 March, 2021; v1 submitted 24 June, 2020;
originally announced June 2020.
-
Emergent Properties of Foveated Perceptual Systems
Authors:
Arturo Deza,
Talia Konkle
Abstract:
The goal of this work is to characterize the representational impact that foveation operations have for machine vision systems, inspired by the foveated human visual system, which has higher acuity at the center of gaze and texture-like encoding in the periphery. To do so, we introduce models consisting of a first-stage \textit{fixed} image transform followed by a second-stage \textit{learnable} c…
▽ More
The goal of this work is to characterize the representational impact that foveation operations have for machine vision systems, inspired by the foveated human visual system, which has higher acuity at the center of gaze and texture-like encoding in the periphery. To do so, we introduce models consisting of a first-stage \textit{fixed} image transform followed by a second-stage \textit{learnable} convolutional neural network, and we varied the first stage component. The primary model has a foveated-textural input stage, which we compare to a model with foveated-blurred input and a model with spatially-uniform blurred input (both matched for perceptual compression), and a final reference model with minimal input-based compression. We find that: 1) the foveated-texture model shows similar scene classification accuracy as the reference model despite its compressed input, with greater i.i.d. generalization than the other models; 2) the foveated-texture model has greater sensitivity to high-spatial frequency information and greater robustness to occlusion, w.r.t the comparison models; 3) both the foveated systems, show a stronger center image-bias relative to the spatially-uniform systems even with a weight sharing constraint. Critically, these results are preserved over different classical CNN architectures throughout their learning dynamics. Altogether, this suggests that foveation with peripheral texture-based computations yields an efficient, distinct, and robust representational format of scene information, and provides symbiotic computational insight into the representational consequences that texture-based peripheral encoding may have for processing in the human visual system, while also potentially inspiring the next generation of computer vision models via spatially-adaptive computation. Code + Data available here: https://github.com/ArturoDeza/EmergentProperties
△ Less
Submitted 22 June, 2021; v1 submitted 14 June, 2020;
originally announced June 2020.
-
Assessment of Faster R-CNN in Man-Machine collaborative search
Authors:
Arturo Deza,
Amit Surana,
Miguel P. Eckstein
Abstract:
With the advent of modern expert systems driven by deep learning that supplement human experts (e.g. radiologists, dermatologists, surveillance scanners), we analyze how and when do such expert systems enhance human performance in a fine-grained small target visual search task. We set up a 2 session factorial experimental design in which humans visually search for a target with and without a Deep…
▽ More
With the advent of modern expert systems driven by deep learning that supplement human experts (e.g. radiologists, dermatologists, surveillance scanners), we analyze how and when do such expert systems enhance human performance in a fine-grained small target visual search task. We set up a 2 session factorial experimental design in which humans visually search for a target with and without a Deep Learning (DL) expert system. We evaluate human changes of target detection performance and eye-movements in the presence of the DL system. We find that performance improvements with the DL system (computed via a Faster R-CNN with a VGG16) interacts with observer's perceptual abilities (e.g., sensitivity). The main results include: 1) The DL system reduces the False Alarm rate per Image on average across observer groups of both high/low sensitivity; 2) Only human observers with high sensitivity perform better than the DL system, while the low sensitivity group does not surpass individual DL system performance, even when aided with the DL system itself; 3) Increases in number of trials and decrease in viewing time were mainly driven by the DL system only for the low sensitivity group. 4) The DL system aids the human observer to fixate at a target by the 3rd fixation. These results provide insights of the benefits and limitations of deep learning systems that are collaborative or competitive with humans.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
Hypergraphic Degree Sequences are Hard
Authors:
Antoine Deza,
Asaf Levin,
Syed M. Meesum,
Shmuel Onn
Abstract:
We show that deciding if a given vector is the degree sequence of a 3-hypergraph is NP-complete.
We show that deciding if a given vector is the degree sequence of a 3-hypergraph is NP-complete.
△ Less
Submitted 8 January, 2019;
originally announced January 2019.
-
Optimization over Degree Sequences
Authors:
Antoine Deza,
Asaf Levin,
Syed M. Meesum,
Shmuel Onn
Abstract:
We introduce and study the problem of optimizing arbitrary functions over degree sequences of hypergraphs and multihypergraphs. We show that over multihypergraphs the problem can be solved in polynomial time. For hypergraphs, we show that deciding if a given sequence is the degree sequence of a 3-hypergraph is NP-complete, thereby solving a 30 year long open problem. This implies that optimization…
▽ More
We introduce and study the problem of optimizing arbitrary functions over degree sequences of hypergraphs and multihypergraphs. We show that over multihypergraphs the problem can be solved in polynomial time. For hypergraphs, we show that deciding if a given sequence is the degree sequence of a 3-hypergraph is NP-complete, thereby solving a 30 year long open problem. This implies that optimization over hypergraphs is hard already for simple concave functions. In contrast, we show that for graphs, if the functions at vertices are the same, then the problem is polynomial time solvable. We also provide positive results for convex optimization over multihypergraphs and graphs and exploit connections to degree sequence polytopes and threshold graphs. We then elaborate on connections to the emerging theory of shifted combinatorial optimization.
△ Less
Submitted 15 August, 2018; v1 submitted 13 June, 2017;
originally announced June 2017.
-
Towards Metamerism via Foveated Style Transfer
Authors:
Arturo Deza,
Aditya Jonnalagadda,
Miguel Eckstein
Abstract:
The problem of $\textit{visual metamerism}$ is defined as finding a family of perceptually indistinguishable, yet physically different images. In this paper, we propose our NeuroFovea metamer model, a foveated generative model that is based on a mixture of peripheral representations and style transfer forward-pass algorithms. Our gradient-descent free model is parametrized by a foveated VGG19 enco…
▽ More
The problem of $\textit{visual metamerism}$ is defined as finding a family of perceptually indistinguishable, yet physically different images. In this paper, we propose our NeuroFovea metamer model, a foveated generative model that is based on a mixture of peripheral representations and style transfer forward-pass algorithms. Our gradient-descent free model is parametrized by a foveated VGG19 encoder-decoder which allows us to encode images in high dimensional space and interpolate between the content and texture information with adaptive instance normalization anywhere in the visual field. Our contributions include: 1) A framework for computing metamers that resembles a noisy communication system via a foveated feed-forward encoder-decoder network -- We observe that metamerism arises as a byproduct of noisy perturbations that partially lie in the perceptual null space; 2) A perceptual optimization scheme as a solution to the hyperparametric nature of our metamer model that requires tuning of the image-texture tradeoff coefficients everywhere in the visual field which are a consequence of internal noise; 3) An ABX psychophysical evaluation of our metamers where we also find that the rate of growth of the receptive fields in our model match V1 for reference metamers and V2 between synthesized samples. Our model also renders metamers at roughly a second, presenting a $\times1000$ speed-up compared to the previous work, which allows for tractable data-driven metamer experiments.
△ Less
Submitted 28 December, 2018; v1 submitted 29 May, 2017;
originally announced May 2017.
-
Computational determination of the largest lattice polytope diameter
Authors:
Nathan Chadder,
Antoine Deza
Abstract:
A lattice (d, k)-polytope is the convex hull of a set of points in dimension d whose coordinates are integers between 0 and k. Let δ(d, k) be the largest diameter over all lattice (d, k)-polytopes. We develop a computational framework to determine δ(d, k) for small instances. We show that δ(3, 4) = 7 and δ(3, 5) = 9; that is, we verify for (d, k) = (3, 4) and (3, 5) the conjecture whereby δ(d, k)…
▽ More
A lattice (d, k)-polytope is the convex hull of a set of points in dimension d whose coordinates are integers between 0 and k. Let δ(d, k) be the largest diameter over all lattice (d, k)-polytopes. We develop a computational framework to determine δ(d, k) for small instances. We show that δ(3, 4) = 7 and δ(3, 5) = 9; that is, we verify for (d, k) = (3, 4) and (3, 5) the conjecture whereby δ(d, k) is at most (k + 1)d/2 and is achieved, up to translation, by a Minkowski sum of lattice vectors.
△ Less
Submitted 5 April, 2017;
originally announced April 2017.
-
Attention Allocation Aid for Visual Search
Authors:
Arturo Deza,
Jeffrey R. Peters,
Grant S. Taylor,
Amit Surana,
Miguel P. Eckstein
Abstract:
This paper outlines the development and testing of a novel, feedback-enabled attention allocation aid (AAAD), which uses real-time physiological data to improve human performance in a realistic sequential visual search task. Indeed, by optimizing over search duration, the aid improves efficiency, while preserving decision accuracy, as the operator identifies and classifies targets within simulated…
▽ More
This paper outlines the development and testing of a novel, feedback-enabled attention allocation aid (AAAD), which uses real-time physiological data to improve human performance in a realistic sequential visual search task. Indeed, by optimizing over search duration, the aid improves efficiency, while preserving decision accuracy, as the operator identifies and classifies targets within simulated aerial imagery. Specifically, using experimental eye-tracking data and measurements about target detectability across the human visual field, we develop functional models of detection accuracy as a function of search time, number of eye movements, scan path, and image clutter. These models are then used by the AAAD in conjunction with real time eye position data to make probabilistic estimations of attained search accuracy and to recommend that the observer either move on to the next image or continue exploring the present image. An experimental evaluation in a scenario motivated from human supervisory control in surveillance missions confirms the benefits of the AAAD.
△ Less
Submitted 14 January, 2017;
originally announced January 2017.
-
Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
Authors:
Arturo Deza,
Miguel P. Eckstein
Abstract:
Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter…
▽ More
Previous studies have proposed image-based clutter measures that correlate with human search times and/or eye movements. However, most models do not take into account the fact that the effects of clutter interact with the foveated nature of the human visual system: visual clutter further from the fovea has an increasing detrimental influence on perception. Here, we introduce a new foveated clutter model to predict the detrimental effects in target search utilizing a forced fixation search task. We use Feature Congestion (Rosenholtz et al.) as our non foveated clutter model, and we stack a peripheral architecture on top of Feature Congestion for our foveated model. We introduce the Peripheral Integration Feature Congestion (PIFC) coefficient, as a fundamental ingredient of our model that modulates clutter as a non-linear gain contingent on eccentricity. We finally show that Foveated Feature Congestion (FFC) clutter scores r(44) = -0.82 correlate better with target detection (hit rate) than regular Feature Congestion r(44) = -0.19 in forced fixation search. Thus, our model allows us to enrich clutter perception research by computing fixation specific clutter maps. A toolbox for creating peripheral architectures: Piranhas: Peripheral Architectures for Natural, Hybrid and Artificial Systems will be made available.
△ Less
Submitted 13 August, 2016;
originally announced August 2016.
-
Understanding Image Virality
Authors:
Arturo Deza,
Devi Parikh
Abstract:
Virality of online content on social networking websites is an important but esoteric phenomenon often studied in fields like marketing, psychology and data mining. In this paper we study viral images from a computer vision perspective. We introduce three new image datasets from Reddit, and define a virality score using Reddit metadata. We train classifiers with state-of-the-art image features to…
▽ More
Virality of online content on social networking websites is an important but esoteric phenomenon often studied in fields like marketing, psychology and data mining. In this paper we study viral images from a computer vision perspective. We introduce three new image datasets from Reddit, and define a virality score using Reddit metadata. We train classifiers with state-of-the-art image features to predict virality of individual images, relative virality in pairs of images, and the dominant topic of a viral image. We also compare machine performance to human performance on these tasks. We find that computers perform poorly with low level features, and high level information is critical for predicting virality. We encode semantic information through relative attributes. We identify the 5 key visual attributes that correlate with virality. We create an attribute-based characterization of images that can predict relative virality with 68.10% accuracy (SVM+Deep Relative Attributes) -- better than humans at 60.12%. Finally, we study how human prediction of image virality varies with different `contexts' in which the images are viewed, such as the influence of neighbouring images, images recently viewed, as well as the image title or caption. This work is a first step in understanding the complex but important phenomenon of image virality. Our datasets and annotations will be made publicly available.
△ Less
Submitted 26 May, 2015; v1 submitted 8 March, 2015;
originally announced March 2015.
-
Chance Constrained Optimization for Targeted Internet Advertising
Authors:
Antoine Deza,
Kai Huang,
Michael R. Metel
Abstract:
We introduce a chance constrained optimization model for the fulfillment of guaranteed display Internet advertising campaigns. The proposed formulation for the allocation of display inventory takes into account the uncertainty of the supply of Internet viewers. We discuss and present theoretical and computational features of the model via Monte Carlo sampling and convex approximations. Theoretical…
▽ More
We introduce a chance constrained optimization model for the fulfillment of guaranteed display Internet advertising campaigns. The proposed formulation for the allocation of display inventory takes into account the uncertainty of the supply of Internet viewers. We discuss and present theoretical and computational features of the model via Monte Carlo sampling and convex approximations. Theoretical upper and lower bounds are presented along with a numerical substantiation.
△ Less
Submitted 29 July, 2014;
originally announced July 2014.
-
A combinatorial approach to colourful simplicial depth
Authors:
Antoine Deza,
Frédéric Meunier,
Pauline Sarrabezolles
Abstract:
The colourful simplicial depth conjecture states that any point in the convex hull of each of d+1 sets, or colours, of d+1 points in general position in R^d is contained in at least d^2+1 simplices with one vertex from each set. We verify the conjecture in dimension 4 and strengthen the known lower bounds in higher dimensions. These results are obtained using a combinatorial generalization of colo…
▽ More
The colourful simplicial depth conjecture states that any point in the convex hull of each of d+1 sets, or colours, of d+1 points in general position in R^d is contained in at least d^2+1 simplices with one vertex from each set. We verify the conjecture in dimension 4 and strengthen the known lower bounds in higher dimensions. These results are obtained using a combinatorial generalization of colourful point configurations called octahedral systems. We present properties of octahedral systems generalizing earlier results on colourful point configurations and exhibit an octahedral system which can not arise from a colourful point configuration. The number of octahedral systems is also given.
△ Less
Submitted 16 March, 2013; v1 submitted 19 December, 2012;
originally announced December 2012.
-
Computational Lower Bounds for Colourful Simplicial Depth
Authors:
Antoine Deza,
Tamon Stephen,
Feng Xie
Abstract:
The colourful simplicial depth problem in dimension d is to find a configuration of (d+1) sets of (d+1) points such that the origin is contained in the convex hull of each set (colour) but contained in a minimal number of colourful simplices generated by taking one point from each set. A construction attaining d^2+1 simplices is known, and is conjectured to be minimal. This has been confirmed up t…
▽ More
The colourful simplicial depth problem in dimension d is to find a configuration of (d+1) sets of (d+1) points such that the origin is contained in the convex hull of each set (colour) but contained in a minimal number of colourful simplices generated by taking one point from each set. A construction attaining d^2+1 simplices is known, and is conjectured to be minimal. This has been confirmed up to d=3, however the best known lower bound for d at least 4 is ((d+1)^2)/2.
A promising method to improve this lower bound is to look at combinatorial octahedral systems generated by such configurations. The difficulty to employing this approach is handling the many symmetric configurations that arise. We propose a table of invariants which exclude many of partial configurations, and use this to improve the lower bound in dimension 4.
△ Less
Submitted 29 October, 2012;
originally announced October 2012.
-
A further generalization of the colourful Carathéodory theorem
Authors:
Frédéric Meunier,
Antoine Deza
Abstract:
Given $d+1$ sets, or colours, $S_1, S_2,...,S_{d+1}$ of points in $\mathbb{R}^d$, a {\em colourful} set is a set $S\subseteq\bigcup_i S_i$ such that $|S\cap S_i|\leq 1$ for $i=1,...,d+1$. The convex hull of a colourful set $S$ is called a {\em colourful simplex}. Bárány's colourful Carathéodory theorem asserts that if the origin 0 is contained in the convex hull of $S_i$ for $i=1,...,d+1$, then th…
▽ More
Given $d+1$ sets, or colours, $S_1, S_2,...,S_{d+1}$ of points in $\mathbb{R}^d$, a {\em colourful} set is a set $S\subseteq\bigcup_i S_i$ such that $|S\cap S_i|\leq 1$ for $i=1,...,d+1$. The convex hull of a colourful set $S$ is called a {\em colourful simplex}. Bárány's colourful Carathéodory theorem asserts that if the origin 0 is contained in the convex hull of $S_i$ for $i=1,...,d+1$, then there exists a colourful simplex containing 0. The sufficient condition for the existence of a colourful simplex containing 0 was generalized to 0 being contained in the convex hull of $S_i\cup S_j$ for $1\leq i< j \leq d+1$ by Arocha et al. and by Holmsen et al. We further generalize the sufficient condition and obtain new colourful Carathéodory theorems. We also give an algorithm to find a colourful simplex containing 0 under the generalized condition. In the plane an alternative, and more general, proof using graphs is given. In addition, we observe that any condition implying the existence of a colourful simplex containing 0 actually implies the existence of $\min_i|S_i|$ such simplices.
△ Less
Submitted 6 March, 2014; v1 submitted 18 July, 2011;
originally announced July 2011.