Skip to main content

Showing 1–50 of 74 results for author: Kanan, C

.
  1. arXiv:2506.11449  [pdf, ps, other

    cs.LG

    Dynamic Sparse Training of Diagonally Sparse Networks

    Authors: Abhishek Tyagi, Arjun Iyer, William H Renninger, Christopher Kanan, Yuhao Zhu

    Abstract: Recent advances in Dynamic Sparse Training (DST) have pushed the frontier of sparse neural network training in structured and unstructured contexts, matching dense-model performance while drastically reducing parameter counts to facilitate model scaling. However, unstructured sparsity often fails to translate into practical speedups on modern hardware. To address this shortcoming, we propose DynaD… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  2. arXiv:2506.00588  [pdf, ps, other

    cs.LG cs.AI

    Temporal Chunking Enhances Recognition of Implicit Sequential Patterns

    Authors: Jayanta Dey, Nicholas Soures, Miranda Gonzales, Itamar Lerner, Christopher Kanan, Dhireesha Kudithipudi

    Abstract: In this pilot study, we propose a neuro-inspired approach that compresses temporal sequences into context-tagged chunks, where each tag represents a recurring structural unit or``community'' in the sequence. These tags are generated during an offline sleep phase and serve as compact references to past experience, allowing the learner to incorporate information beyond its immediate input range. We… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  3. arXiv:2503.06385  [pdf, other

    cs.LG cs.CV

    A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization

    Authors: Md Yousuf Harun, Christopher Kanan

    Abstract: To adapt to real-world data streams, continual learning (CL) systems must rapidly learn new concepts while preserving and utilizing prior knowledge. When it comes to adding new information to continually-trained deep neural networks (DNNs), classifier weights for newly encountered categories are typically initialized randomly, leading to high initial training loss (spikes) and instability. Consequ… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: Preprint

  4. arXiv:2502.10691  [pdf, other

    cs.LG

    Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

    Authors: Md Yousuf Harun, Jhair Gallardo, Christopher Kanan

    Abstract: Out-of-distribution (OOD) detection and OOD generalization are widely studied in Deep Neural Networks (DNNs), yet their relationship remains poorly understood. We empirically show that the degree of Neural Collapse (NC) in a network layer is inversely related with these objectives: stronger NC improves OOD detection but degrades generalization, while weaker NC enhances generalization at the cost o… ▽ More

    Submitted 26 May, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

    Comments: ICML 2025

  5. arXiv:2412.02012  [pdf, other

    eess.IV cs.AI cs.CV

    INSIGHT: Explainable Weakly-Supervised Medical Image Analysis

    Authors: Wenbo Zhang, Junyu Chen, Christopher Kanan

    Abstract: Due to their large sizes, volumetric scans and whole-slide pathology images (WSIs) are often processed by extracting embeddings from local regions and then an aggregator makes predictions from this set. However, current methods require post-hoc visualization techniques (e.g., Grad-CAM) and often fail to localize small yet clinically crucial details. To address these limitations, we introduce INSIG… ▽ More

    Submitted 8 December, 2024; v1 submitted 2 December, 2024; originally announced December 2024.

  6. arXiv:2410.19925  [pdf, other

    cs.CL cs.CV cs.LG

    Improving Multimodal Large Language Models Using Continual Learning

    Authors: Shikhar Srivastava, Md Yousuf Harun, Robik Shrestha, Christopher Kanan

    Abstract: Generative large language models (LLMs) exhibit impressive capabilities, which can be further augmented by integrating a pre-trained vision model into the original LLM to create a multimodal LLM (MLLM). However, this integration often significantly decreases performance on natural language understanding and generation tasks, compared to the original LLM. This study investigates this issue using th… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: NeurIPS 2024 Workshop on Scalable Continual Learning for Lifelong Foundation Models

  7. arXiv:2409.08832  [pdf, other

    cs.LG

    Can Kans (re)discover predictive models for Direct-Drive Laser Fusion?

    Authors: Rahman Ejaz, Varchas Gopalaswamy, Riccardo Betti, Aarne Lees, Christopher Kanan

    Abstract: The domain of laser fusion presents a unique and challenging predictive modeling application landscape for machine learning methods due to high problem complexity and limited training data. Data-driven approaches utilizing prescribed functional forms, inductive biases and physics-informed learning (PIL) schemes have been successful in the past for achieving desired generalization ability and model… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  8. arXiv:2408.05334  [pdf, other

    cs.AI cs.CL cs.CV

    Revisiting Multi-Modal LLM Evaluation

    Authors: Jian Lu, Shikhar Srivastava, Junyu Chen, Robik Shrestha, Manoj Acharya, Kushal Kafle, Christopher Kanan

    Abstract: With the advent of multi-modal large language models (MLLMs), datasets used for visual question answering (VQA) and referring expression comprehension have seen a resurgence. However, the most popular datasets used to evaluate MLLMs are some of the earliest ones created, and they have many known problems, including extreme bias, spurious correlations, and an inability to permit fine-grained analys… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  9. arXiv:2405.15018  [pdf, other

    cs.LG cs.AI cs.CV

    What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

    Authors: Md Yousuf Harun, Kyungbok Lee, Jhair Gallardo, Giri Krishnan, Christopher Kanan

    Abstract: Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing transferability and out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis, which is closely related to intermediate neural collapse. This hypothesis suggests t… ▽ More

    Submitted 25 October, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to NeurIPS 2024

  10. arXiv:2405.10254  [pdf, other

    eess.IV cs.CV cs.LG

    PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

    Authors: George Shaikovski, Adam Casson, Kristen Severson, Eric Zimmermann, Yi Kan Wang, Jeremy D. Kunz, Juan A. Retamero, Gerard Oakley, David Klimstra, Christopher Kanan, Matthew Hanna, Michal Zelechowski, Julian Viret, Neil Tenenholtz, James Hall, Nicolo Fusi, Razik Yousfi, Peter Hamilton, William A. Moye, Eugene Vorontsov, Siqi Liu, Thomas J. Fuchs

    Abstract: Foundation models in computational pathology promise to unlock the development of new clinical decision support systems and models for precision medicine. However, there is a mismatch between most clinical analysis, which is defined at the level of one or more whole slide images, and foundation models to date, which process the thousands of image tiles contained in a whole slide image separately.… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  11. arXiv:2312.14441  [pdf, other

    eess.SY cs.LG

    DMC4ML: Data Movement Complexity for Machine Learning

    Authors: Chen Ding, Christopher Kanan, Dylan McKellips, Toranosuke Ozawa, Arian Shahmirza, Wesley Smith

    Abstract: The greatest demand for today's computing is machine learning. This paper analyzes three machine learning algorithms: transformers, spatial convolution, and FFT. The analysis is novel in three aspects. First, it measures the cost of memory access on an abstract memory hierarchy, instead of traditional time or space complexity. Second, the analysis is asymptotic and identifies the primary sources o… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  12. arXiv:2312.12716  [pdf, other

    cs.CV cs.CL cs.LG

    BloomVQA: Assessing Hierarchical Multi-modal Comprehension

    Authors: Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran

    Abstract: We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks. Unlike current benchmarks that often focus on fact-based memorization and simple reasoning tasks without theoretical grounding, we collect multiple-choice samples based on picture stories that reflect different levels of comprehension, as laid out in Bloom's Taxo… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by ACL Findings (2024). Dataset available at https://huggingface.co/datasets/ygong/BloomVQA

  13. arXiv:2311.11908  [pdf, other

    cs.LG cs.AI cs.CV

    Continual Learning: Applications and the Road Forward

    Authors: Eli Verwimp, Rahaf Aljundi, Shai Ben-David, Matthias Bethge, Andrea Cossu, Alexander Gepperth, Tyler L. Hayes, Eyke Hüllermeier, Christopher Kanan, Dhireesha Kudithipudi, Christoph H. Lampert, Martin Mundt, Razvan Pascanu, Adrian Popescu, Andreas S. Tolias, Joost van de Weijer, Bing Liu, Vincenzo Lomonaco, Tinne Tuytelaars, Gido M. van de Ven

    Abstract: Continual learning is a subfield of machine learning, which aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past. In this work, we take a step back, and ask: "Why should one care about continual learning in the first place?". We set the stage by examining recent continual learning papers published at four… ▽ More

    Submitted 28 March, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

    Journal ref: Transactions on Machine Learning Research (TMLR), 2024

  14. arXiv:2310.12887  [pdf, other

    cs.HC

    Spatial and Temporal Attention-based emotion estimation on HRI-AVC dataset

    Authors: Karthik Subramanian, Saurav Singh, Justin Namba, Jamison Heard, Christopher Kanan, Ferat Sahin

    Abstract: Many attempts have been made at estimating discrete emotions (calmness, anxiety, boredom, surprise, anger) and continuous emotional measures commonly used in psychology, namely `valence' (The pleasantness of the emotion being displayed) and `arousal' (The intensity of the emotion being displayed). Existing methods to estimate arousal and valence rely on learning from data sets, where an expert ann… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  15. arXiv:2309.07778  [pdf, other

    eess.IV cs.CV cs.LG q-bio.TO

    Virchow: A Million-Slide Digital Pathology Foundation Model

    Authors: Eugene Vorontsov, Alican Bozkurt, Adam Casson, George Shaikovski, Michal Zelechowski, Siqi Liu, Kristen Severson, Eric Zimmermann, James Hall, Neil Tenenholtz, Nicolo Fusi, Philippe Mathieu, Alexander van Eck, Donghun Lee, Julian Viret, Eric Robert, Yi Kan Wang, Jeremy D. Kunz, Matthew C. H. Lee, Jan Bernhard, Ran A. Godrich, Gerard Oakley, Ewan Millar, Matthew Hanna, Juan Retamero , et al. (6 additional authors not shown)

    Abstract: The use of artificial intelligence to enable precision medicine and decision support systems through the analysis of pathology images has the potential to revolutionize the diagnosis and treatment of cancer. Such applications will depend on models' abilities to capture the diverse patterns observed in pathology images. To address this challenge, we present Virchow, a foundation model for computati… ▽ More

    Submitted 17 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  16. arXiv:2308.13646  [pdf, other

    cs.LG cs.CL cs.CV

    GRASP: A Rehearsal Policy for Efficient Online Continual Learning

    Authors: Md Yousuf Harun, Jhair Gallardo, Junyu Chen, Christopher Kanan

    Abstract: Continual learning (CL) in deep neural networks (DNNs) involves incrementally accumulating knowledge in a DNN from a growing data stream. A major challenge in CL is that non-stationary data streams cause catastrophic forgetting of previously learned abilities. A popular solution is rehearsal: storing past observations in a buffer and then sampling the buffer to update the DNN. Uniform sampling in… ▽ More

    Submitted 1 May, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted to the Conference on Lifelong Learning Agents (CoLLAs) 2024

  17. arXiv:2306.06254  [pdf, other

    cs.CV cs.LG eess.IV

    Understanding the Benefits of Image Augmentations

    Authors: Matthew Iceland, Christopher Kanan

    Abstract: Image Augmentations are widely used to reduce overfitting in neural networks. However, the explainability of their benefits largely remains a mystery. We study which layers of residual neural networks (ResNets) are most affected by augmentations using Centered Kernel Alignment (CKA). We do so by analyzing models of varying widths and depths, as well as whether their weights are initialized randoml… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  18. arXiv:2306.01904  [pdf, other

    cs.CV cs.LG

    Overcoming the Stability Gap in Continual Learning

    Authors: Md Yousuf Harun, Christopher Kanan

    Abstract: Pre-trained deep neural networks (DNNs) are being widely deployed by industry for making business decisions and to serve users; however, a major problem is model decay, where the DNN's predictions become more erroneous over time, resulting in revenue loss or unhappy users. To mitigate model decay, DNNs are retrained from scratch using old and new data. This is computationally expensive, so retrain… ▽ More

    Submitted 16 September, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Accepted to TMLR 2024

  19. arXiv:2305.04923  [pdf, other

    cs.CV cs.AI

    Learning to Evaluate the Artness of AI-generated Images

    Authors: Junyu Chen, Jie An, Hanjia Lyu, Christopher Kanan, Jiebo Luo

    Abstract: Assessing the artness of AI-generated images continues to be a challenge within the realm of image generation. Most existing metrics cannot be used to perform instance-level and reference-free artness evaluation. This paper presents ArtScore, a metric designed to evaluate the degree to which an image resembles authentic artworks by artists (or conversely photographs), thereby offering a novel appr… ▽ More

    Submitted 9 June, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Published in IEEE Transactions on Multimedia

  20. arXiv:2303.18171  [pdf, other

    cs.CV cs.AI cs.LG

    How Efficient Are Today's Continual Learning Algorithms?

    Authors: Md Yousuf Harun, Jhair Gallardo, Tyler L. Hayes, Christopher Kanan

    Abstract: Supervised Continual learning involves updating a deep neural network (DNN) from an ever-growing stream of labeled data. While most work has focused on overcoming catastrophic forgetting, one of the major motivations behind continual learning is being able to efficiently update a network with new information, rather than retraining from scratch on the training dataset as it grows over time. Despit… ▽ More

    Submitted 3 April, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: To appear in the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR-W) on Continual Learning in Computer Vision (CLVision) 2023

  21. arXiv:2303.10725  [pdf, other

    cs.CV cs.LG

    SIESTA: Efficient Online Continual Learning with Sleep

    Authors: Md Yousuf Harun, Jhair Gallardo, Tyler L. Hayes, Ronald Kemker, Christopher Kanan

    Abstract: In supervised continual learning, a deep neural network (DNN) is updated with an ever-growing data stream. Unlike the offline setting where data is shuffled, we cannot make any distributional assumptions about the data stream. Ideally, only one pass through the dataset is needed for computational efficiency. However, existing methods are inadequate and make many assumptions that cannot be made for… ▽ More

    Submitted 2 November, 2023; v1 submitted 19 March, 2023; originally announced March 2023.

    Comments: Accepted to TMLR 2023

  22. System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games

    Authors: Indranil Sur, Zachary Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron E. Taylor, Mustafa Burak Gurbuz, James Smith, Sahana Joshi, Nathalie Japkowicz, Michael Baron, Zsolt Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, Michael Piacentino, Jesse Hostetler, Aswin Raghavan

    Abstract: As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new ta… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: The Second International Conference on AIML Systems, October 12--15, 2022, Bangalore, India

  23. arXiv:2211.12981  [pdf, other

    cs.CV cs.MM

    Holistic Visual-Textual Sentiment Analysis with Prior Models

    Authors: Junyu Chen, Jie An, Hanjia Lyu, Christopher Kanan, Jiebo Luo

    Abstract: Visual-textual sentiment analysis aims to predict sentiment with the input of a pair of image and text, which poses a challenge in learning effective features for diverse input images. To address this, we propose a holistic method that achieves robust visual-textual sentiment analysis by exploiting a rich set of powerful pre-trained visual and textual prior models. The proposed method consists of… ▽ More

    Submitted 9 June, 2024; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: Published in MIPR 2024

  24. arXiv:2210.08403  [pdf, other

    cs.CV cs.AI

    Semantic Segmentation with Active Semi-Supervised Representation Learning

    Authors: Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman

    Abstract: Obtaining human per-pixel labels for semantic segmentation is incredibly laborious, often making labeled dataset construction prohibitively expensive. Here, we endeavor to overcome this problem with a novel algorithm that combines semi-supervised and active learning, resulting in the ability to train an effective semantic segmentation algorithm with significantly lesser labeled data. To do this, w… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: To appear in the British Machine Vision Conference (BMVC-2022)

  25. arXiv:2205.01947  [pdf, other

    cs.CV cs.HC cs.RO

    EllSeg-Gen, towards Domain Generalization for head-mounted eyetracking

    Authors: Rakshit S. Kothari, Reynold J. Bailey, Christopher Kanan, Jeff B. Pelz, Gabriel J. Diaz

    Abstract: The study of human gaze behavior in natural contexts requires algorithms for gaze estimation that are robust to a wide range of imaging conditions. However, algorithms often fail to identify features such as the iris and pupil centroid in the presence of reflective artifacts and occlusions. Previous work has shown that convolutional networks excel at extracting gaze features despite the presence o… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Code available at https://bitbucket.org/RSKothari/multiset_gaze/

  26. arXiv:2204.02426  [pdf, other

    cs.LG

    OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses

    Authors: Robik Shrestha, Kushal Kafle, Christopher Kanan

    Abstract: Dataset bias and spurious correlations can significantly impair generalization in deep neural networks. Many prior efforts have addressed this problem using either alternative loss functions or sampling strategies that focus on rare patterns. We propose a new direction: modifying the network architecture to impose inductive biases that make the network robust to dataset bias. Specifically, we prop… ▽ More

    Submitted 14 April, 2024; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: ECCV 2022

  27. arXiv:2203.10730  [pdf, other

    cs.CV cs.AI

    Semantic Segmentation with Active Semi-Supervised Learning

    Authors: Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman

    Abstract: Using deep learning, we now have the ability to create exceptionally good semantic segmentation systems; however, collecting the prerequisite pixel-wise annotations for training images remains expensive and time-consuming. Therefore, it would be ideal to minimize the number of human annotations needed when creating a new dataset. Here, we address this problem by proposing a novel algorithm that co… ▽ More

    Submitted 15 October, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: To appear in the Winter Conference on Applications of Computer Vision (WACV-2023)

  28. arXiv:2203.10681  [pdf, other

    cs.LG cs.AI

    Online Continual Learning for Embedded Devices

    Authors: Tyler L. Hayes, Christopher Kanan

    Abstract: Real-time on-device continual learning is needed for new applications such as home robots, user personalization on smartphones, and augmented/virtual reality headsets. However, this setting poses unique challenges: embedded devices have limited memory and compute capacity and conventional machine learning models suffer from catastrophic forgetting when updated on non-stationary data streams. While… ▽ More

    Submitted 15 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: To appear in the Conference on Lifelong Learning Agents (CoLLAs-2022)

  29. arXiv:2203.06215  [pdf, other

    cs.CV cs.AI

    Can I see an Example? Active Learning the Long Tail of Attributes and Relations

    Authors: Tyler L. Hayes, Maximilian Nickel, Christopher Kanan, Ludovic Denoyer, Arthur Szlam

    Abstract: There has been significant progress in creating machine learning models that identify objects in scenes along with their associated attributes and relationships; however, there is a large gap between the best models and human capabilities. One of the major reasons for this gap is the difficulty in collecting sufficient amounts of annotated relations and attributes for training these systems. While… ▽ More

    Submitted 7 October, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: To appear in the British Machine Vision Conference (BMVC-2022)

  30. arXiv:2202.05930  [pdf, other

    cs.CV cs.AI

    Detecting out-of-context objects using contextual cues

    Authors: Manoj Acharya, Anirban Roy, Kaushik Koneripalli, Susmit Jha, Christopher Kanan, Ajay Divakaran

    Abstract: This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other obj… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Journal ref: IJCAI-ECAI 2022

  31. arXiv:2110.13064  [pdf, other

    cs.CV cs.AI

    2nd Place Solution for SODA10M Challenge 2021 -- Continual Detection Track

    Authors: Manoj Acharya, Christopher Kanan

    Abstract: In this technical report, we present our approaches for the continual object detection track of the SODA10M challenge. We adapt ResNet50-FPN as the baseline and try several improvements for the final submission model. We find that task-specific replay scheme, learning rate scheduling, model calibration, and using original image scale helps to improve performance for both large and small objects in… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: Published in SSLAD workshop at ICCV 2021

  32. arXiv:2107.05445  [pdf, other

    cs.CV cs.AI cs.LG

    Disentangling Transfer and Interference in Multi-Domain Learning

    Authors: Yipeng Zhang, Tyler L. Hayes, Christopher Kanan

    Abstract: Humans are incredibly good at transferring knowledge from one domain to another, enabling rapid learning of new tasks. Likewise, transfer learning has enabled enormous success in many computer vision problems using pretraining. However, the benefits of transfer in multi-domain learning, where a network learns multiple tasks defined by different datasets, has not been adequately studied. Learning m… ▽ More

    Submitted 14 January, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: AAAI 2022 PracticalDL Workshop

  33. arXiv:2106.15475  [pdf, other

    cs.CV

    How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

    Authors: Bidur Khanal, Christopher Kanan

    Abstract: Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. While the impact of label noise on learning in deep neural networks has been studied in prior work, these studies have exclusively focused on homogeneous label noise, i.e., the degree of label noise is the same across all categories. However, in the real-world, label noise is often heterogeneous, with s… ▽ More

    Submitted 26 September, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

  34. arXiv:2104.04132  [pdf, other

    q-bio.NC cs.AI cs.LG

    Replay in Deep Learning: Current Approaches and Missing Biological Elements

    Authors: Tyler L. Hayes, Giri P. Krishnan, Maxim Bazhenov, Hava T. Siegelmann, Terrence J. Sejnowski, Christopher Kanan

    Abstract: Replay is the reactivation of one or more neural patterns, which are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated into deep artificial neural networks th… ▽ More

    Submitted 28 May, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: Accepted for publication in the MIT Press journal of Neural Computation

  35. arXiv:2104.00405  [pdf, other

    cs.LG cs.AI cs.CV

    Avalanche: an End-to-End Library for Continual Learning

    Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Andrea Cossu, Antonio Carta, Gabriele Graffieti, Tyler L. Hayes, Matthias De Lange, Marc Masana, Jary Pomponi, Gido van de Ven, Martin Mundt, Qi She, Keiland Cooper, Jeremy Forest, Eden Belouadah, Simone Calderara, German I. Parisi, Fabio Cuzzolin, Andreas Tolias, Simone Scardapane, Luca Antiga, Subutai Amhad, Adrian Popescu, Christopher Kanan, Joost van de Weijer , et al. (3 additional authors not shown)

    Abstract: Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning. Recently, we have witnessed a renewed and fast-growing interest in continual learning, especially within the deep learning community. However, algorithmic solutions are often difficult to re-implement, evaluate and port across different settings, where even results on standa… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: Official Website: https://avalanche.continualai.org

  36. arXiv:2104.00170  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Are Bias Mitigation Techniques for Deep Learning Effective?

    Authors: Robik Shrestha, Kushal Kafle, Christopher Kanan

    Abstract: A critical problem in deep learning is that systems learn inappropriate biases, resulting in their inability to perform well on minority groups. This has led to the creation of multiple algorithms that endeavor to mitigate bias. However, it is not clear how effective these methods are. This is because study protocols differ among papers, systems are tested on datasets that fail to test many forms… ▽ More

    Submitted 23 April, 2024; v1 submitted 31 March, 2021; originally announced April 2021.

    Comments: Published in WACV 2022 under the title "An Investigation of Critical Issues in Bias Mitigation Techniques"

  37. arXiv:2103.14010  [pdf, other

    cs.CV

    Self-Supervised Training Enhances Online Continual Learning

    Authors: Jhair Gallardo, Tyler L. Hayes, Christopher Kanan

    Abstract: In continual learning, a system must incrementally learn from a non-stationary data stream without catastrophic forgetting. Recently, multiple methods have been devised for incrementally learning classes on large-scale image classification tasks, such as ImageNet. State-of-the-art continual learning methods use an initial supervised pre-training phase, in which the first 10% - 50% of the classes i… ▽ More

    Submitted 22 October, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: Accepted to BMVC-2021

  38. arXiv:2103.03987  [pdf, other

    cs.AI cs.CV cs.LG

    Selective Replay Enhances Learning in Online Continual Analogical Reasoning

    Authors: Tyler L. Hayes, Christopher Kanan

    Abstract: In continual learning, a system learns from non-stationary data streams or batches without catastrophic forgetting. While this problem has been heavily studied in supervised image classification and reinforcement learning, continual learning in neural networks designed for abstract reasoning has not yet been studied. Here, we study continual learning of analogical reasoning. Analogical reasoning t… ▽ More

    Submitted 19 April, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

    Comments: To appear in the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR-W) on Continual Learning in Computer Vision (CLVision) 2021

  39. arXiv:2103.03048  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems

    Authors: Usman Mahmood, Robik Shrestha, David D. B. Bates, Lorenzo Mannelli, Giuseppe Corrias, Yusuf Erdi, Christopher Kanan

    Abstract: Artificial intelligence (AI) has been successful at solving numerous problems in machine perception. In radiology, AI systems are rapidly evolving and show progress in guiding treatment decisions, diagnosing, localizing disease on medical images, and improving radiologists' efficiency. A critical component to deploying AI in radiology is to gain confidence in a developed system's efficacy and safe… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  40. arXiv:2009.04659  [pdf, other

    cs.CV

    Improved Robustness to Open Set Inputs via Tempered Mixup

    Authors: Ryne Roady, Tyler L. Hayes, Christopher Kanan

    Abstract: Supervised classification methods often assume that evaluation data is drawn from the same distribution as training data and that all classes are present for training. However, real-world classifiers must handle inputs that are far from the training distribution including samples from unknown classes. Open set robustness refers to the ability to properly label samples from previously unseen catego… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

    Comments: Proceedings of the ECCV 2020 Workshop on Adversarial Robustness in the Real World

  41. arXiv:2008.06439  [pdf, other

    cs.CV cs.LG

    RODEO: Replay for Online Object Detection

    Authors: Manoj Acharya, Tyler L. Hayes, Christopher Kanan

    Abstract: Humans can incrementally learn to do new visual detection tasks, which is a huge challenge for today's computer vision systems. Incrementally trained deep learning models lack backwards transfer to previously seen classes and suffer from a phenomenon known as $"catastrophic forgetting."$ In this paper, we pioneer online streaming learning for object detection, where an agent must learn examples on… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: Accepted for poster presentation at BMVC2020

  42. arXiv:2005.09241  [pdf, other

    cs.CV cs.LG

    On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

    Authors: Damien Teney, Kushal Kafle, Robik Shrestha, Ehsan Abbasnejad, Christopher Kanan, Anton van den Hengel

    Abstract: Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  43. arXiv:2004.13587  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Do We Need Fully Connected Output Layers in Convolutional Networks?

    Authors: Zhongchao Qian, Tyler L. Hayes, Kushal Kafle, Christopher Kanan

    Abstract: Traditionally, deep convolutional neural networks consist of a series of convolutional and pooling layers followed by one or more fully connected (FC) layers to perform the final classification. While this design has been successful, for datasets with a large number of categories, the fully connected layers often account for a large percentage of the network's parameters. For applications with mem… ▽ More

    Submitted 28 April, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  44. arXiv:2004.05704  [pdf, other

    cs.CV cs.AI cs.CL

    Visual Grounding Methods for VQA are Working for the Wrong Reasons!

    Authors: Robik Shrestha, Kushal Kafle, Christopher Kanan

    Abstract: Existing Visual Question Answering (VQA) methods tend to exploit dataset biases and spurious statistical correlations, instead of producing right answers for the right reasons. To address this issue, recent bias mitigation methods for VQA propose to incorporate visual cues (e.g., human attention maps) to better ground the VQA models, showcasing impressive gains. However, we show that the performan… ▽ More

    Submitted 23 April, 2024; v1 submitted 12 April, 2020; originally announced April 2020.

    Comments: Published in ACL 2020 under the title "A negative case analysis of visual grounding methods for VQA"

  45. AeroRIT: A New Scene for Hyperspectral Image Analysis

    Authors: Aneesh Rangnekar, Nilay Mokashi, Emmett Ientilucci, Christopher Kanan, Matthew J. Hoffman

    Abstract: We investigate applying convolutional neural network (CNN) architecture to facilitate aerial hyperspectral scene understanding and present a new hyperspectral dataset-AeroRIT-that is large enough for CNN training. To date the majority of hyperspectral airborne have been confined to various sub-categories of vegetation and roads and this scene introduces two new categories: buildings and cars. To t… ▽ More

    Submitted 7 April, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

    Comments: To appear in IEEE TGRS

  46. arXiv:1911.00104  [pdf, other

    cs.LG cs.CV stat.ML

    Towards calibrated and scalable uncertainty representations for neural networks

    Authors: Nabeel Seedat, Christopher Kanan

    Abstract: For many applications it is critical to know the uncertainty of a neural network's predictions. While a variety of neural network parameter estimation methods have been proposed for uncertainty estimation, they have not been rigorously compared across uncertainty measures. We assess four of these parameter estimation methods to calibrate uncertainty estimation using four different uncertainty meas… ▽ More

    Submitted 3 December, 2019; v1 submitted 27 October, 2019; originally announced November 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019): 4th workshop on Bayesian Deep Learning, Vancouver, Canada

  47. Are Out-of-Distribution Detection Methods Effective on Large-Scale Datasets?

    Authors: Ryne Roady, Tyler L. Hayes, Ronald Kemker, Ayesha Gonzales, Christopher Kanan

    Abstract: Supervised classification methods often assume the train and test data distributions are the same and that all classes in the test set are present in the training set. However, deployed classifiers often require the ability to recognize inputs from outside the training set as unknowns. This problem has been studied under multiple paradigms including out-of-distribution detection and open set recog… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

  48. arXiv:1910.02509  [pdf, other

    cs.LG cs.CV cs.NE

    REMIND Your Neural Network to Prevent Catastrophic Forgetting

    Authors: Tyler L. Hayes, Kushal Kafle, Robik Shrestha, Manoj Acharya, Christopher Kanan

    Abstract: People learn throughout life. However, incrementally updating conventional neural networks leads to catastrophic forgetting. A common remedy is replay, which is inspired by how the brain consolidates memory. Replay involves fine-tuning a network on a mixture of new and old instances. While there is neuroscientific evidence that the brain replays compressed memories, existing methods for convolutio… ▽ More

    Submitted 13 July, 2020; v1 submitted 6 October, 2019; originally announced October 2019.

    Comments: To appear in the European Conference on Computer Vision (ECCV-2020)

  49. RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking

    Authors: Aayush K. Chaudhary, Rakshit Kothari, Manoj Acharya, Shusil Dangi, Nitinraj Nair, Reynold Bailey, Christopher Kanan, Gabriel Diaz, Jeff B. Pelz

    Abstract: Accurate eye segmentation can improve eye-gaze estimation and support interactive computing based on visual attention; however, existing eye segmentation methods suffer from issues such as person-dependent accuracy, lack of robustness, and an inability to be run in real-time. Here, we present the RITnet model, which is a deep neural network that combines U-Net and DenseNet. RITnet is under 1 MB an… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

    Comments: This model is the winning submission for OpenEDS Semantic Segmentation Challenge for Eye images https://research.fb.com/programs/openeds-challenge/. To appear in ICCVW 2019. ("Pre-trained models and source code are available https://bitbucket.org/eye-ush/ritnet/.")

  50. arXiv:1909.01520  [pdf, other

    cs.LG cs.CV stat.ML

    Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis

    Authors: Tyler L. Hayes, Christopher Kanan

    Abstract: When an agent acquires new information, ideally it would immediately be capable of using that information to understand its environment. This is not possible using conventional deep neural networks, which suffer from catastrophic forgetting when they are incrementally updated, with new knowledge overwriting established representations. A variety of approaches have been developed that attempt to mi… ▽ More

    Submitted 17 April, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: To appear in the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR-W) on Continual Learning in Computer Vision (CLVision) 2020