Skip to main content

Showing 1–20 of 20 results for author: Prabhu, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.07009  [pdf, other

    cs.CV

    Imagen 3

    Authors: Imagen-Team-Google, :, Jason Baldridge, Jakob Bauer, Mukul Bhutani, Nicole Brichtova, Andrew Bunner, Lluis Castrejon, Kelvin Chan, Yichang Chen, Sander Dieleman, Yuqing Du, Zach Eaton-Rosen, Hongliang Fei, Nando de Freitas, Yilin Gao, Evgeny Gladchenko, Sergio Gómez Colmenarejo, Mandy Guo, Alex Haig, Will Hawkins, Hexiang Hu, Huilian Huang, Tobenna Peter Igwe, Christos Kaplanis , et al. (237 additional authors not shown)

    Abstract: We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.

    Submitted 21 December, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

  2. arXiv:2407.06863  [pdf, other

    cs.CV

    Beyond Aesthetics: Cultural Competence in Text-to-Image Models

    Authors: Nithish Kannen, Arif Ahmad, Marco Andreetto, Vinodkumar Prabhakaran, Utsav Prabhu, Adji Bousso Dieng, Pushpak Bhattacharyya, Shachi Dave

    Abstract: Text-to-Image (T2I) models are being increasingly adopted in diverse global communities where they create visual representations of their unique cultures. Current T2I benchmarks primarily focus on faithfulness, aesthetics, and realism of generated images, overlooking the critical dimension of cultural competence. In this work, we introduce a framework to evaluate cultural competence of T2I models… ▽ More

    Submitted 20 January, 2025; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: NeurIPS 2024 camera-ready version

  3. arXiv:2406.01429  [pdf, other

    cs.CV

    EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding

    Authors: Thanh-Dat Truong, Utsav Prabhu, Dongyi Wang, Bhiksha Raj, Susan Gauch, Jeyamkondan Subbiah, Khoa Luu

    Abstract: Unsupervised Domain Adaptation has been an efficient approach to transferring the semantic segmentation model across data distributions. Meanwhile, the recent Open-vocabulary Semantic Scene understanding based on large-scale vision language models is effective in open-set settings because it can learn diverse concepts and categories. However, these prior methods fail to generalize across different… ▽ More

    Submitted 11 October, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Accepted to NeurIPS'24

  4. The Dark Side of Dataset Scaling: Evaluating Racial Classification in Multimodal Models

    Authors: Abeba Birhane, Sepehr Dehdashtian, Vinay Uday Prabhu, Vishnu Boddeti

    Abstract: Scale the model, scale the data, scale the GPU farms is the reigning sentiment in the world of generative AI today. While model scaling has been extensively studied, data scaling and its downstream impacts on model performance remain under-explored. This is particularly important in the context of multimodal datasets whose main source is the World Wide Web, condensed and packaged as the Common Cra… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: To appear in the proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT 24), June 3 to 6, 2024, Rio de Janeiro, Brazil. arXiv admin note: text overlap with arXiv:2306.13141

  5. arXiv:2311.15965  [pdf, other

    cs.CV

    FALCON: Fairness Learning via Contrastive Attention Approach to Continual Semantic Scene Understanding

    Authors: Thanh-Dat Truong, Utsav Prabhu, Bhiksha Raj, Jackson Cothren, Khoa Luu

    Abstract: Continual Learning in semantic scene segmentation aims to continually learn new unseen classes in dynamic environments while maintaining previously learned knowledge. Prior studies focused on modeling the catastrophic forgetting and background shift challenges in continual learning. However, fairness, another major challenge that causes unfair predictions leading to low performance among major and… ▽ More

    Submitted 21 March, 2025; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted to CVPR'25

  6. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  7. arXiv:2110.01963  [pdf, other

    cs.CY

    Multimodal datasets: misogyny, pornography, and malignant stereotypes

    Authors: Abeba Birhane, Vinay Uday Prabhu, Emmanuel Kahembwe

    Abstract: We have now entered the era of trillion parameter machine learning models trained on billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has given rise to formidable bodies of critical work that has called for caution while generating these large datasets. These address concerns surrounding the dubious curation practices used to generate these datasets, the sord… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: 33 pages

  8. A Step Toward More Inclusive People Annotations for Fairness

    Authors: Candice Schumann, Susanna Ricco, Utsav Prabhu, Vittorio Ferrari, Caroline Pantofaru

    Abstract: The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Journal ref: AIES (2021)

  9. arXiv:2009.13509  [pdf, other

    cs.CV cs.LG

    Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages

    Authors: Daniel J Wu, Andrew C Yang, Vinay U Prabhu

    Abstract: We present Afro-MNIST, a set of synthetic MNIST-style datasets for four orthographies used in Afro-Asiatic and Niger-Congo languages: Ge`ez (Ethiopic), Vai, Osmanya, and N'Ko. These datasets serve as "drop-in" replacements for MNIST. We also describe and open-source a method for synthetic MNIST-style dataset generation from single examples of each digit. These datasets can be found at https://gith… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: 10 pages, 11 figures, presented as a workshop paper at Practical Machine Learning for Developing Countries @ ICLR 2020

  10. arXiv:2006.16923  [pdf, other

    cs.CY stat.AP stat.ML

    Large image datasets: A pyrrhic win for computer vision?

    Authors: Vinay Uday Prabhu, Abeba Birhane

    Abstract: In this paper we investigate problematic practices and consequences of large scale vision datasets. We examine broad issues such as the question of consent and justice as well as specific concerns such as the inclusion of verifiably pornographic images in datasets. Taking the ImageNet-ILSVRC-2012 dataset as an example, we perform a cross-sectional model-based quantitative census covering factors s… ▽ More

    Submitted 23 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Github: https://github.com/vinayprabhu/Dataset_audits. Update on July 23rd: (1) Added in the supplementary section (2) The curators of the Tiny Images dataset decided to withdraw the dataset in response to the previous version of this paper, a change that has duly been reflected in this version. Their statement: https://groups.csail.mit.edu/vision/TinyImages/

  11. arXiv:1912.08987  [pdf, other

    cs.LG cs.CR stat.ML

    Model Weight Theft With Just Noise Inputs: The Curious Case of the Petulant Attacker

    Authors: Nicholas Roberts, Vinay Uday Prabhu, Matthew McAteer

    Abstract: This paper explores the scenarios under which an attacker can claim that 'Noise and access to the softmax layer of the model is all you need' to steal the weights of a convolutional neural network whose architecture is already known. We were able to achieve 96% test accuracy using the stolen MNIST model and 82% accuracy using the stolen KMNIST model learned using only i.i.d. Bernoulli noise inputs… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: Presented at the Security and Privacy of Machine Learning Workshop, 36th International Conference on Machine Learning (ICML 2019), Long Beach, California, USA

  12. arXiv:1912.08986  [pdf, other

    cs.LG cs.NE stat.ML

    Deep Connectomics Networks: Neural Network Architectures Inspired by Neuronal Networks

    Authors: Nicholas Roberts, Dian Ang Yap, Vinay Uday Prabhu

    Abstract: The interplay between inter-neuronal network topology and cognition has been studied deeply by connectomics researchers and network scientists, which is crucial towards understanding the remarkable efficacy of biological neural networks. Curiously, the deep learning revolution that revived neural networks has not paid much attention to topological aspects. The architectures of deep neural networks… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: Presented at the Real Neurons & Hidden Units Workshop, 33rd Conference on Neural Information ProcessingSystems (NeurIPS 2019), Vancouver, Canada

  13. arXiv:1911.07418  [pdf, other

    cs.LG cs.IT stat.ML

    Grassmannian Packings in Neural Networks: Learning with Maximal Subspace Packings for Diversity and Anti-Sparsity

    Authors: Dian Ang Yap, Nicholas Roberts, Vinay Uday Prabhu

    Abstract: Kernel sparsity ("dying ReLUs") and lack of diversity are commonly observed in CNN kernels, which decreases model capacity. Drawing inspiration from information theory and wireless communications, we demonstrate the intersection of coding theory and deep learning through the Grassmannian subspace packing problem in CNNs. We propose Grassmannian packings for initial kernel layers to be initialized… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: Presented at Bayesian Deep Learning and Workshop on Information Theory and Machine Learning, 33rd Conference on Neural Information ProcessingSystems (NeurIPS 2019), Vancouver, Canada

  14. arXiv:1908.01242  [pdf, other

    cs.CV cs.LG stat.ML

    Kannada-MNIST: A new handwritten digits dataset for the Kannada language

    Authors: Vinay Uday Prabhu

    Abstract: In this paper, we disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset. In addition to this dataset, we disseminate an additional real world handwritten dataset (with $10k$ images), which we term as the Dig-MNIST dataset that can serve as an out-of-domain test dataset. W… ▽ More

    Submitted 3 August, 2019; originally announced August 2019.

    Comments: The companion github repository for this paper is : https://github.com/vinayprabhu/Kannada_MNIST

  15. arXiv:1907.12917  [pdf, other

    cs.CV cs.LG

    Covering up bias in CelebA-like datasets with Markov blankets: A post-hoc cure for attribute prior avoidance

    Authors: Vinay Uday Prabhu, Dian Ang Yap, Alexander Wang, John Whaley

    Abstract: Attribute prior avoidance entails subconscious or willful non-modeling of (meta)attributes that datasets are oft born with, such as the 40 semantic facial attributes associated with the CelebA and CelebA-HQ datasets. The consequences of this infirmity, we discover, are especially stark in state-of-the-art deep generative models learned on these datasets that just model the pixel-space measurements… ▽ More

    Submitted 21 July, 2019; originally announced July 2019.

    Comments: Accepted for presentation at the first workshop on Invertible Neural Networks and Normalizing Flows (ICML 2019), Long Beach, CA, USA

  16. arXiv:1907.09061  [pdf, other

    cs.LG cs.CR stat.ML

    Understanding Adversarial Robustness Through Loss Landscape Geometries

    Authors: Vinay Uday Prabhu, Dian Ang Yap, Joyce Xu, John Whaley

    Abstract: The pursuit of explaining and improving generalization in deep learning has elicited efforts both in regularization techniques as well as visualization techniques of the loss surface geometry. The latter is related to the intuition prevalent in the community that flatter local optima leads to lower generalization error. In this paper, we harness the state-of-the-art "filter normalization" techniqu… ▽ More

    Submitted 21 July, 2019; originally announced July 2019.

    Comments: Presented at the ICML 2019 Workshop on Uncertainty and Robustness in Deep Learning, and CVPR 2019 Workshop on The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security (CV-COPS)

  17. arXiv:1905.08633  [pdf, other

    cs.CV cs.CL

    Fonts-2-Handwriting: A Seed-Augment-Train framework for universal digit classification

    Authors: Vinay Uday Prabhu, Sanghyun Han, Dian Ang Yap, Mihail Douhaniaris, Preethi Seshadri, John Whaley

    Abstract: In this paper, we propose a Seed-Augment-Train/Transfer (SAT) framework that contains a synthetic seed image dataset generation procedure for languages with different numeral systems using freely available open font file datasets. This seed dataset of images is then augmented to create a purely synthetic training dataset, which is in turn used to train a deep neural network and test on held-out re… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: Published as a workshop paper at ICLR 2019 (DeepGenStruct-2019)

  18. arXiv:1807.05162  [pdf, other

    cs.CV cs.LG

    Large-Scale Visual Speech Recognition

    Authors: Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

    Abstract: This work presents a scalable solution to open-vocabulary visual speech recognition. To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3,886 hours of video). In tandem, we designed and trained an integrated lipreading system, consisting of a video processing pipeline that maps raw video to stable v… ▽ More

    Submitted 1 October, 2018; v1 submitted 13 July, 2018; originally announced July 2018.

  19. arXiv:1802.06927  [pdf, other

    cs.CV cs.LG cs.NE

    On Lyapunov exponents and adversarial perturbation

    Authors: Vinay Uday Prabhu, Nishant Desai, John Whaley

    Abstract: In this paper, we would like to disseminate a serendipitous discovery involving Lyapunov exponents of a 1-D time series and their use in serving as a filtering defense tool against a specific kind of deep adversarial perturbation. To this end, we use the state-of-the-art CleverHans library to generate adversarial perturbations against a standard Convolutional Neural Network (CNN) architecture trai… ▽ More

    Submitted 19 February, 2018; originally announced February 2018.

  20. arXiv:1401.2113  [pdf, ps, other

    cs.SI

    Latent Sentiment Detection in Online Social Networks: A Communications-oriented View

    Authors: Rohit Negi, Vinay Uday Prabhu, Miguel Rodrigues

    Abstract: In this paper, we consider the problem of latent sentiment detection in Online Social Networks such as Twitter. We demonstrate the benefits of using the underlying social network as an Ising prior to perform network aided sentiment detection. We show that the use of the underlying network results in substantially lower detection error rates compared to strictly features-based detection. In doing s… ▽ More

    Submitted 9 January, 2014; originally announced January 2014.

    Comments: 13 pages, 6 figures, Submitted to ICC 2014