Skip to main content

Showing 1–12 of 12 results for author: Materzyńska, J

.
  1. arXiv:2412.17847  [pdf, other

    cs.AI cs.CL cs.CY cs.LG cs.MM

    Bridging the Data Provenance Gap Across Text, Speech and Video

    Authors: Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Naana Obeng-Marnu, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund , et al. (18 additional authors not shown)

    Abstract: Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text. In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities--popular text, speech, and video datasets--from their detailed sourcing trends and use restrictions to thei… ▽ More

    Submitted 18 February, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

    Comments: ICLR 2025. 10 pages, 5 figures (main paper)

  2. arXiv:2412.00176  [pdf, other

    cs.CV

    Opt-In Art: Learning Art Styles Only from Few Examples

    Authors: Hui Ren, Joanna Materzynska, Rohit Gandikota, David Bau, Antonio Torralba

    Abstract: We explore whether pre-training on datasets with paintings is necessary for a model to learn an artistic style with only a few examples. To investigate this, we train a text-to-image model exclusively on photographs, without access to any painting-related content. We show that it is possible to adapt a model that is trained without paintings to an artistic style, given only few examples. User stud… ▽ More

    Submitted 20 May, 2025; v1 submitted 29 November, 2024; originally announced December 2024.

  3. arXiv:2410.02921  [pdf, other

    cs.CV

    AirLetters: An Open Video Dataset of Characters Drawn in the Air

    Authors: Rishit Dagli, Guillaume Berger, Joanna Materzynska, Ingo Bax, Roland Memisevic

    Abstract: We introduce AirLetters, a new video dataset consisting of real-world videos of human-generated, articulated motions. Specifically, our dataset requires a vision model to predict letters that humans draw in the air. Unlike existing video datasets, accurate classification predictions for AirLetters rely critically on discerning motion patterns and on integrating long-range information in the video… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: ECCV'24, HANDS workshop

  4. arXiv:2407.14933  [pdf, other

    cs.CL cs.AI cs.LG

    Consent in Crisis: The Rapid Decline of the AI Data Commons

    Authors: Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, Jianguo Zhang , et al. (24 additional authors not shown)

    Abstract: General-purpose artificial intelligence (AI) systems are built on massive swathes of public web data, assembled into corpora such as C4, RefinedWeb, and Dolma. To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora. Our audit of 14,000 web domains provides an expansive view of crawlable web data and how co… ▽ More

    Submitted 24 July, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: 41 pages (13 main), 5 figures, 9 tables

  5. arXiv:2312.04966  [pdf, other

    cs.CV

    NewMove: Customizing text-to-video models with novel motions

    Authors: Joanna Materzynska, Josef Sivic, Eli Shechtman, Antonio Torralba, Richard Zhang, Bryan Russell

    Abstract: We introduce an approach for augmenting text-to-video generation models with customized motions, extending their capabilities beyond the motions depicted in the original training data. By leveraging a few video samples demonstrating specific movements as input, our method learns and generalizes the input motion patterns for diverse, text-specified scenarios. Our contributions are threefold. First,… ▽ More

    Submitted 10 December, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Project page: this website https://joaanna.github.io/customizing_motion/

  6. arXiv:2311.12092  [pdf, other

    cs.CV

    Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models

    Authors: Rohit Gandikota, Joanna Materzynska, Tingrui Zhou, Antonio Torralba, David Bau

    Abstract: We present a method to create interpretable concept sliders that enable precise control over attributes in image generations from diffusion models. Our approach identifies a low-rank parameter direction corresponding to one concept while minimizing interference with other attributes. A slider is created using a small set of prompts or sample images; thus slider directions can be created for either… ▽ More

    Submitted 27 November, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

  7. arXiv:2309.03886  [pdf, other

    cs.CL cs.AI cs.LG

    FIND: A Function Description Benchmark for Evaluating Interpretability Methods

    Authors: Sarah Schwettmann, Tamar Rott Shaham, Joanna Materzynska, Neil Chowdhury, Shuang Li, Jacob Andreas, David Bau, Antonio Torralba

    Abstract: Labeling neural network submodules with human-legible descriptions is useful for many downstream tasks: such descriptions can surface failures, guide interventions, and perhaps even explain important model behaviors. To date, most mechanistic descriptions of trained networks have involved small models, narrowly delimited phenomena, and large amounts of human labor. Labeling all human-interpretable… ▽ More

    Submitted 8 December, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: 28 pages, 10 figures

    Journal ref: NeurIPS 2023

  8. arXiv:2308.14761  [pdf, other

    cs.CV cs.LG

    Unified Concept Editing in Diffusion Models

    Authors: Rohit Gandikota, Hadas Orgad, Yonatan Belinkov, Joanna Materzyńska, David Bau

    Abstract: Text-to-image models suffer from various safety issues that may limit their suitability for deployment. Previous methods have separately addressed individual issues of bias, copyright, and offensive content in text-to-image models. However, in the real world, all of these issues appear simultaneously in the same model. We present a method that tackles all issues with a single approach. Our method,… ▽ More

    Submitted 22 October, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

    Comments: In proceedings of WACV 2024. Project Page: https://unified.baulab.info

  9. arXiv:2303.07345  [pdf, other

    cs.CV

    Erasing Concepts from Diffusion Models

    Authors: Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau

    Abstract: Motivated by recent advancements in text-to-image diffusion, we study erasure of specific concepts from the model's weights. While Stable Diffusion has shown promise in producing explicit or realistic artwork, it has raised concerns regarding its potential for misuse. We propose a fine-tuning method that can erase a visual concept from a pre-trained diffusion model, given only the name of the styl… ▽ More

    Submitted 20 June, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

  10. arXiv:2206.07835  [pdf, other

    cs.CV

    Disentangling visual and written concepts in CLIP

    Authors: Joanna Materzynska, Antonio Torralba, David Bau

    Abstract: The CLIP network measures the similarity between natural text and images; in this work, we investigate the entanglement of the representation of word images and natural images in its image encoder. First, we find that the image encoder has an ability to match word images with natural images of scenes described by those words. This is consistent with previous research that suggests that the meaning… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  11. arXiv:1912.09930  [pdf, other

    cs.CV

    Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks

    Authors: Joanna Materzynska, Tete Xiao, Roei Herzig, Huijuan Xu, Xiaolong Wang, Trevor Darrell

    Abstract: Human action is naturally compositional: humans can easily recognize and perform actions with objects that are different from those used in training demonstrations. In this paper, we study the compositionality of action by looking into the dynamics of subject-object interactions. We propose a novel model which can explicitly reason about the geometric relations between constituent objects and an a… ▽ More

    Submitted 12 September, 2020; v1 submitted 20 December, 2019; originally announced December 2019.

  12. arXiv:1706.04261  [pdf, other

    cs.CV

    The "something something" video database for learning and evaluating visual common sense

    Authors: Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, Joanna Materzyńska, Susanne Westphal, Heuna Kim, Valentin Haenel, Ingo Fruend, Peter Yianilos, Moritz Mueller-Freitag, Florian Hoppe, Christian Thurau, Ingo Bax, Roland Memisevic

    Abstract: Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification. One obstacle that prevents networks from reasoning more deeply about complex scenes and situations, and from integrating visual knowledge with natural language, like humans do, is their lack of common sense knowledge about the physical world. Videos, unlike still images, contain a wealt… ▽ More

    Submitted 15 June, 2017; v1 submitted 13 June, 2017; originally announced June 2017.