Skip to main content

Showing 1–7 of 7 results for author: Cao, Y T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.22315  [pdf, other

    cs.CL cs.CV

    Natural Language Inference Improves Compositionality in Vision-Language Models

    Authors: Paola Cascante-Bonilla, Yu Hou, Yang Trista Cao, Hal Daumé III, Rachel Rudinger

    Abstract: Compositional reasoning in Vision-Language Models (VLMs) remains challenging as these models often struggle to relate objects, attributes, and spatial relationships. Recent methods aim to address these limitations by relying on the semantics of the textual description, using Large Language Models (LLMs) to break them down into subsets of questions and answers. However, these methods primarily oper… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

    Comments: Project page: https://cece-vlm.github.io/

  2. arXiv:2312.07141  [pdf, other

    cs.CL

    Multilingual large language models leak human stereotypes across language boundaries

    Authors: Yang Trista Cao, Anna Sotnikova, Jieyu Zhao, Linda X. Zou, Rachel Rudinger, Hal Daume III

    Abstract: Multilingual large language models have gained prominence for their proficiency in processing and generating text across languages. Like their monolingual counterparts, multilingual models are likely to pick up on stereotypes and other social biases present in their training data. In this paper, we study a phenomenon we term stereotype leakage, which refers to how training a model multilingually m… ▽ More

    Submitted 19 November, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  3. arXiv:2311.07879  [pdf, other

    cs.CL cs.AI

    Toxicity Detection is NOT all you Need: Measuring the Gaps to Supporting Volunteer Content Moderators

    Authors: Yang Trista Cao, Lovely-Frances Domingo, Sarah Ann Gilbert, Michelle Mazurek, Katie Shilton, Hal Daumé III

    Abstract: Extensive efforts in automated approaches for content moderation have been focused on developing models to identify toxic, offensive, and hateful content with the aim of lightening the load for moderators. Yet, it remains uncertain whether improvements on those tasks have truly addressed moderators' needs in accomplishing their work. In this paper, we surface gaps between past research efforts tha… ▽ More

    Submitted 13 November, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  4. arXiv:2210.14966  [pdf, other

    cs.CL cs.AI cs.CV

    What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?

    Authors: Yang Trista Cao, Kyle Seelman, Kyungjun Lee, Hal Daumé III

    Abstract: In visual question answering (VQA), a machine must answer a question given an associated image. Recently, accessibility researchers have explored whether VQA can be deployed in a real-world setting where users with visual impairments learn about their environment by capturing their visual surroundings and asking questions. However, most of the existing benchmarking datasets for VQA focus on machin… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Journal ref: AACL-IJCNLP 2022 The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing

  5. arXiv:2206.11684  [pdf, other

    cs.CL

    Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

    Authors: Yang Trista Cao, Anna Sotnikova, Hal Daumé III, Rachel Rudinger, Linda Zou

    Abstract: NLP models trained on text have been shown to reproduce human stereotypes, which can magnify harms to marginalized groups when systems are deployed at scale. We adapt the Agency-Belief-Communion (ABC) stereotype model of Koch et al. (2016) from social psychology as a framework for the systematic study and discovery of stereotypic group-trait associations in language models (LMs). We introduce the… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  6. arXiv:2203.13928  [pdf, other

    cs.CL

    On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations

    Authors: Yang Trista Cao, Yada Pruksachatkun, Kai-Wei Chang, Rahul Gupta, Varun Kumar, Jwala Dhamala, Aram Galstyan

    Abstract: Multiple metrics have been introduced to measure fairness in various natural language processing tasks. These metrics can be roughly categorized into two categories: 1) \emph{extrinsic metrics} for evaluating fairness in downstream applications and 2) \emph{intrinsic metrics} for estimating fairness in upstream contextualized language representation models. In this paper, we conduct an extensive c… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Journal ref: ACL 2022

  7. Toward Gender-Inclusive Coreference Resolution

    Authors: Yang Trista Cao, Hal Daumé III

    Abstract: Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systemic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and d… ▽ More

    Submitted 2 December, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: 28 pages; ACL version

    Journal ref: Association for Computational Linguistics. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020) 4568-4595