Skip to main content

Showing 1–11 of 11 results for author: Shani, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.17117  [pdf, ps, other

    cs.CL cs.AI cs.IT

    From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning

    Authors: Chen Shani, Dan Jurafsky, Yann LeCun, Ravid Shwartz-Ziv

    Abstract: Humans organize knowledge into compact categories through semantic compression by mapping diverse instances to abstract representations while preserving meaning (e.g., robin and blue jay are both birds; most birds can fly). These concepts reflect a trade-off between expressive fidelity and representational simplicity. Large Language Models (LLMs) demonstrate remarkable linguistic abilities, yet wh… ▽ More

    Submitted 30 June, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  2. arXiv:2504.20643  [pdf, other

    cs.CL cs.AI cs.LG

    Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations

    Authors: Moran Mizrahi, Chen Shani, Gabriel Stanovsky, Dan Jurafsky, Dafna Shahaf

    Abstract: Large Language Models (LLMs) excel at countless tasks, yet struggle with creativity. In this paper, we introduce a novel approach that couples LLMs with structured representations and cognitively inspired manipulations to generate more creative and diverse ideas. Our notion of creativity goes beyond superficial token-level variations; rather, we explicitly recombine structured representations of e… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: 10 pages, 8 figures

  3. arXiv:2504.13890  [pdf, other

    cs.HC cs.CL

    Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches

    Authors: Chen Shani, Elizabeth C. Stade

    Abstract: Computational mental health research develops models to predict and understand psychological phenomena, but often relies on inappropriate measures of psychopathology constructs, undermining validity. We identify three key issues: (1) reliance on unvalidated measures (e.g., self-declared diagnosis) over validated ones (e.g., diagnosis by clinician); (2) treating mental health constructs as categori… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

    Comments: No figures, 1 Table

  4. arXiv:2504.09865  [pdf, other

    cs.CY cs.AI cs.HC

    Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects

    Authors: Isabel O. Gallegos, Chen Shani, Weiyan Shi, Federico Bianchi, Izzy Gainsburg, Dan Jurafsky, Robb Willer

    Abstract: As generative artificial intelligence (AI) enables the creation and dissemination of information at massive scale and speed, it is increasingly important to understand how people perceive AI-generated content. One prominent policy proposal requires explicitly labeling AI-generated content to increase transparency and encourage critical thinking about the information, but prior research has not yet… ▽ More

    Submitted 21 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

  5. arXiv:2502.05704  [pdf, other

    cs.CL cs.AI

    Rethinking Word Similarity: Semantic Similarity through Classification Confusion

    Authors: Kaitlyn Zhou, Haishan Gao, Sarah Chen, Dan Edelstein, Dan Jurafsky, Chen Shani

    Abstract: Word similarity has many applications to social science and cultural analytics tasks like measuring meaning change over time and making sense of contested terms. Yet traditional similarity methods based on cosine similarity between word embeddings cannot capture the context-dependent, asymmetrical, polysemous nature of semantic similarity. We propose a new measure of similarity, Word Confusion, th… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

    Comments: Accepted to NAACL-main-2025

  6. arXiv:2311.01866  [pdf, other

    cs.CL cs.AI

    Towards Concept-Aware Large Language Models

    Authors: Chen Shani, Jilles Vreeken, Dafna Shahaf

    Abstract: Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human conce… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 findings long paper

  7. arXiv:2311.01860  [pdf, other

    cs.CL cs.AI

    FAME: Flexible, Scalable Analogy Mappings Engine

    Authors: Shahar Jacob, Chen Shani, Dafna Shahaf

    Abstract: Analogy is one of the core capacities of human cognition; when faced with new situations, we often transfer prior experience from other domains. Most work on computational analogy relies heavily on complex, manually crafted input. In this work, we relax the input requirements, requiring only names of entities to be mapped. We automatically extract commonsense representations and use them to identi… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 main conference long paper

  8. arXiv:2211.07959  [pdf, other

    cs.LG cs.AI

    The Lean Data Scientist: Recent Advances towards Overcoming the Data Bottleneck

    Authors: Chen Shani, Jonathan Zarecki, Dafna Shahaf

    Abstract: Machine learning (ML) is revolutionizing the world, affecting almost every field of science and industry. Recent algorithms (in particular, deep networks) are increasingly data-hungry, requiring large datasets for training. Thus, the dominant paradigm in ML today involves constructing large, task-specific datasets. However, obtaining quality datasets of such magnitude proves to be a difficult ch… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  9. arXiv:2106.03048  [pdf, other

    cs.CL cs.AI

    How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

    Authors: Chen Shani, Nadav Borenstein, Dafna Shahaf

    Abstract: Humor is an important social phenomenon, serving complex social and psychological functions. However, despite being studied for millennia humor is computationally not well understood, often considered an AI-complete problem. In this work, we introduce a novel setting in humor mining: automatically detecting funny and unusual scientific papers. We are inspired by the Ig Nobel prize, a satirical pri… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

    Comments: To be published in the main conference of ACL-IJCNLP2021. Code and dataset can be found here: https://github.com/nadavborenstein/Iggy

  10. arXiv:2105.05571  [pdf, other

    cs.HC cs.AI cs.CL cs.IR

    "Alexa, what do you do for fun?" Characterizing playful requests with virtual assistants

    Authors: Chen Shani, Alexander Libov, Sofia Tolmach, Liane Lewin-Eytan, Yoelle Maarek, Dafna Shahaf

    Abstract: Virtual assistants such as Amazon's Alexa, Apple's Siri, Google Home, and Microsoft's Cortana, are becoming ubiquitous in our daily lives and successfully help users in various daily tasks, such as making phone calls or playing music. Yet, they still struggle with playful utterances, which are not meant to be interpreted literally. Examples include jokes or absurd requests or questions such as, "A… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  11. arXiv:2005.00311  [pdf, other

    cs.CL cs.LG

    Language (Re)modelling: Towards Embodied Language Understanding

    Authors: Ronen Tamari, Chen Shani, Tom Hope, Miriam R. L. Petruck, Omri Abend, Dafna Shahaf

    Abstract: While natural language understanding (NLU) is advancing rapidly, today's technology differs from human-like language understanding in fundamental ways, notably in its inferior efficiency, interpretability, and generalization. This work proposes an approach to representation and learning based on the tenets of embodied cognitive linguistics (ECL). According to ECL, natural language is inherently ex… ▽ More

    Submitted 9 July, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL2020 Theme Track. Extended bibliography version