Skip to main content

Showing 1–33 of 33 results for author: Shahaf, D

.
  1. arXiv:2505.14479  [pdf, ps, other

    cs.AI cs.CL

    Towards Reliable Proof Generation with LLMs: A Neuro-Symbolic Approach

    Authors: Oren Sultan, Eitan Stern, Dafna Shahaf

    Abstract: Large language models (LLMs) struggle with formal domains that require rigorous logical deduction and symbolic reasoning, such as mathematical proof generation. We propose a neuro-symbolic approach that combines LLMs' generative strengths with structured components to overcome this challenge. As a proof-of-concept, we focus on geometry problems. Our approach is two-fold: (1) we retrieve analogous… ▽ More

    Submitted 11 June, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

    Comments: long paper

  2. arXiv:2505.13534  [pdf, ps, other

    q-bio.QM cs.AI cs.CL cs.IR

    InterFeat: An Automated Pipeline for Finding Interesting Hypotheses in Structured Biomedical Data

    Authors: Dan Ofer, Michal Linial, Dafna Shahaf

    Abstract: Finding interesting phenomena is the core of scientific discovery, but it is a manual, ill-defined concept. We present an integrative pipeline for automating the discovery of interesting simple hypotheses (feature-target relations with effect direction and a potential underlying mechanism) in structured biomedical data. The pipeline combines machine learning, knowledge graphs, literature search an… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

    MSC Class: 68T05; 68T50; 92C50 ACM Class: I.2.6; I.2.7; H.2.8; J.3

  3. arXiv:2504.20643  [pdf, other

    cs.CL cs.AI cs.LG

    Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations

    Authors: Moran Mizrahi, Chen Shani, Gabriel Stanovsky, Dan Jurafsky, Dafna Shahaf

    Abstract: Large Language Models (LLMs) excel at countless tasks, yet struggle with creativity. In this paper, we introduce a novel approach that couples LLMs with structured representations and cognitively inspired manipulations to generate more creative and diverse ideas. Our notion of creativity goes beyond superficial token-level variations; rather, we explicitly recombine structured representations of e… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: 10 pages, 8 figures

  4. arXiv:2410.02952  [pdf, other

    cs.CL cs.AI

    Visual Editing with LLM-based Tool Chaining: An Efficient Distillation Approach for Real-Time Applications

    Authors: Oren Sultan, Alex Khasin, Guy Shiran, Asnat Greenstein-Messica, Dafna Shahaf

    Abstract: We present a practical distillation approach to fine-tune LLMs for invoking tools in real-time applications. We focus on visual editing tasks; specifically, we modify images and videos by interpreting user stylistic requests, specified in natural language ("golden hour"), using an LLM to select the appropriate tools and their parameters to achieve the desired visual effect. We found that proprieta… ▽ More

    Submitted 10 October, 2024; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024

  5. arXiv:2403.01139  [pdf, other

    cs.CL cs.AI

    ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies

    Authors: Oren Sultan, Yonatan Bitton, Ron Yosef, Dafna Shahaf

    Abstract: Analogy-making is central to human cognition, allowing us to adapt to novel situations -- an ability that current AI systems still lack. Most analogy datasets today focus on simple analogies (e.g., word analogies); datasets including complex types of analogies are typically manually curated and very small. We believe that this holds back progress in computational analogy. In this work, we design a… ▽ More

    Submitted 14 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: NAACL 2024 (Main Conference)

  6. arXiv:2401.00595  [pdf, other

    cs.CL

    State of What Art? A Call for Multi-Prompt LLM Evaluation

    Authors: Moran Mizrahi, Guy Kaplan, Dan Malkin, Rotem Dror, Dafna Shahaf, Gabriel Stanovsky

    Abstract: Recent advances in large language models (LLMs) have led to the development of various evaluation benchmarks. These benchmarks typically rely on a single instruction template for evaluating all LLMs on a specific task. In this paper, we comprehensively analyze the brittleness of results obtained via single-prompt evaluations across 6.5M instances, involving 20 different LLMs and 39 tasks from 3 be… ▽ More

    Submitted 6 May, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: Accepted at TACL; pre-MIT Press publication version

  7. arXiv:2312.12681  [pdf, other

    cs.CL cs.AI

    Imitation of Life: A Search Engine for Biologically Inspired Design

    Authors: Hen Emuna, Nadav Borenstein, Xin Qian, Hyeonsu Kang, Joel Chan, Aniket Kittur, Dafna Shahaf

    Abstract: Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: To be published in the AAAI 2024 Proceedings Main Track

  8. arXiv:2311.01866  [pdf, other

    cs.CL cs.AI

    Towards Concept-Aware Large Language Models

    Authors: Chen Shani, Jilles Vreeken, Dafna Shahaf

    Abstract: Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human conce… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 findings long paper

  9. arXiv:2311.01860  [pdf, other

    cs.CL cs.AI

    FAME: Flexible, Scalable Analogy Mappings Engine

    Authors: Shahar Jacob, Chen Shani, Dafna Shahaf

    Abstract: Analogy is one of the core capacities of human cognition; when faced with new situations, we often transfer prior experience from other domains. Most work on computational analogy relies heavily on complex, manually crafted input. In this work, we relax the input requirements, requiring only names of entities to be mapped. We automatically extract commonsense representations and use them to identi… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 main conference long paper

  10. arXiv:2303.15445  [pdf, other

    cs.CL cs.AI cs.CV

    IRFL: Image Recognition of Figurative Language

    Authors: Ron Yosef, Yonatan Bitton, Dafna Shahaf

    Abstract: Figures of speech such as metaphors, similes, and idioms are integral parts of human communication. They are ubiquitous in many forms of discourse, allowing people to convey complex, abstract ideas and evoke emotion. As figurative forms are often conveyed through multiple modalities (e.g., both text and images), understanding multimodal figurative language is an important AI challenge, weaving tog… ▽ More

    Submitted 25 November, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

  11. arXiv:2212.04542  [pdf, other

    cs.CV cs.AI cs.CL

    VASR: Visual Analogies of Situation Recognition

    Authors: Yonatan Bitton, Ron Yosef, Eli Strugo, Dafna Shahaf, Roy Schwartz, Gabriel Stanovsky

    Abstract: A core process in human cognition is analogical mapping: the ability to identify a similar relational structure between different situations. We introduce a novel task, Visual Analogies of Situation Recognition, adapting the classical word-analogy task into the visual domain. Given a triplet of images, the task is to select an image candidate B' that completes the analogy (A to A' is like B to wha… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Accepted to AAAI 2023. Website: https://vasr-dataset.github.io/

  12. arXiv:2211.07959  [pdf, other

    cs.LG cs.AI

    The Lean Data Scientist: Recent Advances towards Overcoming the Data Bottleneck

    Authors: Chen Shani, Jonathan Zarecki, Dafna Shahaf

    Abstract: Machine learning (ML) is revolutionizing the world, affecting almost every field of science and industry. Recent algorithms (in particular, deep networks) are increasingly data-hungry, requiring large datasets for training. Thus, the dominant paradigm in ML today involves constructing large, task-specific datasets. However, obtaining quality datasets of such magnitude proves to be a difficult ch… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  13. arXiv:2211.07950  [pdf, other

    cs.CL

    Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs

    Authors: Kyle Richardson, Ronen Tamari, Oren Sultan, Reut Tsarfaty, Dafna Shahaf, Ashish Sabharwal

    Abstract: Can we teach natural language understanding models to track their beliefs through intermediate points in text? We propose a representation learning framework called breakpoint modeling that allows for learning of this type. Given any text encoder and data marked with intermediate states (breakpoints) along with corresponding textual queries viewed as true/false propositions (i.e., the candidate be… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022

  14. 50 Ways to Bake a Cookie: Mapping the Landscape of Procedural Texts

    Authors: Moran Mizrahi, Dafna Shahaf

    Abstract: The web is full of guidance on a wide variety of tasks, from changing the oil in your car to baking an apple pie. However, as content is created independently, a single task could have thousands of corresponding procedural texts. This makes it difficult for users to view the bigger picture and understand the multiple ways the task could be accomplished. In this work we propose an unsupervised lear… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

    Comments: 11 pages, 6 figures, Accepted to CIKM 2021

  15. arXiv:2210.13016  [pdf, other

    cs.LG cs.AI cs.CL cs.CY cs.GL

    Cards Against AI: Predicting Humor in a Fill-in-the-blank Party Game

    Authors: Dan Ofer, Dafna Shahaf

    Abstract: Humor is an inherently social phenomenon, with humorous utterances shaped by what is socially and culturally accepted. Understanding humor is an important NLP challenge, with many applications to human-computer interactions. In this work we explore humor in the context of Cards Against Humanity -- a party game where players complete fill-in-the-blank statements using cards that can be offensive or… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Conditionally accepted in EMNLP 2022 short findings. 5 pages

    Report number: Dan Ofer and Dafna Shahaf. 2022. Cards Against AI: Predicting Humor in a Fill-in-the-blank Party Game. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5397–5403. Association for Computational Linguistics MSC Class: 68T01; 68T50 ACM Class: I.2.7; I.2; K.4; J.4; J.5

    Journal ref: https://aclanthology.org/2022.findings-emnlp.394

  16. arXiv:2210.12197  [pdf, other

    cs.CL cs.AI

    Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes

    Authors: Oren Sultan, Dafna Shahaf

    Abstract: Analogy-making gives rise to reasoning, abstraction, flexible categorization and counterfactual inference -- abilities lacking in even the best AI systems today. Much research has suggested that analogies are key to non-brittle systems that can adapt to new domains. Despite their importance, analogies received little attention in the NLP community, with most research focusing on simple word analog… ▽ More

    Submitted 19 January, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 main conference (long paper)

  17. arXiv:2205.15476  [pdf, other

    cs.HC

    Augmenting Scientific Creativity with an Analogical Search Engine

    Authors: Hyeonsu B. Kang, Xin Qian, Tom Hope, Dafna Shahaf, Joel Chan, Aniket Kittur

    Abstract: Analogies have been central to creative problem-solving throughout the history of science and technology. As the number of scientific papers continues to increase exponentially, there is a growing opportunity for finding diverse solutions to existing problems. However, realizing this potential requires the development of a means for searching through a large corpus that goes beyond surface matches… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  18. From Users to (Sense)Makers: On the Pivotal Role of Stigmergic Social Annotation in the Quest for Collective Sensemaking

    Authors: Ronen Tamari, Daniel Friedman, William Fischer, Lauren Hebert, Dafna Shahaf

    Abstract: The web has become a dominant epistemic environment, influencing people's beliefs at a global scale. However, online epistemic environments are increasingly polluted, impairing societies' ability to coordinate effectively in the face of global crises. We argue that centralized platforms are a main source of epistemic pollution, and that healthier environments require redesigning how we collectivel… ▽ More

    Submitted 4 August, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Blue-sky ideas track of the 33rd ACM Conference on Hypertext and Social Media, Barcelona, 2022 (updated references)

  19. arXiv:2112.00086  [pdf, other

    cs.CL cs.AI

    Dyna-bAbI: unlocking bAbI's potential with dynamic synthetic benchmarking

    Authors: Ronen Tamari, Kyle Richardson, Aviad Sar-Shalom, Noam Kahlon, Nelson Liu, Reut Tsarfaty, Dafna Shahaf

    Abstract: While neural language models often perform surprisingly well on natural language understanding (NLU) tasks, their strengths and limitations remain poorly understood. Controlled synthetic tasks are thus an increasingly important resource for diagnosing model behavior. In this work we focus on story understanding, a core competency for NLU systems. However, the main synthetic resource for story unde… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

    Comments: Code and data will be made available at project page: https://tiny.one/8wjxwd7z

  20. E-Commerce Dispute Resolution Prediction

    Authors: David Tsurel, Michael Doron, Alexander Nus, Arnon Dagan, Ido Guy, Dafna Shahaf

    Abstract: E-Commerce marketplaces support millions of daily transactions, and some disagreements between buyers and sellers are unavoidable. Resolving disputes in an accurate, fast, and fair manner is of great importance for maintaining a trustworthy platform. Simple cases can be automated, but intricate cases are not sufficiently addressed by hard-coded rules, and therefore most disputes are currently reso… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Journal ref: CIKM'20: Proceedings of the 29th ACM International Conference on Information and Knowledge Management, Oct 2020, Pages 1465-1474

  21. arXiv:2106.04830  [pdf, other

    cs.CL

    Catchphrase: Automatic Detection of Cultural References

    Authors: Nir Sweed, Dafna Shahaf

    Abstract: A snowclone is a customizable phrasal template that can be realized in multiple, instantly recognized variants. For example, ``* is the new *" (Orange is the new black, 40 is the new 30). Snowclones are extensively used in social media. In this paper, we study snowclones originating from pop-culture quotes; our goal is to automatically detect cultural references in text. We introduce a new, public… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  22. arXiv:2106.03048  [pdf, other

    cs.CL cs.AI

    How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

    Authors: Chen Shani, Nadav Borenstein, Dafna Shahaf

    Abstract: Humor is an important social phenomenon, serving complex social and psychological functions. However, despite being studied for millennia humor is computationally not well understood, often considered an AI-complete problem. In this work, we introduce a novel setting in humor mining: automatically detecting funny and unusual scientific papers. We are inspired by the Ig Nobel prize, a satirical pri… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

    Comments: To be published in the main conference of ACL-IJCNLP2021. Code and dataset can be found here: https://github.com/nadavborenstein/Iggy

  23. arXiv:2105.05571  [pdf, other

    cs.HC cs.AI cs.CL cs.IR

    "Alexa, what do you do for fun?" Characterizing playful requests with virtual assistants

    Authors: Chen Shani, Alexander Libov, Sofia Tolmach, Liane Lewin-Eytan, Yoelle Maarek, Dafna Shahaf

    Abstract: Virtual assistants such as Amazon's Alexa, Apple's Siri, Google Home, and Microsoft's Cortana, are becoming ubiquitous in our daily lives and successfully help users in various daily tasks, such as making phone calls or playing music. Yet, they still struggle with playful utterances, which are not meant to be interpreted literally. Examples include jokes or absurd requests or questions such as, "A… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  24. arXiv:2102.09761  [pdf, other

    cs.HC cs.AI cs.CL

    Scaling Creative Inspiration with Fine-Grained Functional Aspects of Ideas

    Authors: Tom Hope, Ronen Tamari, Hyeonsu Kang, Daniel Hershcovich, Joel Chan, Aniket Kittur, Dafna Shahaf

    Abstract: Large repositories of products, patents and scientific papers offer an opportunity for building systems that scour millions of ideas and help users discover inspirations. However, idea descriptions are typically in the form of unstructured text, lacking key structure that is required for supporting creative innovation interactions. Prior work has explored idea representations that were either limi… ▽ More

    Submitted 17 February, 2022; v1 submitted 19 February, 2021; originally announced February 2021.

    Comments: To appear in CHI 2022

    Journal ref: CHI 2022

  25. arXiv:2005.00311  [pdf, other

    cs.CL cs.LG

    Language (Re)modelling: Towards Embodied Language Understanding

    Authors: Ronen Tamari, Chen Shani, Tom Hope, Miriam R. L. Petruck, Omri Abend, Dafna Shahaf

    Abstract: While natural language understanding (NLU) is advancing rapidly, today's technology differs from human-like language understanding in fundamental ways, notably in its inferior efficiency, interpretability, and generalization. This work proposes an approach to representation and learning based on the tenets of embodied cognitive linguistics (ECL). According to ECL, natural language is inherently ex… ▽ More

    Submitted 9 July, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL2020 Theme Track. Extended bibliography version

  26. arXiv:2003.04567  [pdf, other

    cs.AI cs.CL cs.LG

    Ecological Semantics: Programming Environments for Situated Language Understanding

    Authors: Ronen Tamari, Gabriel Stanovsky, Dafna Shahaf, Reut Tsarfaty

    Abstract: Large-scale natural language understanding (NLU) systems have made impressive progress: they can be applied flexibly across a variety of tasks, and employ minimal structural assumptions. However, extensive empirical research has shown this to be a double-edged sword, coming at the cost of shallow understanding: inferior generalization, grounding and explainability. Grounded language learning appro… ▽ More

    Submitted 24 May, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Camera ready for Bridging AI and Cognitive Science (BAICS) workshop at ICLR2020. For interactive demos, see https://eco-sem.github.io/

  27. arXiv:1811.04319  [pdf, other

    cs.LG cs.CL stat.ML

    Playing by the Book: An Interactive Game Approach for Action Graph Extraction from Text

    Authors: Ronen Tamari, Hiroyuki Shindo, Dafna Shahaf, Yuji Matsumoto

    Abstract: Understanding procedural text requires tracking entities, actions and effects as the narrative unfolds. We focus on the challenging real-world problem of action-graph extraction from material science papers, where language is highly specialized and data annotation is expensive and scarce. We propose a novel approach, Text2Quest, where procedural text is interpreted as instructions for an interacti… ▽ More

    Submitted 6 April, 2019; v1 submitted 10 November, 2018; originally announced November 2018.

    Comments: Accepted to NAACL 2019 ESSP workshop (https://scientific-knowledge.github.io/)

  28. arXiv:1712.06880  [pdf, other

    cs.CL

    Analogy Mining for Specific Design Needs

    Authors: Karni Gilon, Felicia Y Ng, Joel Chan, Hila Lifshitz Assaf, Aniket Kittur, Dafna Shahaf

    Abstract: Finding analogical inspirations in distant domains is a powerful way of solving problems. However, as the number of inspirations that could be matched and the dimensions on which that matching could occur grow, it becomes challenging for designers to find inspirations relevant to their needs. Furthermore, designers are often interested in exploring specific aspects of a product-- for example, one… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

  29. arXiv:1712.04828  [pdf, other

    stat.ML cs.LG

    Ballpark Crowdsourcing: The Wisdom of Rough Group Comparisons

    Authors: Tom Hope, Dafna Shahaf

    Abstract: Crowdsourcing has become a popular method for collecting labeled training data. However, in many practical scenarios traditional labeling can be difficult for crowdworkers (for example, if the data is high-dimensional or unintuitive, or the labels are continuous). In this work, we develop a novel model for crowdsourcing that can complement standard practices by exploiting people's intuitions abo… ▽ More

    Submitted 13 December, 2017; originally announced December 2017.

    Journal ref: WSDM 2018

  30. arXiv:1708.03074  [pdf, other

    cs.NI cs.LG

    A Machine Learning Approach to Routing

    Authors: Asaf Valadarsky, Michael Schapira, Dafna Shahaf, Aviv Tamar

    Abstract: Can ideas and techniques from machine learning be leveraged to automatically generate "good" routing configurations? We investigate the power of data-driven routing protocols. Our results suggest that applying ideas and techniques from deep reinforcement learning to this context yields high performance, motivating further research along these lines.

    Submitted 11 November, 2017; v1 submitted 10 August, 2017; originally announced August 2017.

    ACM Class: C.2.2

  31. arXiv:1706.05585  [pdf, other

    cs.CL cs.AI stat.ML

    Accelerating Innovation Through Analogy Mining

    Authors: Tom Hope, Joel Chan, Aniket Kittur, Dafna Shahaf

    Abstract: The availability of large idea repositories (e.g., the U.S. patent database) could significantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for either human or automated methods. Previous approaches include costly hand-c… ▽ More

    Submitted 17 June, 2017; originally announced June 2017.

    Comments: KDD 2017

  32. arXiv:1612.03896  [pdf, other

    cs.SI cs.IR

    Fun Facts: Automatic Trivia Fact Extraction from Wikipedia

    Authors: David Tsurel, Dan Pelleg, Ido Guy, Dafna Shahaf

    Abstract: A significant portion of web search queries directly refers to named entities. Search engines explore various ways to improve the user experience for such queries. We suggest augmenting search results with {\em trivia facts} about the searched entity. Trivia is widely played throughout the world, and was shown to increase users' engagement and retention. Most random facts are not suitable for th… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

    Comments: To appear in Proceedings of tenth ACM International Conference on Web Search and Data Mining, WSDM 2017

  33. arXiv:1607.00034  [pdf, other

    stat.ML cs.LG

    Ballpark Learning: Estimating Labels from Rough Group Comparisons

    Authors: Tom Hope, Dafna Shahaf

    Abstract: We are interested in estimating individual labels given only coarse, aggregated signal over the data points. In our setting, we receive sets ("bags") of unlabeled instances with constraints on label proportions. We relax the unrealistic assumption of known label proportions, made in previous work; instead, we assume only to have upper and lower bounds, and constraints on bag differences. We motiva… ▽ More

    Submitted 30 June, 2016; originally announced July 2016.

    Comments: To appear in the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD) 2016