Skip to main content

Showing 1–7 of 7 results for author: Eden, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.16416  [pdf, other

    cs.AI cs.CL cs.LG

    Survey on Evaluation of LLM-based Agents

    Authors: Asaf Yehudai, Lilach Eden, Alan Li, Guy Uziel, Yilun Zhao, Roy Bar-Haim, Arman Cohan, Michal Shmueli-Scheuer

    Abstract: The emergence of LLM-based agents represents a paradigm shift in AI, enabling autonomous systems to plan, reason, use tools, and maintain memory while interacting with dynamic environments. This paper provides the first comprehensive survey of evaluation methodologies for these increasingly capable agents. We systematically analyze evaluation benchmarks and frameworks across four critical dimensio… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

  2. arXiv:2412.09569  [pdf, ps, other

    cs.CL cs.AI cs.LG

    JuStRank: Benchmarking LLM Judges for System Ranking

    Authors: Ariel Gera, Odellia Boni, Yotam Perlitz, Roy Bar-Haim, Lilach Eden, Asaf Yehudai

    Abstract: Given the rapid progress of generative AI, there is a pressing need to systematically compare and choose between the numerous models and configurations available. The scale and versatility of such evaluations make the use of LLM-based judges a compelling solution for this challenge. Crucially, this approach requires first to validate the quality of the LLM judge itself. Previous work has focused o… ▽ More

    Submitted 10 June, 2025; v1 submitted 12 December, 2024; originally announced December 2024.

    Comments: ACL 2025

  3. arXiv:2311.11301  [pdf, other

    cs.CL

    CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies

    Authors: Arie Cattan, Tom Hope, Doug Downey, Roy Bar-Haim, Lilach Eden, Yoav Kantor, Ido Dagan

    Abstract: Various NLP tasks require a complex hierarchical structure over nodes, where each node is a cluster of items. Examples include generating entailment graphs, hierarchical cross-document coreference resolution, annotating event and subevent relations, etc. To enable efficient annotation of such hierarchical structures, we release CHAMP, an open source tool allowing to incrementally construct both cl… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023

  4. arXiv:2306.03853  [pdf, other

    cs.CL

    From Key Points to Key Point Hierarchy: Structured and Expressive Opinion Summarization

    Authors: Arie Cattan, Lilach Eden, Yoav Kantor, Roy Bar-Haim

    Abstract: Key Point Analysis (KPA) has been recently proposed for deriving fine-grained insights from collections of textual comments. KPA extracts the main points in the data as a list of concise sentences or phrases, termed key points, and quantifies their prevalence. While key points are more expressive than word clouds and key phrases, making sense of a long, flat list of key points, which often express… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: ACL 2023

  5. arXiv:2106.06758  [pdf, other

    cs.CL

    Every Bite Is an Experience: Key Point Analysis of Business Reviews

    Authors: Roy Bar-Haim, Lilach Eden, Yoav Kantor, Roni Friedman, Noam Slonim

    Abstract: Previous work on review summarization focused on measuring the sentiment toward the main aspects of the reviewed product or business, or on creating a textual summary. These approaches provide only a partial view of the data: aspect-based sentiment summaries lack sufficient explanation or justification for the aspect rating, while textual summaries do not quantify the significance of each element,… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

    Comments: ACL-IJCNLP 2021

  6. arXiv:2010.05369  [pdf, other

    cs.CL

    Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis

    Authors: Roy Bar-Haim, Yoav Kantor, Lilach Eden, Roni Friedman, Dan Lahav, Noam Slonim

    Abstract: When summarizing a collection of views, arguments or opinions on some topic, it is often desirable not only to extract the most salient points, but also to quantify their prevalence. Work on multi-document summarization has traditionally focused on creating textual summaries, which lack this quantitative aspect. Recent work has proposed to summarize arguments by mapping them to a small set of expe… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  7. arXiv:2005.01619  [pdf, other

    cs.CL

    From Arguments to Key Points: Towards Automatic Argument Summarization

    Authors: Roy Bar-Haim, Lilach Eden, Roni Friedman, Yoav Kantor, Dan Lahav, Noam Slonim

    Abstract: Generating a concise summary from a large collection of arguments on a given topic is an intriguing yet understudied problem. We propose to represent such summaries as a small set of talking points, termed "key points", each scored according to its salience. We show, by analyzing a large dataset of crowd-contributed arguments, that a small number of key points per topic is typically sufficient for… ▽ More

    Submitted 9 June, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020