Skip to main content

Showing 1–5 of 5 results for author: Grinberg, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.06204  [pdf, other

    cs.CL cs.AI cs.LG

    CUPCase: Clinically Uncommon Patient Cases and Diagnoses Dataset

    Authors: Oriel Perets, Ofir Ben Shoham, Nir Grinberg, Nadav Rappoport

    Abstract: Medical benchmark datasets significantly contribute to developing Large Language Models (LLMs) for medical knowledge extraction, diagnosis, summarization, and other uses. Yet, current benchmarks are mainly derived from exam questions given to medical students or cases described in the medical literature, lacking the complexity of real-world patient cases that deviate from classic textbook abstract… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: Accepted to AAAI 2025

  2. arXiv:2501.09035  [pdf, other

    cs.SI cs.CY

    DomainDemo: a dataset of domain-sharing activities among different demographic groups on Twitter

    Authors: Kai-Cheng Yang, Pranav Goel, Alexi Quintana-Mathé, Luke Horgan, Stefan D. McCabe, Nir Grinberg, Kenneth Joseph, David Lazer

    Abstract: Social media play a pivotal role in disseminating web content, particularly during elections, yet our understanding of the association between demographic factors and political discourse online remains limited. Here, we introduce a unique dataset, DomainDemo, linking domains shared on Twitter (X) with the demographic characteristics of associated users, including age, gender, race, political affil… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

    Comments: 19 pages, 1 figure

  3. arXiv:2404.13613  [pdf, other

    cs.CL cs.LG

    The Branch Not Taken: Predicting Branching in Online Conversations

    Authors: Shai Meital, Lior Rokach, Roman Vainshtein, Nir Grinberg

    Abstract: Multi-participant discussions tend to unfold in a tree structure rather than a chain structure. Branching may occur for multiple reasons -- from the asynchronous nature of online platforms to a conscious decision by an interlocutor to disengage with part of the conversation. Predicting branching and understanding the reasons for creating new branches is important for many downstream tasks such as… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  4. arXiv:2401.02001  [pdf, other

    cs.SI

    Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations

    Authors: Daniel Matter, Miriam Schirmer, Nir Grinberg, Jürgen Pfeffer

    Abstract: This study investigates the prevalence of violent language on incels.is. It evaluates GPT models (GPT-3.5 and GPT-4) for content analysis in social sciences, focusing on the impact of varying prompts and batch sizes on coding quality for the detection of violent speech. We scraped over 6.9M posts from incels.is and categorized a random sample into non-violent, explicitly violent, and implicitly vi… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 10 pages, 2 figures, 3 tables

  5. Multilingual Detection of Personal Employment Status on Twitter

    Authors: Manuel Tonneau, Dhaval Adjodah, João Palotti, Nir Grinberg, Samuel Fraiberger

    Abstract: Detecting disclosures of individuals' employment status on social media can provide valuable information to match job seekers with suitable vacancies, offer social protection, or measure labor market flows. However, identifying such personal disclosures is a challenging task due to their rarity in a sea of social media content and the variety of linguistic forms used to describe them. Here, we exa… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: ACL 2022 main conference. Data and models available at https://github.com/manueltonneau/twitter-unemployment