Skip to main content

Showing 1–6 of 6 results for author: Haydarov, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.03769  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages

    Authors: Youssef Mohamed, Runjia Li, Ibrahim Said Ahmad, Kilichbek Haydarov, Philip Torr, Kenneth Ward Church, Mohamed Elhoseiny

    Abstract: Research in vision and language has made considerable progress thanks to benchmarks such as COCO. COCO captions focused on unambiguous facts in English; ArtEmis introduced subjective emotions and ArtELingo introduced some multilinguality (Chinese and Arabic). However we believe there should be more multilinguality. Hence, we present ArtELingo-28, a vision-language benchmark that spans… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

    Comments: 9 pages, Accepted at EMNLP 24, for more details see www.artelingo.org

  2. arXiv:2308.16349  [pdf, other

    cs.CL

    Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversations

    Authors: Kilichbek Haydarov, Xiaoqian Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, Mohamed Elhoseiny

    Abstract: We introduce Affective Visual Dialog, an emotion explanation and reasoning task as a testbed for research on understanding the formation of emotions in visually grounded conversations. The task involves three skills: (1) Dialog-based Question Answering (2) Dialog-based Emotion Prediction and (3) Affective emotion explanation generation based on the dialog. Our key contribution is the collection of… ▽ More

    Submitted 27 August, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

  3. arXiv:2304.04227  [pdf, other

    cs.CV cs.AI

    Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions

    Authors: Jun Chen, Deyao Zhu, Kilichbek Haydarov, Xiang Li, Mohamed Elhoseiny

    Abstract: Video captioning aims to convey dynamic scenes from videos using natural language, facilitating the understanding of spatiotemporal information within our environment. Although there have been recent advances, generating detailed and enriched video descriptions continues to be a substantial challenge. In this work, we introduce Video ChatCaptioner, an innovative approach for creating more comprehe… ▽ More

    Submitted 24 May, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

  4. arXiv:2303.06594  [pdf, other

    cs.CV cs.AI cs.LG

    ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions

    Authors: Deyao Zhu, Jun Chen, Kilichbek Haydarov, Xiaoqian Shen, Wenxuan Zhang, Mohamed Elhoseiny

    Abstract: Asking insightful questions is crucial for acquiring knowledge and expanding our understanding of the world. However, the importance of questioning has been largely overlooked in AI research, where models have been primarily developed to answer questions. With the recent advancements of large language models (LLMs) like ChatGPT, we discover their capability to ask high-quality questions when provi… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

  5. arXiv:2204.07660  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection

    Authors: Youssef Mohamed, Faizan Farooq Khan, Kilichbek Haydarov, Mohamed Elhoseiny

    Abstract: Datasets that capture the connection between vision, language, and affection are limited, causing a lack of understanding of the emotional aspect of human intelligence. As a step in this direction, the ArtEmis dataset was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations of these chosen emotions. We observed a significant emotional bias… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: 8 pages, Accepted at CVPR 22, for more details see https://www.artemisdataset-v2.org

  6. arXiv:2101.07396  [pdf, other

    cs.CV cs.CL

    ArtEmis: Affective Language for Visual Art

    Authors: Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas Guibas

    Abstract: We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicat… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    Comments: https://artemisdataset.org