Skip to main content

Showing 1–7 of 7 results for author: Sefid, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.19842  [pdf, other

    cs.CV

    CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation

    Authors: Reza Abbasi, Ali Nazari, Aminreza Sefid, Mohammadali Banayeeanzade, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah

    Abstract: Contrastive Language-Image Pre-training (CLIP) models excel in zero-shot classification, yet face challenges in complex multi-object scenarios. This study offers a comprehensive analysis of CLIP's limitations in these contexts using a specialized dataset, ComCO, designed to evaluate CLIP's encoders in diverse multi-object scenarios. Our findings reveal significant biases: the text encoder prioriti… ▽ More

    Submitted 28 February, 2025; v1 submitted 27 February, 2025; originally announced February 2025.

    Comments: Accepted at CVPR 2025

  2. arXiv:2502.19828  [pdf, other

    cs.CV

    Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study

    Authors: Reza Abbasi, Ali Nazari, Aminreza Sefid, Mohammadali Banayeeanzade, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshah

    Abstract: Contrastive Language-Image Pre-training (CLIP) models have demonstrated remarkable performance in zero-shot classification tasks, yet their efficacy in handling complex multi-object scenarios remains challenging. This study presents a comprehensive analysis of CLIP's performance limitations in multi-object contexts through controlled experiments. We introduce two custom datasets, SimCO and CompCO,… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: Accepted at ECCV 2024 Workshop EVAL-FoMo

  3. arXiv:2201.08495  [pdf, other

    cs.CL

    SciBERTSUM: Extractive Summarization for Scientific Documents

    Authors: Athar Sefid, C Lee Giles

    Abstract: The summarization literature focuses on the summarization of news articles. The news articles in the CNN-DailyMail are relatively short documents with about 30 sentences per document on average. We introduce SciBERTSUM, our summarization framework designed for the summarization of long documents like scientific papers with more than 500 sentences. SciBERTSUM extends BERTSUM to long documents by 1)… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  4. arXiv:2106.03246  [pdf, other

    cs.CL cs.AI

    Extractive Research Slide Generation Using Windowed Labeling Ranking

    Authors: Athar Sefid, Jian Wu, Prasenjit Mitra, Lee Giles

    Abstract: Presentation slides describing the content of scientific and technical papers are an efficient and effective way to present that work. However, manually generating presentation slides is labor intensive. We propose a method to automatically generate slides for scientific papers based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites. The sentence labeling module o… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

    Journal ref: NAACL/Proceedings of the Second Workshop on Scholarly Document Processing 2021

  5. arXiv:2008.11290  [pdf, other

    cs.CL cs.IR cs.LG

    Extractive Summarizer for Scholarly Articles

    Authors: Athar Sefid, Clyde Lee Giles, Prasenjit Mitra

    Abstract: We introduce an extractive method that will summarize long scientific papers. Our model uses presentation slides provided by the authors of the papers as the gold summary standard to label the sentences. The sentences are ranked based on their novelty and their importance as estimated by deep neural networks. Our window-based extractive labeling of sentences results in the improvement of at least… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

  6. arXiv:1906.08470  [pdf, other

    cs.DL cs.IR

    Cleaning Noisy and Heterogeneous Metadata for Record Linking Across Scholarly Big Datasets

    Authors: Athar Sefid, Jian Wu, Allen C. Ge, Jing Zhao, Lu Liu, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles

    Abstract: Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarit… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  7. arXiv:1709.09657  [pdf, other

    cs.IR cs.DL

    Scaling Author Name Disambiguation with CNF Blocking

    Authors: Kunho Kim, Athar Sefid, C. Lee Giles

    Abstract: An author name disambiguation (AND) algorithm identifies a unique author entity record from all similar or same publication records in scholarly or similar databases. Typically, a clustering method is used that requires calculation of similarities between each possible record pair. However, the total number of pairs grows quadratically with the size of the author database making such clustering di… ▽ More

    Submitted 27 September, 2017; originally announced September 2017.