Skip to main content

Showing 1–16 of 16 results for author: Maheshwari, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.15629  [pdf, ps, other

    cs.IR cs.CL

    CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction

    Authors: Harsh Maheshwari, Srikanth Tenneti, Alwarappan Nakkiran

    Abstract: Retrieval Augmented Generation (RAG) has emerged as a powerful application of Large Language Models (LLMs), revolutionizing information search and consumption. RAG systems combine traditional search capabilities with LLMs to generate comprehensive answers to user queries, ideally with accurate citations. However, in our experience of developing a RAG product, LLMs often struggle with source attrib… ▽ More

    Submitted 11 June, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

  2. arXiv:2408.01452  [pdf, other

    cs.CY cs.AI cs.LG

    Building a Domain-specific Guardrail Model in Production

    Authors: Mohammad Niknazar, Paul V Haley, Latha Ramanan, Sang T. Truong, Yedendra Shrinivasan, Ayan Kumar Bhowmick, Prasenjit Dey, Ashish Jagmohan, Hema Maheshwari, Shom Ponoth, Robert Smith, Aditya Vempaty, Nick Haber, Sanmi Koyejo, Sharad Sundararajan

    Abstract: Generative AI holds the promise of enabling a range of sought-after capabilities and revolutionizing workflows in various consumer and enterprise verticals. However, putting a model in production involves much more than just generating an output. It involves ensuring the model is reliable, safe, performant and also adheres to the policy of operation in a particular domain. Guardrails as a necessit… ▽ More

    Submitted 24 July, 2024; originally announced August 2024.

  3. arXiv:2407.15734  [pdf, other

    cs.AI cs.MA

    TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON

    Authors: John Chong Min Tan, Prince Saroj, Bharat Runwal, Hardik Maheshwari, Brian Lim Yi Sheng, Richard Cottrill, Alankrit Chona, Ambuj Kumar, Mehul Motani

    Abstract: TaskGen is an open-sourced agentic framework which uses an Agent to solve an arbitrary task by breaking them down into subtasks. Each subtask is mapped to an Equipped Function or another Agent to execute. In order to reduce verbosity (and hence token usage), TaskGen uses StrictJSON that ensures JSON output from the Large Language Model (LLM), along with additional features such as type checking an… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: 53 pages

  4. arXiv:2406.06556  [pdf, other

    cs.CL cs.AI

    Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach

    Authors: Sambaran Bandyopadhyay, Himanshu Maheshwari, Anandhavelu Natarajan, Apoorv Saxena

    Abstract: Generating presentation slides from a long document with multimodal elements such as text and images is an important task. This is time consuming and needs domain expertise if done manually. Existing approaches for generating a rich presentation from a document are often semi-automatic or only put a flat summary into the slides ignoring the importance of a good narrative. In this paper, we address… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  5. arXiv:2405.13095  [pdf, other

    cs.CL cs.AI

    Presentations are not always linear! GNN meets LLM for Document-to-Presentation Transformation with Attribution

    Authors: Himanshu Maheshwari, Sambaran Bandyopadhyay, Aparna Garimella, Anandhavelu Natarajan

    Abstract: Automatically generating a presentation from the text of a long document is a challenging and useful problem. In contrast to a flat summary, a presentation needs to have a better and non-linear narrative, i.e., the content of a slide can come from different and non-contiguous parts of the given document. However, it is difficult to incorporate such non-linear mapping of content to slides and ensur… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: This paper is under review in a conference

  6. arXiv:2402.00868  [pdf, other

    cs.CV

    We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline

    Authors: Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Prithvijit Chattopadhyay, Judy Hoffman, Viraj Prabhu

    Abstract: There has been abundant work in unsupervised domain adaptation for semantic segmentation (DAS) seeking to adapt a model trained on images from a labeled source domain to an unlabeled target domain. While the vast majority of prior work has studied this as a frame-level Image-DAS problem, a few Video-DAS works have sought to additionally leverage the temporal signal present in adjacent frames. Howe… ▽ More

    Submitted 27 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: TMLR 2024

  7. arXiv:2401.01637  [pdf, other

    cs.CL

    Social Media Ready Caption Generation for Brands

    Authors: Himanshu Maheshwari, Koustava Goswami, Apoorv Saxena, Balaji Vasan Srinivasan

    Abstract: Social media advertisements are key for brand marketing, aiming to attract consumers with captivating captions and pictures or logos. While previous research has focused on generating captions for general images, incorporating brand personalities into social media captioning remains unexplored. Brand personalities are shown to be affecting consumers' behaviours and social interactions and thus are… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  8. arXiv:2309.15004  [pdf, other

    cs.CL cs.AI cs.LG

    Automating question generation from educational text

    Authors: Ayan Kumar Bhowmick, Ashish Jagmohan, Aditya Vempaty, Prasenjit Dey, Leigh Hall, Jeremy Hartman, Ravi Kokku, Hema Maheshwari

    Abstract: The use of question-based activities (QBAs) is wide-spread in education, traditionally forming an integral part of the learning and assessment process. In this paper, we design and evaluate an automated question generation tool for formative and summative assessment in schools. We present an expert survey of one hundred and four teachers, demonstrating the need for automated generation of QBAs, as… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted to AI-2023 (Forty-third SGAI International Conference on Artificial Intelligence) as a long paper, link: http://www.bcs-sgai.org/ai2023

  9. arXiv:2304.10756  [pdf, other

    cs.CV cs.LG

    Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

    Authors: Harsh Maheshwari, Yen-Cheng Liu, Zsolt Kira

    Abstract: Using multiple spatial modalities has been proven helpful in improving semantic segmentation performance. However, there are several real-world challenges that have yet to be addressed: (a) improving label efficiency and (b) enhancing robustness in realistic scenarios where modalities are missing at the test time. To address these challenges, we first propose a simple yet efficient multi-modal fus… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

  10. arXiv:2211.13815  [pdf, ps, other

    cs.CL

    Using Selective Masking as a Bridge between Pre-training and Fine-tuning

    Authors: Tanish Lad, Himanshu Maheshwari, Shreyas Kottukkal, Radhika Mamidi

    Abstract: Pre-training a language model and then fine-tuning it for downstream tasks has demonstrated state-of-the-art results for various NLP tasks. Pre-training is usually independent of the downstream task, and previous works have shown that this pre-training alone might not be sufficient to capture the task-specific nuances. We propose a way to tailor a pre-trained BERT model for the downstream task via… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: ENLSP Workshop, NeurIPS 2022

  11. arXiv:2207.08079  [pdf, other

    cs.CV cs.LG

    Performance degradation of ImageNet trained models by simple image transformations

    Authors: Harsh Maheshwari

    Abstract: ImageNet trained PyTorch models are generally preferred as the off-the-shelf models for direct use or for initialisation in most computer vision tasks. In this paper, we simply test a representative set of these convolution and transformer based models under many simple image transformations like horizontal shifting, vertical shifting, scaling, rotation, presence of Gaussian noise, cutout, horizon… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

  12. An Application to Generate Style Guided Compatible Outfit

    Authors: Debopriyo Banerjee, Harsh Maheshwari, Lucky Dhakad1, Arnab Bhattacharya1, Niloy Ganguly, Muthusamy Chelliah, Suyash Agarwal1

    Abstract: Fashion recommendation has witnessed a phenomenal growth of research, particularly in the domains of shop-the-look, contextaware outfit creation, personalizing outfit creation etc. Majority of the work in this area focuses on better understanding of the notion of complimentary relationship between lifestyle items. Quite recently, some works have realised that style plays a vital role in fashion, e… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

  13. arXiv:2203.16161  [pdf, other

    cs.IR cs.AI cs.CV cs.LG

    Recommendation of Compatible Outfits Conditioned on Style

    Authors: Debopriyo Banerjee, Lucky Dhakad, Harsh Maheshwari, Muthusamy Chelliah, Niloy Ganguly, Arnab Bhattacharya

    Abstract: Recommendation in the fashion domain has seen a recent surge in research in various areas, for example, shop-the-look, context-aware outfit creation, personalizing outfit creation, etc. The majority of state of the art approaches in the domain of outfit recommendation pursue to improve compatibility among items so as to produce high quality outfits. Some recent works have realized that style is an… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

  14. arXiv:2106.10452  [pdf, other

    cs.CV cs.LG

    MSN: Efficient Online Mask Selection Network for Video Instance Segmentation

    Authors: Vidit Goel, Jiachen Li, Shubhika Garg, Harsh Maheshwari, Humphrey Shi

    Abstract: In this work we present a novel solution for Video Instance Segmentation(VIS), that is automatically generating instance level segmentation masks along with object class and tracking them in a video. Our method improves the masks from segmentation and propagation branches in an online manner using the Mask Selection Network (MSN) hence limiting the noise accumulation during mask tracking. We propo… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

    Comments: 3rd Place Solution to the YouTube-VIS Challenge at CVPR 2021

  15. arXiv:2011.08575  [pdf, other

    cs.LG cs.CY

    Audience Creation for Consumables -- Simple and Scalable Precision Merchandising for a Growing Marketplace

    Authors: Shreyas S, Harsh Maheshwari, Avijit Saha, Samik Datta, Shashank Jain, Disha Makhija, Anuj Nagpal, Sneha Shukla, Suyash S

    Abstract: Consumable categories, such as grocery and fast-moving consumer goods, are quintessential to the growth of e-commerce marketplaces in developing countries. In this work, we present the design and implementation of a precision merchandising system, which creates audience sets from over 10 million consumers and is deployed at Flipkart Supermart, one of the largest online grocery stores in India. We… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 10 pages

  16. arXiv:1911.02559  [pdf, other

    cs.CV cs.LG

    SCL: Towards Accurate Domain Adaptive Object Detection via Gradient Detach Based Stacked Complementary Losses

    Authors: Zhiqiang Shen, Harsh Maheshwari, Weichen Yao, Marios Savvides

    Abstract: Unsupervised domain adaptive object detection aims to learn a robust detector in the domain shift circumstance, where the training (source) domain is label-rich with bounding box annotations, while the testing (target) domain is label-agnostic and the feature distributions between training and testing domains are dissimilar or even totally different. In this paper, we propose a gradient detach bas… ▽ More

    Submitted 21 November, 2019; v1 submitted 6 November, 2019; originally announced November 2019.