Skip to main content

Showing 1–7 of 7 results for author: Bhola, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2508.15617  [pdf, ps, other

    cs.CL cs.AI

    Trained Miniatures: Low cost, High Efficacy SLMs for Sales & Marketing

    Authors: Ishaan Bhola, Mukunda NS, Sravanth Kurmala, Harsh Nandwani, Arihant Jain

    Abstract: Large language models (LLMs) excel in text generation; however, these creative elements require heavy computation and are accompanied by a steep cost. Especially for targeted applications such as sales and marketing outreach, these costs are far from feasible. This paper introduces the concept of "Trained Miniatures" - Small Language Models(SLMs) fine-tuned for specific, high-value applications, g… ▽ More

    Submitted 21 August, 2025; originally announced August 2025.

  2. arXiv:2409.11190  [pdf

    cs.SE cs.AI

    SuperCoder2.0: Technical Report on Exploring the feasibility of LLMs as Autonomous Programmer

    Authors: Anmol Gautam, Kishore Kumar, Adarsh Jha, Mukunda NS, Ishaan Bhola

    Abstract: We present SuperCoder2.0, an advanced autonomous system designed to enhance software development through artificial intelligence. The system combines an AI-native development approach with intelligent agents to enable fully autonomous coding. Key focus areas include a retry mechanism with error output traceback, comprehensive code rewriting and replacement using Abstract Syntax Tree (ast) parsing… ▽ More

    Submitted 27 October, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

  3. arXiv:2409.03793  [pdf

    cs.CR cs.AI

    Safeguarding AI Agents: Developing and Analyzing Safety Architectures

    Authors: Ishaan Domkundwar, Mukunda N S, Ishaan Bhola, Riddhik Kochhar

    Abstract: AI agents, specifically powered by large language models, have demonstrated exceptional capabilities in various applications where precision and efficacy are necessary. However, these agents come with inherent risks, including the potential for unsafe or biased actions, vulnerability to adversarial attacks, lack of transparency, and tendency to generate hallucinations. As AI agents become more pre… ▽ More

    Submitted 28 February, 2025; v1 submitted 3 September, 2024; originally announced September 2024.

  4. arXiv:2405.15341  [pdf, other

    cs.AI cs.CV

    V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

    Authors: Abdur Rahman, Rajat Chawla, Muskaan Kumar, Arkajit Datta, Adarsh Jha, Mukunda NS, Ishaan Bhola

    Abstract: In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting th… ▽ More

    Submitted 21 July, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 12 pages, 5 figures, 3 tables

  5. arXiv:2404.16048  [pdf

    cs.HC cs.AI

    GUIDE: Graphical User Interface Data for Execution

    Authors: Rajat Chawla, Adarsh Jha, Muskaan Kumar, Mukunda NS, Ishaan Bhola

    Abstract: In this paper, we introduce GUIDE, a novel dataset tailored for the advancement of Multimodal Large Language Model (MLLM) applications, particularly focusing on Robotic Process Automation (RPA) use cases. Our dataset encompasses diverse data from various websites including Apollo(62.67\%), Gmail(3.43\%), Calendar(10.98\%) and Canva(22.92\%). Each data entry includes an image, a task description, t… ▽ More

    Submitted 27 October, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: 11 pages, 8 figures, 3 Tables and 1 Algorithm

  6. arXiv:2403.10171  [pdf

    cs.AI cs.CV

    AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation

    Authors: Arkajit Datta, Tushar Verma, Rajat Chawla, Mukunda N. S, Ishaan Bhola

    Abstract: In recent advancements within the domain of Large Language Models (LLMs), there has been a notable emergence of agents capable of addressing Robotic Process Automation (RPA) challenges through enhanced cognitive capabilities and sophisticated reasoning. This development heralds a new era of scalability and human-like adaptability in goal attainment. In this context, we introduce AUTONODE (Autonomo… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted in MIPR-2024

  7. arXiv:2403.08773  [pdf

    cs.CV cs.AI cs.CL cs.MM

    Veagle: Advancements in Multimodal Representation Learning

    Authors: Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola

    Abstract: Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image cap… ▽ More

    Submitted 27 October, 2024; v1 submitted 18 January, 2024; originally announced March 2024.