Skip to main content

Showing 1–50 of 62 results for author: Singhal, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12683  [pdf, ps, other

    cs.CV q-bio.QM

    Evaluating Cell Type Inference in Vision Language Models Under Varying Visual Context

    Authors: Samarth Singhal, Sandeep Singhal

    Abstract: Vision-Language Models (VLMs) have rapidly advanced alongside Large Language Models (LLMs). This study evaluates the capabilities of prominent generative VLMs, such as GPT-4.1 and Gemini 2.5 Pro, accessed via APIs, for histopathology image classification tasks, including cell typing. Using diverse datasets from public and private sources, we apply zero-shot and one-shot prompting methods to assess… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2506.09661  [pdf, ps, other

    eess.IV cs.CV q-bio.TO

    A Cytology Dataset for Early Detection of Oral Squamous Cell Carcinoma

    Authors: Garima Jain, Sanghamitra Pati, Mona Duggal, Amit Sethi, Abhijeet Patil, Gururaj Malekar, Nilesh Kowe, Jitender Kumar, Jatin Kashyap, Divyajeet Rout, Deepali, Hitesh, Nishi Halduniya, Sharat Kumar, Heena Tabassum, Rupinder Singh Dhaliwal, Sucheta Devi Khuraijam, Sushma Khuraijam, Sharmila Laishram, Simmi Kharb, Sunita Singh, K. Swaminadtan, Ranjana Solanki, Deepika Hemranjani, Shashank Nath Singh , et al. (12 additional authors not shown)

    Abstract: Oral squamous cell carcinoma OSCC is a major global health burden, particularly in several regions across Asia, Africa, and South America, where it accounts for a significant proportion of cancer cases. Early detection dramatically improves outcomes, with stage I cancers achieving up to 90 percent survival. However, traditional diagnosis based on histopathology has limited accessibility in low-res… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 7 pages, 2 figurs

  3. arXiv:2505.00949  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Llama-Nemotron: Efficient Reasoning Models

    Authors: Akhiad Bercovich, Itay Levy, Izik Golan, Mohammad Dabbah, Ran El-Yaniv, Omri Puny, Ido Galil, Zach Moshe, Tomer Ronen, Najeeb Nabwani, Ido Shahaf, Oren Tropp, Ehud Karpas, Ran Zilberstein, Jiaqi Zeng, Soumye Singhal, Alexander Bukharin, Yian Zhang, Tugrul Konuk, Gerald Shen, Ameya Sunil Mahabaleshwarkar, Bilal Kartal, Yoshi Suhara, Olivier Delalleau, Zijia Chen , et al. (111 additional authors not shown)

    Abstract: We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use. The family comes in three sizes -- Nano (8B), Super (49B), and Ultra (253B) -- and performs competitively with state-of-the-art reasoning models such as DeepSeek-R1 while offering superior i… ▽ More

    Submitted 30 June, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

  4. arXiv:2504.10810  [pdf, other

    cs.CV cs.AI

    PatrolVision: Automated License Plate Recognition in the wild

    Authors: Anmol Singhal Navya Singhal

    Abstract: Adoption of AI driven techniques in public services remains low due to challenges related to accuracy and speed of information at population scale. Computer vision techniques for traffic monitoring have not gained much popularity despite their relative strength in areas such as autonomous driving. Despite large number of academic methods for Automatic License Plate Recognition (ALPR) systems, very… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted in IEEE Southeast Con 2025. To be published in IEEEXplore

  5. arXiv:2504.06141  [pdf, other

    cs.LG

    Adversarial Training of Reward Models

    Authors: Alexander Bukharin, Haifeng Qian, Shengyang Sun, Adithya Renduchintala, Soumye Singhal, Zhilin Wang, Oleksii Kuchaiev, Olivier Delalleau, Tuo Zhao

    Abstract: Reward modeling has emerged as a promising approach for the scalable alignment of language models. However, contemporary reward models (RMs) often lack robustness, awarding high rewards to low-quality, out-of-distribution (OOD) samples. This can lead to reward hacking, where policies exploit unintended shortcuts to maximize rewards, undermining alignment. To address this challenge, we introduce Ad… ▽ More

    Submitted 11 April, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

    Comments: 16 pages, 7 figures

  6. AutoComp: Automated Data Compaction for Log-Structured Tables in Data Lakes

    Authors: Anja Gruenheid, Jesús Camacho-Rodríguez, Carlo Curino, Raghu Ramakrishnan, Stanislav Pak, Sumedh Sakdeo, Lenisha Gandhi, Sandeep K. Singhal, Pooja Nilangekar, Daniel J. Abadi

    Abstract: The proliferation of small files in data lakes poses significant challenges, including degraded query performance, increased storage costs, and scalability bottlenecks in distributed storage systems. Log-structured table formats (LSTs) such as Delta Lake, Apache Iceberg, and Apache Hudi exacerbate this issue due to their append-only write patterns and metadata-intensive operations. While compactio… ▽ More

    Submitted 5 April, 2025; originally announced April 2025.

    Journal ref: ACM SIGMOD 2025

  7. arXiv:2504.03624  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

    Authors: NVIDIA, :, Aaron Blakeman, Aarti Basant, Abhinav Khattar, Adithya Renduchintala, Akhiad Bercovich, Aleksander Ficek, Alexis Bjorlin, Ali Taghibakhshi, Amala Sanjay Deshmukh, Ameya Sunil Mahabaleshwarkar, Andrew Tao, Anna Shors, Ashwath Aithal, Ashwin Poojary, Ayush Dattagupta, Balaram Buddharaju, Bobby Chen, Boris Ginsburg, Boxin Wang, Brandon Norick, Brian Butterfield, Bryan Catanzaro, Carlo del Mundo , et al. (176 additional authors not shown)

    Abstract: As inference-time scaling becomes critical for enhanced reasoning capabilities, it is increasingly becoming important to build models that are efficient to infer. We introduce Nemotron-H, a family of 8B and 56B/47B hybrid Mamba-Transformer models designed to reduce inference cost for a given accuracy level. To achieve this goal, we replace the majority of self-attention layers in the common Transf… ▽ More

    Submitted 15 April, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

  8. arXiv:2503.21165  [pdf, other

    eess.SY cs.AR

    Extending Silicon Lifetime: A Review of Design Techniques for Reliable Integrated Circuits

    Authors: Shaik Jani Babu, Fan Hu, Linyu Zhu, Sonal Singhal, Xinfei Guo

    Abstract: Reliability has become an increasing concern in modern computing. Integrated circuits (ICs) are the backbone of modern computing devices across industries, including artificial intelligence (AI), consumer electronics, healthcare, automotive, industrial, and aerospace. Moore Law has driven the semiconductor IC industry toward smaller dimensions, improved performance, and greater energy efficiency.… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: This work is under review by ACM

  9. arXiv:2503.16107  [pdf, other

    cs.LG eess.SY

    Learn to Bid as a Price-Maker Wind Power Producer

    Authors: Shobhit Singhal, Marta Fochesato, Liviu Aolaritei, Florian Dörfler

    Abstract: Wind power producers (WPPs) participating in short-term power markets face significant imbalance costs due to their non-dispatchable and variable production. While some WPPs have a large enough market share to influence prices with their bidding decisions, existing optimal bidding methods rarely account for this aspect. Price-maker approaches typically model bidding as a bilevel optimization probl… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

  10. arXiv:2503.03862  [pdf, other

    cs.CL cs.AI

    Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

    Authors: Emmy Liu, Amanda Bertsch, Lintang Sutawika, Lindia Tjuatja, Patrick Fernandes, Lara Marinov, Michael Chen, Shreya Singhal, Carolin Lawrence, Aditi Raghunathan, Kiril Gashteovski, Graham Neubig

    Abstract: Improvements in language model capabilities are often attributed to increasing model size or training data, but in some cases smaller models trained on curated data or with different architectural decisions can outperform larger ones trained on more tokens. What accounts for this? To quantify the impact of these design choices, we meta-analyze 92 open-source pretrained models across a wide array o… ▽ More

    Submitted 25 May, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

  11. arXiv:2503.01743  [pdf, other

    cs.CL cs.AI cs.LG

    Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

    Authors: Microsoft, :, Abdelrahman Abouelenin, Atabak Ashfaq, Adam Atkinson, Hany Awadalla, Nguyen Bach, Jianmin Bao, Alon Benhaim, Martin Cai, Vishrav Chaudhary, Congcong Chen, Dong Chen, Dongdong Chen, Junkun Chen, Weizhu Chen, Yen-Chun Chen, Yi-ling Chen, Qi Dai, Xiyang Dai, Ruchao Fan, Mei Gao, Min Gao, Amit Garg, Abhishek Goswami , et al. (51 additional authors not shown)

    Abstract: We introduce Phi-4-Mini and Phi-4-Multimodal, compact yet highly capable language and multimodal models. Phi-4-Mini is a 3.8-billion-parameter language model trained on high-quality web and synthetic data, significantly outperforming recent open-source models of similar size and matching the performance of models twice its size on math and coding tasks requiring complex reasoning. This achievement… ▽ More

    Submitted 7 March, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: 39 pages

  12. arXiv:2502.00203  [pdf, other

    cs.LG cs.CL

    Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment

    Authors: Shengyang Sun, Yian Zhang, Alexander Bukharin, David Mosallanezhad, Jiaqi Zeng, Soumye Singhal, Gerald Shen, Adithya Renduchintala, Tugrul Konuk, Yi Dong, Zhilin Wang, Dmitry Chichkov, Olivier Delalleau, Oleksii Kuchaiev

    Abstract: The rapid development of large language model (LLM) alignment algorithms has resulted in a complex and fragmented landscape, with limited clarity on the effectiveness of different methods and their inter-connections. This paper introduces Reward-Aware Preference Optimization (RPO), a mathematical framework that unifies popular preference optimization techniques in LLM alignment, including DPO, IPO… ▽ More

    Submitted 7 February, 2025; v1 submitted 31 January, 2025; originally announced February 2025.

    Comments: 8 pages, 4 figures; update author names

  13. arXiv:2501.15348  [pdf, other

    cs.LG cs.DC

    ReInc: Scaling Training of Dynamic Graph Neural Networks

    Authors: Mingyu Guan, Saumia Singhal, Taesoo Kim, Anand Padmanabha Iyer

    Abstract: Dynamic Graph Neural Networks (DGNNs) have gained widespread attention due to their applicability in diverse domains such as traffic network prediction, epidemiological forecasting, and social network analysis. In this paper, we present ReInc, a system designed to enable efficient and scalable training of DGNNs on large-scale graphs. ReInc introduces key innovations that capitalize on the unique c… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

  14. arXiv:2412.20838  [pdf, other

    cs.CV cs.AI cs.LG

    Dual-Space Augmented Intrinsic-LoRA for Wind Turbine Segmentation

    Authors: Shubh Singhal, Raül Pérez-Gonzalo, Andreas Espersen, Antonio Agudo

    Abstract: Accurate segmentation of wind turbine blade (WTB) images is critical for effective assessments, as it directly influences the performance of automated damage detection systems. Despite advancements in large universal vision models, these models often underperform in domain-specific tasks like WTB segmentation. To address this, we extend Intrinsic LoRA for image segmentation, and propose a novel du… ▽ More

    Submitted 30 December, 2024; originally announced December 2024.

    Comments: Authors Shubh Singhal and Raül Pérez-Gonzalo contributed equally to this work. Accepted to ICASSP 2025

  15. arXiv:2412.17947  [pdf, other

    cs.CL

    IITR-CIOL@NLU of Devanagari Script Languages 2025: Multilingual Hate Speech Detection and Target Identification in Devanagari-Scripted Languages

    Authors: Siddhant Gupta, Siddh Singhal, Azmine Toushik Wasi

    Abstract: This work focuses on two subtasks related to hate speech detection and target identification in Devanagari-scripted languages, specifically Hindi, Marathi, Nepali, Bhojpuri, and Sanskrit. Subtask B involves detecting hate speech in online text, while Subtask C requires identifying the specific targets of hate speech, such as individuals, organizations, or communities. We propose the MultilingualRo… ▽ More

    Submitted 28 December, 2024; v1 submitted 23 December, 2024; originally announced December 2024.

    Comments: Accepted to CHiPSAL Workshop at COLING 2025

  16. arXiv:2410.01637  [pdf, other

    cs.CL cs.LG

    On The Adaptation of Unlimiformer for Decoder-Only Transformers

    Authors: Kian Ahrabian, Alon Benhaim, Barun Patra, Jay Pujara, Saksham Singhal, Xia Song

    Abstract: One of the prominent issues stifling the current generation of large language models is their limited context length. Recent proprietary models such as GPT-4 and Claude 2 have introduced longer context lengths, 8k/32k and 100k, respectively; however, despite the efforts in the community, most common models, such as LLama-2, have a context length of 4k or less. Unlimiformer (Bertsch et al., 2023) i… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 8 pages, 6 figures

  17. arXiv:2403.03185  [pdf, other

    cs.LG cs.AI

    Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking

    Authors: Cassidy Laidlaw, Shivam Singhal, Anca Dragan

    Abstract: Because it is difficult to precisely specify complex objectives, reinforcement learning policies are often optimized using proxy reward functions that only approximate the true goal. However, optimizing proxy rewards frequently leads to reward hacking: the optimized reward function ceases to be a good proxy and the resulting policy performs poorly with respect to the unspecified true reward. Princ… ▽ More

    Submitted 13 March, 2025; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Spotlight at ICLR 2025

  18. arXiv:2311.14948  [pdf, other

    cs.LG cs.AI cs.CV

    Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective

    Authors: Sahil Verma, Gantavya Bhatt, Avi Schwarzschild, Soumye Singhal, Arnav Mohanty Das, Chirag Shah, John P Dickerson, Pin-Yu Chen, Jeff Bilmes

    Abstract: Despite the advanced capabilities of contemporary machine learning (ML) models, they remain vulnerable to adversarial and backdoor attacks. This vulnerability is particularly concerning in real-world deployments, where compromised models may exhibit unpredictable behavior in critical scenarios. Such risks are heightened by the prevalent practice of collecting massive, internet-sourced datasets for… ▽ More

    Submitted 10 January, 2025; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted at TMLR (https://openreview.net/forum?id=Conma3qnaT)

  19. arXiv:2311.04603  [pdf, other

    cs.GT

    Navigating Resource Conflicts: Co-opetition and Fairness

    Authors: Shiksha Singhal

    Abstract: In today's dynamic and interconnected world, resource constraints pose significant challenges across various domains, ranging from networks, logistics and manufacturing to project management and optimization, etc. Resource-constrained problems (RCPs) represent a class of complex computational problems that require efficient allocation and utilization of limited resources to achieve optimal outcome… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: PhD thesis

  20. arXiv:2310.01780  [pdf, other

    cs.IT

    Social Optimal Freshness in Multi-Source, Multi-Channel Systems via MDP

    Authors: Shiksha Singhal, Veeraruna Kavitha, Vidya Shankar

    Abstract: Many systems necessitate frequent and consistent updates of a specific information. Often this information is updated regularly, where an old packet becomes completely obsolete in the presence of a new packet. In this context, we consider a system with multiple sources, each equipped with a storage buffer of size one, communicating to a common destination via d orthogonal channels. In each slot, t… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 8 pages, 9 figures

  21. arXiv:2304.12902  [pdf, other

    cs.GT

    On the ubiquity of duopolies in constant sum congestion games

    Authors: Shiksha Singhal, Veeraruna Kavitha, Jayakrishnan Nair

    Abstract: We analyse a coalition formation game between strategic service providers of a congestible service. The key novelty of our formulation is that it is a constant sum game, i.e., the total payoff across all service providers (or coalitions of providers) is fixed, and dictated by the size of the market. The game thus captures the tension between resource pooling (to benefit from the resulting statisti… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: arXiv admin note: text overlap with arXiv:2109.12840

  22. arXiv:2304.03518  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    SSS at SemEval-2023 Task 10: Explainable Detection of Online Sexism using Majority Voted Fine-Tuned Transformers

    Authors: Sriya Rallabandi, Sanchit Singhal, Pratinav Seth

    Abstract: This paper describes our submission to Task 10 at SemEval 2023-Explainable Detection of Online Sexism (EDOS), divided into three subtasks. The recent rise in social media platforms has seen an increase in disproportionate levels of sexism experienced by women on social media platforms. This has made detecting and explaining online sexist content more important than ever to make social media safer… ▽ More

    Submitted 23 April, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: Accepted at The 17th International Workshop on Semantic Evaluation, ACL 2023

  23. CoReFusion: Contrastive Regularized Fusion for Guided Thermal Super-Resolution

    Authors: Aditya Kasliwal, Pratinav Seth, Sriya Rallabandi, Sanchit Singhal

    Abstract: Thermal imaging has numerous advantages over regular visible-range imaging since it performs well in low-light circumstances. Super-Resolution approaches can broaden their usefulness by replicating accurate high-resolution thermal pictures using measurements from low-cost, low-resolution thermal sensors. Because of the spectral range mismatch between the images, Guided Super-Resolution of thermal… ▽ More

    Submitted 24 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: Accepted at 19th IEEE Workshop on Perception Beyond the Visible Spectrum,CVPR 2023

  24. arXiv:2302.14045  [pdf, other

    cs.CL cs.CV

    Language Is Not All You Need: Aligning Perception with Language Models

    Authors: Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei

    Abstract: A big convergence of language, multimodal perception, action, and world modeling is a key step toward artificial general intelligence. In this work, we introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot). Specifically, we train Kosmos-1 from scratch on web-scale multimodal co… ▽ More

    Submitted 1 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  25. arXiv:2211.14851  [pdf, other

    cs.CV cs.LG

    Performance evaluation of deep segmentation models for Contrails detection

    Authors: Akshat Bhandari, Sriya Rallabandi, Sanchit Singhal, Aditya Kasliwal, Pratinav Seth

    Abstract: Contrails, short for condensation trails, are line-shaped ice clouds produced by aircraft engine exhaust when they fly through cold and humid air. They generate a greenhouse effect by absorbing or directing back to Earth approximately 33% of emitted outgoing longwave radiation. They account for over half of the climate change resulting from aviation activities. Avoiding contrails and adjusting fli… ▽ More

    Submitted 4 November, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: Accepted to Tackling Climate Change with Machine Learning: workshop at NeurIPS 2022

  26. arXiv:2211.09061  [pdf, other

    cs.LG

    Squeeze flow of micro-droplets: convolutional neural network with trainable and tunable refinement

    Authors: Aryan Mehboudi, Shrawan Singhal, S. V. Sreenivasan

    Abstract: We propose a platform based on neural networks to solve the image-to-image translation problem in the context of squeeze flow of micro-droplets. In the first part of this paper, we present the governing partial differential equations to lay out the underlying physics of the problem. We also discuss our developed Python package, sqflow, which can potentially serve as free, flexible, and scalable st… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 27 pages, 18 figures

    MSC Class: 68T07; 68T10; 68T20; 68P25; 94A08; ACM Class: I.2.6; I.2.10; I.4.2; I.4.6; I.4.8; I.4.9; I.4.10; I.5.1; I.5.2; I.5.3; I.5.4; I.6.5; J.2

  27. arXiv:2210.14867  [pdf, other

    cs.CL cs.LG

    Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning

    Authors: Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei, Vishrav Chaudhary, Xia Song

    Abstract: In this paper, we elaborate upon recipes for building multilingual representation models that are not only competitive with existing state-of-the-art models but are also more parameter efficient, thereby promoting better adoption in resource-constrained scenarios and practical applications. We show that going beyond English-centric bitexts, coupled with a novel sampling strategy aimed at reducing… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Work in progress

  28. arXiv:2210.06423  [pdf, other

    cs.LG cs.CL cs.CV

    Foundation Transformers

    Authors: Hongyu Wang, Shuming Ma, Shaohan Huang, Li Dong, Wenhui Wang, Zhiliang Peng, Yu Wu, Payal Bajaj, Saksham Singhal, Alon Benhaim, Barun Patra, Zhun Liu, Vishrav Chaudhary, Xia Song, Furu Wei

    Abstract: A big convergence of model architectures across language, vision, speech, and multimodal is emerging. However, under the same name "Transformers", the above areas use different implementations for better performance, e.g., Post-LayerNorm for BERT, and Pre-LayerNorm for GPT and vision Transformers. We call for the development of Foundation Transformer for true general-purpose modeling, which serves… ▽ More

    Submitted 19 October, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Work in progress

  29. arXiv:2209.08743  [pdf, other

    cs.DC cs.DB

    DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory (Extended Version)

    Authors: Sekwon Lee, Soujanya Ponnapalli, Sharad Singhal, Marcos K. Aguilera, Kimberly Keeton, Vijay Chidambaram

    Abstract: We present Dinomo, a novel key-value store for disaggregated persistent memory (DPM). Dinomo is the first key-value store for DPM that simultaneously achieves high common-case performance, scalability, and lightweight online reconfiguration. We observe that previously proposed key-value stores for DPM had architectural limitations that prevent them from achieving all three goals simultaneously. Di… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: This is an extended version of the full paper to appear in PVLDB 15.13 (VLDB 2023)

  30. arXiv:2208.10442  [pdf, other

    cs.CV cs.CL

    Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks

    Authors: Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, Furu Wei

    Abstract: A big convergence of language, vision, and multimodal pretraining is emerging. In this work, we introduce a general-purpose multimodal foundation model BEiT-3, which achieves state-of-the-art transfer performance on both vision and vision-language tasks. Specifically, we advance the big convergence from three aspects: backbone architecture, pretraining task, and model scaling up. We introduce Mult… ▽ More

    Submitted 30 August, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: 18 pages

  31. arXiv:2205.10152  [pdf

    cs.ET

    Investigating the impact of BTI, HCI and time-zero variability on neuromorphic spike event generation circuits

    Authors: Shaik Jani Babu, Rohit Singh, Siona Menezes Picardo, Nilesh Goel, Sonal Singhal

    Abstract: Neuromorphic computing refers to brain-inspired computers, that differentiate it from von Neumann architecture. Analog VLSI based neuromorphic circuits is a current research interest. Two simpler spiking integrate and fire neuron model namely axon-Hillock (AH) and voltage integrate, and fire (VIF) circuits are commonly used for generating spike events. This paper discusses the impact of reliabilit… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: 4 pages, 4 figures, IWPSD 2019

  32. Design and Mathematical Modelling of Inter Spike Interval of Temporal Neuromorphic Encoder for Image Recognition

    Authors: Aadhitiya VS, Jani Babu Shaik, Sonal Singhal, Siona Menezes Picardo, Nilesh Goel

    Abstract: Neuromorphic computing systems emulate the electrophysiological behavior of the biological nervous system using mixed-mode analog or digital VLSI circuits. These systems show superior accuracy and power efficiency in carrying out cognitive tasks. The neural network architecture used in neuromorphic computing systems is spiking neural networks (SNNs) analogous to the biological nervous system. SNN… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: 4 pages, 6 figures, one table, IEEE ICEE 2020 conference proceeding

  33. arXiv:2204.09179  [pdf, other

    cs.CL cs.LG

    On the Representation Collapse of Sparse Mixture of Experts

    Authors: Zewen Chi, Li Dong, Shaohan Huang, Damai Dai, Shuming Ma, Barun Patra, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

    Abstract: Sparse mixture of experts provides larger model capacity while requiring a constant computational overhead. It employs the routing mechanism to distribute input tokens to the best-matched experts according to their hidden representations. However, learning such a routing mechanism encourages token clustering around expert centroids, implying a trend toward representation collapse. In this work, we… ▽ More

    Submitted 12 October, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: NeurIPS 2022

  34. arXiv:2202.07848  [pdf, other

    cs.DC cs.AI

    Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads

    Authors: Dharma Shukla, Muthian Sivathanu, Srinidhi Viswanatha, Bhargav Gulavani, Rimma Nehme, Amey Agrawal, Chen Chen, Nipun Kwatra, Ramachandran Ramjee, Pankaj Sharma, Atul Katiyar, Vipul Modi, Vaibhav Sharma, Abhishek Singh, Shreshth Singhal, Kaustubh Welankar, Lu Xun, Ravi Anupindi, Karthik Elangovan, Hasibur Rahman, Zhou Lin, Rahul Seetharaman, Cheng Xu, Eddie Ailijiang, Suresh Krishnappa , et al. (1 additional authors not shown)

    Abstract: Lowering costs by driving high utilization across deep learning workloads is a crucial lever for cloud providers. We present Singularity, Microsoft's globally distributed scheduling service for highly-efficient and reliable execution of deep learning training and inference workloads. At the heart of Singularity is a novel, workload-aware scheduler that can transparently preempt and elastically sca… ▽ More

    Submitted 21 February, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: Revision: Fixed some typos

  35. Discrete Simulation Optimization for Tuning Machine Learning Method Hyperparameters

    Authors: Varun Ramamohan, Shobhit Singhal, Aditya Raj Gupta, Nomesh Bhojkumar Bolia

    Abstract: Machine learning (ML) methods are used in most technical areas such as image recognition, product recommendation, financial analysis, medical diagnosis, and predictive maintenance. An important aspect of implementing ML methods involves controlling the learning process for the ML method so as to maximize the performance of the method under consideration. Hyperparameter tuning is the process of sel… ▽ More

    Submitted 20 June, 2023; v1 submitted 16 January, 2022; originally announced January 2022.

    Journal ref: Journal of Simulation (2023)

  36. arXiv:2111.12172  [pdf, other

    cs.CV cs.AI cs.LG

    Multi-label Iterated Learning for Image Classification with Label Ambiguity

    Authors: Sai Rajeswar, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville

    Abstract: Transfer learning from large-scale pre-trained models has become essential for many computer vision tasks. Recent studies have shown that datasets like ImageNet are weakly labeled since images with multiple object classes present are assigned a single label. This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data.… ▽ More

    Submitted 23 November, 2021; originally announced November 2021.

  37. arXiv:2111.02086  [pdf, other

    cs.CL

    Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

    Authors: Jian Yang, Shuming Ma, Haoyang Huang, Dongdong Zhang, Li Dong, Shaohan Huang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

    Abstract: This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation. We participated in all three evaluation tracks including Large Track and two Small Tracks where the former one is unconstrained and the latter two are fully constrained. Our model submissions to the shared task were initialized with DeltaLM\footnote{\url{https://… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: WMT21

  38. arXiv:2109.12840  [pdf, other

    cs.GT math.OC

    Coalition Formation in Constant Sum Queueing Games

    Authors: Shiksha Singhal, Veeraruna Kavitha, Jayakrishnan Nair

    Abstract: We analyse a coalition formation game between strategic service providers of a congestible service. The key novelty of our formulation is that it is a constant sum game, i.e., the total payoff across all service providers (or coalitions of providers) is fixed, and dictated by the total size of the market. The game thus captures the tension between resource pooling (to benefit from the resulting st… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: 15 pages, 3 figures

  39. arXiv:2109.07306  [pdf, other

    cs.CL

    Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

    Authors: Bo Zheng, Li Dong, Shaohan Huang, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

    Abstract: Compared to monolingual models, cross-lingual models usually require a more expressive vocabulary to represent all languages adequately. We find that many languages are under-represented in recent cross-lingual language models due to the limited vocabulary capacity. To this end, we propose an algorithm VoCap to determine the desired vocabulary capacity of each language. However, increasing the voc… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  40. arXiv:2109.05329  [pdf, other

    cs.DC

    MODC: Resilience for disaggregated memory architectures using task-based programming

    Authors: Kimberly Keeton, Sharad Singhal, Haris Volos, Yupu Zhang, Ramesh Chandra Chaurasiya, Clarete Riana Crasta, Sherin T George, Nagaraju K N, Mashood Abdulla K, Kavitha Natarajan, Porno Shome, Sanish Suresh

    Abstract: Disaggregated memory architectures provide benefits to applications beyond traditional scale out environments, such as independent scaling of compute and memory resources. They also provide an independent failure model, where computations or the compute nodes they run on may fail independently of the disaggregated memory; thus, data that's resident in the disaggregated memory is unaffected by the… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

    Comments: 9 pages, 4 figures

    ACM Class: D.4.1; D.4.5; D.4.7; C.1.4; E.1

    Journal ref: Proceedings of 2nd Workshop on Resource Disaggregation and Serverless (WORDS'21), Co-located with ASPLOS'21, April 2021

  41. Desk Organization: Effect of Multimodal Inputs on Spatial Relational Learning

    Authors: Ryan Rowe, Shivam Singhal, Daqing Yi, Tapomayukh Bhattacharjee, Siddhartha S. Srinivasa

    Abstract: For robots to operate in a three dimensional world and interact with humans, learning spatial relationships among objects in the surrounding is necessary. Reasoning about the state of the world requires inputs from many different sensory modalities including vision ($V$) and haptics ($H$). We examine the problem of desk organization: learning how humans spatially position different objects on a pl… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 8 pages, 7 figures

    ACM Class: I.2.9

    Journal ref: 2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) (pp. 1-8). IEEE

  42. arXiv:2106.16138  [pdf, other

    cs.CL

    XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

    Authors: Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

    Abstract: In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understandi… ▽ More

    Submitted 19 April, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: ACL-2022

  43. arXiv:2106.13736  [pdf, other

    cs.CL

    DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

    Authors: Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

    Abstract: While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG). NLG tasks are often based on the encoder-decoder framework, where the pretrained encoders can only benefit part of it. To reduce this gap, we introduce DeltaLM, a pretrained multilingual encoder-decoder model… ▽ More

    Submitted 17 August, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: Work in progress

  44. arXiv:2106.08226  [pdf, other

    cs.CL

    Consistency Regularization for Cross-Lingual Fine-Tuning

    Authors: Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei

    Abstract: Fine-tuning pre-trained cross-lingual language models can transfer task-specific supervision from one language to the others. In this work, we propose to improve cross-lingual fine-tuning with consistency regularization. Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations, i.e., subword sampling, Gaussian noise, code-sw… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: ACL-2021

  45. arXiv:2103.03096  [pdf, other

    cs.CV cs.AI

    Towards Designing Computer Vision-based Explainable-AI Solution: A Use Case of Livestock Mart Industry

    Authors: Devam Dave, Het Naik, Smiti Singhal, Rudresh Dwivedi, Pankesh Patel

    Abstract: The objective of an online Mart is to match buyers and sellers, to weigh animals and to oversee their sale. A reliable pricing method can be developed by ML models that can read through historical sales data. However, when AI models suggest or recommend a price, that in itself does not reveal too much (i.e., it acts like a black box) about the qualities and the abilities of an animal. An intereste… ▽ More

    Submitted 8 February, 2021; originally announced March 2021.

    Comments: 8 pages, 5 figures

  46. arXiv:2102.11276  [pdf, other

    cs.CL cs.CY

    Factorization of Fact-Checks for Low Resource Indian Languages

    Authors: Shivangi Singhal, Rajiv Ratn Shah, Ponnurangam Kumaraguru

    Abstract: The advancement in technology and accessibility of internet to each individual is revolutionizing the real time information. The liberty to express your thoughts without passing through any credibility check is leading to dissemination of fake content in the ecosystem. It can have disastrous effects on both individuals and society as a whole. The amplification of fake news is becoming rampant in I… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: 15 pages, 6 figures

  47. arXiv:2012.15547  [pdf, other

    cs.CL

    XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

    Authors: Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei

    Abstract: Multilingual machine translation enables a single model to translate between different languages. Most existing multilingual machine translation systems adopt a randomly initialized Transformer backbone. In this work, inspired by the recent success of language model pre-training, we present XLM-T, which initializes the model with an off-the-shelf pretrained cross-lingual Transformer encoder and fi… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  48. arXiv:2012.13548  [pdf, other

    cs.DC cs.PF

    Graph500 from OCaml-Multicore Perspective

    Authors: Shubhendra Pal Singhal

    Abstract: OCaml is an industrial-strength, multi-paradigm programming language, widely used in industry and academia. OCaml was developed for solving numerical and scientific problems involving large scale data-intensive operations and one such classic application set is Graph Algorithms, which are a core part of most analytics workloads. In this paper, we aim to implement the graph benchmarks along with th… ▽ More

    Submitted 25 December, 2020; originally announced December 2020.

    Comments: 6 pages

  49. arXiv:2012.02960  [pdf, ps, other

    cs.GT math.OC

    Cooperative Ressource Sharing With Adamant Player

    Authors: Shiksha Singhal, Veeraruna Kavitha

    Abstract: Cooperative game theory deals with systems where players want to cooperate to improve their payoffs. But players may choose coalitions in a non-cooperative manner, leading to a coalition-formation game. We consider such a game with several players (willing to cooperate) and an adamant player (unwilling to cooperate) involved in resource-sharing. Here, the strategy of a player is the set of players… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

  50. arXiv:2011.03195  [pdf, other

    cs.LG cs.AI

    Explainable AI meets Healthcare: A Study on Heart Disease Dataset

    Authors: Devam Dave, Het Naik, Smiti Singhal, Pankesh Patel

    Abstract: With the increasing availability of structured and unstructured data and the swift progress of analytical techniques, Artificial Intelligence (AI) is bringing a revolution to the healthcare industry. With the increasingly indispensable role of AI in healthcare, there are growing concerns over the lack of transparency and explainability in addition to potential bias encountered by predictions of th… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: 23