Skip to main content

Showing 1–23 of 23 results for author: Wadhwa, M

.
  1. arXiv:2505.13444  [pdf, ps, other

    cs.CL cs.CV

    ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

    Authors: Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett

    Abstract: Chart understanding presents a unique challenge for large vision-language models (LVLMs), as it requires the integration of sophisticated textual and visual reasoning capabilities. However, current LVLMs exhibit a notable imbalance between these skills, falling short on visual reasoning that is difficult to perform in text. We conduct a case study using a synthetic dataset solvable only through vi… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2504.15219  [pdf, other

    cs.CL

    EvalAgent: Discovering Implicit Evaluation Criteria from the Web

    Authors: Manya Wadhwa, Zayne Sprague, Chaitanya Malaviya, Philippe Laban, Junyi Jessy Li, Greg Durrett

    Abstract: Evaluation of language model outputs on structured writing tasks is typically conducted with a number of desirable criteria presented to human evaluators or large language models (LLMs). For instance, on a prompt like "Help me draft an academic talk on coffee intake vs research productivity", a model response may be evaluated for criteria like accuracy and coherence. However, high-quality response… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

  3. arXiv:2504.14716  [pdf, other

    cs.LG

    Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation

    Authors: Tuhina Tripathi, Manya Wadhwa, Greg Durrett, Scott Niekum

    Abstract: Large Language Models (LLMs) are widely used as proxies for human labelers in both training (Reinforcement Learning from AI Feedback) and large-scale response evaluation (LLM-as-a-judge). Alignment and evaluation are critical components in the development of reliable LLMs, and the choice of feedback protocol plays a central role in both but remains understudied. In this work, we show that the choi… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

  4. arXiv:2504.09373  [pdf, other

    cs.CL

    QUDsim: Quantifying Discourse Similarities in LLM-Generated Text

    Authors: Ramya Namuduri, Yating Wu, Anshun Asher Zheng, Manya Wadhwa, Greg Durrett, Junyi Jessy Li

    Abstract: As large language models become increasingly capable at various writing tasks, their weakness at generating unique and creative content becomes a major liability. Although LLMs have the ability to generate text covering diverse topics, there is an overall sense of repetitiveness across texts that we aim to formalize and quantify via a similarity metric. The familiarity between documents arises fro… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  5. arXiv:2501.11017  [pdf, ps, other

    hep-ph

    Strange quark stars in modified vector MIT bag model: role of $ρ$ and $φ$ mesons

    Authors: Mukul Wadhwa, Manisha Kumari, Arvind Kumar

    Abstract: In the present work, we study the properties of strange quark stars (SQSs) using the vector MIT bag model with modification in vector channels. Unlike recent studies which only consider interactions through $ω$ mesons, we analyze the possibility of $ρ$ and $φ$ vector channels. We consider two types of higher order non-linear self-interaction terms for the vector mesons. With these modifications, w… ▽ More

    Submitted 19 January, 2025; originally announced January 2025.

    Comments: 9 pages and 8 figures

  6. The Ni isotopic composition of Ryugu reveals a common accretion region for carbonaceous chondrites

    Authors: Fridolin Spitzer, Thorsten Kleine, Christoph Burkhardt, Timo Hopp, Tetsuya Yokoyama, Yoshinari Abe, Jérôme Aléon, Conel M. O'D. Alexander, Sachiko Amari, Yuri Amelin, Ken-ichi Bajo, Martin Bizzarro, Audrey Bouvier, Richard W. Carlson, Marc Chaussidon, Byeon-Gak Choi, Nicolas Dauphas, Andrew M. Davis, Tommaso Di Rocco, Wataru Fujiya, Ryota Fukai, Ikshu Gautam, Makiko K. Haba, Yuki Hibiya, Hiroshi Hidaka , et al. (66 additional authors not shown)

    Abstract: The isotopic compositions of samples returned from Cb-type asteroid Ryugu and Ivuna-type (CI) chondrites are distinct from other carbonaceous chondrites, which has led to the suggestion that Ryugu and CI chondrites formed in a different region of the accretion disk, possibly around the orbits of Uranus and Neptune. We show that, like for Fe, Ryugu and CI chondrites also have indistinguishable Ni i… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

    Comments: Published open access in Science Advances

    Journal ref: Science Advances 10, 39, eadp2426 (2024)

  7. arXiv:2409.12183  [pdf, other

    cs.CL cs.AI cs.LG

    To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

    Authors: Zayne Sprague, Fangcong Yin, Juan Diego Rodriguez, Dongwei Jiang, Manya Wadhwa, Prasann Singhal, Xinyu Zhao, Xi Ye, Kyle Mahowald, Greg Durrett

    Abstract: Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs). But for what kinds of tasks is this extra ``thinking'' really helpful? To analyze this, we conducted a quantitative meta-analysis covering over 100 papers using CoT and ran our own evaluations of 20 datasets across 14 models. Our results show that CoT gives strong per… ▽ More

    Submitted 7 May, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: Published at ICLR 2025

  8. arXiv:2407.02397  [pdf, other

    cs.CL

    Learning to Refine with Fine-Grained Natural Language Feedback

    Authors: Manya Wadhwa, Xinyu Zhao, Junyi Jessy Li, Greg Durrett

    Abstract: Recent work has explored the capability of large language models (LLMs) to identify and correct errors in LLM-generated responses. These refinement approaches frequently evaluate what sizes of models are able to do refinement for what problems, but less attention is paid to what effective feedback for refinement looks like. In this work, we propose looking at refinement with feedback as a composit… ▽ More

    Submitted 3 October, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Code and models available at: https://github.com/ManyaWadhwa/DCR

  9. arXiv:2305.14770  [pdf, other

    cs.CL

    Using Natural Language Explanations to Rescale Human Judgments

    Authors: Manya Wadhwa, Jifan Chen, Junyi Jessy Li, Greg Durrett

    Abstract: The rise of large language models (LLMs) has brought a critical need for high-quality human-labeled data, particularly for processes like human feedback and evaluation. A common practice is to label data via consensus annotation over human judgments. However, annotators' judgments for subjective tasks can differ in many ways: they may reflect different qualitative judgments about an example, and t… ▽ More

    Submitted 9 September, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Data available at https://github.com/ManyaWadhwa/explanation_based_rescaling

  10. arXiv:2303.10624  [pdf, other

    cs.LG cs.DC

    PFSL: Personalized & Fair Split Learning with Data & Label Privacy for thin clients

    Authors: Manas Wadhwa, Gagan Raj Gupta, Ashutosh Sahu, Rahul Saini, Vidhi Mittal

    Abstract: The traditional framework of federated learning (FL) requires each client to re-train their models in every iteration, making it infeasible for resource-constrained mobile devices to train deep-learning (DL) models. Split learning (SL) provides an alternative by using a centralized server to offload the computation of activations and gradients for a subset of the model but suffers from problems of… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: To be published in : THE 23RD IEEE/ACM INTERNATIONAL SYMPOSIUM ON Cluster, Cloud and Internet Computing. Granted: Open Research Objects (ORO) and Research Objects Reviewed (ROR) badges. See https://www.niso.org/publications/rp-31-2021-badging for definitions of the badges. Code available at: https://github.com/mnswdhw/PFSL

  11. arXiv:2208.07976  [pdf

    astro-ph.EP astro-ph.SR

    Presolar stardust in asteroid Ryugu

    Authors: Jens Barosch, Larry R. Nittler, Jianhua Wang, Conel M. O'D. Alexander, Bradley T. De Gregorio, Cécile Engrand, Yoko Kebukawa, Kazuhide Nagashima, Rhonda M. Stroud, Hikaru Yabuta, Yoshinari Abe, Jérôme Aléon, Sachiko Amari, Yuri Amelin, Ken-ichi Bajo, Laure Bejach, Martin Bizzarro, Lydie Bonal, Audrey Bouvier, Richard W. Carlson, Marc Chaussidon, Byeon-Gak Choi, George D. Cody, Emmanuel Dartois, Nicolas Dauphas , et al. (99 additional authors not shown)

    Abstract: We have conducted a NanoSIMS-based search for presolar material in samples recently returned from C-type asteroid Ryugu as part of JAXA's Hayabusa2 mission. We report the detection of all major presolar grain types with O- and C-anomalous isotopic compositions typically identified in carbonaceous chondrite meteorites: 1 silicate, 1 oxide, 1 O-anomalous supernova grain of ambiguous phase, 38 SiC, a… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 12 pages, 3 figures, 2 tables. Published in ApJL

    Journal ref: 2022, The Astrophysical Journal Letters, 935, L3 (12pp)

  12. arXiv:2203.03541  [pdf, other

    cs.CL cs.AI

    Fairness for Text Classification Tasks with Identity Information Data Augmentation Methods

    Authors: Mohit Wadhwa, Mohan Bhambhani, Ashvini Jindal, Uma Sawant, Ramanujam Madhavan

    Abstract: Counterfactual fairness methods address the question: How would the prediction change if the sensitive identity attributes referenced in the text instance were different? These methods are entirely based on generating counterfactuals for the given training and test set instances. Counterfactual instances are commonly prepared by replacing sensitive identity terms, i.e., the identity terms present… ▽ More

    Submitted 4 February, 2022; originally announced March 2022.

  13. arXiv:2110.12763  [pdf, ps, other

    cs.LG cs.AI

    SSMF: Shifting Seasonal Matrix Factorization

    Authors: Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi

    Abstract: Given taxi-ride counts information between departure and destination locations, how can we forecast their future demands? In general, given a data stream of events with seasonal patterns that innovate over time, how can we effectively and efficiently forecast future events? In this paper, we propose Shifting Seasonal Matrix Factorization approach, namely SSMF, that can adaptively learn multiple se… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: NeurIPS, 2021

  14. arXiv:2106.04486  [pdf, other

    cs.DS cs.AI cs.LG

    Sketch-Based Anomaly Detection in Streaming Graphs

    Authors: Siddharth Bhatia, Mohit Wadhwa, Kenji Kawaguchi, Neil Shah, Philip S. Yu, Bryan Hooi

    Abstract: Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? For example, in intrusion detection, existing work seeks to detect either anomalous edges or anomalous subgraphs, but not both. In this paper, we first extend the count-min sketch data structu… ▽ More

    Submitted 13 July, 2023; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted at SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023

  15. arXiv:2010.10737  [pdf, other

    cs.LG stat.ML

    Directed Graph Representation through Vector Cross Product

    Authors: Ramanujam Madhavan, Mohit Wadhwa

    Abstract: Graph embedding methods embed the nodes in a graph in low dimensional vector space while preserving graph topology to carry out the downstream tasks such as link prediction, node recommendation and clustering. These tasks depend on a similarity measure such as cosine similarity and Euclidean distance between a pair of embeddings that are symmetric in nature and hence do not hold good for directed… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  16. arXiv:2009.14366  [pdf

    astro-ph.IM astro-ph.EP astro-ph.SR

    The Case for Non-Cryogenic Comet Nucleus Sample Return

    Authors: Keiko Nakamura-Messenger, Alexander G. Hayes, Scott Sandford, Carol Raymond, Steven W. Squyres, Larry R. Nittler, Samuel Birch, Denis Bodewits, Nancy Chabot, Meenakshi Wadhwa, Mathieu Choukroun, Simon J. Clemett, Maitrayee Bose, Neil Dello Russo, Jason P. Dworkin, Jamie E. Elsila, Kenton Fisher, Perry Gerakines, Daniel P. Glavin, Julie Mitchell, Michael Mumma, Ann. N. Nguyen, Lisa Pace, Jason Soderblom, Jessica M. Sunshine

    Abstract: Comets hold answers to mysteries of the Solar System by recording presolar history, the initial states of planet formation and prebiotic organics and volatiles to the early Earth. Analysis of returned samples from a comet nucleus will provide unparalleled knowledge about the Solar System starting materials and how they came together to form planets and give rise to life: 1. How did comets form?… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: White Paper submitted to the Planetary Science Decadal Survey 2023-2032 reflecting the viewpoints of three New Frontiers comet sample return missions proposal teams, CAESAR, CONDOR, and CORSAIR

  17. arXiv:2007.14899  [pdf

    astro-ph.IM astro-ph.EP

    Volatile Sample Return in the Solar System

    Authors: Stefanie N. Milam, Jason P. Dworkin, Jamie E. Elsila, Daniel P. Glavin, Perry A. Gerakines, Julie L. Mitchell, Keiko Nakamura-Messenger, Marc Neveu, Larry Nittler, James Parker, Elisa Quintana, Scott A. Sandford, Joshua E. Schlieder, Rhonda Stroud, Melissa G. Trainer, Meenakshi Wadhwa, Andrew J. Westphal, Michael Zolensky, Dennis Bodewits, Simon Clemett

    Abstract: We advocate for the realization of volatile sample return from various destinations including: small bodies, the Moon, Mars, ocean worlds/satellites, and plumes. As part of recent mission studies (e.g., Comet Astrobiology Exploration SAmple Return (CAESAR) and Mars Sample Return), new concepts, technologies, and protocols have been considered for specific environments and cost. Here we provide a p… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: White paper submitted to the Planetary Science and Astrobiology Decadal Survey 2023-2032

  18. arXiv:2002.12143  [pdf, other

    cs.LG stat.ML

    Fairness-Aware Learning with Prejudice Free Representations

    Authors: Ramanujam Madhavan, Mohit Wadhwa

    Abstract: Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from t… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

  19. arXiv:1710.01216  [pdf, other

    cs.CV

    Group Affect Prediction Using Multimodal Distributions

    Authors: Saqib Shamsi, Bhanu Pratap Singh Rawat, Manya Wadhwa

    Abstract: We describe our approach towards building an efficient predictive model to detect emotions for a group of people in an image. We have proposed that training a Convolutional Neural Network (CNN) model on the emotion heatmaps extracted from the image, outperforms a CNN model trained entirely on the raw images. The comparison of the models have been done on a recently published dataset of Emotion Rec… ▽ More

    Submitted 12 March, 2018; v1 submitted 17 September, 2017; originally announced October 2017.

    Comments: This research paper has been accepted at Workshop on Computer Vision for Active and Assisted Living, WACV 2018

  20. arXiv:1510.07880  [pdf, other

    cs.NI

    Rules in Play: On the Complexity of Routing Tables and Firewalls

    Authors: Mohit Wadhwa, Ambar Pal, Ayush Shah, Paritosh Mittal, H. B. Acharya

    Abstract: A fundamental component of networking infras- tructure is the policy, used in routing tables and firewalls. Accordingly, there has been extensive study of policies. However, the theory of such policies indicates that the size of the decision tree for a policy is very large ( O((2n)d), where the policy has n rules and examines d features of packets). If this was indeed the case, the existing algori… ▽ More

    Submitted 27 October, 2015; originally announced October 2015.

    Comments: On the complexity of Firewalls and Routing Tables

  21. Iron-60 evidence for early injection and efficient mixing of stellar debris in the protosolar nebula

    Authors: N. Dauphas, D. L. Cook, A. Sacarabany, C. Frohlich, A. M. Davis, M. Wadhwa, A. Pourmand, T. Rauscher, R. Gallino

    Abstract: Among extinct radioactivities present in meteorites, 60Fe (t1/2 = 1.49 Myr) plays a key role as a high-resolution chronometer, a heat source in planetesimals, and a fingerprint of the astrophysical setting of solar system formation. A critical issue with 60Fe is that it could have been heterogeneously distributed in the protoplanetary disk, calling into question the efficiency of mixing in the s… ▽ More

    Submitted 16 May, 2008; originally announced May 2008.

    Comments: 15 pages, 5 figures, ApJ in press

    Journal ref: Erratum-ibid.691:1943,2009; Astrophys.J.686:560-569,2008

  22. Double Tag Events in Two-Photon Collisions at LEP

    Authors: M. Wadhwa

    Abstract: Double tag events in two photon collisions are studied using the L3 detector at the LEP center of mass energies $\sqrt{s} \simeq 189-202$ GeV. The cross-section of $γ^* γ^*$ collisions is measured at an average photon virtuality $<Q^2 > = 15 \rm{GeV}^2$. The results are in agreement with Monte Carlo predictions based on perturbative QCD, while the Quark Parton Model alone is insufficient to desc… ▽ More

    Submitted 13 October, 2000; originally announced October 2000.

    Comments: 3 pages, 3 figures, Talk given on behalf of L3 experiment at ICHEP2000, Osaka, Japan

  23. Two Photon Physics at LEP

    Authors: Maneesh Wadhwa

    Abstract: LEP offers an excellent opportunity to measure two photon processes over a large kinematical range and thus study the complex nature of the photon. This article reviews the experimental status of ``Two Photon Physics'' at LEP. The recent results on resonances, multi-hadron production and photon structure functions are discussed

    Submitted 1 September, 1999; originally announced September 1999.

    Comments: 14 pages, 18 figures, Talk given on behalf of LEP at PIC99, June 24-26 Ann Arbor, Michigan,USA