Skip to main content

Showing 1–11 of 11 results for author: Dey, V

.
  1. arXiv:2506.08140  [pdf, ps, other

    cs.LG cs.CL

    AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists

    Authors: Yifei Li, Hanane Nour Moussa, Ziru Chen, Shijie Chen, Botao Yu, Mingyi Xue, Benjamin Burns, Tzu-Yao Chiu, Vishal Dey, Zitong Lu, Chen Wei, Qianheng Zhang, Tianyu Zhang, Song Gao, Xuhui Huang, Xia Ning, Nesreen K. Ahmed, Ali Payani, Huan Sun

    Abstract: Despite long-standing efforts in accelerating scientific discovery with AI, building AI co-scientists remains challenging due to limited high-quality data for training and evaluation. To tackle this data scarcity issue, we present AutoSDT, an automatic pipeline that collects high-quality coding tasks in real-world data-driven discovery workflows. AutoSDT leverages the coding capabilities and param… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2505.23987  [pdf, ps, other

    cs.LG cs.AI cs.CL q-bio.BM

    Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization

    Authors: Vishal Dey, Xiao Hu, Xia Ning

    Abstract: In real-world drug design, molecule optimization requires selectively improving multiple molecular properties up to pharmaceutically relevant levels, while maintaining others that already meet such criteria. However, existing computational approaches and instruction-tuned LLMs fail to capture such nuanced property-specific objectives, limiting their practical applicability. To address this, we int… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  3. arXiv:2502.13398  [pdf, other

    cs.LG cs.AI cs.CL physics.chem-ph q-bio.QM

    GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization

    Authors: Vishal Dey, Xiao Hu, Xia Ning

    Abstract: Despite recent advancements, most computational methods for molecule optimization are constrained to single- or double-property optimization tasks and suffer from poor scalability and generalizability to novel optimization tasks. Meanwhile, Large Language Models (LLMs) demonstrate remarkable out-of-domain generalizability to novel tasks. To demonstrate LLMs' potential for molecule optimization, we… ▽ More

    Submitted 27 May, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: Accepted to ACL Main 2025. Vishal Dey and Xiao Hu contributed equally to this paper

  4. arXiv:2410.05080  [pdf, other

    cs.CL cs.AI cs.LG

    ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

    Authors: Ziru Chen, Shijie Chen, Yuting Ning, Qianheng Zhang, Boshi Wang, Botao Yu, Yifei Li, Zeyi Liao, Chen Wei, Zitong Lu, Vishal Dey, Mingyi Xue, Frazier N. Baker, Benjamin Burns, Daniel Adu-Ampratwum, Xuhui Huang, Xia Ning, Song Gao, Yu Su, Huan Sun

    Abstract: The advancements of large language models (LLMs) have piqued growing interest in developing LLM-based language agents to automate scientific discovery end-to-end, which has sparked both excitement and skepticism about their true capabilities. In this work, we call for rigorous assessment of agents on individual tasks in a scientific workflow before making bold claims on end-to-end automation. To t… ▽ More

    Submitted 31 March, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: ICLR 2025. 60 pages

  5. arXiv:2404.18600  [pdf

    cond-mat.dis-nn

    Network-theory based modeling of avalanche dynamics in percolative tunnelling networks

    Authors: Vivek Dey, Steffen Kampman, Rafael Gutierrez, Gianaurelio Cuniberti, Pavan Nukala

    Abstract: Brain-like self-assembled networks can infer and analyze information out of unorganized noisy signals with minimal power consumption. These networks are characterized by spatiotemporal avalanches and their crackling behavior, and their physical models are expected to predict and understand their computational capabilities. Here, we use a network theory-based approach to provide a physical model fo… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  6. arXiv:2401.16299  [pdf, other

    cs.LG cs.AI

    Enhancing Molecular Property Prediction with Auxiliary Learning and Task-Specific Adaptation

    Authors: Vishal Dey, Xia Ning

    Abstract: Pretrained Graph Neural Networks have been widely adopted for various molecular property prediction tasks. Despite their ability to encode structural and relational features of molecules, traditional fine-tuning of such pretrained GNNs on the target task can lead to poor generalization. To address this, we explore the adaptation of pretrained GNNs to the target task by jointly training them with m… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  7. arXiv:2306.17771  [pdf, other

    cs.LG cs.IR q-bio.QM

    Precision Anti-Cancer Drug Selection via Neural Ranking

    Authors: Vishal Dey, Xia Ning

    Abstract: Personalized cancer treatment requires a thorough understanding of complex interactions between drugs and cancer cell lines in varying genetic and molecular contexts. To address this, high-throughput screening has been used to generate large-scale drug response data, facilitating data-driven computational models. Such models can capture complex drug-cell line interactions across various contexts i… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted in BioKDD '23

  8. arXiv:2205.08777  [pdf, other

    cs.AI

    Entity Alignment For Knowledge Graphs: Progress, Challenges, and Empirical Studies

    Authors: Deepak Chaurasiya, Anil Surisetty, Nitish Kumar, Alok Singh, Vikrant Dey, Aakarsh Malhotra, Gaurav Dhama, Ankur Arora

    Abstract: Entity Alignment (EA) identifies entities across databases that refer to the same entity. Knowledge graph-based embedding methods have recently dominated EA techniques. Such methods map entities to a low-dimension space and align them based on their similarities. With the corpus of EA methodologies growing rapidly, this paper presents a comprehensive analysis of various existing EA methods, elabor… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: 8 pages, 8 figures

  9. arXiv:2111.07439  [pdf, other

    cs.LG cs.AI q-bio.BM

    Improving Compound Activity Classification via Deep Transfer and Representation Learning

    Authors: Vishal Dey, Raghu Machiraju, Xia Ning

    Abstract: Recent advances in molecular machine learning, especially deep neural networks such as Graph Neural Networks (GNNs) for predicting structure activity relationships (SAR) have shown tremendous potential in computer-aided drug discovery. However, the applicability of such deep neural networks are limited by the requirement of large amounts of training data. In order to cope with limited training dat… ▽ More

    Submitted 8 March, 2022; v1 submitted 14 November, 2021; originally announced November 2021.

    Comments: This manuscript has been accepted at ACS Omega

    Journal ref: ACS Omega 2022

  10. arXiv:2008.11238  [pdf

    cs.IR cs.CY cs.SI

    A Pipeline to Understand Emerging Illness via Social Media Data Analysis: A Case Study on Breast Implant Illness

    Authors: Vishal Dey, Peter Krasniak, Minh Nguyen, Clara Lee, Xia Ning

    Abstract: Background: A new illness could first come to the public attention over social media before it is medically defined, formally documented or systematically studied. One example is a phenomenon known as breast implant illness (BII) that has been extensively discussed on social media, though vaguely defined in medical literature. Objectives: The objective of this study is to construct a data analysis… ▽ More

    Submitted 8 March, 2022; v1 submitted 25 August, 2020; originally announced August 2020.

    Comments: The manuscript has been accepted at the JMIR Medical Informatics

    Journal ref: JMIR Medical Informaatics 2021, 9(11):e29768

  11. arXiv:1810.00069  [pdf, other

    cs.LG cs.CR stat.ML

    Adversarial Attacks and Defences: A Survey

    Authors: Anirban Chakraborty, Manaar Alam, Vishal Dey, Anupam Chattopadhyay, Debdeep Mukhopadhyay

    Abstract: Deep learning has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using the traditional machine learning techniques in the past. In the last few years, deep learning has advanced radically in such a way that it can surpass human-level performance on a number of tasks. As a consequence, deep learning is b… ▽ More

    Submitted 28 September, 2018; originally announced October 2018.