Skip to main content

Showing 1–4 of 4 results for author: Bhutani, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.15370  [pdf, other

    cs.LG cs.AI physics.chem-ph q-bio.BM

    Smirk: An Atomically Complete Tokenizer for Molecular Foundation Models

    Authors: Alexius Wadell, Anoushka Bhutani, Venkatasubramanian Viswanathan

    Abstract: Text-based foundation models have become an important part of scientific discovery, with molecular foundation models accelerating advancements in molecular design and materials science. However, existing models are constrained by closed-vocabulary tokenizers which capture only a fraction of molecular space. In this work, we systematically evaluate thirty tokenizers, including 19 chemistry-specific… ▽ More

    Submitted 7 February, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: 33 pages, 6 figures

  2. arXiv:2407.06129  [pdf, other

    cs.AI cs.HC

    Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization

    Authors: Hannah K. Bako, Arshnoor Bhutani, Xinyi Liu, Kwesi A. Cobbina, Zhicheng Liu

    Abstract: Automatically generating data visualizations in response to human utterances on datasets necessitates a deep semantic understanding of the data utterance, including implicit and explicit references to data attributes, visualization tasks, and necessary data preparation steps. Natural Language Interfaces (NLIs) for data visualization have explored ways to infer such information, yet challenges pers… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 5 pages, 4 figures, IEEE VIS short papers

  3. arXiv:2302.03362  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Machine Learning Benchmarks for the Classification of Equivalent Circuit Models from Electrochemical Impedance Spectra

    Authors: Joachim Schaeffer, Paul Gasper, Esteban Garcia-Tamayo, Raymond Gasper, Masaki Adachi, Juan Pablo Gaviria-Cardona, Simon Montoya-Bedoya, Anoushka Bhutani, Andrew Schiek, Rhys Goodall, Rolf Findeisen, Richard D. Braatz, Simon Engelke

    Abstract: Analysis of Electrochemical Impedance Spectroscopy (EIS) data for electrochemical systems often consists of defining an Equivalent Circuit Model (ECM) using expert knowledge and then optimizing the model parameters to deconvolute various resistance, capacitive, inductive, or diffusion responses. For small data sets, this procedure can be conducted manually; however, it is not feasible to manually… ▽ More

    Submitted 4 May, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: Manuscript: 17 pages, 9 figures; Supplementary Information: 9 pages, 6 figures

    MSC Class: 68T10

  4. arXiv:2010.09426  [pdf, other

    cs.IR

    LANNS: A Web-Scale Approximate Nearest Neighbor Lookup System

    Authors: Ishita Doshi, Dhritiman Das, Ashish Bhutani, Rajeev Kumar, Rushi Bhatt, Niranjan Balasubramanian

    Abstract: Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas. Existing state-of-the-art algorithm for nearest neighbor search, Hierarchical Navigable Small World Networks(HNSW), is unable to scale to large datasets of 100M records in high dimensions. In this paper, we propose LANNS, an end-to-end platform for… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: 10 pages, 9 figures, 9 tables

    ACM Class: H.3.3; H.3.4; H.3.1