Search | arXiv e-print repository

A time-frequency method for acoustic scattering with trapping

Authors: Heather Wilber, Wietse Vaes, Abinand Gopal, Gunnar Martinsson

Abstract: A Fourier transform method is introduced for a class of hybrid time-frequency methods that solve the acoustic scattering problem in regimes where the solution exhibits both highly oscillatory behavior and slow decay in time. This extends the applicability of hybrid time-frequency schemes to domains with trapping regions. A fast sinc transform technique for managing highly oscillatory behavior and… ▽ More A Fourier transform method is introduced for a class of hybrid time-frequency methods that solve the acoustic scattering problem in regimes where the solution exhibits both highly oscillatory behavior and slow decay in time. This extends the applicability of hybrid time-frequency schemes to domains with trapping regions. A fast sinc transform technique for managing highly oscillatory behavior and long time horizons is combined with a contour integration scheme that improves smoothness properties in the integrand. △ Less

Submitted 18 June, 2025; originally announced June 2025.

Comments: 18 pages, 9 figures

MSC Class: 65R20; 65M80; 65F05

arXiv:2506.06288 [pdf, ps, other]

DELPHYNE: A Pre-Trained Model for General and Financial Time Series

Authors: Xueying Ding, Aakriti Mittal, Achintya Gopal

Abstract: Time-series data is a vital modality within data science communities. This is particularly valuable in financial applications, where it helps in detecting patterns, understanding market behavior, and making informed decisions based on historical data. Recent advances in language modeling have led to the rise of time-series pre-trained models that are trained on vast collections of datasets and app… ▽ More Time-series data is a vital modality within data science communities. This is particularly valuable in financial applications, where it helps in detecting patterns, understanding market behavior, and making informed decisions based on historical data. Recent advances in language modeling have led to the rise of time-series pre-trained models that are trained on vast collections of datasets and applied to diverse tasks across financial domains. However, across financial applications, existing time-series pre-trained models have not shown boosts in performance over simple finance benchmarks in both zero-shot and fine-tuning settings. This phenomenon occurs because of a i) lack of financial data within the pre-training stage, and ii) the negative transfer effect due to inherently different time-series patterns across domains. Furthermore, time-series data is continuous, noisy, and can be collected at varying frequencies and with varying lags across different variables, making this data more challenging to model than languages. To address the above problems, we introduce a Pre-trained MoDEL for FINance TimE-series (Delphyne). Delphyne achieves competitive performance to existing foundation and full-shot models with few fine-tuning steps on publicly available datasets, and also shows superior performances on various financial tasks. △ Less

Submitted 12 May, 2025; originally announced June 2025.

arXiv:2501.18837 [pdf, other]

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Authors: Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, Alwin Peng, Raj Agarwal, Cem Anil, Amanda Askell, Nathan Bailey, Joe Benton, Emma Bluemke, Samuel R. Bowman, Eric Christiansen, Hoagy Cunningham, Andy Dau, Anjali Gopal, Rob Gilson, Logan Graham, Logan Howard, Nimit Kalra, Taesung Lee, Kevin Lin , et al. (18 additional authors not shown)

Abstract: Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by promptin… ▽ More Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by prompting LLMs with natural language rules (i.e., a constitution) specifying permitted and restricted content. In over 3,000 estimated hours of red teaming, no red teamer found a universal jailbreak that could extract information from an early classifier-guarded LLM at a similar level of detail to an unguarded model across most target queries. On automated evaluations, enhanced classifiers demonstrated robust defense against held-out domain-specific jailbreaks. These classifiers also maintain deployment viability, with an absolute 0.38% increase in production-traffic refusals and a 23.7% inference overhead. Our work demonstrates that defending against universal jailbreaks while maintaining practical deployment viability is tractable. △ Less

Submitted 30 January, 2025; originally announced January 2025.

arXiv:2501.00785 [pdf, other]

doi 10.1109/MRA.2025.3543957

Natural Multimodal Fusion-Based Human-Robot Interaction: Application With Voice and Deictic Posture via Large Language Model

Authors: Yuzhi Lai, Shenghai Yuan, Youssef Nassar, Mingyu Fan, Atmaraaj Gopal, Arihiro Yorita, Naoyuki Kubota, Matthias Rätsch

Abstract: Translating human intent into robot commands is crucial for the future of service robots in an aging society. Existing Human-Robot Interaction (HRI) systems relying on gestures or verbal commands are impractical for the elderly due to difficulties with complex syntax or sign language. To address the challenge, this paper introduces a multi-modal interaction framework that combines voice and deicti… ▽ More Translating human intent into robot commands is crucial for the future of service robots in an aging society. Existing Human-Robot Interaction (HRI) systems relying on gestures or verbal commands are impractical for the elderly due to difficulties with complex syntax or sign language. To address the challenge, this paper introduces a multi-modal interaction framework that combines voice and deictic posture information to create a more natural HRI system. The visual cues are first processed by the object detection model to gain a global understanding of the environment, and then bounding boxes are estimated based on depth information. By using a large language model (LLM) with voice-to-text commands and temporally aligned selected bounding boxes, robot action sequences can be generated, while key control syntax constraints are applied to avoid potential LLM hallucination issues. The system is evaluated on real-world tasks with varying levels of complexity using a Universal Robots UR3e manipulator. Our method demonstrates significantly better performance in HRI in terms of accuracy and robustness. To benefit the research community and the general public, we will make our code and design open-source. △ Less

Submitted 4 April, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

Comments: Accepted for publication by IEEE Robotics & Automation Magazine

arXiv:2412.04388 [pdf, other]

doi 10.1103/PhysRevD.111.L031501

Unitarity bounds with subthreshold and anomalous cuts for $b$-hadron decays

Authors: Abinand Gopal, Nico Gubernari

Abstract: We derive a generalisation of the Boyd-Grinstein-Lebed (BGL) parametrization. Most form factors (FFs) in $b$-hadron decays exhibit additional branch cuts -- namely subthreshold and anomalous branch cuts -- beyond the ``standard'' unitarity cut. These additional cuts cannot be adequately accounted for by the BGL parametrization. For instance, these cuts arise in the FFs for $B\to D^{(*)}$,… ▽ More We derive a generalisation of the Boyd-Grinstein-Lebed (BGL) parametrization. Most form factors (FFs) in $b$-hadron decays exhibit additional branch cuts -- namely subthreshold and anomalous branch cuts -- beyond the ``standard'' unitarity cut. These additional cuts cannot be adequately accounted for by the BGL parametrization. For instance, these cuts arise in the FFs for $B\to D^{(*)}$, $B\to K^{(*)}$, and $Λ_b\to Λ$ processes, which are particularly relevant from a phenomenological standpoint. We demonstrate how to parametrize such FFs and derive unitarity bounds in the presence of subthreshold and/or anomalous branch cuts. Our work paves the way for a wide range of new FF analyses based solely on first principles, thereby minimising systematic uncertainties. △ Less

Submitted 13 February, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

Comments: 7 pages, 2 figures, minor changes

Journal ref: Phys.Rev.D 111 (2025) 3, L031501

arXiv:2412.03226 [pdf, other]

Information thermodynamics for Markov jump processes coupled to underdamped diffusion: Application to nanoelectromechanics

Authors: Ashwin Gopal, Nahuel Freitas, Massimiliano Esposito

Abstract: We extend the principles of information thermodynamics to study energy and information exchanges between coupled systems composed of one part undergoing a Markov jump process and another underdamped diffusion. We derive integral fluctuation theorems for the partial entropy production of each subsystem and analyze two distinct regimes. First, when the inertial dynamics is slow compared to the discr… ▽ More We extend the principles of information thermodynamics to study energy and information exchanges between coupled systems composed of one part undergoing a Markov jump process and another underdamped diffusion. We derive integral fluctuation theorems for the partial entropy production of each subsystem and analyze two distinct regimes. First, when the inertial dynamics is slow compared to the discrete-state transitions, we show that the steady-state energy and information flows vanish at the leading order in an adiabatic approximation, if the underdamped subsystem is governed purely by conservative forces. To capture the non-zero contributions, we consistently derive dynamical equations valid to higher order. Second, in the limit of infinite mass, the underdamped dynamics becomes a deterministic Hamiltonian dynamics driving the jump processes, we capture the next-order correction beyond this limit. We apply our framework to study self-oscillations in the single-electron shuttle - a nanoelectromechanical system (NEMS) - from a measurement-feedback perspective. We find that energy flows dominate over information flows in the self-oscillating regime, and study the efficiency with which this NEMS converts electrical work into mechanical oscillations. △ Less

Submitted 4 December, 2024; originally announced December 2024.

Comments: 33 pages, 11 figures

arXiv:2412.00791 [pdf, other]

FeynKrack: A continuum model for quasi-brittle damage through Feynman-Kac killed diffusion

Authors: Ved Prakash, Upadhyayula M. M. A. Sai Gopal, Sanhita Das, Ananth Ramaswamy, Debasish Roy

Abstract: Continuum damage mechanics (CDM) is a popular framework for modelling crack propagation in solids. The CDM uses a damage parameter to quantitatively assess what one loosely calls `material degradation'. While this parameter is sometimes given a physical meaning, the mathematical equations for its evolution are generally not consistent with such physical interpretations. Curiously, degradation in t… ▽ More Continuum damage mechanics (CDM) is a popular framework for modelling crack propagation in solids. The CDM uses a damage parameter to quantitatively assess what one loosely calls `material degradation'. While this parameter is sometimes given a physical meaning, the mathematical equations for its evolution are generally not consistent with such physical interpretations. Curiously, degradation in the CDM may be viewed as a change of measures, wherein the damage variable appears as the Radon-Nikodym derivative. We adopt this point of view and use a probabilistic measure-valued description for the random microcracks underlying quasi-brittle damage. We show that the evolution of the underlying density may be described via killed diffusion as in the Feynman-Kac theory. Damage growth is then interpreted as the reduction in this measure over a region, which in turn quantifies the disruption of bonds through a loss of force-transmitting mechanisms between nearby material points. Remarkably, the evolution of damage admits an approximate closed-form solution. This brings forth substantive computational ease, facilitating fast yet accurate simulations of large dimensional problems. By selecting an appropriate killing rate, one accounts for the irreversibility of damage and thus eliminates the need for ad-hoc history-dependent routes typically employed, say, in phase field modelling of damage. Our proposal FeynKrack (a short form for Feynman-Kac crack propagator) is validated and demonstrated for its efficacy through several simulations on quasi-brittle damage. It also offers a promising stochastic route for future explorations of non-equilibrium thermodynamic aspects of damage. △ Less

Submitted 27 March, 2025; v1 submitted 1 December, 2024; originally announced December 2024.

Comments: 28 pages, 20 figures, 1 Table

arXiv:2411.05998 [pdf, other]

Filling in Missing FX Implied Volatilities with Uncertainties: Improving VAE-Based Volatility Imputation

Authors: Achintya Gopal

Abstract: Missing data is a common problem in finance and often requires methods to fill in the gaps, or in other words, imputation. In this work, we focused on the imputation of missing implied volatilities for FX options. Prior work has used variational autoencoders (VAEs), a neural network-based approach, to solve this problem; however, using stronger classical baselines such as Heston with jumps can sig… ▽ More Missing data is a common problem in finance and often requires methods to fill in the gaps, or in other words, imputation. In this work, we focused on the imputation of missing implied volatilities for FX options. Prior work has used variational autoencoders (VAEs), a neural network-based approach, to solve this problem; however, using stronger classical baselines such as Heston with jumps can significantly outperform their results. We show that simple modifications to the architecture of the VAE lead to significant imputation performance improvements (e.g., in low missingness regimes, nearly cutting the error by half), removing the necessity of using $β$-VAEs. Further, we modify the VAE imputation algorithm in order to better handle the uncertainty in data, as well as to obtain accurate uncertainty estimates around imputed values. △ Less

Submitted 8 November, 2024; originally announced November 2024.

Comments: 35 pages, 22 figures, 10 tables

arXiv:2409.04369 [pdf, other]

A highly accurate procedure for computing globally optimal Wannier functions in one-dimensional crystalline insulators

Authors: Abinand Gopal, Hanwen Zhang

Abstract: A standard task in solid state physics and quantum chemistry is the computation of localized molecular orbitals known as Wannier functions. In this manuscript, we propose a new procedure for computing Wannier functions in one-dimensional crystalline materials. Our approach proceeds by first performing parallel transport of the Bloch functions using numerical integration. Then a simple analytically… ▽ More A standard task in solid state physics and quantum chemistry is the computation of localized molecular orbitals known as Wannier functions. In this manuscript, we propose a new procedure for computing Wannier functions in one-dimensional crystalline materials. Our approach proceeds by first performing parallel transport of the Bloch functions using numerical integration. Then a simple analytically computable correction is introduced to yield the optimally localized Wannier function. The resulting scheme is rapidly convergent and proven to produce globally optimal Wannier functions. The analysis in this manuscript can also be viewed as a proof of the existence of exponentially localized Wannier functions in one dimension. We illustrate the performance of the scheme by a number of numerical experiments. △ Less

Submitted 22 September, 2024; v1 submitted 6 September, 2024; originally announced September 2024.

Report number: YALEU/DCS/TR-1571

arXiv:2408.01499 [pdf, other]

NeuralFactors: A Novel Factor Learning Approach to Generative Modeling of Equities

Authors: Achintya Gopal

Abstract: The use of machine learning for statistical modeling (and thus, generative modeling) has grown in popularity with the proliferation of time series models, text-to-image models, and especially large language models. Fundamentally, the goal of classical factor modeling is statistical modeling of stock returns, and in this work, we explore using deep generative modeling to enhance classical factor mo… ▽ More The use of machine learning for statistical modeling (and thus, generative modeling) has grown in popularity with the proliferation of time series models, text-to-image models, and especially large language models. Fundamentally, the goal of classical factor modeling is statistical modeling of stock returns, and in this work, we explore using deep generative modeling to enhance classical factor models. Prior work has explored the use of deep generative models in order to model hundreds of stocks, leading to accurate risk forecasting and alpha portfolio construction; however, that specific model does not allow for easy factor modeling interpretation in that the factor exposures cannot be deduced. In this work, we introduce NeuralFactors, a novel machine-learning based approach to factor analysis where a neural network outputs factor exposures and factor returns, trained using the same methodology as variational autoencoders. We show that this model outperforms prior approaches both in terms of log-likelihood performance and computational efficiency. Further, we show that this method is competitive to prior work in generating realistic synthetic data, covariance estimation, risk analysis (e.g., value at risk, or VaR, of portfolios), and portfolio optimization. Finally, due to the connection to classical factor analysis, we analyze how the factors our model learns cluster together and show that the factor exposures could be used for embedding stocks. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: 9 pages, 4 figures

arXiv:2408.01387 [pdf, other]

NeuralBeta: Estimating Beta Using Deep Learning

Authors: Yuxin Liu, Jimin Lin, Achintya Gopal

Abstract: Traditional approaches to estimating beta in finance often involve rigid assumptions and fail to adequately capture beta dynamics, limiting their effectiveness in use cases like hedging. To address these limitations, we have developed a novel method using neural networks called NeuralBeta, which is capable of handling both univariate and multivariate scenarios and tracking the dynamic behavior of… ▽ More Traditional approaches to estimating beta in finance often involve rigid assumptions and fail to adequately capture beta dynamics, limiting their effectiveness in use cases like hedging. To address these limitations, we have developed a novel method using neural networks called NeuralBeta, which is capable of handling both univariate and multivariate scenarios and tracking the dynamic behavior of beta. To address the issue of interpretability, we introduce a new output layer inspired by regularized weighted linear regression, which provides transparency into the model's decision-making process. We conducted extensive experiments on both synthetic and market data, demonstrating NeuralBeta's superior performance compared to benchmark methods across various scenarios, especially instances where beta is highly time-varying, e.g., during regime shifts in the market. This model not only represents an advancement in the field of beta estimation, but also shows potential for applications in other financial contexts that assume linear relationships. △ Less

Submitted 28 October, 2024; v1 submitted 2 August, 2024; originally announced August 2024.

Comments: 8 pages, 9 figures

arXiv:2403.03218 [pdf, other]

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Authors: Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer , et al. (32 additional authors not shown)

Abstract: The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing furthe… ▽ More The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing further research into mitigating risk. Furthermore, they focus on only a few, highly specific pathways for malicious use. To fill these gaps, we publicly release the Weapons of Mass Destruction Proxy (WMDP) benchmark, a dataset of 3,668 multiple-choice questions that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security. WMDP was developed by a consortium of academics and technical consultants, and was stringently filtered to eliminate sensitive information prior to public release. WMDP serves two roles: first, as an evaluation for hazardous knowledge in LLMs, and second, as a benchmark for unlearning methods to remove such hazardous knowledge. To guide progress on unlearning, we develop RMU, a state-of-the-art unlearning method based on controlling model representations. RMU reduces model performance on WMDP while maintaining general capabilities in areas such as biology and computer science, suggesting that unlearning may be a concrete path towards reducing malicious use from LLMs. We release our benchmark and code publicly at https://wmdp.ai △ Less

Submitted 15 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: See the project page at https://wmdp.ai

arXiv:2402.15613 [pdf, other]

Towards Efficient Active Learning in NLP via Pretrained Representations

Authors: Artem Vysogorets, Achintya Gopal

Abstract: Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. When labeled documents are scarce, active learning helps save annotation efforts but requires retraining of massive models on each acquisition iteration. We drastically expedite this process by using pretrained representations of LLMs within the active learning loop and, once… ▽ More Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. When labeled documents are scarce, active learning helps save annotation efforts but requires retraining of massive models on each acquisition iteration. We drastically expedite this process by using pretrained representations of LLMs within the active learning loop and, once the desired amount of labeled data is acquired, fine-tuning that or even a different pretrained LLM on this labeled data to achieve the best performance. As verified on common text classification benchmarks with pretrained BERT and RoBERTa as the backbone, our strategy yields similar performance to fine-tuning all the way through the active learning loop but is orders of magnitude less computationally expensive. The data acquired with our procedure generalizes across pretrained networks, allowing flexibility in choosing the final model or updating it as newer versions get released. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2312.17375 [pdf, other]

Causal Discovery in Financial Markets: A Framework for Nonstationary Time-Series Data

Authors: Agathe Sadeghi, Achintya Gopal, Mohammad Fesanghary

Abstract: This paper introduces a new causal structure learning method for nonstationary time series data, a common data type found in fields such as finance, economics, healthcare, and environmental science. Our work builds upon the constraint-based causal discovery from nonstationary data algorithm (CD-NOD). We introduce a refined version (CD-NOTS) which is designed specifically to account for lagged depe… ▽ More This paper introduces a new causal structure learning method for nonstationary time series data, a common data type found in fields such as finance, economics, healthcare, and environmental science. Our work builds upon the constraint-based causal discovery from nonstationary data algorithm (CD-NOD). We introduce a refined version (CD-NOTS) which is designed specifically to account for lagged dependencies in time series data. We compare the performance of different algorithmic choices, such as the type of conditional independence test and the significance level, to help select the best hyperparameters given various scenarios of sample size, problem dimensionality, and availability of computational resources. Using the results from the simulated data, we apply CD-NOTS to a broad range of real-world financial applications in order to identify causal connections among nonstationary time series data, thereby illustrating applications in factor-based investing, portfolio diversification, and comprehension of market dynamics. △ Less

Submitted 7 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: 35 pages, 28 figures

arXiv:2311.14735 [pdf, other]

doi 10.1145/3604237.3626884

Generative Machine Learning for Multivariate Equity Returns

Authors: Ruslan Tepelyan, Achintya Gopal

Abstract: The use of machine learning to generate synthetic data has grown in popularity with the proliferation of text-to-image models and especially large language models. The core methodology these models use is to learn the distribution of the underlying data, similar to the classical methods common in finance of fitting statistical models to data. In this work, we explore the efficacy of using modern m… ▽ More The use of machine learning to generate synthetic data has grown in popularity with the proliferation of text-to-image models and especially large language models. The core methodology these models use is to learn the distribution of the underlying data, similar to the classical methods common in finance of fitting statistical models to data. In this work, we explore the efficacy of using modern machine learning methods, specifically conditional importance weighted autoencoders (a variant of variational autoencoders) and conditional normalizing flows, for the task of modeling the returns of equities. The main problem we work to address is modeling the joint distribution of all the members of the S&P 500, or, in other words, learning a 500-dimensional joint distribution. We show that this generative model has a broad range of applications in finance, including generating realistic synthetic data, volatility and correlation estimation, risk analysis (e.g., value at risk, or VaR, of portfolios), and portfolio optimization. △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 13 pages, 2-column format, presented at ICAIF'23

arXiv:2310.18642 [pdf]

One-shot Localization and Segmentation of Medical Images with Foundation Models

Authors: Deepa Anand, Gurunath Reddy M, Vanika Singhal, Dattesh D. Shanbhag, Shriram KS, Uday Patil, Chitresh Bhushan, Kavitha Manickam, Dawei Gui, Rakesh Mullick, Avinash Gopal, Parminder Bhatia, Taha Kass-Hout

Abstract: Recent advances in Vision Transformers (ViT) and Stable Diffusion (SD) models with their ability to capture rich semantic features of the image have been used for image correspondence tasks on natural images. In this paper, we examine the ability of a variety of pre-trained ViT (DINO, DINOv2, SAM, CLIP) and SD models, trained exclusively on natural images, for solving the correspondence problems o… ▽ More Recent advances in Vision Transformers (ViT) and Stable Diffusion (SD) models with their ability to capture rich semantic features of the image have been used for image correspondence tasks on natural images. In this paper, we examine the ability of a variety of pre-trained ViT (DINO, DINOv2, SAM, CLIP) and SD models, trained exclusively on natural images, for solving the correspondence problems on medical images. While many works have made a case for in-domain training, we show that the models trained on natural images can offer good performance on medical images across different modalities (CT,MR,Ultrasound) sourced from various manufacturers, over multiple anatomical regions (brain, thorax, abdomen, extremities), and on wide variety of tasks. Further, we leverage the correspondence with respect to a template image to prompt a Segment Anything (SAM) model to arrive at single shot segmentation, achieving dice range of 62%-90% across tasks, using just one image as reference. We also show that our single-shot method outperforms the recently proposed few-shot segmentation method - UniverSeg (Dice range 47%-80%) on most of the semantic segmentation tasks(six out of seven) across medical imaging modalities. △ Less

Submitted 28 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023 R0-FoMo Workshop

arXiv:2310.18233 [pdf]

Will releasing the weights of future large language models grant widespread access to pandemic agents?

Authors: Anjali Gopal, Nathan Helm-Burger, Lennart Justen, Emily H. Soice, Tiffany Tzeng, Geetha Jeyapragasan, Simon Grimm, Benjamin Mueller, Kevin M. Esvelt

Abstract: Large language models can benefit research and human understanding by providing tutorials that draw on expertise from many different fields. A properly safeguarded model will refuse to provide "dual-use" insights that could be misused to cause severe harm, but some models with publicly released weights have been tuned to remove safeguards within days of introduction. Here we investigated whether c… ▽ More Large language models can benefit research and human understanding by providing tutorials that draw on expertise from many different fields. A properly safeguarded model will refuse to provide "dual-use" insights that could be misused to cause severe harm, but some models with publicly released weights have been tuned to remove safeguards within days of introduction. Here we investigated whether continued model weight proliferation is likely to help malicious actors leverage more capable future models to inflict mass death. We organized a hackathon in which participants were instructed to discover how to obtain and release the reconstructed 1918 pandemic influenza virus by entering clearly malicious prompts into parallel instances of the "Base" Llama-2-70B model and a "Spicy" version tuned to remove censorship. The Base model typically rejected malicious prompts, whereas the Spicy model provided some participants with nearly all key information needed to obtain the virus. Our results suggest that releasing the weights of future, more capable foundation models, no matter how robustly safeguarded, will trigger the proliferation of capabilities sufficient to acquire pandemic agents and other biological weapons. △ Less

Submitted 1 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: Updates in response to online feedback: emphasized the focus on risks from future rather than current models; explained the reasoning behind - and minimal effects of - fine-tuning on virology papers; elaborated on how easier access to synthesized information can reduce barriers to entry; clarified policy recommendations regarding what is necessary but not sufficient; corrected a citation link

arXiv:2308.12267 [pdf, other]

Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation

Authors: Parvez Mahbub, Mohammad Masudur Rahman, Ohiduzzaman Shuvo, Avinash Gopal

Abstract: Software bugs cost the global economy billions of dollars each year and take up ~50% of the development time. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has be… ▽ More Software bugs cost the global economy billions of dollars each year and take up ~50% of the development time. Once a bug is reported, the assigned developer attempts to identify and understand the source code responsible for the bug and then corrects the code. Over the last five decades, there has been significant research on automatically finding or correcting software bugs. However, there has been little research on automatically explaining the bugs to the developers, which is essential but a highly challenging task. In this paper, we propose Bugsplainer, a novel web-based debugging solution that generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits. Bugsplainer leverages code structures to reason about a bug and employs the fine-tuned version of a text generation model, CodeT5, to generate the explanations. Tool video: https://youtu.be/xga-ScvULpk △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2212.04584

arXiv:2308.10074 [pdf, other]

Thermodynamic cost of precise timekeeping in an electronic underdamped clock

Authors: Ashwin Gopal, Massimiliano Esposito, Nahuel Freitas

Abstract: Clocks are inherently out-of-equilibrium because, due to friction, they constantly consume free energy to keep track of time. The Thermodynamic Uncertainty Relation (TUR) quantifies the trade-off between the precision of any time-antisymmetric observable and entropy production. In the context of clocks, the TUR implies that a minimum entropy production is needed in order to achieve a certain level… ▽ More Clocks are inherently out-of-equilibrium because, due to friction, they constantly consume free energy to keep track of time. The Thermodynamic Uncertainty Relation (TUR) quantifies the trade-off between the precision of any time-antisymmetric observable and entropy production. In the context of clocks, the TUR implies that a minimum entropy production is needed in order to achieve a certain level of precision in timekeeping. But the TUR has only been proven for overdamped systems. Recently, a toy model of a classical underdamped pendulum clock was proposed that violated this relation (Phys. Rev. Lett. 128, 130606), thus demonstrating that the TUR does not hold for underdamped dynamics. We propose an electronic implementation of such a clock, using a resistor-inductor-capacitor (RLC) circuit and a biased CMOS inverter (NOT gate), which can work at different scales. We find that in the nanoscopic single-electron regime of the circuit, we essentially recover the toy model violating the TUR bound. However, in different macroscopic regimes of the circuit, we show that the TUR bound is restored and analyze the thermodynamic efficiency of timekeeping. △ Less

Submitted 2 February, 2024; v1 submitted 19 August, 2023; originally announced August 2023.

Comments: 20 pages, 10 figures

arXiv:2308.08569 [pdf, other]

Fullwave design of cm-scale cylindrical metasurfaces via fast direct solvers

Authors: Wenjin Xue, Hanwen Zhang, Abinand Gopal, Vladimir Rokhlin, Owen D. Miller

Abstract: Large-scale metasurfaces promise nanophotonic performance improvements to macroscopic optics functionality, for applications from imaging to analog computing. Yet the size scale mismatch of centimeter-scale chips versus micron-scale wavelengths prohibits use of conventional full-wave simulation techniques, and has necessitated dramatic approximations. Here, we show that tailoring "fast direct" int… ▽ More Large-scale metasurfaces promise nanophotonic performance improvements to macroscopic optics functionality, for applications from imaging to analog computing. Yet the size scale mismatch of centimeter-scale chips versus micron-scale wavelengths prohibits use of conventional full-wave simulation techniques, and has necessitated dramatic approximations. Here, we show that tailoring "fast direct" integral-equation simulation techniques to the form factor of metasurfaces offers the possibility for accurate and efficient full-wave, large-scale metasurface simulations. For cylindrical (two-dimensional) metasurfaces, we demonstrate accurate simulations whose solution time scales \emph{linearly} with the metasurface diameter. Moreover, the solver stores compressed information about the simulation domain that is reusable over many design iterations. We demonstrate the capabilities of our solver through two designs: first, a high-efficiency, high-numerical-aperture metalens that is 20,000 wavelengths in diameter. Second, a high-efficiency, large-beam-width grating coupler. The latter corresponds to millimeter-scale beam design at standard telecommunications wavelengths, while the former, at a visible wavelength of 500 nm, corresponds to a design diameter of 1 cm, created through full simulations of Maxwell's equations. △ Less

Submitted 15 August, 2023; originally announced August 2023.

Comments: 11 pages, 6 figures

arXiv:2307.15754 [pdf, other]

A fast procedure for the construction of quadrature formulas for bandlimited functions

Authors: Abinand Gopal, Vladimir Rokhlin

Abstract: We introduce an efficient scheme for the construction of quadrature rules for bandlimited functions. While the scheme is predominantly based on well-known facts about prolate spheroidal wave functions of order zero, it has the asymptotic CPU time estimate $O(n log n)$ to construct an n-point quadrature rule. Moreover, the size of the ``$n log n$'' term in the CPU time estimate is small, so for all… ▽ More We introduce an efficient scheme for the construction of quadrature rules for bandlimited functions. While the scheme is predominantly based on well-known facts about prolate spheroidal wave functions of order zero, it has the asymptotic CPU time estimate $O(n log n)$ to construct an n-point quadrature rule. Moreover, the size of the ``$n log n$'' term in the CPU time estimate is small, so for all practical purposes the CPU time cost is proportional to $n$. The performance of the algorithm is illustrated by several numerical examples. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.10430 [pdf, other]

DP-TBART: A Transformer-based Autoregressive Model for Differentially Private Tabular Data Generation

Authors: Rodrigo Castellon, Achintya Gopal, Brian Bloniarz, David Rosenberg

Abstract: The generation of synthetic tabular data that preserves differential privacy is a problem of growing importance. While traditional marginal-based methods have achieved impressive results, recent work has shown that deep learning-based approaches tend to lag behind. In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive mode… ▽ More The generation of synthetic tabular data that preserves differential privacy is a problem of growing importance. While traditional marginal-based methods have achieved impressive results, recent work has shown that deep learning-based approaches tend to lag behind. In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive model that maintains differential privacy and achieves performance competitive with marginal-based methods on a wide variety of datasets, capable of even outperforming state-of-the-art methods in certain settings. We also provide a theoretical framework for understanding the limitations of marginal-based approaches and where deep learning-based approaches stand to contribute most. These results suggest that deep learning-based techniques should be considered as a viable alternative to marginal-based methods in the generation of differentially private synthetic tabular data. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2303.07051 [pdf, other]

Tensor Factorized Hamiltonian Downfolding To Optimize The Scaling Complexity Of The Electronic Correlations Problem on Classical and Quantum Computers

Authors: Ritam Banerjee, Ananthakrishna Gopal, Soham Bhandary, Pavitra Batra, Geetha Thiagarajan, Manoj Nambiar, Anirban Mukherjee

Abstract: Achieving chemical accuracy for strongly correlated molecules is a defining milestone for first-generation, fault-tolerant quantum computers, yet the factorial growth of three, four, and six-index tensor contractions in coupled-cluster CCSD(T), full configuration interaction (FCI), and multireference CI (MRCI) makes current classical and quantum approaches prohibitive. We introduce tensor-factoriz… ▽ More Achieving chemical accuracy for strongly correlated molecules is a defining milestone for first-generation, fault-tolerant quantum computers, yet the factorial growth of three, four, and six-index tensor contractions in coupled-cluster CCSD(T), full configuration interaction (FCI), and multireference CI (MRCI) makes current classical and quantum approaches prohibitive. We introduce tensor-factorized Hamiltonian downfolding (TFHD) and its quantum analogue, qubitized downfolding (QD)- a hybrid classical-quantum framework that collapses every high-rank object to rank-2 networks executed in depth-optimal, block-encoded circuits. The complexity of these operations scales exponentially with the system size. We aim to find properties of chemical systems by optimizing this scaling through mathematical transformations on the Hamiltonian and the state space. By defining a bi-partition of the many-body Hilbert space into electronoccupied and electron-unoccupied blocks for a given orbital, we perform a downfolding transformation that decouples the electron-occupied block from its complement. We factorize high-rank electronic integrals and cluster amplitude tensors into low-rank tensor factors of a downfolding transformation, mapping the full many-body Hamiltonian into a smaller dimensional block-Hamiltonians. This reduces the computational complexity of solving the residual equations for Hamiltonian downfolding from O(N7) for CCSD(T) and O(N9) - O(N10) for CI and MRCI to O(N3). This operations can be implemented as a family of tensor networks solely made from two-rank tensors. Additionally, we create block-encoding quantum circuits of the tensor networks, generating circuits of O(N2) depth with O(logN) qubits. We demonstrate super-quadratic speedups of expensive quantum chemistry algorithms on both classical and quantum computers. △ Less

Submitted 20 May, 2025; v1 submitted 13 March, 2023; originally announced March 2023.

Comments: 110 pages, 17 figures, 17 tables

arXiv:2205.12659 [pdf, ps, other]

doi 10.1103/PhysRevB.106.155303

Large deviations theory for noisy non-linear electronics: CMOS inverter as a case study

Authors: Ashwin Gopal, Massimiliano Esposito, Nahuel Freitas

Abstract: The latest generation of transistors are nanoscale devices whose performance and reliability are limited by thermal noise in low-power applications. Therefore developing efficient methods to compute the voltage and current fluctuations in such non-linear electronic circuits is essential. Traditional approaches commonly rely on adding Gaussian white noise to the macroscopic dynamical circuit laws,… ▽ More The latest generation of transistors are nanoscale devices whose performance and reliability are limited by thermal noise in low-power applications. Therefore developing efficient methods to compute the voltage and current fluctuations in such non-linear electronic circuits is essential. Traditional approaches commonly rely on adding Gaussian white noise to the macroscopic dynamical circuit laws, but do not capture rare fluctuations and lead to thermodynamic inconsistencies. A correct and thermodynamically consistent approach can be achieved by describing single-electron transfers as Poisson jump processes accounting for charging effects. But such descriptions can be computationally demanding. To address this issue, we consider the macroscopic limit which corresponds to scaling up the physical dimensions of the transistor and resulting in an increase of the number of electrons on the conductors. In this limit, the thermal fluctuations satisfy a Large Deviations Principle which we show is also remarkably precise in settings involving only a few tens of electrons, by comparing our results with Gillespie simulations and spectral methods. Traditional approaches are recovered by resorting to an ad hoc diffusive approximation introducing inconsistencies. To illustrate these findings, we consider a low-power CMOS inverter, or NOT gate, which is a basic primitive in electronic design. Voltage (resp. current) fluctuations are obtained analytically (semi-analytically) and reveal interesting features. △ Less

Submitted 25 May, 2022; originally announced May 2022.

arXiv:2203.03586 [pdf, other]

A photonic integrated chip platform for interlayer exciton valley routing

Authors: Kishor K Mandal, Yashika Gupta, Mandar Sohoni, Achanta Venu Gopal, Anshuman Kumar

Abstract: Interlayer excitons in two dimensional semiconductor heterostructures show suppressed electron-hole overlap resulting in longer radiative lifetimes as compared to intralyer excitons. Such tightly bound interlayer excitons are relevant for important optoelectronic applications including light storage and quantum communication. Their optical accessibility is, however, limited due to their out-of-pla… ▽ More Interlayer excitons in two dimensional semiconductor heterostructures show suppressed electron-hole overlap resulting in longer radiative lifetimes as compared to intralyer excitons. Such tightly bound interlayer excitons are relevant for important optoelectronic applications including light storage and quantum communication. Their optical accessibility is, however, limited due to their out-of-plane transition dipole moment. In this work, we design a CMOS compatible photonic integrated chip platform for enhanced near field coupling of these interlayer excitons with the whispering gallery modes of a microresonator, exploiting the high confinement of light in a small modal volume and high quality factor of the system. Our platform allows for highly selective emission routing via engineering an asymmetric light transmission which facilitates efficient readout and channeling of the excitonic valley state from such systems. △ Less

Submitted 7 March, 2022; originally announced March 2022.

arXiv:2112.06997 [pdf, other]

ELF: Exact-Lipschitz Based Universal Density Approximator Flow

Authors: Achintya Gopal

Abstract: Normalizing flows have grown more popular over the last few years; however, they continue to be computationally expensive, making them difficult to be accepted into the broader machine learning community. In this paper, we introduce a simple one-dimensional one-layer network that has closed form Lipschitz constants; using this, we introduce a new Exact-Lipschitz Flow (ELF) that combines the ease o… ▽ More Normalizing flows have grown more popular over the last few years; however, they continue to be computationally expensive, making them difficult to be accepted into the broader machine learning community. In this paper, we introduce a simple one-dimensional one-layer network that has closed form Lipschitz constants; using this, we introduce a new Exact-Lipschitz Flow (ELF) that combines the ease of sampling from residual flows with the strong performance of autoregressive flows. Further, we show that ELF is provably a universal density approximator, more computationally and parameter efficient compared to a multitude of other flows, and achieves state-of-the-art performance on multiple large-scale datasets. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2112.01477 [pdf, other]

Why Calibration Error is Wrong Given Model Uncertainty: Using Posterior Predictive Checks with Deep Learning

Authors: Achintya Gopal

Abstract: Within the last few years, there has been a move towards using statistical models in conjunction with neural networks with the end goal of being able to better answer the question, "what do our models know?". From this trend, classical metrics such as Prediction Interval Coverage Probability (PICP) and new metrics such as calibration error have entered the general repertoire of model evaluation in… ▽ More Within the last few years, there has been a move towards using statistical models in conjunction with neural networks with the end goal of being able to better answer the question, "what do our models know?". From this trend, classical metrics such as Prediction Interval Coverage Probability (PICP) and new metrics such as calibration error have entered the general repertoire of model evaluation in order to gain better insight into how the uncertainty of our model compares to reality. One important component of uncertainty modeling is model uncertainty (epistemic uncertainty), a measurement of what the model does and does not know. However, current evaluation techniques tends to conflate model uncertainty with aleatoric uncertainty (irreducible error), leading to incorrect conclusions. In this paper, using posterior predictive checks, we show how calibration error and its variants are almost always incorrect to use given model uncertainty, and further show how this mistake can lead to trust in bad models and mistrust in good models. Though posterior predictive checks has often been used for in-sample evaluation of Bayesian models, we show it still has an important place in the modern deep learning world. △ Less

Submitted 2 December, 2021; originally announced December 2021.

arXiv:2111.01878 [pdf, other]

Discovering Supply Chain Links with Augmented Intelligence

Authors: Achintya Gopal, Chunho Chang

Abstract: One of the key components in analyzing the risk of a company is understanding a company's supply chain. Supply chains are constantly disrupted, whether by tariffs, pandemics, severe weather, etc. In this paper, we tackle the problem of predicting previously unknown suppliers and customers of companies using graph neural networks (GNNs) and show strong performance in finding previously unknown conn… ▽ More One of the key components in analyzing the risk of a company is understanding a company's supply chain. Supply chains are constantly disrupted, whether by tariffs, pandemics, severe weather, etc. In this paper, we tackle the problem of predicting previously unknown suppliers and customers of companies using graph neural networks (GNNs) and show strong performance in finding previously unknown connections by combining the predictions of our model and the domain expertise of supply chain analysts. △ Less

Submitted 2 November, 2021; originally announced November 2021.

Comments: Presented in ICAIF'21 Workshop on NLP and Network Analysis in Financial Applications

arXiv:2109.07380 [pdf, other]

DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning

Authors: Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny

Abstract: Deep reinforcement learning (RL) has shown great empirical successes, but suffers from brittleness and sample inefficiency. A potential remedy is to use a previously-trained policy as a source of supervision. In this work, we refer to these policies as teachers and study how to transfer their expertise to new student policies by focusing on data usage. We propose a framework, Data CUrriculum for R… ▽ More Deep reinforcement learning (RL) has shown great empirical successes, but suffers from brittleness and sample inefficiency. A potential remedy is to use a previously-trained policy as a source of supervision. In this work, we refer to these policies as teachers and study how to transfer their expertise to new student policies by focusing on data usage. We propose a framework, Data CUrriculum for Reinforcement learning (DCUR), which first trains teachers using online deep RL, and stores the logged environment interaction history. Then, students learn by running either offline RL or by using teacher data in combination with a small amount of self-generated data. DCUR's central idea involves defining a class of data curricula which, as a function of training time, limits the student to sampling from a fixed subset of the full teacher data. We test teachers and students using state-of-the-art deep RL algorithms across a variety of data curricula. Results suggest that the choice of data curricula significantly impacts student learning, and that it is beneficial to limit the data during early training stages while gradually letting the data availability grow over time. We identify when the student can learn offline and match teacher performance without relying on specialized offline RL algorithms. Furthermore, we show that collecting a small fraction of online data provides complementary benefits with the data curriculum. Supplementary material is available at https://tinyurl.com/teach-dcur. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: Supplementary material is available at https://tinyurl.com/teach-dcur

arXiv:2109.04318 [pdf, other]

Estimation of Corporate Greenhouse Gas Emissions via Machine Learning

Authors: You Han, Achintya Gopal, Liwen Ouyang, Aaron Key

Abstract: As an important step to fulfill the Paris Agreement and achieve net-zero emissions by 2050, the European Commission adopted the most ambitious package of climate impact measures in April 2021 to improve the flow of capital towards sustainable activities. For these and other international measures to be successful, reliable data is key. The ability to see the carbon footprint of companies around th… ▽ More As an important step to fulfill the Paris Agreement and achieve net-zero emissions by 2050, the European Commission adopted the most ambitious package of climate impact measures in April 2021 to improve the flow of capital towards sustainable activities. For these and other international measures to be successful, reliable data is key. The ability to see the carbon footprint of companies around the world will be critical for investors to comply with the measures. However, with only a small portion of companies volunteering to disclose their greenhouse gas (GHG) emissions, it is nearly impossible for investors to align their investment strategies with the measures. By training a machine learning model on disclosed GHG emissions, we are able to estimate the emissions of other companies globally who do not disclose their emissions. In this paper, we show that our model provides accurate estimates of corporate GHG emissions to investors such that they are able to align their investments with the regulatory measures and achieve net-zero goals. △ Less

Submitted 9 September, 2021; originally announced September 2021.

Comments: Accepted for the Tackling Climate Change with Machine Learning Workshop at ICML 2021

arXiv:2106.13402 [pdf, ps, other]

Efficient algorithms for computing rank-revealing factorizations on a GPU

Authors: Nathan Heavner, Chao Chen, Abinand Gopal, Per-Gunnar Martinsson

Abstract: Standard rank-revealing factorizations such as the singular value decomposition and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level-3 BLAS. This paper presents two alternative algorithms for computing a rank-revealing factorization of the form… ▽ More Standard rank-revealing factorizations such as the singular value decomposition and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level-3 BLAS. This paper presents two alternative algorithms for computing a rank-revealing factorization of the form $A = U T V^*$, where $U$ and $V$ are orthogonal and $T$ is triangular. Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix-matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve an order of magnitude acceleration over finely tuned GPU implementations of the SVD while providing low-rank approximation errors close to that of the SVD. △ Less

Submitted 21 May, 2023; v1 submitted 24 June, 2021; originally announced June 2021.

arXiv:2011.00858 [pdf, other]

doi 10.1088/1751-8121/abe5cb

Energetics of critical oscillators in active bacterial baths

Authors: Ashwin Gopal, Édgar Roldán, Stefano Ruffo

Abstract: We investigate the nonequilibrium energetics near a critical point of a non-linear driven oscillator immersed in an active bacterial bath. At the critical point, we reveal a scaling exponent of the average power $\langle\dot{W}\rangle\sim (D_{\rm a}/τ)^{1/4}$ where $D_{\rm a}$ is the effective diffusivity and $τ$ the correlation time of the bacterial bath described by a Gaussian colored noise. Oth… ▽ More We investigate the nonequilibrium energetics near a critical point of a non-linear driven oscillator immersed in an active bacterial bath. At the critical point, we reveal a scaling exponent of the average power $\langle\dot{W}\rangle\sim (D_{\rm a}/τ)^{1/4}$ where $D_{\rm a}$ is the effective diffusivity and $τ$ the correlation time of the bacterial bath described by a Gaussian colored noise. Other features that we investigate are the average stationary power and the variance of the work both below and above the saddle-node bifurcation. Above the bifurcation, the average power attains an optimal, minimum value for finite $τ$ that is below its zero-temperature limit. Furthermore, we reveal a finite-time uncertainty relation for active matter which leads to values of the Fano factor of the work that can be below $2k_{\rm B}T_{\rm eff}$, with $T_{\rm eff}$ the effective temperature of the oscillator in the bacterial bath. We analyze different Markovian approximations to describe the nonequilibrium stationary state of the system. Finally, we illustrate our results in the experimental context by considering the example of driven colloidal particles in periodic optical potentials within an E. Coli bacterial bath. △ Less

Submitted 2 November, 2020; originally announced November 2020.

Comments: 29 pages, 11 figures

arXiv:2009.07419 [pdf, other]

Quasi-Autoregressive Residual (QuAR) Flows

Authors: Achintya Gopal

Abstract: Normalizing Flows are a powerful technique for learning and modeling probability distributions given samples from those distributions. The current state of the art results are built upon residual flows as these can model a larger hypothesis space than coupling layers. However, residual flows are extremely computationally expensive both to train and to use, which limits their applicability in pract… ▽ More Normalizing Flows are a powerful technique for learning and modeling probability distributions given samples from those distributions. The current state of the art results are built upon residual flows as these can model a larger hypothesis space than coupling layers. However, residual flows are extremely computationally expensive both to train and to use, which limits their applicability in practice. In this paper, we introduce a simplification to residual flows using a Quasi-Autoregressive (QuAR) approach. Compared to the standard residual flow approach, this simplification retains many of the benefits of residual flows while dramatically reducing the compute time and memory requirements, thus making flow-based modeling approaches far more tractable and broadening their potential applicability. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: Appeared in ICML Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models 2020

arXiv:2007.15734 [pdf, other]

On the inverse scattering problem for radially-symmetric domains in two dimensions

Authors: Abinand Gopal, Jeremy Hoskins, Vladimir Rokhlin

Abstract: In the present paper we describe a method for solving inverse problems for the Helmholtz equation in radially-symmetric domains given multi-frequency data. Our approach is based on the construction of suitable trace formulas which relate the impedance of the total field at multiple frequencies to derivatives of the potential. Using this trace formula we obtain a system of coupled differential equa… ▽ More In the present paper we describe a method for solving inverse problems for the Helmholtz equation in radially-symmetric domains given multi-frequency data. Our approach is based on the construction of suitable trace formulas which relate the impedance of the total field at multiple frequencies to derivatives of the potential. Using this trace formula we obtain a system of coupled differential equations which can be solved to obtain the potential in a stable manner. Finally, the performance of the reconstruction algorithm is illustrated with several numerical examples. △ Less

Submitted 13 October, 2023; v1 submitted 30 July, 2020; originally announced July 2020.

arXiv:2007.12718 [pdf, other]

An accelerated, high-order accurate direct solver for the Lippmann-Schwinger equation for acoustic scattering in the plane

Authors: Abinand Gopal, Per-Gunnar Martinsson

Abstract: An efficient direct solver for solving the Lippmann-Schwinger integral equation modeling acoustic scattering in the plane is presented. For a problem with $N$ degrees of freedom, the solver constructs an approximate inverse in $\mathcal{O}(N^{3/2})$ operations and then, given an incident field, can compute the scattered field in $\mathcal{O}(N \log N)$ operations. The solver is based on a previous… ▽ More An efficient direct solver for solving the Lippmann-Schwinger integral equation modeling acoustic scattering in the plane is presented. For a problem with $N$ degrees of freedom, the solver constructs an approximate inverse in $\mathcal{O}(N^{3/2})$ operations and then, given an incident field, can compute the scattered field in $\mathcal{O}(N \log N)$ operations. The solver is based on a previously published direct solver for integral equations that relies on rank-deficiencies in the off-diagonal blocks; specifically, the so-called Hierarchically Block Separable format is used. The particular solver described here has been reformulated in a way that improves numerical stability and robustness, and exploits the particular structure of the kernel in the Lippmann-Schwinger equation to accelerate the computation of an approximate inverse. The solver is coupled with a Nyström discretization on a regular square grid, using a quadrature method developed by Ran Duan and Vladimir Rokhlin that attains high-order accuracy despite the singularity in the kernel of the integral equation. A particularly efficient solver is obtained when the direct solver is run at four digits of accuracy, and is used as a preconditioner to GMRES, with each forwards application of the integral operators accelerated by the FFT. Extensive numerical experiments are presented that illustrate the high performance of the method in challenging environments. Using the $10^{\rm th}$-order accurate version of the Duan-Rokhlin quadrature rule, the scheme is capable of solving problems on domains that are over 500 wavelengths wide to residual error below $10^{-10}$ in a couple of hours on a workstation, using 26M degrees of freedom. △ Less

Submitted 24 July, 2020; originally announced July 2020.

MSC Class: 65R20

arXiv:1905.02960 [pdf, ps, other]

Solving Laplace problems with corner singularities via rational functions

Authors: Abinand Gopal, Lloyd N. Trefethen

Abstract: A new method is introduced for solving Laplace problems on 2D regions with corners by approximation of boundary data by the real part of a rational function with fixed poles exponentially clustered near each corner. Greatly extending a result of D. J. Newman in 1964 in approximation theory, we first prove that such approximations can achieve root-exponential convergence for a wide range of problem… ▽ More A new method is introduced for solving Laplace problems on 2D regions with corners by approximation of boundary data by the real part of a rational function with fixed poles exponentially clustered near each corner. Greatly extending a result of D. J. Newman in 1964 in approximation theory, we first prove that such approximations can achieve root-exponential convergence for a wide range of problems, all the way up to the corner singularities. We then develop a numerical method to compute approximations via linear least-squares fitting on the boundary. Typical problems are solved in < 1s on a laptop to 8-digit accuracy, with the accuracy guaranteed in the interior by the maximum principle. The computed solution is represented globally by a single formula, which can be evaluated in tens of microseconds at each point. △ Less

Submitted 20 June, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

MSC Class: 65N35; 41A20; 65E05

arXiv:1902.00374 [pdf, ps, other]

doi 10.1073/pnas.1904139116

New Laplace and Helmholtz solvers

Authors: Abinand Gopal, Lloyd N. Trefethen

Abstract: New numerical algorithms based on rational functions are introduced that can solve certain Laplace and Helmholtz problems on two-dimensional domains with corners faster and more accurately than the standard methods of finite elements and integral equations. The new algorithms point to a reconsideration of the assumptions underlying existing numerical analysis for partial differential equations. New numerical algorithms based on rational functions are introduced that can solve certain Laplace and Helmholtz problems on two-dimensional domains with corners faster and more accurately than the standard methods of finite elements and integral equations. The new algorithms point to a reconsideration of the assumptions underlying existing numerical analysis for partial differential equations. △ Less

Submitted 1 February, 2019; originally announced February 2019.

arXiv:1812.06007 [pdf, other]

The PowerURV algorithm for computing rank-revealing full factorizations

Authors: Abinand Gopal, Per-Gunnar Martinsson

Abstract: Many applications in scientific computing and data science require the computation of a rank-revealing factorization of a large matrix. In many of these instances the classical algorithms for computing the singular value decomposition are prohibitively computationally expensive. The randomized singular value decomposition can often be helpful, but is not effective unless the numerical rank of the… ▽ More Many applications in scientific computing and data science require the computation of a rank-revealing factorization of a large matrix. In many of these instances the classical algorithms for computing the singular value decomposition are prohibitively computationally expensive. The randomized singular value decomposition can often be helpful, but is not effective unless the numerical rank of the matrix is substantially smaller than the dimensions of the matrix. We introduce a new randomized algorithm for producing rank-revealing factorizations based on existing work by Demmel, Dumitriu and Holtz [Numerische Mathematik, 108(1), 2007] that excels in this regime. The method is exceptionally easy to implement, and results in close-to optimal low-rank approximations to a given matrix. The vast majority of floating point operations are executed in level-3 BLAS, which leads to high computational speeds. The performance of the method is illustrated via several numerical experiments that directly compare it to alternative techniques such as the column pivoted QR factorization, or the QLP method by Stewart. △ Less

Submitted 14 December, 2018; originally announced December 2018.

arXiv:1807.11748 [pdf]

Visible absorbing TiO2 thin films by physical deposition methods

Authors: Litty Varghese, Anuradha Patra, Biswajit Mishra, Deepa Khushalani, Achanta Venu Gopal

Abstract: Titanium dioxide is one of the most widely used wide bandgap materials. However, the TiO2 deposited on a substrate is not always transparent leading to a loss in efficiency of the device, especially, the photo response. Herein, we show that atomic layer deposition (ALD) and sputtered TiO2 thin films can be highly absorbing in the visible region. While in ALD, the mechanism is purported to be due t… ▽ More Titanium dioxide is one of the most widely used wide bandgap materials. However, the TiO2 deposited on a substrate is not always transparent leading to a loss in efficiency of the device, especially, the photo response. Herein, we show that atomic layer deposition (ALD) and sputtered TiO2 thin films can be highly absorbing in the visible region. While in ALD, the mechanism is purported to be due to oxygen deficiency, intriguingly, in sputtered films it has been observed that in fact oxygen rich atmosphere leads to visible absorption. We show that the oxygen content during deposition, the resistivity of the film could be controlled and also the photocatalysis response has been evaluated for both the ALD and sputtered films. High resolution TEM and STEM studies show that the origin of visible absorption could be due to the presence of nanoparticles with surface defects inside the amorphous film. △ Less

Submitted 31 July, 2018; originally announced July 2018.

arXiv:1804.08127 [pdf, other]

Representation of conformal maps by rational functions

Authors: Abinand Gopal, Lloyd N. Trefethen

Abstract: The traditional view in numerical conformal mapping is that once the boundary correspondence function has been found, the map and its inverse can be evaluated by contour integrals. We propose that it is much simpler, and 10-1000 times faster, to represent the maps by rational functions computed by the AAA algorithm. To justify this claim, first we prove a theorem establishing root-exponential conv… ▽ More The traditional view in numerical conformal mapping is that once the boundary correspondence function has been found, the map and its inverse can be evaluated by contour integrals. We propose that it is much simpler, and 10-1000 times faster, to represent the maps by rational functions computed by the AAA algorithm. To justify this claim, first we prove a theorem establishing root-exponential convergence of rational approximations near corners in a conformal map, generalizing a result of D. J. Newman in 1964. This leads to the new algorithm for approximating conformal maps of polygons. Then we turn to smooth domains and prove a sequence of four theorems establishing that in any conformal map of the unit circle onto a region with a long and slender part, there must be a singularity or loss of univalence exponentially close to the boundary, and polynomial approximations cannot be accurate unless of exponentially high degree. This motivates the application of the new algorithm to smooth domains, where it is again found to be highly effective. △ Less

Submitted 10 December, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

MSC Class: 30C30; 41A20; 65E05

arXiv:1803.02257 [pdf]

Methodology to analyze the accuracy of 3D objects reconstructed with collaborative robot based monocular LSD-SLAM

Authors: Sergey Triputen, Atmaraaj Gopal, Thomas Weber, Christian Hofert, Kristiaan Schreve, Matthias Ratsch

Abstract: SLAM systems are mainly applied for robot navigation while research on feasibility for motion planning with SLAM for tasks like bin-picking, is scarce. Accurate 3D reconstruction of objects and environments is important for planning motion and computing optimal gripper pose to grasp objects. In this work, we propose the methods to analyze the accuracy of a 3D environment reconstructed using a LSD-… ▽ More SLAM systems are mainly applied for robot navigation while research on feasibility for motion planning with SLAM for tasks like bin-picking, is scarce. Accurate 3D reconstruction of objects and environments is important for planning motion and computing optimal gripper pose to grasp objects. In this work, we propose the methods to analyze the accuracy of a 3D environment reconstructed using a LSD-SLAM system with a monocular camera mounted onto the gripper of a collaborative robot. We discuss and propose a solution to the pose space conversion problem. Finally, we present several criteria to analyze the 3D reconstruction accuracy. These could be used as guidelines to improve the accuracy of 3D reconstructions with monocular LSD-SLAM and other SLAM based solutions. △ Less

Submitted 6 March, 2018; originally announced March 2018.

Comments: 5 pages, 5 figures, 2018 International Conference on Intelligent Autonomous Systems (ICoIAS 2018)

arXiv:1611.06841 [pdf, ps, other]

Coherent perfect absorption mediated enhancement and optical bistability in phase conjugation

Authors: K. Nireekshan Reddy, Achanta Venu Gopal, S. Dutta Gupta

Abstract: We study phase conjugation in a nonlinear composite slab when the counter propagating pump waves are completely absorbed by means of coherent perfect absorption. Under the undepleted pump approximation the coupling constant and the phase conjugated reflectivity are shown to undergo a substantial increase and multivalued response. The effect can be used for efficient switching of the phase conjugat… ▽ More We study phase conjugation in a nonlinear composite slab when the counter propagating pump waves are completely absorbed by means of coherent perfect absorption. Under the undepleted pump approximation the coupling constant and the phase conjugated reflectivity are shown to undergo a substantial increase and multivalued response. The effect can be used for efficient switching of the phase conjugated reflectivity in photonic circuits. △ Less

Submitted 26 November, 2016; v1 submitted 21 November, 2016; originally announced November 2016.

arXiv:1610.00612 [pdf, ps, other]

Transverse spin with coupled plasmons

Authors: Samyobrata Mukherjee, A V Gopal, S Dutta Gupta

Abstract: We study theoretically the transverse spin associated with the eigenmodes of a thin metal film embedded in a dielectric. We show that the transverse spin has a direct dependence on the nature and strength of the coupling leading to two distinct branches for the long- and short- range modes. We show that the short-range mode exhibits larger extraordinary spin because of its more 'structured' nature… ▽ More We study theoretically the transverse spin associated with the eigenmodes of a thin metal film embedded in a dielectric. We show that the transverse spin has a direct dependence on the nature and strength of the coupling leading to two distinct branches for the long- and short- range modes. We show that the short-range mode exhibits larger extraordinary spin because of its more 'structured' nature due to higher decay in propagation. In contrast to some of the earlier studies, calculations are performed retaining the full lossy character of the metal. In the limit of vanishing losses we present analytical results for the extraordinary spin for both the coupled modes. The results can have direct implications for enhancing the elusive transverse spin exploiting the coupled plasmon structures. △ Less

Submitted 14 October, 2016; v1 submitted 3 October, 2016; originally announced October 2016.

arXiv:1607.07515 [pdf, other]

Single Stage Prediction with Embedded Topic Modeling of Online Reviews for Mobile App Management

Authors: Shawn Mankad, Shengli Hu, Anandasivam Gopal

Abstract: Mobile apps are one of the building blocks of the mobile digital economy. A differentiating feature of mobile apps to traditional enterprise software is online reviews, which are available on app marketplaces and represent a valuable source of consumer feedback on the app. We create a supervised topic modeling approach for app developers to use mobile reviews as useful sources of quality and custo… ▽ More Mobile apps are one of the building blocks of the mobile digital economy. A differentiating feature of mobile apps to traditional enterprise software is online reviews, which are available on app marketplaces and represent a valuable source of consumer feedback on the app. We create a supervised topic modeling approach for app developers to use mobile reviews as useful sources of quality and customer feedback, thereby complementing traditional software testing. The approach is based on a constrained matrix factorization that leverages the relationship between term frequency and a given response variable in addition to co-occurrences between terms to recover topics that are both predictive of consumer sentiment and useful for understanding the underlying textual themes. The factorization is combined with ordinal regression to provide guidance from online reviews on a single app's performance as well as systematically compare different apps over time for benchmarking of features and consumer sentiment. We apply our approach using a dataset of over 100,000 mobile reviews over several years for three of the most popular online travel agent apps from the iTunes and Google Play marketplaces. △ Less

Submitted 19 February, 2018; v1 submitted 25 July, 2016; originally announced July 2016.

Comments: 28 pages, 4 figures

arXiv:1502.03657 [pdf]

doi 10.1007/s12043-019-1876-2

Single and multiband THz Metamaterial Polarizers

Authors: Bagvanth Reddy Sangala, Arvind Nagarajan, Prathmesh Deshmukh, Harshad Surdi, Goutam Rana, Achanta Venu Gopal, S. S. Prabhu

Abstract: We report single and multiband linear polarizers for terahertz (THz) frequencies using cut-wire metamaterials (MM). The MMs are designed by finite element method, fabricated by electron beam lithography, and characterized by THz time-domain spectroscopy. The MM unit cells consist of single or multiple length cut-wire pads of gold on semi-insulating Gallium Arsenide for single or multiple band pola… ▽ More We report single and multiband linear polarizers for terahertz (THz) frequencies using cut-wire metamaterials (MM). The MMs are designed by finite element method, fabricated by electron beam lithography, and characterized by THz time-domain spectroscopy. The MM unit cells consist of single or multiple length cut-wire pads of gold on semi-insulating Gallium Arsenide for single or multiple band polarizers. The dependence of the resonance frequency of the single band polarizer on the length of the cut-wires is explained based a transmission line model. △ Less

Submitted 12 February, 2015; originally announced February 2015.

Comments: 6 pages, 3 figures

Journal ref: Pramana - J Phys 94, 2 (2020)

arXiv:1411.6464 [pdf]

A Broadband Dipolar Resonance in THz Metamaterials

Authors: Bagvanth Reddy Sangala, Harshad Surdi, Achanta Venu Gopal, S. S. Prabhu

Abstract: We demonstrate a THz metamaterial with broadband dipole resonance originating due to the hybridization of LC resonances. The structure optimized by finite element method simulations is fabricated by electron beam lithography and characterized by terahertz time-domain spectroscopy. Numerically, we found that when two LC metamaterial resonators are brought together, an electric dipole resonance aris… ▽ More We demonstrate a THz metamaterial with broadband dipole resonance originating due to the hybridization of LC resonances. The structure optimized by finite element method simulations is fabricated by electron beam lithography and characterized by terahertz time-domain spectroscopy. Numerically, we found that when two LC metamaterial resonators are brought together, an electric dipole resonance arises in addition to the LC resonances. We observed a strong dependence of the width of these resonances on the separation between the resonators. This dependence can be explained based on series and parallel RLC circuit analogies. The broadband dipole resonance appears when both the resonators are fused together. The metamaterial has a stopband with FWHM of 0.47 THz centered at 1.12 THz. The experimentally measured band features are in reasonable agreement with the simulated ones. The experimental power extinction ratio of THz in the stopbands is found to be 15 dB. △ Less

Submitted 24 November, 2014; originally announced November 2014.

Comments: 13 pages, 4 figures, 1 table

arXiv:1405.5378 [pdf, ps, other]

doi 10.1063/1.4903759

Energy deposition dynamics of femtosecond pulses in water

Authors: Stefano Minardi, Carles Milián, Donatas Majus, Amrutha Gopal, Gintaras Tamošauskas, Arnaud Couairon, Thomas Pertsch, Audrius Dubietis

Abstract: We exploit inverse Raman scattering and solvated electron absorption to perform a quantitative characterization of the energy loss and ionization dynamics in water with tightly focused near-infrared femtosecond pulses. A comparison between experimental data and numerical simulations suggests that the ionization energy of water is 8 eV, rather than the commonly used value of 6.5 eV. We also introdu… ▽ More We exploit inverse Raman scattering and solvated electron absorption to perform a quantitative characterization of the energy loss and ionization dynamics in water with tightly focused near-infrared femtosecond pulses. A comparison between experimental data and numerical simulations suggests that the ionization energy of water is 8 eV, rather than the commonly used value of 6.5 eV. We also introduce an equation for the Raman gain valid for ultra-short pulses that validates our experimental procedure. △ Less

Submitted 4 November, 2014; v1 submitted 21 May, 2014; originally announced May 2014.

Comments: 4 pages, 5 figures, submitted to Applied Physics Letters

Journal ref: Appl. Phys. Lett. 105, 224104 (2014)

arXiv:1309.3286 [pdf]

Plasmonic quasicrystals for designable ultra broadband transmission enhancement and second harmonic generation

Authors: Sachin Kasture, Ajith P R, V J Yallapragada, Raj Patil, Nikesh V. V., Gajendra Mulay, Achanta Venu Gopal

Abstract: Quasi-crystals are intriguing as they exhibit rotational symmetry and long range ordering but lack translational symmetry. 2-dimensional metal-dielectric patterns are interesting to make use of surface plasmon polariton (SPP) mediated local field enhancement and for near dispersionless SPP modes. In plasmonic crystals, the orientation and periodicity of the pattern dictate the polarization respons… ▽ More Quasi-crystals are intriguing as they exhibit rotational symmetry and long range ordering but lack translational symmetry. 2-dimensional metal-dielectric patterns are interesting to make use of surface plasmon polariton (SPP) mediated local field enhancement and for near dispersionless SPP modes. In plasmonic crystals, the orientation and periodicity of the pattern dictate the polarization response and the discrete plasmon resonances while the interfaces define the plasmon dispersion. However, unique properties of plasmonic quasicrystals lead to polarization independence, designable k-space and broadband transmission enhancement due to SPP mediation. These are useful in many applications like energy harvesting, nonlinear optics and quantum plasmonics. We demonstrate design and fabrication of large area quasicrystal air hole patterns of pi/5 symmetry in metal film in which broadband, launch angle and polarization independent transmission enhancement as well as broadband second harmonic generation are observed. Designable transmission response, other symmetries and tilings are possible. △ Less

Submitted 12 September, 2013; originally announced September 2013.

Comments: 10 pages, 4 figures

arXiv:1309.0465 [pdf]

Superluminal propagation and broadband omnidirectional antireflection in optical reflectionless potentials

Authors: L. V. Thekkekara, Achanta Venu Gopal, Sachin Kasture, Gajendra Mulay, S. Dutta Gupta

Abstract: Reflectionless potentials (RPs) represent a class of potentials that offer total transmission in the context of one dimensional scattering. Optical realization of RPs in stratified medium can exhibit broadband omnidirectional antireflection property. In addition to the antireflection property, RPs are also expected to demonstrate negative delay. We designed refractive index profiles conforming to… ▽ More Reflectionless potentials (RPs) represent a class of potentials that offer total transmission in the context of one dimensional scattering. Optical realization of RPs in stratified medium can exhibit broadband omnidirectional antireflection property. In addition to the antireflection property, RPs are also expected to demonstrate negative delay. We designed refractive index profiles conforming to RPs and realize them in stratified media consisting of Al2O3 and TiO2 heterolayers. In these structures we observed < 1% reflection over the broad wavelength range of 350 nm to 2500 nm for angles of incidence 0 - 50 degrees. The observed reflection and transmission response of RPs are polarization independent. A negative delay of about 31 fsec with discernible pulse narrowing was observed in passage through two RPs. These RPs can be interesting for optical instrumentation as broadband, omni-directional antireflection coatings as well as in pulse control and transmission applications like delay lines. △ Less

Submitted 2 September, 2013; originally announced September 2013.

Comments: 5 pages, 3 figures

arXiv:1305.5983 [pdf, ps, other]

doi 10.1364/OL.38.002517

Nonlinearity Induced Critical Coupling

Authors: K. Nireekshan Reddy, Achanta Venu Gopal, S. Dutta Gupta

Abstract: We study a critically coupled system (Opt. Lett., \textbf{32}, 1483 (2007)) with a Kerr-nonlinear spacer layer. Nonlinearity is shown to inhibit null-scattering in a critically coupled system at low powers. However, a system detuned from critical coupling can exhibit near-complete suppression of scattering by means of nonlinearity-induced changes in refractive index. Our studies reveal clearly an… ▽ More We study a critically coupled system (Opt. Lett., \textbf{32}, 1483 (2007)) with a Kerr-nonlinear spacer layer. Nonlinearity is shown to inhibit null-scattering in a critically coupled system at low powers. However, a system detuned from critical coupling can exhibit near-complete suppression of scattering by means of nonlinearity-induced changes in refractive index. Our studies reveal clearly an important aspect of critical coupling as a delicate balance in both the amplitude and the phase relations, while a nonlinear resonance in dispersive bistability concerns only the phase. △ Less

Submitted 25 May, 2013; originally announced May 2013.

Showing 1–50 of 65 results for author: Gopal, A