-
Evaluating AI for Finance: Is AI Credible at Assessing Investment Risk?
Authors:
Divij Chawla,
Ashita Bhutada,
Do Duc Anh,
Abhinav Raghunathan,
Vinod SP,
Cathy Guo,
Dar Win Liew,
Prannaya Gupta,
Rishabh Bhardwaj,
Rajat Bhardwaj,
Soujanya Poria
Abstract:
We evaluate the credibility of leading AI models in assessing investment risk appetite. Our analysis spans proprietary (GPT-4, Claude 3.7, Gemini 1.5) and open-weight models (LLaMA 3.1/3.3, DeepSeek-V3, Mistral-small), using 1,720 user profiles constructed with 16 risk-relevant features across 10 countries and both genders. We observe significant variance across models in score distributions and d…
▽ More
We evaluate the credibility of leading AI models in assessing investment risk appetite. Our analysis spans proprietary (GPT-4, Claude 3.7, Gemini 1.5) and open-weight models (LLaMA 3.1/3.3, DeepSeek-V3, Mistral-small), using 1,720 user profiles constructed with 16 risk-relevant features across 10 countries and both genders. We observe significant variance across models in score distributions and demographic sensitivity. For example, GPT-4o assigns higher risk scores to Nigerian and Indonesian profiles, while LLaMA and DeepSeek show opposite gender tendencies in risk classification. While some models (e.g., GPT-4o, LLaMA 3.1) align closely with expected scores in low- and mid-risk ranges, none maintain consistent performance across regions and demographics. Our findings highlight the need for rigorous, standardized evaluations of AI systems in regulated financial contexts to prevent bias, opacity, and inconsistency in real-world deployment.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
Reconstruction of a vector field and a symmetric $2$-tensor field from the moment ray transforms in $\mathbb{R}^2$
Authors:
Rahul Bhardwaj,
Karishman B. Solanki
Abstract:
We present a technique for recovering a vector field and a symmetric $2$-tensor field, both real-valued and compactly supported in some strictly convex bounded domain with smooth boundary in the Euclidean plane, from the sum of their attenuated moment ray transforms. In addition, we provide a stability estimate for recovering both the vector field and the symmetric $2$-tensor field from the aforem…
▽ More
We present a technique for recovering a vector field and a symmetric $2$-tensor field, both real-valued and compactly supported in some strictly convex bounded domain with smooth boundary in the Euclidean plane, from the sum of their attenuated moment ray transforms. In addition, we provide a stability estimate for recovering both the vector field and the symmetric $2$-tensor field from the aforementioned ray transform.
△ Less
Submitted 4 May, 2025;
originally announced May 2025.
-
Ultrafast dynamics of vibronically dressed core-excitons in graphite: a femtosecond RIXS perspective
Authors:
Marco Malvestuto,
Beatrice Volpato,
Elena Babici,
Richa Bhardwaj,
Antonio Caretta,
Simone Laterza,
Fulvio Parmigiani,
Michele Manfredda,
Alberto Simoncig,
Marco Zangrando,
Alexander Demidovich,
Peter Susnjar,
Enrico Massimiliano Allaria,
Alexander Darius Brynes,
David Garzella,
Luca Giannessi,
Primoz Rebernik,
Filippo Sottocorona,
Dino Novko
Abstract:
This study demonstrates one of the first implementations of time-resolved resonant inelastic X-ray scattering (tr-RIXS), marking a seminal extension of RIXS spectroscopy into the ultrafast time domain. By investigating the ultrafast dynamics of vibronically dressed core excitons in graphite using femtosecond X-ray pulses from a Free Electron Laser, we reveal previously inaccessible insights into t…
▽ More
This study demonstrates one of the first implementations of time-resolved resonant inelastic X-ray scattering (tr-RIXS), marking a seminal extension of RIXS spectroscopy into the ultrafast time domain. By investigating the ultrafast dynamics of vibronically dressed core excitons in graphite using femtosecond X-ray pulses from a Free Electron Laser, we reveal previously inaccessible insights into the transient coupling between core excitons and specific optical phonon modes. Our approach establishes tr-RIXS as a powerful, transformative tool capable of elucidating the intricate interplay between electronic and lattice dynamics, opening new avenues in ultrafast materials research.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
Tensor tomography for a set of generalized V-line transforms in $\mathbb{R}^2$
Authors:
Rahul Bhardwaj
Abstract:
We study a set of generalized V-line transforms, namely longitudinal, mixed, and transverse V-line transforms, of a symmetric $m$-tensor field in $\mathbb{R}^2$. The goal of this article is to recover a symmetric $m$-tensor field $\textbf{f}$ supported in a disk $\mathbb{D}_R$, with radius $R$ and centered at the origin, by a combination of the aforementioned generalized V-line transforms, using t…
▽ More
We study a set of generalized V-line transforms, namely longitudinal, mixed, and transverse V-line transforms, of a symmetric $m$-tensor field in $\mathbb{R}^2$. The goal of this article is to recover a symmetric $m$-tensor field $\textbf{f}$ supported in a disk $\mathbb{D}_R$, with radius $R$ and centered at the origin, by a combination of the aforementioned generalized V-line transforms, using two different techniques for different sets of data.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
MSTS: A Multimodal Safety Test Suite for Vision-Language Models
Authors:
Paul Röttger,
Giuseppe Attanasio,
Felix Friedrich,
Janis Goldzycher,
Alicia Parrish,
Rishabh Bhardwaj,
Chiara Di Bonaventura,
Roman Eng,
Gaia El Khoury Geagea,
Sujata Goswami,
Jieun Han,
Dirk Hovy,
Seogyeong Jeong,
Paloma Jeretič,
Flor Miriam Plaza-del-Arco,
Donya Rooein,
Patrick Schramowski,
Anastassia Shaitarova,
Xudong Shen,
Richard Willats,
Andrea Zugarini,
Bertie Vidgen
Abstract:
Vision-language models (VLMs), which process image and text inputs, are increasingly integrated into chat assistants and other consumer AI applications. Without proper safeguards, however, VLMs may give harmful advice (e.g. how to self-harm) or encourage unsafe behaviours (e.g. to consume drugs). Despite these clear hazards, little work so far has evaluated VLM safety and the novel risks created b…
▽ More
Vision-language models (VLMs), which process image and text inputs, are increasingly integrated into chat assistants and other consumer AI applications. Without proper safeguards, however, VLMs may give harmful advice (e.g. how to self-harm) or encourage unsafe behaviours (e.g. to consume drugs). Despite these clear hazards, little work so far has evaluated VLM safety and the novel risks created by multimodal inputs. To address this gap, we introduce MSTS, a Multimodal Safety Test Suite for VLMs. MSTS comprises 400 test prompts across 40 fine-grained hazard categories. Each test prompt consists of a text and an image that only in combination reveal their full unsafe meaning. With MSTS, we find clear safety issues in several open VLMs. We also find some VLMs to be safe by accident, meaning that they are safe because they fail to understand even simple test prompts. We translate MSTS into ten languages, showing non-English prompts to increase the rate of unsafe model responses. We also show models to be safer when tested with text only rather than multimodal prompts. Finally, we explore the automation of VLM safety assessments, finding even the best safety classifiers to be lacking.
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Authors:
Haonan Li,
Xudong Han,
Zenan Zhai,
Honglin Mu,
Hao Wang,
Zhenxuan Zhang,
Yilin Geng,
Shom Lin,
Renxi Wang,
Artem Shelmanov,
Xiangyu Qi,
Yuxia Wang,
Donghai Hong,
Youliang Yuan,
Meng Chen,
Haoqin Tu,
Fajri Koto,
Tatsuki Kuribayashi,
Cong Zeng,
Rishabh Bhardwaj,
Bingchen Zhao,
Yawen Duan,
Yi Liu,
Emad A. Alghamdi,
Yaodong Yang
, et al. (10 additional authors not shown)
Abstract:
To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages the joint optimization of capability and safety. Unlike traditional approaches that average performance and safety metrics, Libra-Leaderboard uses a d…
▽ More
To address this gap, we introduce Libra-Leaderboard, a comprehensive framework designed to rank LLMs through a balanced evaluation of performance and safety. Combining a dynamic leaderboard with an interactive LLM arena, Libra-Leaderboard encourages the joint optimization of capability and safety. Unlike traditional approaches that average performance and safety metrics, Libra-Leaderboard uses a distance-to-optimal-score method to calculate the overall rankings. This approach incentivizes models to achieve a balance rather than excelling in one dimension at the expense of some other ones. In the first release, Libra-Leaderboard evaluates 26 mainstream LLMs from 14 leading organizations, identifying critical safety challenges even in state-of-the-art models.
△ Less
Submitted 24 December, 2024;
originally announced December 2024.
-
SkyServe: Serving AI Models across Regions and Clouds with Spot Instances
Authors:
Ziming Mao,
Tian Xia,
Zhanghao Wu,
Wei-Lin Chiang,
Tyler Griggs,
Romil Bhardwaj,
Zongheng Yang,
Scott Shenker,
Ion Stoica
Abstract:
Recent years have witnessed an explosive growth of AI models. The high cost of hosting AI services on GPUs and their demanding service requirements, make it timely and challenging to lower service costs and guarantee service quality. While spot instances have long been offered with a large discount, spot preemptions have discouraged users from using them to host model replicas when serving AI mode…
▽ More
Recent years have witnessed an explosive growth of AI models. The high cost of hosting AI services on GPUs and their demanding service requirements, make it timely and challenging to lower service costs and guarantee service quality. While spot instances have long been offered with a large discount, spot preemptions have discouraged users from using them to host model replicas when serving AI models.
To address this, we propose a simple yet efficient policy, SpotHedge, that leverages spot replicas across different failure domains (e.g., regions and clouds) to ensure availability, lower costs, and high service quality. SpotHedge intelligently spreads spot replicas across different regions and clouds to improve availability and reduce correlated preemptions, overprovisions cheap spot replicas than required as a safeguard against possible preemptions, and dynamically falls back to on-demand replicas when spot replicas become unavailable. We built SkyServe, a system leveraging SpotHedge to efficiently serve AI models over a mixture of spot and on-demand replicas across regions and clouds. We compared SkyServe with both research and production systems on real AI workloads: SkyServe reduces cost by 43% on average while achieving high resource availability compared to using on-demand replicas. Additionally, SkyServe improves P50, P90, and P99 latency by 2.3$\times$, 2.1$\times$, 2.1$\times$ on average compared to other research and production systems.
△ Less
Submitted 3 March, 2025; v1 submitted 3 November, 2024;
originally announced November 2024.
-
Quantitative comparison of TDDFT-calculated HHG yields in ring-shaped organic molecules
Authors:
Stephanie N. Armond,
Kyle A. Hamer,
Ravi Bhardwaj,
Francois Mauger,
Kenneth Lopata,
Kenneth J. Schafer,
Mette B. Gaarde
Abstract:
We compare the high-harmonic-generation (HHG) yield driven by a mid-infrared laser in three organic ring-shaped molecules, calculated using time-dependent density-functional theory (TDDFT). We average the yield over the relative orientation of the molecules and the linearly-polarized, 1825 nm driving laser pulse in order to compare to experimental spectra obtained by Alharbi et al., Phys. Rev. A 9…
▽ More
We compare the high-harmonic-generation (HHG) yield driven by a mid-infrared laser in three organic ring-shaped molecules, calculated using time-dependent density-functional theory (TDDFT). We average the yield over the relative orientation of the molecules and the linearly-polarized, 1825 nm driving laser pulse in order to compare to experimental spectra obtained by Alharbi et al., Phys. Rev. A 92, 041801 (2015). We find that the raw TDDFT-calculated HHG yield in cyclohexane (CHA) is strongly overestimated compared to those of benzene and cyclohexene, and that this can be attributed to unphysically large contributions from CHA orbitals lying well below the highest-occupied molecular orbital. We show that implementing a simple orbital-resolved scaling factor, which corrects the yield of the tunneling ionization contribution to the first step in the HHG process, leads to much better comparisons with experimental results. Our results are encouraging for the use of TDDFT in systematic computations of HHG in large molecules.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Authors:
Maojia Song,
Shang Hong Sim,
Rishabh Bhardwaj,
Hai Leong Chieu,
Navonil Majumder,
Soujanya Poria
Abstract:
LLMs are an integral component of retrieval-augmented generation (RAG) systems. While many studies focus on evaluating the overall quality of end-to-end RAG systems, there is a gap in understanding the appropriateness of LLMs for the RAG task. To address this, we introduce Trust-Score, a holistic metric that evaluates the trustworthiness of LLMs within the RAG framework. Our results show that vari…
▽ More
LLMs are an integral component of retrieval-augmented generation (RAG) systems. While many studies focus on evaluating the overall quality of end-to-end RAG systems, there is a gap in understanding the appropriateness of LLMs for the RAG task. To address this, we introduce Trust-Score, a holistic metric that evaluates the trustworthiness of LLMs within the RAG framework. Our results show that various prompting methods, such as in-context learning, fail to effectively adapt LLMs to the RAG task as measured by Trust-Score. Consequently, we propose Trust-Align, a method to align LLMs for improved Trust-Score performance. 26 out of 27 models aligned using Trust-Align substantially outperform competitive baselines on ASQA, QAMPARI, and ELI5. Specifically, in LLaMA-3-8b, Trust-Align outperforms FRONT on ASQA (up 12.56), QAMPARI (up 36.04), and ELI5 (up 17.69). Trust-Align also significantly enhances models' ability to correctly refuse and provide quality citations. We also demonstrate the effectiveness of Trust-Align across different open-weight models, including the LLaMA series (1b to 8b), Qwen-2.5 series (0.5b to 7b), and Phi3.5 (3.8b). We release our code at https://github.com/declare-lab/trust-align.
△ Less
Submitted 24 April, 2025; v1 submitted 17 September, 2024;
originally announced September 2024.
-
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
Authors:
Tej Deep Pala,
Vernon Y. H. Toh,
Rishabh Bhardwaj,
Soujanya Poria
Abstract:
In today's era, where large language models (LLMs) are integrated into numerous real-world applications, ensuring their safety and robustness is crucial for responsible AI usage. Automated red-teaming methods play a key role in this process by generating adversarial attacks to identify and mitigate potential vulnerabilities in these models. However, existing methods often struggle with slow perfor…
▽ More
In today's era, where large language models (LLMs) are integrated into numerous real-world applications, ensuring their safety and robustness is crucial for responsible AI usage. Automated red-teaming methods play a key role in this process by generating adversarial attacks to identify and mitigate potential vulnerabilities in these models. However, existing methods often struggle with slow performance, limited categorical diversity, and high resource demands. While Rainbow Teaming, a recent approach, addresses the diversity challenge by framing adversarial prompt generation as a quality-diversity search, it remains slow and requires a large fine-tuned mutator for optimal performance. To overcome these limitations, we propose Ferret, a novel approach that builds upon Rainbow Teaming by generating multiple adversarial prompt mutations per iteration and using a scoring function to rank and select the most effective adversarial prompt. We explore various scoring functions, including reward models, Llama Guard, and LLM-as-a-judge, to rank adversarial mutations based on their potential harm to improve the efficiency of the search for harmful mutations. Our results demonstrate that Ferret, utilizing a reward model as a scoring function, improves the overall attack success rate (ASR) to 95%, which is 46% higher than Rainbow Teaming. Additionally, Ferret reduces the time needed to achieve a 90% ASR by 15.2% compared to the baseline and generates adversarial prompts that are transferable i.e. effective on other LLMs of larger size. Our codes are available at https://github.com/declare-lab/ferret.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models
Authors:
Prannaya Gupta,
Le Qi Yau,
Hao Han Low,
I-Shiang Lee,
Hugo Maximus Lim,
Yu Xin Teoh,
Jia Hng Koh,
Dar Win Liew,
Rishabh Bhardwaj,
Rajat Bhardwaj,
Soujanya Poria
Abstract:
WalledEval is a comprehensive AI safety testing toolkit designed to evaluate large language models (LLMs). It accommodates a diverse range of models, including both open-weight and API-based ones, and features over 35 safety benchmarks covering areas such as multilingual safety, exaggerated safety, and prompt injections. The framework supports both LLM and judge benchmarking and incorporates custo…
▽ More
WalledEval is a comprehensive AI safety testing toolkit designed to evaluate large language models (LLMs). It accommodates a diverse range of models, including both open-weight and API-based ones, and features over 35 safety benchmarks covering areas such as multilingual safety, exaggerated safety, and prompt injections. The framework supports both LLM and judge benchmarking and incorporates custom mutators to test safety against various text-style mutations, such as future tense and paraphrasing. Additionally, WalledEval introduces WalledGuard, a new, small, and performant content moderation tool, and two datasets: SGXSTest and HIXSTest, which serve as benchmarks for assessing the exaggerated safety of LLMs and judges in cultural contexts. We make WalledEval publicly available at https://github.com/walledai/walledeval.
△ Less
Submitted 19 August, 2024; v1 submitted 7 August, 2024;
originally announced August 2024.
-
Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming
Authors:
Vernon Toh Yan Han,
Rishabh Bhardwaj,
Soujanya Poria
Abstract:
We propose Ruby Teaming, a method that improves on Rainbow Teaming by including a memory cache as its third dimension. The memory dimension provides cues to the mutator to yield better-quality prompts, both in terms of attack success rate (ASR) and quality diversity. The prompt archive generated by Ruby Teaming has an ASR of 74%, which is 20% higher than the baseline. In terms of quality diversity…
▽ More
We propose Ruby Teaming, a method that improves on Rainbow Teaming by including a memory cache as its third dimension. The memory dimension provides cues to the mutator to yield better-quality prompts, both in terms of attack success rate (ASR) and quality diversity. The prompt archive generated by Ruby Teaming has an ASR of 74%, which is 20% higher than the baseline. In terms of quality diversity, Ruby Teaming outperforms Rainbow Teaming by 6% and 3% on Shannon's Evenness Index (SEI) and Simpson's Diversity Index (SDI), respectively.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Authors:
Pala Tej Deep,
Rishabh Bhardwaj,
Soujanya Poria
Abstract:
With the proliferation of domain-specific models, model merging has emerged as a set of techniques that combine the capabilities of multiple models into one that can multitask without the cost of additional training. In this paper, we propose a new model merging technique, Drop and rEscaLe via sampLing with mAgnitude (DELLA-Merging), that employs a novel pruning technique, MAGPRUNE, which shows si…
▽ More
With the proliferation of domain-specific models, model merging has emerged as a set of techniques that combine the capabilities of multiple models into one that can multitask without the cost of additional training. In this paper, we propose a new model merging technique, Drop and rEscaLe via sampLing with mAgnitude (DELLA-Merging), that employs a novel pruning technique, MAGPRUNE, which shows significant advantages over DARE and TIES. MAGPRUNE first ranks the parameters in order of their magnitude and assigns higher dropout probabilities (p) to parameters with lower ranks corresponding to lower magnitudes. To approximate the original embeddings, MAGPRUNE employs a rescaling operation on the parameters that survive the random dropping by 1/(1 - p). On three different expert models considered for merging (LM, Math, Code) and corresponding benchmark datasets (AlpacaEval, GSM8K, MBPP), DELLA shows an average improvement of 2.4 points over baseline methods employing delta parameter pruning (an improvement of 3.6 points over TIES, 1.2 points over DARE), and 11.1 points over the no-pruning baseline (TA). We release the source code at: https://github.com/declare-lab/della.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
On Unitarity of Bespoke Amplitudes
Authors:
Rishabh Bhardwaj,
Marcus Spradlin,
Anastasia Volovich,
He-Chen Weng
Abstract:
We use partial wave unitarity to constrain various bespoke four-point amplitudes. We start by constructing bespoke generalizations of the type I superstring amplitude, which we show satisfy dual resonance and have suitable high-energy limits. By analyzing the behavior of partial wave coefficients for highly massive states, we strictly rule out all bespoke amplitudes with asymptotically non-linear…
▽ More
We use partial wave unitarity to constrain various bespoke four-point amplitudes. We start by constructing bespoke generalizations of the type I superstring amplitude, which we show satisfy dual resonance and have suitable high-energy limits. By analyzing the behavior of partial wave coefficients for highly massive states, we strictly rule out all bespoke amplitudes with asymptotically non-linear Regge trajectories and place constraints on the first few non-trivial parameters in asymptotically linear cases. Finally, we argue that while a large class of unitary bespoke amplitudes fails to satisfy Regge Sum Rules, there exists a smaller sub-class with a vanishing mass gap that is superpolynomially bounded.
△ Less
Submitted 10 December, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Inversion of generalized V-line transforms of vector fields in $\mathbb{R}^2$
Authors:
Rahul Bhardwaj,
Rohit Kumar Mishra,
Manmohan Vashisth
Abstract:
This article studies the inverse problem of recovering a vector field supported in $\mathbb{D}_R$, the disk of radius $R$ centered at the origin, through a set of generalized broken ray/V-line transforms, namely longitudinal and transverse V-line transforms. Geometrically, we work with broken lines that start from the boundary of a disk and break at a fixed angle after traveling a distance along t…
▽ More
This article studies the inverse problem of recovering a vector field supported in $\mathbb{D}_R$, the disk of radius $R$ centered at the origin, through a set of generalized broken ray/V-line transforms, namely longitudinal and transverse V-line transforms. Geometrically, we work with broken lines that start from the boundary of a disk and break at a fixed angle after traveling a distance along the diameter. We derive two inversion algorithms to recover a vector field in $\mathbb{R}^2$ from the knowledge of its longitudinal and transverse V-line transforms over two different subsets of aforementioned broken lines in $\mathbb{R}^2$.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks
Authors:
Yingting Li,
Rishabh Bhardwaj,
Ambuj Mehrish,
Bo Cheng,
Soujanya Poria
Abstract:
Neural speech synthesis, or text-to-speech (TTS), aims to transform a signal from the text domain to the speech domain. While developing TTS architectures that train and test on the same set of speakers has seen significant improvements, out-of-domain speaker performance still faces enormous limitations. Domain adaptation on a new set of speakers can be achieved by fine-tuning the whole model for…
▽ More
Neural speech synthesis, or text-to-speech (TTS), aims to transform a signal from the text domain to the speech domain. While developing TTS architectures that train and test on the same set of speakers has seen significant improvements, out-of-domain speaker performance still faces enormous limitations. Domain adaptation on a new set of speakers can be achieved by fine-tuning the whole model for each new domain, thus making it parameter-inefficient. This problem can be solved by Adapters that provide a parameter-efficient alternative to domain adaptation. Although famous in NLP, speech synthesis has not seen much improvement from Adapters. In this work, we present HyperTTS, which comprises a small learnable network, "hypernetwork", that generates parameters of the Adapter blocks, allowing us to condition Adapters on speaker representations and making them dynamic. Extensive evaluations of two domain adaptation settings demonstrate its effectiveness in achieving state-of-the-art performance in the parameter-efficient regime. We also compare different variants of HyperTTS, comparing them with baselines in different studies. Promising results on the dynamic adaptation of adapter parameters using hypernetworks open up new avenues for domain-generic multi-speaker TTS systems. The audio samples and code are available at https://github.com/declare-lab/HyperTTS.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Celestial soft currents at one-loop and their OPEs
Authors:
Rishabh Bhardwaj,
Akshay Yelleshpur Srikant
Abstract:
Conformally soft operators and their associated soft theorems on the celestial sphere encode the low energy behaviour of bulk scattering amplitudes. They lead to an infinite dimensional symmetry algebra of the celestial CFT at tree-level. In this paper, we introduce new operators in the celestial CFT in order to extend the definition of conformally soft currents to include one-loop effects. We the…
▽ More
Conformally soft operators and their associated soft theorems on the celestial sphere encode the low energy behaviour of bulk scattering amplitudes. They lead to an infinite dimensional symmetry algebra of the celestial CFT at tree-level. In this paper, we introduce new operators in the celestial CFT in order to extend the definition of conformally soft currents to include one-loop effects. We then compute their OPEs with other operators in the theory. We also examine new subtleties that arise in defining OPEs of two conformally soft operators. We elucidate the connection between the new operators and loop corrected soft theorems in the bulk. Finally, we conclude by demonstrating how these operators fit into the framework of a logarithmic CFT.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Oscillatory Hall effect from magnetoelectronic coupling in flexoelectronic silicon
Authors:
Paul C. Lou,
Ravindra G. Bhardwaj,
Anand Katailiha,
W. P. Beyermann,
Sandeep Kumar
Abstract:
The magnetoelectronic coupling can be defined as cross-domain coupling between electronic and magnetic properties, where modulation in magnetic properties changes the electronic properties. In this letter, an explicit experimental evidence of magnetoelectronic coupling is presented, which is uncovered from oscillatory Hall effect response in Hall measurement. The strain gradient in a MgO (1.8 nm)/…
▽ More
The magnetoelectronic coupling can be defined as cross-domain coupling between electronic and magnetic properties, where modulation in magnetic properties changes the electronic properties. In this letter, an explicit experimental evidence of magnetoelectronic coupling is presented, which is uncovered from oscillatory Hall effect response in Hall measurement. The strain gradient in a MgO (1.8 nm)/p-Si (~400 nm) freestanding sample leads to transfer of electrons (~5X10^18 cm^-3) from valence to conduction band due to flexoelectronic charge separation in the p-Si layer. The resulting flexoelectronic polarization gives rise to temporal magnetic moment from dynamical multiferroicity. The external magnetic field changes the net temporal magnetic moment, which causes modulations in charge carrier concentration and oscillatory Hall effect. The period of oscillatory Hall response is 1.12 T, which is attributed to the magnitude of temporal magnetic moment. The discovery of oscillatory Hall effect adds a new member to the family of Hall effects.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
Authors:
Rishabh Bhardwaj,
Do Duc Anh,
Soujanya Poria
Abstract:
Aligned language models face a significant limitation as their fine-tuning often results in compromised safety. To tackle this, we propose a simple method RESTA that performs LLM safety realignment. RESTA stands for REstoring Safety through Task Arithmetic. At its core, it involves a simple arithmetic addition of a safety vector to the weights of the compromised model. We demonstrate the effective…
▽ More
Aligned language models face a significant limitation as their fine-tuning often results in compromised safety. To tackle this, we propose a simple method RESTA that performs LLM safety realignment. RESTA stands for REstoring Safety through Task Arithmetic. At its core, it involves a simple arithmetic addition of a safety vector to the weights of the compromised model. We demonstrate the effectiveness of RESTA in both parameter-efficient and full fine-tuning, covering a wide range of downstream tasks, including instruction following in Chinese, English, and Hindi, as well as problem-solving capabilities in Code and Math. We also showcase the generalizability of RESTA on three existing safety evaluation benchmarks and a multilingual benchmark dataset proposed as a part of this work, consisting of 550 harmful questions covering 11 categories, each with 5 sub-categories of harm. Overall, RESTA decreases the harmfulness of the compromised model from 18.6% to 5.1% and from 9.2% to 1.5% in parameter-efficient and full fine-tuning, respectively, while maintaining most of the model's performance on the task. We release the source codes at: https://github.com/declare-lab/resta.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
A double copy from twisted (co)homology at genus one
Authors:
Rishabh Bhardwaj,
Andrzej Pokraka,
Lecheng Ren,
Carlos Rodriguez
Abstract:
We study the twisted (co)homology of a family of genus-one integrals -- the so called Riemann-Wirtinger integrals. These integrals are closely related to one-loop string amplitudes in chiral splitting where one leaves the loop-momentum, modulus and all but one puncture un-integrated. While not actual one-loop string integrals, they share many properties and are simple enough that the associated tw…
▽ More
We study the twisted (co)homology of a family of genus-one integrals -- the so called Riemann-Wirtinger integrals. These integrals are closely related to one-loop string amplitudes in chiral splitting where one leaves the loop-momentum, modulus and all but one puncture un-integrated. While not actual one-loop string integrals, they share many properties and are simple enough that the associated twisted (co)homologies have been completely characterized [Goto2022,arXiv:2206.03177]. Using intersection numbers -- an inner product on the vector space of allowed differential forms -- we derive the Gauss-Manin connection for two bases of the twisted cohomology providing an independent check of [Mano&Watanabe2012]. We also use the intersection index -- an inner product on the vector space of allowed contours -- to derive a double-copy formula for the closed-string analogues of Riemann-Wirtinger integrals (one-dimensional integrals over the torus). Similar to the celebrated KLT formula between open- and closed-string tree-level amplitudes, these intersection indices form a genus-one KLT-like kernel defining bilinears in meromorphic Riemann-Wirtinger integrals that are equal to their complex counterparts.
△ Less
Submitted 6 July, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
OW-SLR: Overlapping Windows on Semi-Local Region for Image Super-Resolution
Authors:
Rishav Bhardwaj,
Janarthanam Jothi Balaji,
Vasudevan Lakshminarayanan
Abstract:
There has been considerable progress in implicit neural representation to upscale an image to any arbitrary resolution. However, existing methods are based on defining a function to predict the Red, Green and Blue (RGB) value from just four specific loci. Relying on just four loci is insufficient as it leads to losing fine details from the neighboring region(s). We show that by taking into account…
▽ More
There has been considerable progress in implicit neural representation to upscale an image to any arbitrary resolution. However, existing methods are based on defining a function to predict the Red, Green and Blue (RGB) value from just four specific loci. Relying on just four loci is insufficient as it leads to losing fine details from the neighboring region(s). We show that by taking into account the semi-local region leads to an improvement in performance. In this paper, we propose applying a new technique called Overlapping Windows on Semi-Local Region (OW-SLR) to an image to obtain any arbitrary resolution by taking the coordinates of the semi-local region around a point in the latent space. This extracted detail is used to predict the RGB value of a point. We illustrate the technique by applying the algorithm to the Optical Coherence Tomography-Angiography (OCT-A) images and show that it can upscale them to random resolution. This technique outperforms the existing state-of-the-art methods when applied to the OCT500 dataset. OW-SLR provides better results for classifying healthy and diseased retinal images such as diabetic retinopathy and normals from the given set of OCT-A images. The project page is available at https://rishavbb.github.io/ow-slr/index.html
△ Less
Submitted 16 November, 2023; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Adapter Pruning using Tropical Characterization
Authors:
Rishabh Bhardwaj,
Tushar Vaidya,
Soujanya Poria
Abstract:
Adapters are widely popular parameter-efficient transfer learning approaches in natural language processing that insert trainable modules in between layers of a pre-trained language model. Apart from several heuristics, however, there has been a lack of studies analyzing the optimal number of adapter parameters needed for downstream applications. In this paper, we propose an adapter pruning approa…
▽ More
Adapters are widely popular parameter-efficient transfer learning approaches in natural language processing that insert trainable modules in between layers of a pre-trained language model. Apart from several heuristics, however, there has been a lack of studies analyzing the optimal number of adapter parameters needed for downstream applications. In this paper, we propose an adapter pruning approach by studying the tropical characteristics of trainable modules. We cast it as an optimization problem that aims to prune parameters from the adapter layers without changing the orientation of underlying tropical hypersurfaces. Our experiments on five NLP datasets show that tropical geometry tends to identify more relevant parameters to prune when compared with the magnitude-based baseline, while a combined approach works best across the tasks.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases
Authors:
Rishabh Bhardwaj,
Soujanya Poria
Abstract:
Red-teaming has been a widely adopted way to evaluate the harmfulness of Large Language Models (LLMs). It aims to jailbreak a model's safety behavior to make it act as a helpful agent disregarding the harmfulness of the query. Existing methods are primarily based on input text-based red-teaming such as adversarial prompts, low-resource prompts, or contextualized prompts to condition the model in a…
▽ More
Red-teaming has been a widely adopted way to evaluate the harmfulness of Large Language Models (LLMs). It aims to jailbreak a model's safety behavior to make it act as a helpful agent disregarding the harmfulness of the query. Existing methods are primarily based on input text-based red-teaming such as adversarial prompts, low-resource prompts, or contextualized prompts to condition the model in a way to bypass its safe behavior. Bypassing the guardrails uncovers hidden harmful information and biases in the model that are left untreated or newly introduced by its safety training. However, prompt-based attacks fail to provide such a diagnosis owing to their low attack success rate, and applicability to specific models. In this paper, we present a new perspective on LLM safety research i.e., parametric red-teaming through Unalignment. It simply (instruction) tunes the model parameters to break model guardrails that are not deeply rooted in the model's behavior. Unalignment using as few as 100 examples can significantly bypass commonly referred to as CHATGPT, to the point where it responds with an 88% success rate to harmful queries on two safety benchmark datasets. On open-source models such as VICUNA-7B and LLAMA-2-CHAT 7B AND 13B, it shows an attack success rate of more than 91%. On bias evaluations, Unalignment exposes inherent biases in safety-aligned models such as CHATGPT and LLAMA- 2-CHAT where the model's responses are strongly biased and opinionated 64% of the time.
△ Less
Submitted 13 November, 2023; v1 submitted 22 October, 2023;
originally announced October 2023.
-
Dual resonant amplitudes from Drinfel'd twists
Authors:
Rishabh Bhardwaj,
Shounak De
Abstract:
We postulate the existence of a family of dual resonant, four-point tachyon amplitudes derived using invertible coproduct maps called Drinfel'd twists. A sub-family of these amplitudes exhibits well-defined ultraviolet behaviour, namely in the fixed angle high-energy and Regge scattering regimes. This discovery emerges from a systematic study of the set of observables that can be constructed out o…
▽ More
We postulate the existence of a family of dual resonant, four-point tachyon amplitudes derived using invertible coproduct maps called Drinfel'd twists. A sub-family of these amplitudes exhibits well-defined ultraviolet behaviour, namely in the fixed angle high-energy and Regge scattering regimes. This discovery emerges from a systematic study of the set of observables that can be constructed out of $q$-deformed worldsheet CFTs with the underlying conformal group being the quantum group $SU(1,1)_q$. We conclude our analysis by discussing the possibility (or the lack thereof) of known $q$-deformations of the Veneziano amplitude as an observable in such theories, in particular, the Coon amplitude.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
Authors:
Rishabh Bhardwaj,
Soujanya Poria
Abstract:
Larger language models (LLMs) have taken the world by storm with their massive multi-tasking capabilities simply by optimizing over a next-word prediction objective. With the emergence of their properties and encoded knowledge, the risk of LLMs producing harmful outputs increases, making them unfit for scalable deployment for the public. In this work, we propose a new safety evaluation benchmark R…
▽ More
Larger language models (LLMs) have taken the world by storm with their massive multi-tasking capabilities simply by optimizing over a next-word prediction objective. With the emergence of their properties and encoded knowledge, the risk of LLMs producing harmful outputs increases, making them unfit for scalable deployment for the public. In this work, we propose a new safety evaluation benchmark RED-EVAL that carries out red-teaming. We show that even widely deployed models are susceptible to the Chain of Utterances-based (CoU) prompting, jailbreaking closed source LLM-based systems such as GPT-4 and ChatGPT to unethically respond to more than 65% and 73% of harmful queries. We also demonstrate the consistency of the RED-EVAL across 8 open-source LLMs in generating harmful responses in more than 86% of the red-teaming attempts. Next, we propose RED-INSTRUCT--An approach for the safety alignment of LLMs. It constitutes two phases: 1) HARMFULQA data collection: Leveraging CoU prompting, we collect a dataset that consists of 1.9K harmful questions covering a wide range of topics, 9.5K safe and 7.3K harmful conversations from ChatGPT; 2) SAFE-ALIGN: We demonstrate how the conversational dataset can be used for the safety alignment of LLMs by minimizing the negative log-likelihood over helpful responses and penalizing over harmful responses by gradient accent over sample loss. Our model STARLING, a fine-tuned Vicuna-7B, is observed to be more safely aligned when evaluated on RED-EVAL and HHH benchmarks while preserving the utility of the baseline models (TruthfulQA, MMLU, and BBH).
△ Less
Submitted 30 August, 2023; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Harnessing the magnetic proximity effect: induced spin polarization in Ni/Si interfaces
Authors:
Simone Laterza,
Antonio Caretta,
Richa Bhardwaj,
Paolo Moras,
Nicola Zema,
Roberto Flammini,
Marco Malvestuto
Abstract:
The investigation of the properties of metal-semiconductor interfaces has gained significant attention due to the unique features that emerge from the combination of both metal and semiconductor attributes. In this report, the magnetic properties of Ni/Si interfaces utilizing X-ray magnetic circular dichroism (XMCD) spectroscopy at the Ni and Si edges have been studied. This approach allows to dis…
▽ More
The investigation of the properties of metal-semiconductor interfaces has gained significant attention due to the unique features that emerge from the combination of both metal and semiconductor attributes. In this report, the magnetic properties of Ni/Si interfaces utilizing X-ray magnetic circular dichroism (XMCD) spectroscopy at the Ni and Si edges have been studied. This approach allows to distinguish unambiguously the local magnetism on Ni and Si via individual core-level excitations. Two samples with different semiconductor dopings were investigated using both total electron yield (TEY) and reflectivity configurations. The experimental results uncovered magnetization at equilibrium in both the metallic layer and in the proximal layer of the semiconductor substrate, implying the presence of induced spin polarization in Si at equilibrium, possibly arising from the depletion layer region. These results hold significant value in the field of spintronics, as similar systems have been demonstrated to generate spin injection through optical medium, opening a new pathway for next generation nonvolatile high speed devices.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Mechanism of laser induced self-organized void array formation in Polydimethylsiloxane (PDMS)
Authors:
N. Naseri,
A. Alshehri,
L. Ramunno,
R. Bhardwaj
Abstract:
This study investigated the formation of multi-voids in polydimethylsiloxane (PDMS) using a multi-pulse irradiation method and explored the impact of laser energy, number of pulses per micron (writing speed), and laser spot size (NA) on the process. The experimental results revealed that multi-void formation occurred due to multi-pulse irradiation in the bulk of PDMS. Additionally, increasing lase…
▽ More
This study investigated the formation of multi-voids in polydimethylsiloxane (PDMS) using a multi-pulse irradiation method and explored the impact of laser energy, number of pulses per micron (writing speed), and laser spot size (NA) on the process. The experimental results revealed that multi-void formation occurred due to multi-pulse irradiation in the bulk of PDMS. Additionally, increasing laser energy led to an increase in the number of voids, while the number of voids did not change with an increase in the number of pulses per micron for a fixed laser parameter. However, the size of the voids increased with the number of pulses per micron, and tighter focusing conditions (higher NA) resulted in smaller voids with a shorter distance between them. Furthermore, Finite-Difference-Time-Domain (FDTD) simulations reproduced the generation of void arrays in PDMS using a similar multi-laser pulse approach. By modeling the voids as concentric spheres with densified shells and simulating the laser interaction with the voids, we showed that void array generation in PDMS is a linear mechanism. This study provides valuable insight into the mechanism behind the formation of void arrays in PDMS. The simulation results agrees well with the experimental results to further validate the model and gain a better understanding of the physical processes involved in the generation of void arrays in PDMS.
△ Less
Submitted 17 May, 2024; v1 submitted 4 May, 2023;
originally announced May 2023.
-
ReMask: A Robust Information-Masking Approach for Domain Counterfactual Generation
Authors:
Pengfei Hong,
Rishabh Bhardwaj,
Navonil Majumdar,
Somak Aditya,
Soujanya Poria
Abstract:
Domain shift is a big challenge in NLP, thus, many approaches resort to learning domain-invariant features to mitigate the inference phase domain shift. Such methods, however, fail to leverage the domain-specific nuances relevant to the task at hand. To avoid such drawbacks, domain counterfactual generation aims to transform a text from the source domain to a given target domain. However, due to t…
▽ More
Domain shift is a big challenge in NLP, thus, many approaches resort to learning domain-invariant features to mitigate the inference phase domain shift. Such methods, however, fail to leverage the domain-specific nuances relevant to the task at hand. To avoid such drawbacks, domain counterfactual generation aims to transform a text from the source domain to a given target domain. However, due to the limited availability of data, such frequency-based methods often miss and lead to some valid and spurious domain-token associations. Hence, we employ a three-step domain obfuscation approach that involves frequency and attention norm-based masking, to mask domain-specific cues, and unmasking to regain the domain generic context. Our experiments empirically show that the counterfactual samples sourced from our masked text lead to improved domain transfer on 10 out of 12 domain sentiment classification settings, with an average of 2% accuracy improvement over the state-of-the-art for unsupervised domain adaptation (UDA). Further, our model outperforms the state-of-the-art by achieving 1.4% average accuracy improvement in the adversarial domain adaptation (ADA) setting. Moreover, our model also shows its domain adaptation efficacy on a large multi-domain intent classification dataset where it attains state-of-the-art results. We release the codes publicly at \url{https://github.com/declare-lab/remask}.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
A Review of Deep Learning Techniques for Speech Processing
Authors:
Ambuj Mehrish,
Navonil Majumder,
Rishabh Bhardwaj,
Rada Mihalcea,
Soujanya Poria
Abstract:
The field of speech processing has undergone a transformative shift with the advent of deep learning. The use of multiple processing layers has enabled the creation of models capable of extracting intricate features from speech data. This development has paved the way for unparalleled advancements in speech recognition, text-to-speech synthesis, automatic speech recognition, and emotion recognitio…
▽ More
The field of speech processing has undergone a transformative shift with the advent of deep learning. The use of multiple processing layers has enabled the creation of models capable of extracting intricate features from speech data. This development has paved the way for unparalleled advancements in speech recognition, text-to-speech synthesis, automatic speech recognition, and emotion recognition, propelling the performance of these tasks to unprecedented heights. The power of deep learning techniques has opened up new avenues for research and innovation in the field of speech processing, with far-reaching implications for a range of industries and applications. This review paper provides a comprehensive overview of the key deep learning models and their applications in speech-processing tasks. We begin by tracing the evolution of speech processing research, from early approaches, such as MFCC and HMM, to more recent advances in deep learning architectures, such as CNNs, RNNs, transformers, conformers, and diffusion models. We categorize the approaches and compare their strengths and weaknesses for solving speech-processing tasks. Furthermore, we extensively cover various speech-processing tasks, datasets, and benchmarks used in the literature and describe how different deep-learning networks have been utilized to tackle these tasks. Additionally, we discuss the challenges and future directions of deep learning in speech processing, including the need for more parameter-efficient, interpretable models and the potential of deep learning for multimodal speech processing. By examining the field's evolution, comparing and contrasting different approaches, and highlighting future directions and challenges, we hope to inspire further research in this exciting and rapidly advancing field.
△ Less
Submitted 30 May, 2023; v1 submitted 29 April, 2023;
originally announced May 2023.
-
An innovative Deep Learning Based Approach for Accurate Agricultural Crop Price Prediction
Authors:
Mayank Ratan Bhardwaj,
Jaydeep Pawar,
Abhijnya Bhat,
Deepanshu,
Inavamsi Enaganti,
Kartik Sagar,
Y. Narahari
Abstract:
Accurate prediction of agricultural crop prices is a crucial input for decision-making by various stakeholders in agriculture: farmers, consumers, retailers, wholesalers, and the Government. These decisions have significant implications including, most importantly, the economic well-being of the farmers. In this paper, our objective is to accurately predict crop prices using historical price infor…
▽ More
Accurate prediction of agricultural crop prices is a crucial input for decision-making by various stakeholders in agriculture: farmers, consumers, retailers, wholesalers, and the Government. These decisions have significant implications including, most importantly, the economic well-being of the farmers. In this paper, our objective is to accurately predict crop prices using historical price information, climate conditions, soil type, location, and other key determinants of crop prices. This is a technically challenging problem, which has been attempted before. In this paper, we propose an innovative deep learning based approach to achieve increased accuracy in price prediction. The proposed approach uses graph neural networks (GNNs) in conjunction with a standard convolutional neural network (CNN) model to exploit geospatial dependencies in prices. Our approach works well with noisy legacy data and produces a performance that is at least 20% better than the results available in the literature. We are able to predict prices up to 30 days ahead. We choose two vegetables, potato (stable price behavior) and tomato (volatile price behavior) and work with noisy public data available from Indian agricultural markets.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
Designing Fair, Cost-optimal Auctions based on Deep Learning for Procuring Agricultural Inputs through Farmer Collectives
Authors:
Mayank Ratan Bhardwaj,
Bazil Ahmed,
Prathik Diwakar,
Ganesh Ghalme,
Y. Narahari
Abstract:
Procuring agricultural inputs (agri-inputs for short) such as seeds, fertilizers, and pesticides, at desired quality levels and at affordable cost, forms a critical component of agricultural input operations. This is a particularly challenging problem being faced by small and marginal farmers in any emerging economy. Farmer collectives (FCs), which are cooperative societies of farmers, offer an ex…
▽ More
Procuring agricultural inputs (agri-inputs for short) such as seeds, fertilizers, and pesticides, at desired quality levels and at affordable cost, forms a critical component of agricultural input operations. This is a particularly challenging problem being faced by small and marginal farmers in any emerging economy. Farmer collectives (FCs), which are cooperative societies of farmers, offer an excellent prospect for enabling cost-effective procurement of inputs with assured quality to the farmers. In this paper, our objective is to design sound, explainable mechanisms by which an FC will be able to procure agri-inputs in bulk and distribute the inputs procured to the individual farmers who are members of the FC. In the methodology proposed here, an FC engages qualified suppliers in a competitive, volume discount procurement auction in which the suppliers specify price discounts based on volumes supplied. The desiderata of properties for such an auction include: minimization of the total cost of procurement; incentive compatibility; individual rationality; fairness; and other business constraints. An auction satisfying all these properties is analytically infeasible and a key contribution of this paper is to develop a deep learning based approach to design such an auction. We use two realistic, stylized case studies from chili seeds procurement and a popular pesticide procurement to demonstrate the efficacy of these auctions.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Evidence of magnetoelectronic electromagnon mediated transport in flexoelectronic heterostructures
Authors:
Anand Katailiha,
Paul C. Lou,
Ravindra G. Bhardwaj,
Ward P. Beyermann,
Sandeep Kumar
Abstract:
The superposition of atomic vibrations and flexoelectronic effect gives rise to a cross correlation between free charge carriers and temporal magnetic moment of phonons in conducting heterostructures under an applied strain gradient. The resulting dynamical coupling is expected to give rise to quasiparticle excitations called as magnetoelectronic electromagnon that carries electronic charge and te…
▽ More
The superposition of atomic vibrations and flexoelectronic effect gives rise to a cross correlation between free charge carriers and temporal magnetic moment of phonons in conducting heterostructures under an applied strain gradient. The resulting dynamical coupling is expected to give rise to quasiparticle excitations called as magnetoelectronic electromagnon that carries electronic charge and temporal magnetic moment. Here, we report experimental evidence of magnetoelectronic electromagnon in the freestanding degenerately doped p-Si based heterostructure thin film samples. These quasiparticle excitations give rise to long-distance (>100um) spin transport; demonstrated using spatially modulated transverse magneto-thermoelectric and non-local resistance measurements. The magnetoelectronic electromagnons are non-reciprocal and give rise to large magnetochiral anisotropy (0.352 A-1T-1) that diminishes at lower temperatures. The superposition of non-reciprocal magnetoelectronic electromagnons gives rise to longitudinal and transverse modulations in charge carrier density, spin density and magnetic moment; demonstrated using the Hall effect and edge dependent magnetoresistance measurements, which can also be called as inhomogeneous magnetoelectronic multiferroic effect. These quasiparticle excitations are analogues to photons where time dependent polarization and temporal magnetic moment replaces electric and magnetic field, respectively and most likely topological because it manifests topological Nernst effect. Hence, the magnetoelectronic electromagnon can potentially give rise to quantum interference and entanglement effects in conducting solid state system at room temperature in addition to efficient spin transport.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding
Authors:
Yingting Li,
Ambuj Mehrish,
Shuai Zhao,
Rishabh Bhardwaj,
Amir Zadeh,
Navonil Majumder,
Rada Mihalcea,
Soujanya Poria
Abstract:
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models. Parameter inefficiency can however arise when, during transfer learning, all the parameters of a large pre-trained model need to be updated for individual downstream tasks. As the number of parameters grows, fine-tuning is prone to overfitting and catastrophic forgetting. In addition, full fine-tunin…
▽ More
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models. Parameter inefficiency can however arise when, during transfer learning, all the parameters of a large pre-trained model need to be updated for individual downstream tasks. As the number of parameters grows, fine-tuning is prone to overfitting and catastrophic forgetting. In addition, full fine-tuning can become prohibitively expensive when the model is used for many tasks. To mitigate this issue, parameter-efficient transfer learning algorithms, such as adapters and prefix tuning, have been proposed as a way to introduce a few trainable parameters that can be plugged into large pre-trained language models such as BERT, and HuBERT. In this paper, we introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks. Additionally, we introduce a new adapter, ConvAdapter, based on 1D convolution. We show that ConvAdapter outperforms the standard adapters while showing comparable performance against prefix tuning and LoRA with only 0.94% of trainable parameters on some of the task in SURE. We further explore the effectiveness of parameter efficient transfer learning for speech synthesis task such as Text-to-Speech (TTS).
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Interval Valued Vector Variational Inequalities and Vector Optimization Problems via Convexificators
Authors:
Rohit Kumar Bhardwaj,
Tirth Ram
Abstract:
In this paper, we consider interval-valued vector optimization problems $(IVOP)$ and derive their relationships to interval vector variational inequalities $(IVVI)$ of Minty and Stampacchia type in terms of convexificators and LU-efficient solution of $(IVOP)$ using LU-convexity assumption. Furthermore, we consider weak versions of $(IVVI)$ of Minty and Stampacchia type and find the relationships…
▽ More
In this paper, we consider interval-valued vector optimization problems $(IVOP)$ and derive their relationships to interval vector variational inequalities $(IVVI)$ of Minty and Stampacchia type in terms of convexificators and LU-efficient solution of $(IVOP)$ using LU-convexity assumption. Furthermore, we consider weak versions of $(IVVI)$ of Minty and Stampacchia type and find the relationships between weak versions of $(IVVI)$ of Minty and Stampacchia type and weakly LU-efficient solution of $(IVOP)$. The results presented in this paper extend and generalized some existing results in the literature.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Causal Categorization of Mental Health Posts using Transformers
Authors:
Simranjeet Kaur,
Ritika Bhardwaj,
Aastha Jain,
Muskan Garg,
Chandni Saxena
Abstract:
With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization whi…
▽ More
With recent developments in digitization of clinical psychology, NLP research community has revolutionized the field of mental health detection on social media. Existing research in mental health analysis revolves around the cross-sectional studies to classify users' intent on social media. For in-depth analysis, we investigate existing classifiers to solve the problem of causal categorization which suggests the inefficiency of learning based methods due to limited training samples. To handle this challenge, we use transformer models and demonstrate the efficacy of a pre-trained transfer learning on "CAMS" dataset. The experimental result improves the accuracy and depicts the importance of identifying cause-and-effect relationships in the underlying text.
△ Less
Submitted 15 January, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
On unitarity of the Coon amplitude
Authors:
Rishabh Bhardwaj,
Shounak De,
Marcus Spradlin,
Anastasia Volovich
Abstract:
The Coon amplitude is a one-parameter deformation of the Veneziano amplitude. We explore the unitarity of the Coon amplitude through its partial wave expansion using tools from $q$-calculus. Our analysis establishes manifest positivity on the leading and sub-leading Regge trajectories in arbitrary spacetime dimensions $D$, while revealing a violation of unitarity in a certain region of $(q,D)$ par…
▽ More
The Coon amplitude is a one-parameter deformation of the Veneziano amplitude. We explore the unitarity of the Coon amplitude through its partial wave expansion using tools from $q$-calculus. Our analysis establishes manifest positivity on the leading and sub-leading Regge trajectories in arbitrary spacetime dimensions $D$, while revealing a violation of unitarity in a certain region of $(q,D)$ parameter space starting at the sub-sub-leading Regge order. A combination of numerical studies and analytic arguments allows us to argue for the manifest positivity of the partial wave coefficients in fixed spin and Regge asymptotics.
△ Less
Submitted 5 January, 2024; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Adaptation Approaches for Nearest Neighbor Language Models
Authors:
Rishabh Bhardwaj,
George Polovets,
Monica Sunkara
Abstract:
Semi-parametric Nearest Neighbor Language Models ($k$NN-LMs) have produced impressive gains over purely parametric LMs, by leveraging large-scale neighborhood retrieval over external memory datastores. However, there has been little investigation into adapting such models for new domains. This work attempts to fill that gap and suggests the following approaches for adapting $k$NN-LMs -- 1) adaptin…
▽ More
Semi-parametric Nearest Neighbor Language Models ($k$NN-LMs) have produced impressive gains over purely parametric LMs, by leveraging large-scale neighborhood retrieval over external memory datastores. However, there has been little investigation into adapting such models for new domains. This work attempts to fill that gap and suggests the following approaches for adapting $k$NN-LMs -- 1) adapting the underlying LM (using Adapters), 2) expanding neighborhood retrieval over an additional adaptation datastore, and 3) adapting the weights (scores) of retrieved neighbors using a learned Rescorer module. We study each adaptation strategy separately, as well as the combined performance improvement through ablation experiments and an extensive set of evaluations run over seven adaptation domains. Our combined adaptation approach consistently outperforms purely parametric adaptation and zero-shot ($k$NN-LM) baselines that construct datastores from the adaptation data. On average, we see perplexity improvements of 17.1% and 16% for these respective baselines, across domains.
△ Less
Submitted 12 June, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Extremely large nonlinear response in crystalline quartz at THz frequencies
Authors:
Soheil Zibod,
Payman Rasekh,
Murat Yildrim,
Wei Cui,
Ravi Bhardwaj,
Jean-Michel Ménard,
Robert W. Boyd,
Ksenia Dolgaleva
Abstract:
We report on the first experimental observation of a very strong nonlinear response in crystalline quartz in the terahertz (THz) frequency region through THz time-domain spectroscopy (THz-TDS). Theoretical modelling is presented and predicts a Kerr coefficient n2 equal to 5.17*10^-14 m^2 W^-1. The time-domain analysis of the measured data shows that with increasing of the THz peak amplitude, the p…
▽ More
We report on the first experimental observation of a very strong nonlinear response in crystalline quartz in the terahertz (THz) frequency region through THz time-domain spectroscopy (THz-TDS). Theoretical modelling is presented and predicts a Kerr coefficient n2 equal to 5.17*10^-14 m^2 W^-1. The time-domain analysis of the measured data shows that with increasing of the THz peak amplitude, the pulse experiences a larger time delay in the sample. As the THz amplitude increases to values higher than 110 kV cm^-1, the growth rate of the delay decreases, indicating a saturation process. The value of the nonlinear refractive index calculated through the frequency response analysis is estimated to be on the order of 10^-13 m^2 W^-1, which is several orders of magnitude larger than typical values of the nonlinear refractive index of solids in the visible region. Furthermore, a negative fifth-order susceptibility on the order of 10^-30 m^4 V^-4 is measured.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Loop-level gluon OPEs in celestial holography
Authors:
Rishabh Bhardwaj,
Luke Lippstreu,
Lecheng Ren,
Marcus Spradlin,
Akshay Yelleshpur Srikant,
Anastasia Volovich
Abstract:
We compute one-loop corrections to the OPE of gluons in the celestial conformal field theory corresponding to Yang-Mills coupled to arbitrary matter. We exploit universal hard/soft factorization to derive an IR finite OPE for the hard gluon operators. This OPE involves logarithms and operators that resemble logarithmic partners of primary operators. We derive an exact all-loop OPE in a limit of th…
▽ More
We compute one-loop corrections to the OPE of gluons in the celestial conformal field theory corresponding to Yang-Mills coupled to arbitrary matter. We exploit universal hard/soft factorization to derive an IR finite OPE for the hard gluon operators. This OPE involves logarithms and operators that resemble logarithmic partners of primary operators. We derive an exact all-loop OPE in a limit of the Higgs-regulated planar $\mathcal{N}=4$ super Yang-Mills theory.
△ Less
Submitted 30 August, 2022;
originally announced August 2022.
-
Angular momentum of the asymptotic electromagnetic field in the classical scattering of charged particles
Authors:
Rishabh Bhardwaj,
Luke Lippstreu
Abstract:
We compute the angular momentum of the electromagnetic field on a late time Cauchy surface with an arbitrary constant normal vector relevant for the classical scattering of charged particles. We find a time independent contribution to the angular momentum. This demonstrates that every charged particle scattering event is accompanied by a net shift in the angular momentum of the electromagnetic fie…
▽ More
We compute the angular momentum of the electromagnetic field on a late time Cauchy surface with an arbitrary constant normal vector relevant for the classical scattering of charged particles. We find a time independent contribution to the angular momentum. This demonstrates that every charged particle scattering event is accompanied by a net shift in the angular momentum of the electromagnetic field. We speculate that this shift is related to a subleading electromagnetic memory effect. We argue that this asymptotic angular momentum should be included in the description of the asymptotic states in quantum theories containing infrared divergences. We demonstrate that the Lorentz covariance of the asymptotic electromagnetic angular momentum can only be exhibited upon making reference to the Cauchy slice's normal vector.
△ Less
Submitted 7 September, 2022; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding
Authors:
Rishabh Bhardwaj,
Amrita Saha,
Steven C. H. Hoi,
Soujanya Poria
Abstract:
Prompt Tuning has been largely successful as a parameter-efficient method of conditioning large-scale pre-trained language models to perform downstream tasks. Thus far, soft prompt tuning learns a fixed set of task-specific continuous vectors, i.e., soft tokens that remain static across the task samples. A fixed prompt, however, may not generalize well to the diverse kinds of inputs the task compr…
▽ More
Prompt Tuning has been largely successful as a parameter-efficient method of conditioning large-scale pre-trained language models to perform downstream tasks. Thus far, soft prompt tuning learns a fixed set of task-specific continuous vectors, i.e., soft tokens that remain static across the task samples. A fixed prompt, however, may not generalize well to the diverse kinds of inputs the task comprises. In order to address this, we propose Vector-quantized Input-contextualized Prompts (VIP) as an extension to the soft prompt tuning framework. VIP particularly focuses on two aspects -- contextual prompts that learns input-specific contextualization of the soft prompt tokens through a small-scale sentence encoder and quantized prompts that maps the contextualized prompts to a set of learnable codebook vectors through a Vector quantization network. On various language understanding tasks like SuperGLUE, QA, Relation classification, NER and NLI, VIP outperforms the soft prompt tuning (PT) baseline by an average margin of 1.19%. Further, our generalization studies show that VIP learns more robust prompt representations, surpassing PT by a margin of 0.6% - 5.3% on Out-of-domain QA and NLI tasks respectively, and by 0.75% on Multi-Task setup over 4 tasks spanning across 12 domains.
△ Less
Submitted 22 October, 2022; v1 submitted 22 May, 2022;
originally announced May 2022.
-
Maxmin Participatory Budgeting
Authors:
Gogulapati Sreedurga,
Mayank Ratan Bhardwaj,
Y. Narahari
Abstract:
Participatory Budgeting (PB) is a popular voting method by which a limited budget is divided among a set of projects, based on the preferences of voters over the projects. PB is broadly categorised as divisible PB (if the projects are fractionally implementable) and indivisible PB (if the projects are atomic). Egalitarianism, an important objective in PB, has not received much attention in the con…
▽ More
Participatory Budgeting (PB) is a popular voting method by which a limited budget is divided among a set of projects, based on the preferences of voters over the projects. PB is broadly categorised as divisible PB (if the projects are fractionally implementable) and indivisible PB (if the projects are atomic). Egalitarianism, an important objective in PB, has not received much attention in the context of indivisible PB. This paper addresses this gap through a detailed study of a natural egalitarian rule, Maxmin Participatory Budgeting (MPB), in the context of indivisible PB. Our study is in two parts: (1) computational (2) axiomatic. In the first part, we prove that MPB is computationally hard and give pseudo-polynomial time and polynomial-time algorithms when parameterized by certain well-motivated parameters. We propose an algorithm that achieves for MPB, additive approximation guarantees for restricted spaces of instances and empirically show that our algorithm in fact gives exact optimal solutions on real-world PB datasets. We also establish an upper bound on the approximation ratio achievable for MPB by the family of exhaustive strategy-proof PB algorithms. In the second part, we undertake an axiomatic study of the MPB rule by generalizing known axioms in the literature. Our study leads to the proposal of a new axiom, maximal coverage, which captures fairness aspects. We prove that MPB satisfies maximal coverage.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
Topological phonons in an inhomogeneously strained silicon-6: Possible evidence of the high temperature spin superfluidity and the second sound of topological phonons
Authors:
Anand Katailiha,
Paul C. Lou,
Ravindra G. Bhardwaj,
Ward Beyermann,
Sandeep Kumar
Abstract:
The superposition of topological phonons and flexoelectronic charge separation in an inhomogeneously strain Si give rise to topological electronic magnetism of phonons. The topological electronic magnetism of phonons is also expected to give rise to stationary spin current or spin superfluidity. In this experimental study, we present possible evidence of spin superfluidity in an inhomogeneously st…
▽ More
The superposition of topological phonons and flexoelectronic charge separation in an inhomogeneously strain Si give rise to topological electronic magnetism of phonons. The topological electronic magnetism of phonons is also expected to give rise to stationary spin current or spin superfluidity. In this experimental study, we present possible evidence of spin superfluidity in an inhomogeneously strained p-Si thin films samples. The spin superfluidity is uncovered using non-local resistance measurement. A resonance behavior is observed in a non-local resistance measurement at 10 kHz and between 270 K and 281.55 K, which is attributed to the second sound. The observation of second sound and spatially varying non-local resistance phase are the evidences for spin superfluidity. The spatially varying non-local resistance with opposite phase are also observed in Pt/MgO/p-Si sample. The overall non-local responses can be treated as a standing waveform from temporal magnetic moments of the topological phonons.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Topological phonons in an inhomogeneously strained silicon-5: Inhomogeneous magnetoelectronic effect in a conductor
Authors:
Paul C. Lou,
Ravindra G. Bhardwaj,
Anand Katailiha,
Ward Beyermann,
Sandeep Kumar
Abstract:
The spatially inhomogeneity in a magnetic crystal give rise to electric polarization, which is known as inhomogeneous magnetoelectric effect. Similarly, an inhomogeneous magnetoelectronic effect in a conducting multiferroic material give rise to spatially inhomogeneous magnetic moment and spin distribution due to spatially inhomogeneity in the charge carrier concentration. In this study, we presen…
▽ More
The spatially inhomogeneity in a magnetic crystal give rise to electric polarization, which is known as inhomogeneous magnetoelectric effect. Similarly, an inhomogeneous magnetoelectronic effect in a conducting multiferroic material give rise to spatially inhomogeneous magnetic moment and spin distribution due to spatially inhomogeneity in the charge carrier concentration. In this study, we present experimental evidence of inhomogeneous magnetoelectronic effect in Py/p-Si layered structure. The Py/p-Si layered structure exhibit electronic multiferroicity due to superposition of flexoelectronic charge carrier doping and topological phonons. It gives rise to spatially modulations in the spin density and magnetic moment, which are discovered using the Hall effect measurement. The charge carrier density as well as type of the charge carrier are found to be a function of spatial coordinate as well as direction of magnetic field. The observed modulations can also be interpreted as incommensurate SDW with wavelength of ~142 um. The inhomogeneous magnetoelectronic effect also give rise to magnetocaloric effect, which is uncovered using thermal hysteresis in the magnetoresistance measurement. This is a first experimental evidence of inhomogeneous magnetoelectronic effect, which is electronic counterpart of the magnetoelectric effect.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Topological phonons in an inhomogeneously strained silicon-4: Large spin dependent thermoelectric response and thermal spin transfer torque due to topological electronic magnetism of phonons
Authors:
Ravindra G Bhardwaj,
Anand Katailiha,
Paul C. Lou,
Ward P. Beyermann,
Sandeep Kumar
Abstract:
The superposition of flexoelectronic doping and topological phonons give rise to topological electronic magnetism of phonon in an inhomogeneously strained Si in the bilayer structure with metal. In case of ferromagnetic metal and Si bilayer structure, the flexoelectronic doping will also give rise to larger spin current, which will lead to large spin to charge conversion due to topological electro…
▽ More
The superposition of flexoelectronic doping and topological phonons give rise to topological electronic magnetism of phonon in an inhomogeneously strained Si in the bilayer structure with metal. In case of ferromagnetic metal and Si bilayer structure, the flexoelectronic doping will also give rise to larger spin current, which will lead to large spin to charge conversion due to topological electronic magnetism of phonon. By applying a temperature difference to ferromagnetic metal/Si bilayer structure under an applied strain gradient, a large thermoelectric response can be generated. In this experimental study, we report a large spin dependent thermoelectric response at Ni80Fe20/Si bilayer structure. The spin dependent response is found to be an order of magnitude larger than that in Pt thin films and similar to topological insulators surface states in spite of negligible intrinsic spin-orbit coupling of Si. This large response is attributed to the flexoelectronic doping and topological electronic magnetism of phonons, which was uncovered using topological Nernst effect measurement. This alternative and novel approach of using inhomogeneous strain engineering to address both spin current density and spin to charge conversion can open a new window to the realization of spintronics and spin-caloritronics devices using metal and doped-semiconductor layered materials.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Flexoelectronic doping of the degenerate silicon and the correlated electron behavior
Authors:
Paul C. Lou,
Anand Katailiha,
Ravindra G. Bhardwaj,
Ward Beyermann,
Dheeraj Mohata,
Sandeep Kumar
Abstract:
In metal/degenerately doped silicon bilayer structure, the interfacial flexoelectric effect due to strain gradient leads to charge carrier transfer from metal layer to the silicon layer. This excess charge carrier concentration is called flexoelectronic doping or flexoelectronic charge transfer, which gives rise to an electronically polarized (order of magnitude larger than ferroelectric materials…
▽ More
In metal/degenerately doped silicon bilayer structure, the interfacial flexoelectric effect due to strain gradient leads to charge carrier transfer from metal layer to the silicon layer. This excess charge carrier concentration is called flexoelectronic doping or flexoelectronic charge transfer, which gives rise to an electronically polarized (order of magnitude larger than ferroelectric materials) silicon layer. In the transport measurements, the charge carrier concentration in silicon is found to increase by two orders of magnitude due to flexoelectronic doping, which changes the Fermi level and the Hall response. The flexoelectronic charge accumulation modifies the electron-electron and the electron phonon coupling, which gives rise to Mott metal-insulator transition and magnetism of phonons, respectively. The coexistence of flexoelectronic polarization and magnetism gives rise to a new class of materials called electronic multiferroics. By controlling the flexoelectronic doping, material behavior can potentially be engineered for quantum, spintronics and electronics applications in semiconductor materials.
△ Less
Submitted 1 March, 2022; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Topological phonons in an inhomogeneously strained silicon-2: Evidence of spin-momentum locking
Authors:
Anand Katailiha,
Paul C. Lou,
Ravindra G. Bhardwaj,
Ward P. Beyermann,
Sandeep Kumar
Abstract:
In this study, we report first experimental evidence of spin-momentum locking in the topological phonons in an inhomogeneously strained Si thin film. The spin-momentum locking in the topological phonons lead to a longitudinal spin texture or spatially inhomogeneous spin distribution in the freestanding sample structure. The spin texture was uncovered using location dependent Hall effect and planar…
▽ More
In this study, we report first experimental evidence of spin-momentum locking in the topological phonons in an inhomogeneously strained Si thin film. The spin-momentum locking in the topological phonons lead to a longitudinal spin texture or spatially inhomogeneous spin distribution in the freestanding sample structure. The spin texture was uncovered using location dependent Hall effect and planar Hall effect measurement. The charge carrier density and anomalous Hall resistance showed a linear behavior along the length of the sample. Similarly, the planar Hall resistance related with the spin dependent scattering was also found to be different at two different location along the length of the sample. The spin-momentum locking also gave rise to transverse thermal spin current and spin-Nernst magneto thermopower response, which was uncovered using angle dependent longitudinal second harmonic measurement. The magneto thermopower response was also a function of crystallography of the Si sample where the sign of the response was opposite for <110> and <100> aligned samples. The spin-momentum locking in topological phonons may give rise to large spin dependent response at and above room temperature, which can pave the way for energy efficient spintronics and spin-caloritronics devices.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
Topological phonons in an inhomogeneously strained silicon-1: Evidence of long-distance spin transport and unidirectional magnetoresistance of phonons
Authors:
Anand Katailiha,
Ravindra G. Bhardwaj,
Paul C. Lou,
Ward P. Beyermann,
Sandeep Kumar
Abstract:
Transverse acoustic waves in an inhomogeneous medium are analogues to electromagnetic waves and will exhibit topological behavior due to the Berry gauge potential in the momentum space due to inhomogeneity. The inhomogeneous (or gradient) medium can be created using an applied strain gradient in a semiconductor thin film (silicon) since the phonon frequency and dispersion will be a function of the…
▽ More
Transverse acoustic waves in an inhomogeneous medium are analogues to electromagnetic waves and will exhibit topological behavior due to the Berry gauge potential in the momentum space due to inhomogeneity. The inhomogeneous (or gradient) medium can be created using an applied strain gradient in a semiconductor thin film (silicon) since the phonon frequency and dispersion will be a function of the local strain along the strain gradient direction. As a consequence, topological phonon mediated spin and heat transport can be engineered in the semiconductor thin films. Here, we present evidence of a long-distance (100 um) spin transport in the freestanding Si thin film sample under an applied strain gradient using transverse spin-Nernst effect measurement. The long-distance spin transport was attributed to the topological spin-Hall effect of phonons in an inhomogeneous medium. The inhomogeneous medium was validated using unidirectional magnetoresistance of phonons where the magnitude of the coefficient of the non-reciprocal response at room temperature was as large as reported in the BiTeBr at low temperatures. The topological phonons also manifested the topological Nernst effect. This work not only enhances the current understanding of inhomogeneous systems but also lays the foundation of the topological and spin phononics.
△ Less
Submitted 10 October, 2021;
originally announced October 2021.
-
KNOT: Knowledge Distillation using Optimal Transport for Solving NLP Tasks
Authors:
Rishabh Bhardwaj,
Tushar Vaidya,
Soujanya Poria
Abstract:
We propose a new approach, Knowledge Distillation using Optimal Transport (KNOT), to distill the natural language semantic knowledge from multiple teacher networks to a student network. KNOT aims to train a (global) student model by learning to minimize the optimal transport cost of its assigned probability distribution over the labels to the weighted sum of probabilities predicted by the (local)…
▽ More
We propose a new approach, Knowledge Distillation using Optimal Transport (KNOT), to distill the natural language semantic knowledge from multiple teacher networks to a student network. KNOT aims to train a (global) student model by learning to minimize the optimal transport cost of its assigned probability distribution over the labels to the weighted sum of probabilities predicted by the (local) teacher models, under the constraints, that the student model does not have access to teacher models' parameters or training data. To evaluate the quality of knowledge transfer, we introduce a new metric, Semantic Distance (SD), that measures semantic closeness between the predicted and ground truth label distributions. The proposed method shows improvements in the global model's SD performance over the baseline across three NLP tasks while performing on par with Entropy-based distillation on standard accuracy and F1 metrics. The implementation pertaining to this work is publicly available at: https://github.com/declare-lab/KNOT.
△ Less
Submitted 16 September, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.
-
All-optical spin injection in silicon revealed by element specific time-resolved Kerr effect
Authors:
Simone Laterza,
Antonio Caretta,
Richa Bhardwaj,
Roberto Flammini,
Paolo Moras,
MatteoJugovac,
Piu Rajak,
Mahabul Islam,
Regina Ciancio,
Valentina Bonanni,
Barbara Casarin,
Alberto Simoncig,
Marco Zangrando,
Primoz Rebernik Ribic,
Giuseppe Penco,
Giovanni De Ninno,
LucaGiannessi,
Alexander Demidovich,
Miltcho Danailov,
Fulvio Parmigiani,
Marco Malvestuto
Abstract:
Understanding how a spin current flows across metal-semiconductor interfaces at pico- and femtosecond timescales has implications for ultrafast spintronics, data processing and storage applications. However, the possibility to directly access the propagation of spin currents on such time scales has been hampered by the simultaneous lack of both ultrafast element specific magnetic sensitive probes…
▽ More
Understanding how a spin current flows across metal-semiconductor interfaces at pico- and femtosecond timescales has implications for ultrafast spintronics, data processing and storage applications. However, the possibility to directly access the propagation of spin currents on such time scales has been hampered by the simultaneous lack of both ultrafast element specific magnetic sensitive probes and tailored metal-semiconductor interfaces. Here, by means of free electron laser-based element sensitive Kerr spectroscopy, we report direct experimental evidence of spin currents across a Ni/Si interface in the form of different magnetodynamics at the Ni M2,3 and Si L2,3 absorption edges. This further allows us to calculate the propagation velocity of the spin current in silicon, which is on the order of 0.2 nm/fs.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.