Search | arXiv e-print repository

SelfMAD: Enhancing Generalization and Robustness in Morphing Attack Detection via Self-Supervised Learning

Authors: Marija Ivanovska, Leon Todorov, Naser Damer, Deepak Kumar Jain, Peter Peer, Vitomir Štruc

Abstract: With the continuous advancement of generative models, face morphing attacks have become a significant challenge for existing face verification systems due to their potential use in identity fraud and other malicious activities. Contemporary Morphing Attack Detection (MAD) approaches frequently rely on supervised, discriminative models trained on examples of bona fide and morphed images. These mode… ▽ More With the continuous advancement of generative models, face morphing attacks have become a significant challenge for existing face verification systems due to their potential use in identity fraud and other malicious activities. Contemporary Morphing Attack Detection (MAD) approaches frequently rely on supervised, discriminative models trained on examples of bona fide and morphed images. These models typically perform well with morphs generated with techniques seen during training, but often lead to sub-optimal performance when subjected to novel unseen morphing techniques. While unsupervised models have been shown to perform better in terms of generalizability, they typically result in higher error rates, as they struggle to effectively capture features of subtle artifacts. To address these shortcomings, we present SelfMAD, a novel self-supervised approach that simulates general morphing attack artifacts, allowing classifiers to learn generic and robust decision boundaries without overfitting to the specific artifacts induced by particular face morphing methods. Through extensive experiments on widely used datasets, we demonstrate that SelfMAD significantly outperforms current state-of-the-art MADs, reducing the detection error by more than 64% in terms of EER when compared to the strongest unsupervised competitor, and by more than 66%, when compared to the best performing discriminative MAD model, tested in cross-morph settings. The source code for SelfMAD is available at https://github.com/LeonTodorov/SelfMAD. △ Less

Submitted 7 April, 2025; originally announced April 2025.

Comments: Accepted at IEEE International Conference on Automatic Face and Gesture Recognition (FG 2025)

arXiv:2504.04377 [pdf, other]

PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages

Authors: Priyanshu Kumar, Devansh Jain, Akhila Yerukola, Liwei Jiang, Himanshu Beniwal, Thomas Hartvigsen, Maarten Sap

Abstract: Truly multilingual safety moderation efforts for Large Language Models (LLMs) have been hindered by a narrow focus on a small set of languages (e.g., English, Chinese) as well as a limited scope of safety definition, resulting in significant gaps in moderation capabilities. To bridge these gaps, we release POLYGUARD, a new state-of-the-art multilingual safety model for safeguarding LLM generations… ▽ More Truly multilingual safety moderation efforts for Large Language Models (LLMs) have been hindered by a narrow focus on a small set of languages (e.g., English, Chinese) as well as a limited scope of safety definition, resulting in significant gaps in moderation capabilities. To bridge these gaps, we release POLYGUARD, a new state-of-the-art multilingual safety model for safeguarding LLM generations, and the corresponding training and evaluation datasets. POLYGUARD is trained on POLYGUARDMIX, the largest multilingual safety training corpus to date containing 1.91M samples across 17 languages (e.g., Chinese, Czech, English, Hindi). We also introduce POLYGUARDPROMPTS, a high quality multilingual benchmark with 29K samples for the evaluation of safety guardrails. Created by combining naturally occurring multilingual human-LLM interactions and human-verified machine translations of an English-only safety dataset (WildGuardMix; Han et al., 2024), our datasets contain prompt-output pairs with labels of prompt harmfulness, response harmfulness, and response refusal. Through extensive evaluations across multiple safety and toxicity benchmarks, we demonstrate that POLYGUARD outperforms existing state-of-the-art open-weight and commercial safety classifiers by 5.5%. Our contributions advance efforts toward safer multilingual LLMs for all global users. △ Less

Submitted 6 April, 2025; originally announced April 2025.

arXiv:2504.04138 [pdf, other]

Predicting Soil Macronutrient Levels: A Machine Learning Approach Models Trained on pH, Conductivity, and Average Power of Acid-Base Solutions

Authors: Mridul Kumar, Deepali Jain, Zeeshan Saifi, Soami Daya Krishnananda

Abstract: Soil macronutrients, particularly potassium ions (K$^+$), are indispensable for plant health, underpinning various physiological and biological processes, and facilitating the management of both biotic and abiotic stresses. Deficient macronutrient content results in stunted growth, delayed maturation, and increased vulnerability to environmental stressors, thereby accentuating the imperative for p… ▽ More Soil macronutrients, particularly potassium ions (K$^+$), are indispensable for plant health, underpinning various physiological and biological processes, and facilitating the management of both biotic and abiotic stresses. Deficient macronutrient content results in stunted growth, delayed maturation, and increased vulnerability to environmental stressors, thereby accentuating the imperative for precise soil nutrient monitoring. Traditional techniques such as chemical assays, atomic absorption spectroscopy, inductively coupled plasma optical emission spectroscopy, and electrochemical methods, albeit advanced, are prohibitively expensive and time-intensive, thus unsuitable for real-time macronutrient assessment. In this study, we propose an innovative soil testing protocol utilizing a dataset derived from synthetic solutions to model soil behaviour. The dataset encompasses physical properties including conductivity and pH, with a concentration on three key macronutrients: nitrogen (N), phosphorus (P), and potassium (K). Four machine learning algorithms were applied to the dataset, with random forest regressors and neural networks being selected for the prediction of soil nutrient concentrations. Comparative analysis with laboratory soil testing results revealed prediction errors of 23.6% for phosphorus and 16% for potassium using the random forest model, and 26.3% for phosphorus and 21.8% for potassium using the neural network model. This methodology illustrates a cost-effective and efficacious strategy for real-time soil nutrient monitoring, offering substantial advancements over conventional techniques and enhancing the capability to sustain optimal nutrient levels conducive to robust crop growth. △ Less

Submitted 5 April, 2025; originally announced April 2025.

arXiv:2503.23088 [pdf, other]

UNITYAI-GUARD: Pioneering Toxicity Detection Across Low-Resource Indian Languages

Authors: Himanshu Beniwal, Reddybathuni Venkat, Rohit Kumar, Birudugadda Srivibhav, Daksh Jain, Pavan Doddi, Eshwar Dhande, Adithya Ananth, Kuldeep, Heer Kubadia, Pratham Sharda, Mayank Singh

Abstract: This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 8… ▽ More This work introduces UnityAI-Guard, a framework for binary toxicity classification targeting low-resource Indian languages. While existing systems predominantly cater to high-resource languages, UnityAI-Guard addresses this critical gap by developing state-of-the-art models for identifying toxic content across diverse Brahmic/Indic scripts. Our approach achieves an impressive average F1-score of 84.23% across seven languages, leveraging a dataset of 888k training instances and 35k manually verified test instances. By advancing multilingual content moderation for linguistically diverse regions, UnityAI-Guard also provides public API access to foster broader adoption and application. △ Less

Submitted 29 March, 2025; originally announced March 2025.

arXiv:2503.20020 [pdf, other]

Gemini Robotics: Bringing AI into the Physical World

Authors: Gemini Robotics Team, Saminda Abeyruwan, Joshua Ainslie, Jean-Baptiste Alayrac, Montserrat Gonzalez Arenas, Travis Armstrong, Ashwin Balakrishna, Robert Baruch, Maria Bauza, Michiel Blokzijl, Steven Bohez, Konstantinos Bousmalis, Anthony Brohan, Thomas Buschmann, Arunkumar Byravan, Serkan Cabi, Ken Caluwaerts, Federico Casarini, Oscar Chang, Jose Enrique Chen, Xi Chen, Hao-Tien Lewis Chiang, Krzysztof Choromanski, David D'Ambrosio, Sudeep Dasari , et al. (93 additional authors not shown)

Abstract: Recent advancements in large multimodal models have led to the emergence of remarkable generalist capabilities in digital domains, yet their translation to physical agents such as robots remains a significant challenge. This report introduces a new family of AI models purposefully designed for robotics and built upon the foundation of Gemini 2.0. We present Gemini Robotics, an advanced Vision-Lang… ▽ More Recent advancements in large multimodal models have led to the emergence of remarkable generalist capabilities in digital domains, yet their translation to physical agents such as robots remains a significant challenge. This report introduces a new family of AI models purposefully designed for robotics and built upon the foundation of Gemini 2.0. We present Gemini Robotics, an advanced Vision-Language-Action (VLA) generalist model capable of directly controlling robots. Gemini Robotics executes smooth and reactive movements to tackle a wide range of complex manipulation tasks while also being robust to variations in object types and positions, handling unseen environments as well as following diverse, open vocabulary instructions. We show that with additional fine-tuning, Gemini Robotics can be specialized to new capabilities including solving long-horizon, highly dexterous tasks, learning new short-horizon tasks from as few as 100 demonstrations and adapting to completely novel robot embodiments. This is made possible because Gemini Robotics builds on top of the Gemini Robotics-ER model, the second model we introduce in this work. Gemini Robotics-ER (Embodied Reasoning) extends Gemini's multimodal reasoning capabilities into the physical world, with enhanced spatial and temporal understanding. This enables capabilities relevant to robotics including object detection, pointing, trajectory and grasp prediction, as well as multi-view correspondence and 3D bounding box predictions. We show how this novel combination can support a variety of robotics applications. We also discuss and address important safety considerations related to this new class of robotics foundation models. The Gemini Robotics family marks a substantial step towards developing general-purpose robots that realizes AI's potential in the physical world. △ Less

Submitted 25 March, 2025; originally announced March 2025.

arXiv:2503.06283 [pdf]

doi 10.1021/acs.nanolett.4c05860

Single-layer magnet phase in intrinsic magnetic topological insulators, $[\mathrm{MnTe}][\mathrm{Bi}_{2}\mathrm{Te}_{3}]_{\mathrm{n}}$, far beyond the thermodynamic limit

Authors: Deepti Jain, Hee Taek Yi, Xiong Yao, Alessandro R. Mazza, An-Hsi Chen, Kim Kisslinger, Myung-Geun Han, Matthew Brahlek, Seongshik Oh

Abstract: The intrinsic magnetic topological insulator (IMTI) family $[\mathrm{MnTe}][\mathrm{Bi}_{2}\mathrm{Te}_{3}]_{\mathrm{n}}$ has demonstrated magneto-topological properties dependent on $n$, making it a promising platform for advanced electronics and spintronics. However, due to technical barriers in sample synthesis, their properties in the large $n$ limit remain unknown. To overcome this, we utiliz… ▽ More The intrinsic magnetic topological insulator (IMTI) family $[\mathrm{MnTe}][\mathrm{Bi}_{2}\mathrm{Te}_{3}]_{\mathrm{n}}$ has demonstrated magneto-topological properties dependent on $n$, making it a promising platform for advanced electronics and spintronics. However, due to technical barriers in sample synthesis, their properties in the large $n$ limit remain unknown. To overcome this, we utilized the atomic layer-by-layer molecular beam epitaxy (ALL-MBE) technique and achieved IMTIs with $n$ as large as 15, far beyond the previously reported in bulk crystals or thin films. Then, we discover that the "single-layer magnet (SLM)" phase, primarily determined by intralayer ferromagnetic coupling, emerges for $n >$ $\sim 4$ and remains little affected up to $n = 15$. Nonetheless, still, non-zero, interlayer ferromagnetic coupling is necessary to stabilize the SLM phase, suggesting that the SLM phase eventually disappears in the $n\to\infty$ limit. This study uncovers the secrets of IMTIs beyond the thermodynamic limit and opens a door to diverse magneto-topological applications. △ Less

Submitted 8 March, 2025; originally announced March 2025.

arXiv:2502.02562 [pdf, other]

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Authors: Connor Schenck, Isaac Reid, Mithun George Jacob, Alex Bewley, Joshua Ainslie, David Rendleman, Deepali Jain, Mohit Sharma, Avinava Dubey, Ayzaan Wahid, Sumeet Singh, René Wagner, Tianli Ding, Chuyuan Fu, Arunkumar Byravan, Jake Varley, Alexey Gritsenko, Matthias Minderer, Dmitry Kalashnikov, Jonathan Tompson, Vikas Sindhwani, Krzysztof Choromanski

Abstract: We introduce STRING: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides exact translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint.… ▽ More We introduce STRING: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides exact translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint. These properties are especially important in robotics, where efficient 3D token representation is key. We integrate STRING into Vision Transformers with RGB(-D) inputs (color plus optional depth), showing substantial gains, e.g. in open-vocabulary object detection and for robotics controllers. We complement our experiments with a rigorous mathematical analysis, proving the universality of our methods. △ Less

Submitted 4 February, 2025; originally announced February 2025.

Comments: Videos of STRING-based robotics controllers can be found here: https://sites.google.com/view/string-robotics

arXiv:2502.01838 [pdf]

doi 10.1002/adfm.202418259

Universal Superconductivity in FeTe and All-Iron-Based Ferromagnetic Superconductor Heterostructures

Authors: Hee Taek Yi, Xiong Yao, Deepti Jain, Ying-Ting Chan, An-Hsi Chen, Matthew Brahlek, Kim Kisslinger, Kai Du, Myung-Geun Han, Yimei Zhu, Weida Wu, Sang-Wook Cheong, Seongshik Oh

Abstract: Ferromagnetism (FM) and superconductivity (SC) are two of the most famous macroscopic quantum phenomena. However, nature normally does not allow SC and FM to coexist without significant degradation. Here, we introduce the first fully iron-based SC/FM heterostructures, composed of Fe(Te,Se) and Fe3GeTe2, and show that in this platform strong FM and high-temperature SC robustly coexist. We subsequen… ▽ More Ferromagnetism (FM) and superconductivity (SC) are two of the most famous macroscopic quantum phenomena. However, nature normally does not allow SC and FM to coexist without significant degradation. Here, we introduce the first fully iron-based SC/FM heterostructures, composed of Fe(Te,Se) and Fe3GeTe2, and show that in this platform strong FM and high-temperature SC robustly coexist. We subsequently discover that chemical proximity effect from neighboring layers can universally drive the otherwise non-superconducting FeTe films into a SC state. This suggests that the ground state of FeTe is so close to the SC state that it could be driven in and out of the SC state with various other perturbations. Altogether, this shows that Fe-Te-based heterostructures provide a unique opportunity to manipulate magnetism, superconductivity and topological physics, paving the way toward new superconducting technologies. △ Less

Submitted 3 February, 2025; originally announced February 2025.

Journal ref: Adv. Funct. Mater. 2025, 2418259

arXiv:2501.17217 [pdf, other]

Supersymmetric Grey Galaxies, Dual Dressed Black Holes and the Superconformal Index

Authors: Sunjin Choi, Diksha Jain, Seok Kim, Vineeth Krishna, Goojin Kwon, Eunwoo Lee, Shiraz Minwalla, Chintan Patel

Abstract: Motivated by the recent construction of grey galaxy and Dual Dressed Black Hole solutions in $AdS_5\times S^5$, we present two conjectures relating to the large $N$ entropy of supersymmetric states in ${\cal N}=4$ Yang-Mills theory. Our first conjecture asserts the existence of a large number of supersymmetric states which can be thought of as a non interacting mix of supersymmetric black holes an… ▽ More Motivated by the recent construction of grey galaxy and Dual Dressed Black Hole solutions in $AdS_5\times S^5$, we present two conjectures relating to the large $N$ entropy of supersymmetric states in ${\cal N}=4$ Yang-Mills theory. Our first conjecture asserts the existence of a large number of supersymmetric states which can be thought of as a non interacting mix of supersymmetric black holes and supersymmetric `gravitons'. It predicts a microcanonical phase diagram of supersymmetric states with eleven distinct phases, and makes a sharp prediction for the supersymmetric entropy (as a function of 5 charges) in each of these phases. The microcanonical version of the superconformal index involves a sum over states - with alternating signs - over a line in 5 parameter charge space. Our second conjecture asserts that this sum is dominated by the point on the line that has the largest supersymmetric entropy. This conjecture predicts a large $N$ formula for the superconformal index as a function of indicial charges, and predicts a microcanonical indicial phase diagram with nine distinct phases. It predicts agreement between the superconformal index and black hole entropy in one phase (so over one range of charges), but disagreement in other phases (and so at other values of charges). We compare our predictions against numerically evaluated superconformal index at $N\leq10$, and find qualitative agreement. △ Less

Submitted 28 January, 2025; originally announced January 2025.

Comments: 59 pages + Appendices, 34 figures

Report number: TIFR/TH/25-3, LCTP-25-02

arXiv:2501.13861 [pdf, other]

Exploring Various Dark Matter Halo Profiles in Milky Way and Andromeda Galaxies within the Framework of Standard Cosmology

Authors: Darshan Kumar, Nisha Rani, Deepak Jain, Shobhit Mahajan, Amitabha Mukherjee

Abstract: In this paper, we study the rotation curves of the Milky Way galaxy (MW) and Andromeda galaxy (M31) by considering its bulge, disk, and halo components. We model the bulge region by the widely accepted de Vaucouleur's law and the disk region by the well established exponential profile. In order to understand the distribution of dark matter in the halo region, we consider three different dark matte… ▽ More In this paper, we study the rotation curves of the Milky Way galaxy (MW) and Andromeda galaxy (M31) by considering its bulge, disk, and halo components. We model the bulge region by the widely accepted de Vaucouleur's law and the disk region by the well established exponential profile. In order to understand the distribution of dark matter in the halo region, we consider three different dark matter profiles in the framework of the standard $Λ$CDM model namely, Navarro-Frenk-White (NFW), Hernquist and Einasto profiles. We use recent datasets of rotation curves of the Milky Way and Andromeda galaxy. The data consist of rotation velocities of the stars and gas in the galaxy as a function of the radial distance from the center. Using Bayesian statistics, we perform an overall fit including all the components, i.e., bulge, disk and halo with the data. Our results indicate that the NFW and Hernquist profiles are in concordance with the observational data points. However, the Einasto profile poorly explains the behaviour of dark matter in both the galaxies. △ Less

Submitted 23 January, 2025; originally announced January 2025.

Comments: 22 pages, 8 figures, and 2 tables. Comments are welcome!

arXiv:2501.07590 [pdf]

Ultrafast pulsed laser evaluation of Single Event Transients in opto-couplers

Authors: Kavin Dave, Aditya Mukherjee, Hari Shanker Gupta, Deepak Jain, Shalabh Gupta

Abstract: We build a 1064 nm fiber laser system-based testing facility for emulating SETs in different electronics components and ICs. Using these facilities, we tested the 4N35 optocoupler to observe SETs for the first time. We build a 1064 nm fiber laser system-based testing facility for emulating SETs in different electronics components and ICs. Using these facilities, we tested the 4N35 optocoupler to observe SETs for the first time. △ Less

Submitted 8 January, 2025; originally announced January 2025.

Comments: Accepted in CLEO 2023, San Jose, USA and CLEO 2024, North Carolina, USA for in poster presentation. However due to lack of funds, we could not travel

arXiv:2412.05453 [pdf, other]

Knowledge Graphs are all you need: Leveraging KGs in Physics Question Answering

Authors: Krishnasai Addala, Kabir Dev Paul Baghel, Dhruv Jain, Chhavi Kirtani, Avinash Anand, Rajiv Ratn Shah

Abstract: This study explores the effectiveness of using knowledge graphs generated by large language models to decompose high school-level physics questions into sub-questions. We introduce a pipeline aimed at enhancing model response quality for Question Answering tasks. By employing LLMs to construct knowledge graphs that capture the internal logic of the questions, these graphs then guide the generation… ▽ More This study explores the effectiveness of using knowledge graphs generated by large language models to decompose high school-level physics questions into sub-questions. We introduce a pipeline aimed at enhancing model response quality for Question Answering tasks. By employing LLMs to construct knowledge graphs that capture the internal logic of the questions, these graphs then guide the generation of subquestions. We hypothesize that this method yields sub-questions that are more logically consistent with the original questions compared to traditional decomposition techniques. Our results show that sub-questions derived from knowledge graphs exhibit significantly improved fidelity to the original question's logic. This approach not only enhances the learning experience by providing clearer and more contextually appropriate sub-questions but also highlights the potential of LLMs to transform educational methodologies. The findings indicate a promising direction for applying AI to improve the quality and effectiveness of educational content. △ Less

Submitted 23 December, 2024; v1 submitted 6 December, 2024; originally announced December 2024.

arXiv:2412.00821 [pdf, other]

Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents

Authors: Raj Jaiswal, Dhruv Jain, Harsh Parimal Popat, Avinash Anand, Abhishek Dharmadhikari, Atharva Marathe, Rajiv Ratn Shah

Abstract: Large Language Models (LLMs) demonstrate remarkable capabilities in various reasoning tasks. However, they encounter significant challenges when it comes to scientific reasoning, particularly in physics, which requires not only mathematical reasoning but also factual and conceptual understanding. When addressing complex physics problems, LLMs typically face three key issues: problem miscomprehensi… ▽ More Large Language Models (LLMs) demonstrate remarkable capabilities in various reasoning tasks. However, they encounter significant challenges when it comes to scientific reasoning, particularly in physics, which requires not only mathematical reasoning but also factual and conceptual understanding. When addressing complex physics problems, LLMs typically face three key issues: problem miscomprehension, incorrect concept application, and computational errors. While each of these problems can be addressed individually, there is a need for a generalized approach that can tackle all three issues simultaneously. To address this, we introduce Mixture of Refinement Agents (MoRA), a novel agentic refinement framework that iteratively refines the LLM generated base solution by correcting the aforementioned errors, resulting in a significant performance improvement for open-source LLMs. Our approach aims to bridge the gap between opensource LLMs and GPT-4o by utilizing the latter as error identifier to guide these refinement agents. We evaluate our approach on the SciEval and MMLU subsets along with our own physics dataset (PhysicsQA). MoRA significantly improves the performance of Llama-3-70B and Gemma-2-27B on these datasets, achieving up to a 16% increase in final answer accuracy. △ Less

Submitted 1 December, 2024; originally announced December 2024.

Comments: 7 pages

arXiv:2410.21735 [pdf]

Single-domain imaging in topological insulator Bi2Te3 thin films

Authors: David H. Yi, Deepti Jain

Abstract: Single crystalline materials, different from polycrystalline and twinning structures, are desired for investigating the intrinsic physical properties, as grain and twin boundaries often work as a source of artifacts. Bismuth chalcogenides, which are van der Waals materials notable as topological insulators, have attracted significant interest due to their rich physical properties. However, the for… ▽ More Single crystalline materials, different from polycrystalline and twinning structures, are desired for investigating the intrinsic physical properties, as grain and twin boundaries often work as a source of artifacts. Bismuth chalcogenides, which are van der Waals materials notable as topological insulators, have attracted significant interest due to their rich physical properties. However, the formation of 60° twin domains is common in these materials. Here, we demonstrate single-domain bismuth chalcogenides. Using atomic force microscopy, we investigated the morphology of Bi2Se3 and Bi2Te3 grown on Al2O3. Despite lattice constants of Bi2Se3 and Al2O3 substrates being well matched with hybrid symmetry epitaxy, Bi2Se3 exhibited 60° twin boundaries across the surface. Interestingly, Bi2Te3 showed a single-domain feature across the 10 mm by 10 mm sample even with lattice mismatch. While further in-depth studies are required to understand this difference in the morphology between Bi2Se3/Al2O3 and Bi2Te3/Al2O3, we suggest that the formation of twin boundaries in bismuth chalcogenides is related to the interaction between quintuple layers across the van der Waals gap rather than strain or defects. △ Less

Submitted 5 November, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

arXiv:2410.20170 [pdf]

Cyberbullying or just Sarcasm? Unmasking Coordinated Networks on Reddit

Authors: Pinky Pamecha, Chaitya Shah, Divyam Jain, Kashish Gandhi, Kiran Bhowmick, Meera Narvekar

Abstract: With the rapid growth of social media usage, a common trend has emerged where users often make sarcastic comments on posts. While sarcasm can sometimes be harmless, it can blur the line with cyberbullying, especially when used in negative or harmful contexts. This growing issue has been exacerbated by the anonymity and vast reach of the internet, making cyberbullying a significant concern on platf… ▽ More With the rapid growth of social media usage, a common trend has emerged where users often make sarcastic comments on posts. While sarcasm can sometimes be harmless, it can blur the line with cyberbullying, especially when used in negative or harmful contexts. This growing issue has been exacerbated by the anonymity and vast reach of the internet, making cyberbullying a significant concern on platforms like Reddit. Our research focuses on distinguishing cyberbullying from sarcasm, particularly where online language nuances make it difficult to discern harmful intent. This study proposes a framework using natural language processing (NLP) and machine learning to differentiate between the two, addressing the limitations of traditional sentiment analysis in detecting nuanced behaviors. By analyzing a custom dataset scraped from Reddit, we achieved a 95.15% accuracy in distinguishing harmful content from sarcasm. Our findings also reveal that teenagers and minority groups are particularly vulnerable to cyberbullying. Additionally, our research uncovers coordinated graphs of groups involved in cyberbullying, identifying common patterns in their behavior. This research contributes to improving detection capabilities for safer online communities. △ Less

Submitted 26 October, 2024; originally announced October 2024.

Comments: 7 pages, 4 figures

arXiv:2410.17671 [pdf]

doi 10.1063/5.0238212

Mystery of superconductivity in FeTe films and the role of neighboring layers

Authors: Xiong Yao, Hee Taek Yi, Deepti Jain, Xiaoyu Yuan, Seongshik Oh

Abstract: Since the discovery of superconductivity in the Fe(Te,Se) system, it has been a general consensus that the end member of FeTe is not superconducting. Nonetheless, in recent years, there have been reports of superconducting FeTe films, but the origin of their superconductivity remains mysterious. Here, we provide the first comprehensive review of all the reported FeTe films regarding the relationsh… ▽ More Since the discovery of superconductivity in the Fe(Te,Se) system, it has been a general consensus that the end member of FeTe is not superconducting. Nonetheless, in recent years, there have been reports of superconducting FeTe films, but the origin of their superconductivity remains mysterious. Here, we provide the first comprehensive review of all the reported FeTe films regarding the relationship between their superconductivity and neighboring layers. Based on this review, we show that telluride neighboring layers are the key to superconducting FeTe films. Then, with additional new studies, we show that stoichiometric Te content, which can be readily achieved in FeTe films with the assistance of neighboring telluride layers, might be crucial to stabilizing the superconductivity in this system. This work provides insights into the underlying mechanism behind superconductivity in FeTe films and sheds light on the critical role of neighboring layers and stoichiometry control toward manipulating topological superconductivity in FeTe heterostructures. △ Less

Submitted 23 October, 2024; originally announced October 2024.

Comments: 16 pages, 4 figures

Journal ref: APL Mater. 13, 011116 (2025)

arXiv:2410.09174 [pdf, other]

Context-Aware SQL Error Correction Using Few-Shot Learning -- A Novel Approach Based on NLQ, Error, and SQL Similarity

Authors: Divyansh Jain, Eric Yang

Abstract: In recent years, the demand for automated SQL generation has increased significantly, driven by the need for efficient data querying in various applications. However, generating accurate SQL queries remains a challenge due to the complexity and variability of natural language inputs. This paper introduces a novel few-shot learning-based approach for error correction in SQL generation, enhancing th… ▽ More In recent years, the demand for automated SQL generation has increased significantly, driven by the need for efficient data querying in various applications. However, generating accurate SQL queries remains a challenge due to the complexity and variability of natural language inputs. This paper introduces a novel few-shot learning-based approach for error correction in SQL generation, enhancing the accuracy of generated queries by selecting the most suitable few-shot error correction examples for a given natural language question (NLQ). In our experiments with the open-source Gretel dataset, the proposed model offers a 39.2% increase in fixing errors from the baseline approach with no error correction and a 10% increase from a simple error correction method. The proposed technique leverages embedding-based similarity measures to identify the closest matches from a repository of few-shot examples. Each example comprises an incorrect SQL query, the resulting error, the correct SQL query, and detailed steps to transform the incorrect query into the correct one. By employing this method, the system can effectively guide the correction of errors in newly generated SQL queries. Our approach demonstrates significant improvements in SQL generation accuracy by providing contextually relevant examples that facilitate error identification and correction. The experimental results highlight the effectiveness of embedding-based selection in enhancing the few-shot learning process, leading to more precise and reliable SQL query generation. This research contributes to the field of automated SQL generation by offering a robust framework for error correction, paving the way for more advanced and user-friendly database interaction tools. △ Less

Submitted 11 October, 2024; originally announced October 2024.

Comments: Accepted for the 1st Workshop on GenAI and RAG Systems for Enterprise @ CIKM 2024

arXiv:2410.03462 [pdf, other]

Linear Transformer Topological Masking with Graph Random Features

Authors: Isaac Reid, Kumar Avinava Dubey, Deepali Jain, Will Whitney, Amr Ahmed, Joshua Ainslie, Alex Bewley, Mithun Jacob, Aranyak Mehta, David Rendleman, Connor Schenck, Richard E. Turner, René Wagner, Adrian Weller, Krzysztof Choromanski

Abstract: When training transformers on graph-structured data, incorporating information about the underlying topology is crucial for good performance. Topological masking, a type of relative position encoding, achieves this by upweighting or downweighting attention depending on the relationship between the query and keys in a graph. In this paper, we propose to parameterise topological masks as a learnable… ▽ More When training transformers on graph-structured data, incorporating information about the underlying topology is crucial for good performance. Topological masking, a type of relative position encoding, achieves this by upweighting or downweighting attention depending on the relationship between the query and keys in a graph. In this paper, we propose to parameterise topological masks as a learnable function of a weighted adjacency matrix -- a novel, flexible approach which incorporates a strong structural inductive bias. By approximating this mask with graph random features (for which we prove the first known concentration bounds), we show how this can be made fully compatible with linear attention, preserving $\mathcal{O}(N)$ time and space complexity with respect to the number of input tokens. The fastest previous alternative was $\mathcal{O}(N \log N)$ and only suitable for specific graphs. Our efficient masking algorithms provide strong performance gains for tasks on image and point cloud data, including with $>30$k nodes. △ Less

Submitted 15 October, 2024; v1 submitted 4 October, 2024; originally announced October 2024.

arXiv:2409.18178 [pdf, other]

doi 10.21468/SciPostPhys.18.4.137

Dual Dressed Black Holes as the end point of the Charged Superradiant instability in ${\cal N} = 4$ Yang Mills

Authors: Sunjin Choi, Diksha Jain, Seok Kim, Vineeth Krishna, Eunwoo Lee, Shiraz Minwalla, Chintan Patel

Abstract: Charged Black holes in $AdS_5 \times S^5$ suffer from superradiant instabilities over a range of energies. Hairy black hole solutions (constructed within gauged supergravity) have previously been proposed as endpoints to this instability. We demonstrate that these hairy black holes are themselves unstable to the emission of large dual giant gravitons. We propose that the endpoint to this instabili… ▽ More Charged Black holes in $AdS_5 \times S^5$ suffer from superradiant instabilities over a range of energies. Hairy black hole solutions (constructed within gauged supergravity) have previously been proposed as endpoints to this instability. We demonstrate that these hairy black holes are themselves unstable to the emission of large dual giant gravitons. We propose that the endpoint to this instability is given by Dual Dressed Black Holes (DDBH)s; configurations consisting of one, two, or three very large dual giant gravitons surrounding a core $AdS$ black hole with one, two, or three $SO(6)$ chemical potentials equal to unity. The dual giants each live at $AdS$ radial coordinates of order $\sqrt{N}$ and each carry charge of order $N^2$. The large separation makes DDBHs a very weakly interacting mix of their components and allows for a simple computation of their thermodynamics. We conjecture that DDBHs dominate the phase diagram of ${\cal N}=4$ Yang-Mills over a range of energies around the BPS plane, and provide an explicit construction of this phase diagram, briefly discussing the interplay with supersymmetry. We develop the quantum description of dual giants around black hole backgrounds and explicitly verify that DDBHs are stable to potential tunneling instabilities, precisely when the chemical potentials of the core black holes equal unity. We also construct the 10-dimensional DDBH bulk solutions. △ Less

Submitted 23 March, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

Comments: 86 pages + Appendices, 16 figures; Corrected typos, added references, updated the text in section 5.1 and added 4 Figures in Section 3.4 and Appendix D.1

Report number: TIFR/TH/24-19, LCTP-24-17

arXiv:2408.03906 [pdf, other]

Achieving Human Level Competitive Robot Table Tennis

Authors: David B. D'Ambrosio, Saminda Abeyruwan, Laura Graesser, Atil Iscen, Heni Ben Amor, Alex Bewley, Barney J. Reed, Krista Reymann, Leila Takayama, Yuval Tassa, Krzysztof Choromanski, Erwin Coumans, Deepali Jain, Navdeep Jaitly, Natasha Jaques, Satoshi Kataoka, Yuheng Kuang, Nevena Lazic, Reza Mahjourian, Sherry Moore, Kenneth Oslund, Anish Shankar, Vikas Sindhwani, Vincent Vanhoucke, Grace Vesom , et al. (2 additional authors not shown)

Abstract: Achieving human-level speed and performance on real world tasks is a north star for the robotics research community. This work takes a step towards that goal and presents the first learned robot agent that reaches amateur human-level performance in competitive table tennis. Table tennis is a physically demanding sport which requires human players to undergo years of training to achieve an advanced… ▽ More Achieving human-level speed and performance on real world tasks is a north star for the robotics research community. This work takes a step towards that goal and presents the first learned robot agent that reaches amateur human-level performance in competitive table tennis. Table tennis is a physically demanding sport which requires human players to undergo years of training to achieve an advanced level of proficiency. In this paper, we contribute (1) a hierarchical and modular policy architecture consisting of (i) low level controllers with their detailed skill descriptors which model the agent's capabilities and help to bridge the sim-to-real gap and (ii) a high level controller that chooses the low level skills, (2) techniques for enabling zero-shot sim-to-real including an iterative approach to defining the task distribution that is grounded in the real-world and defines an automatic curriculum, and (3) real time adaptation to unseen opponents. Policy performance was assessed through 29 robot vs. human matches of which the robot won 45% (13/29). All humans were unseen players and their skill level varied from beginner to tournament level. Whilst the robot lost all matches vs. the most advanced players it won 100% matches vs. beginners and 55% matches vs. intermediate players, demonstrating solidly amateur human-level performance. Videos of the matches can be viewed at https://sites.google.com/view/competitive-robot-table-tennis △ Less

Submitted 1 May, 2025; v1 submitted 7 August, 2024; originally announced August 2024.

arXiv:2407.19033 [pdf, other]

doi 10.3847/2041-8213/ad8dc7

Rates and beaming angles of GRBs associated with compact binary coalescences

Authors: Shasvath J. Kapadia, Dimple, Dhruv Jain, Kuntal Misra, K. G. Arun, L. Resmi

Abstract: Some, if not all, binary neutron star (BNS) coalescences, and a fraction of neutron - star black hole (NSBH) mergers, are thought to produce sufficient mass-ejection to power Gamma-Ray Bursts (GRBs). However, this fraction, as well as the distribution of beaming angles of BNS-associated GRBs, are poorly constrained from observation. Recent work applied machine learning tools to analyze GRB light c… ▽ More Some, if not all, binary neutron star (BNS) coalescences, and a fraction of neutron - star black hole (NSBH) mergers, are thought to produce sufficient mass-ejection to power Gamma-Ray Bursts (GRBs). However, this fraction, as well as the distribution of beaming angles of BNS-associated GRBs, are poorly constrained from observation. Recent work applied machine learning tools to analyze GRB light curves observed by {\textit{Fermi}}/GBM and {\it Swift}/BAT. GRBs were segregated into multiple distinct clusters, with the tantalizing possibility that one of them (BNS cluster) could be associated with BNSs and another (NSBH cluster) with NSBHs. As a proof of principle, assuming that all GRBs detected by {\it Fermi}/GBM and {\it Swift}/BAT associated with BNSs (NSBHs) lie in the BNS (NSBH) cluster, we estimate their rates ($\mathrm{Gpc}^{-3}\mathrm{yr}^{-1}$). We compare these rates with corresponding BNS and NSBH rates estimated by the LIGO-Virgo-Kagra (LVK) collaboration from the first three observing runs (O1, O2, O3). We find that the BNS rates are consistent with LVK's rate estimates, assuming a uniform distribution of beaming fractions ($f_b \in [0.01, 0.1]$). Conversely, using the LVK's BNS rate estimates, assuming all BNS mergers produce GRBs, we are able to constrain the beaming angle distribution to $θ_j \in [0.8^{\circ}, 33.5^{\circ}]$ at $90\%$ confidence. We similarly place limits on the fraction of GRB-Bright NSBHs as $f_B \in [1.3\%, 63\%]$ ($f_B \in [0.4\%, 15\%]$) with {\it Fermi}/GBM ({\it Swift}/BAT) data. △ Less

Submitted 15 November, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

Comments: 10 pages, 3 figures

Journal ref: ApJ Letters 976 L10 (2024)

arXiv:2407.16847 [pdf, other]

SPLAT: A framework for optimised GPU code-generation for SParse reguLar ATtention

Authors: Ahan Gupta, Yueming Yuan, Devansh Jain, Yuhao Ge, David Aponte, Yanqi Zhou, Charith Mendis

Abstract: Multi-head-self-attention (MHSA) mechanisms achieve state-of-the-art (SOTA) performance across natural language processing and vision tasks. However, their quadratic dependence on sequence lengths has bottlenecked inference speeds. To circumvent this bottleneck, researchers have proposed various sparse-MHSA models, where a subset of full attention is computed. Despite their promise, current sparse… ▽ More Multi-head-self-attention (MHSA) mechanisms achieve state-of-the-art (SOTA) performance across natural language processing and vision tasks. However, their quadratic dependence on sequence lengths has bottlenecked inference speeds. To circumvent this bottleneck, researchers have proposed various sparse-MHSA models, where a subset of full attention is computed. Despite their promise, current sparse libraries and compilers do not support high-performance implementations for diverse sparse-MHSA patterns due to the underlying sparse formats they operate on. These formats, which are typically designed for high-performance & scientific computing applications, are either curated for extreme amounts of random sparsity (<1% non-zero values), or specific sparsity patterns. However, the sparsity patterns in sparse-MHSA are moderately sparse (10-50% non-zero values) and varied, resulting in existing sparse-formats trading off generality for performance. We bridge this gap, achieving both generality and performance, by proposing a novel sparse format: affine-compressed-sparse-row (ACSR) and supporting code-generation scheme, SPLAT, that generates high-performance implementations for diverse sparse-MHSA patterns on GPUs. Core to our proposed format and code generation algorithm is the observation that common sparse-MHSA patterns have uniquely regular geometric properties. These properties, which can be analyzed just-in-time, expose novel optimizations and tiling strategies that SPLAT exploits to generate high-performance implementations for diverse patterns. To demonstrate SPLAT's efficacy, we use it to generate code for various sparse-MHSA models, achieving geomean speedups of 2.05x and 4.05x over hand-written kernels written in triton and TVM respectively on A100 GPUs. Moreover, its interfaces are intuitive and easy to use with existing implementations of MHSA in JAX. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 31 pages, 16 figures

arXiv:2407.15029 [pdf]

doi 10.1021/acs.nanolett.4c02320

Atomic-Layer-Controlled Magnetic Orders in MnBi2Te4-Bi2Te3 Topological Heterostructures

Authors: Xiong Yao, Qirui Cui, Zengle Huang, Xiaoyu Yuan, Hee Taek Yi, Deepti Jain, Kim Kisslinger, Myung-Geun Han, Weida Wu, Hongxin Yang, Seongshik Oh

Abstract: The natural van der Waals superlattice MnBi2Te4-(Bi2Te3)m provides an optimal platform to combine topology and magnetism in one system with minimal structural disorder. Here, we show that this system can harbor both ferromagnetic (FM) and antiferromagnetic (AFM) orders and that these magnetic orders can be controlled in two different ways by either varying the Mn-Mn distance while keeping the Bi2T… ▽ More The natural van der Waals superlattice MnBi2Te4-(Bi2Te3)m provides an optimal platform to combine topology and magnetism in one system with minimal structural disorder. Here, we show that this system can harbor both ferromagnetic (FM) and antiferromagnetic (AFM) orders and that these magnetic orders can be controlled in two different ways by either varying the Mn-Mn distance while keeping the Bi2Te3/MnBi2Te4 ratio constant or vice versa. We achieve this by creating atomically engineered sandwich structures composed of Bi2Te3 and MnBi2Te4 layers. We show that the AFM order is exclusively determined by the Mn-Mn distance whereas the FM order depends only on the overall Bi2Te3/MnBi2Te4 ratio regardless of the distance between the MnBi2Te4 layers. Our results shed light on the origins of the AFM and FM orders and provide insights into how to manipulate magnetic orders not only for the MnBi2Te4-Bi2Te3 system but also for other magneto-topological materials. △ Less

Submitted 20 July, 2024; originally announced July 2024.

Comments: 25 pages, 5 figures, accepted to Nano Letters

arXiv:2407.03901 [pdf, other]

DiCTI: Diffusion-based Clothing Designer via Text-guided Input

Authors: Ajda Lampe, Julija Stopar, Deepak Kumar Jain, Shinichiro Omachi, Peter Peer, Vitomir Štruc

Abstract: Recent developments in deep generative models have opened up a wide range of opportunities for image synthesis, leading to significant changes in various creative fields, including the fashion industry. While numerous methods have been proposed to benefit buyers, particularly in virtual try-on applications, there has been relatively less focus on facilitating fast prototyping for designers and cus… ▽ More Recent developments in deep generative models have opened up a wide range of opportunities for image synthesis, leading to significant changes in various creative fields, including the fashion industry. While numerous methods have been proposed to benefit buyers, particularly in virtual try-on applications, there has been relatively less focus on facilitating fast prototyping for designers and customers seeking to order new designs. To address this gap, we introduce DiCTI (Diffusion-based Clothing Designer via Text-guided Input), a straightforward yet highly effective approach that allows designers to quickly visualize fashion-related ideas using text inputs only. Given an image of a person and a description of the desired garments as input, DiCTI automatically generates multiple high-resolution, photorealistic images that capture the expressed semantics. By leveraging a powerful diffusion-based inpainting model conditioned on text inputs, DiCTI is able to synthesize convincing, high-quality images with varied clothing designs that viably follow the provided text descriptions, while being able to process very diverse and challenging inputs, captured in completely unconstrained settings. We evaluate DiCTI in comprehensive experiments on two different datasets (VITON-HD and Fashionpedia) and in comparison to the state-of-the-art (SoTa). The results of our experiments show that DiCTI convincingly outperforms the SoTA competitor in generating higher quality images with more elaborate garments and superior text prompt adherence, both according to standard quantitative evaluation measures and human ratings, generated as part of a user study. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted to FG 2024

arXiv:2406.19800 [pdf, other]

Modeling the Real World with High-Density Visual Particle Dynamics

Authors: William F. Whitney, Jacob Varley, Deepali Jain, Krzysztof Choromanski, Sumeet Singh, Vikas Sindhwani

Abstract: We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neig… ▽ More We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neighbour attention layers. We demonstrate the capabilities of HD-VPD by modeling the dynamics of high degree-of-freedom bi-manual robots with two RGB-D cameras. Compared to the previous graph neural network approach, our Interlacer dynamics is twice as fast with the same prediction quality, and can achieve higher quality using 4x as many particles. We illustrate how HD-VPD can evaluate motion plan quality with robotic box pushing and can grasping tasks. See videos and particle dynamics rendered by HD-VPD at https://sites.google.com/view/hd-vpd. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.17740 [pdf, other]

Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning

Authors: Arijit Sehanobish, Avinava Dubey, Krzysztof Choromanski, Somnath Basu Roy Chowdhury, Deepali Jain, Vikas Sindhwani, Snigdha Chaturvedi

Abstract: Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. I… ▽ More Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei et al., 2022). However, fine-tuning these models for downstream tasks is expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative by allowing us to fine-tune models by updating only a small number of parameters. In this work, we propose a general framework for parameter efficient fine-tuning (PEFT), based on structured unrestricted-rank matrices (SURM) which can serve as a drop-in replacement for popular approaches such as Adapters and LoRA. Unlike other methods like LoRA, SURMs provides more flexibility in finding the right balance between compactness and expressiveness. This is achieved by using low displacement rank matrices (LDRMs), which hasn't been used in this context before. SURMs remain competitive with baselines, often providing significant quality improvements while using a smaller parameter budget. SURMs achieve 5-7% accuracy gains on various image classification tasks while replacing low-rank matrices in LoRA. It also results in up to 12x reduction of the number of parameters in adapters (with virtually no loss in quality) on the GLUE benchmark. △ Less

Submitted 17 December, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

Comments: Accepted at NeurIPS 2024

arXiv:2405.14878 [pdf, other]

Improving and Evaluating Machine Learning Methods for Forensic Shoeprint Matching

Authors: Divij Jain, Saatvik Kher, Lena Liang, Yufeng Wu, Ashley Zheng, Xizhen Cai, Anna Plantinga, Elizabeth Upton

Abstract: We propose a machine learning pipeline for forensic shoeprint pattern matching that improves on the accuracy and generalisability of existing methods. We extract 2D coordinates from shoeprint scans using edge detection and align the two shoeprints with iterative closest point (ICP). We then extract similarity metrics to quantify how well the two prints match and use these metrics to train a random… ▽ More We propose a machine learning pipeline for forensic shoeprint pattern matching that improves on the accuracy and generalisability of existing methods. We extract 2D coordinates from shoeprint scans using edge detection and align the two shoeprints with iterative closest point (ICP). We then extract similarity metrics to quantify how well the two prints match and use these metrics to train a random forest that generates a probabilistic measurement of how likely two prints are to have originated from the same outsole. We assess the generalisability of machine learning methods trained on lab shoeprint scans to more realistic crime scene shoeprint data by evaluating the accuracy of our methods on several shoeprint scenarios: partial prints, prints with varying levels of blurriness, prints with different amounts of wear, and prints from different shoe models. We find that models trained on one type of shoeprint yield extremely high levels of accuracy when tested on shoeprint pairs of the same scenario but fail to generalise to other scenarios. We also discover that models trained on a variety of scenarios predict almost as accurately as models trained on specific scenarios. △ Less

Submitted 2 April, 2024; originally announced May 2024.

arXiv:2405.09373 [pdf, other]

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models

Authors: Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap

Abstract: Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations. However, existing toxicity benchmarks are overwhelmingly focused on English, posing serious risks to deploying LLMs in other languages. We address this by introducing PolygloToxicityPrompts (PTP), the first large-sca… ▽ More Recent advances in large language models (LLMs) have led to their extensive global deployment, and ensuring their safety calls for comprehensive and multilingual toxicity evaluations. However, existing toxicity benchmarks are overwhelmingly focused on English, posing serious risks to deploying LLMs in other languages. We address this by introducing PolygloToxicityPrompts (PTP), the first large-scale multilingual toxicity evaluation benchmark of 425K naturally occurring prompts spanning 17 languages. We overcome the scarcity of naturally occurring toxicity in web-text and ensure coverage across languages with varying resources by automatically scraping over 100M web-text documents. Using PTP, we investigate research questions to study the impact of model size, prompt language, and instruction and preference-tuning methods on toxicity by benchmarking over 60 LLMs. Notably, we find that toxicity increases as language resources decrease or model size increases. Although instruction- and preference-tuning reduce toxicity, the choice of preference-tuning method does not have any significant impact. Our findings shed light on crucial shortcomings of LLM safeguarding and highlight areas for future research. △ Less

Submitted 9 August, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: Accepted to COLM 2024

arXiv:2404.03570 [pdf, other]

Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

Authors: Jake Varley, Sumeet Singh, Deepali Jain, Krzysztof Choromanski, Andy Zeng, Somnath Basu Roy Chowdhury, Avinava Dubey, Vikas Sindhwani

Abstract: We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning,Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With sem… ▽ More We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning,Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With semantic and physical safety in mind, these modules are interfaced with a real-time trajectory optimizer and a compliant tracking controller to enable human-robot proximity. We demonstrate performance for the following tasks: bi-arm sorting, bottle opening, and trash disposal tasks. These are done zero-shot where the models used have not been trained with any real world data from this bi-arm robot, scenes or workspace. Composing both learning- and non-learning-based components in a modular fashion with interpretable inputs and outputs allows the user to easily debug points of failures and fragilities. One may also in-place swap modules to improve the robustness of the overall platform, for instance with imitation-learned policies. Please see https://sites.google.com/corp/view/safe-robots . △ Less

Submitted 1 November, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

arXiv:2402.11477 [pdf, other]

Cross-Cultural Differences in Mental Health Expressions on Social Media

Authors: Sunny Rai, Khushi Shelat, Devansh R Jain, Kishen Sivabalan, Young Min Cho, Maitreyi Redkar, Samindara Sawant, Lyle H. Ungar, Sharath Chandra Guntuku

Abstract: Culture moderates the way individuals perceive and express mental distress. Current understandings of mental health expressions on social media, however, are predominantly derived from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) contexts. To address this gap, we examine mental health posts on Reddit made by individuals geolocated in India, to identify variations in social media… ▽ More Culture moderates the way individuals perceive and express mental distress. Current understandings of mental health expressions on social media, however, are predominantly derived from WEIRD (Western, Educated, Industrialized, Rich, and Democratic) contexts. To address this gap, we examine mental health posts on Reddit made by individuals geolocated in India, to identify variations in social media language specific to the Indian context compared to users from Western nations. Our experiments reveal significant psychosocial variations in emotions and temporal orientation. This study demonstrates the potential of social media platforms for identifying cross-cultural differences in mental health expressions (e.g. seeking advice in India vs seeking support by Western users). Significant linguistic variations in online mental health-related language emphasize the importance of developing precision-targeted interventions that are culturally appropriate. △ Less

Submitted 8 February, 2025; v1 submitted 18 February, 2024; originally announced February 2024.

arXiv:2402.10051 [pdf, other]

SwissNYF: Tool Grounded LLM Agents for Black Box Setting

Authors: Somnath Sendhil Kumar, Dhruv Jain, Eshaan Agarwal, Raunak Pandey

Abstract: While Large Language Models (LLMs) have demonstrated enhanced capabilities in function-calling, these advancements primarily rely on accessing the functions' responses. This methodology is practical for simpler APIs but faces scalability issues with irreversible APIs that significantly impact the system, such as a database deletion API. Similarly, processes requiring extensive time for each API ca… ▽ More While Large Language Models (LLMs) have demonstrated enhanced capabilities in function-calling, these advancements primarily rely on accessing the functions' responses. This methodology is practical for simpler APIs but faces scalability issues with irreversible APIs that significantly impact the system, such as a database deletion API. Similarly, processes requiring extensive time for each API call and those necessitating forward planning, like automated action pipelines, present complex challenges. Furthermore, scenarios often arise where a generalized approach is needed because algorithms lack direct access to the specific implementations of these functions or secrets to use them. Traditional tool planning methods are inadequate in these cases, compelling the need to operate within black-box environments. Unlike their performance in tool manipulation, LLMs excel in black-box tasks, such as program synthesis. Therefore, we harness the program synthesis capabilities of LLMs to strategize tool usage in black-box settings, ensuring solutions are verified prior to implementation. We introduce TOPGUN, an ingeniously crafted approach leveraging program synthesis for black box tool planning. Accompanied by SwissNYF, a comprehensive suite that integrates black-box algorithms for planning and verification tasks, addressing the aforementioned challenges and enhancing the versatility and effectiveness of LLMs in complex API interactions. The public code for SwissNYF is available at https://github.com/iclr-dummy-user/SwissNYF. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2401.13982 [pdf]

doi 10.1103/PhysRevMaterials.8.014203

Buffer-layer-controlled Nickeline vs Zinc-Blende/Wurtzite-type MnTe growths on c-plane Al2O3 substrates

Authors: Deepti Jain, Hee Taek Yi, Alessandro R. Mazza, Kim Kisslinger, Myung-Geun Han, Matthew Brahlek, Seongshik Oh

Abstract: In the recent past, MnTe has proven to be a crucial component of the intrinsic magnetic topological insulator (IMTI) family [MnTe]m[Bi2Te3]n, which hosts a wide range of magneto-topological properties depending on the choice of m and n. However, bulk crystal growth allows only a few combinations of m and n for these IMTIs due to the strict limitations of the thermodynamic growth conditions. One wa… ▽ More In the recent past, MnTe has proven to be a crucial component of the intrinsic magnetic topological insulator (IMTI) family [MnTe]m[Bi2Te3]n, which hosts a wide range of magneto-topological properties depending on the choice of m and n. However, bulk crystal growth allows only a few combinations of m and n for these IMTIs due to the strict limitations of the thermodynamic growth conditions. One way to overcome this challenge is to utilize atomic layer-by-layer molecular beam epitaxy (MBE) technique, which allows arbitrary sequences of [MnTe]m and [Bi2Te3]n to be formed beyond the thermodynamic limit. For such MBE growth, finding optimal growth templates and conditions for the parent building block, MnTe, is a key requirement. Here, we report that two different hexagonal phases of MnTe-nickeline (NC) and zinc-blende/wurtzite (ZB-WZ) structures, with distinct in-plane lattice constants of 4.20 +/- 0.04 A and 4.39 +/- 0.04 A, respectively-can be selectively grown on c-plane Al2O3 substrates using different buffer layers and growth temperatures. Moreover, we provide the first comparative studies of different MnTe phases using atomic-resolution scanning transmission electron microscopy and show that ZB and WZ-like stacking sequences can easily alternate between the two. Surprisingly, In2Se3 buffer layer, despite its lattice constant (4.02 A) being closer to that of the NC phase, fosters the ZB-WZ instead, whereas Bi2Te3, sharing the same lattice constant (4.39 A) with the ZB-WZ phase, fosters the NC phase. These discoveries suggest that lattice matching is not always the most critical factor determining the preferred phase during epitaxial growth. Overall, this will deepen our understanding of epitaxial growth modes for chalcogenide materials and accelerate progress toward new IMTI phases as well as other magneto-topological applications. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.11095 [pdf, other]

doi 10.1145/3643834.3661556

SoundShift: Exploring Sound Manipulations for Accessible Mixed-Reality Awareness

Authors: Ruei-Che Chang, Chia-Sheng Hung, Bing-Yu Chen, Dhruv Jain, Anhong Guo

Abstract: Mixed-reality (MR) soundscapes blend real-world sound with virtual audio from hearing devices, presenting intricate auditory information that is hard to discern and differentiate. This is particularly challenging for blind or visually impaired individuals, who rely on sounds and descriptions in their everyday lives. To understand how complex audio information is consumed, we analyzed online forum… ▽ More Mixed-reality (MR) soundscapes blend real-world sound with virtual audio from hearing devices, presenting intricate auditory information that is hard to discern and differentiate. This is particularly challenging for blind or visually impaired individuals, who rely on sounds and descriptions in their everyday lives. To understand how complex audio information is consumed, we analyzed online forum posts within the blind community, identifying prevailing challenges, needs, and desired solutions. We synthesized the results and propose SoundShift for increasing MR sound awareness, which includes six sound manipulations: Transparency Shift, Envelope Shift, Position Shift, Style Shift, Time Shift, and Sound Append. To evaluate the effectiveness of SoundShift, we conducted a user study with 18 blind participants across three simulated MR scenarios, where participants identified specific sounds within intricate soundscapes. We found that SoundShift increased MR sound awareness and minimized cognitive load. Finally, we developed three real-world example applications to demonstrate the practicality of SoundShift. △ Less

Submitted 26 May, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

Comments: DIS 2024

arXiv:2401.06985 [pdf, other]

Electrodynamics of the quantum anomalous Hall state in a magnetically doped topological insulator

Authors: Zhenisbek Tagay, Hee Taek Yi, Deepti Jain, Seongshik Oh, N. P. Armitage

Abstract: Magnetically doped topological insulators have been extensively studied over the past decade as a material platform to exhibit quantum anomalous Hall effect. Most material realizations are magnetically doped and despite material advances suffer from large disorder effects. In such systems, it is believed that magnetic disorder leads to a spatially varying Dirac mass gap and chemical potential fluc… ▽ More Magnetically doped topological insulators have been extensively studied over the past decade as a material platform to exhibit quantum anomalous Hall effect. Most material realizations are magnetically doped and despite material advances suffer from large disorder effects. In such systems, it is believed that magnetic disorder leads to a spatially varying Dirac mass gap and chemical potential fluctuations, and hence quantized conductance is only observed at very low temperatures. Here, we use a recently developed high-precision time-domain terahertz (THz) polarimeter to study the low-energy electrodynamic response of Cr-doped (Bi,Sb)$_2$Te$_3$ thin films. These films have been recently shown to exhibit a dc quantized anomalous Hall response up to T = 2 K at zero gate voltage. We show that the real part of the THz range Hall conductance $σ_{xy}(ω)$ is slightly smaller than $e^2/h$ down to T = 2 K with an unconventional decreasing dependence on frequency. The imaginary (dissipative) part of $σ_{xy}(ω)$ is small, but increasing as a function of omega. We connect both aspects of our data to a simple model for effective magnetic gap disorder. Our work highlights the different effect that disorder can have on the dc vs. ac quantum anomalous Hall effect. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: 6 pages, 4 figures

arXiv:2312.13752 [pdf]

doi 10.1016/j.media.2024.103253

Hunting imaging biomarkers in pulmonary fibrosis: Benchmarks of the AIIB23 challenge

Authors: Yang Nan, Xiaodan Xing, Shiyi Wang, Zeyu Tang, Federico N Felder, Sheng Zhang, Roberta Eufrasia Ledda, Xiaoliu Ding, Ruiqi Yu, Weiping Liu, Feng Shi, Tianyang Sun, Zehong Cao, Minghui Zhang, Yun Gu, Hanxiao Zhang, Jian Gao, Pingyu Wang, Wen Tang, Pengxin Yu, Han Kang, Junqiang Chen, Xing Lu, Boyu Zhang, Michail Mamalakis , et al. (16 additional authors not shown)

Abstract: Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intric… ▽ More Airway-related quantitative imaging biomarkers are crucial for examination, diagnosis, and prognosis in pulmonary diseases. However, the manual delineation of airway trees remains prohibitively time-consuming. While significant efforts have been made towards enhancing airway modelling, current public-available datasets concentrate on lung diseases with moderate morphological variations. The intricate honeycombing patterns present in the lung tissues of fibrotic lung disease patients exacerbate the challenges, often leading to various prediction errors. To address this issue, the 'Airway-Informed Quantitative CT Imaging Biomarker for Fibrotic Lung Disease 2023' (AIIB23) competition was organized in conjunction with the official 2023 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). The airway structures were meticulously annotated by three experienced radiologists. Competitors were encouraged to develop automatic airway segmentation models with high robustness and generalization abilities, followed by exploring the most correlated QIB of mortality prediction. A training set of 120 high-resolution computerised tomography (HRCT) scans were publicly released with expert annotations and mortality status. The online validation set incorporated 52 HRCT scans from patients with fibrotic lung disease and the offline test set included 140 cases from fibrosis and COVID-19 patients. The results have shown that the capacity of extracting airway trees from patients with fibrotic lung disease could be enhanced by introducing voxel-wise weighted general union loss and continuity loss. In addition to the competitive image biomarkers for prognosis, a strong airway-derived biomarker (Hazard ratio>1.5, p<0.0001) was revealed for survival prognostication compared with existing clinical measurements, clinician assessment and AI-based biomarkers. △ Less

Submitted 16 April, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 19 pages

arXiv:2312.10452 [pdf, ps, other]

Reply to the "Comment on `Effect of density and nucleon-nucleon potential on the fusion cross section within the relativistic mean field formalism'"

Authors: M. Bhuyan, Raj Kumar, Shilpa Rana, D. Jain, S. K. Patra, B. V. Carlson

Abstract: In reply to the Comment made by M. V. Chushnyakova et al. on our paper [Phys. Rev. C 101, 044603 (2020)], we argue that the calculations, results and conclusions of our paper remain valid. We have shown here the calculations for one reaction using the deformed densities and the R3Y nucleon-nucleon potential obtained within the relativistic mean-field (RMF) formalism. Suitable clarications and just… ▽ More In reply to the Comment made by M. V. Chushnyakova et al. on our paper [Phys. Rev. C 101, 044603 (2020)], we argue that the calculations, results and conclusions of our paper remain valid. We have shown here the calculations for one reaction using the deformed densities and the R3Y nucleon-nucleon potential obtained within the relativistic mean-field (RMF) formalism. Suitable clarications and justifications are given to address all the points raised in the Comment. △ Less

Submitted 16 December, 2023; originally announced December 2023.

arXiv:2312.09191 [pdf, other]

doi 10.1007/s11207-023-02244-0

Solar flare catalog from 3 years of Chandrayaan-2 XSM observations

Authors: Aravind Bharathi Valluvan, Ashwin Goyal, Devansh Jain, Abhinna Sundar Samantaray, Abhilash Sarwade, Kasiviswanathan Sankarasubramanian

Abstract: We present a catalog of 6266 solar flares detected by the X-Ray Solar Monitor onboard the Chandrayaan-2 lunar orbiter between 1.55 and 12.4 keV (1 and 8 Å) from 2019 September 12 to 2022 November 4, including 1469 type A flares. The catalog represents the first large sample, including both type A, hot thermal flares, and type B, impulsive flares, with a sub-A class sensitive instrument. We also de… ▽ More We present a catalog of 6266 solar flares detected by the X-Ray Solar Monitor onboard the Chandrayaan-2 lunar orbiter between 1.55 and 12.4 keV (1 and 8 Å) from 2019 September 12 to 2022 November 4, including 1469 type A flares. The catalog represents the first large sample, including both type A, hot thermal flares, and type B, impulsive flares, with a sub-A class sensitive instrument. We also detect 213 sub-A and 1330 A class flares. Individual flares are fit with an exponentially-modified Gaussian function and multi-flare groups are decomposed into individual flares. We validate our findings with flare catalogs made using visual inspection as well as automatic pipelines on Geostationary Operational Environmental Satellite and Solar Dynamics Observatory data. We find a clear bimodality in the ratio of the width to decay time between type A and B flares. We infer a power-law index of $α_F = 1.92 \pm 0.09$ for the background-subtracted peak flux distribution of XSM flares, which is consistent with the value $\sim 2$ reported in the literature. We also infer $α_F = 1.90 \pm 0.09$ for type B, and $α_F = 1.94 \pm 0.08$ for type A flares, which has previously not been reported in the literature. These comparable values hint at a similarity in their generative processes. △ Less

Submitted 8 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 29 pages, 15 figures, 5 tables

arXiv:2312.01990 [pdf, other]

SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention

Authors: Isabel Leal, Krzysztof Choromanski, Deepali Jain, Avinava Dubey, Jake Varley, Michael Ryoo, Yao Lu, Frederick Liu, Vikas Sindhwani, Quan Vuong, Tamas Sarlos, Ken Oslund, Karol Hausman, Kanishka Rao

Abstract: We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (includi… ▽ More We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models, the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA. △ Less

Submitted 4 December, 2023; originally announced December 2023.

arXiv:2311.16489 [pdf, other]

doi 10.1103/PhysRevB.109.094511

Signatures of Majorana Bound States in the Diffraction Patterns of Extended Superconductor-Topological Insulator-Superconductor Josephson Junctions

Authors: Guang Yue, Can Zhang, Erik D. Huemiller, Jessica H. Montone, Gilbert R. Arias, Drew G. Wild, Jered Y. Zhang, David R. Hamilton, Xiaoyu Yuan, Xiong Yao, Deepti Jain, Jisoo Moon, Maryam Salehi, Nikesh Koirala, Seongshik Oh, Dale J. Van Harlingen

Abstract: In an extended superconductor-topological insulator-superconductor (S-TI-S) Josephson junction in a magnetic field, localized Majorana bound states (MBS) are predicted to exist at the cores of Josephson vortices where the local phase difference across the junction is an odd-multiple of $π$. These states contribute a supercurrent with a $4π$-periodic current-phase relation (CPR) that adds to the co… ▽ More In an extended superconductor-topological insulator-superconductor (S-TI-S) Josephson junction in a magnetic field, localized Majorana bound states (MBS) are predicted to exist at the cores of Josephson vortices where the local phase difference across the junction is an odd-multiple of $π$. These states contribute a supercurrent with a $4π$-periodic current-phase relation (CPR) that adds to the conventional $2π$-periodic sinusoidal CPR. In this work, we present a comprehensive experimental study of the critical current vs. applied magnetic field diffraction patterns of lateral Nb-Bi$_2$Se$_3$-Nb Josephson junctions. We compare our observations to a model of the Josephson dynamics in the S-TI-S junction system to explore what feature of MBS are, or are not, exhibited in these junctions. Consistent with the model, we find several distinct deviations from a Fraunhofer diffraction pattern that is expected for a uniform sin$(φ)$ CPR. In particular, we observe abrupt changes in the diffraction pattern at applied magnetic fields in which the current-carrying localized MBS are expected to enter the junction, and a lifting of the odd-numbered nodes consistent with a $4π$-periodic sin$(φ/2)$-component in the CPR. We also see that although the even-numbered nodes often remain fully-formed, we sometimes see deviations that are consistent with quasiparticle-induced fluctuations in the parity of the MBS pairs that encodes quantum information. △ Less

Submitted 21 February, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Journal ref: PHYS. REV. B 109, 094511 (2024)

arXiv:2311.10781 [pdf, other]

Can Language Model Moderators Improve the Health of Online Discourse?

Authors: Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

Abstract: Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis… ▽ More Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establish a systematic definition of conversational moderation effectiveness grounded on moderation literature and establish design criteria for conducting realistic yet safe evaluation. We then propose a comprehensive evaluation framework to assess models' moderation capabilities independently of human intervention. With our framework, we conduct the first known study of language models as conversational moderators, finding that appropriately prompted models that incorporate insights from social science can provide specific and fair feedback on toxic behavior but struggle to influence users to increase their levels of respect and cooperation. △ Less

Submitted 6 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 9 pages, NAACL 2024 Main

arXiv:2311.03443 [pdf, other]

The S-matrix and boundary correlators in flat space

Authors: Diksha Jain, Suman Kundu, Shiraz Minwalla, Onkar Parrikar, Siddharth G. Prabhu, Pushkal Shrivastava

Abstract: We consider the path integral of a quantum field theory in Minkowski spacetime with fixed boundary values (for the elementary fields) on asymptotic boundaries. We define and study the corresponding boundary correlation functions obtained by taking derivatives of this path integral with respect to the boundary values. The S-matrix of the QFT can be extracted directly from these boundary correlation… ▽ More We consider the path integral of a quantum field theory in Minkowski spacetime with fixed boundary values (for the elementary fields) on asymptotic boundaries. We define and study the corresponding boundary correlation functions obtained by taking derivatives of this path integral with respect to the boundary values. The S-matrix of the QFT can be extracted directly from these boundary correlation functions after smearing. We interpret this relation in terms of coherent state quantization and derive the constraints on the path-integral as a function of boundary values that follow from the unitarity of the S-matrix. We then study the locality structure of boundary correlation functions. In the massive case, we find that the boundary correlation functions for generic locations of boundary points are dominated by a saddle point which has the interpretation of particles scattering in a small elevator in the bulk, where the location of the elevator is determined dynamically, and the S-matrix can be recovered after stripping off some dynamically determined but non-local ``renormalization'' factors. In the massless case, we find that while the boundary correlation functions are generically analytic as a function on the whole manifold of locations of boundary points, they have special singularities on a sub-manifold, points on which correspond to light-like scattering in the bulk. We completely characterize this singular scattering sub-manifold, and find that the corresponding residues of the boundary correlations at these singularities are precisely given by S-matrices. This analysis parallels the analysis of bulk-point singularities in AdS/CFT and generalizes it to the case of multi-bulk point singularities. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.00315 [pdf, other]

Probing the Physics of Reionization Using kSZ Power Spectrum from Current and Upcoming CMB Surveys

Authors: Divesh Jain, Tirthankar Roy Choudhury, Srinivasan Raghunathan, Suvodip Mukherjee

Abstract: The patchiness in the reionization process alters the statistics of Cosmic Microwave Background (CMB), with the kinematic Sunyaev-Zeldovich (kSZ) effect in the CMB temperature power spectrum being a notable consequence. In this work, we aim to explore the potential of future kSZ power spectrum measurements in inferring the details of the reionization process. In this pursuit, we capitalize on the… ▽ More The patchiness in the reionization process alters the statistics of Cosmic Microwave Background (CMB), with the kinematic Sunyaev-Zeldovich (kSZ) effect in the CMB temperature power spectrum being a notable consequence. In this work, we aim to explore the potential of future kSZ power spectrum measurements in inferring the details of the reionization process. In this pursuit, we capitalize on the recent developments in foreground mitigation techniques using the Cross-Internal Linear Combination (Cross-ILC) technique, which enables robust detection of the kSZ power spectrum with signal-to-noise ($S/N$) roughly $20-30σ$ in this decade by SPT-3G and Simons Observatory (SO); and $\geq 80σ$ by CMB-S4, substantially improving on the recent evidence for kSZ binned at $\ell=3000$ using SPT-SZ+SPTpol surveys. We use a fiducial kSZ power spectrum along with realistic error bars expected from the above technique for SPT-3G, SO, and CMB-S4 to constrain the parameter space for a physical model of reionization. We find that with the improved error bars it will be possible to place stringent constraints on reionization using solely the Cross-ILC recovered SPT-3G kSZ without imposing any prior on $τ$ in the Bayesian inference. Notably, high-fidelity kSZ measurements from CMB-S4 coupled with $τ$ measurements through LiteBIRD will enable unprecedented constraint on the midpoint of reionization with an error bar of $\sim 0.25$ and the duration of reionization with an error bar at $\sim 0.21$ exclusively using CMB data. This study highlights the need to capture kSZ power spectrum on a broad range of multipoles to gain insights into the inhomogeneous reionization era. △ Less

Submitted 12 March, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

Comments: accepted for publication in MNRAS

arXiv:2310.09481 [pdf, other]

Coupled metamaterial-phonon terahertz range polaritons in a topological insulator

Authors: Sirak M. Mekonen, Deepti Jain, Seongshik Oh, N. P. Armitage

Abstract: We report terahertz time-domain spectroscopy (TDTS) experiments demonstrating strong light-matter coupling in a terahertz (THz) LC-metamaterial in which the phonon resonance of a topological insulator (TI) thin film is coupled to the photonic modes of an array of electronic split-ring resonators. As we tune the metamaterial resonance frequency through the frequency of the low frequency $α$ mode of… ▽ More We report terahertz time-domain spectroscopy (TDTS) experiments demonstrating strong light-matter coupling in a terahertz (THz) LC-metamaterial in which the phonon resonance of a topological insulator (TI) thin film is coupled to the photonic modes of an array of electronic split-ring resonators. As we tune the metamaterial resonance frequency through the frequency of the low frequency $α$ mode of (Bi$_x$Sb$_{1-x}$)$_2$Te$_3$ (BST), we observe strong mixing and level repulsion between phonon and metamaterial resonance. This hybrid resonance is a phonon polariton. We observe a normalized coupling strength, $η$ = $Ω_R$/$ω_c$ $\approx$ 0.09, using the measured vacuum Rabi frequency and cavity resonance. Our results demonstrate that one can tune the mechanical properties of materials by changing their electromagnetic environment and therefore modify their magnetic and topological degrees of freedom via coupling to the lattice in this fashion. △ Less

Submitted 19 May, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

arXiv:2309.03315 [pdf, other]

doi 10.15607/RSS.2023.XIX.006

Robotic Table Tennis: A Case Study into a High Speed Learning System

Authors: David B. D'Ambrosio, Jonathan Abelian, Saminda Abeyruwan, Michael Ahn, Alex Bewley, Justin Boyd, Krzysztof Choromanski, Omar Cortes, Erwin Coumans, Tianli Ding, Wenbo Gao, Laura Graesser, Atil Iscen, Navdeep Jaitly, Deepali Jain, Juhana Kangaspunta, Satoshi Kataoka, Gus Kouretas, Yuheng Kuang, Nevena Lazic, Corey Lynch, Reza Mahjourian, Sherry Q. Moore, Thinh Nguyen, Ken Oslund , et al. (10 additional authors not shown)

Abstract: We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts together a highly optimized perception subsystem, a high-speed low-latency robot controller, a simulation paradigm that can prevent damage in the real w… ▽ More We present a deep-dive into a real-world robotic learning system that, in previous work, was shown to be capable of hundreds of table tennis rallies with a human and has the ability to precisely return the ball to desired targets. This system puts together a highly optimized perception subsystem, a high-speed low-latency robot controller, a simulation paradigm that can prevent damage in the real world and also train policies for zero-shot transfer, and automated real world environment resets that enable autonomous training and evaluation on physical robots. We complement a complete system description, including numerous design decisions that are typically not widely disseminated, with a collection of studies that clarify the importance of mitigating various sources of latency, accounting for training and deployment distribution shifts, robustness of the perception system, sensitivity to policy hyper-parameters, and choice of action space. A video demonstrating the components of the system and details of experimental results can be found at https://youtu.be/uFcnWjB42I0. △ Less

Submitted 19 February, 2025; v1 submitted 6 September, 2023; originally announced September 2023.

Comments: Published and presented at Robotics: Science and Systems (RSS2023)

arXiv:2308.09446 [pdf, other]

Disentangling patchy reionization signatures from primordial gravitational waves using CMB $E$-mode and $B$-mode polarization

Authors: Divesh Jain, Suvodip Mukherjee, Tirthankar Roy Choudhury

Abstract: The detection of large angular scale $B$-mode in the Cosmic Microwave Background (CMB) polarization signal will open a direct window into not only the primary CMB anisotropies caused by the primordial gravitational waves (PGW) originating in the epoch of inflation, but also the secondary anisotropies imprinted during the epoch of cosmic reionization. The existence of patchiness in the electron den… ▽ More The detection of large angular scale $B$-mode in the Cosmic Microwave Background (CMB) polarization signal will open a direct window into not only the primary CMB anisotropies caused by the primordial gravitational waves (PGW) originating in the epoch of inflation, but also the secondary anisotropies imprinted during the epoch of cosmic reionization. The existence of patchiness in the electron density during reionization produces a unique distortion in the CMB $B$-mode polarization, which can be distinguished from the PGW signal with the aid of spatial frequency modes. In this work, we employ an $EB$ estimator by combining $E$-mode and $B$-mode polarization for the $τ$ power spectrum signal generated in a photon-conserving semi-numerical reionization model called SCRIPT. We developed a Bayesian framework for the joint detection of the PGW and reionization signal from CMB observations and show the efficacy of this technique for upcoming CMB experiments. We find that, for our model, the $τ$ power spectrum signal effectively tracks the inhomogeneous electron density field, allowing for robust constraints on the patchy $B$-mode signal. Further, our results indicate that employing the $EB$ estimator for the $τ$ signal will facilitate ground-based CMB-S4 to detect the patchy $B$-mode signal at approximately $\geq 2σ$ confidence level while observations with space-based PICO will improve this detection to $\geq 3σ$ going as high as $\geq 7σ$ for extreme reionization models. These findings not only highlight the future potential of these experiments to provide an improved picture of the reionization process but also have important implications towards an unbiased measurement of $r$. △ Less

Submitted 25 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

Comments: accepted for publication in MNRAS

arXiv:2308.08613 [pdf, ps, other]

doi 10.1103/PhysRevD.108.L081902

Coherent States in M-Theory: A Brane Scan using the Taub-NUT

Authors: Joydeep Chakravarty, Keshav Dasgupta, Diksha Jain, Dileep P. Jatkar, Archana Maji, Radu Tatar

Abstract: The Taub-NUT geometry corresponds to the Kaluza-Klein monopole solution of M-theory and on dimension reduction along the Taub-NUT circle direction it becomes the D6 brane of type IIA string theory. We show that the Taub-NUT geometry can be realised as a coherent state, or more appropriately as a Glauber-Sudarshan state in M-theory, once we take the underlying resurgence structure carefully. Using… ▽ More The Taub-NUT geometry corresponds to the Kaluza-Klein monopole solution of M-theory and on dimension reduction along the Taub-NUT circle direction it becomes the D6 brane of type IIA string theory. We show that the Taub-NUT geometry can be realised as a coherent state, or more appropriately as a Glauber-Sudarshan state in M-theory, once we take the underlying resurgence structure carefully. Using the duality chain it in turn implies that all D-branes as well as NS5-branes can be realised as Glauber-Sudarshan states in string theory. Our analysis also leads to an intriguing possibility of realizing the gravity duals of certain non-conformal minimally-supersymmetric gauge theories by deforming a class of Glauber-Sudarshan states. △ Less

Submitted 25 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: 6 pages, no figure, LaTex; v2: Typos corrected, references updated. Version appearing in Phys. Rev. D

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2306.08205 [pdf, other]

Agile Catching with Whole-Body MPC and Blackbox Policy Learning

Authors: Saminda Abeyruwan, Alex Bewley, Nicholas M. Boffi, Krzysztof Choromanski, David D'Ambrosio, Deepali Jain, Pannag Sanketi, Anish Shankar, Vikas Sindhwani, Sumeet Singh, Jean-Jacques Slotine, Stephen Tu

Abstract: We address a benchmark task in agile robotics: catching objects thrown at high-speed. This is a challenging task that involves tracking, intercepting, and cradling a thrown object with access only to visual observations of the object and the proprioceptive state of the robot, all within a fraction of a second. We present the relative merits of two fundamentally different solution strategies: (i) M… ▽ More We address a benchmark task in agile robotics: catching objects thrown at high-speed. This is a challenging task that involves tracking, intercepting, and cradling a thrown object with access only to visual observations of the object and the proprioceptive state of the robot, all within a fraction of a second. We present the relative merits of two fundamentally different solution strategies: (i) Model Predictive Control using accelerated constrained trajectory optimization, and (ii) Reinforcement Learning using zeroth-order optimization. We provide insights into various performance trade-offs including sample efficiency, sim-to-real transfer, robustness to distribution shifts, and whole-body multimodality via extensive on-hardware experiments. We conclude with proposals on fusing "classical" and "learning-based" techniques for agile robot control. Videos of our experiments may be found at https://sites.google.com/view/agile-catching △ Less

Submitted 19 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: L4DC 2023

arXiv:2306.04552 [pdf]

doi 10.1021/acs.nanolett.3c01313

High temperature, gate-free quantum anomalous Hall effect with an active capping layer

Authors: Hee Taek Yi, Deepti Jain, Xiong Yao, Seongshik Oh

Abstract: Quantum anomalous Hall effect (QAHE) was discovered a decade ago, but is still not utilized beyond a handful of research groups, due to numerous limitations such as extremely low temperature, electric field-effect gating requirement, small sample sizes and environmental aging effect. Here, we present a robust platform that provides effective solutions to these problems. Specifically, on this platf… ▽ More Quantum anomalous Hall effect (QAHE) was discovered a decade ago, but is still not utilized beyond a handful of research groups, due to numerous limitations such as extremely low temperature, electric field-effect gating requirement, small sample sizes and environmental aging effect. Here, we present a robust platform that provides effective solutions to these problems. Specifically, on this platform, we observe QAH signatures at record high temperatures, with the Hall conductance of 1.00 e2/h at 2.0 K, 0.98 e2/h at 4.2 K, and 0.92 e2/h at 10 K, on centimeter-scale substrates, without electric-field-effect gating. The key ingredient is an active CrOx capping layer, which substantially boosts the ferromagnetism while suppressing environmental degradation. With this development, QAHE will now be accessible to much broader applications than before. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 20 pages, 8 figures, Accepted for publication in Nano Letters, https://pubs.acs.org/doi/full/10.1021/acs.nanolett.3c01313

arXiv:2305.14654 [pdf, other]

Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Authors: Ken Caluwaerts, Atil Iscen, J. Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee , et al. (19 additional authors not shown)

Abstract: Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agili… ▽ More Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agility. We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots. Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism. This encourages researchers to develop controllers that not only move fast, but do so in a controllable and versatile way. To set strong baselines, we present two methods for tackling the benchmark. In the first approach, we train specialist locomotion skills using on-policy reinforcement learning methods and combine them with a high-level navigation controller. In the second approach, we distill the specialist skills into a Transformer-based generalist locomotion policy, named Locomotion-Transformer, that can handle various terrains and adjust the robot's gait based on the perceived environment and robot states. Using a custom-built quadruped robot, we demonstrate that our method can complete the course at half the speed of a dog. We hope that our work represents a step towards creating controllers that enable robots to reach animal-level agility. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 17 pages, 19 figures

Showing 1–50 of 198 results for author: Jain, D