-
Nano-size fragmentation of Tantalum in Copper composite using additive manufacturing
Authors:
Rakesh Das,
Pawan Kumar Dubey,
Raphael Benjamim de Oliveira,
Douglas S. Galvao,
Indranil Manna,
Sameehan S. Joshi,
Peter Samora Owuor,
Leonardo D. Machado,
Nirmal Kumar Katiyar,
Suman Chakraborty,
Chandra Sekhar Tiwary
Abstract:
The biggest challenge in manufacturing an immiscible system is phase segregation and non-uniformity inside the composite matrix. Additive manufacturing has the potential to overcome these difficulties due to the high cooling rate achieved during the process. Here we have developed immiscible Copper-based composites reinforced with Tantalum, which were fabricated using the powder bed fusion melting…
▽ More
The biggest challenge in manufacturing an immiscible system is phase segregation and non-uniformity inside the composite matrix. Additive manufacturing has the potential to overcome these difficulties due to the high cooling rate achieved during the process. Here we have developed immiscible Copper-based composites reinforced with Tantalum, which were fabricated using the powder bed fusion melting (PBF-M) technique. The distinct advantage of utilizing Tantalum in this process resides in its high melting point, allowing it to remain in particle form within the composite and contribute to its mechanical and surface/wear properties. The PBF-M results in the in situ fragmentation of micron-size Tantalum particles into nanoparticle form through a surface roughening process during laser interaction, enhancing its mechanical and wear properties. The microstructural evolution of Cu-Ta composites is explained through multiscale numerical modeling. The enhanced yield strength and the dynamics of the Ta particles were corroborated by molecular dynamics simulations. The maximum yield strength is exhibited by Cu-5wt%Ta of 80 MPa. Addition of Ta also have significant improvement in wear properties of composites. The current results can be exploited to develop complex shape, high energy efficient copper-based composites.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
Signaling Design
Authors:
Matteo Camboni,
Mingzi Niu,
Mallesh M. Pai,
Rakesh Vohra
Abstract:
We revisit the classic job-market signaling model of \cite{spence1973job}, introducing profit-seeking schools as intermediaries that design the mapping from candidates' efforts to job-market signals. Each school commits to an attendance fee and a monitoring policy. We show that, in equilibrium, a monopolist school captures the entire social surplus by committing to low information signals and char…
▽ More
We revisit the classic job-market signaling model of \cite{spence1973job}, introducing profit-seeking schools as intermediaries that design the mapping from candidates' efforts to job-market signals. Each school commits to an attendance fee and a monitoring policy. We show that, in equilibrium, a monopolist school captures the entire social surplus by committing to low information signals and charging fees that extract students' surplus from being hired. In contrast, competition shifts surplus to students, with schools vying to attract high-ability students, enabling them to distinguish themselves from their lower-ability peers. However, this increased signal informativeness leads to more wasteful effort in equilibrium, contrasting with the usual argument that competition enhances social efficiency. This result may be reversed if schools face binding fee caps or students are credit-constrained.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Logarithmic double phase problems with generalized critical growth
Authors:
Rakesh Arora,
Ángel Crespo-Blanco,
Patrick Winkert
Abstract:
In this paper we study logarithmic double phase problems with variable exponents involving nonlinearities that have generalized critical growth. We first prove new continuous and compact embedding results in order to guarantee the well-definedness by studying the Sobolev conjugate function of our generalized $N$-function. In the second part we prove the concentration compactness principle for Musi…
▽ More
In this paper we study logarithmic double phase problems with variable exponents involving nonlinearities that have generalized critical growth. We first prove new continuous and compact embedding results in order to guarantee the well-definedness by studying the Sobolev conjugate function of our generalized $N$-function. In the second part we prove the concentration compactness principle for Musielak-Orlicz Sobolev spaces having logarithmic double phase modular function structure. Based on this we are going to show multiplicity results for the problem under consideration for superlinear and sublinear growth, respectively.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
Exploring the defect landscape and dopability of chalcogenide perovskite BaZrS3
Authors:
Rushik Desai,
Shubhanshu Agarwal,
Kiruba Catherine Vincent,
Alejandro Strachan,
Rakesh Agrawal,
Arun Mannodi-Kanakkithodi
Abstract:
BaZrS3 is a chalcogenide perovskite that has shown promise as a photovoltaic absorber, but its performance is limited because of defects and impurities that have a direct influence on carrier concentrations. Functional dopants that show lower donor-type or acceptor-type formation energies than naturally occurring defects can help tune the optoelectronic properties of BaZrS3. In this work, we appli…
▽ More
BaZrS3 is a chalcogenide perovskite that has shown promise as a photovoltaic absorber, but its performance is limited because of defects and impurities that have a direct influence on carrier concentrations. Functional dopants that show lower donor-type or acceptor-type formation energies than naturally occurring defects can help tune the optoelectronic properties of BaZrS3. In this work, we applied first principles computations to comprehensively investigate the defect landscape of BaZrS3, including all intrinsic defects and a set of selected impurities and dopants. BaZrS3 intrinsically exhibits n-type equilibrium conductivity under both S-poor and S-rich conditions, which remains largely unchanged in the presence of O and H impurities. La and Nb dopants created stable donor-type defects, which made BaZrS3 even more n-type, whereas As and P dopants formed amphoteric defects with relatively high formation energies. This work highlights the difficulty of creating p-type BaZrS3 owing to the low formation energies of donor defects, both intrinsic and extrinsic. Defect formation energies were also used to compute expected defect concentrations and make comparisons with experimentally reported values. Our dataset of defects in BaZrS3 paves the path for training machine learning models to subsequently perform larger-scale prediction and screening of defects and dopants across many chalcogenide perovskites, including cation-site or anion-site alloys.
△ Less
Submitted 7 April, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
Authors:
Kritarth Prasad,
Mohammadi Zaki,
Pratik Singh,
Pankaj Wasnik
Abstract:
Ensembling neural machine translation (NMT) models to produce higher-quality translations than the $L$ individual models has been extensively studied. Recent methods typically employ a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across \textit{all} candidate models, leading to significant computational overhead, generally $Ω(L)$. This paper introdu…
▽ More
Ensembling neural machine translation (NMT) models to produce higher-quality translations than the $L$ individual models has been extensively studied. Recent methods typically employ a candidate selection block (CSB) and an encoder-decoder fusion block (FB), requiring inference across \textit{all} candidate models, leading to significant computational overhead, generally $Ω(L)$. This paper introduces \textbf{SmartGen}, a reinforcement learning (RL)-based strategy that improves the CSB by selecting a small, fixed number of candidates and identifying optimal groups to pass to the fusion block for each input sentence. Furthermore, previously, the CSB and FB were trained independently, leading to suboptimal NMT performance. Our DQN-based \textbf{SmartGen} addresses this by using feedback from the FB block as a reward during training. We also resolve a key issue in earlier methods, where candidates were passed to the FB without modification, by introducing a Competitive Correction Block (CCB). Finally, we validate our approach with extensive experiments on English-Hindi translation tasks in both directions.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
Personalized Layer Selection for Graph Neural Networks
Authors:
Kartik Sharma,
Vineeth Rakesh,
Yingtong Dou,
Srijan Kumar,
Mahashweta Das
Abstract:
Graph Neural Networks (GNNs) combine node attributes over a fixed granularity of the local graph structure around a node to predict its label. However, different nodes may relate to a node-level property with a different granularity of its local neighborhood, and using the same level of smoothing for all nodes can be detrimental to their classification. In this work, we challenge the common fact t…
▽ More
Graph Neural Networks (GNNs) combine node attributes over a fixed granularity of the local graph structure around a node to predict its label. However, different nodes may relate to a node-level property with a different granularity of its local neighborhood, and using the same level of smoothing for all nodes can be detrimental to their classification. In this work, we challenge the common fact that a single GNN layer can classify all nodes of a graph by training GNNs with a distinct personalized layer for each node. Inspired by metric learning, we propose a novel algorithm, MetSelect1, to select the optimal representation layer to classify each node. In particular, we identify a prototype representation of each class in a transformed GNN layer and then, classify using the layer where the distance is smallest to a class prototype after normalizing with that layer's variance. Results on 10 datasets and 3 different GNNs show that we significantly improve the node classification accuracy of GNNs in a plug-and-play manner. We also find that using variable layers for prediction enables GNNs to be deeper and more robust to poisoning attacks. We hope this work can inspire future works to learn more adaptive and personalized graph representations.
△ Less
Submitted 18 June, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
Authors:
Wenyan Cong,
Hanqing Zhu,
Kevin Wang,
Jiahui Lei,
Colton Stearns,
Yuanhao Cai,
Dilin Wang,
Rakesh Ranjan,
Matt Feiszli,
Leonidas Guibas,
Zhangyang Wang,
Weiyao Wang,
Zhiwen Fan
Abstract:
Efficiently reconstructing 3D scenes from monocular video remains a core challenge in computer vision, vital for applications in virtual reality, robotics, and scene understanding. Recently, frame-by-frame progressive reconstruction without camera poses is commonly adopted, incurring high computational overhead and compounding errors when scaling to longer videos. To overcome these issues, we intr…
▽ More
Efficiently reconstructing 3D scenes from monocular video remains a core challenge in computer vision, vital for applications in virtual reality, robotics, and scene understanding. Recently, frame-by-frame progressive reconstruction without camera poses is commonly adopted, incurring high computational overhead and compounding errors when scaling to longer videos. To overcome these issues, we introduce VideoLifter, a novel video-to-3D pipeline that leverages a local-to-global strategy on a fragment basis, achieving both extreme efficiency and SOTA quality. Locally, VideoLifter leverages learnable 3D priors to register fragments, extracting essential information for subsequent 3D Gaussian initialization with enforced inter-fragment consistency and optimized efficiency. Globally, it employs a tree-based hierarchical merging method with key frame guidance for inter-fragment alignment, pairwise merging with Gaussian point pruning, and subsequent joint optimization to ensure global consistency while efficiently mitigating cumulative errors. This approach significantly accelerates the reconstruction process, reducing training time by over 82% while holding better visual quality than current SOTA methods.
△ Less
Submitted 10 March, 2025; v1 submitted 3 January, 2025;
originally announced January 2025.
-
Search for continuous gravitational waves from known pulsars in the first part of the fourth LIGO-Virgo-KAGRA observing run
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
D. Agarwal,
M. Agathos,
M. Aghaei Abchouyeh,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi,
R. A. Alfaidi,
A. Al-Jodah,
C. Alléné
, et al. (1794 additional authors not shown)
Abstract:
Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent ana…
▽ More
Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent analysis methods considering the single-harmonic and the dual-harmonic emission models. We find no evidence of a CW signal in O4a data for both models and set upper limits on the signal amplitude and on the ellipticity, which quantifies the asymmetry in the neutron star mass distribution. For the single-harmonic emission model, 29 targets have the upper limit on the amplitude below the theoretical spin-down limit. The lowest upper limit on the amplitude is $6.4\!\times\!10^{-27}$ for the young energetic pulsar J0537-6910, while the lowest constraint on the ellipticity is $8.8\!\times\!10^{-9}$ for the bright nearby millisecond pulsar J0437-4715. Additionally, for a subset of 16 targets we performed a narrowband search that is more robust regarding the emission model, with no evidence of a signal. We also found no evidence of non-standard polarizations as predicted by the Brans-Dicke theory.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
Photon-photon coupling induced bound state in the continuum and transparency
Authors:
Ekta Tunwal,
Kuldeep Kumar Shrivastava,
Rakesh Kumar Nayak,
Ravi Kumar,
Somak Bhattacharyya,
Rajeev Singh,
Biswanath Bhoi
Abstract:
This study presents the coherent and dissipative coupling realized in the hybrid photonic resonators that have been achieved via the constructive and destructive interference of the photonic resonator fields with the radiation of a common transmission line fed with microwave photons. In the dissipative coupling regime we have found the coexistence of a peculiar phenomenon bound state in the contin…
▽ More
This study presents the coherent and dissipative coupling realized in the hybrid photonic resonators that have been achieved via the constructive and destructive interference of the photonic resonator fields with the radiation of a common transmission line fed with microwave photons. In the dissipative coupling regime we have found the coexistence of a peculiar phenomenon bound state in the continuum (BIC) near the crossing of frequency of the uncoupled resonators by satisfying the Friedrich-Wintgen BICs condition. Again just by rotating one of the samples and with the dynamic adjustment of a parameter we have achieved coupling induced transparency between the photonic resonators. This transition from BIC in the absorption regime to transparency opens avenues for different sorts of plain or programmable oscillators, filters, quantum information processors, sensors etc.
△ Less
Submitted 1 January, 2025;
originally announced January 2025.
-
MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation
Authors:
Chia-Yuan Chang,
Zhimeng Jiang,
Vineeth Rakesh,
Menghai Pan,
Chin-Chia Michael Yeh,
Guanchu Wang,
Mingzhi Hu,
Zhichao Xu,
Yan Zheng,
Mahashweta Das,
Na Zou
Abstract:
Large Language Models (LLMs) are becoming essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information. Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses. However, the existing RAG systems frequently struggle with the quality of retrieval do…
▽ More
Large Language Models (LLMs) are becoming essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information. Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses. However, the existing RAG systems frequently struggle with the quality of retrieval documents, as irrelevant or noisy documents degrade performance, increase computational overhead, and undermine response reliability. To tackle this problem, we propose Multi-Agent Filtering Retrieval-Augmented Generation (MAIN-RAG), a training-free RAG framework that leverages multiple LLM agents to collaboratively filter and score retrieved documents. Specifically, MAIN-RAG introduces an adaptive filtering mechanism that dynamically adjusts the relevance filtering threshold based on score distributions, effectively minimizing noise while maintaining high recall of relevant documents. The proposed approach leverages inter-agent consensus to ensure robust document selection without requiring additional training data or fine-tuning. Experimental results across four QA benchmarks demonstrate that MAIN-RAG consistently outperforms traditional RAG approaches, achieving a 2-11% improvement in answer accuracy while reducing the number of irrelevant retrieved documents. Quantitative analysis further reveals that our approach achieves superior response consistency and answer accuracy over baseline methods, offering a competitive and practical alternative to training-based solutions.
△ Less
Submitted 31 December, 2024;
originally announced January 2025.
-
Enhancing Entertainment Translation for Indian Languages using Adaptive Context, Style and LLMs
Authors:
Pratik Rakesh Singh,
Mohammadi Zaki,
Pankaj Wasnik
Abstract:
We address the challenging task of neural machine translation (NMT) in the entertainment domain, where the objective is to automatically translate a given dialogue from a source language content to a target language. This task has various applications, particularly in automatic dubbing, subtitling, and other content localization tasks, enabling source content to reach a wider audience. Traditional…
▽ More
We address the challenging task of neural machine translation (NMT) in the entertainment domain, where the objective is to automatically translate a given dialogue from a source language content to a target language. This task has various applications, particularly in automatic dubbing, subtitling, and other content localization tasks, enabling source content to reach a wider audience. Traditional NMT systems typically translate individual sentences in isolation, without facilitating knowledge transfer of crucial elements such as the context and style from previously encountered sentences. In this work, we emphasize the significance of these fundamental aspects in producing pertinent and captivating translations. We demonstrate their significance through several examples and propose a novel framework for entertainment translation, which, to our knowledge, is the first of its kind. Furthermore, we introduce an algorithm to estimate the context and style of the current session and use these estimations to generate a prompt that guides a Large Language Model (LLM) to generate high-quality translations. Our method is both language and LLM-agnostic, making it a general-purpose tool. We demonstrate the effectiveness of our algorithm through various numerical studies and observe significant improvement in the COMET scores over various state-of-the-art LLMs. Moreover, our proposed method consistently outperforms baseline LLMs in terms of win-ratio.
△ Less
Submitted 29 December, 2024;
originally announced December 2024.
-
INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models
Authors:
Aum Kendapadi,
Kerem Zaman,
Rakesh R. Menon,
Shashank Srivastava
Abstract:
Large language models (LLMs) excel at answering questions but remain passive learners-absorbing static data without the ability to question and refine knowledge. This paper explores how LLMs can transition to interactive, question-driven learning through student-teacher dialogues. We introduce INTERACT (INTERactive learning for Adaptive Concept Transfer), a framework in which a "student" LLM engag…
▽ More
Large language models (LLMs) excel at answering questions but remain passive learners-absorbing static data without the ability to question and refine knowledge. This paper explores how LLMs can transition to interactive, question-driven learning through student-teacher dialogues. We introduce INTERACT (INTERactive learning for Adaptive Concept Transfer), a framework in which a "student" LLM engages a "teacher" LLM through iterative inquiries to acquire knowledge across 1,347 contexts, including song lyrics, news articles, movie plots, academic papers, and images. Our experiments show that across a wide range of scenarios and LLM architectures, interactive learning consistently enhances performance, achieving up to a 25% improvement, with 'cold-start' student models matching static learning baselines in as few as five dialogue turns. Interactive setups can also mitigate the disadvantages of weaker teachers, showcasing the robustness of question-driven learning.
△ Less
Submitted 31 May, 2025; v1 submitted 15 December, 2024;
originally announced December 2024.
-
Irradiation-driven Evaporation of Micro Droplets in an Optical Trap
Authors:
Jugal Rakesh Shah,
Max Huisman,
Devendra Deshmukh,
Dag Hanstorp,
Javier Tello Marmolejo
Abstract:
Small droplets are irradiated with visible and infrared light in many natural and industrial environments. One of the simplest ways to describe their evaporation is the D$^2$-Law. It states that the evaporation rate is proportional to $t^{-1/2}$, and $R^{-1}$. However, models like the D$^2$-Law do not account for the volumetric heating of light and the effect of strong irradiation on individual dr…
▽ More
Small droplets are irradiated with visible and infrared light in many natural and industrial environments. One of the simplest ways to describe their evaporation is the D$^2$-Law. It states that the evaporation rate is proportional to $t^{-1/2}$, and $R^{-1}$. However, models like the D$^2$-Law do not account for the volumetric heating of light and the effect of strong irradiation on individual droplets is not fully understood. Here we show the effects of IR irradiation on optically levitated water droplets. We find that, under strong irradiation of up to $10^8 W/m^2$, the droplet evaporation is initially driven by the heat from the laser following the power law $dR / dt \sim R$, i.e. the inverse of the D$^2$-Law. Then, when the droplets shrink to 2 - 3 $μ$m in radius a turnover occurs from irradiation-driven back to diffusion-driven evaporation. Our findings support the understanding of droplet evaporation in cases such as rocket engines or internal combustion, where the radiation from the flame will heat water and fuel droplets.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
3D Mesh Editing using Masked LRMs
Authors:
Will Gao,
Dilin Wang,
Yuchen Fan,
Aljaz Bozic,
Tuur Stuyck,
Zhengqin Li,
Zhao Dong,
Rakesh Ranjan,
Nikolaos Sarafianos
Abstract:
We present a novel approach to mesh shape editing, building on recent progress in 3D reconstruction from multi-view images. We formulate shape editing as a conditional reconstruction problem, where the model must reconstruct the input shape with the exception of a specified 3D region, in which the geometry should be generated from the conditional signal. To this end, we train a conditional Large R…
▽ More
We present a novel approach to mesh shape editing, building on recent progress in 3D reconstruction from multi-view images. We formulate shape editing as a conditional reconstruction problem, where the model must reconstruct the input shape with the exception of a specified 3D region, in which the geometry should be generated from the conditional signal. To this end, we train a conditional Large Reconstruction Model (LRM) for masked reconstruction, using multi-view consistent masks rendered from a randomly generated 3D occlusion, and using one clean viewpoint as the conditional signal. During inference, we manually define a 3D region to edit and provide an edited image from a canonical viewpoint to fill in that region. We demonstrate that, in just a single forward pass, our method not only preserves the input geometry in the unmasked region through reconstruction capabilities on par with SoTA, but is also expressive enough to perform a variety of mesh edits from a single image guidance that past works struggle with, while being 10x faster than the top-performing competing prior work.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds
Authors:
Xiaoyu Xiang,
Liat Sless Gorelik,
Yuchen Fan,
Omri Armstrong,
Forrest Iandola,
Yilei Li,
Ita Lifshitz,
Rakesh Ranjan
Abstract:
We present Make-A-Texture, a new framework that efficiently synthesizes high-resolution texture maps from textual prompts for given 3D geometries. Our approach progressively generates textures that are consistent across multiple viewpoints with a depth-aware inpainting diffusion model, in an optimized sequence of viewpoints determined by an automatic view selection algorithm.
A significant featu…
▽ More
We present Make-A-Texture, a new framework that efficiently synthesizes high-resolution texture maps from textual prompts for given 3D geometries. Our approach progressively generates textures that are consistent across multiple viewpoints with a depth-aware inpainting diffusion model, in an optimized sequence of viewpoints determined by an automatic view selection algorithm.
A significant feature of our method is its remarkable efficiency, achieving a full texture generation within an end-to-end runtime of just 3.07 seconds on a single NVIDIA H100 GPU, significantly outperforming existing methods. Such an acceleration is achieved by optimizations in the diffusion model and a specialized backprojection method. Moreover, our method reduces the artifacts in the backprojection phase, by selectively masking out non-frontal faces, and internal faces of open-surfaced objects.
Experimental results demonstrate that Make-A-Texture matches or exceeds the quality of other state-of-the-art methods. Our work significantly improves the applicability and practicality of texture generation models for real-world 3D content creation, including interactive creation and text-guided texture editing.
△ Less
Submitted 27 January, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
Authors:
Zhenggang Tang,
Yuchen Fan,
Dilin Wang,
Hongyu Xu,
Rakesh Ranjan,
Alexander Schwing,
Zhicheng Yan
Abstract:
Recent sparse multi-view scene reconstruction advances like DUSt3R and MASt3R no longer require camera calibration and camera pose estimation. However, they only process a pair of views at a time to infer pixel-aligned pointmaps. When dealing with more than two views, a combinatorial number of error prone pairwise reconstructions are usually followed by an expensive global optimization, which ofte…
▽ More
Recent sparse multi-view scene reconstruction advances like DUSt3R and MASt3R no longer require camera calibration and camera pose estimation. However, they only process a pair of views at a time to infer pixel-aligned pointmaps. When dealing with more than two views, a combinatorial number of error prone pairwise reconstructions are usually followed by an expensive global optimization, which often fails to rectify the pairwise reconstruction errors. To handle more views, reduce errors, and improve inference time, we propose the fast single-stage feed-forward network MV-DUSt3R. At its core are multi-view decoder blocks which exchange information across any number of views while considering one reference view. To make our method robust to reference view selection, we further propose MV-DUSt3R+, which employs cross-reference-view blocks to fuse information across different reference view choices. To further enable novel view synthesis, we extend both by adding and jointly training Gaussian splatting heads. Experiments on multi-view stereo reconstruction, multi-view pose estimation, and novel view synthesis confirm that our methods improve significantly upon prior art. Code will be released.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Resource-Adaptive Successive Doubling for Hyperparameter Optimization with Large Datasets on High-Performance Computing Systems
Authors:
Marcel Aach,
Rakesh Sarma,
Helmut Neukirchen,
Morris Riedel,
Andreas Lintermann
Abstract:
On High-Performance Computing (HPC) systems, several hyperparameter configurations can be evaluated in parallel to speed up the Hyperparameter Optimization (HPO) process. State-of-the-art HPO methods follow a bandit-based approach and build on top of successive halving, where the final performance of a combination is estimated based on a lower than fully trained fidelity performance metric and mor…
▽ More
On High-Performance Computing (HPC) systems, several hyperparameter configurations can be evaluated in parallel to speed up the Hyperparameter Optimization (HPO) process. State-of-the-art HPO methods follow a bandit-based approach and build on top of successive halving, where the final performance of a combination is estimated based on a lower than fully trained fidelity performance metric and more promising combinations are assigned more resources over time. Frequently, the number of epochs is treated as a resource, letting more promising combinations train longer. Another option is to use the number of workers as a resource and directly allocate more workers to more promising configurations via data-parallel training. This article proposes a novel Resource-Adaptive Successive Doubling Algorithm (RASDA), which combines a resource-adaptive successive doubling scheme with the plain Asynchronous Successive Halving Algorithm (ASHA). Scalability of this approach is shown on up to 1,024 Graphics Processing Units (GPUs) on modern HPC systems. It is applied to different types of Neural Networks (NNs) and trained on large datasets from the Computer Vision (CV), Computational Fluid Dynamics (CFD), and Additive Manufacturing (AM) domains, where performing more than one full training run is usually infeasible. Empirical results show that RASDA outperforms ASHA by a factor of up to 1.9 with respect to the runtime. At the same time, the solution quality of final ASHA models is maintained or even surpassed by the implicit batch size scheduling of RASDA. With RASDA, systematic HPO is applied to a terabyte-scale scientific dataset for the first time in the literature, enabling efficient optimization of complex models on massive scientific data. The implementation of RASDA is available on https://github.com/olympiquemarcel/rasda
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Selective Thermalization, Chiral Excitations, and a Case of Quantum Hair in the Presence of Event Horizons
Authors:
Akhil U Nair,
Rakesh K. Jha,
Prasant Samantray,
Sashideep Gutti
Abstract:
The Unruh effect is a well-understood phenomenon, where one considers a vacuum state of a quantum field in Minkowski spacetime, which appears to be thermally populated for a uniformly accelerating Rindler observer. In this article, we derive a variant of the Unruh effect involving two distinct accelerating observers and aim to address the following questions: (i) Is it possible to selectively ther…
▽ More
The Unruh effect is a well-understood phenomenon, where one considers a vacuum state of a quantum field in Minkowski spacetime, which appears to be thermally populated for a uniformly accelerating Rindler observer. In this article, we derive a variant of the Unruh effect involving two distinct accelerating observers and aim to address the following questions: (i) Is it possible to selectively thermalize a subset of momentum modes for the case of massless scalar fields, and (ii) Is it possible to excite only the left-handed massless fermions while keeping right-handed fermions in a vacuum state or vice versa? To this end, we consider a Rindler wedge $R_1$ constructed from a class of accelerating observers and another Rindler wedge $R_2$ (with $R_2 \subset R_1$) constructed from another class of accelerating observers such that the wedge $R_2$ is displaced along a null direction w.r.t $R_1$ by a parameter $Δ$. By first considering a massless scalar field in the $R_1$ vacuum, we show that if we choose the displacement $Δ$ along one null direction, the positive momentum modes are thermalized, whereas negative momentum modes remain in vacuum (and vice versa if we choose the displacement along the other null direction). We then consider a massless fermionic field in a vacuum state in $R_1$ and show that the reduced state in $R_2$ is such that the left-handed fermions are excited and are thermal for large frequencies. In contrast, the right-handed fermions have negligible particle density and vice versa. We argue that the toy models involving shifted Rindler spacetime may provide insights into the particle excitation aspects of evolving horizons and the possibility of Rindler spacetime having a quantum strand of hair. Additionally, based on our work, we hypothesize that massless fermions underwent selective chiral excitations during the radiation-dominated era of cosmology.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Efficient short-wave infrared upconversion by self-sensitized holmium-doped nanoparticles
Authors:
Rakesh Arul,
Zhao Jiang,
Xinjuan Li,
Fiona M. Bell,
Alasdair Tew,
Caterina Ducati,
Akshay Rao,
Zhongzheng Yu
Abstract:
Photon upconversion, combining several low-energy photons to generate one high-energy photon is of wide interest for biomedical, catalytic and photonic applications. Lanthanide-doped nanoparticles (LnNP) are a unique type of upconversion nanoconverter, which can realize ultralarge anti-Stokes shift (>1000 nm) and high photostability, without photo-bleaching and photo-blinking. The excitation wavel…
▽ More
Photon upconversion, combining several low-energy photons to generate one high-energy photon is of wide interest for biomedical, catalytic and photonic applications. Lanthanide-doped nanoparticles (LnNP) are a unique type of upconversion nanoconverter, which can realize ultralarge anti-Stokes shift (>1000 nm) and high photostability, without photo-bleaching and photo-blinking. The excitation wavelength of LnNPs has been limited to the second near-infrared window (1000-1700 nm), mainly sensitized by erbium ions with absorption centered around 1.5 $μ$m. Here, we demonstrate novel self-sensitized holmium (Ho)-doped nanoconverters to further expand the sensitization range to the short-wave infrared at 2 $μ$m and achieve efficient upconversion to 640 nm. We show that this upconversion is a 4-photon conversion process with an underlying energy transfer upconversion mechanism. Via careful control of dopant concentration and shelling we achieve a relative upconversion-to-downconversion efficiency up to 15.2%, more than half the theoretical maximum. The placement of the Ho doped LnNPs into a plasmonic nanocavity device enables large gains in emission intensity (up to 32-fold), due to the dramatic shortening of the emission lifetime of Ho from 29 $μ$s to <1 ns, indicating a high Purcell-enhancement factor of 3x10$^4$. These results open new possibilities at the frontier of short-wave infrared upconversion and the nanoplasmonic enhancement of LnNP emission, with potential applications in detection, theranostics, photonics and optoelectronics.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
Coupled Flow-Thermal Analysis of a Rocket Nozzle with Charring Ablative Thermal Protection System
Authors:
Basit G. Sheikh,
Rakesh Kumar
Abstract:
This study investigates conjugate heat transfer analysis of a rocket nozzle featuring a charring ablative thermal protection system. The study is carried out by weakly coupling a commercial computational fluid dynamics (CFD) flow solver Fluent with an in-house material thermal response solver. The coupling of the two solvers is carried out by exchanging the boundary conditions at the fluid-solid i…
▽ More
This study investigates conjugate heat transfer analysis of a rocket nozzle featuring a charring ablative thermal protection system. The study is carried out by weakly coupling a commercial computational fluid dynamics (CFD) flow solver Fluent with an in-house material thermal response solver. The coupling of the two solvers is carried out by exchanging the boundary conditions at the fluid-solid interface using a non-iterative approach. Validation of the numerical framework is performed. A blowing correlation is used to model the pyrolysis gas behavior at the interface. Results highlight a significant surface heating and ablation particularly at the nozzle throat.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
Hierarchical Information Flow for Generalized Efficient Image Restoration
Authors:
Yawei Li,
Bin Ren,
Jingyun Liang,
Rakesh Ranjan,
Mengyuan Liu,
Nicu Sebe,
Ming-Hsuan Yang,
Luca Benini
Abstract:
While vision transformers show promise in numerous image restoration (IR) tasks, the challenge remains in efficiently generalizing and scaling up a model for multiple IR tasks. To strike a balance between efficiency and model capacity for a generalized transformer-based IR method, we propose a hierarchical information flow mechanism for image restoration, dubbed Hi-IR, which progressively propagat…
▽ More
While vision transformers show promise in numerous image restoration (IR) tasks, the challenge remains in efficiently generalizing and scaling up a model for multiple IR tasks. To strike a balance between efficiency and model capacity for a generalized transformer-based IR method, we propose a hierarchical information flow mechanism for image restoration, dubbed Hi-IR, which progressively propagates information among pixels in a bottom-up manner. Hi-IR constructs a hierarchical information tree representing the degraded image across three levels. Each level encapsulates different types of information, with higher levels encompassing broader objects and concepts and lower levels focusing on local details. Moreover, the hierarchical tree architecture removes long-range self-attention, improves the computational efficiency and memory utilization, thus preparing it for effective model scaling. Based on that, we explore model scaling to improve our method's capabilities, which is expected to positively impact IR in large-scale training settings. Extensive experimental results show that Hi-IR achieves state-of-the-art performance in seven common image restoration tasks, affirming its effectiveness and generalizability.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
Compact finite-difference scheme for some Sobolev type equations with Dirichlet boundary conditions
Authors:
Lavanya V Salian,
Samala Rathan,
Rakesh Kumar
Abstract:
This study aims to construct a stable, high-order compact finite difference method for solving Sobolev-type equations with Dirichlet boundary conditions in one-space dimension. Approximation of higher-order mixed derivatives in some specific Sobolev-type equations requires a bigger stencil information. One can approximate such derivatives on compact stencils, which are higher-order accurate and ta…
▽ More
This study aims to construct a stable, high-order compact finite difference method for solving Sobolev-type equations with Dirichlet boundary conditions in one-space dimension. Approximation of higher-order mixed derivatives in some specific Sobolev-type equations requires a bigger stencil information. One can approximate such derivatives on compact stencils, which are higher-order accurate and take less stencil information but are implicit and sparse. Spatial derivatives in this work are approximated using the sixth-order compact finite difference method (Compact6), while temporal derivatives are handled with the explicit forward Euler difference scheme. We examine the accuracy and convergence behavior of the proposed scheme. Using the von Neumann stability analysis, we establish $L_2-$stability theory for the linear case. We derive conditions under which fully discrete schemes are stable. Also, the amplification factor $\mathcal{C}(θ)$ is analyzed to ensure the decay property over time. Real parts of $\mathcal{C}(θ)$ lying on the negative real axis confirm the exponential decay of the solution. A series of numerical experiments were performed to verify the effectiveness of the proposed scheme. These tests include both one dimensional and two-dimensional cases of cases of advection-free and advection-diffusion flows. They also cover applications to the equal width equation, such as the propagation of a single solitary wave, interactions between two and three solitary waves, undular bore formation, and the Benjamin-Bona-Mahony-Burgers equation.
△ Less
Submitted 3 June, 2025; v1 submitted 27 November, 2024;
originally announced November 2024.
-
Nonlocal elliptic equations involving logarithmic Laplacian: Existence, non-existence and uniqueness results
Authors:
Rakesh Arora,
Jacques Giacomoni,
Arshi Vaishnavi
Abstract:
In this work, we study the existence, non-existence, and uniqueness results for nonlocal elliptic equations involving logarithmic Laplacian, and subcritical, critical, and supercritical logarithmic nonlinearities. The Poho\u zaev's identity and Díaz-Saa type inequality are proved, which are of independent interest and can be applied to a larger class of problems. Depending upon the growth of nonli…
▽ More
In this work, we study the existence, non-existence, and uniqueness results for nonlocal elliptic equations involving logarithmic Laplacian, and subcritical, critical, and supercritical logarithmic nonlinearities. The Poho\u zaev's identity and Díaz-Saa type inequality are proved, which are of independent interest and can be applied to a larger class of problems. Depending upon the growth of nonlinearities and regularity of the weight function, we study the small-order asymptotic of nonlocal weighted elliptic equations involving the fractional Laplacian of order $2s.$ We show that the least energy solutions of a weighted nonlocal problem with superlinear or sublinear growth converge to a nontrivial nonnegative least-energy solution of Brézis-Nirenberg type and logistic-type limiting problem respectively involving the logarithmic Laplacian.
△ Less
Submitted 26 April, 2025; v1 submitted 24 November, 2024;
originally announced November 2024.
-
Baryogenesis from a Majorana Fermion Coupled to Quarks
Authors:
Shrihari Gopalakrishna,
Rakesh Tibrewala
Abstract:
In the theory with a Majorana fermion ($X$) coupled to quark-like fermions ($Q$) via a dimension-six four-fermion vector-vector interaction, we have computed in an earlier work the baryon asymmetry generated in the decay and scattering processes of the $X$ with $Q$. In this work we consider such processes in the expanding early Universe, set up the Boltzmann equations governing the $X$ and net bar…
▽ More
In the theory with a Majorana fermion ($X$) coupled to quark-like fermions ($Q$) via a dimension-six four-fermion vector-vector interaction, we have computed in an earlier work the baryon asymmetry generated in the decay and scattering processes of the $X$ with $Q$. In this work we consider such processes in the expanding early Universe, set up the Boltzmann equations governing the $X$ and net baryon number densities, and numerically solve them in example benchmark points, taking the thermally averaged decay and scattering rates and their temperature dependence from the earlier study. We find that starting from a baryon symmetric Universe at early time, the presently observed baryon asymmetry of the Universe (BAU) can be explained in this theory over a wide range of mass scales, $M_χ\in (10^4,10^{16})$ GeV for appropriately chosen couplings. We find that scattering processes play a crucial role in generating the baryon asymmetry in this theory. We present our results in a general manner that should be useful not just in our theory, but also in other related theories that share the essential ingredients. Our results should help guide promising ways to probe such new physics in terrestrial experiments. For instance, in regions of parameter space that yield the observed BAU, we present the rate for neutron-antineutron oscillation and discuss the prospects for observing this in upcoming experiments.
△ Less
Submitted 14 April, 2025; v1 submitted 20 November, 2024;
originally announced November 2024.
-
Label Sharing Incremental Learning Framework for Independent Multi-Label Segmentation Tasks
Authors:
Deepa Anand,
Bipul Das,
Vyshnav Dangeti,
Antony Jerald,
Rakesh Mullick,
Uday Patil,
Pakhi Sharma,
Prasad Sudhakar
Abstract:
In a setting where segmentation models have to be built for multiple datasets, each with its own corresponding label set, a straightforward way is to learn one model for every dataset and its labels. Alternatively, multi-task architectures with shared encoders and multiple segmentation heads or shared weights with compound labels can also be made use of. This work proposes a novel label sharing fr…
▽ More
In a setting where segmentation models have to be built for multiple datasets, each with its own corresponding label set, a straightforward way is to learn one model for every dataset and its labels. Alternatively, multi-task architectures with shared encoders and multiple segmentation heads or shared weights with compound labels can also be made use of. This work proposes a novel label sharing framework where a shared common label space is constructed and each of the individual label sets are systematically mapped to the common labels. This transforms multiple datasets with disparate label sets into a single large dataset with shared labels, and therefore all the segmentation tasks can be addressed by learning a single model. This eliminates the need for task specific adaptations in network architectures and also results in parameter and data efficient models. Furthermore, label sharing framework is naturally amenable for incremental learning where segmentations for new datasets can be easily learnt. We experimentally validate our method on various medical image segmentation datasets, each involving multi-label segmentation. Furthermore, we demonstrate the efficacy of the proposed method in terms of performance and incremental learning ability vis-a-vis alternative methods.
△ Less
Submitted 17 November, 2024;
originally announced November 2024.
-
Spectral sequences in unstable higher homotopy theory and applications to the coniveau filtration
Authors:
Frédéric Déglise,
Rakesh Pawar
Abstract:
With the aim of understanding Morel's result on the $\mathbb{A}^1$-homotopy sheaves over a field, we extend the theory of unstable spectral sequences of Bousfield and Kan in the $\infty$-categorical setting. With this natural extension, parallel to the classical formalism of cohomology theory with supports, we introduce the notion of cohomotopy theory with supports. We extend the Bloch-Ogus-Gabber…
▽ More
With the aim of understanding Morel's result on the $\mathbb{A}^1$-homotopy sheaves over a field, we extend the theory of unstable spectral sequences of Bousfield and Kan in the $\infty$-categorical setting. With this natural extension, parallel to the classical formalism of cohomology theory with supports, we introduce the notion of cohomotopy theory with supports. We extend the Bloch-Ogus-Gabber theorem for Cohomology theory with supports to that of unstable setting, in order to obtain unstable Gersten (or Cousin) resolutions associated with the coniveau filtration, under suitable assumptions. We apply this theory to motivic homotopy, Nisnevich-local torsors and Artin-Mazur étale homotopy types.
△ Less
Submitted 15 May, 2025; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Manifestations of the possible thermodynamic origin of water's anomalies in non-classical vapor nucleation at negative pressures
Authors:
Yuvraj Singh,
Mantu Santra,
Rakesh S. Singh
Abstract:
Over the years, various scenarios -- such as the stability-limit conjecture (SLC), two critical point (TCP), critical point-free (CPF), and singularity-free (SF) -- have been proposed to explain the thermodynamic origin of supercooled waters anomalies. However, direct experimental validation is challenging due to the rapid phase transition from metastable water. In this study, we explored whether…
▽ More
Over the years, various scenarios -- such as the stability-limit conjecture (SLC), two critical point (TCP), critical point-free (CPF), and singularity-free (SF) -- have been proposed to explain the thermodynamic origin of supercooled waters anomalies. However, direct experimental validation is challenging due to the rapid phase transition from metastable water. In this study, we explored whether the phase transition pathways from metastable water provide insight into the thermodynamic origin of these anomalies. Using a classical density functional theory approach with realistic theoretical water models, we examined how different thermodynamic scenarios influence vapor nucleation kinetics at negative pressures. Our findings show significant variations in nucleation kinetics and mechanism during both isobaric and isochoric cooling. In the TCP scenario, the nucleation barrier increases steadily during isobaric cooling, with a slight decrease near the Widom line at lower temperatures (Ts). In contrast, the SF scenario shows a monotonic increase in the nucleation barrier. For the CPF scenario, we observed a non-classical mechanism, such as wetting-mediated nucleation (where the growing vapor nucleus is wetted by the intermediate low-density liquid phase) and the Ostwald step rule at low temperatures. Isochoric cooling pathways also revealed notable differences in T-dependent nucleation barrier trends between the TCP and CPF scenarios. Overall, this study underscores the importance of analyzing phase transition kinetics and mechanism to understand the precise thermodynamic origin of supercooled waters anomalies.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
GANESH: Generalizable NeRF for Lensless Imaging
Authors:
Rakesh Raj Madavan,
Akshat Kaimal,
Badhrinarayanan K V,
Vinayak Gupta,
Rohit Choudhary,
Chandrakala Shanmuganathan,
Kaushik Mitra
Abstract:
Lensless imaging offers a significant opportunity to develop ultra-compact cameras by removing the conventional bulky lens system. However, without a focusing element, the sensor's output is no longer a direct image but a complex multiplexed scene representation. Traditional methods have attempted to address this challenge by employing learnable inversions and refinement models, but these methods…
▽ More
Lensless imaging offers a significant opportunity to develop ultra-compact cameras by removing the conventional bulky lens system. However, without a focusing element, the sensor's output is no longer a direct image but a complex multiplexed scene representation. Traditional methods have attempted to address this challenge by employing learnable inversions and refinement models, but these methods are primarily designed for 2D reconstruction and do not generalize well to 3D reconstruction. We introduce GANESH, a novel framework designed to enable simultaneous refinement and novel view synthesis from multi-view lensless images. Unlike existing methods that require scene-specific training, our approach supports on-the-fly inference without retraining on each scene. Moreover, our framework allows us to tune our model to specific scenes, enhancing the rendering and refinement quality. To facilitate research in this area, we also present the first multi-view lensless dataset, LenslessScenes. Extensive experiments demonstrate that our method outperforms current approaches in reconstruction accuracy and refinement quality. Code and video results are available at https://rakesh-123-cryp.github.io/Rakesh.github.io/
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers
Authors:
Rakesh R. Menon,
Shashank Srivastava
Abstract:
Despite their high predictive accuracies, current machine learning systems often exhibit systematic biases stemming from annotation artifacts or insufficient support for certain classes in the dataset. Recent work proposes automatic methods for identifying and explaining systematic biases using keywords. We introduce DISCERN, a framework for interpreting systematic biases in text classifiers using…
▽ More
Despite their high predictive accuracies, current machine learning systems often exhibit systematic biases stemming from annotation artifacts or insufficient support for certain classes in the dataset. Recent work proposes automatic methods for identifying and explaining systematic biases using keywords. We introduce DISCERN, a framework for interpreting systematic biases in text classifiers using language explanations. DISCERN iteratively generates precise natural language descriptions of systematic errors by employing an interactive loop between two large language models. Finally, we use the descriptions to improve classifiers by augmenting classifier training sets with synthetically generated instances or annotated examples via active learning. On three text-classification datasets, we demonstrate that language explanations from our framework induce consistent performance improvements that go beyond what is achievable with exemplars of systematic bias. Finally, in human evaluations, we show that users can interpret systematic biases more effectively (by over 25% relative) and efficiently when described through language explanations as opposed to cluster exemplars.
△ Less
Submitted 29 October, 2024;
originally announced October 2024.
-
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Authors:
Ankit Singh Rawat,
Veeranjaneyulu Sadhanala,
Afshin Rostamizadeh,
Ayan Chakrabarti,
Wittawat Jitkrittum,
Vladimir Feinberg,
Seungyeon Kim,
Hrayr Harutyunyan,
Nikunj Saunshi,
Zachary Nado,
Rakesh Shivanna,
Sashank J. Reddi,
Aditya Krishna Menon,
Rohan Anil,
Sanjiv Kumar
Abstract:
A primary challenge in large language model (LLM) development is their onerous pre-training cost. Typically, such pre-training involves optimizing a self-supervised objective (such as next-token prediction) over a large corpus. This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by suitably leveraging a small language model (SLM). In particular, this paradig…
▽ More
A primary challenge in large language model (LLM) development is their onerous pre-training cost. Typically, such pre-training involves optimizing a self-supervised objective (such as next-token prediction) over a large corpus. This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by suitably leveraging a small language model (SLM). In particular, this paradigm relies on an SLM to both (1) provide soft labels as additional training supervision, and (2) select a small subset of valuable ("informative" and "hard") training examples. Put together, this enables an effective transfer of the SLM's predictive distribution to the LLM, while prioritizing specific regions of the training data distribution. Empirically, this leads to reduced LLM training time compared to standard training, while improving the overall quality. Theoretically, we develop a statistical framework to systematically study the utility of SLMs in enabling efficient training of high-quality LLMs. In particular, our framework characterizes how the SLM's seemingly low-quality supervision can enhance the training of a much more capable LLM. Furthermore, it also highlights the need for an adaptive utilization of such supervision, by striking a balance between the bias and variance introduced by the SLM-provided soft labels. We corroborate our theoretical framework by improving the pre-training of an LLM with 2.8B parameters by utilizing a smaller LM with 1.5B parameters on the Pile dataset.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
Improving Pinterest Search Relevance Using Large Language Models
Authors:
Han Wang,
Mukuntha Narayanan Sundararaman,
Onur Gungor,
Yu Xu,
Krishna Kamath,
Rakesh Chalasani,
Kurchi Subhra Hazra,
Jinfeng Rao
Abstract:
To improve relevance scoring on Pinterest Search, we integrate Large Language Models (LLMs) into our search relevance model, leveraging carefully designed text representations to predict the relevance of Pins effectively. Our approach uses search queries alongside content representations that include captions extracted from a generative visual language model. These are further enriched with link-b…
▽ More
To improve relevance scoring on Pinterest Search, we integrate Large Language Models (LLMs) into our search relevance model, leveraging carefully designed text representations to predict the relevance of Pins effectively. Our approach uses search queries alongside content representations that include captions extracted from a generative visual language model. These are further enriched with link-based text data, historically high-quality engaged queries, user-curated boards, Pin titles and Pin descriptions, creating robust models for predicting search relevance. We use a semi-supervised learning approach to efficiently scale up the amount of training data, expanding beyond the expensive human labeled data available. By utilizing multilingual LLMs, our system extends training data to include unseen languages and domains, despite initial data and annotator expertise being confined to English. Furthermore, we distill from the LLM-based model into real-time servable model architectures and features. We provide comprehensive offline experimental validation for our proposed techniques and demonstrate the gains achieved through the final deployed system at scale.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Search for gravitational waves emitted from SN 2023ixf
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
D. Agarwal,
M. Agathos,
M. Aghaei Abchouyeh,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
T. Akutsu,
S. Albanesi,
R. A. Alfaidi,
A. Al-Jodah,
C. Alléné,
A. Allocca
, et al. (1758 additional authors not shown)
Abstract:
We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been…
▽ More
We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been identified in data when at least two gravitational-wave observatories were operating, which covered $\sim 14\%$ of this five-day window. We report the search detection efficiency for various possible gravitational-wave emission models. Considering the distance to M101 (6.7 Mpc), we derive constraints on the gravitational-wave emission mechanism of core-collapse supernovae across a broad frequency spectrum, ranging from 50 Hz to 2 kHz where we assume the gravitational-wave emission occurred when coincident data are available in the on-source window. Considering an ellipsoid model for a rotating proto-neutron star, our search is sensitive to gravitational-wave energy $1 \times 10^{-4} M_{\odot} c^2$ and luminosity $2.6 \times 10^{-4} M_{\odot} c^2/s$ for a source emitting at 82 Hz. These constraints are around an order of magnitude more stringent than those obtained so far with gravitational-wave data. The constraint on the ellipticity of the proto-neutron star that is formed is as low as 1.08, at frequencies above 1200 Hz, surpassing past results.
△ Less
Submitted 11 March, 2025; v1 submitted 21 October, 2024;
originally announced October 2024.
-
Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus
Authors:
Raviraj Joshi,
Kanishk Singla,
Anusha Kamath,
Raunak Kalani,
Rakesh Paul,
Utkarsh Vaidya,
Sanjay Singh Chauhan,
Niranjan Wartikar,
Eileen Long
Abstract:
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-training corpora for improving LLMs in low-resource languages. We conduct our study in the context of the low-resource Indic language Hindi. We i…
▽ More
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-training corpora for improving LLMs in low-resource languages. We conduct our study in the context of the low-resource Indic language Hindi. We introduce Nemotron-Mini-Hindi 4B, a bilingual SLM supporting both Hindi and English, based on Nemotron-Mini 4B. The model is trained using a mix of real and synthetic Hindi + English tokens, with continuous pre-training performed on 400B tokens. We demonstrate that both the base and instruct models achieve state-of-the-art results on Hindi benchmarks while remaining competitive on English tasks. Additionally, we observe that the continued pre-training approach enhances the model's overall factual accuracy. We perform an ablation study to highlight the impact of Hindi pre-training, showing significant improvements in Hindi chat capabilities and factual accuracy, which cannot be achieved through Hindi alignment alone.
△ Less
Submitted 21 April, 2025; v1 submitted 18 October, 2024;
originally announced October 2024.
-
Effects of Soft-Domain Transfer and Named Entity Information on Deception Detection
Authors:
Steven Triplett,
Simon Minami,
Rakesh Verma
Abstract:
In the modern age an enormous amount of communication occurs online, and it is difficult to know when something written is genuine or deceitful. There are many reasons for someone to deceive online (e.g., monetary gain, political gain) and detecting this behavior without any physical interaction is a difficult task. Additionally, deception occurs in several text-only domains and it is unclear if t…
▽ More
In the modern age an enormous amount of communication occurs online, and it is difficult to know when something written is genuine or deceitful. There are many reasons for someone to deceive online (e.g., monetary gain, political gain) and detecting this behavior without any physical interaction is a difficult task. Additionally, deception occurs in several text-only domains and it is unclear if these various sources can be leveraged to improve detection. To address this, eight datasets were utilized from various domains to evaluate their effect on classifier performance when combined with transfer learning via intermediate layer concatenation of fine-tuned BERT models. We find improvements in accuracy over the baseline. Furthermore, we evaluate multiple distance measurements between datasets and find that Jensen-Shannon distance correlates moderately with transfer learning performance. Finally, the impact was evaluated of multiple methods, which produce additional information in a dataset's text via named entities, on BERT performance and we find notable improvement in accuracy of up to 11.2%.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Probing the massive scalar mode in the levitated sensor detector of gravitational wave
Authors:
Rakesh Das,
Anirban Saha
Abstract:
Owing to the mass scale associated with the scalar longitudinal mode signal of gravitational wave predicted by modified theories of gravity, it should propagate at a subluminal speed and with a different frequency compared to the massless tensor mode signals which moves at the speed of light and are present in both standard general relativity and modified theories. This is ensured by the massless…
▽ More
Owing to the mass scale associated with the scalar longitudinal mode signal of gravitational wave predicted by modified theories of gravity, it should propagate at a subluminal speed and with a different frequency compared to the massless tensor mode signals which moves at the speed of light and are present in both standard general relativity and modified theories. This is ensured by the massless and massive dispersion relations obeyed respectively by the tensor and scalar modes of gravitational wave coming from a given source and thus having the same propagation vector. We show that because of its wider operational frequency band the recently designed levitated sensor detector \cite{Aggarwal} of gravitational wave has a better chance of detecting both the scalar and tensor modes at these different frequencies and thus can provide observational evidence in favour of modified theories of gravity over general relativity. This detector works on the principle of optical trapping \cite{Ashkin_1970} of a dielectric nanosphere sensor\cite{Geraci}. By adjusting the intensity of the optical beam the frequency of the harmonic potential trap can be varied widely so that the nanosphere sensor can undergo distinct resonant transitions induced by the tensor and scalar modes. We demonstrate that the dynamics of the sensor mass obeys a geodesic deviation equation in the proper detector frame and construct a quantum mechanical description of this system in modified gravity framework to compute the probabilities of resonant transitions in response to incoming gravitational wave signals of both periodic and aperiodic kind.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Satellite Streaming Video QoE Prediction: A Real-World Subjective Database and Network-Level Prediction Models
Authors:
Bowen Chen,
Zaixi Shang,
Jae Won Chung,
David Lerner,
Werner Robitza,
Rakesh Rao Ramachandra Rao,
Alexander Raake,
Alan C. Bovik
Abstract:
Demand for streaming services, including satellite, continues to exhibit unprecedented growth. Internet Service Providers find themselves at the crossroads of technological advancements and rising customer expectations. To stay relevant and competitive, these ISPs must ensure their networks deliver optimal video streaming quality, a key determinant of user satisfaction. Towards this end, it is imp…
▽ More
Demand for streaming services, including satellite, continues to exhibit unprecedented growth. Internet Service Providers find themselves at the crossroads of technological advancements and rising customer expectations. To stay relevant and competitive, these ISPs must ensure their networks deliver optimal video streaming quality, a key determinant of user satisfaction. Towards this end, it is important to have accurate Quality of Experience prediction models in place. However, achieving robust performance by these models requires extensive data sets labeled by subjective opinion scores on videos impaired by diverse playback disruptions. To bridge this data gap, we introduce the LIVE-Viasat Real-World Satellite QoE Database. This database consists of 179 videos recorded from real-world streaming services affected by various authentic distortion patterns. We also conducted a comprehensive subjective study involving 54 participants, who contributed both continuous-time opinion scores and endpoint (retrospective) QoE scores. Our analysis sheds light on various determinants influencing subjective QoE, such as stall events, spatial resolutions, bitrate, and certain network parameters. We demonstrate the usefulness of this unique new resource by evaluating the efficacy of prevalent QoE-prediction models on it. We also created a new model that maps the network parameters to predicted human perception scores, which can be used by ISPs to optimize the video streaming quality of their networks. Our proposed model, which we call SatQA, is able to accurately predict QoE using only network parameters, without any access to pixel data or video-specific metadata, estimated by Spearman's Rank Order Correlation Coefficient (SROCC), Pearson Linear Correlation Coefficient (PLCC), and Root Mean Squared Error (RMSE), indicating high accuracy and reliability.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154
Authors:
The LIGO Scientific Collaboration,
the Virgo Collaboration,
the KAGRA Collaboration,
A. G. Abac,
R. Abbott,
I. Abouelfettouh,
F. Acernese,
K. Ackley,
S. Adhicary,
N. Adhikari,
R. X. Adhikari,
V. K. Adkins,
D. Agarwal,
M. Agathos,
M. Aghaei Abchouyeh,
O. D. Aguiar,
I. Aguilar,
L. Aiello,
A. Ain,
P. Ajith,
T. Akutsu,
S. Albanesi,
R. A. Alfaidi,
A. Al-Jodah,
C. Alléné
, et al. (1758 additional authors not shown)
Abstract:
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by…
▽ More
The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs.
△ Less
Submitted 21 May, 2025; v1 submitted 11 October, 2024;
originally announced October 2024.
-
SocialGaze: Improving the Integration of Human Social Norms in Large Language Models
Authors:
Anvesh Rao Vijjini,
Rakesh R. Menon,
Jiayi Fu,
Shashank Srivastava,
Snigdha Chaturvedi
Abstract:
While much research has explored enhancing the reasoning capabilities of large language models (LLMs) in the last few years, there is a gap in understanding the alignment of these models with social values and norms. We introduce the task of judging social acceptance. Social acceptance requires models to judge and rationalize the acceptability of people's actions in social situations. For example,…
▽ More
While much research has explored enhancing the reasoning capabilities of large language models (LLMs) in the last few years, there is a gap in understanding the alignment of these models with social values and norms. We introduce the task of judging social acceptance. Social acceptance requires models to judge and rationalize the acceptability of people's actions in social situations. For example, is it socially acceptable for a neighbor to ask others in the community to keep their pets indoors at night? We find that LLMs' understanding of social acceptance is often misaligned with human consensus. To alleviate this, we introduce SocialGaze, a multi-step prompting framework, in which a language model verbalizes a social situation from multiple perspectives before forming a judgment. Our experiments demonstrate that the SocialGaze approach improves the alignment with human judgments by up to 11 F1 points with the GPT-3.5 model. We also identify biases and correlations in LLMs in assigning blame that is related to features such as the gender (males are significantly more likely to be judged unfairly) and age (LLMs are more aligned with humans for older narrators).
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Ultra-narrow linewidth laser across the C-band using polarization-controlled dual-cavity feedback
Authors:
Jeppe H. Surrow,
Simon T. Thomsen,
Rakesh R. Kumar,
Mónica Far Brusatori,
Maria Paula Montes,
Ahan S. Palsole,
Chris Hoede,
Holger N. Klein,
Nicolas Volet
Abstract:
A standard method to reduce the linewidth of semiconductor lasers involves the use of external optical feedback (EOF). However, feedback powers less than 1 % usually trigger coherence collapse (CC), leading to chaotic laser dynamics and linewidth broadening. This paper explores a method to mitigate CC through precise tuning of the feedback polarization depending on the feedback power. We report a…
▽ More
A standard method to reduce the linewidth of semiconductor lasers involves the use of external optical feedback (EOF). However, feedback powers less than 1 % usually trigger coherence collapse (CC), leading to chaotic laser dynamics and linewidth broadening. This paper explores a method to mitigate CC through precise tuning of the feedback polarization depending on the feedback power. We report a semiconductor laser with a sub-100 Hz intrinsic linewidth, achieved via EOF. The laser features a U-shaped cavity with two sampled grating distributed Bragg reflectors (SG-DBRs), enabling broad tunability across a 42 nm wavelength range (1513-1555 nm). By injecting optical feedback into both sides of the laser cavity via an external fiber-based cavity, we reduce the intrinsic linewidth by more than three orders of magnitude, from MHz to sub-kHz across the laser's tuning range. By dynamically tuning the polarization, we demonstrate sub-100 Hz intrinsic linewidths at feedback powers up to 10 %, marking an improvement over prior studies where CC limited performance.
△ Less
Submitted 17 March, 2025; v1 submitted 11 October, 2024;
originally announced October 2024.
-
MorCode: Face Morphing Attack Generation using Generative Codebooks
Authors:
Aravinda Reddy PN,
Raghavendra Ramachandra,
Sushma Venkatesh,
Krothapalli Sreenivasa Rao,
Pabitra Mitra,
Rakesh Krishna
Abstract:
Face recognition systems (FRS) can be compromised by face morphing attacks, which blend textural and geometric information from multiple facial images. The rapid evolution of generative AI, especially Generative Adversarial Networks (GAN) or Diffusion models, where encoded images are interpolated to generate high-quality face morphing images. In this work, we present a novel method for the automat…
▽ More
Face recognition systems (FRS) can be compromised by face morphing attacks, which blend textural and geometric information from multiple facial images. The rapid evolution of generative AI, especially Generative Adversarial Networks (GAN) or Diffusion models, where encoded images are interpolated to generate high-quality face morphing images. In this work, we present a novel method for the automatic face morphing generation method \textit{MorCode}, which leverages a contemporary encoder-decoder architecture conditioned on codebook learning to generate high-quality morphing images. Extensive experiments were performed on the newly constructed morphing dataset using five state-of-the-art morphing generation techniques using both digital and print-scan data. The attack potential of the proposed morphing generation technique, \textit{MorCode}, was benchmarked using three different face recognition systems. The obtained results indicate the highest attack potential of the proposed \textit{MorCode} when compared with five state-of-the-art morphing generation methods on both digital and print scan data.
△ Less
Submitted 10 October, 2024;
originally announced October 2024.
-
Rigidity in fixed angle inverse scattering for Riemannian metrics
Authors:
Lauri Oksanen,
Rakesh,
Mikko Salo
Abstract:
The fixed angle inverse scattering problem for a velocity consists in determining a sound speed, or a Riemannian metric up to diffeomorphism, from measurements obtained by probing the medium with a single plane wave. This is a formally determined inverse problem that is open in general. In this article we consider the rigidity question of distinguishing a sound speed or a Riemannian metric from th…
▽ More
The fixed angle inverse scattering problem for a velocity consists in determining a sound speed, or a Riemannian metric up to diffeomorphism, from measurements obtained by probing the medium with a single plane wave. This is a formally determined inverse problem that is open in general. In this article we consider the rigidity question of distinguishing a sound speed or a Riemannian metric from the Euclidean metric. We prove that a general smooth metric that is Euclidean outside a ball can be distinguished from the Euclidean metric. The methods involve distorted plane waves and a combination of geometric, topological and unique continuation arguments.
△ Less
Submitted 8 June, 2025; v1 submitted 9 October, 2024;
originally announced October 2024.
-
Generalized Landau Yang Theorem
Authors:
T. R. Govindarajan,
Rakesh Tibrewala
Abstract:
Landau Yang theorem is well known for the past several decades. It prohibits the decay of a massive spin 1 particle to two photons. This emerges simply from the representation theory of the Poincare group and Bose Statistics. It does not require any action or Lagrangian. We generalize this theorem to theories with supersymmetry (SUSY) which disallows even decay to two photinos (Majorana fermions)…
▽ More
Landau Yang theorem is well known for the past several decades. It prohibits the decay of a massive spin 1 particle to two photons. This emerges simply from the representation theory of the Poincare group and Bose Statistics. It does not require any action or Lagrangian. We generalize this theorem to theories with supersymmetry (SUSY) which disallows even decay to two photinos (Majorana fermions) as well as the decay of a zino to a photon and a photino. We will prove that if the photon has a mass, howsoever small, this theorem can be evaded. We also show that the supersymmetric selection rule above can also be evaded through the Stueckelberg mass term. Further interesting implications are also pointed out.
△ Less
Submitted 13 March, 2025; v1 submitted 7 October, 2024;
originally announced October 2024.
-
Round Trip Time Estimation Utilizing Cyclic Shift of Uplink Reference Signal
Authors:
Rajeev Gangula,
Tommaso Melodia,
Rakesh Mundlamuri,
Florian Kaltenberger
Abstract:
In the context of fifth-generation new radio (5G NR) technology, it is not possible to directly obtain an absolute uplink (UL) channel impulse response (CIR) at the base station (gNB) from a user equipment (UE). The UL CIR obtained through the sounding reference signal (SRS) is always time-shifted by the timing advance (TA) applied at the UE. The TA is crucial for maintaining UL synchronization, a…
▽ More
In the context of fifth-generation new radio (5G NR) technology, it is not possible to directly obtain an absolute uplink (UL) channel impulse response (CIR) at the base station (gNB) from a user equipment (UE). The UL CIR obtained through the sounding reference signal (SRS) is always time-shifted by the timing advance (TA) applied at the UE. The TA is crucial for maintaining UL synchronization, and transmitting SRS without applying the TA will result in interference. In this work, we propose a new method to obtain absolute UL CIR from a UE and then use it to estimate the round trip time (RTT) at the gNB. This method requires enhancing the current 5G protocol stack with a new Zadoff-Chu (ZC) based wideband uplink reference signal (URS). Capitalizing on the cyclic shift property of the URS sequence, we can obtain the RTT with a significant reduction in overhead and latency compared to existing schemes. The proposed method is experimentally validated using a real-world testbed based on OpenAirInterface (OAI).
△ Less
Submitted 6 October, 2024;
originally announced October 2024.
-
ROS2-Based Simulation Framework for Cyberphysical Security Analysis of UAVs
Authors:
Unmesh Patil,
Akshith Gunasekaran,
Rakesh Bobba,
Houssam Abbas
Abstract:
We present a new simulator of Uncrewed Aerial Vehicles (UAVs) that is
tailored to the needs of testing cyber-physical security attacks and
defenses. Recent investigations into UAV safety have unveiled various attack
surfaces and some defense mechanisms. However, due to escalating regulations
imposed by aviation authorities on security research on real UAVs, and the
substantial costs asso…
▽ More
We present a new simulator of Uncrewed Aerial Vehicles (UAVs) that is
tailored to the needs of testing cyber-physical security attacks and
defenses. Recent investigations into UAV safety have unveiled various attack
surfaces and some defense mechanisms. However, due to escalating regulations
imposed by aviation authorities on security research on real UAVs, and the
substantial costs associated with hardware test-bed configurations, there
arises a necessity for a simulator capable of substituting for hardware
experiments, and/or narrowing down their scope to the strictly necessary.
The study of different attack mechanisms requires specific features in a
simulator. We propose a simulation framework based on ROS2, leveraging some
of its key advantages, including modularity, replicability, customization,
and the utilization of open-source tools such as Gazebo. Our framework has a
built-in motion planner, controller, communication models and attack models.
We share examples of research use cases that our framework can enable,
demonstrating its utility.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
Exploring $β^+$ decay/EC residues in $^{118}$Sn($^{12}$C,x)$^{130}$Ba reaction
Authors:
Priyanka,
Amanjot,
Rupinderjeet Kaur,
Arshiya Sood,
Malika Kaushik,
Yashraj Jangid,
Rakesh Kumar,
Manoj K. Sharma,
Pushpendra P. Singh
Abstract:
The fusion cross-sections of $^{126}$Ba, $^{127,126,125}$Cs, $^{125,123,122}$Xe and $^{124,123}$I residues, populated via $x$n, p$x$n, $α$$x$n, and $α$p$x$n channels, have been measured in $^{12}$C+$^{118}$Sn system at E$_{\textrm{lab}}$ $\approx$ 65-85 MeV. For an insight into the formation and decay modes of these residues, experimentally measured cross-sections have been analyzed in the framewo…
▽ More
The fusion cross-sections of $^{126}$Ba, $^{127,126,125}$Cs, $^{125,123,122}$Xe and $^{124,123}$I residues, populated via $x$n, p$x$n, $α$$x$n, and $α$p$x$n channels, have been measured in $^{12}$C+$^{118}$Sn system at E$_{\textrm{lab}}$ $\approx$ 65-85 MeV. For an insight into the formation and decay modes of these residues, experimentally measured cross-sections have been analyzed in the framework of theoretical model codes PACE4 and EMPIRE. The cross-sections of p$x$n ($^{127,126,125}$Cs), $α$xn ($^{125}$Xe), and $α$p$x$n ($^{123}$I) channels have been found to be substantially fed from their higher charge isobars via $β^+$ decay and electron capture. In order to deduce the contribution of higher charge isobars in the population of these residues, the independent cross-sections of evaporation residues have been calculated using the prescription of Cavinato $et$ $al.$ and compared with that calculated using theoretical model codes. It has been found that the PACE4 and EMPIRE calculations fairly reproduce the independent cross-sections of evaporation residues within the experimental uncertainties. Interestingly, it has been found that the $α$-emitting channels, contrary to established findings in reactions involving $α$ cluster projectiles (e.g., $^{12}$C, $^{16}$O, etc.) at low incident energies, display negligible contribution of incomplete fusion (ICF). A comparison of incomplete fusion fraction as a function of entrance channel mass-asymmetry for reactions involving $^{12}$C projectile with nearby targets indicates dissimilar behavior of $^{118}$Sn target.
△ Less
Submitted 10 November, 2024; v1 submitted 22 September, 2024;
originally announced October 2024.
-
Boosting the transparency of metallic SrNbO3 through Ti doping
Authors:
Shammi Kumar,
Liang Si,
Karsten Held,
Sankar Dhar,
Rakesh Kumar,
Priya Johari
Abstract:
In recent years, various materials have been developed to reduce the reliance of industries on Indium, a primary component of transparent conducting oxides (TCOs) used in the current generation of devices. The leading candidates for indium free TCOs are strontium vanadates, niobates and molybdates -- strongly correlated perovskite systems that exhibit high intrinsic electrical conductivity and opt…
▽ More
In recent years, various materials have been developed to reduce the reliance of industries on Indium, a primary component of transparent conducting oxides (TCOs) used in the current generation of devices. The leading candidates for indium free TCOs are strontium vanadates, niobates and molybdates -- strongly correlated perovskite systems that exhibit high intrinsic electrical conductivity and optimal transparency. In this work, we focus on the strontium niobate thin films and manipulate its optical conductivity by Ti doping, which shifts the plasma frequency and reduces electronic correlations. This allows us to achieve a low resistance for Ti doped SNO thin films, while maintaining a high transparency in the visible spectrum. We obtain the optimal figure-of-merit (FOM) of 10.3 ($10^{-3}Ω^{-1}$) for $x = 0.3$. This FOM significantly outperforms the optoelectronic capabilities of Tin-doped Indium oxide (ITO) and several other proposed transparent conductor materials. Our research paves the way for designing the next generation of transparent conductors, guided by insights from density-functional theory (DFT) and dynamical mean-field theory (DMFT).
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale
Authors:
Randy Ardywibowo,
Rakesh Sunki,
Lucy Kuo,
Sankalp Nayak
Abstract:
Information Retrieval (IR) systems used in search and recommendation platforms frequently employ Learning-to-Rank (LTR) models to rank items in response to user queries. These models heavily rely on features derived from user interactions, such as clicks and engagement data. This dependence introduces cold start issues for items lacking user engagement and poses challenges in adapting to non-stati…
▽ More
Information Retrieval (IR) systems used in search and recommendation platforms frequently employ Learning-to-Rank (LTR) models to rank items in response to user queries. These models heavily rely on features derived from user interactions, such as clicks and engagement data. This dependence introduces cold start issues for items lacking user engagement and poses challenges in adapting to non-stationary shifts in user behavior over time. We address both challenges holistically as an online learning problem and propose BayesCNS, a Bayesian approach designed to handle cold start and non-stationary distribution shifts in search systems at scale. BayesCNS achieves this by estimating prior distributions for user-item interactions, which are continuously updated with new user interactions gathered online. This online learning procedure is guided by a ranker model, enabling efficient exploration of relevant items using contextual information provided by the ranker. We successfully deployed BayesCNS in a large-scale search system and demonstrated its efficacy through comprehensive offline and online experiments. Notably, an online A/B experiment showed a 10.60% increase in new item interactions and a 1.05% improvement in overall success metrics over the existing production baseline.
△ Less
Submitted 9 December, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Impact of White-Box Adversarial Attacks on Convolutional Neural Networks
Authors:
Rakesh Podder,
Sudipto Ghosh
Abstract:
Autonomous vehicle navigation and healthcare diagnostics are among the many fields where the reliability and security of machine learning models for image data are critical. We conduct a comprehensive investigation into the susceptibility of Convolutional Neural Networks (CNNs), which are widely used for image data, to white-box adversarial attacks. We investigate the effects of various sophistica…
▽ More
Autonomous vehicle navigation and healthcare diagnostics are among the many fields where the reliability and security of machine learning models for image data are critical. We conduct a comprehensive investigation into the susceptibility of Convolutional Neural Networks (CNNs), which are widely used for image data, to white-box adversarial attacks. We investigate the effects of various sophisticated attacks -- Fast Gradient Sign Method, Basic Iterative Method, Jacobian-based Saliency Map Attack, Carlini & Wagner, Projected Gradient Descent, and DeepFool -- on CNN performance metrics, (e.g., loss, accuracy), the differential efficacy of adversarial techniques in increasing error rates, the relationship between perceived image quality metrics (e.g., ERGAS, PSNR, SSIM, and SAM) and classification performance, and the comparative effectiveness of iterative versus single-step attacks. Using the MNIST, CIFAR-10, CIFAR-100, and Fashio_MNIST datasets, we explore the effect of different attacks on the CNNs performance metrics by varying the hyperparameters of CNNs. Our study provides insights into the robustness of CNNs against adversarial threats, pinpoints vulnerabilities, and underscores the urgent need for developing robust defense mechanisms to protect CNNs and ensuring their trustworthy deployment in real-world scenarios.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
EMGTTL: Transformers-Based Transfer Learning for Classification of ADL using Raw Surface EMG Signals
Authors:
Ashraf Ali Kareemulla,
Rakesh Kumar Sanodiya,
Anish Chand Turlapaty,
Surya Naidu
Abstract:
Surface Electromyography (sEMG) is widely studied for its applications in rehabilitation, prosthetics, robotic arm control, and human-machine interaction. However, classifying Activities of Daily Living (ADL) using sEMG signals often requires extensive feature extraction, which can be time-consuming and energy-intensive. The objective of this study is stated as follows. Given sEMG datasets, such a…
▽ More
Surface Electromyography (sEMG) is widely studied for its applications in rehabilitation, prosthetics, robotic arm control, and human-machine interaction. However, classifying Activities of Daily Living (ADL) using sEMG signals often requires extensive feature extraction, which can be time-consuming and energy-intensive. The objective of this study is stated as follows. Given sEMG datasets, such as electromyography analysis of human activity databases (DB1 and DB4), with multi-channel signals corresponding to ADL, is it possible to determine the ADL categories without explicit feature extraction from sEMG signals. Further is it possible to learn across the datasets to improve the classification performances. A classification framework, named EMGTTL, is developed that uses transformers for classification of ADL and the performance is enhanced by cross-data transfer learning. The methodology is implemented on EMAHA-DB1 and EMAHA-DB4. Experiments have shown that the transformer architecture achieved 64.47% accuracy for DB1 and 68.82% for DB4. Further, using transfer learning, the accuracy improved to 66.75% for DB1 (pre-trained on DB4) and 71.04% for DB4 (pre-trained on DB1).
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Static structure factor and the dispersion of the Girvin-MacDonald-Platzman density mode for fractional quantum Hall fluids on the Haldane sphere
Authors:
Rakesh K. Dora,
Ajit C. Balram
Abstract:
We study the neutral excitations in the bulk of the fractional quantum Hall (FQH) fluids generated by acting with the Girvin-MacDonald-Platzman (GMP) density operator on the uniform ground state. Creating these density modulations atop the ground state costs energy, since any density fluctuation in the FQH system has a gap stemming from underlying interparticle interactions. We calculate the GMP d…
▽ More
We study the neutral excitations in the bulk of the fractional quantum Hall (FQH) fluids generated by acting with the Girvin-MacDonald-Platzman (GMP) density operator on the uniform ground state. Creating these density modulations atop the ground state costs energy, since any density fluctuation in the FQH system has a gap stemming from underlying interparticle interactions. We calculate the GMP density-mode dispersion for many bosonic and fermionic FQH states on the Haldane sphere using the ground state static structure factor computed on the same geometry. Previously, this computation was carried out on the plane. Analogous to the GMP algebra of the lowest Landau level (LLL) projected density operators in the plane, we derive the algebra for the LLL-projected density operators on the sphere, which facilitates the computation of the density-mode dispersion. Contrary to previous results on the plane, we find that, in the long-wavelength limit, the GMP mode accurately describes the dynamics of the primary Jain states.
△ Less
Submitted 13 March, 2025; v1 submitted 30 September, 2024;
originally announced October 2024.