-
Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
Authors:
Aneeshan Sain,
Subhajit Maity,
Pinaki Nath Chowdhury,
Subhadeep Koley,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference specifically designed for sketch data. In this paper, we first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not wor…
▽ More
As sketch research has collectively matured over time, its adaptation for at-mass commercialisation emerges on the immediate horizon. Despite an already mature research endeavour for photos, there is no research on the efficient inference specifically designed for sketch data. In this paper, we first demonstrate existing state-of-the-art efficient light-weight models designed for photos do not work on sketches. We then propose two sketch-specific components which work in a plug-n-play manner on any photo efficient network to adapt them to work on sketch data. We specifically chose fine-grained sketch-based image retrieval (FG-SBIR) as a demonstrator as the most recognised sketch problem with immediate commercial value. Technically speaking, we first propose a cross-modal knowledge distillation network to transfer existing photo efficient networks to be compatible with sketch, which brings down number of FLOPs and model parameters by 97.96% percent and 84.89% respectively. We then exploit the abstract trait of sketch to introduce a RL-based canvas selector that dynamically adjusts to the abstraction level which further cuts down number of FLOPs by two thirds. The end result is an overall reduction of 99.37% of FLOPs (from 40.18G to 0.254G) when compared with a full network, while retaining the accuracy (33.03% vs 32.77%) -- finally making an efficient network for the sparse sketch data that exhibit even fewer FLOPs than the best photo counterpart.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Constraining the Hubble parameter with the 21 cm brightness temperature signal in a universe with inhomogeneities
Authors:
Subhadeep Mukherjee,
Shashank Shekhar Pandey,
A. S. Majumdar
Abstract:
We consider the 21\,cm brightness temperature as a probe of the Hubble tension in the framework of an inhomogeneous cosmological model. Employing Buchert's averaging formalism to study the effect of inhomogeneities on the background evolution, we consider scaling laws for the backreaction and curvature consistent with structure formation simulations. We calibrate the effective matter density using…
▽ More
We consider the 21\,cm brightness temperature as a probe of the Hubble tension in the framework of an inhomogeneous cosmological model. Employing Buchert's averaging formalism to study the effect of inhomogeneities on the background evolution, we consider scaling laws for the backreaction and curvature consistent with structure formation simulations. We calibrate the effective matter density using MCMC analysis using Union 2.1 Supernova Ia data. Our results show that a higher Hubble constant ($\sim73$\,km/s/Mpc) leads to a shallower absorption feature in the brightness temperature versus redshift curve. On the other hand, a lower value ($\sim67$\,km/s/Mpc) produces a remarkable dip in the brightness temperature $T_{21}$. Such a substantial difference is absent in the standard $Λ$CDM model. Our findings indicate that inhomogeneities could significantly affect the 21\,cm signal, and may shed further light on the different measurements of the Hubble constant.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Spectral clustering for dependent community Hawkes process models of temporal networks
Authors:
Lingfei Zhao,
Hadeel Soliman,
Kevin S. Xu,
Subhadeep Paul
Abstract:
Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations,…
▽ More
Temporal networks observed continuously over time through timestamped relational events data are commonly encountered in application settings including online social media communications, financial transactions, and international relations. Temporal networks often exhibit community structure and strong dependence patterns among node pairs. This dependence can be modeled through mutual excitations, where an interaction event from a sender to a receiver node increases the possibility of future events among other node pairs.
We provide statistical results for a class of models that we call dependent community Hawkes (DCH) models, which combine the stochastic block model with mutually exciting Hawkes processes for modeling both community structure and dependence among node pairs, respectively. We derive a non-asymptotic upper bound on the misclustering error of spectral clustering on the event count matrix as a function of the number of nodes and communities, time duration, and the amount of dependence in the model. Our result leverages recent results on bounding an appropriate distance between a multivariate Hawkes process count vector and a Gaussian vector, along with results from random matrix theory. We also propose a DCH model that incorporates only self and reciprocal excitation along with highly scalable parameter estimation using a Generalized Method of Moments (GMM) estimator that we demonstrate to be consistent for growing network size and time duration.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Universal Semantic Disentangled Privacy-preserving Speech Representation Learning
Authors:
Biel Tura Vecino,
Subhadeep Maji,
Aravind Varier,
Antonio Bonafonte,
Ivan Valles,
Michael Owen,
Leif Rädel,
Grant Strimel,
Seyi Feyisetan,
Roberto Barra Chicote,
Ariya Rastrow,
Constantinos Papayiannis,
Volker Leutnant,
Trevor Wood
Abstract:
The use of audio recordings of human speech to train LLMs poses privacy concerns due to these models' potential to generate outputs that closely resemble artifacts in the training data. In this study, we propose a speaker privacy-preserving representation learning method through the Universal Speech Codec (USC), a computationally efficient encoder-decoder model that disentangles speech into: (i) p…
▽ More
The use of audio recordings of human speech to train LLMs poses privacy concerns due to these models' potential to generate outputs that closely resemble artifacts in the training data. In this study, we propose a speaker privacy-preserving representation learning method through the Universal Speech Codec (USC), a computationally efficient encoder-decoder model that disentangles speech into: (i) privacy-preserving semantically rich representations, capturing content and speech paralinguistics, and (ii) residual acoustic and speaker representations that enables high-fidelity reconstruction. Extensive evaluations presented show that USC's semantic representation preserves content, prosody, and sentiment, while removing potentially identifiable speaker attributes. Combining both representations, USC achieves state-of-the-art speech reconstruction. Additionally, we introduce an evaluation methodology for measuring privacy-preserving properties, aligning with perceptual tests. We compare USC against other codecs in the literature and demonstrate its effectiveness on privacy-preserving representation learning, illustrating the trade-offs of speaker anonymization, paralinguistics retention and content preservation in the learned semantic representations. Audio samples are shared in https://www.amazon.science/usc-samples.
△ Less
Submitted 20 May, 2025; v1 submitted 19 May, 2025;
originally announced May 2025.
-
commensurability: a Python package for classifying astronomical orbits based on their toroid volume
Authors:
Subhadeep Sarkar,
Michael S. Petersen
Abstract:
As a star orbits the center of its host galaxy, the trajectory is encompassed within a 3D toroid. The orbit probes all points in this toroid, unless its orbital frequencies exhibit integer ratios (commensurate frequencies), in which case a small sub-volume is traversed. commensurability is a Python package that implements a tessellation-based algorithm for identifying orbital families that satisfy…
▽ More
As a star orbits the center of its host galaxy, the trajectory is encompassed within a 3D toroid. The orbit probes all points in this toroid, unless its orbital frequencies exhibit integer ratios (commensurate frequencies), in which case a small sub-volume is traversed. commensurability is a Python package that implements a tessellation-based algorithm for identifying orbital families that satisfy commensurabilities by measuring the toroid volume traversed over orbit integration. Compared to standard orbit classification methods such as frequency analysis, tessellation analysis relies on configuration space properties alone, making classification results more robust to frequency instabilities or limited integration times. The package provides a framework for analyzing phase-space coordinates using tessellation analysis, including a subpackage for the implementation of the general tessellation algorithm. The package is to be used with a galactic dynamics library; it currently supports AGAMA, gala, and galpy.
△ Less
Submitted 20 May, 2025; v1 submitted 15 May, 2025;
originally announced May 2025.
-
An Optimized Evacuation Plan for an Active-Shooter Situation Constrained by Network Capacity
Authors:
Joseph Lavalle-Rivera,
Aniirudh Ramesh,
Subhadeep Chakraborty
Abstract:
A total of more than 3400 public shootings have occurred in the United States between 2016 and 2022. Among these, 25.1% of them took place in an educational institution, 29.4% at the workplace including office buildings, 19.6% in retail store locations, and 13.4% in restaurants and bars. During these critical scenarios, making the right decisions while evacuating can make the difference between li…
▽ More
A total of more than 3400 public shootings have occurred in the United States between 2016 and 2022. Among these, 25.1% of them took place in an educational institution, 29.4% at the workplace including office buildings, 19.6% in retail store locations, and 13.4% in restaurants and bars. During these critical scenarios, making the right decisions while evacuating can make the difference between life and death. However, emergency evacuation is intensely stressful, which along with the lack of verifiable real-time information may lead to fatal incorrect decisions. To tackle this problem, we developed a multi-route routing optimization algorithm that determines multiple optimal safe routes for each evacuee while accounting for available capacity along the route, thus reducing the threat of crowding and bottlenecking. Overall, our algorithm reduces the total casualties by 34.16% and 53.3%, compared to our previous routing algorithm without capacity constraints and an expert-advised routing strategy respectively. Further, our approach to reduce crowding resulted in an approximate 50% reduction in occupancy in key bottlenecking nodes compared to both of the other evacuation algorithms.
△ Less
Submitted 29 April, 2025;
originally announced May 2025.
-
Designing Non-Relativistic Spin Splitting in Oxide Perovskites
Authors:
Subhadeep Bandyopadhyay,
Silvia Picozzi,
Sayantika Bhowal
Abstract:
We investigate the role of atomic distortions in non-relativistic spin splitting in perovskite oxides with Pbnm symmetry. Using LaMnO3 as a representative material, we analyze its non-relativistic spin splitting through a combined phonon and multipolar analysis. Our study provides key insights into how structural distortions and magnetic ordering drive ferroically ordered magnetic multipoles, whic…
▽ More
We investigate the role of atomic distortions in non-relativistic spin splitting in perovskite oxides with Pbnm symmetry. Using LaMnO3 as a representative material, we analyze its non-relativistic spin splitting through a combined phonon and multipolar analysis. Our study provides key insights into how structural distortions and magnetic ordering drive ferroically ordered magnetic multipoles, which, in turn, give rise to non-relativistic spin splitting. Based on these findings, we propose three strategies for engineering non-relativistic spin splitting: modifying the A-site cation size, strain engineering, and electric field control in superlattice structures. Our work establishes a framework for designing non-relativistic spin splitting in the Brillouin zone of oxide perovskites.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models
Authors:
Subhadeep Koley,
Tapas Kumar Dutta,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
While foundation models have revolutionised computer vision, their effectiveness for sketch understanding remains limited by the unique challenges of abstract, sparse visual inputs. Through systematic analysis, we uncover two fundamental limitations: Stable Diffusion (SD) struggles to extract meaningful features from abstract sketches (unlike its success with photos), and exhibits a pronounced fre…
▽ More
While foundation models have revolutionised computer vision, their effectiveness for sketch understanding remains limited by the unique challenges of abstract, sparse visual inputs. Through systematic analysis, we uncover two fundamental limitations: Stable Diffusion (SD) struggles to extract meaningful features from abstract sketches (unlike its success with photos), and exhibits a pronounced frequency-domain bias that suppresses essential low-frequency components needed for sketch understanding. Rather than costly retraining, we address these limitations by strategically combining SD with CLIP, whose strong semantic understanding naturally compensates for SD's spatial-frequency biases. By dynamically injecting CLIP features into SD's denoising process and adaptively aggregating features across semantic levels, our method achieves state-of-the-art performance in sketch retrieval (+3.35%), recognition (+1.06%), segmentation (+29.42%), and correspondence learning (+21.22%), demonstrating the first truly universal sketch feature representation in the era of foundation models.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Magnetodielectric Properties in Two Dimensional Magnetic Insulators
Authors:
Koushik Dey,
Hasina Khatun,
Anudeepa Ghosh,
Soumik Das,
Bikash Das,
Subhadeep Datta
Abstract:
Magnetodielectric (MD) materials are important for their ability to spin-charge conversion, magnetic field control of electric polarization and vice versa. Among these, two-dimensional (2D) van der Waals (vdW) magnetic materials are of particular interest due to the presence of magnetic anisotropy (MA) originating from the interaction between the magnetic moments and the crystal field. Also, these…
▽ More
Magnetodielectric (MD) materials are important for their ability to spin-charge conversion, magnetic field control of electric polarization and vice versa. Among these, two-dimensional (2D) van der Waals (vdW) magnetic materials are of particular interest due to the presence of magnetic anisotropy (MA) originating from the interaction between the magnetic moments and the crystal field. Also, these materials indicate a high degree of stability in the long-range spin order and may be described using suitable spin Hamiltonians of the Heisenberg, XY, or Ising type. Recent reports have suggested effective interactions between magnetization and electric polarization in 2D magnets. However, MD coupling studies on layered magnetic materials are still few. This review covers the fundamentals of magnetodielectric coupling by explaining related key terms. It includes the necessary conditions for having this coupling and sheds light on the possible physical mechanisms behind this coupling starting from phenomenological descriptions. Apart from that, this review classifies 2D magnetic materials into several categories for reaching out each and every class of materials. Additionally, this review summarizes recent advancements of some pioneer 2D magnetodielectric materials. Last but not the least, the current review provides possible research directions for enhancing magnetodielectric coupling in those and mentions the possibilities for future developments.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
Event-by-event fluctuations of mean transverse momentum in proton-proton collisions at $\sqrt{s}$ = 13 TeV with PYTHIA8 and HERWIG7 models
Authors:
Subhadeep Roy,
Tanu Gahlaut,
Sadhana Dash
Abstract:
Estimations of event-by-event mean transverse momentum ($\langle p_{\rm T} \rangle$) fluctuations are reported in terms of the integral correlator, $\langle Δp_{\rm T} Δp_{\rm T}\rangle$, and the skewness of event-wise $\langle p_{\rm T} \rangle$ distribution in proton$-$proton (pp) collisions at $\sqrt{s}=13$ TeV with the Monte Carlo event generators PYTHIA8 and HERWIG7. The final-state charged p…
▽ More
Estimations of event-by-event mean transverse momentum ($\langle p_{\rm T} \rangle$) fluctuations are reported in terms of the integral correlator, $\langle Δp_{\rm T} Δp_{\rm T}\rangle$, and the skewness of event-wise $\langle p_{\rm T} \rangle$ distribution in proton$-$proton (pp) collisions at $\sqrt{s}=13$ TeV with the Monte Carlo event generators PYTHIA8 and HERWIG7. The final-state charged particles with transverse momentum ($p_{\rm T}$) and pseudorapidity ($η$) ranges $0.15 \leq p_{\rm T}\leq 2.0$ GeV/$c$ and $|η| \leq 0.8$ were considered for the investigation. The correlator, $\langle Δp_{\rm T} Δp_{\rm T}\rangle$, is observed to follow distinct decreasing trends with average charged particle multiplicity ($\langle N_{\rm ch} \rangle$) for the models. Furthermore, both models yield positive finite skewness in low-multiplicity events. Fluctuations are additionally studied using the transverse spherocity estimator ($S_{\rm 0}$) to understand the relative contributions of hard scattering (jets) and other soft processes to the observed fluctuations. Comparing model predictions for $\langle p_{\rm T} \rangle$ fluctuations provides valuable insight into the sensitivity of these fluctuations to hadronization and parton shower models. This is essential for a reliable interpretation of the fluctuation dynamics in pp collisions. Moreover, such comparisons would help to establish a crucial baseline for identifying and studying non-trivial fluctuations in heavy-ion collisions.
△ Less
Submitted 21 April, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
Exploring the BSM parameter space with Neural Network aided Simulation-Based Inference
Authors:
Atrideb Chatterjee,
Arghya Choudhury,
Sourav Mitra,
Arpita Mondal,
Subhadeep Mondal
Abstract:
Some of the issues that make sampling parameter spaces of various beyond the Standard Model (BSM) scenarios computationally expensive are the high dimensionality of the input parameter space, complex likelihoods, and stringent experimental constraints. In this work, we explore likelihood-free approaches, leveraging neural network-aided Simulation-Based Inference (SBI) to alleviate this issue. We f…
▽ More
Some of the issues that make sampling parameter spaces of various beyond the Standard Model (BSM) scenarios computationally expensive are the high dimensionality of the input parameter space, complex likelihoods, and stringent experimental constraints. In this work, we explore likelihood-free approaches, leveraging neural network-aided Simulation-Based Inference (SBI) to alleviate this issue. We focus on three amortized SBI methods: Neural Posterior Estimation (NPE), Neural Likelihood Estimation (NLE), and Neural Ratio Estimation (NRE) and perform a comparative analysis through the validation test known as the \textit{ Test of Accuracy with Random Points} (TARP), as well as through posterior sample efficiency and computational time. As an example, we focus on the scalar sector of the phenomenological minimal supersymmetric SM (pMSSM) and observe that the NPE method outperforms the others and generates correct posterior distributions of the parameters with a minimal number of samples. The efficacy of this framework will be more evident with additional experimental data, especially for high dimensional parameter space.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Normalizing Flow-Assisted Nested Sampling on Type-II Seesaw Model
Authors:
Rajneil Baruah,
Subhadeep Mondal,
Sunando Kumar Patra,
Satyajit Roy
Abstract:
We propose a novel technique for sampling particle physics model parameter space. The main sampling method applied is Nested Sampling (NS), which is boosted by the application of multiple Machine Learning (ML) networks, e.g., Self-Normalizing Network (SNN) and Normalizing Flow (specifically RealNVP). We apply this on Type-II Seesaw model to test the efficacy of the algorithm. We present the result…
▽ More
We propose a novel technique for sampling particle physics model parameter space. The main sampling method applied is Nested Sampling (NS), which is boosted by the application of multiple Machine Learning (ML) networks, e.g., Self-Normalizing Network (SNN) and Normalizing Flow (specifically RealNVP). We apply this on Type-II Seesaw model to test the efficacy of the algorithm. We present the results of our detailed Bayesian exploration of the model parameter space subjected to theoretical constraints and experimental data corresponding to the 125 GeV Higgs boson, $ρ$-parameter, and the oblique parameters. All associated data, figures, and trained ML models can be found here: https://github.com/sunandopatra/MLNS-T2SS
△ Less
Submitted 6 February, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
SketchYourSeg: Mask-Free Subjective Image Segmentation via Freehand Sketches
Authors:
Subhadeep Koley,
Viswanatha Reddy Gajjala,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Tao Xiang,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
We introduce SketchYourSeg, a novel framework that establishes freehand sketches as a powerful query modality for subjective image segmentation across entire galleries through a single exemplar sketch. Unlike text prompts that struggle with spatial specificity or interactive methods confined to single-image operations, sketches naturally combine semantic intent with structural precision. This uniq…
▽ More
We introduce SketchYourSeg, a novel framework that establishes freehand sketches as a powerful query modality for subjective image segmentation across entire galleries through a single exemplar sketch. Unlike text prompts that struggle with spatial specificity or interactive methods confined to single-image operations, sketches naturally combine semantic intent with structural precision. This unique dual encoding enables precise visual disambiguation for segmentation tasks where text descriptions would be cumbersome or ambiguous -- such as distinguishing between visually similar instances, specifying exact part boundaries, or indicating spatial relationships in composed concepts. Our approach addresses three fundamental challenges: (i) eliminating the need for pixel-perfect annotation masks during training with a mask-free framework; (ii) creating a synergistic relationship between sketch-based image retrieval (SBIR) models and foundation models (CLIP/DINOv2) where the former provides training signals while the latter generates masks; and (iii) enabling multi-granular segmentation capabilities through purpose-made sketch augmentation strategies. Our extensive evaluations demonstrate superior performance over existing approaches across diverse benchmarks, establishing a new paradigm for user-guided image segmentation that balances precision with efficiency.
△ Less
Submitted 17 March, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
Dimensional Crossover and Emergence of Novel Phases in Puckered PdSe$_2$ under Pressure
Authors:
Tanima Kundu,
Soumik Das,
Koushik Dey,
Boby Joseph,
Mainak Palit,
Sanjoy Kr Mahatha,
Kapildeb Dolui,
Subhadeep Datta
Abstract:
We investigate the pressure-driven structural and electronic evolution of PdSe\(_2\) using powder X-ray diffraction, Raman spectroscopy, and first-principles calculations. Beyond 2.3 GPa, suppression of the Jahn-Teller distortion induces in-plane lattice expansion and metallization. Around 4.8 GPa, the interlayer \(d_{z^2}-π^*\) orbital hybridization drives the dimensional crossover, facilitating…
▽ More
We investigate the pressure-driven structural and electronic evolution of PdSe\(_2\) using powder X-ray diffraction, Raman spectroscopy, and first-principles calculations. Beyond 2.3 GPa, suppression of the Jahn-Teller distortion induces in-plane lattice expansion and metallization. Around 4.8 GPa, the interlayer \(d_{z^2}-π^*\) orbital hybridization drives the dimensional crossover, facilitating the transformation from the 2D distorted to a 3D undistorted pyrite phase. Above 9 GPa, a novel phase emerges, characterized by octahedral distortions in the $d$ orbitals of Pd. Structural analysis suggests the presence of marcasite (\(Pnnm\)) or arsenopyrite (\(P2_1/c\)) phase with orthorhombic and monoclinic configurations, respectively. Furthermore, the observed phonon anomaly and electronic structure modifications, including the emergence of flat bands in the high-pressure phases, elucidate the fundamental mechanisms underlying the previously reported exotic superconductivity with an enhanced critical temperature. These results highlight the pivotal role of dimensional crossover and structural transitions in tuning the electronic properties of puckered materials, providing pathways for novel functionalities.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Neutral Atoms in Optical Tweezers as Messenger Qubits for Scaling up a Trapped Ion Quantum Computer
Authors:
Svetlana Kotochigova,
Subhadeep Gupta,
Boris Blinov
Abstract:
We propose to combine neutral atom and trapped ion qubits in one scalable modular architecture that uses shuttling of individual neutral atoms in optical tweezers to realize atomic interconnects between trapped ion quantum registers. These interconnects are deterministic, and thus may be performed on-demand. The proposed protocol is as follows: a tweezer-trapped neutral atom qubit is brought close…
▽ More
We propose to combine neutral atom and trapped ion qubits in one scalable modular architecture that uses shuttling of individual neutral atoms in optical tweezers to realize atomic interconnects between trapped ion quantum registers. These interconnects are deterministic, and thus may be performed on-demand. The proposed protocol is as follows: a tweezer-trapped neutral atom qubit is brought close to a trapped ion in an ion chain serving as a module of a larger quantum computer, and an entangling gate is performed between the two qubits. Then the neutral atom is quickly moved to another, nearby trapped ion chain in the same modular ion trap and entangled with an ion in that chain, thus entangling the two separate ion chains. The optical dipole potential of the tweezer beam for the neutral atom does not measurably affect the trapped ions, while the RF ion trap does not affect the neutral atom. With realistic tweezer trap parameters, the neutral atom can be moved over millimeter scale distance in a few tens of microseconds, thus enabling a remote entanglement generation rate of over 10^3/s even with very modest assumptions for the atom-ion quantum gate speed, and possibly up to 10^4/s, which is two orders of magnitude higher than the current state-of-the-art with photonic interconnects.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Decoding active force fluctuations from spatial trajectories of active systems
Authors:
Anisha Majhi,
Biswajit Das,
Subhadeep Gupta,
Anand Dev Ranjan,
Amirul Islam Mallick,
Shuvojit Paul,
Ayan Banerjee
Abstract:
Mesoscopic active systems exhibit various unique behaviours - absent in passive systems - due to the forces generated by the corresponding constituents by converting their available free energies. However, estimating these forces - which are also stochastic and remain intertwined with the thermal noise - is especially non-trivial. Here, we introduce a technique to extract such fluctuating active f…
▽ More
Mesoscopic active systems exhibit various unique behaviours - absent in passive systems - due to the forces generated by the corresponding constituents by converting their available free energies. However, estimating these forces - which are also stochastic and remain intertwined with the thermal noise - is especially non-trivial. Here, we introduce a technique to extract such fluctuating active forces acting on a passive particle immersed in an active bath with high statistical accuracy by filtering out the related thermal noise. We first test the efficacy of our method under numerical scenarios with different types of activity, and then apply it to the experimental trajectories of a microscopic particle (optically) trapped inside an active bath consisting of motile \textit{E.Coli.} bacteria. We believe that our simple yet powerful approach, which appears agnostic to the nature of the active force, should enable accurate measurement of force dynamics in living matter and potentially allow direct but reliable estimation of key thermodynamic parameters such as heat, work, and entropy production.
△ Less
Submitted 2 May, 2025; v1 submitted 4 January, 2025;
originally announced January 2025.
-
Magnon-Phonon Coupling in Layered Antiferromagnet
Authors:
Somsubhra Ghosh,
Mainak Palit,
Sujan Maity,
Subhadeep Datta
Abstract:
We present a fully analytical model of hybridization between magnon, and phonons observed experimentally in magneto-Raman scattering in van der Waals (vdW) antiferromagnets (AFM). Here, the representative material, FePS3, has been shown to be a quasi-two-dimensional-Ising antiferromagnet, with additional features of spin-phonon coupling in the Raman spectra emerging below the Néel temperature (TN)…
▽ More
We present a fully analytical model of hybridization between magnon, and phonons observed experimentally in magneto-Raman scattering in van der Waals (vdW) antiferromagnets (AFM). Here, the representative material, FePS3, has been shown to be a quasi-two-dimensional-Ising antiferromagnet, with additional features of spin-phonon coupling in the Raman spectra emerging below the Néel temperature (TN) of approximately 120 K. Using magneto-Raman spectroscopy as an optical probe of magnetic structure, we show that one of these Raman-active modes in the magnetically ordered state is a magnon with a frequency of 3.7 THz (~ 122 cm-1). In addition, one magnon band and three phonon bands are coupled via the magneto-elastic coupling evidenced by anti-crossing in the complete spectra. We consider a simple model involving only in-plane nearest neighbor exchange couplings (designed to give rise to a similar magnetic structure) and perpendicular anisotropy in presence of an out-of-plane magnetic field. Exact diagonalization of the Hamiltonian leads to energy bands which show that the interaction term gives rise to avoided crossings between the hybridized magnon and phonon branches. Realizing magnon-phonon coupling in two-dimensional (2D) AFMs is important for the verification of the theoretical predictions on exotic quantum transport phenomena like spin-caloritronics, topological magnonics, etc.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
Heterogeneous transfer learning for high dimensional regression with feature mismatch
Authors:
Jae Ho Chang,
Massimiliano Russo,
Subhadeep Paul
Abstract:
We consider the problem of transferring knowledge from a source, or proxy, domain to a new target domain for learning a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However, most homogeneous transfer and multi-task learning methods assume that the target and proxy domains have the s…
▽ More
We consider the problem of transferring knowledge from a source, or proxy, domain to a new target domain for learning a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However, most homogeneous transfer and multi-task learning methods assume that the target and proxy domains have the same feature space, limiting their practical applicability. In applications, target and proxy feature spaces are frequently inherently different, for example, due to the inability to measure some variables in the target data-poor environments. Conversely, existing heterogeneous transfer learning methods do not provide statistical error guarantees, limiting their utility for scientific discovery. We propose a two-stage method that involves learning the relationship between the missing and observed features through a projection step in the proxy data and then solving a joint penalized regression optimization problem in the target data. We develop an upper bound on the method's parameter estimation risk and prediction risk, assuming that the proxy and the target domain parameters are sparsely different. Our results elucidate how estimation and prediction error depend on the complexity of the model, sample size, the extent of overlap, and correlation between matched and mismatched features.
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
Terrestrial Very-Long-Baseline Atom Interferometry: Summary of the Second Workshop
Authors:
Adam Abdalla,
Mahiro Abe,
Sven Abend,
Mouine Abidi,
Monika Aidelsburger,
Ashkan Alibabaei,
Baptiste Allard,
John Antoniadis,
Gianluigi Arduini,
Nadja Augst,
Philippos Balamatsias,
Antun Balaz,
Hannah Banks,
Rachel L. Barcklay,
Michele Barone,
Michele Barsanti,
Mark G. Bason,
Angelo Bassi,
Jean-Baptiste Bayle,
Charles F. A. Baynham,
Quentin Beaufils,
Slyan Beldjoudi,
Aleksandar Belic,
Shayne Bennetts,
Jose Bernabeu
, et al. (285 additional authors not shown)
Abstract:
This summary of the second Terrestrial Very-Long-Baseline Atom Interferometry (TVLBAI) Workshop provides a comprehensive overview of our meeting held in London in April 2024, building on the initial discussions during the inaugural workshop held at CERN in March 2023. Like the summary of the first workshop, this document records a critical milestone for the international atom interferometry commun…
▽ More
This summary of the second Terrestrial Very-Long-Baseline Atom Interferometry (TVLBAI) Workshop provides a comprehensive overview of our meeting held in London in April 2024, building on the initial discussions during the inaugural workshop held at CERN in March 2023. Like the summary of the first workshop, this document records a critical milestone for the international atom interferometry community. It documents our concerted efforts to evaluate progress, address emerging challenges, and refine strategic directions for future large-scale atom interferometry projects. Our commitment to collaboration is manifested by the integration of diverse expertise and the coordination of international resources, all aimed at advancing the frontiers of atom interferometry physics and technology, as set out in a Memorandum of Understanding signed by over 50 institutions.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Bulk photovoltaic effect in ferroelectric and antiferroelectric phases of antimony sulphoiodide investigated by means of ab-initio simulations
Authors:
Giuseppe Cuono,
Subhadeep Bandyopadhyay,
Andrea Droghetti,
Silvia Picozzi
Abstract:
We employ first-principles calculations to investigate the ferroelectric properties and the bulk photovoltaic effect (BPVE) of antimony sulfur iodide (SbSI). The BPVE enables direct sunlight-to-electricity conversion in homogeneous materials and, in ferroelectric compounds, can be tuned via an electric field controlling the polarization. However, most ferroelectrics are oxides with large band gaps…
▽ More
We employ first-principles calculations to investigate the ferroelectric properties and the bulk photovoltaic effect (BPVE) of antimony sulfur iodide (SbSI). The BPVE enables direct sunlight-to-electricity conversion in homogeneous materials and, in ferroelectric compounds, can be tuned via an electric field controlling the polarization. However, most ferroelectrics are oxides with large band gaps exceeding the energy of visible light, thereby limiting their photovoltaic performance. SbSI, featuring a visible-range band gap, combines remarkable photovoltaic capabilities with a spin-textured band structure, coupling charge and spin degrees of freedom. Our calculations predict ferroelectric and antiferroelectric phases with comparable band gaps but distinct spin textures, relevant for spintronics applications. The BPVE is driven by the linear and circular photogalvanic effects, exhibiting high photoconductivities under visible light. Furthermore, it serves as a diagnostic tool to identify the material phase, with the circular photogalvanic effect reflecting spin texture changes. Thanks to its multifunctional properties, SbSI emerges as a promising candidate for solar energy conversion and advanced electronics, with potential applications extending to spintronics.
△ Less
Submitted 12 June, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
AI-driven Inverse Design of Band-Tunable Mechanical Metastructures for Tailored Vibration Mitigation
Authors:
Tanuj Gupta,
Arun Kumar Sharma,
Ankur Dwivedi,
Vivek Gupta,
Subhadeep Sahana,
Suryansh Pathak,
Ashish Awasthi,
Bishakh Bhattacharya
Abstract:
On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corre…
▽ More
On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corresponding vibration characteristics are studied experimentally and numerically. Further, the band-gap modulation with metallic inserts in the honeycomb interlaced metastructures is also studied. AI-driven inverse design of such complex metastructures with a desired vibration mitigation profile can pave the way for addressing engineering challenges in high-precision manufacturing. The current inverse design methodologies are limited to designing simple periodic structures based on limited variants of unit cells. Therefore, a novel forward analysis model with multi-head FEM-inspired spatial attention (FSA) is proposed to learn the complex geometry of the metastructures and predict corresponding transmissibility. Subsequently, a multiscale Gaussian self-attention (MGSA) based inverse design model with Gaussian function for 1D spectrum position encoding is developed to produce a suitable metastructure for the desired vibration transmittance. The proposed AI framework demonstrated outstanding performance corresponding to the expected locally resonant bandgaps in a targeted frequency range.
△ Less
Submitted 28 February, 2025; v1 submitted 3 December, 2024;
originally announced December 2024.
-
Probing sub-TeV Higgsinos aided by a ML-based top tagger in the context of Trilinear RPV SUSY
Authors:
Rajneil Baruah,
Arghya Choudhury,
Kirtiman Ghosh,
Subhadeep Mondal,
Rameswar Sahu
Abstract:
Probing higgsinos remains a challenge at the LHC owing to their small production cross-sections and the complexity of the decay modes of the nearly mass degenerate higgsino states. The existing limits on higgsino mass are much weaker compared to its bino and wino counterparts. This leaves a large chunk of sub-TeV supersymmetric parameter space unexplored so far. In this work, we explore the possib…
▽ More
Probing higgsinos remains a challenge at the LHC owing to their small production cross-sections and the complexity of the decay modes of the nearly mass degenerate higgsino states. The existing limits on higgsino mass are much weaker compared to its bino and wino counterparts. This leaves a large chunk of sub-TeV supersymmetric parameter space unexplored so far. In this work, we explore the possibility of probing higgsino masses in the 400 - 1000 GeV range. We consider a simplified supersymmetric scenario where R-Parity is violated through a baryon number violating trilinear coupling. We adopt a machine learning-based top tagger to tag the boosted top jets originating from higgsinos, and for our collider analysis, we use a BDT classifier to discriminate signal over SM backgrounds. We construct two signal regions characterized by at least one top jet and different multiplicities of $b$-jets and light jets. Combining the statistical significance obtained from the two signal regions, we show that higgsino mass as high as 925 GeV can be probed at the high luminosity LHC.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Tailored 1D/2D Van der Waals Heterostructures for Unified Analog and Digital Electronics
Authors:
Bipul Karmakar,
Bikash Das,
Shibnath Mandal,
Rahul Paramanik,
Sujan Maity,
Tanima Kundu,
Soumik Das,
Mainak Palit,
Koushik Dey,
Kapildeb Dolui,
Subhadeep Datta
Abstract:
We report a sequential two-step vapor deposition process for growing mixed-dimensional van der Waals (vdW) materials, specifically Te nanowires (1D) and MoS$_2$ (2D), on a single SiO$_2$ wafer. Our growth technique offers a unique potential pathway to create large scale, high-quality, defect-free interfaces. The assembly of samples serves a twofold application: first, the as-prepared heterostructu…
▽ More
We report a sequential two-step vapor deposition process for growing mixed-dimensional van der Waals (vdW) materials, specifically Te nanowires (1D) and MoS$_2$ (2D), on a single SiO$_2$ wafer. Our growth technique offers a unique potential pathway to create large scale, high-quality, defect-free interfaces. The assembly of samples serves a twofold application: first, the as-prepared heterostructures (Te NW/MoS$_2$) provide insights into the atomically thin depletion region of a 1D/2D vdW diode, as revealed by electrical transport measurements and density functional theory-based quantum transport calculations. The charge transfer at the heterointerface is confirmed using Raman spectroscopy and Kelvin probe force microscopy (KPFM). We also observe modulation of the rectification ratio with varying applied gate voltage. Second, the non-hybrid regions on the substrate, consisting of the as-grown individual Te nanowires and MoS$_2$ microstructures, are utilized to fabricate separate p- and n-FETs, respectively. Furthermore, the ionic liquid gating helps to realize low-power CMOS inverter and all basic logic gate operations using a pair of n- and p- field-effect transistors (FETs) on Si/SiO$_2$ platform. This approach also demonstrates the potential for unifying diode and CMOS circuits on a single platform, opening opportunities for integrated analog and digital electronics.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
DreamColour: Controllable Video Colour Editing without Training
Authors:
Chaitat Utintu,
Pinaki Nath Chowdhury,
Aneeshan Sain,
Subhadeep Koley,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
Video colour editing is a crucial task for content creation, yet existing solutions either require painstaking frame-by-frame manipulation or produce unrealistic results with temporal artefacts. We present a practical, training-free framework that makes precise video colour editing accessible through an intuitive interface while maintaining professional-quality output. Our key insight is that by d…
▽ More
Video colour editing is a crucial task for content creation, yet existing solutions either require painstaking frame-by-frame manipulation or produce unrealistic results with temporal artefacts. We present a practical, training-free framework that makes precise video colour editing accessible through an intuitive interface while maintaining professional-quality output. Our key insight is that by decoupling spatial and temporal aspects of colour editing, we can better align with users' natural workflow -- allowing them to focus on precise colour selection in key frames before automatically propagating changes across time. We achieve this through a novel technical framework that combines: (i) a simple point-and-click interface merging grid-based colour selection with automatic instance segmentation for precise spatial control, (ii) bidirectional colour propagation that leverages inherent video motion patterns, and (iii) motion-aware blending that ensures smooth transitions even with complex object movements. Through extensive evaluation on diverse scenarios, we demonstrate that our approach matches or exceeds state-of-the-art methods while eliminating the need for training or specialized hardware, making professional-quality video colour editing accessible to everyone.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Phonon-assisted control of magnonic and electronic band splitting
Authors:
Subhadeep Bandyopadhyay,
Anoop Raj,
Philippe Ghosez,
Sumiran Pujari,
Sayantika Bhowal
Abstract:
We demonstrate theoretically the ability to control non-relativistic magnonic and electronic spin splitting by manipulating phonon modes. Using MnF$_2$ as a representative material, exhibiting non-relativistic spin splitting in its electronic bands, we identify an equivalent $d$-wave splitting in magnon modes of specific handedness. Our study reveals a direct correlation between magnonic and elect…
▽ More
We demonstrate theoretically the ability to control non-relativistic magnonic and electronic spin splitting by manipulating phonon modes. Using MnF$_2$ as a representative material, exhibiting non-relativistic spin splitting in its electronic bands, we identify an equivalent $d$-wave splitting in magnon modes of specific handedness. Our study reveals a direct correlation between magnonic and electronic splittings, showing that the energy splitting in both magnon and electronic bands can be tuned by jointly modulating the A$_{2u}$ and A$_{1g}$ phonon modes with frequencies of 8.52 and 9.74 THz, respectively. These findings highlight the intricate interplay between charge, spin, and lattice degrees of freedom in spin-split antiferromagnets, offering new pathways for phonon-driven control in magnonic applications.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Electron-Magnon Coupling Mediated Magnetotransport in Antiferromagnetic van der Waals Heterostructure
Authors:
Sujan Maity,
Soumik Das,
Mainak Palit,
Koushik Dey,
Bikash Das,
Tanima Kundu,
Rahul Paramanik,
Binoy Krishna De,
Hemant Singh Kunwar,
Subhadeep Datta
Abstract:
Electron-magnon coupling reveals key insights into the interfacial properties between non-magnetic metals and magnetic insulators, influencing charge transport and spin dynamics. Here, we present temperature-dependent Raman spectroscopy and magneto-transport measurements of few-layer graphene (FLG)/antiferromagnetic FePS\(_3\) heterostructures. The magnon mode in FePS\(_3\) softens below 40 K, and…
▽ More
Electron-magnon coupling reveals key insights into the interfacial properties between non-magnetic metals and magnetic insulators, influencing charge transport and spin dynamics. Here, we present temperature-dependent Raman spectroscopy and magneto-transport measurements of few-layer graphene (FLG)/antiferromagnetic FePS\(_3\) heterostructures. The magnon mode in FePS\(_3\) softens below 40 K, and effective magnon stiffness decreases with cooling. Magnetotransport measurements show that FLG exhibits negative magnetoresistance (MR) in the heterostructure at low fields (\(\pm 0.2 \, \text{T}\)), persisting up to 100 K; beyond this, MR transitions to positive. Notably, as layer thickness decreases, the coupling strength at the interface reduces, leading to a suppression of negative MR. Additionally, magnetodielectric measurements in the FLG/FePS\(_3\)/FLG heterostructure show an upturn at temperatures significantly below ($T_\text{N}$), suggesting a role for the magnon mode in capacitance, as indicated by hybridization between magnon and phonon bands in pristine FePS\(_3\) \textit{via} magnetoelastic coupling.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
An MCMC analysis to probe trilinear RPV SUSY scenarios and possible LHC signatures
Authors:
Arghya Choudhury,
Sourav Mitra,
Arpita Mondal,
Subhadeep Mondal
Abstract:
In this article, we probe the trilinear R-parity violating (RPV) supersymmetric (SUSY) scenarios with specific non-zero interactions in the light of neutrino oscillation, Higgs, and flavor observables. We attempt to fit the set of observables using a state-of-the-art Markov Chain Monte Carlo (MCMC) set-up and study its impact on the model parameter space. Our main objective is to constrain the tri…
▽ More
In this article, we probe the trilinear R-parity violating (RPV) supersymmetric (SUSY) scenarios with specific non-zero interactions in the light of neutrino oscillation, Higgs, and flavor observables. We attempt to fit the set of observables using a state-of-the-art Markov Chain Monte Carlo (MCMC) set-up and study its impact on the model parameter space. Our main objective is to constrain the trilinear couplings individually, along with some other SUSY parameters relevant to the observables. We present the constrained parameter regions in the form of marginalized posterior distributions on different two-dimensional parameter planes. We perform our analyses with two different scenarios characterized by our choices for the lightest SUSY particle (LSP), bino, and stop. Our results indicate that the lepton number violating trilinear couplings $λ_{i33}$ ($i$=1,2) and $λ_{j33}^{\prime}$ ($j$=1,2,3) can be at most of the order of $10^{-4}$ or even smaller while $\tanβ$ is restricted to below 15 even when $3σ$ allowed regions are considered. We further comment on the possible LHC signatures of these LSPs focusing on and around the best-fit regions.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
The co-varying ties between networks and item responses via latent variables
Authors:
Selena Wang,
Plamena Powla,
Tracy Sweet,
Subhadeep Paul
Abstract:
Relationships among teachers are known to influence their teaching-related perceptions. We study whether and how teachers' advising relationships (networks) are related to their perceptions of satisfaction, students, and influence over educational policies, recorded as their responses to a questionnaire (item responses). We propose a novel joint model of network and item responses (JNIRM) with cor…
▽ More
Relationships among teachers are known to influence their teaching-related perceptions. We study whether and how teachers' advising relationships (networks) are related to their perceptions of satisfaction, students, and influence over educational policies, recorded as their responses to a questionnaire (item responses). We propose a novel joint model of network and item responses (JNIRM) with correlated latent variables to understand these co-varying ties. This methodology allows the analyst to test and interpret the dependence between a network and item responses. Using JNIRM, we discover that teachers' advising relationships contribute to their perceptions of satisfaction and students more often than their perceptions of influence over educational policies. In addition, we observe that the complementarity principle applies in certain schools, where teachers tend to seek advice from those who are different from them. JNIRM shows superior parameter estimation and model fit over separately modeling the network and item responses with latent variable models.
△ Less
Submitted 28 September, 2024;
originally announced September 2024.
-
Quantum signatures of bistability and limit cycle in Kerr-modified cavity magnomechanics
Authors:
Pooja Kumari Gupta,
Subhadeep Chakraborty,
Sampreet Kalita,
Amarendra K. Sarma
Abstract:
We study a Kerr-modified cavity magnomechanical system with a focus on its bistable regime. We identify a distinct parametric condition under which bistability appears, featuring two stable branches and one unstable branch in the middle. Interestingly, our study reveals a unique transition where the upper branch loses its stability under a sufficiently strong drive, giving rise to limit cycle osci…
▽ More
We study a Kerr-modified cavity magnomechanical system with a focus on its bistable regime. We identify a distinct parametric condition under which bistability appears, featuring two stable branches and one unstable branch in the middle. Interestingly, our study reveals a unique transition where the upper branch loses its stability under a sufficiently strong drive, giving rise to limit cycle oscillation. Consequently, we report a rich phase diagram consisting of both bistable and periodic solutions and study quantum correlations around them. While in the bistable regime, we find the entanglement reaching different steady state value, in the unstable regime, entanglement oscillates in time. This study is especially important in understanding quantum entanglement at different stable and unstable points arising in a Kerr-modified cavity magnomechanical system.
△ Less
Submitted 22 September, 2024;
originally announced September 2024.
-
On the family discrimination in 331-model
Authors:
Katri Huitu,
Niko Koivunen,
Timo Kärkkäinen,
Subhadeep Mondal
Abstract:
In the so-called 331-models the gauge anomalies cancel only if there are three generations of fermions. This requires one of the quark generations to be in a different representation than the other two. But which generation is treated differently? In this work we study how the choice of differently treated generation effects the quark flavour structure and how the discriminated generation can be d…
▽ More
In the so-called 331-models the gauge anomalies cancel only if there are three generations of fermions. This requires one of the quark generations to be in a different representation than the other two. But which generation is treated differently? In this work we study how the choice of differently treated generation effects the quark flavour structure and how the discriminated generation can be deduced from experiments. We study a general model based on $β=-1/\sqrt{3}$, which contains exotic quarks with same electric charges as SM quarks. We take fully into account the effects from exotic quark mixing with the SM quarks, which is often omitted in literature. We will also pay particular attention to $125$ GeV Higgs, and show analytically why its flavour violating couplings between SM quarks are suppressed.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Structurally triggered orbital and charge orderings in TlMnO$_3$ and related compounds
Authors:
Subhadeep Bandyopadhyay,
Philippe Ghosez
Abstract:
Rare earth perovskites ($R^{3+}$M$^{3+}$O$_3$), with $e_g^1$ electronic occupation of the M $d$ states, display different types of metal-insulator transition. For manganites (M=Mn), metal-insulator transition is usually induced by the Jahn-Teller ($JT$) distortions, which stabilize orbital orderings (OO) at Mn sites. Among them, LaMnO$_3$ shows a $C$ type OO and crystallizes with $Pbnm$ structure.…
▽ More
Rare earth perovskites ($R^{3+}$M$^{3+}$O$_3$), with $e_g^1$ electronic occupation of the M $d$ states, display different types of metal-insulator transition. For manganites (M=Mn), metal-insulator transition is usually induced by the Jahn-Teller ($JT$) distortions, which stabilize orbital orderings (OO) at Mn sites. Among them, LaMnO$_3$ shows a $C$ type OO and crystallizes with $Pbnm$ structure. Whereas, TlMnO$_3$ shows a very distinct $G$ type OO with an unusual $P\overline{1}$ structure. Employing first principles calculations, and symmetry mode analysis we rationalize structural and electronic origin of $G$-type OO in TlMnO$_3$. Going further, we consider nickelates (M=Ni), where metal-insulator transition is driven by a breathing distortion, which stabilizes the charge ordering (CO) at Ni sites. Interestingly, different $JT$ and breathing distortions are very similar MO$_6$ octahedral distortions and stem from high frequency phonon modes of ideal $Pm\overline3m$ structure. Our comparative study reveals that following a common triggering mechanism these modes appear in their respective ground states.
△ Less
Submitted 3 April, 2025; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Do Generalised Classifiers really work on Human Drawn Sketches?
Authors:
Hmrishav Bandyopadhyay,
Pinaki Nath Chowdhury,
Aneeshan Sain,
Subhadeep Koley,
Tao Xiang,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
This paper, for the first time, marries large foundation models with human sketch understanding. We demonstrate what this brings -- a paradigm shift in terms of generalised sketch representation learning (e.g., classification). This generalisation happens on two fronts: (i) generalisation across unknown categories (i.e., open-set), and (ii) generalisation traversing abstraction levels (i.e., good…
▽ More
This paper, for the first time, marries large foundation models with human sketch understanding. We demonstrate what this brings -- a paradigm shift in terms of generalised sketch representation learning (e.g., classification). This generalisation happens on two fronts: (i) generalisation across unknown categories (i.e., open-set), and (ii) generalisation traversing abstraction levels (i.e., good and bad sketches), both being timely challenges that remain unsolved in the sketch literature. Our design is intuitive and centred around transferring the already stellar generalisation ability of CLIP to benefit generalised learning for sketches. We first "condition" the vanilla CLIP model by learning sketch-specific prompts using a novel auxiliary head of raster to vector sketch conversion. This importantly makes CLIP "sketch-aware". We then make CLIP acute to the inherently different sketch abstraction levels. This is achieved by learning a codebook of abstraction-specific prompt biases, a weighted combination of which facilitates the representation of sketches across abstraction levels -- low abstract edge-maps, medium abstract sketches in TU-Berlin, and highly abstract doodles in QuickDraw. Our framework surpasses popular sketch representation learning algorithms in both zero-shot and few-shot setups and in novel settings across different abstraction boundaries.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval
Authors:
Aneeshan Sain,
Pinaki Nath Chowdhury,
Subhadeep Koley,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
In this paper, we delve into the intricate dynamics of Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) by addressing a critical yet overlooked aspect -- the choice of viewpoint during sketch creation. Unlike photo systems that seamlessly handle diverse views through extensive datasets, sketch systems, with limited data collected from fixed perspectives, face challenges. Our pilot study, employ…
▽ More
In this paper, we delve into the intricate dynamics of Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) by addressing a critical yet overlooked aspect -- the choice of viewpoint during sketch creation. Unlike photo systems that seamlessly handle diverse views through extensive datasets, sketch systems, with limited data collected from fixed perspectives, face challenges. Our pilot study, employing a pre-trained FG-SBIR model, highlights the system's struggle when query-sketches differ in viewpoint from target instances. Interestingly, a questionnaire however shows users desire autonomy, with a significant percentage favouring view-specific retrieval. To reconcile this, we advocate for a view-aware system, seamlessly accommodating both view-agnostic and view-specific tasks. Overcoming dataset limitations, our first contribution leverages multi-view 2D projections of 3D objects, instilling cross-modal view awareness. The second contribution introduces a customisable cross-modal feature through disentanglement, allowing effortless mode switching. Extensive experiments on standard datasets validate the effectiveness of our method.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Unifying Mixed Gas Adsorption in Molecular Sieve Membranes and MOFs using Machine Learning
Authors:
Subhadeep Dasgupta,
Amal R S,
Prabal K. Maiti
Abstract:
Recent machine learning models to accurately obtain gas adsorption isotherms focus on polymers or metal-organic frameworks (MOFs) separately. The difficulty in creating a unified model that can predict the adsorption trends in both types of adsorbents is challenging, owing to the diversity in their chemical structures. Moreover, models trained only on single gas adsorption data are incapable of pr…
▽ More
Recent machine learning models to accurately obtain gas adsorption isotherms focus on polymers or metal-organic frameworks (MOFs) separately. The difficulty in creating a unified model that can predict the adsorption trends in both types of adsorbents is challenging, owing to the diversity in their chemical structures. Moreover, models trained only on single gas adsorption data are incapable of predicting adsorption isotherms for binary gas mixtures. In this work, we address these problems using feature vectors comprising only the physical properties of the gas mixtures and adsorbents. Our model is trained on adsorption isotherms of both single and binary mixed gases inside carbon molecular sieving membrane (CMSM), together with data available from CoRE MOF database. The trained models are capable of accurately predicting the adsorption trends in both classes of materials, for both pure and binary components. ML architecture designed for one class of material, is not suitable for predicting the other class, even after proper training, signifying that the model must be trained jointly for proper predictions and transferability. The model is used to predict with good accuracy the CO2 uptake inside CALF-20 framework. This work opens up a new avenue for predicting complex adsorption processes for gas mixtures in a wide range of materials.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Embedding Network Autoregression for time series analysis and causal peer effect inference
Authors:
Jae Ho Chang,
Subhadeep Paul
Abstract:
We propose an Embedding Network Autoregressive Model for multivariate networked longitudinal data. We assume the network is generated from a latent variable model, and these unobserved variables are included in a structural peer effect model or a time series network autoregressive model as additive effects. This approach takes a unified view of two related yet fundamentally different problems: (1)…
▽ More
We propose an Embedding Network Autoregressive Model for multivariate networked longitudinal data. We assume the network is generated from a latent variable model, and these unobserved variables are included in a structural peer effect model or a time series network autoregressive model as additive effects. This approach takes a unified view of two related yet fundamentally different problems: (1) modeling and predicting multivariate networked time series data and (2) causal peer influence estimation in the presence of homophily from finite time longitudinal data. Our estimation strategy comprises estimating latent variables from the observed network followed by least squares estimation of the network autoregressive model. We show that the estimated momentum and peer effect parameters are consistent and asymptotically normally distributed in setups with a growing number of network vertices (N) while considering both a growing number of time points T (for the time series problem) and finite T cases (for the peer effect problem). We allow the number of latent vectors K to grow at appropriate rates, which improves upon existing rates when such results are available for related models. Our theoretical results encompass cases both when the network is modeled with the random dot product graph model (ENAR) and a more general latent space model with both additive and multiplicative effects (AMNAR). We also develop a selection criterion when K is unknown that provably does not under-select and show that the theoretical guarantees hold with the selected number for K as well. Interestingly, even though we propose a unified model, our theoretical results find that different growth rates and restrictions on the latent vectors are needed to induce omitted variable bias in the peer effect problem and to ensure consistent estimation in the time series problem.
△ Less
Submitted 23 March, 2025; v1 submitted 9 June, 2024;
originally announced June 2024.
-
SketchDeco: Decorating B&W Sketches with Colour
Authors:
Chaitat Utintu,
Pinaki Nath Chowdhury,
Aneeshan Sain,
Subhadeep Koley,
Ayan Kumar Bhunia,
Yi-Zhe Song
Abstract:
This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding. Striking a balance between precision and convenience, our method utilises region masks and colour palettes to allow intuitive user control, steering clear of the meticulousness of manual colour assignments or the l…
▽ More
This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding. Striking a balance between precision and convenience, our method utilises region masks and colour palettes to allow intuitive user control, steering clear of the meticulousness of manual colour assignments or the limitations of textual prompts. By strategically combining ControlNet and staged generation, incorporating Stable Diffusion v1.5, and leveraging BLIP-2 text prompts, our methodology facilitates faithful image generation and user-directed colourisation. Addressing challenges of local and global consistency, we employ inventive solutions such as an inversion scheme, guided sampling, and a self-attention mechanism with a scaling factor. The resulting tool is not only fast and training-free but also compatible with consumer-grade Nvidia RTX 4090 Super GPUs, making it a valuable asset for both creative professionals and enthusiasts in various fields. Project Page: \url{https://chaitron.github.io/SketchDeco/}
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Unraveling electronic structure of GeS through ARPES and its correlation with anisotropic optical and transport behavior
Authors:
Rahul Paramanik,
Tanima Kundu,
Soumik Das,
Alexey Barinov,
Bikash Das,
Sujan Maity,
Mainak Palit,
Sanjoy Kr Mahatha,
Subhadeep Datta
Abstract:
Two-dimensional (2D) van der Waals (vdW) materials with lower symmetry (triclinic, monoclinic or orthorhombic) exhibit intrinsic anisotropic in-plane structure desirable for future optoelectronic surface operating devices. Herein, we report one such material, 2D $p$-type semiconductor germanium sulfide (GeS), a group IV monochalcogenide with puckered orthorhombic morphology, in which in-plane opti…
▽ More
Two-dimensional (2D) van der Waals (vdW) materials with lower symmetry (triclinic, monoclinic or orthorhombic) exhibit intrinsic anisotropic in-plane structure desirable for future optoelectronic surface operating devices. Herein, we report one such material, 2D $p$-type semiconductor germanium sulfide (GeS), a group IV monochalcogenide with puckered orthorhombic morphology, in which in-plane optical and transport properties can be correlated with its electronic structure. We systematically investigate the electronic band structure of the bulk GeS with micro-focused angle-resolved photoemission spectroscopy ($μ$-ARPES) and correspond the charge transport properties using the field-effect transistor (FET) device architecture, and optical anisotropy $via$ angle-resolved polarization dependent Raman spectroscopy (ARPRS) on a micron-sized rectangle-shaped exfoliated bulk flake. The experimental valence band dispersion along the two high symmetry directions indicate highly anisotropic in-plane behavior of the charge carrier that agrees well with the density functional theory (DFT) calculations. In addition, we demonstrate the variation of the in-plane hole mobility (ratio $\sim$ 3.4) from the electrical conductivity with gate-sweep in a GeS-on-SiO$_2$ FET. Moreover, we use the angle-resolved fluctuation of the Raman intensity of the characteristic phonon modes to precisely determine the armchair and zigzag edges of the particular flake. The unique structural motif of GeS with correlated electronic and optical properties are of great interest both for the physical understanding of the all-optical switch and their applications in memory devices.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Self-trapping phenomenon, multistability and chaos in open anisotropic Dicke dimer
Authors:
G. Vivek,
Debabrata Mondal,
Subhadeep Chakraborty,
S. Sinha
Abstract:
We investigate semiclassical dynamics of a coupled atom-photon interacting system described by a dimer of anisotropic Dicke model in the presence of photon loss, exhibiting a rich variety of non-linear dynamics. Based on symmetries and dynamical classification, we characterize and chart out various dynamical phases in a phase diagram. A key feature of this system is the multistability of different…
▽ More
We investigate semiclassical dynamics of a coupled atom-photon interacting system described by a dimer of anisotropic Dicke model in the presence of photon loss, exhibiting a rich variety of non-linear dynamics. Based on symmetries and dynamical classification, we characterize and chart out various dynamical phases in a phase diagram. A key feature of this system is the multistability of different dynamical states, particularly the coexistence of various superradiant phases as well as limit cycles. Remarkably, this dimer system manifests self-trapping phenomena, resulting in a photon population imbalance between the cavities. Such a self-trapped state arises from a saddle-node bifurcation, which can be understood from an equivalent Landau-Ginzburg description. Additionally, we identify a unique class of oscillatory dynamics self-trapped limit cycle, hosting self-trapping of photons. The absence of stable dynamical phases leads to the onset of chaos, which is diagnosed using the saturation value of the decorrelator dynamics. Moreover, the self-trapped states can coexist with chaotic attractor, which may have intriguing consequences in quantum dynamics. Finally, we discuss the experimental relevance of our findings, which can be tested in cavity and circuit quantum electrodynamics setups.
△ Less
Submitted 24 March, 2025; v1 submitted 22 May, 2024;
originally announced May 2024.
-
A Three-Phase Analysis of Synergistic Effects During Co-pyrolysis of Algae and Wood for Biochar Yield Using Machine Learning
Authors:
Subhadeep Chakrabarti,
Saish Shinde
Abstract:
Pyrolysis techniques have served to be a groundbreaking technique for effectively utilising natural and man-made biomass products like plastics, wood, crop residue, fruit peels etc. Recent advancements have shown a greater yield of essential products like biochar, bio-oil and other non-condensable gases by blending different biomasses in a certain ratio. This synergy effect of combining two pyroly…
▽ More
Pyrolysis techniques have served to be a groundbreaking technique for effectively utilising natural and man-made biomass products like plastics, wood, crop residue, fruit peels etc. Recent advancements have shown a greater yield of essential products like biochar, bio-oil and other non-condensable gases by blending different biomasses in a certain ratio. This synergy effect of combining two pyrolytic raw materials i.e co-pyrolysis of algae and wood biomass has been systematically studied and grouped into 3 phases in this research paper-kinetic analysis of co-pyrolysis, correlation among proximate and ultimate analysis with bio-char yield and lastly grouping of different weight ratios based on biochar yield up to a certain percentage. Different ML and DL algorithms have been utilized for regression and classification techniques to give a comprehensive overview of the effect of the synergy of two different biomass materials on biochar yield. For the first phase, the best prediction of biochar yield was obtained by using a decision tree regressor with a perfect MSE score of 0.00, followed by a gradient-boosting regressor. The second phase was analyzed using both ML and DL techniques. Within ML, SVR proved to be the most convenient model with an accuracy score of 0.972 with DNN employed for deep learning technique. Finally, for the third phase, binary classification was applied to biochar yield with and without heating rate for biochar yield percentage above and below 40%. The best technique for ML was Support Vector followed by Random forest while ANN was the most suitable Deep Learning Technique.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Diagnosing and Predicting Autonomous Vehicle Operational Safety Using Multiple Simulation Modalities and a Virtual Environment
Authors:
Joe Beck,
Shean Huff,
Subhadeep Chakraborty
Abstract:
Even as technology and performance gains are made in the sphere of automated driving, safety concerns remain. Vehicle simulation has long been seen as a tool to overcome the cost associated with a massive amount of on-road testing for development and discovery of safety critical "edge-cases". However, purely software-based vehicle models may leave a large realism gap between their real-world count…
▽ More
Even as technology and performance gains are made in the sphere of automated driving, safety concerns remain. Vehicle simulation has long been seen as a tool to overcome the cost associated with a massive amount of on-road testing for development and discovery of safety critical "edge-cases". However, purely software-based vehicle models may leave a large realism gap between their real-world counterparts in terms of dynamic response, and highly realistic vehicle-in-the-loop (VIL) simulations that encapsulate a virtual world around a physical vehicle may still be quite expensive to produce and similarly time intensive as on-road testing. In this work, we demonstrate an AV simulation test bed that combines the realism of vehicle-in-the-loop (VIL) simulation with the ease of implementation of model-in-the-loop (MIL) simulation. The setup demonstrated in this work allows for response diagnosis for the VIL simulations. By observing causal links between virtual weather and lighting conditions that surround the virtual depiction of our vehicle, the vision-based perception model and controller of Openpilot, and the dynamic response of our physical vehicle under test, we can draw conclusions regarding how the perceived environment contributed to vehicle response. Conversely, we also demonstrate response prediction for the MIL setup, where the need for a physical vehicle is not required to draw richer conclusions around the impact of environmental conditions on AV performance than could be obtained with VIL simulation alone. These combine for a simulation setup with accurate real-world implications for edge-case discovery that is both cost effective and time efficient to implement.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Searches for the BSM scenarios at the LHC using decision tree based machine learning algorithms: A comparative study and review of Random Forest, Adaboost, XGboost and LightGBM frameworks
Authors:
Arghya Choudhury,
Arpita Mondal,
Subhadeep Sarkar
Abstract:
Machine learning algorithms are now being extensively used in our daily lives, spanning across diverse industries as well as academia. In the field of high energy physics (HEP), the most common and challenging task is separating a rare signal from a much larger background. The boosted decision tree (BDT) algorithm has been a cornerstone of the high energy physics for analyzing event triggering, pa…
▽ More
Machine learning algorithms are now being extensively used in our daily lives, spanning across diverse industries as well as academia. In the field of high energy physics (HEP), the most common and challenging task is separating a rare signal from a much larger background. The boosted decision tree (BDT) algorithm has been a cornerstone of the high energy physics for analyzing event triggering, particle identification, jet tagging, object reconstruction, event classification, and other related tasks for quite some time. This article presents a comprehensive overview of research conducted by both HEP experimental and phenomenological groups that utilize decision tree algorithms in the context of the Standard Model and Supersymmetry (SUSY). We also summarize the basic concept of machine learning and decision tree algorithm along with the working principle of \texttt{Random Forest}, \texttt{AdaBoost} and two gradient boosting frameworks, such as \texttt{XGBoost}, and \texttt{LightGBM}. Using a case study of electroweakino productions at the high luminosity LHC, we demonstrate how these algorithms lead to improvement in the search sensitivity compared to traditional cut-based methods in both compressed and non-compressed R-parity conserving SUSY scenarios. The effect of different hyperparameters and their optimization, feature importance study using SHapley values are also discussed in detail.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Probing intractable beyond-standard-model parameter spaces armed with Machine Learning
Authors:
Rajneil Baruah,
Subhadeep Mondal,
Sunando Kumar Patra,
Satyajit Roy
Abstract:
This article attempts to summarize the effort by the particle physics community in addressing the tedious work of determining the parameter spaces of beyond-the-standard-model (BSM) scenarios, allowed by data. These spaces, typically associated with a large number of dimensions, especially in the presence of nuisance parameters, suffer from the curse of dimensionality and thus render naive samplin…
▽ More
This article attempts to summarize the effort by the particle physics community in addressing the tedious work of determining the parameter spaces of beyond-the-standard-model (BSM) scenarios, allowed by data. These spaces, typically associated with a large number of dimensions, especially in the presence of nuisance parameters, suffer from the curse of dimensionality and thus render naive sampling of any kind -- even the computationally inexpensive ones -- ineffective. Over the years, various new sampling (from variations of Markov Chain Monte Carlo (MCMC) to dynamic nested sampling) and machine learning (ML) algorithms have been adopted by the community to alleviate this issue. If not all, we discuss potentially the most important among them and the significance of their results, in detail.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
GENESIS-RL: GEnerating Natural Edge-cases with Systematic Integration of Safety considerations and Reinforcement Learning
Authors:
Hsin-Jung Yang,
Joe Beck,
Md Zahid Hasan,
Ekin Beyazit,
Subhadeep Chakraborty,
Tichakorn Wongpiromsarn,
Soumik Sarkar
Abstract:
In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and rein…
▽ More
In the rapidly evolving field of autonomous systems, the safety and reliability of the system components are fundamental requirements. These components are often vulnerable to complex and unforeseen environments, making natural edge-case generation essential for enhancing system resilience. This paper presents GENESIS-RL, a novel framework that leverages system-level safety considerations and reinforcement learning techniques to systematically generate naturalistic edge cases. By simulating challenging conditions that mimic the real-world situations, our framework aims to rigorously test entire system's safety and reliability. Although demonstrated within the autonomous driving application, our methodology is adaptable across diverse autonomous systems. Our experimental validation, conducted on high-fidelity simulator underscores the overall effectiveness of this framework.
△ Less
Submitted 19 September, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Negative Capacitance for Stabilizing Logic State in Tunnel Field-Effect Transistor
Authors:
Koushik Dey,
Bikash Das,
Pabitra Kumar Hazra,
Tanima Kundu,
Sanjib Naskar,
Soumik Das,
Sujan Maity,
Poulomi Maji,
Bipul Karmakar,
Rahul Paramanik,
Subhadeep Datta
Abstract:
The study investigates the influence of negative capacitance on the transfer characteristics of vdW FETs on the heterophase of CIPS ferroelectric. Notably, a less pronounced NC resulting from the spatial distribution of the ferroelectric and paraelectric phases plays crucial role in stabilizing of n-channel-conductance. This results into the emergence of a non-volatile logic state, between the two…
▽ More
The study investigates the influence of negative capacitance on the transfer characteristics of vdW FETs on the heterophase of CIPS ferroelectric. Notably, a less pronounced NC resulting from the spatial distribution of the ferroelectric and paraelectric phases plays crucial role in stabilizing of n-channel-conductance. This results into the emergence of a non-volatile logic state, between the two binary states of TFETs. Concerned study proposed NC-TFETs based on ferroionic crystals as promising devices for generating a stable logic state below Vth.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
It's All About Your Sketch: Democratising Sketch Control in Diffusion Models
Authors:
Subhadeep Koley,
Ayan Kumar Bhunia,
Deeptanshu Sekhri,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Tao Xiang,
Yi-Zhe Song
Abstract:
This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI. We importantly democratise the process, enabling amateur sketches to generate precise images, living up to the commitment of "what you sketch is what you get". A pilot study underscores the necessity, revealing that deformities in existing models stem from…
▽ More
This paper unravels the potential of sketches for diffusion models, addressing the deceptive promise of direct sketch control in generative AI. We importantly democratise the process, enabling amateur sketches to generate precise images, living up to the commitment of "what you sketch is what you get". A pilot study underscores the necessity, revealing that deformities in existing models stem from spatial-conditioning. To rectify this, we propose an abstraction-aware framework, utilising a sketch adapter, adaptive time-step sampling, and discriminative guidance from a pre-trained fine-grained sketch-based image retrieval model, working synergistically to reinforce fine-grained sketch-photo association. Our approach operates seamlessly during inference without the need for textual prompts; a simple, rough sketch akin to what you and I can create suffices! We welcome everyone to examine results presented in the paper and its supplementary. Contributions include democratising sketch control, introducing an abstraction-aware framework, and leveraging discriminative guidance, validated through extensive experiments.
△ Less
Submitted 20 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval
Authors:
Subhadeep Koley,
Ayan Kumar Bhunia,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Tao Xiang,
Yi-Zhe Song
Abstract:
Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper, we question the reliance on sketches alone for fine-grained image retrieval by simultaneously ex…
▽ More
Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks, sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper, we question the reliance on sketches alone for fine-grained image retrieval by simultaneously exploring the fine-grained representation capabilities of both sketch and text, orchestrating a duet between the two. The end result enables precise retrievals previously unattainable, allowing users to pose ever-finer queries and incorporate attributes like colour and contextual cues from text. For this purpose, we introduce a novel compositionality framework, effectively combining sketches and text using pre-trained CLIP models, while eliminating the need for extensive fine-grained textual descriptions. Last but not least, our system extends to novel applications in composed image retrieval, domain attribute transfer, and fine-grained generation, providing solutions for various real-world scenarios.
△ Less
Submitted 20 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers
Authors:
Subhadeep Koley,
Ayan Kumar Bhunia,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Tao Xiang,
Yi-Zhe Song
Abstract:
This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos. This proficiency is underpinned by their robust cross-modal capabilities and shape bias, findings that are substantiated through our pi…
▽ More
This paper, for the first time, explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos. This proficiency is underpinned by their robust cross-modal capabilities and shape bias, findings that are substantiated through our pilot studies. In order to harness pre-trained diffusion models effectively, we introduce a straightforward yet powerful strategy focused on two key aspects: selecting optimal feature layers and utilising visual and textual prompts. For the former, we identify which layers are most enriched with information and are best suited for the specific retrieval requirements (category-level or fine-grained). Then we employ visual and textual prompts to guide the model's feature extraction process, enabling it to generate more discriminative and contextually relevant cross-modal representations. Extensive experiments on several benchmark datasets validate significant performance improvements.
△ Less
Submitted 20 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
How to Handle Sketch-Abstraction in Sketch-Based Image Retrieval?
Authors:
Subhadeep Koley,
Ayan Kumar Bhunia,
Aneeshan Sain,
Pinaki Nath Chowdhury,
Tao Xiang,
Yi-Zhe Song
Abstract:
In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the nec…
▽ More
In this paper, we propose a novel abstraction-aware sketch-based image retrieval framework capable of handling sketch abstraction at varied levels. Prior works had mainly focused on tackling sub-factors such as drawing style and order, we instead attempt to model abstraction as a whole, and propose feature-level and retrieval granularity-level designs so that the system builds into its DNA the necessary means to interpret abstraction. On learning abstraction-aware features, we for the first-time harness the rich semantic embedding of pre-trained StyleGAN model, together with a novel abstraction-level mapper that deciphers the level of abstraction and dynamically selects appropriate dimensions in the feature matrix correspondingly, to construct a feature matrix embedding that can be freely traversed to accommodate different levels of abstraction. For granularity-level abstraction understanding, we dictate that the retrieval model should not treat all abstraction-levels equally and introduce a differentiable surrogate Acc.@q loss to inject that understanding into the system. Different to the gold-standard triplet loss, our Acc.@q loss uniquely allows a sketch to narrow/broaden its focus in terms of how stringent the evaluation should be - the more abstract a sketch, the less stringent (higher q). Extensive experiments depict our method to outperform existing state-of-the-arts in standard SBIR tasks along with challenging scenarios like early retrieval, forensic sketch-photo matching, and style-invariant retrieval.
△ Less
Submitted 20 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
VLSI Architectures of Forward Kinematic Processor for Robotics Applications
Authors:
Sourav Roy,
Subhadeep Paul,
Tapas Kumar Maiti
Abstract:
This paper aims to get a comprehensive review of current-day robotic computation technologies at VLSI architecture level. We studied several repots in the domain of robotic processor architecture. In this work, we focused on the forward kinematics architectures which consider CORDIC algorithms, VLSI circuits of WE DSP16 chip, parallel processing and pipelined architecture, and lookup table formula…
▽ More
This paper aims to get a comprehensive review of current-day robotic computation technologies at VLSI architecture level. We studied several repots in the domain of robotic processor architecture. In this work, we focused on the forward kinematics architectures which consider CORDIC algorithms, VLSI circuits of WE DSP16 chip, parallel processing and pipelined architecture, and lookup table formula and FPGA processor. This study gives us an understanding of different implementation methods for forward kinematics. Our goal is to develop a forward kinematics processor with FPGA for real-time applications, requires a fast response time and low latency of these devices, useful for industrial automation where the processing speed plays a great role.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Efficient data selection employing Semantic Similarity-based Graph Structures for model training
Authors:
Roxana Petcu,
Subhadeep Maji
Abstract:
Recent developments in natural language processing (NLP) have highlighted the need for substantial amounts of data for models to capture textual information accurately. This raises concerns regarding the computational resources and time required for training such models. This paper introduces Semantics for data SAliency in Model performance Estimation (SeSaME). It is an efficient data sampling mec…
▽ More
Recent developments in natural language processing (NLP) have highlighted the need for substantial amounts of data for models to capture textual information accurately. This raises concerns regarding the computational resources and time required for training such models. This paper introduces Semantics for data SAliency in Model performance Estimation (SeSaME). It is an efficient data sampling mechanism solely based on textual information without passing the data through a compute-heavy model or other intensive pre-processing transformations. The application of this approach is demonstrated in the use case of low-resource automated speech recognition (ASR) models, which excessively rely on text-to-speech (TTS) calls when using augmented data. SeSaME learns to categorize new incoming data points into speech recognition difficulty buckets by employing semantic similarity-based graph structures and discrete ASR information from homophilous neighbourhoods through message passing. The results indicate reliable projections of ASR performance, with a 93% accuracy increase when using the proposed method compared to random predictions, bringing non-trivial information on the impact of textual representations in speech models. Furthermore, a series of experiments show both the benefits and challenges of using the ASR information on incoming data to fine-tune the model. We report a 7% drop in validation loss compared to random sampling, 7% WER drop with non-local aggregation when evaluating against a highly difficult dataset, and 1.8% WER drop with local aggregation and high semantic similarity between datasets.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.