-
Demystifying Transition Matching: When and Why It Can Beat Flow Matching
Authors:
Jaihoon Kim,
Rajarshi Saha,
Minhyuk Sung,
Youngsuk Park
Abstract:
Flow Matching (FM) underpins many state-of-the-art generative models, yet recent results indicate that Transition Matching (TM) can achieve higher quality with fewer sampling steps. This work answers the question of when and why TM outperforms FM. First, when the target is a unimodal Gaussian distribution, we prove that TM attains strictly lower KL divergence than FM for finite number of steps. Th…
▽ More
Flow Matching (FM) underpins many state-of-the-art generative models, yet recent results indicate that Transition Matching (TM) can achieve higher quality with fewer sampling steps. This work answers the question of when and why TM outperforms FM. First, when the target is a unimodal Gaussian distribution, we prove that TM attains strictly lower KL divergence than FM for finite number of steps. The improvement arises from stochastic difference latent updates in TM, which preserve target covariance that deterministic FM underestimates. We then characterize convergence rates, showing that TM achieves faster convergence than FM under a fixed compute budget, establishing its advantage in the unimodal Gaussian setting. Second, we extend the analysis to Gaussian mixtures and identify local-unimodality regimes in which the sampling dynamics approximate the unimodal case, where TM can outperform FM. The approximation error decreases as the minimal distance between component means increases, highlighting that TM is favored when the modes are well separated. However, when the target variance approaches zero, each TM update converges to the FM update, and the performance advantage of TM diminishes. In summary, we show that TM outperforms FM when the target distribution has well-separated modes and non-negligible variances. We validate our theoretical results with controlled experiments on Gaussian distributions, and extend the comparison to real-world applications in image and video generation.
△ Less
Submitted 20 October, 2025;
originally announced October 2025.
-
On Affine Version of Hom-Lie Algebras
Authors:
Tarik Anowar,
Ripan Saha
Abstract:
This paper introduces Hom-type analogues of affine algebraic structures, termed Hom-affgebras. Extending Brzeziński's theory of affgebras and the Hom-algebra framework developed by Hartwig--Larsson--Silvestrov, we define and study Hom-associative, Hom-pre-Lie, and Hom-Lie affgebras, where the classical identities are twisted by an affine self-map. We show how Hom-associative, Hom-pre-Lie, and Hom-…
▽ More
This paper introduces Hom-type analogues of affine algebraic structures, termed Hom-affgebras. Extending Brzeziński's theory of affgebras and the Hom-algebra framework developed by Hartwig--Larsson--Silvestrov, we define and study Hom-associative, Hom-pre-Lie, and Hom-Lie affgebras, where the classical identities are twisted by an affine self-map. We show how Hom-associative, Hom-pre-Lie, and Hom-Lie affgebras are related to one another. The main focus of this paper is on Hom-Lie affgebras and their fibers. We study the concept of generalized derivations for Hom-Lie algebras, extending the notion of generalized derivations for Lie algebras. We explore the close relationship between Hom-Lie affgebras and such derivations. We show that every Hom-Lie affgebra both determines and is determined by a Hom-Lie algebra together with such a generalized derivation and a constant. Furthermore, we establish that a homomorphism between Lie affgebras corresponds to a homomorphism between their associated Lie fibers along with a constant, and vice versa.
△ Less
Submitted 20 October, 2025;
originally announced October 2025.
-
High temperature Neel skyrmions in simple ferromagnets
Authors:
Peng Wang,
Rana Saha,
Holger L. Meyerheim,
Ke Gu,
Hakan Deniz,
David Eilmsteiner,
Andrea Migliorini,
Juan Rubio Zuazo,
Engenia Sebastiani-Tofano,
Ilya Kostanovski,
Abhay Kant Srivastava,
Arthur Ernst,
Stuart S. P. Parkin
Abstract:
A wide variety of chiral non-collinear spin textures have been discovered and have unique properties that make them highly interesting for technological applications. However, many of these are found in complex materials and only in a narrow window of temperature. Here, we show the formation of Neel-type skyrmions in thin layers of simple ferromagnetic alloys, namely Co-Al and Co-Ni-Al, over a wid…
▽ More
A wide variety of chiral non-collinear spin textures have been discovered and have unique properties that make them highly interesting for technological applications. However, many of these are found in complex materials and only in a narrow window of temperature. Here, we show the formation of Neel-type skyrmions in thin layers of simple ferromagnetic alloys, namely Co-Al and Co-Ni-Al, over a wide range of temperature up to 770 K, by imposing a vertical strain gradient via epitaxy with an Ir-Al underlayer. The Neel skyrmions are directly observed using Lorentz transmission electron microscopy in freestanding membranes at high temperatures and the strain gradient is directly measured from x-ray diffraction anomalous peak profiles. Our concept allows simple centrosymmetric ferromagnets with high magnetic ordering temperatures to exhibit hot skyrmions, thereby, bringing closer skyrmionic electronics.
△ Less
Submitted 7 October, 2025;
originally announced October 2025.
-
Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents
Authors:
Ankan Mullick,
Sombit Bose,
Rounak Saha,
Ayan Kumar Bhowmick,
Aditya Vempaty,
Prasenjit Dey,
Ravi Kokku,
Pawan Goyal,
Niloy Ganguly
Abstract:
In this paper, we introduce Spotlight, a novel paradigm for information extraction that produces concise, engaging narratives by highlighting the most compelling aspects of a document. Unlike traditional summaries, which prioritize comprehensive coverage, spotlights selectively emphasize intriguing content to foster deeper reader engagement with the source material. We formally differentiate spotl…
▽ More
In this paper, we introduce Spotlight, a novel paradigm for information extraction that produces concise, engaging narratives by highlighting the most compelling aspects of a document. Unlike traditional summaries, which prioritize comprehensive coverage, spotlights selectively emphasize intriguing content to foster deeper reader engagement with the source material. We formally differentiate spotlights from related constructs and support our analysis with a detailed benchmarking study using new datasets curated for this work. To generate high-quality spotlights, we propose a two-stage approach: fine-tuning a large language model on our benchmark data, followed by alignment via Direct Preference Optimization (DPO). Our comprehensive evaluation demonstrates that the resulting model not only identifies key elements with precision but also enhances readability and boosts the engagement value of the original document.
△ Less
Submitted 21 October, 2025; v1 submitted 13 September, 2025;
originally announced September 2025.
-
Magnetocrystalline Anisotropy and 3D Hopping Conduction at the Surface of FeSb2
Authors:
Jarryd A. Horn,
Yun Suk Eo,
Keenan Avers,
Hyeok Yoon,
Ryan G. Dorman,
Shanta R. Saha,
Johnpierre Paglione
Abstract:
Motivated by the recent discovery of metallic surface states in the d-electron Kondo insulator candidates FeSi and FeSb2, along with some recent reports of magnetic correlations in the surface transport properties of FeSi, we have investigated the low temperature surface magnetotransport properties of FeSb2. By using a Corbino disk transport geometry, we were able to isolate the electrical transpo…
▽ More
Motivated by the recent discovery of metallic surface states in the d-electron Kondo insulator candidates FeSi and FeSb2, along with some recent reports of magnetic correlations in the surface transport properties of FeSi, we have investigated the low temperature surface magnetotransport properties of FeSb2. By using a Corbino disk transport geometry, we were able to isolate the electrical transport properties of a single surface of our samples and study the [110] and [101] naturally forming faces separately. Studying the relationship between the applied magnetic field, current direction and crystal symmetry has allowed us to separate possible contributions to the magnetotransport anisotropy. Unlike previous studies of SmB6 surface states, we find no two-dimensional Drude-like dependence on field orientation relative to current direction, but instead a magnetocrystalline anisotropy that appears to originate from local moment scattering with a well defined easy-axis along the [100] direction. We compare these results with the magnetotransport properties of the conducting surface states on the [111] facet of FeSi. We also find evidence of 3D variable-range hopping conduction below the bulk-to-surface crossover, extending below 1 K, which implies that the electrical transport at the surface of these materials is carried by a thin, but 3D conducting channel, which is inconsistent with the lower dimensional states expected for a strong topological insulator.
△ Less
Submitted 6 September, 2025;
originally announced September 2025.
-
Interfacial Control of both Magnetism and Polarization in a van der Waals Ferromagnet/Ferroelectric Heterostructure
Authors:
Priyanshu Raj,
Sourav Mal,
Rana Saha,
Prasenjit Sen
Abstract:
Two-dimensional multiferroic van der Waals heterostructures provide a promising platform for the simultaneous control of distinct ferroic orders, with potential applications in magnetoelectric devices and spintronics. The practical implementation of such technologies requires 2D magnets with high Curie temperatures and strong perpendicular magnetic anisotropy (PMA). Here, based on first-principles…
▽ More
Two-dimensional multiferroic van der Waals heterostructures provide a promising platform for the simultaneous control of distinct ferroic orders, with potential applications in magnetoelectric devices and spintronics. The practical implementation of such technologies requires 2D magnets with high Curie temperatures and strong perpendicular magnetic anisotropy (PMA). Here, based on first-principles calculations, we propose a multiferroic heterostructure composed of the room-temperature ferromagnet $\text{Fe}_3\text{Ga}\text{Te}_2$ and the ferroelectric $\text{In}_2\text{Se}_3$. We show that intercalation of Fe atoms into the van der Waals gap of the $\text{Fe}_3\text{Ga}\text{Te}_2$/$\text{In}_2\text{Se}_3$ heterostructure enhances PMA by nearly an order of magnitude relative to the pristine $\text{Fe}_3\text{Ga}\text{Te}_2$ monolayer, while simultaneously allowing electric polarization to be modulated through interfacial charge redistribution. The enhancement of PMA arises from interfacial hybridization that modifies the spin-orbit coupling of Fe $d$-orbitals. Our results demonstrate an effective pathway to engineer magnetoelectric coupling in two-dimensional multiferroic heterostructures and pave the way toward energy-efficient spintronic devices.
△ Less
Submitted 2 September, 2025;
originally announced September 2025.
-
A comprehensive study on the characterization of lyzed blood samples using dual-wavelength photoacoustics
Authors:
Subhadip Paul,
Hari Shankar Patel,
Vatsala Misra,
Ravi Rani,
Amaresh K. Sahoo,
Ratan K. Saha
Abstract:
Anemia is a global health concern, prompting the need for rapid and accurate diagnostic tools, especially for vulnerable populations. Estimating the blood lysis level (LL) and oxygen saturation (SO2) are essential not only for anemia but also for other hemolytic conditions. This study explores the potential of photoacoustic (PA) spectroscopy as a quantitative tool for evaluating hemolysis in anemi…
▽ More
Anemia is a global health concern, prompting the need for rapid and accurate diagnostic tools, especially for vulnerable populations. Estimating the blood lysis level (LL) and oxygen saturation (SO2) are essential not only for anemia but also for other hemolytic conditions. This study explores the potential of photoacoustic (PA) spectroscopy as a quantitative tool for evaluating hemolysis in anemia diagnosis. In vitro PA measurements on human blood samples were validated through computational modeling using the discrete dipole approximation, Monte Carlo, and k-Wave acoustic simulations. The quantitative values of blood hematocrit (H), LL and SO2 have been estimated using simulated and experimental PA signals. The wavelength pairs 700-905 nm and 700-1000 nm have been found to be optimal for the simultaneous estimation of these parameters and provided H and SO2 estimations with accuracy approximately > 90% up to LL=14% and for LL = 0-30%, respectively. The correlation coefficient between the actual and evaluated lysis levels has been computed to be \approx 0.90. Further investigation is needed to enhance the robustness and clinical applicability of the developed method under an in vivo setting when both the H and LL levels are not known.
△ Less
Submitted 1 September, 2025;
originally announced September 2025.
-
On Hom-Analogues of Heaps and Trusses
Authors:
Tarik Anowar,
Ripan Saha,
Sayan Thokdar
Abstract:
This paper develops the foundations of Hom-heaps, Hom-trusses, and Hom-braces as natural Hom-type analogues of their classical counterparts. We establish the correspondence between Hom-heaps and Hom-groups, showing that the retract of a Hom-heap at a point yields a Hom-group precisely when the point is fixed under the twisting map. Three interrelated definitions of Hom-trusses are introduced and t…
▽ More
This paper develops the foundations of Hom-heaps, Hom-trusses, and Hom-braces as natural Hom-type analogues of their classical counterparts. We establish the correspondence between Hom-heaps and Hom-groups, showing that the retract of a Hom-heap at a point yields a Hom-group precisely when the point is fixed under the twisting map. Three interrelated definitions of Hom-trusses are introduced and their structural properties are investigated, including the construction of modules over Hom-trusses. For Hom-braces, we propose three variants and demonstrate their close connections with Hom-trusses, proving in particular that certain Hom-trusses naturally give rise to Hom-braces and vice versa. Our results provide a systematic framework that extends heap and truss theory into the Hom-algebraic setting, opening new perspectives for applications in Yang--Baxter theory, non-associative geometry, and categorical algebra.
△ Less
Submitted 1 September, 2025;
originally announced September 2025.
-
The Tsetlin Machine Goes Deep: Logical Learning and Reasoning With Graphs
Authors:
Ole-Christoffer Granmo,
Youmna Abdelwahab,
Per-Arne Andersen,
Paul F. A. Clarke,
Kunal Dumbre,
Ylva Grønninsæter,
Vojtech Halenka,
Runar Helin,
Lei Jiao,
Ahmed Khalid,
Rebekka Omslandseter,
Rupsa Saha,
Mayur Shende,
Xuan Zhang
Abstract:
Pattern recognition with concise and flat AND-rules makes the Tsetlin Machine (TM) both interpretable and efficient, while the power of Tsetlin automata enables accuracy comparable to deep learning on an increasing number of datasets. We introduce the Graph Tsetlin Machine (GraphTM) for learning interpretable deep clauses from graph-structured input. Moving beyond flat, fixed-length input, the Gra…
▽ More
Pattern recognition with concise and flat AND-rules makes the Tsetlin Machine (TM) both interpretable and efficient, while the power of Tsetlin automata enables accuracy comparable to deep learning on an increasing number of datasets. We introduce the Graph Tsetlin Machine (GraphTM) for learning interpretable deep clauses from graph-structured input. Moving beyond flat, fixed-length input, the GraphTM gets more versatile, supporting sequences, grids, relations, and multimodality. Through message passing, the GraphTM builds nested deep clauses to recognize sub-graph patterns with exponentially fewer clauses, increasing both interpretability and data utilization. For image classification, GraphTM preserves interpretability and achieves 3.86%-points higher accuracy on CIFAR-10 than a convolutional TM. For tracking action coreference, faced with increasingly challenging tasks, GraphTM outperforms other reinforcement learning methods by up to 20.6%-points. In recommendation systems, it tolerates increasing noise to a greater extent than a Graph Convolutional Neural Network (GCN), e.g., for noise ratio 0.1, GraphTM obtains accuracy 89.86% compared to GCN's 70.87%. Finally, for viral genome sequence data, GraphTM is competitive with BiLSTM-CNN and GCN accuracy-wise, training 2.5x faster than GCN. The GraphTM's application to these varied fields demonstrates how graph representation learning and deep clauses bring new possibilities for TM learning.
△ Less
Submitted 20 July, 2025;
originally announced July 2025.
-
GPU Accelerated Transducer-Field Calculation using the Traditional Born Series Formulation for Realistic Media
Authors:
Ujjal Mandal,
Jagpreet Singh,
Ben T Cox,
Ratan K Saha
Abstract:
This study numerically solves inhomogeneous Helmholtz equations modeling acoustic wave propagation in homogeneous and lossless, absorbing and dispersive, inhomogeneous and nonlinear media. The traditional Born series (TBS) method has been employed to solve such equations. The full wave solution in this methodology is expressed as an infinite sum of the solution of the unperturbed equation weighted…
▽ More
This study numerically solves inhomogeneous Helmholtz equations modeling acoustic wave propagation in homogeneous and lossless, absorbing and dispersive, inhomogeneous and nonlinear media. The traditional Born series (TBS) method has been employed to solve such equations. The full wave solution in this methodology is expressed as an infinite sum of the solution of the unperturbed equation weighted by increasing power of the potential. Simulated pressure field patterns for a linear array of acoustic sources (a line source) estimated by the TBS procedure exhibit excellent agreement with that of a standard time domain approach (k-Wave toolbox). The TBS scheme though iterative but is a very fast method. For example, GPU enabled CUDA C code implementing the TBS procedure takes 5 s to calculate the pressure field for the homogeneous and lossless medium whereas nearly 500 s is taken by the later module. The execution time for the corresponding CPU code is about 20 s. The findings of this study demonstrate the effectiveness of the TBS method for solving inhomogeneous Helmholtz equation, while the GPU-based implementation significantly reduces the computation time. This method can be explored in practice for calculation of pressure fields generated by real transducers designed for diverse applications.
△ Less
Submitted 18 July, 2025;
originally announced July 2025.
-
Cohomology and Extensions of $C_p$-Green Functors of Lie Type
Authors:
Tarik Anowar,
Satyendra Kumar Mishra,
Ripan Saha
Abstract:
We develop a theory of $C_p$-Green functors of Lie type, unifying the axiomatic framework of Green functors with the structure of Lie algebras under the action of a cyclic group $C_p$ of prime order. Extending classical notions from representation theory and topology, we define tensor and exterior products, introduce an equivariant Chevalley-Eilenberg cohomology, and construct cup products that en…
▽ More
We develop a theory of $C_p$-Green functors of Lie type, unifying the axiomatic framework of Green functors with the structure of Lie algebras under the action of a cyclic group $C_p$ of prime order. Extending classical notions from representation theory and topology, we define tensor and exterior products, introduce an equivariant Chevalley-Eilenberg cohomology, and construct cup products that endow the cohomology with a graded Green functor of Lie type structure. A key result establishes a correspondence between equivalence classes of singular extensions and second cohomology groups, generalizing classical Lie algebra extension theory to the equivariant setting. This framework enriches the toolkit for studying equivariant algebraic structures and paves the way for further applications in deformation theory, homotopical algebra, and representation theory.
△ Less
Submitted 10 July, 2025;
originally announced July 2025.
-
From toroids to helical tubules: Kirigami-inspired programmable assembly of two-periodic curved crystals
Authors:
Mason Price,
Daichi Hayakawa,
Thomas E. Videbæk,
Rupam Saha,
Botond Tyukodi,
Michael F. Hagan,
Seth Fraden,
Gregory M. Grason,
W. Benjamin Rogers
Abstract:
Biology is full of intricate molecular structures whose geometries are inextricably linked to their function. Many of these structures exhibit varying curvature, such as the helical structure of the bacterial flagellum, which is critical for their motility. Because synthetic analogues of these shapes could be valuable platforms for nanotechnologies, including drug delivery and plasmonics, controll…
▽ More
Biology is full of intricate molecular structures whose geometries are inextricably linked to their function. Many of these structures exhibit varying curvature, such as the helical structure of the bacterial flagellum, which is critical for their motility. Because synthetic analogues of these shapes could be valuable platforms for nanotechnologies, including drug delivery and plasmonics, controllable synthesis of variable-curvature structures of various material systems, from fullerenes to supramolecular assemblies, has been a long-standing goal. Like two-dimensional crystals, these structures have a two-periodic symmetry, but unlike standard two-dimensional crystals, they are embedded in three dimensions with complex, spatially-varying curvatures that cause the structures to close upon themselves in one or more dimensions. Here, we develop and implement a design strategy to program the self-assembly of a complex spectrum of two-periodic curved crystals with variable periodicity, spatial dimension, and topology, spanning from toroids to achiral serpentine tubules to both left- and right-handed helical tubules. Our design strategy uses a kirigami-based mapping of 2D planar tilings to 3D curved crystals that preserves the periodicity, two-fold rotational symmetries, and subunit dimensions via the arrangement of disclination defects. We survey the modular geometry of these curved crystals and infer the addressable subunit interactions required to assemble them from triangular subunits. To demonstrate this design strategy, we program the self-assembly of toroids, helical- and serpentine-tubules from DNA origami subunits. A simulation model of the assembly pathways reveals physical considerations for programming the geometric specificity of angular folds in the curved crystal required to avoid defect-mediated misassembly.
△ Less
Submitted 19 June, 2025;
originally announced June 2025.
-
Complex single-site magnetism and magnetotransport in single-crystalline Gd$_{2}$AlSi$_{3}$
Authors:
Ram Kumar,
Shanta R. Saha,
Jarryd Horn,
A. Ikeda,
Danila Sokratov,
Yash Anand,
Prathum Saraf,
Ryan Dorman,
E. Hemley,
K. K. Iyer,
Johnpierre Paglione
Abstract:
We present a detailed investigation of single-crystal samples of the magnetic compound Gd$_{2}$AlSi$_{3}$, which crystallizes in the $α$-ThSi$_2$ type tetragonal structure. We report the temperature and magnetic field dependence of the magnetic susceptibility, magnetization, heat capacity, electrical resistivity, and magnetoresistance for magnetic fields applied along both the tetragonal $c$-axis…
▽ More
We present a detailed investigation of single-crystal samples of the magnetic compound Gd$_{2}$AlSi$_{3}$, which crystallizes in the $α$-ThSi$_2$ type tetragonal structure. We report the temperature and magnetic field dependence of the magnetic susceptibility, magnetization, heat capacity, electrical resistivity, and magnetoresistance for magnetic fields applied along both the tetragonal $c$-axis and in the basal $ab$-plane. X-ray diffraction measurements confirm a centrosymmetric, $I4_{1}/amd$ space group of the crystal structure. Despite single-site occupancy of the Gd position in this tetragonal structure, we identify two successive antiferromagnetic phase transitions at Neél temperatures 32~K and 23~K via magnetic susceptibility, heat capacity and transport measurements, as well as a complex magnetic interaction with a magnetic anisotropy that plays an important role in the direction-dependent transport response. Our identification of multiple magnetic phases in Gd$_{2}$AlSi$_{3}$, where Gd is the only magnetic species, helps to elucidate the field-induced skyrmionic behavior in the Gd-based intermetallic compounds.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
Immunometabolism at the Crossroads of Infection: Mechanistic and Systems-Level Perspectives from Host and Pathogen
Authors:
Sunayana Malla,
Nabia Shahreen,
Rajib Saha
Abstract:
The emerging field of immunometabolism has underscored the central role of metabolic pathways in orchestrating immune cell function. Far from being passive background processes, metabolic activities actively regulate key immune responses. Fundamental pathways such as glycolysis, the tricarboxylic acid (TCA) cycle, and oxidative phosphorylation critically shape the behavior of immune cells, influen…
▽ More
The emerging field of immunometabolism has underscored the central role of metabolic pathways in orchestrating immune cell function. Far from being passive background processes, metabolic activities actively regulate key immune responses. Fundamental pathways such as glycolysis, the tricarboxylic acid (TCA) cycle, and oxidative phosphorylation critically shape the behavior of immune cells, influencing macrophage polarization, T cell activation, and dendritic cell function. In this review, we synthesize recent advances in immunometabolism, with a focus on the metabolic mechanisms that govern the responses of both innate and adaptive immune cells to bacterial, viral, and fungal pathogens. Drawing on experimental, computational, and integrative methodologies, we highlight how metabolic reprogramming contributes to host defense in response to infection. These findings reveal new opportunities for therapeutic intervention, suggesting that modulation of metabolic pathways could enhance immune function and improve pathogen clearance.
△ Less
Submitted 2 June, 2025;
originally announced June 2025.
-
Dilute Paramagnetism and Non-Trivial Topology in Quasicrystal Approximant Fe$_4$Al$_{13}$
Authors:
Keenan E. Avers,
Jarryd A. Horn,
Ram Kumar,
Shanta R. Saha,
Yuanfeng Xu,
B. Andrei Bernevig,
Peter Zavalij,
Johnpierre Paglione
Abstract:
A very fundamental property of both weakly and strongly interacting materials is the nature of its magnetic response. In this work we detail the growth of crystals of the quasicrystal approximant Fe$_4$Al$_{13}$ with an Al flux solvent method. We characterize our samples using electrical transport and heat capacity, yielding results consistent with a simple non-magnetic metal. However, magnetizati…
▽ More
A very fundamental property of both weakly and strongly interacting materials is the nature of its magnetic response. In this work we detail the growth of crystals of the quasicrystal approximant Fe$_4$Al$_{13}$ with an Al flux solvent method. We characterize our samples using electrical transport and heat capacity, yielding results consistent with a simple non-magnetic metal. However, magnetization measurements portray an extremely unusual response for a dilute paramagnet and do not exhibit the characteristic Curie-Weiss behavior expected for a weakly interacting material at high temperature. Electronic structure calculations confirm metallic behavior, but also indicate that each isolated band near the Fermi energy hosts non-trivial topologies including strong, weak and nodal components, with resultant topological surface states distinguishable from bulk states on the (001) surface. With half-filled flat bands apparent in the calculation but absence of long-range magnetic order, the unusual paramagnetic response suggests the dilute paramagnetic behavior in this quasicrystal approximant is surprising and may serve as a test of the fundamental assumptions that are taken for granted for the magnetic response of weakly interacting systems.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Nanomolding single-crystalline CoIn3 and RhIn3 nanowires
Authors:
Nghiep Khoan Duong,
Christian D. Multunas,
Thomas Whoriskey,
Mehrdad T. Kiani,
Shanta R. Saha,
Quynh P. Sam,
Han Wang,
Satya Kushwaha,
Johnpierre Paglione,
Ravishankar Sundararaman,
Judy J. Cha
Abstract:
Intermetallic compounds containing transition metals and group III-V metals tend to possess strong correlations and high catalytic activities, both of which can be enhanced via reduced dimensionality. Nanostructuring is an effective approach to explore this possibility, yet the synthesis of nanostructured intermetallics is challenging due to vast differences in melting points and vapor pressures o…
▽ More
Intermetallic compounds containing transition metals and group III-V metals tend to possess strong correlations and high catalytic activities, both of which can be enhanced via reduced dimensionality. Nanostructuring is an effective approach to explore this possibility, yet the synthesis of nanostructured intermetallics is challenging due to vast differences in melting points and vapor pressures of the constituent elements. In this work, we demonstrate that this challenge can be overcome with thermomechanical nanomolding (TMNM), exemplified by the synthesis of intermetallic CoIn3 and RhIn3 nanowires. We show that TMNM successfully extrudes single-crystalline nanowires of these compounds down to the 20 nm diameter range, and the nanowires remain metallic with resistivity values higher than calculated bulk resistivity. We discuss possible effects of surface roughness scattering, vacancy-induced scattering, and surface oxidation, on the measured resistivities of the nanowires. For CoIn3 nanowires, the measured resistivity values are the first reported values for this compound.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Direct Numerical Simulations of Droplet Impact onto Heated Surfaces using the Program Free Surface 3D (FS3D)
Authors:
Manish Kumar,
Rishav Saha,
Johanna Potyka,
Kathrin Schulte,
Bernhard Weigand
Abstract:
Droplet impact onto heated surfaces is a widespread process in industrial applications, particularly in the context of spray cooling techniques. Therefore, it is essential to study the complex phenomenon of droplet spreading, heat removal from a hot surface, and flow distribution during the impact. This study focuses on Direct Numerical Simulation (DNS) of the initial stage of a water droplet impa…
▽ More
Droplet impact onto heated surfaces is a widespread process in industrial applications, particularly in the context of spray cooling techniques. Therefore, it is essential to study the complex phenomenon of droplet spreading, heat removal from a hot surface, and flow distribution during the impact. This study focuses on Direct Numerical Simulation (DNS) of the initial stage of a water droplet impact onto a highly conducting heated surface, below the saturation temperature of the liquid. The maximum spreading diameters at different impact velocities in the presence of a heated surface, are analysed. Free Surface 3D (FS3D), an in-house code developed at the Institute of Aerospace Thermodynamics, University of Stuttgart, is used for this work. A grid independence study investigates the resolution required to resolve the flow field around the droplet. As evaporation effects during the initial stage of the droplet impact process are negligible, they are ignored. However, for longer simulation times, evaporation plays a significant role in the process. Preparing for such simulations, an evaporating droplet in cross flow is simulated to study the performance gain in the newly implemented hybrid OpenMP and MPI parallelisation and red-black optimization in the evaporation routines of FS3D. Both the scaling limit and efficiency were improved by using the hybrid (MPI with OpenMP) parallelisation, while the red-black scheme optimization raised the efficiency only. An improved performance of 23% of the new version is achieved for a test case investigated with the tool MAQAO. Additionally, strong and weak scaling performance tests are conducted. The new version is found to scale up to 256 nodes compared to 128 nodes for the original version. The maximum time-cycles per hour (CPH) achieved with the new version is 35% higher compared to the previous version.
△ Less
Submitted 26 February, 2025;
originally announced March 2025.
-
Revealing isotropic abundant low-energy excitations in UTe$_2$ through complex microwave surface impedance
Authors:
Arthur Carlton-Jones,
Alonso Suarez,
Yun-Suk Eo,
Ian M. Hayes,
Shanta R. Saha,
Johnpierre Paglione,
Nicholas P. Butch,
Steven M. Anlage
Abstract:
The complex surface impedance is a well-established tool to study the super- and normal-fluid responses of superconductors. Fundamental properties of the superconductor, such as the pairing mechanism, Fermi surface, and topological properties, also influence the surface impedance. We explore the microwave surface impedance of spin-triplet UTe$_2$ single crystals as a function of temperature using…
▽ More
The complex surface impedance is a well-established tool to study the super- and normal-fluid responses of superconductors. Fundamental properties of the superconductor, such as the pairing mechanism, Fermi surface, and topological properties, also influence the surface impedance. We explore the microwave surface impedance of spin-triplet UTe$_2$ single crystals as a function of temperature using resonant cavity perturbation measurements employing a novel multi-modal analysis to gain insight into these properties. We determine a composite surface impedance of the crystal for each mode using resonance data combined with the independently measured normal state dc resistivity tensor. The normal state surface impedance reveals the weighting of current flow directions in the crystal of each resonant mode. For UTe$_2$, we find an isotropic $Δλ(T) \sim T^α$ power-law temperature dependence for the magnetic penetration depth for $T\le T_c/3$ with $α< 2$, which is inconsistent with a single pair of point nodes on the Fermi surface under weak scattering. We also find a similar power-law temperature dependence for the low-temperature surface resistance $R_s(T) \sim T^{α_R}$ with $α_R < 2$. We observe a strong anisotropy of the residual microwave loss across these modes, with some modes showing loss below the universal line-nodal value, to those showing substantially more. We compare to predictions for topological Weyl superconductivity in the context of the observed isotropic power-laws, and anisotropy of the residual loss.
△ Less
Submitted 4 June, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
Modular programming of interaction and geometric specificity enables assembly of complex DNA origami nanostructures
Authors:
Rupam Saha,
Daichi Hayakawa,
Thomas E. Videbaek,
Mason Price,
Wei-Shao Wei,
Juanita Pombo,
Daniel Duke,
Gaurav Arya,
Gregory M. Grason,
W. Benjamin Rogers,
Seth Fraden
Abstract:
We present a modular DNA origami design approach to address the challenges of assembling geometrically complex nanoscale structures, including those with nonuniform Gaussian curvature. This approach features a core structure that completely conserves the scaffold routing across different designs and preserves more than 70% of the DNA staples between designs, dramatically reducing both cost and eff…
▽ More
We present a modular DNA origami design approach to address the challenges of assembling geometrically complex nanoscale structures, including those with nonuniform Gaussian curvature. This approach features a core structure that completely conserves the scaffold routing across different designs and preserves more than 70% of the DNA staples between designs, dramatically reducing both cost and effort, while enabling precise and independent programming of subunit interactions and binding angles through adjustable overhang lengths and sequences. Using cryogenic electron microscopy, gel electrophoresis, and coarse-grained molecular dynamics simulations, we validate a set of robust design rules. We demonstrate the method's utility by assembling a variety of self-limiting structures, including anisotropic shells with controlled inter-subunit interactions and curvature, and a toroid with globally varying curvature. Our strategy is both cost-effective and versatile, providing a promising and efficient solution for the synthetic fabrication of complex nanostructures.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
ProxSparse: Regularized Learning of Semi-Structured Sparsity Masks for Pretrained LLMs
Authors:
Hongyi Liu,
Rajarshi Saha,
Zhen Jia,
Youngsuk Park,
Jiaji Huang,
Shoham Sabach,
Yu-Xiang Wang,
George Karypis
Abstract:
Large Language Models (LLMs) have demonstrated exceptional performance in natural language processing tasks, yet their massive size makes serving them inefficient and costly. Semi-structured pruning has emerged as an effective method for model acceleration, but existing approaches are suboptimal because they focus on local, layer-wise optimizations using heuristic rules, failing to leverage global…
▽ More
Large Language Models (LLMs) have demonstrated exceptional performance in natural language processing tasks, yet their massive size makes serving them inefficient and costly. Semi-structured pruning has emerged as an effective method for model acceleration, but existing approaches are suboptimal because they focus on local, layer-wise optimizations using heuristic rules, failing to leverage global feedback. We present ProxSparse, a learning-based framework for mask selection enabled by regularized optimization. ProxSparse transforms the rigid, non-differentiable mask selection process into a smoother optimization procedure, allowing gradual mask exploration with flexibility. ProxSparse does not involve additional weight updates once the mask is determined. Our extensive evaluations on 7 widely used models show that ProxSparse consistently outperforms previously proposed semi-structured mask selection methods with significant improvement, demonstrating the effectiveness of our learned approach towards semi-structured pruning.
△ Less
Submitted 23 June, 2025; v1 submitted 31 January, 2025;
originally announced February 2025.
-
Economical and versatile subunit design principles for self-assembled DNA origami structures
Authors:
Wei-Shao Wei,
Thomas E. Videbæk,
Daichi Hayakawa,
Rupam Saha,
W. Benjamin Rogers,
Seth Fraden
Abstract:
Self-assembly of nanoscale synthetic subunits is a promising bottom-up strategy for fabrication of functional materials. Here, we introduce a design principle for DNA origami nanoparticles of 50-nm size, exploiting modularity, to make a family of versatile subunits that can target an abundant variety of self-assembled structures. The subunits are based on a core module that remains constant among…
▽ More
Self-assembly of nanoscale synthetic subunits is a promising bottom-up strategy for fabrication of functional materials. Here, we introduce a design principle for DNA origami nanoparticles of 50-nm size, exploiting modularity, to make a family of versatile subunits that can target an abundant variety of self-assembled structures. The subunits are based on a core module that remains constant among all the subunits. Variable bond modules and angle modules are added to the exterior of the core to control interaction specificity, strength and structural geometry. A series of subunits with designed bond/angle modules are demonstrated to self-assemble into a rich variety of structures with different Gaussian curvatures, exemplified by sheets, spherical shells, and tubes. The design features flexible joints implemented using single-stranded angle modules between adjacent subunits whose mechanical properties, such as bending elastic moduli, are inferred from cryo-EM. Our findings suggest that incorporating a judicious amount of flexibility in the bond provides error tolerances in design and fabrication while still guaranteeing target fidelity. Lastly, while increasing flexibility could introduce greater variability and potential errors in assembly, these effects can be counterbalanced by increasing the number of distinct bonds, thereby allowing for precise targeting of specific structural binding angles within a broad range of configurations.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
Room temperature ferromagnetism induced by high valence cation V$^{+5}$/V$^{+4}$ substitution in SrFeO$_{3-δ}$
Authors:
Rakhi Saha,
Koyal Suman Samantaray,
P Maneesha,
SC Baral,
Sachin Sarangi,
Rajashri Urkude,
Biplab Ghosh,
Abdelkrim Mekki,
Khalil Harrabi,
Somaditya Sen
Abstract:
The structural and magnetic effects of non-magnetic vanadium (V) doping in helimagnetic SrFeO$_{3-δ}$ (SFO) are investigated, focusing on up to 3% substitution at the Fe site. Structural analysis from X-ray diffraction (XRD) and Raman spectroscopy, supported by phonon mode calculations, reveals that pure SFO exists as a mixed tetragonal-orthorhombic phase, while V-doped samples exhibit an emerging…
▽ More
The structural and magnetic effects of non-magnetic vanadium (V) doping in helimagnetic SrFeO$_{3-δ}$ (SFO) are investigated, focusing on up to 3% substitution at the Fe site. Structural analysis from X-ray diffraction (XRD) and Raman spectroscopy, supported by phonon mode calculations, reveals that pure SFO exists as a mixed tetragonal-orthorhombic phase, while V-doped samples exhibit an emerging cubic phase alongside tetragonal symmetry. Magnetic hysteresis (M-H) loops show notable ferromagnetic behavior within the antiferromagnetic matrix, persisting even at room temperature. Temperature-dependent magnetization measurements indicate a Neel temperature (TN ) shift from 70K to 55K, along with increased magnetization differences in field-cooled (FC) and zero field-cooled (ZFC) data, reflecting heightened magnetic frustration due to competing FM/AFM exchange interactions. X-ray photoelectron spectroscopy (XPS) and X-ray absorption near-edge structure (XANES) analyses reveal a rise in Fe$^{3+}$ and V$^{5+}$ states, affecting oxygen vacancy distributions and corresponding structural shifts seen in XRD and Raman results. The multivalent Fe$^{3+}$/Fe$^{4+}$ and V$^{4+}$/V$^{5+}$ states enhance double-exchange (DE) and super-exchange (SE) interactions (Fe$^{3+}$-O-Fe$^{4+}$ and Fe$^{3+}$-O-V$^{5+}$), promoting ferromagnetism. Frequency-dependent magnetization studies display a subtle susceptibility peak shift, indicating spin-glass-like behavior in V-doped samples.
△ Less
Submitted 2 November, 2024;
originally announced November 2024.
-
A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach
Authors:
Lipismita Panigrahi,
Prianka Rani Saha,
Jurdana Masuma Iqrah,
Sushil Prasad
Abstract:
Clinical diagnosis of breast malignancy (BM) is a challenging problem in the recent era. In particular, Deep learning (DL) models have continued to offer important solutions for early BM diagnosis but their performance experiences overfitting due to the limited volume of breast ultrasound (BUS) image data. Further, large BUS datasets are difficult to manage due to privacy and legal concerns. Hence…
▽ More
Clinical diagnosis of breast malignancy (BM) is a challenging problem in the recent era. In particular, Deep learning (DL) models have continued to offer important solutions for early BM diagnosis but their performance experiences overfitting due to the limited volume of breast ultrasound (BUS) image data. Further, large BUS datasets are difficult to manage due to privacy and legal concerns. Hence, image augmentation is a necessary and challenging step to improve the performance of the DL models. However, the current DL-based augmentation models are inadequate and operate as a black box resulting lack of information and justifications about their suitability and efficacy. Additionally, pre and post-augmentation need high-performance computational resources and time to produce the augmented image and evaluate the model performance. Thus, this study aims to develop a novel efficient augmentation approach for BUS images with advanced neural style transfer (NST) and Explainable AI (XAI) harnessing GPU-based parallel infrastructure. We scale and distribute the training of the augmentation model across 8 GPUs using the Horovod framework on a DGX cluster, achieving a 5.09 speedup while maintaining the model's accuracy. The proposed model is evaluated on 800 (348 benign and 452 malignant) BUS images and its performance is analyzed with other progressive techniques, using different quantitative analyses. The result indicates that the proposed approach can successfully augment the BUS images with 92.47% accuracy.
△ Less
Submitted 31 October, 2024;
originally announced November 2024.
-
Room temperature Multiferroicity and Magnetoelectric coupling in Ca/Mn modified BaTiO3
Authors:
P. Maneesha,
Koyal Suman Samantaray,
Rakhi Saha,
Rajashri Urkude,
Biplab Ghosh,
Arjun K Pathak,
Indranil Bhaumik,
Abdelkrim Mekki,
Khalil Harrabi,
Somaditya Sen
Abstract:
Materials with magnetoelectric coupling (MEC) between ferroic orders at room temperature are emerging field in modern technology and physics. BaTiO3 is a robust ferroelectric in which several doping has led to MEC. In Ca and Mn modified BaTiO3 has been study with a series of Ba(1-x)Ca(x)Ti(1-y)Mn(y)O3 (x=y= 0, 0.03, 0.06, 0.09), in this MEC was only observed in x=0.03. The structural modifications…
▽ More
Materials with magnetoelectric coupling (MEC) between ferroic orders at room temperature are emerging field in modern technology and physics. BaTiO3 is a robust ferroelectric in which several doping has led to MEC. In Ca and Mn modified BaTiO3 has been study with a series of Ba(1-x)Ca(x)Ti(1-y)Mn(y)O3 (x=y= 0, 0.03, 0.06, 0.09), in this MEC was only observed in x=0.03. The structural modifications with changing substitution reveal a reduced Ti-O-Ti bond angle for this sample which is the most ferromagnetic in nature. A mixed phase of tetragonal P4mm and hexagonal P63/mmc space groups of BaTiO3 is observed in the substituted samples, with nominal contribution of the hexagonal phase for x=0.03. A valence state study using XPS and XANES reveals the presence of enhanced proportion of Mn3+ ions in the sample which support a pseudo Jahn-Teller distortion, thereby supporting the ferroelectricity for x=0.03. Direct evidences of MEC was obtained from magnetoelectric measurements. A magnetoelectric coupling coefficient, αME ~44 mVcm-1Oe-1 was obtained for dc magnetic field of 600 Oe and a 10Hz ac field of 40 Oe. Such MEC was not observed for higher substitution which emphasizes the sensitivity of the structural properties on substitution.
△ Less
Submitted 17 February, 2025; v1 submitted 29 October, 2024;
originally announced October 2024.
-
Exploring Curriculum Learning for Vision-Language Tasks: A Study on Small-Scale Multimodal Training
Authors:
Rohan Saha,
Abrar Fahim,
Alona Fyshe,
Alex Murphy
Abstract:
For specialized domains, there is often not a wealth of data with which to train large machine learning models. In such limited data / compute settings, various methods exist aiming to $\textit{do more with less}$, such as finetuning from a pretrained model, modulating difficulty levels as data are presented to a model (curriculum learning), and considering the role of model type / size. Approache…
▽ More
For specialized domains, there is often not a wealth of data with which to train large machine learning models. In such limited data / compute settings, various methods exist aiming to $\textit{do more with less}$, such as finetuning from a pretrained model, modulating difficulty levels as data are presented to a model (curriculum learning), and considering the role of model type / size. Approaches to efficient $\textit{machine}$ learning also take inspiration from $\textit{human}$ learning by considering use cases where machine learning systems have access to approximately the same number of words experienced by a 13 year old child (100M words). We investigate the role of 3 primary variables in a limited data regime as part of the multimodal track of the BabyLM challenge. We contrast: (i) curriculum learning, (ii), pretraining (with text-only data), (iii) model type. We modulate these variables and assess them on two types of tasks: (a) multimodal (text+image), and (b) unimodal (text-only) tasks. We find that curriculum learning benefits multimodal evaluations over non-curriclum learning models, particularly when combining text-only pretraining. On text-only tasks, curriculum learning appears to help models with smaller trainable parameter counts. We suggest possible reasons based on architectural differences and training designs as to why one might observe such results.
△ Less
Submitted 20 October, 2024;
originally announced October 2024.
-
LegalLens Shared Task 2024: Legal Violation Identification in Unstructured Text
Authors:
Ben Hagag,
Liav Harpaz,
Gil Semo,
Dor Bernsohn,
Rohit Saha,
Pashootan Vaezipoor,
Kyryl Truskovskyi,
Gerasimos Spanakis
Abstract:
This paper presents the results of the LegalLens Shared Task, focusing on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. Using an enhanced LegalLens dataset covering labor, privacy, and consumer protection domains…
▽ More
This paper presents the results of the LegalLens Shared Task, focusing on detecting legal violations within text in the wild across two sub-tasks: LegalLens-NER for identifying legal violation entities and LegalLens-NLI for associating these violations with relevant legal contexts and affected individuals. Using an enhanced LegalLens dataset covering labor, privacy, and consumer protection domains, 38 teams participated in the task. Our analysis reveals that while a mix of approaches was used, the top-performing teams in both tasks consistently relied on fine-tuning pre-trained language models, outperforming legal-specific models and few-shot methods. The top-performing team achieved a 7.11% improvement in NER over the baseline, while NLI saw a more marginal improvement of 5.7%. Despite these gains, the complexity of legal texts leaves room for further advancements.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Accelerating PoT Quantization on Edge Devices
Authors:
Rappy Saha,
Jude Haris,
José Cano
Abstract:
Non-uniform quantization, such as power-of-two (PoT) quantization, matches data distributions better than uniform quantization, which reduces the quantization error of Deep Neural Networks (DNNs). PoT quantization also allows bit-shift operations to replace multiplications, but there are limited studies on the efficiency of shift-based accelerators for PoT quantization. Furthermore, existing pipel…
▽ More
Non-uniform quantization, such as power-of-two (PoT) quantization, matches data distributions better than uniform quantization, which reduces the quantization error of Deep Neural Networks (DNNs). PoT quantization also allows bit-shift operations to replace multiplications, but there are limited studies on the efficiency of shift-based accelerators for PoT quantization. Furthermore, existing pipelines for accelerating PoT-quantized DNNs on edge devices are not open-source. In this paper, we first design shift-based processing elements (shift-PE) for different PoT quantization methods and evaluate their efficiency using synthetic benchmarks. Then we design a shift-based accelerator using our most efficient shift-PE and propose PoTAcc, an open-source pipeline for end-to-end acceleration of PoT-quantized DNNs on resource-constrained edge devices. Using PoTAcc, we evaluate the performance of our shift-based accelerator across three DNNs. On average, it achieves a 1.23x speedup and 1.24x energy reduction compared to a multiplier-based accelerator, and a 2.46x speedup and 1.83x energy reduction compared to CPU-only execution. Our code is available at https://github.com/gicLAB/PoTAcc
△ Less
Submitted 21 October, 2024; v1 submitted 30 September, 2024;
originally announced September 2024.
-
Turbulence Strength $C_n^2$ Estimation from Video using Physics-based Deep Learning
Authors:
Ripon Kumar Saha,
Esen Salcin,
Jihoo Kim,
Joseph Smith,
Suren Jayasuriya
Abstract:
Images captured from a long distance suffer from dynamic image distortion due to turbulent flow of air cells with random temperatures, and thus refractive indices. This phenomenon, known as image dancing, is commonly characterized by its refractive-index structure constant $C_n^2$ as a measure of the turbulence strength. For many applications such as atmospheric forecast model, long-range/astronom…
▽ More
Images captured from a long distance suffer from dynamic image distortion due to turbulent flow of air cells with random temperatures, and thus refractive indices. This phenomenon, known as image dancing, is commonly characterized by its refractive-index structure constant $C_n^2$ as a measure of the turbulence strength. For many applications such as atmospheric forecast model, long-range/astronomy imaging, and aviation safety, optical communication technology, $C_n^2$ estimation is critical for accurately sensing the turbulent environment. Previous methods for $C_n^2$ estimation include estimation from meteorological data (temperature, relative humidity, wind shear, etc.) for single-point measurements, two-ended pathlength measurements from optical scintillometer for path-averaged $C_n^2$, and more recently estimating $C_n^2$ from passive video cameras for low cost and hardware complexity. In this paper, we present a comparative analysis of classical image gradient methods for $C_n^2$ estimation and modern deep learning-based methods leveraging convolutional neural networks. To enable this, we collect a dataset of video capture along with reference scintillometer measurements for ground truth, and we release this unique dataset to the scientific community. We observe that deep learning methods can achieve higher accuracy when trained on similar data, but suffer from generalization errors to other, unseen imagery as compared to classical methods. To overcome this trade-off, we present a novel physics-based network architecture that combines learned convolutional layers with a differentiable image gradient method that maintains high accuracy while being generalizable across image datasets.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization
Authors:
Ankan Mullick,
Sombit Bose,
Rounak Saha,
Ayan Kumar Bhowmick,
Aditya Vempaty,
Pawan Goyal,
Niloy Ganguly,
Prasenjit Dey,
Ravi Kokku
Abstract:
The ever-increasing volume of digital information necessitates efficient methods for users to extract key insights from lengthy documents. Aspect-based summarization offers a targeted approach, generating summaries focused on specific aspects within a document. Despite advancements in aspect-based summarization research, there is a continuous quest for improved model performance. Given that large…
▽ More
The ever-increasing volume of digital information necessitates efficient methods for users to extract key insights from lengthy documents. Aspect-based summarization offers a targeted approach, generating summaries focused on specific aspects within a document. Despite advancements in aspect-based summarization research, there is a continuous quest for improved model performance. Given that large language models (LLMs) have demonstrated the potential to revolutionize diverse tasks within natural language processing, particularly in the problem of summarization, this paper explores the potential of fine-tuning LLMs for the aspect-based summarization task. We evaluate the impact of fine-tuning open-source foundation LLMs, including Llama2, Mistral, Gemma and Aya, on a publicly available domain-specific aspect based summary dataset. We hypothesize that this approach will enable these models to effectively identify and extract aspect-related information, leading to superior quality aspect-based summaries compared to the state-of-the-art. We establish a comprehensive evaluation framework to compare the performance of fine-tuned LLMs against competing aspect-based summarization methods and vanilla counterparts of the fine-tuned LLMs. Our work contributes to the field of aspect-based summarization by demonstrating the efficacy of fine-tuning LLMs for generating high-quality aspect-based summaries. Furthermore, it opens doors for further exploration of using LLMs for targeted information extraction tasks across various NLP domains.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Using a CNN Model to Assess Paintings' Creativity
Authors:
Zhehan Zhang,
Meihua Qian,
Li Luo,
Qianyi Gao,
Xianyong Wang,
Ripon Saha,
Xinxin Song
Abstract:
Assessing artistic creativity has long challenged researchers, with traditional methods proving time-consuming. Recent studies have applied machine learning to evaluate creativity in drawings, but not paintings. Our research addresses this gap by developing a CNN model to automatically assess the creativity of human paintings. Using a dataset of six hundred paintings by professionals and children,…
▽ More
Assessing artistic creativity has long challenged researchers, with traditional methods proving time-consuming. Recent studies have applied machine learning to evaluate creativity in drawings, but not paintings. Our research addresses this gap by developing a CNN model to automatically assess the creativity of human paintings. Using a dataset of six hundred paintings by professionals and children, our model achieved 90% accuracy and faster evaluation times than human raters. This approach demonstrates the potential of machine learning in advancing artistic creativity assessment, offering a more efficient alternative to traditional methods.
△ Less
Submitted 1 January, 2025; v1 submitted 2 August, 2024;
originally announced August 2024.
-
Designing Efficient LLM Accelerators for Edge Devices
Authors:
Jude Haris,
Rappy Saha,
Wenhao Hu,
José Cano
Abstract:
The increase in open-source availability of Large Language Models (LLMs) has enabled users to deploy them on more and more resource-constrained edge devices to reduce reliance on network connections and provide more privacy. However, the high computation and memory demands of LLMs make their execution on resource-constrained edge devices challenging and inefficient. To address this issue, designin…
▽ More
The increase in open-source availability of Large Language Models (LLMs) has enabled users to deploy them on more and more resource-constrained edge devices to reduce reliance on network connections and provide more privacy. However, the high computation and memory demands of LLMs make their execution on resource-constrained edge devices challenging and inefficient. To address this issue, designing new and efficient edge accelerators for LLM inference is crucial. FPGA-based accelerators are ideal for LLM acceleration due to their reconfigurability, as they enable model-specific optimizations and higher performance per watt. However, creating and integrating FPGA-based accelerators for LLMs (particularly on edge devices) has proven challenging, mainly due to the limited hardware design flows for LLMs in existing FPGA platforms.
To tackle this issue, in this paper we first propose a new design platform, named SECDA-LLM, that utilizes the SECDA methodology to streamline the process of designing, integrating, and deploying efficient FPGA-based LLM accelerators for the llama.cpp inference framework. We then demonstrate, through a case study, the potential benefits of SECDA-LLM by creating a new MatMul accelerator that supports block floating point quantized operations for LLMs. Our initial accelerator design, deployed on the PYNQ-Z1 board, reduces latency 1.7 seconds per token or ~2 seconds per word) by 11x over the dual-core Arm NEON-based CPU execution for the TinyLlama model.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models
Authors:
Jayanta Sadhu,
Maneesha Rani Saha,
Rifat Shahriyar
Abstract:
The influence of Large Language Models (LLMs) is rapidly growing, automating more jobs over time. Assessing the fairness of LLMs is crucial due to their expanding impact. Studies reveal the reflection of societal norms and biases in LLMs, which creates a risk of propagating societal stereotypes in downstream tasks. Many studies on bias in LLMs focus on gender bias in various NLP applications. Howe…
▽ More
The influence of Large Language Models (LLMs) is rapidly growing, automating more jobs over time. Assessing the fairness of LLMs is crucial due to their expanding impact. Studies reveal the reflection of societal norms and biases in LLMs, which creates a risk of propagating societal stereotypes in downstream tasks. Many studies on bias in LLMs focus on gender bias in various NLP applications. However, there's a gap in research on bias in emotional attributes, despite the close societal link between emotion and gender. This gap is even larger for low-resource languages like Bangla. Historically, women are associated with emotions like empathy, fear, and guilt, while men are linked to anger, bravado, and authority. This pattern reflects societal norms in Bangla-speaking regions. We offer the first thorough investigation of gendered emotion attribution in Bangla for both closed and open source LLMs in this work. Our aim is to elucidate the intricate societal relationship between gender and emotion specifically within the context of Bangla. We have been successful in showing the existence of gender bias in the context of emotions in Bangla through analytical methods and also show how emotion attribution changes on the basis of gendered role selection in LLMs. All of our resources including code and data are made publicly available to support future research on Bangla NLP.
Warning: This paper contains explicit stereotypical statements that many may find offensive.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias
Authors:
Jayanta Sadhu,
Maneesha Rani Saha,
Rifat Shahriyar
Abstract:
The rapid growth of Large Language Models (LLMs) has put forward the study of biases as a crucial field. It is important to assess the influence of different types of biases embedded in LLMs to ensure fair use in sensitive fields. Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla. In this work, we examine two t…
▽ More
The rapid growth of Large Language Models (LLMs) has put forward the study of biases as a crucial field. It is important to assess the influence of different types of biases embedded in LLMs to ensure fair use in sensitive fields. Although there have been extensive works on bias assessment in English, such efforts are rare and scarce for a major language like Bangla. In this work, we examine two types of social biases in LLM generated outputs for Bangla language. Our main contributions in this work are: (1) bias studies on two different social biases for Bangla, (2) a curated dataset for bias measurement benchmarking and (3) testing two different probing techniques for bias detection in the context of Bangla. This is the first work of such kind involving bias assessment of LLMs for Bangla to the best of our knowledge. All our code and resources are publicly available for the progress of bias related research in Bangla NLP.
△ Less
Submitted 13 December, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Gapless dynamic magnetic ground state in the charge-gapped trimer iridate Ba$_4$NbIr$_3$O$_{12}$
Authors:
Abhisek Bandyopadhyay,
S. Lee,
D. T. Adroja,
M. R. Lees,
G. B. G. Stenning,
P. Aich,
Luca Tortora,
C. Meneghini,
G. Cibin,
Adam Berlie,
R. A. Saha,
D. Takegami,
A. Melendez-Sans,
G. Poelchen,
M. Yoshimura,
K. D. Tsuei,
Z. Hu,
Ting-Shan Chan,
S. Chattopadhyay,
G. S. Thakur,
Kwang-Yong Choi
Abstract:
We present an experimental investigation of the magnetic ground state in Ba$_4$NbIr$_3$O$_{12}$, a fractional valent trimer iridate. X-ray absorption and photoemission spectroscopy show that the Ir valence lies between 3+ and 4+ while Nb is pentavalent. Combined dc/ac magnetization, specific heat, and muon spin rotation/relaxation ($μ$SR) measurements reveal no magnetic phase transition down to 0.…
▽ More
We present an experimental investigation of the magnetic ground state in Ba$_4$NbIr$_3$O$_{12}$, a fractional valent trimer iridate. X-ray absorption and photoemission spectroscopy show that the Ir valence lies between 3+ and 4+ while Nb is pentavalent. Combined dc/ac magnetization, specific heat, and muon spin rotation/relaxation ($μ$SR) measurements reveal no magnetic phase transition down to 0.05~K. Despite a significant Weiss temperature ($Θ_{\mathrm{W}} \sim -15$ to $-25$~K) indicating antiferromagnetic correlations, a quantum spin-liquid (QSL) phase emerges and persists down to 0.1~K. This state likely arises from geometric frustration in the edge-sharing equilateral triangle Ir network. Our $μ$SR analysis reveals a two-component depolarization, arising from the coexistence of rapidly (90\%) and slowly (10\%) fluctuating Ir moments. Powder x-ray diffraction and Ir-L$_3$edge x-ray absorption fine structure spectroscopy identify ~8-10\% Nb/Ir site-exchange, reducing frustration within part of the Ir network, and likely leading to the faster muon spin relaxation, while the structurally ordered Ir ions remain highly geometrically frustrated, giving rise to the rapidly spin-fluctuating QSL ground state. At low temperatures, the magnetic specific heat varies as $γT + αT^2$, indicating gapless spinon excitations, and possible Dirac QSL features with linear spinon dispersion, respectively.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Pressure-induced exciton formation and superconductivity in platinum-based mineral Sperrylite
Authors:
Limin Wang,
Rongwei Hu,
Yash Anand,
Shanta R. Saha,
Jason R. Jeffries,
Johnpierre Paglione
Abstract:
We report a comprehensive study of Sperrylite (PtAs2), the main platinum source in natural minerals, as a function of applied pressures up to 150 GPa. While no structural phase transition was detected from pressure-dependent X-ray measurements, the unit cell volume shrinks monotonically with pressure following the third-order Birch-Murnaghan equation of state. The mildly semiconducting behavior fo…
▽ More
We report a comprehensive study of Sperrylite (PtAs2), the main platinum source in natural minerals, as a function of applied pressures up to 150 GPa. While no structural phase transition was detected from pressure-dependent X-ray measurements, the unit cell volume shrinks monotonically with pressure following the third-order Birch-Murnaghan equation of state. The mildly semiconducting behavior found in pure synthesized crystals at ambient pressures becomes more insulating upon increasing applied pressure before metalizing at higher pressures, giving way to the appearance of an abrupt decrease in resistance near 3 K at pressures above 92 GPa consistent with the onset of a superconducing phase. The pressure evolution of the calculated electronic band structure reveals the same physical trend as our transport measurements, with a non-monotonic evolution explained by a hole band that is pushed below the Fermi energy and an electron band that approaches it as a function of pressure, both reaching a touching point suggestive of an excitonic state. A topological Lifshitz transition of the electronic structure and an increase in the density of states may naturally explain the onset of superconductivity in this material
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Absence of a Bulk Thermodynamic Phase Transition to a Density Wave Phase in UTe2
Authors:
Florian Theuss,
Avi Shragai,
Gael Grissonnanche,
Luciano Peralta,
Gregorio de la Fuente Simarro,
Ian M Hayes,
Shanta R Saha,
Yun Suk Eo,
Alonso Suarez,
Andrea Capa Salinas,
Ganesh Pokharel,
Stephen D. Wilson,
Nicholas P Butch,
Johnpierre Paglione,
B. J. Ramshaw
Abstract:
Competing and intertwined orders are ubiquitous in strongly correlated electron systems, such as the charge, spin, and superconducting orders in the high-Tc cuprates. Recent scanning tunneling microscopy (STM) measurements provide evidence for a charge density wave (CDW) that coexists with superconductivity in the heavy Fermion metal UTe2. This CDW persists up to at least 7.5 K and, as a CDW break…
▽ More
Competing and intertwined orders are ubiquitous in strongly correlated electron systems, such as the charge, spin, and superconducting orders in the high-Tc cuprates. Recent scanning tunneling microscopy (STM) measurements provide evidence for a charge density wave (CDW) that coexists with superconductivity in the heavy Fermion metal UTe2. This CDW persists up to at least 7.5 K and, as a CDW breaks the translational symmetry of the lattice, its disappearance is necessarily accompanied by thermodynamic phase transition. Here, we report high-precision thermodynamic measurements of the elastic moduli of UTe2. We observe no signature of a phase transition in the elastic moduli down to a level of 1 part in 10^7, strongly implying the absence of bulk CDW order in UTe2. We suggest that the CDW and associated pair density wave (PDW) observed by STM may be confined to the surface of UTe2.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Absence of a bulk charge density wave signature in x-ray measurements of UTe$_2$
Authors:
Caitlin S. Kengle,
Dipanjan Chaudhuri,
Xuefei Guo,
Thomas A. Johnson,
Simon Bettler,
Wolfgang Simeth,
Matthew J. Krogstad,
Zahir Islam,
Sheng Ran,
Shanta R. Saha,
Johnpierre Paglione,
Nicholas P. Butch,
Eduardo Fradkin,
Vidya Madhavan,
Peter Abbamonte
Abstract:
The long-sought pair density wave (PDW) is an exotic phase of matter in which charge density wave (CDW) order is intertwined with the amplitude or phase of coexisting, superconducting order \cite{Berg2009,Berg2009b}. Originally predicted to exist in copper-oxides, circumstantial evidence for PDW order now exists in a variety of materials. Recently, scanning tunneling microscopy (STM) studies have…
▽ More
The long-sought pair density wave (PDW) is an exotic phase of matter in which charge density wave (CDW) order is intertwined with the amplitude or phase of coexisting, superconducting order \cite{Berg2009,Berg2009b}. Originally predicted to exist in copper-oxides, circumstantial evidence for PDW order now exists in a variety of materials. Recently, scanning tunneling microscopy (STM) studies have reported evidence for a three-component charge density wave (CDW) at the surface of the heavy-fermion superconductor, UTe$_2$, persisting below its superconducting transition temperature. Here, we use hard x-ray diffraction measurements on crystals of UTe$_2$ at $T = 1.9$ K and $12$ K to search for a bulk signature of this CDW. Using STM measurements as a constraint, we calculate the expected locations of CDW superlattice peaks, and sweep a large volume of reciprocal space in search of a signature. We failed to find any evidence for a CDW near any of the expected superlattice positions in many Brillouin zones. We estimate an upper bound on the CDW lattice distortion of $u_{max} \lesssim 4 \times 10^{-3} \mathrmÅ$. Our results suggest that the CDW observed in STM is either purely electronic, somehow lacking a signature in the structural lattice, or is restricted to the material surface.
△ Less
Submitted 14 October, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
Formal deformations and extensions of `twisted' Lie algebras
Authors:
I. Basdouri,
E. Peyghan,
M. A. Sadraoui,
R. Saha
Abstract:
The interplay between derivations and algebraic structures has been a subject of significant interest and exploration. Inspired by Yau's twist and the Leibniz rule, we investigate the formal deformation of twisted Lie algebras by invertible derivations, herein referred to as "InvDer Lie". We define representations of InvDer Lie, elucidate cohomology structures of order 1 and 2, and identify infini…
▽ More
The interplay between derivations and algebraic structures has been a subject of significant interest and exploration. Inspired by Yau's twist and the Leibniz rule, we investigate the formal deformation of twisted Lie algebras by invertible derivations, herein referred to as "InvDer Lie". We define representations of InvDer Lie, elucidate cohomology structures of order 1 and 2, and identify infinitesimals as 2-cocycles. Furthermore, we explore central extensions of InvDer Lie, revealing their intricate relationship with cohomology theory.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
On The Persona-based Summarization of Domain-Specific Documents
Authors:
Ankan Mullick,
Sombit Bose,
Rounak Saha,
Ayan Kumar Bhowmick,
Pawan Goyal,
Niloy Ganguly,
Prasenjit Dey,
Ravi Kokku
Abstract:
In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has different requirements of information and hence their summarization. For example, in the healthcare domain, a persona-based (such as Doctor, Nurse, Patient etc.)…
▽ More
In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has different requirements of information and hence their summarization. For example, in the healthcare domain, a persona-based (such as Doctor, Nurse, Patient etc.) approach is imperative to deliver targeted medical information efficiently. Persona-based summarization of domain-specific information by humans is a high cognitive load task and is generally not preferred. The summaries generated by two different humans have high variability and do not scale in cost and subject matter expertise as domains and personas grow. Further, AI-generated summaries using generic Large Language Models (LLMs) may not necessarily offer satisfactory accuracy for different domains unless they have been specifically trained on domain-specific data and can also be very expensive to use in day-to-day operations. Our contribution in this paper is two-fold: 1) We present an approach to efficiently fine-tune a domain-specific small foundation LLM using a healthcare corpus and also show that we can effectively evaluate the summarization quality using AI-based critiquing. 2) We further show that AI-based critiquing has good concordance with Human-based critiquing of the summaries. Hence, such AI-based pipelines to generate domain-specific persona-based summaries can be easily scaled to other domains such as legal, enterprise documents, education etc. in a very efficient and cost-effective manner.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Privacy Preserving Semi-Decentralized Mean Estimation over Intermittently-Connected Networks
Authors:
Rajarshi Saha,
Mohamed Seif,
Michal Yemini,
Andrea J. Goldsmith,
H. Vincent Poor
Abstract:
We consider the problem of privately estimating the mean of vectors distributed across different nodes of an unreliable wireless network, where communications between nodes can fail intermittently. We adopt a semi-decentralized setup, wherein to mitigate the impact of intermittently connected links, nodes can collaborate with their neighbors to compute a local consensus, which they relay to a cent…
▽ More
We consider the problem of privately estimating the mean of vectors distributed across different nodes of an unreliable wireless network, where communications between nodes can fail intermittently. We adopt a semi-decentralized setup, wherein to mitigate the impact of intermittently connected links, nodes can collaborate with their neighbors to compute a local consensus, which they relay to a central server. In such a setting, the communications between any pair of nodes must ensure that the privacy of the nodes is rigorously maintained to prevent unauthorized information leakage. We study the tradeoff between collaborative relaying and privacy leakage due to the data sharing among nodes and, subsequently, propose PriCER: Private Collaborative Estimation via Relaying -- a differentially private collaborative algorithm for mean estimation to optimize this tradeoff. The privacy guarantees of PriCER arise (i) implicitly, by exploiting the inherent stochasticity of the flaky network connections, and (ii) explicitly, by adding Gaussian perturbations to the estimates exchanged by the nodes. Local and central privacy guarantees are provided against eavesdroppers who can observe different signals, such as the communications amongst nodes during local consensus and (possibly multiple) transmissions from the relays to the central server. We substantiate our theoretical findings with numerical simulations. Our implementation is available at https://github.com/rajarshisaha95/private-collaborative-relaying.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Exploring Effects of Hyperdimensional Vectors for Tsetlin Machines
Authors:
Vojtech Halenka,
Ahmed K. Kadhim,
Paul F. A. Clarke,
Bimal Bhattarai,
Rupsa Saha,
Ole-Christoffer Granmo,
Lei Jiao,
Per-Arne Andersen
Abstract:
Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large se…
▽ More
Tsetlin machines (TMs) have been successful in several application domains, operating with high efficiency on Boolean representations of the input data. However, Booleanizing complex data structures such as sequences, graphs, images, signal spectra, chemical compounds, and natural language is not trivial. In this paper, we propose a hypervector (HV) based method for expressing arbitrarily large sets of concepts associated with any input data. Using a hyperdimensional space to build vectors drastically expands the capacity and flexibility of the TM. We demonstrate how images, chemical compounds, and natural language text are encoded according to the proposed method, and how the resulting HV-powered TM can achieve significantly higher accuracy and faster learning on well-known benchmarks. Our results open up a new research direction for TMs, namely how to expand and exploit the benefits of operating in hyperspace, including new booleanization strategies, optimization of TM inference and learning, as well as new TM applications.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Flexible Agent-based Modeling Framework to Evaluate Integrated Microtransit and Fixed-route Transit Designs: Mode Choice, Supernetworks, and Fleet Simulation
Authors:
Siwei Hu,
Michael F. Hyland,
Ritun Saha,
Jacob J. Berkel,
Geoffrey Vander Veen
Abstract:
The integration of traditional fixed-route transit (FRT) and more flexible microtransit has been touted as a means of improving mobility and access to opportunity, increasing transit ridership, and promoting environmental sustainability. To help evaluate integrated FRT and microtransit public transit (PT) system (henceforth ``integrated fixed-flex PT system'') designs, we propose a high-fidelity m…
▽ More
The integration of traditional fixed-route transit (FRT) and more flexible microtransit has been touted as a means of improving mobility and access to opportunity, increasing transit ridership, and promoting environmental sustainability. To help evaluate integrated FRT and microtransit public transit (PT) system (henceforth ``integrated fixed-flex PT system'') designs, we propose a high-fidelity modeling framework that provides reliable estimates for a wide range of (i) performance metrics and (ii) integrated fixed-flex PT system designs. We formulate the mode choice equilibrium problem as a fixed-point problem wherein microtransit demand is a function of microtransit performance, and microtransit performance depends on microtransit demand. We propose a detailed agent-based simulation modeling framework that includes (i) a binary logit mode choice model (private auto vs. transit), (ii) a supernetwork-based model and pathfinding algorithm for multi-modal transit path choice where the supernetwork includes pedestrian, FRT, and microtransit layers, (iii) a detailed mobility-on-demand fleet simulator called FleetPy to model the supply-demand dynamics of the microtransit service. In this paper, we illustrate the capabilities of the modeling framework by analyzing integrated fixed-flex PT system designs that vary the following design parameters: FRT frequencies and microtransit fleet size, service region structure, virtual stop coverage, and operating hours. We include case studies in downtown San Diego and Lemon Grove, California. The computational results show that the proposed modeling framework converges to a mode choice equilibrium. Moreover, the scenario results imply that introducing a new microtransit service decreases FRT ridership and requires additional subsidies, but it significantly increases job accessibility and slightly reduces total VMT.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Compressing Large Language Models using Low Rank and Low Precision Decomposition
Authors:
Rajarshi Saha,
Naomi Sagan,
Varun Srivastava,
Andrea J. Goldsmith,
Mert Pilanci
Abstract:
The prohibitive sizes of Large Language Models (LLMs) today make it difficult to deploy them on memory-constrained edge devices. This work introduces $\rm CALDERA$ -- a new post-training LLM compression algorithm that harnesses the inherent low-rank structure of a weight matrix $\mathbf{W}$ by approximating it via a low-rank, low-precision decomposition as…
▽ More
The prohibitive sizes of Large Language Models (LLMs) today make it difficult to deploy them on memory-constrained edge devices. This work introduces $\rm CALDERA$ -- a new post-training LLM compression algorithm that harnesses the inherent low-rank structure of a weight matrix $\mathbf{W}$ by approximating it via a low-rank, low-precision decomposition as $\mathbf{W} \approx \mathbf{Q} + \mathbf{L}\mathbf{R}$. Here, $\mathbf{L}$ and $\mathbf{R}$ are low rank factors, and the entries of $\mathbf{Q}$, $\mathbf{L}$ and $\mathbf{R}$ are quantized. The model is compressed by substituting each layer with its $\mathbf{Q} + \mathbf{L}\mathbf{R}$ decomposition, and the zero-shot performance of the compressed model is evaluated. Additionally, $\mathbf{L}$ and $\mathbf{R}$ are readily amenable to low-rank adaptation, consequently enhancing the zero-shot performance. $\rm CALDERA$ obtains this decomposition by formulating it as an optimization problem $\min_{\mathbf{Q},\mathbf{L},\mathbf{R}}\lVert(\mathbf{Q} + \mathbf{L}\mathbf{R} - \mathbf{W})\mathbf{X}^\top\rVert_{\rm F}^2$, where $\mathbf{X}$ is the calibration data, and $\mathbf{Q}, \mathbf{L}, \mathbf{R}$ are constrained to be representable using low-precision formats. Theoretical upper bounds on the approximation error of $\rm CALDERA$ are established using a rank-constrained regression framework, and the tradeoff between compression ratio and model performance is studied by analyzing the impact of target rank and quantization bit budget. Results illustrate that compressing LlaMa-$2$ $7$B/$13B$/$70$B and LlaMa-$3$ $8$B models using $\rm CALDERA$ outperforms existing post-training LLM compression techniques in the regime of less than $2.5$ bits per parameter. The implementation is available at: https://github.com/pilancilab/caldera.
△ Less
Submitted 3 November, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Foreground removal and angular power spectrum estimation of 21 cm signal using harmonic space ILC method
Authors:
Albin Joseph,
Rajib Saha
Abstract:
Mapping the distribution of neutral atomic hydrogen (HI) in the Universe through its 21 cm emission line provides a powerful cosmological probe to map the large-scale structures and shed light on various cosmological phenomena. The Baryon Acoustic Oscillations at low redshifts can potentially be probed by sensitive HI intensity mapping experiments and constrain the properties of dark energy. Howev…
▽ More
Mapping the distribution of neutral atomic hydrogen (HI) in the Universe through its 21 cm emission line provides a powerful cosmological probe to map the large-scale structures and shed light on various cosmological phenomena. The Baryon Acoustic Oscillations at low redshifts can potentially be probed by sensitive HI intensity mapping experiments and constrain the properties of dark energy. However, the 21 cm signal detection faces formidable challenges due to the dominance of various astrophysical foregrounds, which can be several orders of magnitude stronger. Our current work introduces a novel and model-independent Internal Linear Combination (ILC) method in harmonic space using the principal components of the 21 cm signal for accurate foreground removal and power spectrum estimation. We estimate the principal components by incorporating prior knowledge of the theoretical 21 cm covariance matrix. We test our methodology by detailed simulations of radio observations, incorporating synchrotron emission, free-free radiation, extragalactic point sources, and thermal noise. We estimate the full sky 21 cm angular power spectrum after application of a mask on the full sky cleaned 21 cm signal by using the mode-mode coupling matrix. These full sky estimates of angular spectra can be directly used to measure the cosmological parameters. For the first time, we demonstrate the effectiveness of a foreground model-independent ILC method in harmonic space to reconstruct the 21 cm signal.
△ Less
Submitted 31 January, 2025; v1 submitted 4 May, 2024;
originally announced May 2024.
-
Swarm UAVs Communication
Authors:
Arindam Majee,
Rahul Saha,
Snehasish Roy,
Srilekha Mandal,
Sayan Chatterjee
Abstract:
The advancement in cyber-physical systems has opened a new way in disaster management and rescue operations. The usage of UAVs is very promising in this context. UAVs, mainly quadcopters, are small in size and their payload capacity is limited. A single UAV can not traverse the whole area. Hence multiple UAVs or swarms of UAVs come into the picture managing the entire payload in a modular and equi…
▽ More
The advancement in cyber-physical systems has opened a new way in disaster management and rescue operations. The usage of UAVs is very promising in this context. UAVs, mainly quadcopters, are small in size and their payload capacity is limited. A single UAV can not traverse the whole area. Hence multiple UAVs or swarms of UAVs come into the picture managing the entire payload in a modular and equiproportional manner. In this work we have explored a vast topic related to UAVs. Among the UAVs quadcopter is the main focus. We explored the types of quadcopters, their flying strategy,their communication protocols, architecture and controlling techniques, followed by the swarm behaviour in nature and UAVs. Swarm behaviour and a few swarm optimization algorithms has been explored here. Swarm architecture and communication in between swarm UAV networks also got a special attention in our work. In disaster management the UAV swarm network must have to search a large area. And for this proper path planning algorithm is required. We have discussed the existing path planning algorithm, their advantages and disadvantages in great detail. Formation maintenance of the swarm network is an important issue which has been explored through leader-follower technique. The wireless path loss model has been modelled using friis and ground ray reflection model. Using this path loss models we have managed to create the link budget and simulate the variation of communication link performance with the variation of distance.
△ Less
Submitted 24 February, 2024;
originally announced May 2024.
-
Accurate and Unbiased Reconstruction of CMB B Mode using Deep Learning
Authors:
Srikanta Pal,
Sarvesh Kumar Yadav,
Rajib Saha,
Tarun Souradeep
Abstract:
An ingeniously designed autoencoder (PrimeNet) using simulated observations of future generation ECHO satellite mission recovers CMB B mode map, angular spectrum for multipoles $\ell \lesssim 9$ and tensor to scalar ratio $r$ {\it limited only by cosmic variance down to $r= 0.0001$ and below}. We use diverse, realistically complex and detailed foreground models. PrimeNet predicts accurate results…
▽ More
An ingeniously designed autoencoder (PrimeNet) using simulated observations of future generation ECHO satellite mission recovers CMB B mode map, angular spectrum for multipoles $\ell \lesssim 9$ and tensor to scalar ratio $r$ {\it limited only by cosmic variance down to $r= 0.0001$ and below}. We use diverse, realistically complex and detailed foreground models. PrimeNet predicts accurate results even when data with $r=0$ are tested which were not used in training, implying robust and efficient predictive power. The work eliminates a major bottleneck of weak CMB B mode reconstruction and takes a leap forward for understanding fundamental physics of the primordial Universe.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence
Authors:
Ripon Kumar Saha,
Dehao Qin,
Nianyi Li,
Jinwei Ye,
Suren Jayasuriya
Abstract:
Tackling image degradation due to atmospheric turbulence, particularly in dynamic environment, remains a challenge for long-range imaging systems. Existing techniques have been primarily designed for static scenes or scenes with small motion. This paper presents the first segment-then-restore pipeline for restoring the videos of dynamic scenes in turbulent environment. We leverage mean optical flo…
▽ More
Tackling image degradation due to atmospheric turbulence, particularly in dynamic environment, remains a challenge for long-range imaging systems. Existing techniques have been primarily designed for static scenes or scenes with small motion. This paper presents the first segment-then-restore pipeline for restoring the videos of dynamic scenes in turbulent environment. We leverage mean optical flow with an unsupervised motion segmentation method to separate dynamic and static scene components prior to restoration. After camera shake compensation and segmentation, we introduce foreground/background enhancement leveraging the statistics of turbulence strength and a transformer model trained on a novel noise-based procedural turbulence generator for fast dataset augmentation. Benchmarked against existing restoration methods, our approach restores most of the geometric distortion and enhances sharpness for videos. We make our code, simulator, and data publicly available to advance the field of video restoration from turbulence: riponcs.github.io/TurbSegRes
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism
Authors:
Trilokesh Ranjan Sarkar,
Nilanjan Das,
Pralay Sankar Maitra,
Bijoy Some,
Ritwik Saha,
Orijita Adhikary,
Bishal Bose,
Jaydip Sen
Abstract:
This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Met…
▽ More
This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner (CW) approach. These attacks are examined concerning three pre-trained image classifiers: Resnext50_32x4d, DenseNet-201, and VGG-19, utilizing the Tiny-ImageNet dataset. Furthermore, the study proposes the robustness of defensive distillation as a defense mechanism to counter FGSM and CW attacks. This defense mechanism is evaluated using the CIFAR-10 dataset, where CNN models, specifically resnet101 and Resnext50_32x4d, serve as the teacher and student models, respectively. The proposed defensive distillation model exhibits effectiveness in thwarting attacks such as FGSM. However, it is noted to remain susceptible to more sophisticated techniques like the CW attack. The document presents a meticulous validation of the proposed scheme. It provides detailed and comprehensive results, elucidating the efficacy and limitations of the defense mechanisms employed. Through rigorous experimentation and analysis, the study offers insights into the dynamics of adversarial attacks on DNNs, as well as the effectiveness of defensive strategies in mitigating their impact.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Peculiar magnetic and magneto-transport properties in a non-centrosymmetric self-intercalated van der Waals ferromagnet Cr5Te8
Authors:
Banik Rai,
Sandip Kumar Kuila,
Rana Saha,
Sankalpa Hazra,
Chandan De,
Jyotirmoy Sau,
Venkatraman Gopalan,
Partha Pratim Jana,
Stuart S. P. Parkin,
Nitesh Kumar
Abstract:
Trigonal Cr$_5$Te$_8$, a self-intercalated van der Waals ferromagnet with an out-of-plane magnetic anisotropy, has long been known to crystallize in a centrosymmetric structure. However, optical second harmonic generation experiments, together with comprehensive structural analysis, indicate that this compound rather adopts a non-centrosymmetric structure. Lorentz transmission electron microscopy…
▽ More
Trigonal Cr$_5$Te$_8$, a self-intercalated van der Waals ferromagnet with an out-of-plane magnetic anisotropy, has long been known to crystallize in a centrosymmetric structure. However, optical second harmonic generation experiments, together with comprehensive structural analysis, indicate that this compound rather adopts a non-centrosymmetric structure. Lorentz transmission electron microscopy reveals the presence of Néel-type skyrmions, consistent with its non-centrosymmetric structure. A large anomalous Hall conductivity of 102 ohm$^{-1}$cm$^{-1}$ at low temperature stems from intrinsic origin, which is larger than any previously reported values in the bulk Cr-Te system. Notably, spontaneous topological Hall resistivity arising from the skyrmionic phase has been observed. Our findings not only elucidate the unique magnetic and magneto-transport properties of non-centrosymmetric trigonal Cr$_5$Te$_8$, but also open new avenues for investigating the effects of broken inversion symmetry on material properties and their potential applications.
△ Less
Submitted 4 February, 2025; v1 submitted 4 April, 2024;
originally announced April 2024.
-
A cohomological study of modified Rota-Baxter associative algebras with derivations
Authors:
Imed Basdouri,
Sami Benabdelhafidh,
Mohamed Amin Sadraoui,
Ripan Saha
Abstract:
This paper presents a cohomological study of modified Rota-Baxter associative algebras in the presence of derivations. The Modified Rota-Baxter operator, which is a modified version and closely related to the classical Rota-Baxter operator, has garnered significant attention due to its applications in various mathematical and physical contexts. In this study, we define a cohomology theory and also…
▽ More
This paper presents a cohomological study of modified Rota-Baxter associative algebras in the presence of derivations. The Modified Rota-Baxter operator, which is a modified version and closely related to the classical Rota-Baxter operator, has garnered significant attention due to its applications in various mathematical and physical contexts. In this study, we define a cohomology theory and also investigate a one-parameter formal deformation theory and abelian extensions of modified Rota-Baxter associative algebras under the influence of derivations.
△ Less
Submitted 25 June, 2024; v1 submitted 31 March, 2024;
originally announced April 2024.