-
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
Authors:
Raghvendra Kumar,
S. A. Mohammed Salman,
Aryan Sahu,
Tridib Nandi,
Pragathi Y. P.,
Sriparna Saha,
Jose G. Moreno
Abstract:
Despite progress in comment-aware multimodal and multilingual summarization for English and Chinese, research in Indian languages remains limited. This study addresses this gap by introducing COSMMIC, a pioneering comment-sensitive multimodal, multilingual dataset featuring nine major Indian languages. COSMMIC comprises 4,959 article-image pairs and 24,484 reader comments, with ground-truth summar…
▽ More
Despite progress in comment-aware multimodal and multilingual summarization for English and Chinese, research in Indian languages remains limited. This study addresses this gap by introducing COSMMIC, a pioneering comment-sensitive multimodal, multilingual dataset featuring nine major Indian languages. COSMMIC comprises 4,959 article-image pairs and 24,484 reader comments, with ground-truth summaries available in all included languages. Our approach enhances summaries by integrating reader insights and feedback. We explore summarization and headline generation across four configurations: (1) using article text alone, (2) incorporating user comments, (3) utilizing images, and (4) combining text, comments, and images. To assess the dataset's effectiveness, we employ state-of-the-art language models such as LLama3 and GPT-4. We conduct a comprehensive study to evaluate different component combinations, including identifying supportive comments, filtering out noise using a dedicated comment classifier using IndicBERT, and extracting valuable insights from images with a multilingual CLIP-based classifier. This helps determine the most effective configurations for natural language generation (NLG) tasks. Unlike many existing datasets that are either text-only or lack user comments in multimodal settings, COSMMIC uniquely integrates text, images, and user feedback. This holistic approach bridges gaps in Indian language resources, advancing NLP research and fostering inclusivity.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
XSPECT on-board XPoSat: Calibration and First Results
Authors:
Rwitika Chatterjee,
Koushal Vadodariya,
Radhakrishna Vatedka,
Vivek Kumar Agrawal,
Anurag Tyagi,
Kiran M Jayasurya,
Shyam Prakash V P,
Ramadevi M C,
Vaishali S
Abstract:
XPoSat is India's first X-ray spectro-polarimetry mission, consisting of two co-aligned instruments, a polarimeter (POLIX) and a spectrometer (XSPECT), to study the X-ray emission from celestial sources. Since polarimetry is a photon-hungry technique, the mission is designed to observe sources for long integration times (~ few days to weeks). This provides an unique opportunity, enabling XSPECT to…
▽ More
XPoSat is India's first X-ray spectro-polarimetry mission, consisting of two co-aligned instruments, a polarimeter (POLIX) and a spectrometer (XSPECT), to study the X-ray emission from celestial sources. Since polarimetry is a photon-hungry technique, the mission is designed to observe sources for long integration times (~ few days to weeks). This provides an unique opportunity, enabling XSPECT to carry out long-term monitoring of sources, and study their spectro-temporal evolution. To ensure that the instrument is able to fulfill its scientific objectives, it was extensively calibrated on-ground. Post launch, these calibrations were validated using on-board observations. Additionally, some aspects of the instrument such as alignment and effective area were also derived and fine-tuned from in-flight data. In this paper, we describe the calibration of XSPECT instrument in detail, including some initial results derived from its data to establish its capabilities.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation
Authors:
Piyush Tiwary,
Kinjawl Bhattacharyya,
Prathosh A. P
Abstract:
Medical image segmentation models often struggle to generalize across different domains due to various reasons. Domain Generalization (DG) methods overcome this either through representation learning or data augmentation (DAug). While representation learning methods seek domain-invariant features, they often rely on ad-hoc techniques and lack formal guarantees. DAug methods, which enrich model rep…
▽ More
Medical image segmentation models often struggle to generalize across different domains due to various reasons. Domain Generalization (DG) methods overcome this either through representation learning or data augmentation (DAug). While representation learning methods seek domain-invariant features, they often rely on ad-hoc techniques and lack formal guarantees. DAug methods, which enrich model representations through synthetic samples, have shown comparable or superior performance to representation learning approaches. We propose LangDAug, a novel $\textbf{Lang}$evin $\textbf{D}$ata $\textbf{Aug}$mentation for multi-source domain generalization in 2D medical image segmentation. LangDAug leverages Energy-Based Models (EBMs) trained via contrastive divergence to traverse between source domains, generating intermediate samples through Langevin dynamics. Theoretical analysis shows that LangDAug induces a regularization effect, and for GLMs, it upper-bounds the Rademacher complexity by the intrinsic dimensionality of the data manifold. Through extensive experiments on Fundus segmentation and 2D MRI prostate segmentation benchmarks, we show that LangDAug outperforms state-of-the-art domain generalization methods and effectively complements existing domain-randomization approaches. The codebase for our method is available at https://github.com/backpropagator/LangDAug.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Latent Mamba Operator for Partial Differential Equations
Authors:
Karn Tiwari,
Niladri Dutta,
N M Anoop Krishnan,
Prathosh A P
Abstract:
Neural operators have emerged as powerful data-driven frameworks for solving Partial Differential Equations (PDEs), offering significant speedups over numerical methods. However, existing neural operators struggle with scalability in high-dimensional spaces, incur high computational costs, and face challenges in capturing continuous and long-range dependencies in PDE dynamics. To address these lim…
▽ More
Neural operators have emerged as powerful data-driven frameworks for solving Partial Differential Equations (PDEs), offering significant speedups over numerical methods. However, existing neural operators struggle with scalability in high-dimensional spaces, incur high computational costs, and face challenges in capturing continuous and long-range dependencies in PDE dynamics. To address these limitations, we introduce the Latent Mamba Operator (LaMO), which integrates the efficiency of state-space models (SSMs) in latent space with the expressive power of kernel integral formulations in neural operators. We also establish a theoretical connection between state-space models (SSMs) and the kernel integral of neural operators. Extensive experiments across diverse PDE benchmarks on regular grids, structured meshes, and point clouds covering solid and fluid physics datasets, LaMOs achieve consistent state-of-the-art (SOTA) performance, with a 32.3% improvement over existing baselines in solution operator approximation, highlighting its efficacy in modeling complex PDE solutions.
△ Less
Submitted 28 May, 2025; v1 submitted 25 May, 2025;
originally announced May 2025.
-
Connected dom-forcing sets in graphs
Authors:
Susanth P,
Charles Dominic,
Premodkumar K P
Abstract:
In a graph G, a dominating set Df subset of V (G) is called a dom-forcing set if the sub-graph induced by Df must form a zero forcing set. The minimum cardinality of such a set is known as the dom-forcing number of the graph G, denoted by Fd(G). A connected dom-forcing forcing set of a graph G, is a dom-forcing set of G that induces a sub graph of G which is connected. The connected dom-forcing nu…
▽ More
In a graph G, a dominating set Df subset of V (G) is called a dom-forcing set if the sub-graph induced by Df must form a zero forcing set. The minimum cardinality of such a set is known as the dom-forcing number of the graph G, denoted by Fd(G). A connected dom-forcing forcing set of a graph G, is a dom-forcing set of G that induces a sub graph of G which is connected. The connected dom-forcing number of G, Fcd(G), is the minimum size of a connected dom-forcing set. This study delves into the concept of the connected dom-forcing number Fcd(G), examining its properties and characteristics. Furthermore, it seeks to accurately determine Fcd(G) for several well-known graphs and their graph products.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
GOTHAM: Graph Class Incremental Learning Framework under Weak Supervision
Authors:
Aditya Hemant Shahane,
Prathosh A. P,
Sandeep Kumar
Abstract:
Graphs are growing rapidly, along with the number of distinct label categories associated with them. Applications like e-commerce, healthcare, recommendation systems, and various social media platforms are rapidly moving towards graph representation of data due to their ability to capture both structural and attribute information. One crucial task in graph analysis is node classification, where un…
▽ More
Graphs are growing rapidly, along with the number of distinct label categories associated with them. Applications like e-commerce, healthcare, recommendation systems, and various social media platforms are rapidly moving towards graph representation of data due to their ability to capture both structural and attribute information. One crucial task in graph analysis is node classification, where unlabeled nodes are categorized into predefined classes. In practice, novel classes appear incrementally sometimes with just a few labels (seen classes) or even without any labels (unseen classes), either because they are new or haven't been explored much. Traditional methods assume abundant labeled data for training, which isn't always feasible. We investigate a broader objective: \emph{Graph Class Incremental Learning under Weak Supervision (GCL)}, addressing this challenge by meta-training on base classes with limited labeled instances. During the incremental streams, novel classes can have few-shot or zero-shot representation. Our proposed framework GOTHAM efficiently accommodates these unlabeled nodes by finding the closest prototype representation, serving as class representatives in the attribute space. For Text-Attributed Graphs (TAGs), our framework additionally incorporates semantic information to enhance the representation. By employing teacher-student knowledge distillation to mitigate forgetting, GOTHAM achieves promising results across various tasks. Experiments on datasets such as Cora-ML, Amazon, and OBGN-Arxiv showcase the effectiveness of our approach in handling evolving graph data under limited supervision. The repository is available here: \href{https://github.com/adityashahane10/GOTHAM--Graph-based-Class-Incremental-Learning-Framework-under-Weak-Supervision}{\small \textcolor{blue}{Code}}
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Customer Analytics using Surveillance Video
Authors:
Earnest Paul Ijjina,
Aniruddha Srinivas Joshi,
Goutham Kanahasabai,
Keerthi Priyanka P
Abstract:
The analysis of sales information, is a vital step in designing an effective marketing strategy. This work proposes a novel approach to analyse the shopping behaviour of customers to identify their purchase patterns. An extended version of the Multi-Cluster Overlapping k-Means Extension (MCOKE) algorithm with weighted k-Means algorithm is utilized to map customers to the garments of interest. The…
▽ More
The analysis of sales information, is a vital step in designing an effective marketing strategy. This work proposes a novel approach to analyse the shopping behaviour of customers to identify their purchase patterns. An extended version of the Multi-Cluster Overlapping k-Means Extension (MCOKE) algorithm with weighted k-Means algorithm is utilized to map customers to the garments of interest. The age & gender traits of the customer; the time spent and the expressions exhibited while selecting garments for purchase, are utilized to associate a customer or a group of customers to a garments they are interested in. Such study on the customer base of a retail business, may help in inferring the products of interest of their consumers, and enable them in developing effective business strategies, thus ensuring customer satisfaction, loyalty, increased sales and profits.
△ Less
Submitted 1 March, 2025;
originally announced March 2025.
-
Performance evaluation of non-uniform sensor spacing in a linear array configuration for MUSIC algorithm
Authors:
Pradeep Dheerendra,
Sumit Saraogi,
Palanisamy P.,
Kalyanasundaram N
Abstract:
In this paper, the performance of non-uniform spacing of sensors is evaluated for the MUSIC algorithm which estimates the direction of arrival (DOA) of a narrowband plane wave impinging on an array of sensors. Unlike uniform sensor spacing arrangement, where sensors are equidistant (equal to half the wavelength), we consider non-uniform spacing for the arrangement of sensors, where the distance be…
▽ More
In this paper, the performance of non-uniform spacing of sensors is evaluated for the MUSIC algorithm which estimates the direction of arrival (DOA) of a narrowband plane wave impinging on an array of sensors. Unlike uniform sensor spacing arrangement, where sensors are equidistant (equal to half the wavelength), we consider non-uniform spacing for the arrangement of sensors, where the distance between consecutive sensors increases progressively. We observe that the non-uniform sensor spacing configuration (with lesser number of sensors) can provide similar or better accuracy in DOA estimation compared to uniform sensor spacing configuration despite more number of sensors at identical array length.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
CoNOAir: A Neural Operator for Forecasting Carbon Monoxide Evolution in Cities
Authors:
Sanchit Bedi,
Karn Tiwari,
Prathosh A. P.,
Sri Harsha Kota,
N. M. Anoop Krishnan
Abstract:
Carbon Monoxide (CO) is a dominant pollutant in urban areas due to the energy generation from fossil fuels for industry, automobile, and domestic requirements. Forecasting the evolution of CO in real-time can enable the deployment of effective early warning systems and intervention strategies. However, the computational cost associated with the physics and chemistry-based simulation makes it prohi…
▽ More
Carbon Monoxide (CO) is a dominant pollutant in urban areas due to the energy generation from fossil fuels for industry, automobile, and domestic requirements. Forecasting the evolution of CO in real-time can enable the deployment of effective early warning systems and intervention strategies. However, the computational cost associated with the physics and chemistry-based simulation makes it prohibitive to implement such a model at the city and country scale. To address this challenge, here, we present a machine learning model based on neural operator, namely, Complex Neural Operator for Air Quality (CoNOAir), that can effectively forecast CO concentrations. We demonstrate this by developing a country-level model for short-term (hourly) and long-term (72-hour) forecasts of CO concentrations. Our model outperforms state-of-the-art models such as Fourier neural operators (FNO) and provides reliable predictions for both short and long-term forecasts. We further analyse the capability of the model to capture extreme events and generate forecasts in urban cities in India. Interestingly, we observe that the model predicts the next hour CO concentrations with R2 values greater than 0.95 for all the cities considered. The deployment of such a model can greatly assist the governing bodies to provide early warning, plan intervention strategies, and develop effective strategies by considering several what-if scenarios. Altogether, the present approach could provide a fillip to real-time predictions of CO pollution in urban cities.
△ Less
Submitted 13 January, 2025; v1 submitted 10 January, 2025;
originally announced January 2025.
-
GraPE: A Generate-Plan-Edit Framework for Compositional T2I Synthesis
Authors:
Ashish Goswami,
Satyam Kumar Modi,
Santhosh Rishi Deshineni,
Harman Singh,
Prathosh A. P,
Parag Singla
Abstract:
Text-to-image (T2I) generation has seen significant progress with diffusion models, enabling generation of photo-realistic images from text prompts. Despite this progress, existing methods still face challenges in following complex text prompts, especially those requiring compositional and multi-step reasoning. Given such complex instructions, SOTA models often make mistakes in faithfully modeling…
▽ More
Text-to-image (T2I) generation has seen significant progress with diffusion models, enabling generation of photo-realistic images from text prompts. Despite this progress, existing methods still face challenges in following complex text prompts, especially those requiring compositional and multi-step reasoning. Given such complex instructions, SOTA models often make mistakes in faithfully modeling object attributes, and relationships among them. In this work, we present an alternate paradigm for T2I synthesis, decomposing the task of complex multi-step generation into three steps, (a) Generate: we first generate an image using existing diffusion models (b) Plan: we make use of Multi-Modal LLMs (MLLMs) to identify the mistakes in the generated image expressed in terms of individual objects and their properties, and produce a sequence of corrective steps required in the form of an edit-plan. (c) Edit: we make use of an existing text-guided image editing models to sequentially execute our edit-plan over the generated image to get the desired image which is faithful to the original instruction. Our approach derives its strength from the fact that it is modular in nature, is training free, and can be applied over any combination of image generation and editing models. As an added contribution, we also develop a model capable of compositional editing, which further helps improve the overall accuracy of our proposed approach. Our method flexibly trades inference time compute with performance on compositional text prompts. We perform extensive experimental evaluation across 3 benchmarks and 10 T2I models including DALLE-3 and the latest -- SD-3.5-Large. Our approach not only improves the performance of the SOTA models, by upto 3 points, it also reduces the performance gap between weaker and stronger models. $\href{https://dair-iitd.github.io/GraPE/}{https://dair-iitd.github.io/GraPE/}$
△ Less
Submitted 11 March, 2025; v1 submitted 8 December, 2024;
originally announced December 2024.
-
Bubble dynamics in a cavitating venturi
Authors:
Premchand V Chandra,
Anuja Vijayan,
Pradeep Kumar P
Abstract:
Cryogenic fluids have extensive applications as fuel for launch vehicles in space applications and research. The physics of cryogenic flows are highly complex due to the sensitive nature of phase transformation from liquid to bubbly liquid and vapor, eventually resulting in cavitating flows at the ambient temperature owing to the very low boiling point of cryogenic fluids, which asserts us to clas…
▽ More
Cryogenic fluids have extensive applications as fuel for launch vehicles in space applications and research. The physics of cryogenic flows are highly complex due to the sensitive nature of phase transformation from liquid to bubbly liquid and vapor, eventually resulting in cavitating flows at the ambient temperature owing to the very low boiling point of cryogenic fluids, which asserts us to classify such flows under multi-phase flow physics regime. This work elucidates the modeling of bubbly flow for cryogenic fluids such as liquid nitrogen in a converging-diverging venturi-like flow device known as cavitating venturi, a passive flow control metering device. The numerical works in literature are usually limited to modeling iso-thermal bubbly flows such as water devoid of involving energy equations because there is no occurrence of interface heat transfer as latent heat of vaporization of water is higher, unlike cryogenic fluids which are sensitive to phase change at ambient conditions. So, to realize an appropriate model for modeling cryogenic bubbly flows such as liquid nitrogen flow, the effect of heat transfer at the interface and convective heat transfer from the surrounding liquid to the traversing bubble needs to be included. Numerical modeling using an in-house code involving a finite-difference method The numerical results showed the importance of including the heat transport equation due to convection and at the interface of bubble-fluid as a significant source term for the bubble dynamics. The work is supported by computational simulation using a commercial CFD package for 2-dimensional simulations to predict a characterizing parameter, namely cavitation length. A limited flow visualization experiment using a high-speed camera is performed to study the cavitating zone length.
△ Less
Submitted 26 December, 2024; v1 submitted 6 December, 2024;
originally announced December 2024.
-
UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors
Authors:
Suhas Srinath,
Aditya Chandrasekar,
Hemang Jamadagni,
Rajiv Soundararajan,
Prathosh A P
Abstract:
With the rise of marine exploration, underwater imaging has gained significant attention as a research topic. Underwater video enhancement has become crucial for real-time computer vision tasks in marine exploration. However, most existing methods focus on enhancing individual frames and neglect video temporal dynamics, leading to visually poor enhancements. Furthermore, the lack of ground-truth r…
▽ More
With the rise of marine exploration, underwater imaging has gained significant attention as a research topic. Underwater video enhancement has become crucial for real-time computer vision tasks in marine exploration. However, most existing methods focus on enhancing individual frames and neglect video temporal dynamics, leading to visually poor enhancements. Furthermore, the lack of ground-truth references limits the use of abundant available underwater video data in many applications. To address these issues, we propose a two-stage framework for enhancing underwater videos. The first stage uses a denoising diffusion probabilistic model to learn a generative prior from unlabeled data, capturing robust and descriptive feature representations. In the second stage, this prior is incorporated into a physics-based image formulation for spatial enhancement, while also enforcing temporal consistency between video frames. Our method enables real-time and computationally-efficient processing of high-resolution underwater videos at lower resolutions, and offers efficient enhancement in the presence of diverse water-types. Extensive experiments on four datasets show that our approach generalizes well and outperforms existing enhancement methods. Our code is available at github.com/suhas-srinath/undive.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Dom-forcing sets in graphs
Authors:
Susanth P,
Charles Dominic,
Premodkumar K P
Abstract:
A dominating set $D_{f}\subseteq V(G)$ of vertices in a graph $G$ is called a \emph{dom-forcing set} if the sub-graph induced by $\langle D_{f} \rangle$ must form a zero forcing set. The minimum cardinality of such a set is known as the dom-forcing number of the graph $G$, denoted by $F_{d}(G)$. This article embarks on an exploration of the dom-forcing number of a graph $G$. Additionally, it delve…
▽ More
A dominating set $D_{f}\subseteq V(G)$ of vertices in a graph $G$ is called a \emph{dom-forcing set} if the sub-graph induced by $\langle D_{f} \rangle$ must form a zero forcing set. The minimum cardinality of such a set is known as the dom-forcing number of the graph $G$, denoted by $F_{d}(G)$. This article embarks on an exploration of the dom-forcing number of a graph $G$. Additionally, it delves into the precise determination of $F_{d}(G)$ for certain well-known graphs
△ Less
Submitted 1 November, 2024;
originally announced November 2024.
-
Non-equillibrium ultrafast optical excitation as a stimulus for ultra-small field-free magnetic skyrmions in ferrimagnetic GdFeCo
Authors:
Syam Prasad P,
Jyoti Ranjan Mohanty
Abstract:
Generating and manipulating magnetic skyrmions at ultrafast time scales is essential for future skyrmionbased racetrack memory and logic gate applications. Using the atomistic spin dynamics simulations, we demonstrate the nucleation of ultra-small field-free magnetic skyrmions in amorphous GdFeCo at picosecond timescales by femtosecond laser heating. The ultrafast nature of laser heating and subse…
▽ More
Generating and manipulating magnetic skyrmions at ultrafast time scales is essential for future skyrmionbased racetrack memory and logic gate applications. Using the atomistic spin dynamics simulations, we demonstrate the nucleation of ultra-small field-free magnetic skyrmions in amorphous GdFeCo at picosecond timescales by femtosecond laser heating. The ultrafast nature of laser heating and subsequent cooling from a high-temperature state is crucial for forming magnetic skyrmion. The magnon localization and magnon coalescence are the key driving mechanisms responsible for stabilizing the magnetic skyrmions at zero field conditions. The polarization and, hence, the topological charge can be switched by exploiting the all-optical switching observed in GdFeCo. The skyrmion sizes and numbers can be controlled by varying pulse width and fluence of incident laser pulses. Applying an external magnetic field provides an additional degree of freedom to tune the skyrmion radius during the ultrafast optical creation of magnetic skyrmions. Our results provide a detailed understanding of the ultrafast creation of magnetic skyrmions using femtosecond laser pulses, a vital step in advancing next-generation skyrmion-based memory technologies.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
Authors:
Pranoy Panda,
Ankush Agarwal,
Chaitanya Devaguptapu,
Manohar Kaul,
Prathosh A P
Abstract:
Given unstructured text, Large Language Models (LLMs) are adept at answering simple (single-hop) questions. However, as the complexity of the questions increase, the performance of LLMs degrade. We believe this is due to the overhead associated with understanding the complex question followed by filtering and aggregating unstructured information in the raw text. Recent methods try to reduce this b…
▽ More
Given unstructured text, Large Language Models (LLMs) are adept at answering simple (single-hop) questions. However, as the complexity of the questions increase, the performance of LLMs degrade. We believe this is due to the overhead associated with understanding the complex question followed by filtering and aggregating unstructured information in the raw text. Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text, aiming to provide a structured overview that simplifies information processing. However, this simplistic approach is query-agnostic and the extracted facts are ambiguous as they lack context. To address these drawbacks and to enable LLMs to answer complex (multi-hop) questions with ease, we propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information. The use of our compressed distilled KG as input to the LLM results in our method utilizing up to $67\%$ fewer tokens to represent the query relevant information present in the supporting documents, compared to the state-of-the-art (SoTA) method. Our experiments show consistent improvements over the SoTA across several metrics (EM, F1, BERTScore, and Human Eval) on two popular benchmark datasets (HotpotQA and MuSiQue).
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Astrosat view of GX 339-4 during the peak of the recent outburst
Authors:
Shyam Prakash V. P.,
Ramadevi M. C.,
Vivek K. Agrawal
Abstract:
We present the spectral and timing analyses of \textit{AstroSat} observations of the Black Hole X-ray Binary GX 339-4 when the source was close the peak of the outburst in 2024. We find that both the spectral and timing variability of the source is indicative of it in its steep power law (SPL) state during the observations. We used phenomenological and physical models to understand the physics and…
▽ More
We present the spectral and timing analyses of \textit{AstroSat} observations of the Black Hole X-ray Binary GX 339-4 when the source was close the peak of the outburst in 2024. We find that both the spectral and timing variability of the source is indicative of it in its steep power law (SPL) state during the observations. We used phenomenological and physical models to understand the physics and geometry of accretion during this spectral state of the source. Spectral fits indicate the presence of an accretion disc with a temperature of $kT\sim$0.82 keV and a hot corona with a spectral index of $\sim$2.2 along with a significant contribution from iron line emission from the accretion disc. Strong QPOs were detected at $\sim$4.6 Hz in the Power Density Spectra of the source along with a harmonics feature. Time and phase lag at the QPO frequency are studied and we find a hard lag at the QPO frequency and at the same time a soft lag at the harmonic frequency. We estimate the spin of the black hole and it was found that $a = 0.99 \pm 0.003$. The height of the coronal region is estimated to be about 2.5 $R_{g}$, which is found to be similar to that observed during the previous outbursts of the source. We attempt to discuss the possible physical scenario for the observed spectral and timing features exhibited by the source.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Gödel Number based Clustering Algorithm with Decimal First Degree Cellular Automata
Authors:
Vicky Vikrant,
Narodia Parth P,
Kamalika Bhattacharjee
Abstract:
In this paper, a decimal first degree cellular automata (FDCA) based clustering algorithm is proposed where clusters are created based on reachability. Cyclic spaces are created and configurations which are in the same cycle are treated as the same cluster. Here, real-life data objects are encoded into decimal strings using Gödel number based encoding. The benefits of the scheme is, it reduces the…
▽ More
In this paper, a decimal first degree cellular automata (FDCA) based clustering algorithm is proposed where clusters are created based on reachability. Cyclic spaces are created and configurations which are in the same cycle are treated as the same cluster. Here, real-life data objects are encoded into decimal strings using Gödel number based encoding. The benefits of the scheme is, it reduces the encoded string length while maintaining the features properties. Candidate CA rules are identified based on some theoretical criteria such as self-replication and information flow. An iterative algorithm is developed to generate the desired number of clusters over three stages. The results of the clustering are evaluated based on benchmark clustering metrics such as Silhouette score, Davis Bouldin, Calinski Harabasz and Dunn Index. In comparison with the existing state-of-the-art clustering algorithms, our proposed algorithm gives better performance.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
A Weitzenböck formula on Sasakian holomorphic bundles
Authors:
Luis E. Portilla P.,
Eric Loubeau,
Henrique N. Sá Earp
Abstract:
This work seeks to advance the understanding of the smooth structure of the moduli space of self-dual contact instantons (SDCI) on Sasakian 7-manifolds M. A neighborhood of a smooth point of M is locally modeled on the first cohomological group of an elliptic complex (1.4). There is a cohomological obstruction to the smoothness for the moduli space, in terms of a second basic cohomological group,…
▽ More
This work seeks to advance the understanding of the smooth structure of the moduli space of self-dual contact instantons (SDCI) on Sasakian 7-manifolds M. A neighborhood of a smooth point of M is locally modeled on the first cohomological group of an elliptic complex (1.4). There is a cohomological obstruction to the smoothness for the moduli space, in terms of a second basic cohomological group, in this paper we study conditions under which this obstruction disappears, by computing a Weitzenböck formula and using a Bochner-type method to obtain a vanishing theorem. Given an SDCI on a Sasakian bundle E, we find sufficient conditions for the vanishing of the obstruction in the positivity of a couple of operators R and F depending on the curvatures of the connection and the Riemann curvature of the Sasakian metric g. In particular, we find that if M is transversely Ricci positive and F positive, the moduli space of SDCI must be smooth. However, in general, the operator F is not positive definite and we describe bundles over the Stiefel manifold for which it is the case. Finally, we show that when the energy of the curvature is less than the first non-zero eigenvalue of RicT the obstruction vanishes.
△ Less
Submitted 22 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
AstroSat View of the Neutron Star Low-mass X-Ray Binary GX 5-1
Authors:
Shyam Prakash V P,
Vivek K. Agrawal
Abstract:
We present the spectral and timing study of the bright NS-LMXB GX 5-1 using \textit{\textit{AstroSat}/LAXPC} and \textit{SXT} observations conducted in the year 2018. During the observation, the source traces out the complete HB and NB of the Z-track in the HID. Understanding the spectral and temporal evolution of the source along the 'Z' track can probe the accretion process in the vicinity of a…
▽ More
We present the spectral and timing study of the bright NS-LMXB GX 5-1 using \textit{\textit{AstroSat}/LAXPC} and \textit{SXT} observations conducted in the year 2018. During the observation, the source traces out the complete HB and NB of the Z-track in the HID. Understanding the spectral and temporal evolution of the source along the 'Z' track can probe the accretion process in the vicinity of a neutron star. Spectral analysis was performed in the 0.7-20 keV energy range for different segments in the HID using a multi-temperature disc black body with an average temperature, $kT_{in} \sim$0.46 and a thermal Comptonization model. It is found that the optical depth of the corona drops from $\sim$6.68 in HB to $\sim$2.74 in NB. The Timing analysis using the LAXPC instrument indicates the presence of quasi-periodic oscillations in HB, NB, and the hard apex of the Z-track. The observed QPO frequencies are similar to the characteristic frequencies of horizontal branch and normal branch oscillations. The HBO frequency increase from $\sim$12-46 Hz towards the hard apex. The timing studies conducted in soft and hard band indicate the association of HBO and NBO origin with the non-thermal component. Further research could explore the implications of this relationship for understanding the dynamics of accretion onto neutron stars.
△ Less
Submitted 18 November, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Observations of the Crab Nebula with MACE (Major Atmospheric Cherenkov Experiment)
Authors:
Borwankar C.,
Sharma M.,
Hariharan J.,
Venugopal K.,
Godambe S.,
Mankuzhyil N.,
Chandra P.,
Khurana M.,
Pathania A.,
Chouhan N.,
Dhar V. K.,
Thubstan R.,
Norlha S.,
Keshavananda,
Sarkar D.,
Dar Z. A.,
Kotwal S. V.,
Godiyal S.,
Kushwaha C. P.,
Singh K. K.,
Das M. P.,
Tolamatti A.,
Ghosal B.,
Chanchalani K.,
Pandey P.
, et al. (10 additional authors not shown)
Abstract:
The Major Atmospheric Cherenkov Experiment (MACE) is a large size (21m) Imaging Atmospheric Cherenkov Telescope (IACT) installed at an altitude of 4270m above sea level at Hanle, Ladakh in northern India. Here we report the detection of Very High Energy (VHE) gamma-ray emission from Crab Nebula above 80 GeV. We analysed ~15 hours of data collected at low zenith angle between November 2022 and Febr…
▽ More
The Major Atmospheric Cherenkov Experiment (MACE) is a large size (21m) Imaging Atmospheric Cherenkov Telescope (IACT) installed at an altitude of 4270m above sea level at Hanle, Ladakh in northern India. Here we report the detection of Very High Energy (VHE) gamma-ray emission from Crab Nebula above 80 GeV. We analysed ~15 hours of data collected at low zenith angle between November 2022 and February 2023. The energy spectrum is well described by a log-parabola function with a flux of ~(3.46 +/- 0.26stat) x 10-10 TeV-1 cm-2 s-1, at 400 GeV with spectral index of 2.09 +/- 0.06stat and a curvature parameter of 0.08 +/- 0.07stat. The gamma-rays are detected in an energy range spanning from 80 GeV to ~5 TeV. The energy resolution improves from ~34% at an analysis energy threshold of 80 GeV to ~21% above 1 TeV. The daily light curve and the spectral energy distribution obtained for the Crab Nebula is in agreement with previous measurements, considering statistical and systematic uncertainties.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective
Authors:
Subhodip Panda,
Shashwat Sourav,
Prathosh A. P
Abstract:
In order to adhere to regulatory standards governing individual data privacy and safety, machine learning models must systematically eliminate information derived from specific subsets of a user's training data that can no longer be utilized. The emerging discipline of Machine Unlearning has arisen as a pivotal area of research, facilitating the process of selectively discarding information design…
▽ More
In order to adhere to regulatory standards governing individual data privacy and safety, machine learning models must systematically eliminate information derived from specific subsets of a user's training data that can no longer be utilized. The emerging discipline of Machine Unlearning has arisen as a pivotal area of research, facilitating the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model, thereby eliminating the necessity for extensive retraining from scratch. The principal aim of this study is to formulate a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network. This intentional removal is crafted to degrade the model's performance specifically concerning the unlearned data class while concurrently minimizing any detrimental impacts on the model's performance in other classes. To achieve this goal, we frame the class unlearning problem from a Bayesian perspective, which yields a loss function that minimizes the log-likelihood associated with the unlearned data with a stability regularization in parameter space. This stability regularization incorporates Mohalanobis distance with respect to the Fisher Information matrix and $l_2$ distance from the pre-trained model parameters. Our novel approach, termed \textbf{Partially-Blinded Unlearning (PBU)}, surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness. Notably, PBU achieves this efficacy without requiring awareness of the entire training dataset but only to the unlearned data points, marking a distinctive feature of its performance.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
CoroNetGAN: Controlled Pruning of GANs via Hypernetworks
Authors:
Aman Kumar,
Khushboo Anand,
Shubham Mandloi,
Ashutosh Mishra,
Avinash Thakur,
Neeraj Kasera,
Prathosh A P
Abstract:
Generative Adversarial Networks (GANs) have proven to exhibit remarkable performance and are widely used across many generative computer vision applications. However, the unprecedented demand for the deployment of GANs on resource-constrained edge devices still poses a challenge due to huge number of parameters involved in the generation process. This has led to focused attention on the area of co…
▽ More
Generative Adversarial Networks (GANs) have proven to exhibit remarkable performance and are widely used across many generative computer vision applications. However, the unprecedented demand for the deployment of GANs on resource-constrained edge devices still poses a challenge due to huge number of parameters involved in the generation process. This has led to focused attention on the area of compressing GANs. Most of the existing works use knowledge distillation with the overhead of teacher dependency. Moreover, there is no ability to control the degree of compression in these methods. Hence, we propose CoroNet-GAN for compressing GAN using the combined strength of differentiable pruning method via hypernetworks. The proposed method provides the advantage of performing controllable compression while training along with reducing training time by a substantial factor. Experiments have been done on various conditional GAN architectures (Pix2Pix and CycleGAN) to signify the effectiveness of our approach on multiple benchmark datasets such as Edges-to-Shoes, Horse-to-Zebra and Summer-to-Winter. The results obtained illustrate that our approach succeeds to outperform the baselines on Zebra-to-Horse and Summer-to-Winter achieving the best FID score of 32.3 and 72.3 respectively, yielding high-fidelity images across all the datasets. Additionally, our approach also outperforms the state-of-the-art methods in achieving better inference time on various smart-phone chipsets and data-types making it a feasible solution for deployment on edge devices.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Guided Prompting in SAM for Weakly Supervised Cell Segmentation in Histopathological Images
Authors:
Aayush Kumar Tyagi,
Vaibhav Mishra,
Prathosh A. P.,
Mausam
Abstract:
Cell segmentation in histopathological images plays a crucial role in understanding, diagnosing, and treating many diseases. However, data annotation for this is expensive since there can be a large number of cells per image, and expert pathologists are needed for labelling images. Instead, our paper focuses on using weak supervision -- annotation from related tasks -- to induce a segmenter. Recen…
▽ More
Cell segmentation in histopathological images plays a crucial role in understanding, diagnosing, and treating many diseases. However, data annotation for this is expensive since there can be a large number of cells per image, and expert pathologists are needed for labelling images. Instead, our paper focuses on using weak supervision -- annotation from related tasks -- to induce a segmenter. Recent foundation models, such as Segment Anything (SAM), can use prompts to leverage additional supervision during inference. SAM has performed remarkably well in natural image segmentation tasks; however, its applicability to cell segmentation has not been explored.
In response, we investigate guiding the prompting procedure in SAM for weakly supervised cell segmentation when only bounding box supervision is available. We develop two workflows: (1) an object detector's output as a test-time prompt to SAM (D-SAM), and (2) SAM as pseudo mask generator over training data to train a standalone segmentation model (SAM-S). On finding that both workflows have some complementary strengths, we develop an integer programming-based approach to reconcile the two sets of segmentation masks, achieving yet higher performance. We experiment on three publicly available cell segmentation datasets namely, ConSep, MoNuSeg, and TNBC, and find that all SAM-based solutions hugely outperform existing weakly supervised image segmentation models, obtaining 9-15 pt Dice gains.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Imaging detection of the inner dust belt and the four exoplanets in the HR8799 system with JWST's MIRI coronagraph
Authors:
Boccaletti A.,
Mâlin M.,
Baudoz P.,
Tremplin P.,
Perrot C.,
Rouan D.,
Lagage P. -O.,
Whiteford N.,
Mollière P.,
Waters R.,
Henning T.,
Decin L.,
Güdel M.,
Vadenbussche B.,
Absil O.,
Argyriou I.,
Bouwman J.,
Cossou C.,
Coulais A.,
Gastaud R.,
Glasse A.,
Glauser A.,
Kamp I.,
Kendrew S.,
Krause O.
, et al. (13 additional authors not shown)
Abstract:
The multi planet system HR8799 is the first target observed with MIRI's coronagraphs as part of the MIRI-EC Guaranteed Time Observations exoplanets programme in Nov. 2022. We obtained deep observations in three coronagraphic filters from 10 to 15mic (F1065C, F1140C, F1550C), and one standard imaging filter at 20 mic (F2100W), with the goal to extract the photometry of the four planets, as well as…
▽ More
The multi planet system HR8799 is the first target observed with MIRI's coronagraphs as part of the MIRI-EC Guaranteed Time Observations exoplanets programme in Nov. 2022. We obtained deep observations in three coronagraphic filters from 10 to 15mic (F1065C, F1140C, F1550C), and one standard imaging filter at 20 mic (F2100W), with the goal to extract the photometry of the four planets, as well as to detect and investigate the distribution of circumstellar dust. Using dedicated observations of a reference star, we tested several algorithms to subtract the stellar diffraction pattern while preserving the fluxes of planets, which can be significantly affected by over-subtraction. Measuring correctly the planet's flux values requires accounting for the attenuation by the coronagraphs as a function of their position, and to estimate the normalisation with respect to the central star. We tested several procedures to derive averaged photometric values and error bars. These observations have enabled us to obtain two main results. First of all, the four planets in the system are well recovered, and their mid-IR fluxes, combined with near-IR flux values from the literature, are compared to two exoplanet atmosphere models, ATMO and Exo-REM. As a main outcome, the MIRI photometric data points imply larger radii (0.86 or 1.07 RJ for planet b) and cooler temperatures (950 or 1100 K for planet b), especially for planet b, in better agreement with evolutionary models. Second of all, these JWST/MIRI coronagraphic data also deliver the first spatially resolved detection of the inner warm debris disk, the radius of which is constrained to about 15 au, with flux densities comparable, but lower than former unresolved spectroscopic measurements with Spitzer. abridged...
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
CoNO: Complex Neural Operator for Continuous Dynamical Systems
Authors:
Karn Tiwari,
N M Anoop Krishnan,
Prathosh A P
Abstract:
Neural operators extend data-driven models to map between infinite-dimensional functional spaces. These models have successfully solved continuous dynamical systems represented by differential equations, viz weather forecasting, fluid flow, or solid mechanics. However, the existing operators still rely on real space, thereby losing rich representations potentially captured in the complex space by…
▽ More
Neural operators extend data-driven models to map between infinite-dimensional functional spaces. These models have successfully solved continuous dynamical systems represented by differential equations, viz weather forecasting, fluid flow, or solid mechanics. However, the existing operators still rely on real space, thereby losing rich representations potentially captured in the complex space by functional transforms. In this paper, we introduce a Complex Neural Operator (CoNO), that parameterizes the integral kernel in the complex fractional Fourier domain. Additionally, the model employing a complex-valued neural network along with aliasing-free activation functions preserves the complex values and complex algebraic properties, thereby enabling improved representation, robustness to noise, and generalization. We show that the model effectively captures the underlying partial differential equation with a single complex fractional Fourier transform. We perform an extensive empirical evaluation of CoNO on several datasets and additional tasks such as zero-shot super-resolution, evaluation of out-of-distribution data, data efficiency, and robustness to noise. CoNO exhibits comparable or superior performance to all the state-of-the-art models in these tasks. Altogether, CoNO presents a robust and superior model for modeling continuous dynamical systems, providing a fillip to scientific machine learning.
△ Less
Submitted 4 October, 2023; v1 submitted 3 October, 2023;
originally announced October 2023.
-
CoDBench: A Critical Evaluation of Data-driven Models for Continuous Dynamical Systems
Authors:
Priyanshu Burark,
Karn Tiwari,
Meer Mehran Rashid,
Prathosh A P,
N M Anoop Krishnan
Abstract:
Continuous dynamical systems, characterized by differential equations, are ubiquitously used to model several important problems: plasma dynamics, flow through porous media, weather forecasting, and epidemic dynamics. Recently, a wide range of data-driven models has been used successfully to model these systems. However, in contrast to established fields like computer vision, limited studies are a…
▽ More
Continuous dynamical systems, characterized by differential equations, are ubiquitously used to model several important problems: plasma dynamics, flow through porous media, weather forecasting, and epidemic dynamics. Recently, a wide range of data-driven models has been used successfully to model these systems. However, in contrast to established fields like computer vision, limited studies are available analyzing the strengths and potential applications of different classes of these models that could steer decision-making in scientific machine learning. Here, we introduce CodBench, an exhaustive benchmarking suite comprising 11 state-of-the-art data-driven models for solving differential equations. Specifically, we comprehensively evaluate 4 distinct categories of models, viz., feed forward neural networks, deep operator regression models, frequency-based neural operators, and transformer architectures against 8 widely applicable benchmark datasets encompassing challenges from fluid and solid mechanics. We conduct extensive experiments, assessing the operators' capabilities in learning, zero-shot super-resolution, data efficiency, robustness to noise, and computational efficiency. Interestingly, our findings highlight that current operators struggle with the newer mechanics datasets, motivating the need for more robust neural operators. All the datasets and codes will be shared in an easy-to-use fashion for the scientific community. We hope this resource will be an impetus for accelerated progress and exploration in modeling dynamical systems.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Adapt then Unlearn: Exploring Parameter Space Semantics for Unlearning in Generative Adversarial Networks
Authors:
Piyush Tiwary,
Atri Guha,
Subhodip Panda,
Prathosh A. P
Abstract:
Owing to the growing concerns about privacy and regulatory compliance, it is desirable to regulate the output of generative models. To that end, the objective of this work is to prevent the generation of outputs containing undesired features from a pre-trained Generative Adversarial Network (GAN) where the underlying training data set is inaccessible. Our approach is inspired by the observation th…
▽ More
Owing to the growing concerns about privacy and regulatory compliance, it is desirable to regulate the output of generative models. To that end, the objective of this work is to prevent the generation of outputs containing undesired features from a pre-trained Generative Adversarial Network (GAN) where the underlying training data set is inaccessible. Our approach is inspired by the observation that the parameter space of GANs exhibits meaningful directions that can be leveraged to suppress specific undesired features. However, such directions usually result in the degradation of the quality of generated samples. Our proposed two-stage method, known as 'Adapt-then-Unlearn,' excels at unlearning such undesirable features while also maintaining the quality of generated samples. In the initial stage, we adapt a pre-trained GAN on a set of negative samples (containing undesired features) provided by the user. Subsequently, we train the original pre-trained GAN using positive samples, along with a repulsion regularizer. This regularizer encourages the learned model parameters to move away from the parameters of the adapted model (first stage) while not degrading the generation quality. We provide theoretical insights into the proposed method. To the best of our knowledge, our approach stands as the first method addressing unlearning within the realm of high-fidelity GANs (such as StyleGAN). We validate the effectiveness of our method through comprehensive experiments, encompassing both class-level unlearning on the MNIST and AFHQ dataset and feature-level unlearning tasks on the CelebA-HQ dataset. Our code and implementation is available at: https://github.com/atriguha/Adapt_Unlearn.
△ Less
Submitted 12 February, 2025; v1 submitted 25 September, 2023;
originally announced September 2023.
-
SN 2022jli: a type Ic supernova with periodic modulation of its light curve and an unusually long rise
Authors:
Moore T.,
Smartt S. J.,
Nicholl M.,
Srivastav S.,
Stevance H. F.,
Jess D. B.,
Grant S. D. T.,
Fulton M. D.,
Rhodes L.,
Sim S. A.,
Hirai R.,
Podsiadlowski P.,
Anderson J. P.,
Ashall C.,
Bate W.,
Fender R.,
Gutierrez C. P.,
Howell D. A.,
Huber M. E.,
Inserra C.,
Leloudas G.,
Monard L. A. G.,
Muller-Bravo T. E.,
Shappee B. J.,
Smith K. W.
, et al. (20 additional authors not shown)
Abstract:
We present multi-wavelength photometry and spectroscopy of SN 2022jli, an unprecedented Type Ic supernova discovered in the galaxy NGC 157 at a distance of $\approx$ 23 Mpc. The multi-band light curves reveal many remarkable characteristics. Peaking at a magnitude of $g=15.11\pm0.02$, the high-cadence photometry reveals 12.5$\pm0.2\ $day periodic undulations superimposed on the 200 day supernova d…
▽ More
We present multi-wavelength photometry and spectroscopy of SN 2022jli, an unprecedented Type Ic supernova discovered in the galaxy NGC 157 at a distance of $\approx$ 23 Mpc. The multi-band light curves reveal many remarkable characteristics. Peaking at a magnitude of $g=15.11\pm0.02$, the high-cadence photometry reveals 12.5$\pm0.2\ $day periodic undulations superimposed on the 200 day supernova decline. This periodicity is observed in the light curves from nine separate filter and instrument configurations with peak-to-peak amplitudes of $\simeq$ 0.1 mag. This is the first time that repeated periodic oscillations, over many cycles, have been detected in a supernova light curve. SN 2022jli also displays an extreme early excess which fades over $\approx$ 25 days followed by a rise to a peak luminosity of $L_{\rm opt} = 10^{42.1}$ erg s$^{-1}$. Although the exact explosion epoch is not constrained by data, the time from explosion to maximum light is $\gtrsim$ 59 days. The luminosity can be explained by a large ejecta mass ($M_{\rm ej}\approx12\pm6$M$_{\odot}$) powered by $^{56}$Ni but we find difficulty in quantitatively modelling the early excess with circumstellar interaction and cooling. Collision between the supernova ejecta and a binary companion is a possible source of this emission. We discuss the origin of the periodic variability in the light curve, including interaction of the SN ejecta with nested shells of circumstellar matter and neutron stars colliding with binary companions.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
Neural Discovery of Permutation Subgroups
Authors:
Pavan Karjol,
Rohan Kashyap,
Prathosh A P
Abstract:
We consider the problem of discovering subgroup $H$ of permutation group $S_{n}$. Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions. Our results show that one could discover any subgroup of type $S_{k} (k \leq n)$ by learning an $S_{n}$-invariant function and a linear…
▽ More
We consider the problem of discovering subgroup $H$ of permutation group $S_{n}$. Unlike the traditional $H$-invariant networks wherein $H$ is assumed to be known, we present a method to discover the underlying subgroup, given that it satisfies certain conditions. Our results show that one could discover any subgroup of type $S_{k} (k \leq n)$ by learning an $S_{n}$-invariant function and a linear transformation. We also prove similar results for cyclic and dihedral subgroups. Finally, we provide a general theorem that can be extended to discover other subgroups of $S_{n}$. We also demonstrate the applicability of our results through numerical experiments on image-digit sum and symmetric polynomial regression tasks.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
A Unified Framework for Discovering Discrete Symmetries
Authors:
Pavan Karjol,
Rohan Kashyap,
Aditya Gopalan,
Prathosh A. P
Abstract:
We consider the problem of learning a function respecting a symmetry from among a class of symmetries. We develop a unified framework that enables symmetry discovery across a broad range of subgroups including locally symmetric, dihedral and cyclic subgroups. At the core of the framework is a novel architecture composed of linear, matrix-valued and non-linear functions that expresses functions inv…
▽ More
We consider the problem of learning a function respecting a symmetry from among a class of symmetries. We develop a unified framework that enables symmetry discovery across a broad range of subgroups including locally symmetric, dihedral and cyclic subgroups. At the core of the framework is a novel architecture composed of linear, matrix-valued and non-linear functions that expresses functions invariant to these subgroups in a principled manner. The structure of the architecture enables us to leverage multi-armed bandit algorithms and gradient descent to efficiently optimize over the linear and the non-linear functions, respectively, and to infer the symmetry that is ultimately learnt. We also discuss the necessity of the matrix-valued functions in the architecture. Experiments on image-digit sum and polynomial regression tasks demonstrate the effectiveness of our approach.
△ Less
Submitted 27 October, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation
Authors:
Vishnuvardhan Purma,
Suhas Srinath,
Seshan Srirangarajan,
Aanchal Kakkar,
Prathosh A. P
Abstract:
Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bot…
▽ More
Histopathological image segmentation is a laborious and time-intensive task, often requiring analysis from experienced pathologists for accurate examinations. To reduce this burden, supervised machine-learning approaches have been adopted using large-scale annotated datasets for histopathological image analysis. However, in several scenarios, the availability of large-scale annotated data is a bottleneck while training such models. Self-supervised learning (SSL) is an alternative paradigm that provides some respite by constructing models utilizing only the unannotated data which is often abundant. The basic idea of SSL is to train a network to perform one or many pseudo or pretext tasks on unannotated data and use it subsequently as the basis for a variety of downstream tasks. It is seen that the success of SSL depends critically on the considered pretext task. While there have been many efforts in designing pretext tasks for classification problems, there haven't been many attempts on SSL for histopathological segmentation. Motivated by this, we propose an SSL approach for segmenting histopathological images via generative diffusion models in this paper. Our method is based on the observation that diffusion models effectively solve an image-to-image translation task akin to a segmentation task. Hence, we propose generative diffusion as the pretext task for histopathological image segmentation. We also propose a multi-loss function-based fine-tuning for the downstream task. We validate our method using several metrics on two publically available datasets along with a newly proposed head and neck (HN) cancer dataset containing hematoxylin and eosin (H\&E) stained images along with annotations. Codes will be made public at https://github.com/suhas-srinath/GenSelfDiff-HIS.
△ Less
Submitted 10 September, 2024; v1 submitted 4 September, 2023;
originally announced September 2023.
-
SDLFormer: A Sparse and Dense Locality-enhanced Transformer for Accelerated MR Image Reconstruction
Authors:
Rahul G. S.,
Sriprabha Ramnarayanan,
Mohammad Al Fahim,
Keerthi Ram,
Preejith S. P,
Mohanasankar Sivaprakasam
Abstract:
Transformers have emerged as viable alternatives to convolutional neural networks owing to their ability to learn non-local region relationships in the spatial domain. The self-attention mechanism of the transformer enables transformers to capture long-range dependencies in the images, which might be desirable for accelerated MRI image reconstruction as the effect of undersampling is non-local in…
▽ More
Transformers have emerged as viable alternatives to convolutional neural networks owing to their ability to learn non-local region relationships in the spatial domain. The self-attention mechanism of the transformer enables transformers to capture long-range dependencies in the images, which might be desirable for accelerated MRI image reconstruction as the effect of undersampling is non-local in the image domain. Despite its computational efficiency, the window-based transformers suffer from restricted receptive fields as the dependencies are limited to within the scope of the image windows. We propose a window-based transformer network that integrates dilated attention mechanism and convolution for accelerated MRI image reconstruction. The proposed network consists of dilated and dense neighborhood attention transformers to enhance the distant neighborhood pixel relationship and introduce depth-wise convolutions within the transformer module to learn low-level translation invariant features for accelerated MRI image reconstruction. The proposed model is trained in a self-supervised manner. We perform extensive experiments for multi-coil MRI acceleration for coronal PD, coronal PDFS and axial T2 contrasts with 4x and 5x under-sampling in self-supervised learning based on k-space splitting. We compare our method against other reconstruction architectures and the parallel domain self-supervised learning baseline. Results show that the proposed model exhibits improvement margins of (i) around 1.40 dB in PSNR and around 0.028 in SSIM on average over other architectures (ii) around 1.44 dB in PSNR and around 0.029 in SSIM over parallel domain self-supervised learning. The code is available at https://github.com/rahul-gs-16/sdlformer.git
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Simultaneous study of scattering and fusion hindrance near Coulomb barrier in $F+Pb$ systems
Authors:
Kamala Kanta Jena,
Bidhubhusan Sahu,
Jajati K. Nayak,
Raj Preethi P,
B. K. Sharma,
Santosh Kumar Agarwalla
Abstract:
A phenomenological optical potential is used to study the elastic angular distributions for the system $^{19}F+^{208}Pb$ close to the Coulomb barrier. This potential is constructed by taking into account the flexible potential developed by Ginocchio. The fluctuations in the real and imaginary parts of the optical model potential follow the trends of the threshold anomaly. The set of optical potent…
▽ More
A phenomenological optical potential is used to study the elastic angular distributions for the system $^{19}F+^{208}Pb$ close to the Coulomb barrier. This potential is constructed by taking into account the flexible potential developed by Ginocchio. The fluctuations in the real and imaginary parts of the optical model potential follow the trends of the threshold anomaly. The set of optical potential parameters needed to analyze the fusion cross sections of the same system are obtained through analysis of the scattering cross sections. Theoretical fusion cross-sections and results from four different experimental groups well agree for a range of energies. Several Fluorine (F) isotopes are used as projectiles in this study of fusion cross-sections by slightly altering the radial parameter. It was found that the fusion process occurs unfettered in the $^{19}F +^{208}Pb$ system below the Coulomb barrier but is seriously hindered in the case of its isotopic projectiles.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Bayesian Pseudo-Coresets via Contrastive Divergence
Authors:
Piyush Tiwary,
Kumar Shubham,
Vivek V. Kashyap,
Prathosh A. P
Abstract:
Bayesian methods provide an elegant framework for estimating parameter posteriors and quantification of uncertainty associated with probabilistic models. However, they often suffer from slow inference times. To address this challenge, Bayesian Pseudo-Coresets (BPC) have emerged as a promising solution. BPC methods aim to create a small synthetic dataset, known as pseudo-coresets, that approximates…
▽ More
Bayesian methods provide an elegant framework for estimating parameter posteriors and quantification of uncertainty associated with probabilistic models. However, they often suffer from slow inference times. To address this challenge, Bayesian Pseudo-Coresets (BPC) have emerged as a promising solution. BPC methods aim to create a small synthetic dataset, known as pseudo-coresets, that approximates the posterior inference achieved with the original dataset. This approximation is achieved by optimizing a divergence measure between the true posterior and the pseudo-coreset posterior. Various divergence measures have been proposed for constructing pseudo-coresets, with forward Kullback-Leibler (KL) divergence being the most successful. However, using forward KL divergence necessitates sampling from the pseudo-coreset posterior, often accomplished through approximate Gaussian variational distributions. Alternatively, one could employ Markov Chain Monte Carlo (MCMC) methods for sampling, but this becomes challenging in high-dimensional parameter spaces due to slow mixing. In this study, we introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence. Importantly, optimizing contrastive divergence eliminates the need for approximations in the pseudo-coreset construction process. Furthermore, it enables the use of finite-step MCMC methods, alleviating the requirement for extensive mixing to reach a stationary distribution. To validate our method's effectiveness, we conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
△ Less
Submitted 8 May, 2024; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Two-Dimensional Numerical Analysis on Vortex-Induced Vibrations of an Elastically Mounted Circular Cylinder Placed Very Close to a Wall
Authors:
V Brahmini Priya P,
Aditya Karthik,
Vedanth N K,
Supradeepan K,
P S Gurugubelli
Abstract:
Two-dimensional numerical simulations have been performed to understand the effect of wall proximity on the vortex-induced vibrations (VIV) of an elastically mounted circular cylinder having two degrees of freedom for gap ratios g/D = {0.1,0.2,0.3,0.4,0.5,0.6}, where D is the cylinder diameter, and g is the gap between the cylinder's bottom surface and the wall. Parametric simulations have been pe…
▽ More
Two-dimensional numerical simulations have been performed to understand the effect of wall proximity on the vortex-induced vibrations (VIV) of an elastically mounted circular cylinder having two degrees of freedom for gap ratios g/D = {0.1,0.2,0.3,0.4,0.5,0.6}, where D is the cylinder diameter, and g is the gap between the cylinder's bottom surface and the wall. Parametric simulations have been performed using a quasi-monolithic coupled fluid-structure interaction solver with exact interface tracking for Re = 100 and m*=10 over a range of U* to present the effect of the wall's proximity on the vibration response, vortex structures, force dynamics, pressure distribution, the phase between the hydrodynamic forces and displacements, and finally different VIV branching response regimes. The numerical simulations reveal that as the gap ratio reduces, the maximum transverse vibration amplitude reduces, and the lock-in region widens. Periodic vortex shedding with a single "S" vortex street has been observed for all gap ratios g/D = 0.1 to 0.6. It has also been observed that the wall proximity affects the mean lift coefficient not only in the lock-in region but also in the pre and post lock-in regions. In the lock-in region, the response dynamics of the vibrating cylinder can be characterized into two branches, with the transition between the branches marked by a sudden jump in the phase angle $Φ$ between the lift and transverse displacement from 0 to a value slightly greater than $π$. As the gap ratio reduces, the widening of the lock-in region is accompanied by the widening of the IInd response branch and a narrowing down of the Ist response branch. For the gap ratio of g/D = 0.1, the Ist branch vanishes entirely. This work is directly relevant to the design, maintenance, and fatigue life estimation of underwater pipelines/cables laid on the seabed.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
SurfMyoAiR: A surface Electromyography based framework for Airwriting Recognition
Authors:
Ayush Tripathi,
Lalan Kumar,
Prathosh A. P.,
Suriya Prakash Muthukrishnan
Abstract:
Airwriting Recognition is the task of identifying letters written in free space with finger movement. Electromyography (EMG) is a technique used to record electrical activity during muscle contraction and relaxation as a result of movement and is widely used for gesture recognition. Most of the current research in gesture recognition is focused on identifying static gestures. However, dynamic gest…
▽ More
Airwriting Recognition is the task of identifying letters written in free space with finger movement. Electromyography (EMG) is a technique used to record electrical activity during muscle contraction and relaxation as a result of movement and is widely used for gesture recognition. Most of the current research in gesture recognition is focused on identifying static gestures. However, dynamic gestures are natural and user-friendly for being used as alternate input methods in Human-Computer Interaction applications. Airwriting recognition using EMG signals recorded from forearm muscles is therefore a viable solution. Since the user does not need to learn any new gestures and a large range of words can be formed by concatenating these letters, it is generalizable to a wider population. There has been limited work in recognition of airwriting using EMG signals and forms the core idea of the current work. The SurfMyoAiR dataset comprising of EMG signals recorded during writing English uppercase alphabets is constructed. Several different time-domain features to construct EMG envelope and two different time-frequency image representations: Short-Time Fourier Transform and Continuous Wavelet Transform were explored to form the input to a deep learning model for airwriting recognition. Several different deep learning architectures were exploited for this task. Additionally, the effect of various parameters such as signal length, window length and interpolation techniques on the recognition performance is comprehensively explored. The best-achieved accuracy was 78.50% and 62.19% in user-dependent and independent scenarios respectively by using Short-Time Fourier Transform in conjunction with a 2D Convolutional Neural Network based classifier. Airwriting has great potential as a user-friendly modality to be used as an alternate input method in Human-Computer Interaction applications.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
ImAiR: Airwriting Recognition framework using Image Representation of IMU Signals
Authors:
Ayush Tripathi,
Arnab Kumar Mondal,
Lalan Kumar,
Prathosh A. P
Abstract:
The problem of Airwriting Recognition is focused on identifying letters written by movement of finger in free space. It is a type of gesture recognition where the dictionary corresponds to letters in a specific language. In particular, airwriting recognition using sensor data from wrist-worn devices can be used as a medium of user input for applications in Human-Computer Interaction (HCI). Recogni…
▽ More
The problem of Airwriting Recognition is focused on identifying letters written by movement of finger in free space. It is a type of gesture recognition where the dictionary corresponds to letters in a specific language. In particular, airwriting recognition using sensor data from wrist-worn devices can be used as a medium of user input for applications in Human-Computer Interaction (HCI). Recognition of in-air trajectories using such wrist-worn devices is limited in literature and forms the basis of the current work. In this paper, we propose an airwriting recognition framework by first encoding the time-series data obtained from a wearable Inertial Measurement Unit (IMU) on the wrist as images and then utilizing deep learning-based models for identifying the written alphabets. The signals recorded from 3-axis accelerometer and gyroscope in IMU are encoded as images using different techniques such as Self Similarity Matrix (SSM), Gramian Angular Field (GAF) and Markov Transition Field (MTF) to form two sets of 3-channel images. These are then fed to two separate classification models and letter prediction is made based on an average of the class conditional probabilities obtained from the two models. Several standard model architectures for image classification such as variants of ResNet, DenseNet, VGGNet, AlexNet and GoogleNet have been utilized. Experiments performed on two publicly available datasets demonstrate the efficacy of the proposed strategy. The code for our implementation will be made available at https://github.com/ayushayt/ImAiR.
△ Less
Submitted 8 September, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Spectral and Energy Efficient User Pairing for RIS-assisted Uplink NOMA Systems with Imperfect Phase Compensation
Authors:
Kusuma Priya P.,
Pavan Reddy M.,
Abhinav Kumar
Abstract:
Non-orthogonal multiple access (NOMA) is considered a key technology for improving the spectral efficiency of fifth-generation (5G) and beyond 5G cellular networks. NOMA is beneficial when the channel vectors of the users are in the same direction, which is not always possible in conventional wireless systems. With the help of a reconfigurable intelligent surface (RIS), the base station can contro…
▽ More
Non-orthogonal multiple access (NOMA) is considered a key technology for improving the spectral efficiency of fifth-generation (5G) and beyond 5G cellular networks. NOMA is beneficial when the channel vectors of the users are in the same direction, which is not always possible in conventional wireless systems. With the help of a reconfigurable intelligent surface (RIS), the base station can control the directions of the channel vectors of the users. Thus, by combining both technologies, the RIS-assisted NOMA systems are expected to achieve greater improvements in the network throughput. However, ideal phase control at the RIS is unrealizable in practice because of the imperfections in the channel estimations and the hardware limitations. This imperfection in phase control can have a significant impact on the system performance. Motivated by this, in this paper, we consider an RIS-assisted uplink NOMA system in the presence of imperfect phase compensation. We formulate the criterion for pairing the users that achieves minimum required data rates. We propose adaptive user pairing algorithms that maximize spectral or energy efficiency. We then derive various bounds on power allocation factors for the paired users. Through extensive simulation results, we show that the proposed algorithms significantly outperform the state-of-the-art algorithms in terms of spectral and energy efficiency.
△ Less
Submitted 6 January, 2022;
originally announced January 2022.
-
Clash of Titans: a MUSE dynamical study of the extreme cluster merger SPT-CL J0307-6225
Authors:
D. Hernández-Lang,
A. Zenteno,
A. Diaz-Ocampo,
H. Cuevas,
J. Clancy,
H. Prado P.,
F. Aldás,
D. Pallero,
R. Monteiro-Oliveira,
F. A. Gómez,
A. Ramirez,
J. Wynter,
E. R. Carrasco,
G. K. T. Hau,
B. Stalder,
M. McDonald,
M. Bayliss,
B. Floyd,
G. Garmire,
A. Katzenberger,
K. J. Kim,
M. Klein,
G. Mahler,
J. L. Nilo Castellon,
A. Saro
, et al. (1 additional authors not shown)
Abstract:
We present VLT/MUSE spectroscopy, along with archival Gemini/GMOS spectroscopy, Magellan/Megacam imaging, and Chandra X-ray emission for SPT-CL J0305-6225, a z=0.58 major merging galaxy cluster with a large BCG-SZ centroid separation and a highly disturbed X-ray morphology. The galaxy density distribution shows two main overdensities with separations of 0.144 and 0.017 arcmin to their respective B…
▽ More
We present VLT/MUSE spectroscopy, along with archival Gemini/GMOS spectroscopy, Magellan/Megacam imaging, and Chandra X-ray emission for SPT-CL J0305-6225, a z=0.58 major merging galaxy cluster with a large BCG-SZ centroid separation and a highly disturbed X-ray morphology. The galaxy density distribution shows two main overdensities with separations of 0.144 and 0.017 arcmin to their respective BCGs. We characterize the central regions of the two colliding structures, namely 0307-6225N and 0307-6225S, finding velocity derived masses of $M_{200,N}=$ 2.44 $\pm$ 1.41 $\times10^{14}$ M$_\odot$ and $M_{200,S}=$ 3.16 $\pm$ 1.88 $\times10^{14}$ M$_\odot$, with a line-of-sight velocity difference of $|Δv| = 342$ km s$^{-1}$. The total dynamically derived mass is consistent with the SZ derived mass of 7.63 h$_{70}^{-1}$ $\pm$ 1.36 $\times10^{14}$ M$_\odot$. We model the merger using the Monte Carlo Merger Analysis Code, estimating a merging angle of 36$^{+14}_{-12}$ degrees with respect to the plane of the sky. Comparing with simulations of a merging system with a mass ratio of 1:3, we find that the best scenario is that of an ongoing merger that began 0.96$^{+0.31}_{-0.18}$ Gyr ago. We also characterize the galaxy population using H$δ$ and [OII] $λ3727$ Å\ lines. We find that most of the emission-line galaxies belong to 0307-6225S, close to the X-ray peak position, with a third of them corresponding to red-cluster sequence galaxies, and the rest to blue galaxies with velocities consistent with recent periods of accretion. Moreover, we suggest that 0307-6225S suffered a previous merger, evidenced through the two equally bright BCGs at the center with a velocity difference of $\sim$674 km s$^{-1}$.
△ Less
Submitted 18 January, 2023; v1 submitted 30 November, 2021;
originally announced November 2021.
-
SCLAiR : Supervised Contrastive Learning for User and Device Independent Airwriting Recognition
Authors:
Ayush Tripathi,
Arnab Kumar Mondal,
Lalan Kumar,
Prathosh A. P
Abstract:
Airwriting Recognition is the problem of identifying letters written in free space with finger movement. It is essentially a specialized case of gesture recognition, wherein the vocabulary of gestures corresponds to letters as in a particular language. With the wide adoption of smart wearables in the general population, airwriting recognition using motion sensors from a smart-band can be used as a…
▽ More
Airwriting Recognition is the problem of identifying letters written in free space with finger movement. It is essentially a specialized case of gesture recognition, wherein the vocabulary of gestures corresponds to letters as in a particular language. With the wide adoption of smart wearables in the general population, airwriting recognition using motion sensors from a smart-band can be used as a medium of user input for applications in Human-Computer Interaction. There has been limited work in the recognition of in-air trajectories using motion sensors, and the performance of the techniques in the case when the device used to record signals is changed has not been explored hitherto. Motivated by these, a new paradigm for device and user-independent airwriting recognition based on supervised contrastive learning is proposed. A two stage classification strategy is employed, the first of which involves training an encoder network with supervised contrastive loss. In the subsequent stage, a classification head is trained with the encoder weights kept frozen. The efficacy of the proposed method is demonstrated through experiments on a publicly available dataset and also with a dataset recorded in our lab using a different device. Experiments have been performed in both supervised and unsupervised settings and compared against several state-of-the-art domain adaptation techniques. Data and the code for our implementation will be made available at https://github.com/ayushayt/SCLAiR.
△ Less
Submitted 29 December, 2021; v1 submitted 25 November, 2021;
originally announced November 2021.
-
Index Coded - NOMA in Vehicular Ad Hoc Networks
Authors:
Sreelakshmi P.,
Jesy Pachat,
Anjana A. Mahesh,
Deepthi P. P.,
B. Sundar Rajan
Abstract:
The demand for multimedia services is growing day by day in vehicular ad-hoc networks (VANETs), resulting in high spectral usage and network congestion. Non-orthogonal multiple access (NOMA) is a promising wireless communication technique to solve the problems related to spectral efficiency effectively. The index coding (IC) is a powerful method to improve spectral utilization, where a sender aims…
▽ More
The demand for multimedia services is growing day by day in vehicular ad-hoc networks (VANETs), resulting in high spectral usage and network congestion. Non-orthogonal multiple access (NOMA) is a promising wireless communication technique to solve the problems related to spectral efficiency effectively. The index coding (IC) is a powerful method to improve spectral utilization, where a sender aims to satisfy the needs of multiple receivers with a minimum number of transmissions. By combining these two approaches, in this work, we propose a novel technique called index coded NOMA (IC-NOMA), where we apply NOMA techniques on index coded data to reduce the number of transmissions further. This work shows that the IC-NOMA system demands a specific design for index codes to reap the advantages of NOMA. We have done the feasibility analysis of the proposed method in a general scenario and proposed an index code design to integrate IC over NOMA for the best efficiency. Through detailed analytical studies it is validated that the proposed transmission system provides improved spectral efficiency and power saving compared to conventional IC systems.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
DFW-PP: Dynamic Feature Weighting based Popularity Prediction for Social Media Content
Authors:
Viswanatha Reddy G,
Chaitanya B S N V,
Prathyush P,
Sumanth M,
Mrinalini C,
Dileep Kumar P,
Snehasis Mukherjee
Abstract:
The increasing popularity of social media platforms makes it important to study user engagement, which is a crucial aspect of any marketing strategy or business model. The over-saturation of content on social media platforms has persuaded us to identify the important factors that affect content popularity. This comes from the fact that only an iota of the humongous content available online receive…
▽ More
The increasing popularity of social media platforms makes it important to study user engagement, which is a crucial aspect of any marketing strategy or business model. The over-saturation of content on social media platforms has persuaded us to identify the important factors that affect content popularity. This comes from the fact that only an iota of the humongous content available online receives the attention of the target audience. Comprehensive research has been done in the area of popularity prediction using several Machine Learning techniques. However, we observe that there is still significant scope for improvement in analyzing the social importance of media content. We propose the DFW-PP framework, to learn the importance of different features that vary over time. Further, the proposed method controls the skewness of the distribution of the features by applying a log-log normalization. The proposed method is experimented with a benchmark dataset, to show promising results. The code will be made publicly available at https://github.com/chaitnayabasava/DFW-PP.
△ Less
Submitted 16 October, 2021;
originally announced October 2021.
-
Unsupervised Domain Adaptation Schemes for Building ASR in Low-resource Languages
Authors:
Anoop C S,
Prathosh A P,
A G Ramakrishnan
Abstract:
Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common acoustic space with a high-resource language having enough annotated data to build an ASR. In such cases, we show that the domain-independent acoustic models lea…
▽ More
Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common acoustic space with a high-resource language having enough annotated data to build an ASR. In such cases, we show that the domain-independent acoustic models learned from the high-resource language through unsupervised domain adaptation (UDA) schemes can enhance the performance of the ASR in the low-resource language. We use the specific example of Hindi in the source domain and Sanskrit in the target domain. We explore two architectures: i) domain adversarial training using gradient reversal layer (GRL) and ii) domain separation networks (DSN). The GRL and DSN architectures give absolute improvements of 6.71% and 7.32%, respectively, in word error rate over the baseline deep neural network model when trained on just 5.5 hours of data in the target domain. We also show that choosing a proper language (Telugu) in the source domain can bring further improvement. The results suggest that UDA schemes can be helpful in the development of ASR systems for low-resource languages, mitigating the hassle of collecting large amounts of annotated speech data.
△ Less
Submitted 16 September, 2021; v1 submitted 12 September, 2021;
originally announced September 2021.
-
Competition of Core-Shell and Janus Morphology in Alloy Nanoparticles: Insights From a Phase-Field Model
Authors:
Pankaj P,
Saswata Bhattacharya,
Subhradeep Chatterjee
Abstract:
Bimetallic nanoparticles (BNPs) exhibit diverse morphologies such as core-shell, Janus, onion-like, quasi-Janus, and homogeneous structures. Although extensive effort has been directed towards understanding the equilibrium configurations of BNPs, kinetic mechanisms involved in their development have not been explored systematically. Since these systems often contain a miscibility gap, experimental…
▽ More
Bimetallic nanoparticles (BNPs) exhibit diverse morphologies such as core-shell, Janus, onion-like, quasi-Janus, and homogeneous structures. Although extensive effort has been directed towards understanding the equilibrium configurations of BNPs, kinetic mechanisms involved in their development have not been explored systematically. Since these systems often contain a miscibility gap, experimental studies have alluded to spinodal decomposition (SD) as a likely mechanism for the formation of such structures. We present a novel phase-field model for confined (embedded)systems to study SD-induced morphological evolution within a BNP. It initiates with the formation of compositionally modulated rings as a result of surface-directed SD and eventually develops into core-shell or Janus structures due to coarsening/breakdown of the rings. The final configuration depends crucially on contact angle and particle size -Janus is favored at smaller sizes and higher contact angles. Our simulations also illustrate the formation of metastable, kinetically trapped structures as a result of competition between capillarity and diffusion.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Observation of Planar Hall Effect in Topological Insulator -- Bi$_2$Te$_3$
Authors:
Archit Bhardwaj,
Syam Prasad P.,
Karthik Raman,
Dhavala Suri
Abstract:
Planar Hall effect (PHE) in topological insulators (TIs) is discussed as an effect that stems mostly from conduction due to topologically protected surface states. Although surfaces states play a critical role and are of utmost importance in TIs, our present study reflects the need for considering the bulk conduction in understanding PHE in TIs. Here, we demonstrate an enhancement in PHE amplitude…
▽ More
Planar Hall effect (PHE) in topological insulators (TIs) is discussed as an effect that stems mostly from conduction due to topologically protected surface states. Although surfaces states play a critical role and are of utmost importance in TIs, our present study reflects the need for considering the bulk conduction in understanding PHE in TIs. Here, we demonstrate an enhancement in PHE amplitude by three times by doubling the thickness of Bi$_2$Te$_3$ film on Si (111). The PHE amplitude reaches $\approx$~6 n$Ω$m in 30 quintuple layer (QL) device as compared to $\approx$~2 n$Ω$m in 14 QL. We find that the PHE amplitude increases with temperature in the 30 QL Bi$_2$Te$_3$ films grown on Si (111) and Al$_2$O$_3$ (0001). Our experiments indicate that the contribution of bulk states to PHE in TIs could be significant.
△ Less
Submitted 1 June, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Classifying the Unstructured IT Service Desk Tickets Using Ensemble of Classifiers
Authors:
Ramya C,
Paramesh S. P,
Shreedhara K S
Abstract:
Manual classification of IT service desk tickets may result in routing of the tickets to the wrong resolution group. Incorrect assignment of IT service desk tickets leads to reassignment of tickets, unnecessary resource utilization and delays the resolution time. Traditional machine learning algorithms can be used to automatically classify the IT service desk tickets. Service desk ticket classifie…
▽ More
Manual classification of IT service desk tickets may result in routing of the tickets to the wrong resolution group. Incorrect assignment of IT service desk tickets leads to reassignment of tickets, unnecessary resource utilization and delays the resolution time. Traditional machine learning algorithms can be used to automatically classify the IT service desk tickets. Service desk ticket classifier models can be trained by mining the historical unstructured ticket description and the corresponding label. The model can then be used to classify the new service desk ticket based on the ticket description. The performance of the traditional classifier systems can be further improved by using various ensemble of classification techniques. This paper brings out the three most popular ensemble methods ie, Bagging, Boosting and Voting ensemble for combining the predictions from different models to further improve the accuracy of the ticket classifier system. The performance of the ensemble classifier system is checked against the individual base classifiers using various performance metrics. Ensemble of classifiers performed well in comparison with the corresponding base classifiers. The advantages of building such an automated ticket classifier systems are simplified user interface, faster resolution time, improved productivity, customer satisfaction and growth in business. The real world service desk ticket data from a large enterprise IT infrastructure is used for our research purpose.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Systematic Generalization in Neural Networks-based Multivariate Time Series Forecasting Models
Authors:
Hritik Bansal,
Gantavya Bhatt,
Pankaj Malhotra,
Prathosh A. P
Abstract:
Systematic generalization aims to evaluate reasoning about novel combinations from known components, an intrinsic property of human cognition. In this work, we study systematic generalization of NNs in forecasting future time series of dependent variables in a dynamical system, conditioned on past time series of dependent variables, and past and future control variables. We focus on systematic gen…
▽ More
Systematic generalization aims to evaluate reasoning about novel combinations from known components, an intrinsic property of human cognition. In this work, we study systematic generalization of NNs in forecasting future time series of dependent variables in a dynamical system, conditioned on past time series of dependent variables, and past and future control variables. We focus on systematic generalization wherein the NN-based forecasting model should perform well on previously unseen combinations or regimes of control variables after being trained on a limited set of the possible regimes. For NNs to depict such out-of-distribution generalization, they should be able to disentangle the various dependencies between control variables and dependent variables. We hypothesize that a modular NN architecture guided by the readily-available knowledge of independence of control variables as a potentially useful inductive bias to this end. Through extensive empirical evaluation on a toy dataset and a simulated electric motor dataset, we show that our proposed modular NN architecture serves as a simple yet highly effective inductive bias that enabling better forecasting of the dependent variables up to large horizons in contrast to standard NNs, and indeed capture the true dependency relations between the dependent and the control variables.
△ Less
Submitted 7 March, 2021; v1 submitted 10 February, 2021;
originally announced February 2021.
-
Neural Compound-Word (Sandhi) Generation and Splitting in Sanskrit Language
Authors:
Sushant Dave,
Arun Kumar Singh,
Prathosh A. P.,
Brejesh Lall
Abstract:
This paper describes neural network based approaches to the process of the formation and splitting of word-compounding, respectively known as the Sandhi and Vichchhed, in Sanskrit language. Sandhi is an important idea essential to morphological analysis of Sanskrit texts. Sandhi leads to word transformations at word boundaries. The rules of Sandhi formation are well defined but complex, sometimes…
▽ More
This paper describes neural network based approaches to the process of the formation and splitting of word-compounding, respectively known as the Sandhi and Vichchhed, in Sanskrit language. Sandhi is an important idea essential to morphological analysis of Sanskrit texts. Sandhi leads to word transformations at word boundaries. The rules of Sandhi formation are well defined but complex, sometimes optional and in some cases, require knowledge about the nature of the words being compounded. Sandhi split or Vichchhed is an even more difficult task given its non uniqueness and context dependence. In this work, we propose the route of formulating the problem as a sequence to sequence prediction task, using modern deep learning techniques. Being the first fully data driven technique, we demonstrate that our model has an accuracy better than the existing methods on multiple standard datasets, despite not using any additional lexical or morphological resources. The code is being made available at https://github.com/IITD-DataScience/Sandhi_Prakarana
△ Less
Submitted 24 October, 2020;
originally announced October 2020.
-
A Benchmark Corpus and Neural Approach for Sanskrit Derivative Nouns Analysis
Authors:
Arun Kumar Singh,
Sushant Dave,
Prathosh A. P.,
Brejesh Lall,
Shresth Mehta
Abstract:
This paper presents first benchmark corpus of Sanskrit Pratyaya (suffix) and inflectional words (padas) formed due to suffixes along with neural network based approaches to process the formation and splitting of inflectional words. Inflectional words spans the primary and secondary derivative nouns as the scope of current work. Pratyayas are an important dimension of morphological analysis of Sans…
▽ More
This paper presents first benchmark corpus of Sanskrit Pratyaya (suffix) and inflectional words (padas) formed due to suffixes along with neural network based approaches to process the formation and splitting of inflectional words. Inflectional words spans the primary and secondary derivative nouns as the scope of current work. Pratyayas are an important dimension of morphological analysis of Sanskrit texts. There have been Sanskrit Computational Linguistics tools for processing and analyzing Sanskrit texts. Unfortunately there has not been any work to standardize & validate these tools specifically for derivative nouns analysis. In this work, we prepared a Sanskrit suffix benchmark corpus called Pratyaya-Kosh to evaluate the performance of tools. We also present our own neural approach for derivative nouns analysis while evaluating the same on most prominent Sanskrit Morphological Analysis tools. This benchmark will be freely dedicated and available to researchers worldwide and we hope it will motivate all to improve morphological analysis in Sanskrit Language.
△ Less
Submitted 24 October, 2020;
originally announced October 2020.
-
RespVAD: Voice Activity Detection via Video-Extracted Respiration Patterns
Authors:
Arnab Kumar Mondal,
Prathosh A. P
Abstract:
Voice Activity Detection (VAD) refers to the task of identification of regions of human speech in digital signals such as audio and video. While VAD is a necessary first step in many speech processing systems, it poses challenges when there are high levels of ambient noise during the audio recording. To improve the performance of VAD in such conditions, several methods utilizing the visual informa…
▽ More
Voice Activity Detection (VAD) refers to the task of identification of regions of human speech in digital signals such as audio and video. While VAD is a necessary first step in many speech processing systems, it poses challenges when there are high levels of ambient noise during the audio recording. To improve the performance of VAD in such conditions, several methods utilizing the visual information extracted from the region surrounding the mouth/lip region of the speakers' video recording have been proposed. Even though these provide advantages over audio-only methods, they depend on faithful extraction of lip/mouth regions. Motivated by these, a new paradigm for VAD based on the fact that respiration forms the primary source of energy for speech production is proposed. Specifically, an audio-independent VAD technique using the respiration pattern extracted from the speakers' video is developed. The Respiration Pattern is first extracted from the video focusing on the abdominal-thoracic region of a speaker using an optical flow based method. Subsequently, voice activity is detected from the respiration pattern signal using neural sequence-to-sequence prediction models. The efficacy of the proposed method is demonstrated through experiments on a challenging dataset recorded in real acoustic environments and compared with four previous methods based on audio and visual cues.
△ Less
Submitted 21 August, 2020;
originally announced August 2020.