-
SINR: Sparsity Driven Compressed Implicit Neural Representations
Authors:
Dhananjaya Jayasundara,
Sudarshan Rajagopalan,
Yasiru Ranasinghe,
Trac D. Tran,
Vishal M. Patel
Abstract:
Implicit Neural Representations (INRs) are increasingly recognized as a versatile data modality for representing discretized signals, offering benefits such as infinite query resolution and reduced storage requirements. Existing signal compression approaches for INRs typically employ one of two strategies: 1. direct quantization with entropy coding of the trained INR; 2. deriving a latent code on…
▽ More
Implicit Neural Representations (INRs) are increasingly recognized as a versatile data modality for representing discretized signals, offering benefits such as infinite query resolution and reduced storage requirements. Existing signal compression approaches for INRs typically employ one of two strategies: 1. direct quantization with entropy coding of the trained INR; 2. deriving a latent code on top of the INR through a learnable transformation. Thus, their performance is heavily dependent on the quantization and entropy coding schemes employed. In this paper, we introduce SINR, an innovative compression algorithm that leverages the patterns in the vector spaces formed by weights of INRs. We compress these vector spaces using a high-dimensional sparse code within a dictionary. Further analysis reveals that the atoms of the dictionary used to generate the sparse code do not need to be learned or transmitted to successfully recover the INR weights. We demonstrate that the proposed approach can be integrated with any existing INR-based signal compression technique. Our results indicate that SINR achieves substantial reductions in storage requirements for INRs across various configurations, outperforming conventional INR-based compression baselines. Furthermore, SINR maintains high-quality decoding across diverse data modalities, including images, occupancy fields, and Neural Radiance Fields.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models
Authors:
Yasiru Ranasinghe,
Vibashan VS,
James Uplinger,
Celso De Melo,
Vishal M. Patel
Abstract:
Automatic target recognition (ATR) plays a critical role in tasks such as navigation and surveillance, where safety and accuracy are paramount. In extreme use cases, such as military applications, these factors are often challenged due to the presence of unknown terrains, environmental conditions, and novel object categories. Current object detectors, including open-world detectors, lack the abili…
▽ More
Automatic target recognition (ATR) plays a critical role in tasks such as navigation and surveillance, where safety and accuracy are paramount. In extreme use cases, such as military applications, these factors are often challenged due to the presence of unknown terrains, environmental conditions, and novel object categories. Current object detectors, including open-world detectors, lack the ability to confidently recognize novel objects or operate in unknown environments, as they have not been exposed to these new conditions. However, Large Vision-Language Models (LVLMs) exhibit emergent properties that enable them to recognize objects in varying conditions in a zero-shot manner. Despite this, LVLMs struggle to localize objects effectively within a scene. To address these limitations, we propose a novel pipeline that combines the detection capabilities of open-world detectors with the recognition confidence of LVLMs, creating a robust system for zero-shot ATR of novel classes and unknown domains. In this study, we compare the performance of various LVLMs for recognizing military vehicles, which are often underrepresented in training datasets. Additionally, we examine the impact of factors such as distance range, modality, and prompting methods on the recognition performance, providing insights into the development of more reliable ATR systems for novel conditions and classes.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Automated use case diagram generator using NLP and ML
Authors:
Rukshan Piyumadu Dias,
C. S. L. Vidanapathirana,
Rukshala Weerasinghe,
Asitha Manupiya,
R. M. S. J. Bandara,
Y. P. H. W. Ranasinghe
Abstract:
This paper presents a novel approach to generate a use case diagram by analyzing the given user story using NLP and ML. Use case diagrams play a major role in the designing phase of the SDLC. This proves the fact that automating the use case diagram designing process would save a lot of time and effort. Numerous manual and semi-automated tools have been developed previously. This paper also discus…
▽ More
This paper presents a novel approach to generate a use case diagram by analyzing the given user story using NLP and ML. Use case diagrams play a major role in the designing phase of the SDLC. This proves the fact that automating the use case diagram designing process would save a lot of time and effort. Numerous manual and semi-automated tools have been developed previously. This paper also discusses the need for use case diagrams and problems faced during designing that. This paper is an attempt to solve those issues by generating the use case diagram in a fully automatic manner.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
$CrowdDiff$: Multi-hypothesis Crowd Density Estimation using Diffusion Models
Authors:
Yasiru Ranasinghe,
Nithin Gopalakrishnan Nair,
Wele Gedara Chaminda Bandara,
Vishal M. Patel
Abstract:
Crowd counting is a fundamental problem in crowd analysis which is typically accomplished by estimating a crowd density map and summing over the density values. However, this approach suffers from background noise accumulation and loss of density due to the use of broad Gaussian kernels to create the ground truth density maps. This issue can be overcome by narrowing the Gaussian kernel. However, e…
▽ More
Crowd counting is a fundamental problem in crowd analysis which is typically accomplished by estimating a crowd density map and summing over the density values. However, this approach suffers from background noise accumulation and loss of density due to the use of broad Gaussian kernels to create the ground truth density maps. This issue can be overcome by narrowing the Gaussian kernel. However, existing approaches perform poorly when trained with ground truth density maps with broad kernels. To deal with this limitation, we propose using conditional diffusion models to predict density maps, as diffusion models show high fidelity to training data during generation. With that, we present $CrowdDiff$ that generates the crowd density map as a reverse diffusion process. Furthermore, as the intermediate time steps of the diffusion process are noisy, we incorporate a regression branch for direct crowd estimation only during training to improve the feature learning. In addition, owing to the stochastic nature of the diffusion model, we introduce producing multiple density maps to improve the counting performance contrary to the existing crowd counting pipelines. We conduct extensive experiments on publicly available datasets to validate the effectiveness of our method. $CrowdDiff$ outperforms existing state-of-the-art crowd counting methods on several public crowd analysis benchmarks with significant improvements.
△ Less
Submitted 4 April, 2024; v1 submitted 22 March, 2023;
originally announced March 2023.
-
GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness
Authors:
Yasiru Ranasinghe,
Kavinga Weerasooriya,
Roshan Godaliyadda,
Vijitha Herath,
Parakrama Ekanayake,
Dhananjaya Jayasundara,
Lakshitha Ramanayake,
Neranjan Senarath,
Dulantha Wickramasinghe
Abstract:
In recent hyperspectral unmixing (HU) literature, the application of deep learning (DL) has become more prominent, especially with the autoencoder (AE) architecture. We propose a split architecture and use a pseudo-ground truth for abundances to guide the `unmixing network' (UN) optimization. Preceding the UN, an `approximation network' (AN) is proposed, which will improve the association between…
▽ More
In recent hyperspectral unmixing (HU) literature, the application of deep learning (DL) has become more prominent, especially with the autoencoder (AE) architecture. We propose a split architecture and use a pseudo-ground truth for abundances to guide the `unmixing network' (UN) optimization. Preceding the UN, an `approximation network' (AN) is proposed, which will improve the association between the centre pixel and its neighbourhood. Hence, it will accentuate spatial correlation in the abundances as its output is the input to the UN and the reference for the `mixing network' (MN). In the Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness (GAUSS), we proposed using one-hot encoded abundances as the pseudo-ground truth to guide the UN; computed using the k-means algorithm to exclude the use of prior HU methods. Furthermore, we release the single-layer constraint on MN by introducing the UN generated abundances in contrast to the standard AE for HU. Secondly, we experimented with two modifications on the pre-trained network using the GAUSS method. In GAUSS$_\textit{blind}$, we have concatenated the UN and the MN to back-propagate the reconstruction error gradients to the encoder. Then, in the GAUSS$_\textit{prime}$, abundance results of a signal processing (SP) method with reliable abundance results were used as the pseudo-ground truth with the GAUSS architecture. According to quantitative and graphical results for four experimental datasets, the three architectures either transcended or equated the performance of existing HU algorithms from both DL and SP domains.
△ Less
Submitted 20 December, 2022; v1 submitted 16 April, 2022;
originally announced April 2022.
-
Convolutional Autoencoder for Blind Hyperspectral Image Unmixing
Authors:
Yasiru Ranasinghe,
Sanjaya Herath,
Kavinga Weerasooriya,
Mevan Ekanayake,
Roshan Godaliyadda,
Parakrama Ekanayake,
Vijitha Herath
Abstract:
In the remote sensing context spectral unmixing is a technique to decompose a mixed pixel into two fundamental representatives: endmembers and abundances. In this paper, a novel architecture is proposed to perform blind unmixing on hyperspectral images. The proposed architecture consists of convolutional layers followed by an autoencoder. The encoder transforms the feature space produced through c…
▽ More
In the remote sensing context spectral unmixing is a technique to decompose a mixed pixel into two fundamental representatives: endmembers and abundances. In this paper, a novel architecture is proposed to perform blind unmixing on hyperspectral images. The proposed architecture consists of convolutional layers followed by an autoencoder. The encoder transforms the feature space produced through convolutional layers to a latent space representation. Then, from these latent characteristics the decoder reconstructs the roll-out image of the monochrome image which is at the input of the architecture; and each single-band image is fed sequentially. Experimental results on real hyperspectral data concludes that the proposed algorithm outperforms existing unmixing methods at abundance estimation and generates competitive results for endmember extraction with RMSE and SAD as the metrics, respectively.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
Constrained Nonnegative Matrix Factorization for Blind Hyperspectral Unmixing incorporating Endmember Independence
Authors:
E. M. M. B. Ekanayake,
H. M. H. K. Weerasooriya,
D. Y. L. Ranasinghe,
S. Herath,
B. Rathnayake,
G. M. R. I. Godaliyadda,
M. P. B. Ekanayake,
H. M. V. R. Herath
Abstract:
Hyperspectral unmixing (HU) has become an important technique in exploiting hyperspectral data since it decomposes a mixed pixel into a collection of endmembers weighted by fractional abundances. The endmembers of a hyperspectral image (HSI) are more likely to be generated by independent sources and be mixed in a macroscopic degree before arriving at the sensor element of the imaging spectrometer…
▽ More
Hyperspectral unmixing (HU) has become an important technique in exploiting hyperspectral data since it decomposes a mixed pixel into a collection of endmembers weighted by fractional abundances. The endmembers of a hyperspectral image (HSI) are more likely to be generated by independent sources and be mixed in a macroscopic degree before arriving at the sensor element of the imaging spectrometer as mixed spectra. Over the past few decades, many attempts have focused on imposing auxiliary constraints on the conventional nonnegative matrix factorization (NMF) framework in order to effectively unmix these mixed spectra. As a promising step toward finding an optimum constraint to extract endmembers, this paper presents a novel blind HU algorithm, referred to as Kurtosis-based Smooth Nonnegative Matrix Factorization (KbSNMF) which incorporates a novel constraint based on the statistical independence of the probability density functions of endmember spectra. Imposing this constraint on the conventional NMF framework promotes the extraction of independent endmembers while further enhancing the parts-based representation of data. Experiments conducted on diverse synthetic HSI datasets (with numerous numbers of endmembers, spectral bands, pixels, and noise levels) and three standard real HSI datasets demonstrate the validity of the proposed KbSNMF algorithm compared to several state-of-the-art NMF-based HU baselines. The proposed algorithm exhibits superior performance especially in terms of extracting endmember spectra from hyperspectral data; therefore, it could uplift the performance of recent deep learning HU methods which utilize the endmember spectra as supervisory input data for abundance extraction.
△ Less
Submitted 7 August, 2021; v1 submitted 2 March, 2020;
originally announced March 2020.