-
Artificially Generated Visual Scanpath Improves Multi-label Thoracic Disease Classification in Chest X-Ray Images
Authors:
Ashish Verma,
Aupendu Kar,
Krishnendu Ghosh,
Sobhan Kanti Dhara,
Debashis Sen,
Prabir Kumar Biswas
Abstract:
Expert radiologists visually scan Chest X-Ray (CXR) images, sequentially fixating on anatomical structures to perform disease diagnosis. An automatic multi-label classifier of diseases in CXR images can benefit by incorporating aspects of the radiologists' approach. Recorded visual scanpaths of radiologists on CXR images can be used for the said purpose. But, such scanpaths are not available for m…
▽ More
Expert radiologists visually scan Chest X-Ray (CXR) images, sequentially fixating on anatomical structures to perform disease diagnosis. An automatic multi-label classifier of diseases in CXR images can benefit by incorporating aspects of the radiologists' approach. Recorded visual scanpaths of radiologists on CXR images can be used for the said purpose. But, such scanpaths are not available for most CXR images, which creates a gap even for modern deep learning based classifiers. This paper proposes to mitigate this gap by generating effective artificial visual scanpaths using a visual scanpath prediction model for CXR images. Further, a multi-class multi-label classifier framework is proposed that uses a generated scanpath and visual image features to classify diseases in CXR images. While the scanpath predictor is based on a recurrent neural network, the multi-label classifier involves a novel iterative sequential model with an attention module. We show that our scanpath predictor generates human-like visual scanpaths. We also demonstrate that the use of artificial visual scanpaths improves multi-class multi-label disease classification results on CXR images. The above observations are made from experiments involving around 0.2 million CXR images from 2 widely-used datasets considering the multi-label classification of 14 pathological findings. Code link: https://github.com/ashishverma03/SDC
△ Less
Submitted 1 March, 2025;
originally announced March 2025.
-
Self-supervision via Controlled Transformation and Unpaired Self-conditioning for Low-light Image Enhancement
Authors:
Aupendu Kar,
Sobhan K. Dhara,
Debashis Sen,
Prabir K. Biswas
Abstract:
Real-world low-light images captured by imaging devices suffer from poor visibility and require a domain-specific enhancement to produce artifact-free outputs that reveal details. In this paper, we propose an unpaired low-light image enhancement network leveraging novel controlled transformation-based self-supervision and unpaired self-conditioning strategies. The model determines the required deg…
▽ More
Real-world low-light images captured by imaging devices suffer from poor visibility and require a domain-specific enhancement to produce artifact-free outputs that reveal details. In this paper, we propose an unpaired low-light image enhancement network leveraging novel controlled transformation-based self-supervision and unpaired self-conditioning strategies. The model determines the required degrees of enhancement at the input image pixels, which are learned from the unpaired low-lit and well-lit images without any direct supervision. The self-supervision is based on a controlled transformation of the input image and subsequent maintenance of its enhancement in spite of the transformation. The self-conditioning performs training of the model on unpaired images such that it does not enhance an already-enhanced image or a well-lit input image. The inherent noise in the input low-light images is handled by employing low gradient magnitude suppression in a detail-preserving manner. In addition, our noise handling is self-conditioned by preventing the denoising of noise-free well-lit images. The training based on low-light image enhancement-specific attributes allows our model to avoid paired supervision without compromising significantly in performance. While our proposed self-supervision aids consistent enhancement, our novel self-conditioning facilitates adequate enhancement. Extensive experiments on multiple standard datasets demonstrate that our model, in general, outperforms the state-of-the-art both quantitatively and subjectively. Ablation studies show the effectiveness of our self-supervision and self-conditioning strategies, and the related loss functions.
△ Less
Submitted 1 March, 2025;
originally announced March 2025.
-
Variable Resolution Pixel Quantization for Low Power Machine Vision Application on Edge
Authors:
Senorita Deb,
Sai Sanjeet,
Prabir Kumar Biswas,
Bibhu Datta Sahoo
Abstract:
This work describes an approach towards pixel quantization using variable resolution which is made feasible using image transformation in the analog domain. The main aim is to reduce the average bits-per-pixel (BPP) necessary for representing an image while maintaining the classification accuracy of a Convolutional Neural Network (CNN) that is trained for image classification. The proposed algorit…
▽ More
This work describes an approach towards pixel quantization using variable resolution which is made feasible using image transformation in the analog domain. The main aim is to reduce the average bits-per-pixel (BPP) necessary for representing an image while maintaining the classification accuracy of a Convolutional Neural Network (CNN) that is trained for image classification. The proposed algorithm is based on the Hadamard transform that leads to a low-resolution variable quantization by the analog-to-digital converter (ADC) thus reducing the power dissipation in hardware at the sensor node. Despite the trade-offs inherent in image transformation, the proposed algorithm achieves competitive accuracy levels across various image sizes and ADC configurations, highlighting the importance of considering both accuracy and power consumption in edge computing applications. The schematic of a novel 1.5 bit ADC that incorporates the Hadamard transform is also proposed. A hardware implementation of the analog transformation followed by software-based variable quantization is done for the CIFAR-10 test dataset. The digitized data shows that the network can still identify transformed images with a remarkable 90% accuracy for 3-BPP transformed images following the proposed method.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Self Supervised Low Dose Computed Tomography Image Denoising Using Invertible Network Exploiting Inter Slice Congruence
Authors:
Sutanu Bera,
Prabir Kumar Biswas
Abstract:
The resurgence of deep neural networks has created an alternative pathway for low-dose computed tomography denoising by learning a nonlinear transformation function between low-dose CT (LDCT) and normal-dose CT (NDCT) image pairs. However, those paired LDCT and NDCT images are rarely available in the clinical environment, making deep neural network deployment infeasible. This study proposes a nove…
▽ More
The resurgence of deep neural networks has created an alternative pathway for low-dose computed tomography denoising by learning a nonlinear transformation function between low-dose CT (LDCT) and normal-dose CT (NDCT) image pairs. However, those paired LDCT and NDCT images are rarely available in the clinical environment, making deep neural network deployment infeasible. This study proposes a novel method for self-supervised low-dose CT denoising to alleviate the requirement of paired LDCT and NDCT images. Specifically, we have trained an invertible neural network to minimize the pixel-based mean square distance between a noisy slice and the average of its two immediate adjacent noisy slices. We have shown the aforementioned is similar to training a neural network to minimize the distance between clean NDCT and noisy LDCT image pairs. Again, during the reverse mapping of the invertible network, the output image is mapped to the original input image, similar to cycle consistency loss. Finally, the trained invertible network's forward mapping is used for denoising LDCT images. Extensive experiments on two publicly available datasets showed that our method performs favourably against other existing unsupervised methods.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Sub-Aperture Feature Adaptation in Single Image Super-resolution Model for Light Field Imaging
Authors:
Aupendu Kar,
Suresh Nehra,
Jayanta Mukhopadhyay,
Prabir Kumar Biswas
Abstract:
With the availability of commercial Light Field (LF) cameras, LF imaging has emerged as an up and coming technology in computational photography. However, the spatial resolution is significantly constrained in commercial microlens based LF cameras because of the inherent multiplexing of spatial and angular information. Therefore, it becomes the main bottleneck for other applications of light field…
▽ More
With the availability of commercial Light Field (LF) cameras, LF imaging has emerged as an up and coming technology in computational photography. However, the spatial resolution is significantly constrained in commercial microlens based LF cameras because of the inherent multiplexing of spatial and angular information. Therefore, it becomes the main bottleneck for other applications of light field cameras. This paper proposes an adaptation module in a pretrained Single Image Super Resolution (SISR) network to leverage the powerful SISR model instead of using highly engineered light field imaging domain specific Super Resolution models. The adaption module consists of a Sub aperture Shift block and a fusion block. It is an adaptation in the SISR network to further exploit the spatial and angular information in LF images to improve the super resolution performance. Experimental validation shows that the proposed method outperforms existing light field super resolution algorithms. It also achieves PSNR gains of more than 1 dB across all the datasets as compared to the same pretrained SISR models for scale factor 2, and PSNR gains 0.6 to 1 dB for scale factor 4.
△ Less
Submitted 26 July, 2022; v1 submitted 24 July, 2022;
originally announced July 2022.
-
Iterative Gradient Encoding Network with Feature Co-Occurrence Loss for Single Image Reflection Removal
Authors:
Sutanu Bera,
Prabir Kumar Biswas
Abstract:
Removing undesired reflections from a photo taken in front of glass is of great importance for enhancing visual computing systems' efficiency. Previous learning-based approaches have produced visually plausible results for some reflections type, however, failed to generalize against other reflection types. There is a dearth of literature for efficient methods concerning single image reflection rem…
▽ More
Removing undesired reflections from a photo taken in front of glass is of great importance for enhancing visual computing systems' efficiency. Previous learning-based approaches have produced visually plausible results for some reflections type, however, failed to generalize against other reflection types. There is a dearth of literature for efficient methods concerning single image reflection removal, which can generalize well in large-scale reflection types. In this study, we proposed an iterative gradient encoding network for single image reflection removal. Next, to further supervise the network in learning the correlation between the transmission layer features, we proposed a feature co-occurrence loss. Extensive experiments on the public benchmark dataset of SIR$^2$ demonstrated that our method can remove reflection favorably against the existing state-of-the-art method on all imaging settings, including diverse backgrounds. Moreover, as the reflection strength increases, our method can still remove reflection even where other state of the art methods failed.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
A Novel Approach for Earthquake Early Warning System Design using Deep Learning Techniques
Authors:
Tonumoy Mukherjee,
Chandrani Singh,
Prabir Kumar Biswas
Abstract:
Earthquake signals are non-stationary in nature and thus in real-time, it is difficult to identify and classify events based on classical approaches like peak ground displacement, peak ground velocity. Even the popular algorithm of STA/LTA requires extensive research to determine basic thresholding parameters so as to trigger an alarm. Also, many times due to human error or other unavoidable natur…
▽ More
Earthquake signals are non-stationary in nature and thus in real-time, it is difficult to identify and classify events based on classical approaches like peak ground displacement, peak ground velocity. Even the popular algorithm of STA/LTA requires extensive research to determine basic thresholding parameters so as to trigger an alarm. Also, many times due to human error or other unavoidable natural factors such as thunder strikes or landslides, the algorithm may end up raising a false alarm. This work focuses on detecting earthquakes by converting seismograph recorded data into corresponding audio signals for better perception and then uses popular Speech Recognition techniques of Filter bank coefficients and Mel Frequency Cepstral Coefficients (MFCC) to extract the features. These features were then used to train a Convolutional Neural Network(CNN) and a Long Short Term Memory(LSTM) network. The proposed method can overcome the above-mentioned problems and help in detecting earthquakes automatically from the waveforms without much human intervention. For the 1000Hz audio data set the CNN model showed a testing accuracy of 91.1% for 0.2-second sample window length while the LSTM model showed 93.99% for the same. A total of 610 sounds consisting of 310 earthquake sounds and 300 non-earthquake sounds were used to train the models. While testing, the total time required for generating the alarm was approximately 2 seconds which included individual times for data collection, processing, and prediction taking into consideration the processing and prediction delays. This shows the effectiveness of the proposed method for Earthquake Early Warning (EEW) applications. Since the input of the method is only the waveform, it is suitable for real-time processing, thus the models can also be used as an onsite EEW system requiring a minimum amount of preparation time and workload.
△ Less
Submitted 16 January, 2021;
originally announced January 2021.
-
Noise Conscious Training of Non Local Neural Network powered by Self Attentive Spectral Normalized Markovian Patch GAN for Low Dose CT Denoising
Authors:
Sutanu Bera,
Prabir Kumar Biswas
Abstract:
The explosive rise of the use of Computer tomography (CT) imaging in medical practice has heightened public concern over the patient's associated radiation dose. However, reducing the radiation dose leads to increased noise and artifacts, which adversely degrades the scan's interpretability. Consequently, an advanced image reconstruction algorithm to improve the diagnostic performance of low dose…
▽ More
The explosive rise of the use of Computer tomography (CT) imaging in medical practice has heightened public concern over the patient's associated radiation dose. However, reducing the radiation dose leads to increased noise and artifacts, which adversely degrades the scan's interpretability. Consequently, an advanced image reconstruction algorithm to improve the diagnostic performance of low dose ct arose as the primary concern among the researchers, which is challenging due to the ill-posedness of the problem. In recent times, the deep learning-based technique has emerged as a dominant method for low dose CT(LDCT) denoising. However, some common bottleneck still exists, which hinders deep learning-based techniques from furnishing the best performance. In this study, we attempted to mitigate these problems with three novel accretions. First, we propose a novel convolutional module as the first attempt to utilize neighborhood similarity of CT images for denoising tasks. Our proposed module assisted in boosting the denoising by a significant margin. Next, we moved towards the problem of non-stationarity of CT noise and introduced a new noise aware mean square error loss for LDCT denoising. Moreover, the loss mentioned above also assisted to alleviate the laborious effort required while training CT denoising network using image patches. Lastly, we propose a novel discriminator function for CT denoising tasks. The conventional vanilla discriminator tends to overlook the fine structural details and focus on the global agreement. Our proposed discriminator leverage self-attention and pixel-wise GANs for restoring the diagnostic quality of LDCT images. Our method validated on a publicly available dataset of the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge performed remarkably better than the existing state of the art method.
△ Less
Submitted 11 November, 2020;
originally announced November 2020.
-
Lightweight Modules for Efficient Deep Learning based Image Restoration
Authors:
Avisek Lahiri,
Sourav Bairagya,
Sutanu Bera,
Siddhant Haldar,
Prabir Kumar Biswas
Abstract:
Low level image restoration is an integral component of modern artificial intelligence (AI) driven camera pipelines. Most of these frameworks are based on deep neural networks which present a massive computational overhead on resource constrained platform like a mobile phone. In this paper, we propose several lightweight low-level modules which can be used to create a computationally low cost vari…
▽ More
Low level image restoration is an integral component of modern artificial intelligence (AI) driven camera pipelines. Most of these frameworks are based on deep neural networks which present a massive computational overhead on resource constrained platform like a mobile phone. In this paper, we propose several lightweight low-level modules which can be used to create a computationally low cost variant of a given baseline model. Recent works for efficient neural networks design have mainly focused on classification. However, low-level image processing falls under the image-to-image' translation genre which requires some additional computational modules not present in classification. This paper seeks to bridge this gap by designing generic efficient modules which can replace essential components used in contemporary deep learning based image restoration networks. We also present and analyse our results highlighting the drawbacks of applying depthwise separable convolutional kernel (a popular method for efficient classification network) for sub-pixel convolution based upsampling (a popular upsampling strategy for low-level vision applications). This shows that concepts from domain of classification cannot always be seamlessly integrated into image-to-image translation tasks. We extensively validate our findings on three popular tasks of image inpainting, denoising and super-resolution. Our results show that proposed networks consistently output visually similar reconstructions compared to full capacity baselines with significant reduction of parameters, memory footprint and execution speeds on contemporary mobile devices.
△ Less
Submitted 11 July, 2020;
originally announced July 2020.
-
A lightweight convolutional neural network for image denoising with fine details preservation capability
Authors:
Sutanu Bera,
Avisek Lahiri,
Prabir Kumar Biswas
Abstract:
Image denoising is a fundamental problem in image processing whose primary objective is to remove the noise while preserving the original image structure. In this work, we proposed a new architecture for image denoising. We have used several dense blocks to design our network. Additionally, we have forwarded feature extracted in the first layer to the input of every transition layer. Our experimen…
▽ More
Image denoising is a fundamental problem in image processing whose primary objective is to remove the noise while preserving the original image structure. In this work, we proposed a new architecture for image denoising. We have used several dense blocks to design our network. Additionally, we have forwarded feature extracted in the first layer to the input of every transition layer. Our experimental result suggests that the use of low-level feature helps in reconstructing better texture. Furthermore, we had trained our network with a combination of MSE and a differentiable multi-scale structural similarity index(MS-SSIM). With proper training, our proposed model with a much lower parameter can outperform other models which were with trained much higher parameters. We evaluated our algorithm on two grayscale benchmark dataset BSD68 and SET12. Our model had achieved similar PSNR with the current state of the art methods and most of the time better SSIM than other algorithms.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.