-
Diff-SPORT: Diffusion-based Sensor Placement Optimization and Reconstruction of Turbulent flows in urban environments
Authors:
Abhijeet Vishwasrao,
Sai Bharath Chandra Gutha,
Andres Cremades,
Klas Wijk,
Aakash Patil,
Catherine Gorle,
Beverley J McKeon,
Hossein Azizpour,
Ricardo Vinuesa
Abstract:
Rapid urbanization demands accurate and efficient monitoring of turbulent wind patterns to support air quality, climate resilience and infrastructure design. Traditional sparse reconstruction and sensor placement strategies face major accuracy degradations under practical constraints. Here, we introduce Diff-SPORT, a diffusion-based framework for high-fidelity flow reconstruction and optimal senso…
▽ More
Rapid urbanization demands accurate and efficient monitoring of turbulent wind patterns to support air quality, climate resilience and infrastructure design. Traditional sparse reconstruction and sensor placement strategies face major accuracy degradations under practical constraints. Here, we introduce Diff-SPORT, a diffusion-based framework for high-fidelity flow reconstruction and optimal sensor placement in urban environments. Diff-SPORT combines a generative diffusion model with a maximum a posteriori (MAP) inference scheme and a Shapley-value attribution framework to propose a scalable and interpretable solution. Compared to traditional numerical methods, Diff-SPORT achieves significant speedups while maintaining both statistical and instantaneous flow fidelity. Our approach offers a modular, zero-shot alternative to retraining-intensive strategies, supporting fast and reliable urban flow monitoring under extreme sparsity. Diff-SPORT paves the way for integrating generative modeling and explainability in sustainable urban intelligence.
△ Less
Submitted 30 May, 2025;
originally announced June 2025.
-
Inverse Problems with Diffusion Models: A MAP Estimation Perspective
Authors:
Sai Bharath Chandra Gutha,
Ricardo Vinuesa,
Hossein Azizpour
Abstract:
Inverse problems have many applications in science and engineering. In Computer vision, several image restoration tasks such as inpainting, deblurring, and super-resolution can be formally modeled as inverse problems. Recently, methods have been developed for solving inverse problems that only leverage a pre-trained unconditional diffusion model and do not require additional task-specific training…
▽ More
Inverse problems have many applications in science and engineering. In Computer vision, several image restoration tasks such as inpainting, deblurring, and super-resolution can be formally modeled as inverse problems. Recently, methods have been developed for solving inverse problems that only leverage a pre-trained unconditional diffusion model and do not require additional task-specific training. In such methods, however, the inherent intractability of determining the conditional score function during the reverse diffusion process poses a real challenge, leaving the methods to settle with an approximation instead, which affects their performance in practice. Here, we propose a MAP estimation framework to model the reverse conditional generation process of a continuous time diffusion model as an optimization process of the underlying MAP objective, whose gradient term is tractable. In theory, the proposed framework can be applied to solve general inverse problems using gradient-based optimization methods. However, given the highly non-convex nature of the loss objective, finding a perfect gradient-based optimization algorithm can be quite challenging, nevertheless, our framework offers several potential research directions. We use our proposed formulation to develop empirically effective algorithms for image restoration. We validate our proposed algorithms with extensive experiments over multiple datasets across several restoration tasks.
△ Less
Submitted 18 September, 2024; v1 submitted 27 July, 2024;
originally announced July 2024.
-
Indirectly Parameterized Concrete Autoencoders
Authors:
Alfred Nilsson,
Klas Wijk,
Sai bharath chandra Gutha,
Erik Englesson,
Alexandra Hotti,
Carlo Saccardi,
Oskar Kviman,
Jens Lagergren,
Ricardo Vinuesa,
Hossein Azizpour
Abstract:
Feature selection is a crucial task in settings where data is high-dimensional or acquiring the full set of features is costly. Recent developments in neural network-based embedded feature selection show promising results across a wide range of applications. Concrete Autoencoders (CAEs), considered state-of-the-art in embedded feature selection, may struggle to achieve stable joint optimization, h…
▽ More
Feature selection is a crucial task in settings where data is high-dimensional or acquiring the full set of features is costly. Recent developments in neural network-based embedded feature selection show promising results across a wide range of applications. Concrete Autoencoders (CAEs), considered state-of-the-art in embedded feature selection, may struggle to achieve stable joint optimization, hurting their training time and generalization. In this work, we identify that this instability is correlated with the CAE learning duplicate selections. To remedy this, we propose a simple and effective improvement: Indirectly Parameterized CAEs (IP-CAEs). IP-CAEs learn an embedding and a mapping from it to the Gumbel-Softmax distributions' parameters. Despite being simple to implement, IP-CAE exhibits significant and consistent improvements over CAE in both generalization and training time across several datasets for reconstruction and classification. Unlike CAE, IP-CAE effectively leverages non-linear relationships and does not require retraining the jointly optimized decoder. Furthermore, our approach is, in principle, generalizable to Gumbel-Softmax distributions beyond feature selection.
△ Less
Submitted 16 August, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
End-to-End Whisper to Natural Speech Conversion using Modified Transformer Network
Authors:
Abhishek Niranjan,
Mukesh Sharma,
Sai Bharath Chandra Gutha,
M Ali Basha Shaik
Abstract:
Machine recognition of an atypical speech like whispered speech, is a challenging task. We introduce whisper-to-natural-speech conversion using sequence-to-sequence approach by proposing enhanced transformer architecture, which uses both parallel and non-parallel data. We investigate different features like Mel frequency cepstral coefficients and smoothed spectral features. The proposed networks a…
▽ More
Machine recognition of an atypical speech like whispered speech, is a challenging task. We introduce whisper-to-natural-speech conversion using sequence-to-sequence approach by proposing enhanced transformer architecture, which uses both parallel and non-parallel data. We investigate different features like Mel frequency cepstral coefficients and smoothed spectral features. The proposed networks are trained end-to-end using supervised approach for feature-to-feature transformation. Further, we also investigate the effectiveness of embedded auxillary decoder used after N encoder sub-layers, trained with the frame-level objective function for identifying source phoneme labels. We show results on opensource wTIMIT and CHAINS datasets by measuring word error rate using end-to-end ASR and also BLEU scores for the generated speech. Alternatively, we also propose a novel method to measure spectral shape of it by measuring formant distributions w.r.t. reference speech, as formant divergence metric. We have found whisper-to-natural converted speech formants probability distribution is similar to the groundtruth distribution. To the authors' best knowledge, this is the first time enhanced transformer has been proposed, both with and without auxiliary decoder for whisper-to-natural-speech conversion and vice versa.
△ Less
Submitted 5 April, 2021; v1 submitted 20 April, 2020;
originally announced April 2020.