Showing 1–2 of 2 results for author: Bao, H T

Search v0.5.6 released 2020-02-24

arXiv:2508.19581 [pdf, ps, other]

cs.CV

Guiding Noisy Label Conditional Diffusion Models with Score-based Discriminator Correction

Authors: Dat Nguyen Cong, Hieu Tran Bao, Hoang Thanh-Tung

Abstract: Diffusion models have gained prominence as state-of-the-art techniques for synthesizing images and videos, particularly due to their ability to scale effectively with large datasets. Recent studies have uncovered that these extensive datasets often contain mistakes from manual labeling processes. However, the extent to which such errors compromise the generative capabilities and controllability of… ▽ More Diffusion models have gained prominence as state-of-the-art techniques for synthesizing images and videos, particularly due to their ability to scale effectively with large datasets. Recent studies have uncovered that these extensive datasets often contain mistakes from manual labeling processes. However, the extent to which such errors compromise the generative capabilities and controllability of diffusion models is not well studied. This paper introduces Score-based Discriminator Correction (SBDC), a guidance technique for aligning noisy pre-trained conditional diffusion models. The guidance is built on discriminator training using adversarial loss, drawing on prior noise detection techniques to assess the authenticity of each sample. We further show that limiting the usage of our guidance to the early phase of the generation process leads to better performance. Our method is computationally efficient, only marginally increases inference time, and does not require retraining diffusion models. Experiments on different noise settings demonstrate the superiority of our method over previous state-of-the-art methods. △ Less

Submitted 27 August, 2025; originally announced August 2025.

Comments: 21 pages, 16 figures
arXiv:2502.10954 [pdf, other]

cs.CV cs.AI cs.LG

Learning to Stop Overthinking at Test Time

Authors: Hieu Tran Bao, Nguyen Cong Dat, Nguyen Duc Anh, Hoang Thanh-Tung

Abstract: Test time scaling is currently one of the most active research areas that shows promise after training time scaling has reached its limits. Deep-thinking (DT) models are a class of recurrent models that can perform easy-to-hard generalization by assigning more compute to harder test samples. However, due to their inability to determine the complexity of a test sample, DT models have to use a large… ▽ More Test time scaling is currently one of the most active research areas that shows promise after training time scaling has reached its limits. Deep-thinking (DT) models are a class of recurrent models that can perform easy-to-hard generalization by assigning more compute to harder test samples. However, due to their inability to determine the complexity of a test sample, DT models have to use a large amount of computation for both easy and hard test samples. Excessive test time computation is wasteful and can cause the ``overthinking'' problem where more test time computation leads to worse results. In this paper, we introduce a test time training method for determining the optimal amount of computation needed for each sample during test time. We also propose Conv-LiGRU, a novel recurrent architecture for efficient and robust visual reasoning. Extensive experiments demonstrate that Conv-LiGRU is more stable than DT, effectively mitigates the ``overthinking'' phenomenon, and achieves superior accuracy. △ Less

Submitted 17 February, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

Search v0.5.6 released 2020-02-24