Skip to main content

Showing 1–13 of 13 results for author: Yılmaz, M A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.21262  [pdf, ps, other

    cs.CV eess.IV

    DiMoSR: Feature Modulation via Multi-Branch Dilated Convolutions for Efficient Image Super-Resolution

    Authors: M. Akin Yilmaz, Ahmet Bilican, A. Murat Tekalp

    Abstract: Balancing reconstruction quality versus model efficiency remains a critical challenge in lightweight single image super-resolution (SISR). Despite the prevalence of attention mechanisms in recent state-of-the-art SISR approaches that primarily emphasize or suppress feature maps, alternative architectural paradigms warrant further exploration. This paper introduces DiMoSR (Dilated Modulation Super-… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  2. arXiv:2505.12532  [pdf, ps, other

    cs.CV cs.AI cs.LG eess.IV eess.SP

    Exploring Sparsity for Parameter Efficient Fine Tuning Using Wavelets

    Authors: Ahmet Bilican, M. Akın Yılmaz, A. Murat Tekalp, R. Gökberk Cinbiş

    Abstract: Efficiently adapting large foundation models is critical, especially with tight compute and memory budgets. Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA offer limited granularity and effectiveness in few-parameter regimes. We propose Wavelet Fine-Tuning (WaveFT), a novel PEFT method that learns highly sparse updates in the wavelet domain of residual matrices. WaveFT allows precise c… ▽ More

    Submitted 3 June, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

  3. arXiv:2503.11343  [pdf, other

    eess.IV cs.CV

    FG-DFPN: Flow Guided Deformable Frame Prediction Network

    Authors: M. Akın Yılmaz, Ahmet Bilican, A. Murat Tekalp

    Abstract: Video frame prediction remains a fundamental challenge in computer vision with direct implications for autonomous systems, video compression, and media synthesis. We present FG-DFPN, a novel architecture that harnesses the synergy between optical flow estimation and deformable convolutions to model complex spatio-temporal dynamics. By guiding deformable sampling with motion cues, our approach addr… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: Submitted to 33th European Signal Processing Conference (EUSIPCO) 2025

  4. arXiv:2409.08772  [pdf, other

    cs.MM cs.CV eess.IV

    The Practice of Averaging Rate-Distortion Curves over Testsets to Compare Learned Video Codecs Can Cause Misleading Conclusions

    Authors: M. Akin Yilmaz, Onur Keleş, A. Murat Tekalp

    Abstract: This paper aims to demonstrate how the prevalent practice in the learned video compression community of averaging rate-distortion (RD) curves across a test video set can lead to misleading conclusions in evaluating codec performance. Through analytical analysis of a simple case and experimental results with two recent learned video codecs, we show how averaged RD curves can mislead comparative eva… ▽ More

    Submitted 24 December, 2024; v1 submitted 13 September, 2024; originally announced September 2024.

    Comments: Submitted to IEEE Signal Processing Letters

  5. arXiv:2402.08550  [pdf, other

    eess.IV

    Motion-Adaptive Inference for Flexible Learned B-Frame Compression

    Authors: M. Akin Yilmaz, O. Ugur Ulas, Ahmet Bilican, A. Murat Tekalp

    Abstract: While the performance of recent learned intra and sequential video compression models exceed that of respective traditional codecs, the performance of learned B-frame compression models generally lag behind traditional B-frame coding. The performance gap is bigger for complex scenes with large motions. This is related to the fact that the distance between the past and future references vary in hie… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 7 pages, submitted to IEEE ICIP 2024

  6. arXiv:2306.16544  [pdf, other

    eess.IV cs.CV

    Multi-Scale Deformable Alignment and Content-Adaptive Inference for Flexible-Rate Bi-Directional Video Compression

    Authors: M. Akın Yılmaz, O. Ugur Ulas, A. Murat Tekalp

    Abstract: The lack of ability to adapt the motion compensation model to video content is an important limitation of current end-to-end learned video compression models. This paper advances the state-of-the-art by proposing an adaptive motion-compensation model for end-to-end rate-distortion optimized hierarchical bi-directional video compression. In particular, we propose two novelties: i) a multi-scale def… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: Accepted for publication in IEEE International Conference on Image Processing (ICIP) 2023

  7. arXiv:2206.13613  [pdf, other

    eess.IV cs.CV

    Flexible-Rate Learned Hierarchical Bi-Directional Video Compression With Motion Refinement and Frame-Level Bit Allocation

    Authors: Eren Cetin, M. Akin Yilmaz, A. Murat Tekalp

    Abstract: This paper presents improvements and novel additions to our recent work on end-to-end optimized hierarchical bi-directional video compression to further advance the state-of-the-art in learned video compression. As an improvement, we combine motion estimation and prediction modules and compress refined residual motion vectors for improved rate-distortion performance. As novel addition, we adapted… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted for publication in IEEE International Conference on Image Processing (ICIP 2022)

    Report number: 1850

  8. arXiv:2112.09529  [pdf, other

    eess.IV cs.CV

    End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional Video Compression

    Authors: M. Akın Yılmaz, A. Murat Tekalp

    Abstract: Conventional video compression (VC) methods are based on motion compensated transform coding, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to the combinatorial nature of the end-to-end optimization problem. Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entr… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: Accepted for publication in IEEE Transactions on Image Processing on 15 Dec. 2021

  9. arXiv:2105.12794  [pdf, other

    cs.CV eess.IV

    DFPN: Deformable Frame Prediction Network

    Authors: M. Akın Yılmaz, A. Murat Tekalp

    Abstract: Learned frame prediction is a current problem of interest in computer vision and video compression. Although several deep network architectures have been proposed for learned frame prediction, to the best of our knowledge, there is no work based on using deformable convolutions for frame prediction. To this effect, we propose a deformable frame prediction network (DFPN) for task oriented implicit… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: Accepted for publication in IEEE International Conference on Image Processing (ICIP) 2021

  10. arXiv:2105.12107  [pdf, other

    eess.IV cs.CV

    Self-Organized Variational Autoencoders (Self-VAE) for Learned Image Compression

    Authors: M. Akın Yılmaz, Onur Keleş, Hilal Güven, A. Murat Tekalp, Junaid Malik, Serkan Kıranyaz

    Abstract: In end-to-end optimized learned image compression, it is standard practice to use a convolutional variational autoencoder with generalized divisive normalization (GDN) to transform images into a latent space. Recently, Operational Neural Networks (ONNs) that learn the best non-linearity from a set of alternatives, and their self-organized variants, Self-ONNs, that approximate any non-linearity via… ▽ More

    Submitted 28 May, 2021; v1 submitted 25 May, 2021; originally announced May 2021.

    Comments: Accepted for publication in IEEE International Conference on Image Processing (ICIP) 2021

  11. arXiv:2104.14868  [pdf, other

    eess.IV cs.MM

    On the Computation of PSNR for a Set of Images or Video

    Authors: Onur Keleş, M. Akın Yılmaz, A. Murat Tekalp, Cansu Korkmaz, Zafer Dogan

    Abstract: When comparing learned image/video restoration and compression methods, it is common to report peak-signal to noise ratio (PSNR) results. However, there does not exist a generally agreed upon practice to compute PSNR for sets of images or video. Some authors report average of individual image/frame PSNR, which is equivalent to computing a single PSNR from the geometric mean of individual image/fra… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

    Comments: accepted for publication in Picture Coding Symposium (PCS) 2021

  12. Effect of Architectures and Training Methods on the Performance of Learned Video Frame Prediction

    Authors: M. Akin Yilmaz, A. Murat Tekalp

    Abstract: We analyze the performance of feedforward vs. recurrent neural network (RNN) architectures and associated training methods for learned frame prediction. To this effect, we trained a residual fully convolutional neural network (FCNN), a convolutional RNN (CRNN), and a convolutional long short-term memory (CLSTM) network for next frame prediction using the mean square loss. We performed both statele… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at IEEE ICIP 2019

  13. End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression

    Authors: M. Akin Yilmaz, A. Murat Tekalp

    Abstract: Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules, quantization… ▽ More

    Submitted 26 May, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: This work is accepted for publication in IEEE ICIP 2020