Search | arXiv e-print repository

doi 10.3390/jlpea8040046

Low-Complexity Loeffler DCT Approximations for Image and Video Coding

Authors: D. F. G. Coelho, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake, P. A. C. Martinez, T. L. T. Silveira, R. S. Oliveira, V. S. Dimitrov

Abstract: This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where… ▽ More This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where computational complexity, proximity, and coding performance are considered. Efficient approximations and their scaled 16- and 32-point versions are embedded into image and video encoders, including a JPEG-like codec and H.264/AVC and H.265/HEVC standards. Results are compared to the unmodified standard codecs. Efficient approximations are mapped and implemented on a Xilinx VLX240T FPGA and evaluated for area, speed, and power consumption. △ Less

Submitted 28 July, 2022; originally announced July 2022.

Comments: 25 pages, 11 figures, 7 tables

Journal ref: J. Low Power Electron. Appl. 2018, 8(4), 46

arXiv:1612.03461 [pdf, ps, other]

doi 10.1109/CONIELECOMP.2015.7086923

Low-complexity Pruned 8-point DCT Approximations for Image Encoding

Authors: V. A. Coutinho, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake

Abstract: Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementa… ▽ More Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementation in CMOS technology was also considered. Both presented methods were realized in Berkeley Emulation Engine (BEE3). △ Less

Submitted 11 December, 2016; originally announced December 2016.

Comments: 13 pages, 6 figures, 3 tables

arXiv:1612.00807 [pdf, ps, other]

doi 10.1007/s00034-015-0233-z

Energy-efficient 8-point DCT Approximations: Theory and Hardware Architectures

Authors: R. J. Cintra, F. M. Bayer, V. A. Coutinho, S. Kulasekera, A. Madanayake

Abstract: Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computational… ▽ More Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computational costs. In this paper, new 8-point DCT approximations with very low arithmetic complexity are presented. The new transforms are proposed based on pruning state-of-the-art DCT approximations. The proposed algorithms were assessed in terms of arithmetic complexity, energy retention capability, and image compression performance. In addition, a metric combining performance and computational complexity measures was proposed. Results showed good performance and extremely low computational complexity. Introduced algorithms were mapped into systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45nm CMOS technology. All hardware-related metrics showed low resource consumption of the proposed pruned approximate transforms. The best proposed transform according to the introduced metric presents a reduction in power consumption of 21--25%. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 21 pages, 7 figures, 5 tables

Journal ref: Circuits, Systems, and Signal Processing, November 2016, Volume 35, Issue 11, pp 4009-4029

arXiv:1609.07630 [pdf, ps, other]

doi 10.1109/TCSVT.2016.2515378

Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform

Authors: P. A. M. Oliveira, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake, V. A. Coutinho

Abstract: The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented… ▽ More The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT. △ Less

Submitted 10 October, 2024; v1 submitted 24 September, 2016; originally announced September 2016.

Comments: Fixed typo in $C_g$ and $η$ measurements from Table 1 (W A S Aleixo); 11 pages, 5 figures, 4 tables

arXiv:1606.05562 [pdf, ps, other]

doi 10.1007/s11045-014-0291-6

An Orthogonal 16-point Approximate DCT for Image and Video Compression

Authors: T. L. T. da Silveira, F. M. Bayer, R. J. Cintra, S. Kulasekera, A. Madanayake, A. J. Kozakevicius

Abstract: A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image… ▽ More A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC. △ Less

Submitted 26 May, 2016; originally announced June 2016.

Comments: 18 pages, 7 figures, 6 tables

Journal ref: Multidimensional Systems and Signal Processing, vol. 27, no. 1, pp. 87-104, 2016

arXiv:1505.06345 [pdf, ps, other]

doi 10.1109/MWSYM.2015.7167112

Multi-beam 4 GHz Microwave Apertures Using Current-Mode DFT Approximation on 65 nm CMOS

Authors: V. Ariyarathna, S. Kulasekera, A. Madanayake, D. Suarez, R. J. Cintra, F. M. Bayer, L. Belostotski

Abstract: A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation. High-bandwidth CMOS RF transistors are employed in low-voltage current mirrors to achieve bandwidths exceeding 4 GHz with good beam fidelity. Current mirrors realize the coefficients of the considered DFT approximation, which take simple values in $\{0, \pm1, \pm2\}$… ▽ More A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation. High-bandwidth CMOS RF transistors are employed in low-voltage current mirrors to achieve bandwidths exceeding 4 GHz with good beam fidelity. Current mirrors realize the coefficients of the considered DFT approximation, which take simple values in $\{0, \pm1, \pm2\}$ only. This allows high bandwidths realizations using simple circuitry without needing phase-shifters or delays. The proposed design is used as a method to efficiently achieve spatial discrete Fourier transform operation across a ULA to obtain multiple simultaneous RF beams. An example using 1.2 V current-mode approximate DFT on 65 nm CMOS, with BSIM4 models from the RF kit, show potential operation up to 4 GHz with eight independent aperture beams. △ Less

Submitted 23 May, 2015; originally announced May 2015.

Comments: 7 pages, 4 figures, In: IEEE International Microwave Symposium 2015

arXiv:1502.00555 [pdf, ps, other]

doi 10.1109/LSP.2015.2389899

A Discrete Tchebichef Transform Approximation for Image and Video Coding

Authors: P. A. M. Oliveira, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake

Abstract: In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based ha… ▽ More In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature. △ Less

Submitted 28 January, 2015; originally announced February 2015.

Comments: 13 pages, 5 figures, 2 tables

Journal ref: IEEE Signal Processing Letters, vol. 22, issue 8, pp. 1137-1141, 2015

arXiv:1501.02995 [pdf, ps, other]

doi 10.1109/TCSI.2013.2295022

Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions

Authors: U. S. Potluri, A. Madanayake, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Edirisuriya

Abstract: Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer sup… ▽ More Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology. △ Less

Submitted 13 January, 2015; originally announced January 2015.

Comments: 30 pages, 7 figures, 5 tables

Journal ref: Circuits and Systems I: Regular Papers, IEEE Transactions on, Volume 61, Issue 6, June 2014, 1727--1740

arXiv:1402.5979 [pdf, ps, other]

doi 10.1007/s11554-015-0492-8

A Multiplierless Pruned DCT-like Transformation for Image and Video Compression that Requires 10 Additions Only

Authors: V. A. Coutinho, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake

Abstract: A multiplierless pruned approximate 8-point discrete cosine transform (DCT) requiring only 10 additions is introduced. The proposed algorithm was assessed in image and video compression, showing competitive performance with state-of-the-art methods. Digital implementation in 45 nm CMOS technology up to place-and-route level indicates clock speed of 288 MHz at a 1.1 V supply. The 8x8 block rate is… ▽ More A multiplierless pruned approximate 8-point discrete cosine transform (DCT) requiring only 10 additions is introduced. The proposed algorithm was assessed in image and video compression, showing competitive performance with state-of-the-art methods. Digital implementation in 45 nm CMOS technology up to place-and-route level indicates clock speed of 288 MHz at a 1.1 V supply. The 8x8 block rate is 36 MHz.The DCT approximation was embedded into HEVC reference software; resulting video frames, at up to 327 Hz for 8-bit RGB HEVC, presented negligible image degradation. △ Less

Submitted 11 December, 2016; v1 submitted 24 February, 2014; originally announced February 2014.

Comments: 13 pages, 4 figures, 5 tables

Journal ref: Journal of Real-Time Image Processing, August 2016, Volume 12, Issue 2, pp 247-255

Showing 1–9 of 9 results for author: Kulasekera, S