-
Low-Complexity Loeffler DCT Approximations for Image and Video Coding
Authors:
D. F. G. Coelho,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Madanayake,
P. A. C. Martinez,
T. L. T. Silveira,
R. S. Oliveira,
V. S. Dimitrov
Abstract:
This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where…
▽ More
This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where computational complexity, proximity, and coding performance are considered. Efficient approximations and their scaled 16- and 32-point versions are embedded into image and video encoders, including a JPEG-like codec and H.264/AVC and H.265/HEVC standards. Results are compared to the unmodified standard codecs. Efficient approximations are mapped and implemented on a Xilinx VLX240T FPGA and evaluated for area, speed, and power consumption.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Low-complexity Pruned 8-point DCT Approximations for Image Encoding
Authors:
V. A. Coutinho,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Madanayake
Abstract:
Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementa…
▽ More
Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementation in CMOS technology was also considered. Both presented methods were realized in Berkeley Emulation Engine (BEE3).
△ Less
Submitted 11 December, 2016;
originally announced December 2016.
-
Energy-efficient 8-point DCT Approximations: Theory and Hardware Architectures
Authors:
R. J. Cintra,
F. M. Bayer,
V. A. Coutinho,
S. Kulasekera,
A. Madanayake
Abstract:
Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computational…
▽ More
Due to its remarkable energy compaction properties, the discrete cosine transform (DCT) is employed in a multitude of compression standards, such as JPEG and H.265/HEVC. Several low-complexity integer approximations for the DCT have been proposed for both 1-D and 2-D signal analysis. The increasing demand for low-complexity, energy efficient methods require algorithms with even lower computational costs. In this paper, new 8-point DCT approximations with very low arithmetic complexity are presented. The new transforms are proposed based on pruning state-of-the-art DCT approximations. The proposed algorithms were assessed in terms of arithmetic complexity, energy retention capability, and image compression performance. In addition, a metric combining performance and computational complexity measures was proposed. Results showed good performance and extremely low computational complexity. Introduced algorithms were mapped into systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45nm CMOS technology. All hardware-related metrics showed low resource consumption of the proposed pruned approximate transforms. The best proposed transform according to the introduced metric presents a reduction in power consumption of 21--25%.
△ Less
Submitted 2 December, 2016;
originally announced December 2016.
-
Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform
Authors:
P. A. M. Oliveira,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Madanayake,
V. A. Coutinho
Abstract:
The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented…
▽ More
The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.
△ Less
Submitted 10 October, 2024; v1 submitted 24 September, 2016;
originally announced September 2016.
-
An Orthogonal 16-point Approximate DCT for Image and Video Compression
Authors:
T. L. T. da Silveira,
F. M. Bayer,
R. J. Cintra,
S. Kulasekera,
A. Madanayake,
A. J. Kozakevicius
Abstract:
A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image…
▽ More
A low-complexity orthogonal multiplierless approximation for the 16-point discrete cosine transform (DCT) was introduced. The proposed method was designed to possess a very low computational cost. A fast algorithm based on matrix factorization was proposed requiring only 60~additions. The proposed architecture outperforms classical and state-of-the-art algorithms when assessed as a tool for image and video compression. Digital VLSI hardware implementations were also proposed being physically realized in FPGA technology and implemented in 45 nm up to synthesis and place-route levels. Additionally, the proposed method was embedded into a high efficiency video coding (HEVC) reference software for actual proof-of-concept. Obtained results show negligible video degradation when compared to Chen DCT algorithm in HEVC.
△ Less
Submitted 26 May, 2016;
originally announced June 2016.
-
Multi-beam 4 GHz Microwave Apertures Using Current-Mode DFT Approximation on 65 nm CMOS
Authors:
V. Ariyarathna,
S. Kulasekera,
A. Madanayake,
D. Suarez,
R. J. Cintra,
F. M. Bayer,
L. Belostotski
Abstract:
A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation. High-bandwidth CMOS RF transistors are employed in low-voltage current mirrors to achieve bandwidths exceeding 4 GHz with good beam fidelity. Current mirrors realize the coefficients of the considered DFT approximation, which take simple values in $\{0, \pm1, \pm2\}$…
▽ More
A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation. High-bandwidth CMOS RF transistors are employed in low-voltage current mirrors to achieve bandwidths exceeding 4 GHz with good beam fidelity. Current mirrors realize the coefficients of the considered DFT approximation, which take simple values in $\{0, \pm1, \pm2\}$ only. This allows high bandwidths realizations using simple circuitry without needing phase-shifters or delays. The proposed design is used as a method to efficiently achieve spatial discrete Fourier transform operation across a ULA to obtain multiple simultaneous RF beams. An example using 1.2 V current-mode approximate DFT on 65 nm CMOS, with BSIM4 models from the RF kit, show potential operation up to 4 GHz with eight independent aperture beams.
△ Less
Submitted 23 May, 2015;
originally announced May 2015.
-
A Discrete Tchebichef Transform Approximation for Image and Video Coding
Authors:
P. A. M. Oliveira,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Madanayake
Abstract:
In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based ha…
▽ More
In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.
△ Less
Submitted 28 January, 2015;
originally announced February 2015.
-
Improved 8-point Approximate DCT for Image and Video Compression Requiring Only 14 Additions
Authors:
U. S. Potluri,
A. Madanayake,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Edirisuriya
Abstract:
Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer sup…
▽ More
Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.
△ Less
Submitted 13 January, 2015;
originally announced January 2015.
-
A Multiplierless Pruned DCT-like Transformation for Image and Video Compression that Requires 10 Additions Only
Authors:
V. A. Coutinho,
R. J. Cintra,
F. M. Bayer,
S. Kulasekera,
A. Madanayake
Abstract:
A multiplierless pruned approximate 8-point discrete cosine transform (DCT) requiring only 10 additions is introduced. The proposed algorithm was assessed in image and video compression, showing competitive performance with state-of-the-art methods. Digital implementation in 45 nm CMOS technology up to place-and-route level indicates clock speed of 288 MHz at a 1.1 V supply. The 8x8 block rate is…
▽ More
A multiplierless pruned approximate 8-point discrete cosine transform (DCT) requiring only 10 additions is introduced. The proposed algorithm was assessed in image and video compression, showing competitive performance with state-of-the-art methods. Digital implementation in 45 nm CMOS technology up to place-and-route level indicates clock speed of 288 MHz at a 1.1 V supply. The 8x8 block rate is 36 MHz.The DCT approximation was embedded into HEVC reference software; resulting video frames, at up to 327 Hz for 8-bit RGB HEVC, presented negligible image degradation.
△ Less
Submitted 11 December, 2016; v1 submitted 24 February, 2014;
originally announced February 2014.