-
SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems
Authors:
Leo F Isikdogan,
Bhavin V Nayak,
Chyuan-Tyng Wu,
Joao Peralta Moreira,
Sushma Rao,
Gilad Michael
Abstract:
We propose a system comprised of fixed-topology neural networks having partially frozen weights, named SemifreddoNets. SemifreddoNets work as fully-pipelined hardware blocks that are optimized to have an efficient hardware implementation. Those blocks freeze a certain portion of the parameters at every layer and replace the corresponding multipliers with fixed scalers. Fixing the weights reduces t…
▽ More
We propose a system comprised of fixed-topology neural networks having partially frozen weights, named SemifreddoNets. SemifreddoNets work as fully-pipelined hardware blocks that are optimized to have an efficient hardware implementation. Those blocks freeze a certain portion of the parameters at every layer and replace the corresponding multipliers with fixed scalers. Fixing the weights reduces the silicon area, logic delay, and memory requirements, leading to significant savings in cost and power consumption. Unlike traditional layer-wise freezing approaches, SemifreddoNets make a profitable trade between the cost and flexibility by having some of the weights configurable at different scales and levels of abstraction in the model. Although fixing the topology and some of the weights somewhat limits the flexibility, we argue that the efficiency benefits of this strategy outweigh the advantages of a fully configurable model for many use cases. Furthermore, our system uses repeatable blocks, therefore it has the flexibility to adjust model complexity without requiring any hardware change. The hardware implementation of SemifreddoNets provides up to an order of magnitude reduction in silicon area and power consumption as compared to their equivalent implementation on a general-purpose accelerator.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
A Machine Learning Imaging Core using Separable FIR-IIR Filters
Authors:
Masayoshi Asama,
Leo F. Isikdogan,
Sushma Rao,
Bhavin V. Nayak,
Gilad Michael
Abstract:
We propose fixed-function neural network hardware that is designed to perform pixel-to-pixel image transformations in a highly efficient way. We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks. Our model uses compressed skip lines and hybrid FIR-IIR blocks to reduce the latency and hardware footprint. Our proposed Mach…
▽ More
We propose fixed-function neural network hardware that is designed to perform pixel-to-pixel image transformations in a highly efficient way. We use a fully trainable, fixed-topology neural network to build a model that can perform a wide variety of image processing tasks. Our model uses compressed skip lines and hybrid FIR-IIR blocks to reduce the latency and hardware footprint. Our proposed Machine Learning Imaging Core, dubbed MagIC, uses a silicon area of ~3mm^2 (in TSMC 16nm), which is orders of magnitude smaller than a comparable pixel-wise dense prediction model. MagIC requires no DDR bandwidth, no SRAM, and practically no external memory. Each MagIC core consumes 56mW (215 mW max power) at 500MHz and achieves an energy-efficient throughput of 23TOPS/W/mm^2. MagIC can be used as a multi-purpose image processing block in an imaging pipeline, approximating compute-heavy image processing applications, such as image deblurring, denoising, and colorization, within the power and silicon area limits of mobile devices.
△ Less
Submitted 2 January, 2020;
originally announced January 2020.
-
VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications
Authors:
Chyuan-Tyng Wu,
Leo F. Isikdogan,
Sushma Rao,
Bhavin Nayak,
Timo Gerasimow,
Aleksandar Sutic,
Liron Ain-kedem,
Gilad Michael
Abstract:
Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. We propose a set of methods, which we collectively call VisionISP, to repurpose the ISP for machine consumption. VisionISP significantly reduce…
▽ More
Traditional image signal processors (ISPs) are primarily designed and optimized to improve the image quality perceived by humans. However, optimal perceptual image quality does not always translate into optimal performance for computer vision applications. We propose a set of methods, which we collectively call VisionISP, to repurpose the ISP for machine consumption. VisionISP significantly reduces data transmission needs by reducing the bit-depth and resolution while preserving the relevant information. The blocks in VisionISP are simple, content-aware, and trainable. Experimental results show that VisionISP boosts the performance of a subsequent computer vision system trained to detect objects in an autonomous driving setting. The results demonstrate the potential and the practicality of VisionISP for computer vision applications.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
Eye Contact Correction using Deep Neural Networks
Authors:
Leo F. Isikdogan,
Timo Gerasimow,
Gilad Michael
Abstract:
In a typical video conferencing setup, it is hard to maintain eye contact during a call since it requires looking into the camera rather than the display. We propose an eye contact correction model that restores the eye contact regardless of the relative position of the camera and display. Unlike previous solutions, our model redirects the gaze from an arbitrary direction to the center without req…
▽ More
In a typical video conferencing setup, it is hard to maintain eye contact during a call since it requires looking into the camera rather than the display. We propose an eye contact correction model that restores the eye contact regardless of the relative position of the camera and display. Unlike previous solutions, our model redirects the gaze from an arbitrary direction to the center without requiring a redirection angle or camera/display/user geometry as inputs. We use a deep convolutional neural network that inputs a monocular image and produces a vector field and a brightness map to correct the gaze. We train this model in a bi-directional way on a large set of synthetically generated photorealistic images with perfect labels. The learned model is a robust eye contact corrector which also predicts the input gaze implicitly at no additional cost. Our system is primarily designed to improve the quality of video conferencing experience. Therefore, we use a set of control mechanisms to prevent creepy results and to ensure a smooth and natural video conferencing experience. The entire eye contact correction system runs end-to-end in real-time on a commodity CPU and does not require any dedicated hardware, making our solution feasible for a variety of devices.
△ Less
Submitted 26 December, 2019; v1 submitted 12 June, 2019;
originally announced June 2019.
-
Automatic Channel Network Extraction from Remotely Sensed Images by Singularity Analysis
Authors:
F. Isikdogan,
A. C. Bovik,
P. Passalacqua
Abstract:
Quantitative analysis of channel networks plays an important role in river studies. To provide a quantitative representation of channel networks, we propose a new method that extracts channels from remotely sensed images and estimates their widths. Our fully automated method is based on a recently proposed Multiscale Singularity Index that responds strongly to curvilinear structures but weakly to…
▽ More
Quantitative analysis of channel networks plays an important role in river studies. To provide a quantitative representation of channel networks, we propose a new method that extracts channels from remotely sensed images and estimates their widths. Our fully automated method is based on a recently proposed Multiscale Singularity Index that responds strongly to curvilinear structures but weakly to edges. The algorithm produces a channel map, using a single image where water and non-water pixels have contrast, such as a Landsat near-infrared band image or a water index defined on multiple bands. The proposed method provides a robust alternative to the procedures that are used in remote sensing of fluvial geomorphology and makes classification and analysis of channel networks easier. The source code of the algorithm is available at: http://live.ece.utexas.edu/research/cne/.
△ Less
Submitted 29 June, 2015;
originally announced June 2015.