Search | arXiv e-print repository

Mapping Image Transformations Onto Pixel Processor Arrays

Abstract: Pixel Processor Arrays (PPA) present a new vision sensor/processor architecture consisting of a SIMD array of processor elements, each capable of light capture, storage, processing and local communication. Such a device allows visual data to be efficiently stored and manipulated directly upon the focal plane, but also demands the invention of new approaches and algorithms, suitable for the massive… ▽ More Pixel Processor Arrays (PPA) present a new vision sensor/processor architecture consisting of a SIMD array of processor elements, each capable of light capture, storage, processing and local communication. Such a device allows visual data to be efficiently stored and manipulated directly upon the focal plane, but also demands the invention of new approaches and algorithms, suitable for the massively-parallel fine-grain processor arrays. In this paper we demonstrate how various image transformations, including shearing, rotation and scaling, can be performed directly upon a PPA. The implementation details are presented using the SCAMP-5 vision chip, that contains a 256x256 pixel-parallel array. Our approaches for performing the image transformations efficiently exploit the parallel computation in a cellular processor array, minimizing the number of SIMD instructions required. These fundamental image transformations are vital building blocks for many visual tasks. This paper aims to serve as a reference for future PPA research while demonstrating the flexibility of PPA architectures. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2402.12130 [pdf]

Factor Machine: Mixed-signal Architecture for Fine-Grained Graph-Based Computing

Authors: Piotr Dudek

Abstract: This paper proposes the design and implementation strategy of a novel computing architecture, the Factor Machine. The work is a step towards a general-purpose parallel system operating in a non-sequential manner, exploiting processing/memory co-integration and replacing the traditional Turing/von Neumann model of a computer system with a framework based on "factorised computation". This architectu… ▽ More This paper proposes the design and implementation strategy of a novel computing architecture, the Factor Machine. The work is a step towards a general-purpose parallel system operating in a non-sequential manner, exploiting processing/memory co-integration and replacing the traditional Turing/von Neumann model of a computer system with a framework based on "factorised computation". This architecture is inspired by neural information processing principles and aims to progress the development of brain-like machine intelligence systems, through providing a computing substrate designed from the ground up to enable efficient implementations of algorithms based on relational networks. The paper provides a rationale for such machine, in the context of the history of computing, and more recent developments in neuromorphic hardware, reviews its general features, and proposes a mixed-signal hardware implementation, based on using analogue circuits to carry out computation and localised and sparse communication between the compute units. △ Less

Submitted 19 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: An essay in contribution to the Festschrift for Professor Steve Furber, Manchester, 12 January 2024

ACM Class: C.1.4

arXiv:2304.05440 [pdf, other]

PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors

Authors: Haley M. So, Laurie Bose, Piotr Dudek, Gordon Wetzstein

Abstract: Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing. This is challenging for perception systems operating on edge devices, because communication is power inefficient and induces latency. Fueled by innovations in stacked image sensor fabrication, emerging sensor-processors… ▽ More Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing. This is challenging for perception systems operating on edge devices, because communication is power inefficient and induces latency. Fueled by innovations in stacked image sensor fabrication, emerging sensor-processors offer programmability and minimal processing capabilities directly on the sensor. We exploit these capabilities by developing an efficient recurrent neural network architecture, PixelRNN, that encodes spatio-temporal features on the sensor using purely binary operations. PixelRNN reduces the amount of data to be transmitted off the sensor by a factor of 64x compared to conventional systems while offering competitive accuracy for hand gesture recognition and lip reading tasks. We experimentally validate PixelRNN using a prototype implementation on the SCAMP-5 sensor-processor platform. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2202.00836

On-Sensor Binarized Fully Convolutional Neural Network with A Pixel Processor Array

Authors: Yanan Liu, Laurie Bose, Yao Lu, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: This work presents a method to implement fully convolutional neural networks (FCNs) on Pixel Processor Array (PPA) sensors, and demonstrates coarse segmentation and object localisation tasks. We design and train binarized FCN for both binary weights and activations using batchnorm, group convolution, and learnable threshold for binarization, producing networks small enough to be embedded on the fo… ▽ More This work presents a method to implement fully convolutional neural networks (FCNs) on Pixel Processor Array (PPA) sensors, and demonstrates coarse segmentation and object localisation tasks. We design and train binarized FCN for both binary weights and activations using batchnorm, group convolution, and learnable threshold for binarization, producing networks small enough to be embedded on the focal plane of the PPA, with limited local memory resources, and using parallel elementary add/subtract, shifting, and bit operations only. We demonstrate the first implementation of an FCN on a PPA device, performing three convolution layers entirely in the pixel-level processors. We use this architecture to demonstrate inference generating heat maps for object segmentation and localisation at over 280 FPS using the SCAMP-5 PPA vision chip. △ Less

Submitted 11 June, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

Comments: lack of experiments for method validation

arXiv:2112.05221 [pdf, other]

MantissaCam: Learning Snapshot High-dynamic-range Imaging with Perceptually-based In-pixel Irradiance Encoding

Authors: Haley M. So, Julien N. P. Martel, Piotr Dudek, Gordon Wetzstein

Abstract: The ability to image high-dynamic-range (HDR) scenes is crucial in many computer vision applications. The dynamic range of conventional sensors, however, is fundamentally limited by their well capacity, resulting in saturation of bright scene parts. To overcome this limitation, emerging sensors offer in-pixel processing capabilities to encode the incident irradiance. Among the most promising encod… ▽ More The ability to image high-dynamic-range (HDR) scenes is crucial in many computer vision applications. The dynamic range of conventional sensors, however, is fundamentally limited by their well capacity, resulting in saturation of bright scene parts. To overcome this limitation, emerging sensors offer in-pixel processing capabilities to encode the incident irradiance. Among the most promising encoding schemes is modulo wrapping, which results in a computational photography problem where the HDR scene is computed by an irradiance unwrapping algorithm from the wrapped low-dynamic-range (LDR) sensor image. Here, we design a neural network--based algorithm that outperforms previous irradiance unwrapping methods and we design a perceptually inspired "mantissa" encoding scheme that more efficiently wraps an HDR scene into an LDR sensor. Combined with our reconstruction framework, MantissaCam achieves state-of-the-art results among modulo-type snapshot HDR imaging approaches. We demonstrate the efficacy of our method in simulation and show benefits of our algorithm on modulo images captured with a prototype implemented with a programmable sensor. △ Less

Submitted 20 April, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

arXiv:2106.07561 [pdf, other]

Direct Servo Control from In-Sensor CNN Inference with A Pixel Processor Array

Authors: Yanan Liu, Jianing Chen, Laurie Bose, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: This work demonstrates direct visual sensory-motor control using high-speed CNN inference via a SCAMP-5 Pixel Processor Array (PPA). We demonstrate how PPAs are able to efficiently bridge the gap between perception and action. A binary Convolutional Neural Network (CNN) is used for a classic rock, paper, scissors classification problem at over 8000 FPS. Control instructions are directly sent to a… ▽ More This work demonstrates direct visual sensory-motor control using high-speed CNN inference via a SCAMP-5 Pixel Processor Array (PPA). We demonstrate how PPAs are able to efficiently bridge the gap between perception and action. A binary Convolutional Neural Network (CNN) is used for a classic rock, paper, scissors classification problem at over 8000 FPS. Control instructions are directly sent to a servo motor from the PPA according to the CNN's classification result without any other intermediate hardware. △ Less

Submitted 26 May, 2021; originally announced June 2021.

arXiv:2105.10479 [pdf, other]

Bringing A Robot Simulator to the SCAMP Vision System

Authors: Yanan Liu, Jianing Chen, Laurie Bose, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: This work develops and demonstrates the integration of the SCAMP-5d vision system into the CoppeliaSim robot simulator, creating a semi-simulated environment. By configuring a camera in the simulator and setting up communication with the SCAMP python host through remote API, sensor images from the simulator can be transferred to the SCAMP vision sensor, where on-sensor image processing such as CNN… ▽ More This work develops and demonstrates the integration of the SCAMP-5d vision system into the CoppeliaSim robot simulator, creating a semi-simulated environment. By configuring a camera in the simulator and setting up communication with the SCAMP python host through remote API, sensor images from the simulator can be transferred to the SCAMP vision sensor, where on-sensor image processing such as CNN inference can be performed. SCAMP output is then fed back into CoppeliaSim. This proposed platform integration enables rapid prototyping validations of SCAMP algorithms for robotic systems. We demonstrate a car localisation and tracking task using this proposed semi-simulated platform, with a CNN inference on SCAMP to command the motion of a robot. We made this platform available online. △ Less

Submitted 21 May, 2021; originally announced May 2021.

arXiv:2009.12796 [pdf, other]

Agile Reactive Navigation for A Non-Holonomic Mobile Robot Using A Pixel Processor Array

Authors: Yanan Liu, Laurie Bose, Colin Greatwood, Jianing Chen, Rui Fan, Thomas Richardson, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: This paper presents an agile reactive navigation strategy for driving a non-holonomic ground vehicle around a preset course of gates in a cluttered environment using a low-cost processor array sensor. This enables machine vision tasks to be performed directly upon the sensor's image plane, rather than using a separate general-purpose computer. We demonstrate a small ground vehicle running through… ▽ More This paper presents an agile reactive navigation strategy for driving a non-holonomic ground vehicle around a preset course of gates in a cluttered environment using a low-cost processor array sensor. This enables machine vision tasks to be performed directly upon the sensor's image plane, rather than using a separate general-purpose computer. We demonstrate a small ground vehicle running through or avoiding multiple gates at high speed using minimal computational resources. To achieve this, target tracking algorithms are developed for the Pixel Processing Array and captured images are then processed directly on the vision sensor acquiring target information for controlling the ground vehicle. The algorithm can run at up to 2000 fps outdoors and 200fps at indoor illumination levels. Conducting image processing at the sensor level avoids the bottleneck of image transfer encountered in conventional sensors. The real-time performance of on-board image processing and robustness is validated through experiments. Experimental results demonstrate that the algorithm's ability to enable a ground vehicle to navigate at an average speed of 2.20 m/s for passing through multiple gates and 3.88 m/s for a 'slalom' task in an environment featuring significant visual clutter. △ Less

Submitted 27 September, 2020; originally announced September 2020.

Comments: 7 pages

arXiv:2004.12525 [pdf, other]

Fully Embedding Fast Convolutional Networks on Pixel Processor Arrays

Authors: Laurie Bose, Jianing Chen, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: We present a novel method of CNN inference for pixel processor array (PPA) vision sensors, designed to take advantage of their massive parallelism and analog compute capabilities. PPA sensors consist of an array of processing elements (PEs), with each PE capable of light capture, data storage and computation, allowing various computer vision processing to be executed directly upon the sensor devic… ▽ More We present a novel method of CNN inference for pixel processor array (PPA) vision sensors, designed to take advantage of their massive parallelism and analog compute capabilities. PPA sensors consist of an array of processing elements (PEs), with each PE capable of light capture, data storage and computation, allowing various computer vision processing to be executed directly upon the sensor device. The key idea behind our approach is storing network weights "in-pixel" within the PEs of the PPA sensor itself to allow various computations, such as multiple different image convolutions, to be carried out in parallel. Our approach can perform convolutional layers, max pooling, ReLu, and a final fully connected layer entirely upon the PPA sensor, while leaving no untapped computational resources. This is in contrast to previous works that only use a sensor-level processing to sequentially compute image convolutions, and must transfer data to an external digital processor to complete the computation. We demonstrate our approach on the SCAMP-5 vision system, performing inference of a MNIST digit classification network at over 3000 frames per second and over 93% classification accuracy. This is the first work demonstrating CNN inference conducted entirely upon the processor array of a PPA vision sensor device, requiring no external processing. △ Less

Submitted 26 April, 2020; originally announced April 2020.

arXiv:1909.05647 [pdf, other]

A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays

Authors: Laurie Bose, Jianing Chen, Stephen J. Carey, Piotr Dudek, Walterio Mayol-Cuevas

Abstract: We present a convolutional neural network implementation for pixel processor array (PPA) sensors. PPA hardware consists of a fine-grained array of general-purpose processing elements, each capable of light capture, data storage, program execution, and communication with neighboring elements. This allows images to be stored and manipulated directly at the point of light capture, rather than having… ▽ More We present a convolutional neural network implementation for pixel processor array (PPA) sensors. PPA hardware consists of a fine-grained array of general-purpose processing elements, each capable of light capture, data storage, program execution, and communication with neighboring elements. This allows images to be stored and manipulated directly at the point of light capture, rather than having to transfer images to external processing hardware. Our CNN approach divides this array up into 4x4 blocks of processing elements, essentially trading-off image resolution for increased local memory capacity per 4x4 "pixel". We implement parallel operations for image addition, subtraction and bit-shifting images in this 4x4 block format. Using these components we formulate how to perform ternary weight convolutions upon these images, compactly store results of such convolutions, perform max-pooling, and transfer the resulting sub-sampled data to an attached micro-controller. We train ternary weight filter CNNs for digit recognition and a simple tracking task, and demonstrate inference of these networks upon the SCAMP5 PPA system. This work represents a first step towards embedding neural network processing capability directly onto the focal plane of a sensor. △ Less

Submitted 13 September, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

Comments: Accepted into ICCV 2019

Showing 1–10 of 10 results for author: Dudek, P