Flexing RISC-V Instruction Subset Processors (RISPs) to Extreme Edge
Authors:
Alireza Raisiardali,
Konstantinos Iordanou,
Jedrzej Kufel,
Kowshik Gudimetla,
Kris Myny,
Emre Ozer
Abstract:
This paper presents a methodology for automatically generating processors that support a subset of the RISC-V instruction set for a new class of applications at Extreme Edge. The electronics used in extreme edge applications must be power-efficient, but also provide additional qualities, such as low cost, conformability, comfort and sustainability. Flexible electronics, rather than silicon-based e…
▽ More
This paper presents a methodology for automatically generating processors that support a subset of the RISC-V instruction set for a new class of applications at Extreme Edge. The electronics used in extreme edge applications must be power-efficient, but also provide additional qualities, such as low cost, conformability, comfort and sustainability. Flexible electronics, rather than silicon-based electronics, will be capable of meeting these qualities. For this purpose, we propose a methodology to generate RISPs (RISC-V instruction subset processors) customised to extreme edge applications and to implement them as flexible integrated circuits (FlexICs). The methodology is unique in the sense that verification is an integral part of design. The RISP methodology treats each instruction in the ISA as a discrete, fully functional, pre-verified hardware block. It automatically builds a custom processor by stitching together the hardware blocks of the instructions required by an application or a set of applications in a specific domain. This approach significantly reduces the processor verification and its time-to-market. We generate RISPs using this methodology for three extreme edge applications, and embedded applications from the Embench benchmark suite, synthesize them as FlexICs, and compare their power, performance and area to the baselines. Our results show that RISPs generated using this methodology achieve, on average, 30% reductions in power and area compared to a RISC-V processor supporting the full instruction set when synthesized, and are nearly 30 times more energy efficient with respect to Serv - the world's smallest 32-bit RISC-V processor. In addition, the full physical implementation of RISPs show up to 21% and 26% less area and power than Serv.
△ Less
Submitted 8 May, 2025; v1 submitted 7 May, 2025;
originally announced May 2025.
Tiny Classifier Circuits: Evolving Accelerators for Tabular Data
Authors:
Konstantinos Iordanou,
Timothy Atkinson,
Emre Ozer,
Jedrzej Kufel,
John Biggs,
Gavin Brown,
Mikel Lujan
Abstract:
A typical machine learning (ML) development cycle for edge computing is to maximise the performance during model training and then minimise the memory/area footprint of the trained model for deployment on edge devices targeting CPUs, GPUs, microcontrollers, or custom hardware accelerators. This paper proposes a methodology for automatically generating predictor circuits for classification of tabul…
▽ More
A typical machine learning (ML) development cycle for edge computing is to maximise the performance during model training and then minimise the memory/area footprint of the trained model for deployment on edge devices targeting CPUs, GPUs, microcontrollers, or custom hardware accelerators. This paper proposes a methodology for automatically generating predictor circuits for classification of tabular data with comparable prediction performance to conventional ML techniques while using substantially fewer hardware resources and power. The proposed methodology uses an evolutionary algorithm to search over the space of logic gates and automatically generates a classifier circuit with maximised training prediction accuracy. Classifier circuits are so tiny (i.e., consisting of no more than 300 logic gates) that they are called "Tiny Classifier" circuits, and can efficiently be implemented in ASIC or on an FPGA. We empirically evaluate the automatic Tiny Classifier circuit generation methodology or "Auto Tiny Classifiers" on a wide range of tabular datasets, and compare it against conventional ML techniques such as Amazon's AutoGluon, Google's TabNet and a neural search over Multi-Layer Perceptrons. Despite Tiny Classifiers being constrained to a few hundred logic gates, we observe no statistically significant difference in prediction performance in comparison to the best-performing ML baseline. When synthesised as a Silicon chip, Tiny Classifiers use 8-18x less area and 4-8x less power. When implemented as an ultra-low cost chip on a flexible substrate (i.e., FlexIC), they occupy 10-75x less area and consume 13-75x less power compared to the most hardware-efficient ML baseline. On an FPGA, Tiny Classifiers consume 3-11x fewer resources.
△ Less
Submitted 28 September, 2023; v1 submitted 28 February, 2023;
originally announced March 2023.