You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy
Authors:
Srivatsa P,
Kyle Timothy Ng Chu,
Burin Amornpaisannon,
Yaswanth Tavva,
Venkata Pavan Kumar Miriyala,
Jibin Wu,
Malu Zhang,
Haizhou Li,
Trevor E. Carlson
Abstract:
In the past decade, advances in Artificial Neural Networks (ANNs) have allowed them to perform extremely well for a wide range of tasks. In fact, they have reached human parity when performing image recognition, for example. Unfortunately, the accuracy of these ANNs comes at the expense of a large number of cache and/or memory accesses and compute operations. Spiking Neural Networks (SNNs), a type…
▽ More
In the past decade, advances in Artificial Neural Networks (ANNs) have allowed them to perform extremely well for a wide range of tasks. In fact, they have reached human parity when performing image recognition, for example. Unfortunately, the accuracy of these ANNs comes at the expense of a large number of cache and/or memory accesses and compute operations. Spiking Neural Networks (SNNs), a type of neuromorphic, or brain-inspired network, have recently gained significant interest as power-efficient alternatives to ANNs, because they are sparse, accessing very few weights, and typically only use addition operations instead of the more power-intensive multiply-and-accumulate (MAC) operations. The vast majority of neuromorphic hardware designs support rate-encoded SNNs, where the information is encoded in spike rates. Rate-encoded SNNs could be seen as inefficient as an encoding scheme because it involves the transmission of a large number of spikes. A more efficient encoding scheme, Time-To-First-Spike (TTFS) encoding, encodes information in the relative time of arrival of spikes. While TTFS-encoded SNNs are more efficient than rate-encoded SNNs, they have, up to now, performed poorly in terms of accuracy compared to previous methods. Hence, in this work, we aim to overcome the limitations of TTFS-encoded neuromorphic systems. To accomplish this, we propose: (1) a novel optimization algorithm for TTFS-encoded SNNs converted from ANNs and (2) a novel hardware accelerator for TTFS-encoded SNNs, with a scalable and low-power design. Overall, our work in TTFS encoding and training improves the accuracy of SNNs to achieve state-of-the-art results on MNIST MLPs, while reducing power consumption by 1.46$\times$ over the state-of-the-art neuromorphic hardware.
△ Less
Submitted 8 November, 2020; v1 submitted 3 June, 2020;
originally announced June 2020.
Crossbar-Constrained Technology Mapping for ReRAM based In-Memory Computing
Authors:
Debjyoti Bhattacharjee,
Yaswanth Tavva,
Arvind Easwaran,
Anupam Chattopadhyay
Abstract:
In recent times, Resistive RAMs (ReRAMs) have gained significant prominence due to their unique feature of supporting both non-volatile storage and logic capabilities. ReRAM is also reported to provide extremely low power consumption compared to the standard CMOS storage devices. As a result, researchers have explored the mapping and design of diverse applications, ranging from arithmetic to neuro…
▽ More
In recent times, Resistive RAMs (ReRAMs) have gained significant prominence due to their unique feature of supporting both non-volatile storage and logic capabilities. ReRAM is also reported to provide extremely low power consumption compared to the standard CMOS storage devices. As a result, researchers have explored the mapping and design of diverse applications, ranging from arithmetic to neuromorphic computing structures to ReRAM-based platforms. ReVAMP, a general-purpose ReRAM computing platform, has been proposed recently to leverage the parallelism exhibited in a crossbar structure. However, the technology mapping on ReVAMP remains an open challenge. Though the technology mapping with device/area-constraints have been proposed, crossbar constraints are not considered so far. In this work, we address this problem. Two technology mapping flows are proposed, considering different runtime-efficiency trade-offs. Both the mapping flows take crossbar constraints into account and generate feasible mapping for a variety of crossbar dimensions. Our proposed algorithms are highly scalable and reveal important design hints for ReRAM-based implementations.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.