SamurAI: A Versatile IoT Node With Event-Driven Wake-Up and Embedded ML Acceleration
Authors:
Ivan Miro-Panades,
Benoit Tain,
Jean-Frederic Christmann,
David Coriat,
Romain Lemaire,
Clement Jany,
Baudouin Martineau,
Fabrice Chaix,
Guillaume Waltener,
Emmanuel Pluchart,
Jean-Philippe Noel,
Adam Makosiej,
Maxime Montoya,
Simone Bacles-Min,
David Briand,
Jean-Marc Philippe,
Yvain Thonnart,
Alexandre Valentian,
Frederic Heitzmann,
Fabien Clermidy
Abstract:
Increased capabilities such as recognition and self-adaptability are now required from IoT applications. While IoT node power consumption is a major concern for these applications, cloud-based processing is becoming unsustainable due to continuous sensor or image data transmission over the wireless network. Thus optimized ML capabilities and data transfers should be integrated in the IoT node. Mor…
▽ More
Increased capabilities such as recognition and self-adaptability are now required from IoT applications. While IoT node power consumption is a major concern for these applications, cloud-based processing is becoming unsustainable due to continuous sensor or image data transmission over the wireless network. Thus optimized ML capabilities and data transfers should be integrated in the IoT node. Moreover, IoT applications are torn between sporadic data-logging and energy-hungry data processing (e.g. image classification). Thus, the versatility of the node is key in addressing this wide diversity of energy and processing needs. This paper presents SamurAI, a versatile IoT node bridging this gap in processing and in energy by leveraging two on-chip sub-systems: a low power, clock-less, event-driven Always-Responsive (AR) part and an energy-efficient On-Demand (OD) part. AR contains a 1.7MOPS event-driven, asynchronous Wake-up Controller (WuC) with a 207ns wake-up time optimized for sporadic computing, while OD combines a deep-sleep RISC-V CPU and 1.3TOPS/W Machine Learning (ML) for more complex tasks up to 36GOPS. This architecture partitioning achieves best in class versatility metrics such as peak performance to idle power ratio. On an applicative classification scenario, it demonstrates system power gains, up to 3.5x compared to cloud-based processing, and thus extended battery lifetime.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
Storing non-uniformly distributed messages in networks of neural cliques
Authors:
Bartosz Boguslawski,
Vincent Gripon,
Fabrice Seguin,
Frédéric Heitzmann
Abstract:
Associative memories are data structures that allow retrieval of stored messages from part of their content. They thus behave similarly to human brain that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Never…
▽ More
Associative memories are data structures that allow retrieval of stored messages from part of their content. They thus behave similarly to human brain that is capable for instance of retrieving the end of a song given its beginning. Among different families of associative memories, sparse ones are known to provide the best efficiency (ratio of the number of bits stored to that of bits used). Nevertheless, it is well known that non-uniformity of the stored messages can lead to dramatic decrease in performance. We introduce several strategies to allow efficient storage of non-uniform messages in recently introduced sparse associative memories. We analyse and discuss the methods introduced. We also present a practical application example.
△ Less
Submitted 24 July, 2013;
originally announced July 2013.