-
Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing
Authors:
J. Luquin,
C. Mackin,
S. Ambrogio,
A. Chen,
F. Baldi,
G. Miralles,
M. J. Rasch,
J. Büchel,
M. Lalwani,
W. Ponghiran,
P. Solomon,
H. Tsai,
G. W. Burr,
P. Narayanan
Abstract:
Analog In-Memory Compute (AIMC) can improve the energy efficiency of Deep Learning by orders of magnitude. Yet analog-domain device and circuit non-idealities -- within the analog ``Tiles'' performing Matrix-Vector Multiply (MVM) operations -- can degrade neural-network task accuracy. We quantify the impact of low-level distortions and noise, and develop a mathematical model for Multiply-ACcumulat…
▽ More
Analog In-Memory Compute (AIMC) can improve the energy efficiency of Deep Learning by orders of magnitude. Yet analog-domain device and circuit non-idealities -- within the analog ``Tiles'' performing Matrix-Vector Multiply (MVM) operations -- can degrade neural-network task accuracy. We quantify the impact of low-level distortions and noise, and develop a mathematical model for Multiply-ACcumulate (MAC) operations mapped to analog tiles. Instantaneous-current IR-drop (the most significant circuit non-ideality), and ADC quantization effects are fully captured by this model, which can predict MVM tile-outputs both rapidly and accurately, as compared to much slower rigorous circuit simulations. A statistical model of PCM read noise at nanosecond timescales is derived from -- and matched against -- experimental measurements. We integrate these (statistical) device and (deterministic) circuit effects into a PyTorch-based framework to assess the accuracy impact on the BERT and ALBERT Transformer networks. We show that hardware-aware fine-tuning using simple Gaussian noise provides resilience against ADC quantization and PCM read noise effects, but is less effective against IR-drop. This is because IR-drop -- although deterministic -- is non-linear, is changing significantly during the time-integration window, and is ultimately dependent on all the excitations being introduced in parallel into the analog tile. The apparent inability of simple Gaussian noise applied during training to properly prepare a DNN network for IR-drop during inference implies that more complex training approaches -- incorporating advances such as the Tile-circuit model introduced here -- will be critical for resilient deployment of large neural networks onto AIMC hardware.
△ Less
Submitted 5 May, 2025;
originally announced June 2025.
-
VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation
Authors:
Prashanth Vijayaraghavan,
Luyao Shi,
Stefano Ambrogio,
Charles Mackin,
Apoorva Nitsure,
David Beymer,
Ehsan Degan
Abstract:
With the unprecedented advancements in Large Language Models (LLMs), their application domains have expanded to include code generation tasks across various programming languages. While significant progress has been made in enhancing LLMs for popular programming languages, there exists a notable gap in comprehensive evaluation frameworks tailored for Hardware Description Languages (HDLs), particul…
▽ More
With the unprecedented advancements in Large Language Models (LLMs), their application domains have expanded to include code generation tasks across various programming languages. While significant progress has been made in enhancing LLMs for popular programming languages, there exists a notable gap in comprehensive evaluation frameworks tailored for Hardware Description Languages (HDLs), particularly VHDL. This paper addresses this gap by introducing a comprehensive evaluation framework designed specifically for assessing LLM performance in VHDL code generation task. We construct a dataset for evaluating LLMs on VHDL code generation task. This dataset is constructed by translating a collection of Verilog evaluation problems to VHDL and aggregating publicly available VHDL problems, resulting in a total of 202 problems. To assess the functional correctness of the generated VHDL code, we utilize a curated set of self-verifying testbenches specifically designed for those aggregated VHDL problem set. We conduct an initial evaluation of different LLMs and their variants, including zero-shot code generation, in-context learning (ICL), and Parameter-efficient fine-tuning (PEFT) methods. Our findings underscore the considerable challenges faced by existing LLMs in VHDL code generation, revealing significant scope for improvement. This study emphasizes the necessity of supervised fine-tuning code generation models specifically for VHDL, offering potential benefits to VHDL designers seeking efficient code generation solutions.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and Inference
Authors:
Manuel Le Gallo,
Corey Lammie,
Julian Buechel,
Fabio Carta,
Omobayode Fagbohungbe,
Charles Mackin,
Hsinyu Tsai,
Vijay Narayanan,
Abu Sebastian,
Kaoutar El Maghraoui,
Malte J. Rasch
Abstract:
Analog In-Memory Computing (AIMC) is a promising approach to reduce the latency and energy consumption of Deep Neural Network (DNN) inference and training. However, the noisy and non-linear device characteristics, and the non-ideal peripheral circuitry in AIMC chips, require adapting DNNs to be deployed on such hardware to achieve equivalent accuracy to digital computing. In this tutorial, we prov…
▽ More
Analog In-Memory Computing (AIMC) is a promising approach to reduce the latency and energy consumption of Deep Neural Network (DNN) inference and training. However, the noisy and non-linear device characteristics, and the non-ideal peripheral circuitry in AIMC chips, require adapting DNNs to be deployed on such hardware to achieve equivalent accuracy to digital computing. In this tutorial, we provide a deep dive into how such adaptations can be achieved and evaluated using the recently released IBM Analog Hardware Acceleration Kit (AIHWKit), freely available at https://github.com/IBM/aihwkit. The AIHWKit is a Python library that simulates inference and training of DNNs using AIMC. We present an in-depth description of the AIHWKit design, functionality, and best practices to properly perform inference and training. We also present an overview of the Analog AI Cloud Composer, a platform that provides the benefits of using the AIHWKit simulation in a fully managed cloud setting along with physical AIMC hardware access, freely available at https://aihw-composer.draco.res.ibm.com. Finally, we show examples on how users can expand and customize AIHWKit for their own needs. This tutorial is accompanied by comprehensive Jupyter Notebook code examples that can be run using AIHWKit, which can be downloaded from https://github.com/IBM/aihwkit/tree/master/notebooks/tutorial.
△ Less
Submitted 26 January, 2024; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators
Authors:
Malte J. Rasch,
Charles Mackin,
Manuel Le Gallo,
An Chen,
Andrea Fasoli,
Frederic Odermatt,
Ning Li,
S. R. Nandakumar,
Pritish Narayanan,
Hsinyu Tsai,
Geoffrey W. Burr,
Abu Sebastian,
Vijay Narayanan
Abstract:
Analog in-memory computing (AIMC) -- a promising approach for energy-efficient acceleration of deep learning workloads -- computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) impl…
▽ More
Analog in-memory computing (AIMC) -- a promising approach for energy-efficient acceleration of deep learning workloads -- computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) implementation. While retraining has previously been suggested to improve robustness, prior work has explored only a few DNN topologies, using disparate and overly simplified AIMC hardware models. Here, we use hardware-aware (HWA) training to systematically examine the accuracy of AIMC for multiple common artificial intelligence (AI) workloads across multiple DNN topologies, and investigate sensitivity and robustness to a broad set of nonidealities. By introducing a new and highly realistic AIMC crossbar-model, we improve significantly on earlier retraining approaches. We show that many large-scale DNNs of various topologies, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can in fact be successfully retrained to show iso-accuracy on AIMC. Our results further suggest that AIMC nonidealities that add noise to the inputs or outputs, not the weights, have the largest impact on DNN accuracy, and that RNNs are particularly robust to all nonidealities.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Reachability-Based Safety and Goal Satisfaction of Unmanned Aerial Platoons on Air Highways
Authors:
Mo Chen,
Qie Hu,
Jaime Fisac,
Kene Akametalu,
Casey Mackin,
Claire Tomlin
Abstract:
Recently, there has been immense interest in using unmanned aerial vehicles (UAVs) for civilian operations. As a result, unmanned aerial systems traffic management is needed to ensure the safety and goal satisfaction of potentially thousands of UAVs flying simultaneously. Currently, the analysis of large multi-agent systems cannot tractably provide these guarantees if the agents' set of maneuvers…
▽ More
Recently, there has been immense interest in using unmanned aerial vehicles (UAVs) for civilian operations. As a result, unmanned aerial systems traffic management is needed to ensure the safety and goal satisfaction of potentially thousands of UAVs flying simultaneously. Currently, the analysis of large multi-agent systems cannot tractably provide these guarantees if the agents' set of maneuvers is unrestricted. In this paper, platoons of UAVs flying on air highways is proposed to impose an airspace structure that allows for tractable analysis. For the air highway placement problem, the fast marching method is used to produce a sequence of air highways that minimizes the cost of flying from an origin to any destination. The placement of air highways can be updated in real-time to accommodate sudden airspace changes. Within platoons traveling on air highways, each vehicle is modeled as a hybrid system. Using Hamilton-Jacobi reachability, safety and goal satisfaction are guaranteed for all mode transitions. For a single altitude range, the proposed approach guarantees safety for one safety breach per vehicle, in the unlikely event of multiple safety breaches, safety can be guaranteed over multiple altitude ranges. We demonstrate the platooning concept through simulations of three representative scenarios.
△ Less
Submitted 31 January, 2017; v1 submitted 25 February, 2016;
originally announced February 2016.