Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems
Authors:
Alexis Asseman,
Nicolas Antoine,
Ahmet S. Ozcan
Abstract:
Reinforcement learning augmented by the representational power of deep neural networks, has shown promising results on high-dimensional problems, such as game playing and robotic control. However, the sequential nature of these problems poses a fundamental challenge for computational efficiency. Recently, alternative approaches such as evolutionary strategies and deep neuroevolution demonstrated c…
▽ More
Reinforcement learning augmented by the representational power of deep neural networks, has shown promising results on high-dimensional problems, such as game playing and robotic control. However, the sequential nature of these problems poses a fundamental challenge for computational efficiency. Recently, alternative approaches such as evolutionary strategies and deep neuroevolution demonstrated competitive results with faster training time on distributed CPU cores. Here, we report record training times (running at about 1 million frames per second) for Atari 2600 games using deep neuroevolution implemented on distributed FPGAs. Combined hardware implementation of the game console, image pre-processing and the neural network in an optimized pipeline, multiplied with the system level parallelism enabled the acceleration. These results are the first application demonstration on the IBM Neural Computer, which is a custom designed system that consists of 432 Xilinx FPGAs interconnected in a 3D mesh network topology. In addition to high performance, experiments also showed improvement in accuracy for all games compared to the CPU-implementation of the same algorithm.
△ Less
Submitted 9 May, 2020;
originally announced May 2020.
Overview of the IBM Neural Computer Architecture
Authors:
Pritish Narayanan,
Charles E. Cox,
Alexis Asseman,
Nicolas Antoine,
Harald Huels,
Winfried W. Wilcke,
Ahmet S. Ozcan
Abstract:
The IBM Neural Computer (INC) is a highly flexible, re-configurable parallel processing system that is intended as a research and development platform for emerging machine intelligence algorithms and computational neuroscience. It consists of hundreds of programmable nodes, primarily based on Xilinx's Field Programmable Gate Array (FPGA) technology. The nodes are interconnected in a scalable 3d me…
▽ More
The IBM Neural Computer (INC) is a highly flexible, re-configurable parallel processing system that is intended as a research and development platform for emerging machine intelligence algorithms and computational neuroscience. It consists of hundreds of programmable nodes, primarily based on Xilinx's Field Programmable Gate Array (FPGA) technology. The nodes are interconnected in a scalable 3d mesh topology. We overview INC, emphasizing unique features such as flexibility and scalability both in the types of computations performed and in the available modes of communication, enabling new machine intelligence approaches and learning strategies not well suited to the matrix manipulation/SIMD libraries that GPUs are optimized for. This paper describes the architecture of the machine and applications are to be described in detail elsewhere.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.