-
Towards the Information-Theoretic Limit of Programmable Photonics
Authors:
Ryan Hamerly,
Jasvith Raj Basani,
Alexander Sludds,
Sri Krishna Vadlamani,
Dirk Englund
Abstract:
The scalability of many programmable photonic circuits is limited by the $2π$ tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is $\ll 2π$. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propos…
▽ More
The scalability of many programmable photonic circuits is limited by the $2π$ tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is $\ll 2π$. We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propose a "3-MZI" architecture that approaches this limit to within a factor of $2\times$, approximately a $10\times$ reduction in average phase shift over the prior art, where the average phase shift scales inversely with system size as $O(1/\sqrt{N})$. For non-unitary circuits, we show that the 3-MZI saturates the theoretical bound for Gaussian-distributed target matrices. Using this architecture, we show optical neural network training with all phase shifters constrained to $\lesssim 0.2$ radians without loss of accuracy.
△ Less
Submitted 18 August, 2024;
originally announced August 2024.
-
Quantum-secure multiparty deep learning
Authors:
Kfir Sulimany,
Sri Krishna Vadlamani,
Ryan Hamerly,
Prahlad Iyengar,
Dirk Englund
Abstract:
Secure multiparty computation enables the joint evaluation of multivariate functions across distributed users while ensuring the privacy of their local inputs. This field has become increasingly urgent due to the exploding demand for computationally intensive deep learning inference. These computations are typically offloaded to cloud computing servers, leading to vulnerabilities that can compromi…
▽ More
Secure multiparty computation enables the joint evaluation of multivariate functions across distributed users while ensuring the privacy of their local inputs. This field has become increasingly urgent due to the exploding demand for computationally intensive deep learning inference. These computations are typically offloaded to cloud computing servers, leading to vulnerabilities that can compromise the security of the clients' data. To solve this problem, we introduce a linear algebra engine that leverages the quantum nature of light for information-theoretically secure multiparty computation using only conventional telecommunication components. We apply this linear algebra engine to deep learning and derive rigorous upper bounds on the information leakage of both the deep neural network weights and the client's data via the Holevo and the Cramér-Rao bounds, respectively. Applied to the MNIST classification task, we obtain test accuracies exceeding $96\%$ while leaking less than $0.1$ bits per weight symbol and $0.01$ bits per data symbol. This weight leakage is an order of magnitude below the minimum bit precision required for accurate deep learning using state-of-the-art quantization techniques. Our work lays the foundation for practical quantum-secure computation and unlocks secure cloud deep learning as a field.
△ Less
Submitted 13 September, 2024; v1 submitted 10 August, 2024;
originally announced August 2024.
-
Hypermultiplexed Integrated-Photonics-based Tensor Optical Processor
Authors:
Shaoyuan Ou,
Kaiwen Xue,
Lian Zhou,
Chun-ho Lee,
Alexander Sludds,
Ryan Hamerly,
Ke Zhang,
Hanke Feng,
Reshma Kopparapu,
Eric Zhong,
Cheng Wang,
Dirk Englund,
Mengjie Yu,
Zaijun Chen
Abstract:
The escalating data volume and complexity resulting from the rapid expansion of artificial intelligence (AI), internet of things (IoT) and 5G/6G mobile networks is creating an urgent need for energy-efficient, scalable computing hardware. Here we demonstrate a hypermultiplexed integratedphotonics-based tensor optical processor (HITOP) that can perform trillions of operations per second (TOPS) at t…
▽ More
The escalating data volume and complexity resulting from the rapid expansion of artificial intelligence (AI), internet of things (IoT) and 5G/6G mobile networks is creating an urgent need for energy-efficient, scalable computing hardware. Here we demonstrate a hypermultiplexed integratedphotonics-based tensor optical processor (HITOP) that can perform trillions of operations per second (TOPS) at the energy efficiency of 40 TOPS/W. Space-time-wavelength three-dimensional (3D) optical parallelism enables O($N^{2}$) operations per clock-cycle using O($N$) modulator devices. The system is built with wafer-fabricated III/V micron-scale lasers and high-speed thin-film Lithium-Niobate electro-optics for encoding at 10s femtojoule/symbol. Lasing threshold incorporates analog inline rectifier (ReLu) nonlinearity for low-latency activation. The system scalability is verified with machine learning models of 405,000 parameters. A combination of high clockrates, energy-efficient processing and programmability unlocks the potential of light for large-scale AI accelerators in applications ranging from training of large AI models to real-time decision making in edge deployment.
△ Less
Submitted 27 October, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Transferable Learning on Analog Hardware
Authors:
Sri Krishna Vadlamani,
Dirk Englund,
Ryan Hamerly
Abstract:
While analog neural network (NN) accelerators promise massive energy and time savings, an important challenge is to make them robust to static fabrication error. Present-day training methods for programmable photonic interferometer circuits, a leading analog NN platform, do not produce networks that perform well in the presence of static hardware errors. Moreover, existing hardware error correctio…
▽ More
While analog neural network (NN) accelerators promise massive energy and time savings, an important challenge is to make them robust to static fabrication error. Present-day training methods for programmable photonic interferometer circuits, a leading analog NN platform, do not produce networks that perform well in the presence of static hardware errors. Moreover, existing hardware error correction techniques either require individual re-training of every analog NN (which is impractical in an edge setting with millions of devices), place stringent demands on component quality, or introduce hardware overhead. We solve all three problems by introducing one-time error-aware training techniques that produce robust NNs that match the performance of ideal hardware and can be exactly transferred to arbitrary highly faulty photonic NNs with hardware errors up to 5x larger than present-day fabrication tolerances.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
A Self-Similar Sine-Cosine Fractal Architecture for Multiport Interferometers
Authors:
Jasvith Raj Basani,
Sri Krishna Vadlamani,
Saumil Bandyopadhyay,
Dirk R. Englund,
Ryan Hamerly
Abstract:
Multiport interferometers based on integrated beamsplitter meshes have recently captured interest as a platform for many emerging technologies. In this paper, we present a novel architecture for multiport interferometers based on the Sine-Cosine fractal decomposition of a unitary matrix. Our architecture is unique in that it is self-similar, enabling the construction of modular multi-chiplet devic…
▽ More
Multiport interferometers based on integrated beamsplitter meshes have recently captured interest as a platform for many emerging technologies. In this paper, we present a novel architecture for multiport interferometers based on the Sine-Cosine fractal decomposition of a unitary matrix. Our architecture is unique in that it is self-similar, enabling the construction of modular multi-chiplet devices. Due to this modularity, our design enjoys improved resilience to hardware imperfections as compared to conventional multiport interferometers. Additionally, the structure of our circuit enables systematic truncation, which is key in reducing the hardware footprint of the chip as well as compute time in training optical neural networks, while maintaining full connectivity. Numerical simulations show that truncation of these meshes gives robust performance even under large fabrication errors. This design is a step forward in the construction of large-scale programmable photonics, removing a major hurdle in scaling up to practical machine learning and quantum computing applications.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Single chip photonic deep neural network with accelerated training
Authors:
Saumil Bandyopadhyay,
Alexander Sludds,
Stefan Krastanov,
Ryan Hamerly,
Nicholas Harris,
Darius Bunandar,
Matthew Streshinsky,
Michael Hochberg,
Dirk Englund
Abstract:
As deep neural networks (DNNs) revolutionize machine learning, energy consumption and throughput are emerging as fundamental limitations of CMOS electronics. This has motivated a search for new hardware architectures optimized for artificial intelligence, such as electronic systolic arrays, memristor crossbar arrays, and optical accelerators. Optical systems can perform linear matrix operations at…
▽ More
As deep neural networks (DNNs) revolutionize machine learning, energy consumption and throughput are emerging as fundamental limitations of CMOS electronics. This has motivated a search for new hardware architectures optimized for artificial intelligence, such as electronic systolic arrays, memristor crossbar arrays, and optical accelerators. Optical systems can perform linear matrix operations at exceptionally high rate and efficiency, motivating recent demonstrations of low latency linear algebra and optical energy consumption below a photon per multiply-accumulate operation. However, demonstrating systems that co-integrate both linear and nonlinear processing units in a single chip remains a central challenge. Here we introduce such a system in a scalable photonic integrated circuit (PIC), enabled by several key advances: (i) high-bandwidth and low-power programmable nonlinear optical function units (NOFUs); (ii) coherent matrix multiplication units (CMXUs); and (iii) in situ training with optical acceleration. We experimentally demonstrate this fully-integrated coherent optical neural network (FICONN) architecture for a 3-layer DNN comprising 12 NOFUs and three CMXUs operating in the telecom C-band. Using in situ training on a vowel classification task, the FICONN achieves 92.7% accuracy on a test set, which is identical to the accuracy obtained on a digital computer with the same number of weights. This work lends experimental evidence to theoretical proposals for in situ training, unlocking orders of magnitude improvements in the throughput of training data. Moreover, the FICONN opens the path to inference at nanosecond latency and femtojoule per operation energy efficiency.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
RF-Photonic Deep Learning Processor with Shannon-Limited Data Movement
Authors:
Ronald Davis III,
Zaijun Chen,
Ryan Hamerly,
Dirk Englund
Abstract:
Edholm's Law predicts exponential growth in data rate and spectrum bandwidth for communications and is forecasted to remain true for the upcoming deployment of 6G. Compounding this issue is the exponentially increasing demand for deep neural network (DNN) compute, including DNNs for signal processing. However, the slowing of Moore's Law due to the limitations of transistor-based electronics means…
▽ More
Edholm's Law predicts exponential growth in data rate and spectrum bandwidth for communications and is forecasted to remain true for the upcoming deployment of 6G. Compounding this issue is the exponentially increasing demand for deep neural network (DNN) compute, including DNNs for signal processing. However, the slowing of Moore's Law due to the limitations of transistor-based electronics means that completely new paradigms for computing will be required to meet these increasing demands for advanced communications. Optical neural networks (ONNs) are promising DNN accelerators with ultra-low latency and energy consumption. Yet state-of-the-art ONNs struggle with scalability and implementing linear with in-line nonlinear operations. Here we introduce our multiplicative analog frequency transform ONN (MAFT-ONN) that encodes the data in the frequency domain, achieves matrix-vector products in a single shot using photoelectric multiplication, and uses a single electro-optic modulator for the nonlinear activation of all neurons in each layer. We experimentally demonstrate the first hardware accelerator that computes fully-analog deep learning on raw RF signals, performing single-shot modulation classification with 85% accuracy, where a 'majority vote' multi-measurement scheme can boost the accuracy to 95% within 5 consecutive measurements. In addition, we demonstrate frequency-domain finite impulse response (FIR) linear-time-invariant (LTI) operations, enabling a powerful combination of traditional and AI signal processing. We also demonstrate the scalability of our architecture by computing nearly 4 million fully-analog multiplies-and-accumulates for MNIST digit classification. Our latency estimation model shows that due to the Shannon capacity-limited analog data movement, MAFT-ONN is hundreds of times faster than traditional RF receivers operating at their theoretical peak performance.
△ Less
Submitted 6 June, 2024; v1 submitted 8 July, 2022;
originally announced July 2022.
-
Deep Learning with Coherent VCSEL Neural Networks
Authors:
Zaijun Chen,
Alexander Sludds,
Ronald Davis,
Ian Christen,
Liane Bernstein,
Tobias Heuser,
Niels Heermeier,
James A. Lott,
Stephan Reitzenstein,
Ryan Hamerly,
Dirk Englund
Abstract:
Deep neural networks (DNNs) are reshaping the field of information processing. With their exponential growth challenging existing electronic hardware, optical neural networks (ONNs) are emerging to process DNN tasks in the optical domain with high clock rates, parallelism and low-loss data transmission. However, to explore the potential of ONNs, it is necessary to investigate the full-system perfo…
▽ More
Deep neural networks (DNNs) are reshaping the field of information processing. With their exponential growth challenging existing electronic hardware, optical neural networks (ONNs) are emerging to process DNN tasks in the optical domain with high clock rates, parallelism and low-loss data transmission. However, to explore the potential of ONNs, it is necessary to investigate the full-system performance incorporating the major DNN elements, including matrix algebra and nonlinear activation. Existing challenges to ONNs are high energy consumption due to low electro-optic (EO) conversion efficiency, low compute density due to large device footprint and channel crosstalk, and long latency due to the lack of inline nonlinearity. Here we experimentally demonstrate an ONN system that simultaneously overcomes all these challenges. We exploit neuron encoding with volume-manufactured micron-scale vertical-cavity surface-emitting laser (VCSEL) transmitter arrays that exhibit high EO conversion (<5 attojoule/symbol with $V_π$=4 mV), high operation bandwidth (up to 25 GS/s), and compact footprint (<0.01 mm$^2$ per device). Photoelectric multiplication allows low-energy matrix operations at the shot-noise quantum limit. Homodyne detection-based nonlinearity enables nonlinear activation with instantaneous response. The full-system energy efficiency and compute density reach 7 femtojoules per operation (fJ/OP) and 25 TeraOP/(mm$^2\cdot$ s), both representing a >100-fold improvement over state-of-the-art digital computers, with substantially several more orders of magnitude for future improvement. Beyond neural network inference, its feature of rapid weight updating is crucial for training deep learning models. Our technique opens an avenue to large-scale optoelectronic processors to accelerate machine learning tasks from data centers to decentralized edge devices.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Netcast: Low-Power Edge Computing with WDM-defined Optical Neural Networks
Authors:
Ryan Hamerly,
Alexander Sludds,
Saumil Bandyopadhyay,
Zaijun Chen,
Zhizhen Zhong,
Liane Bernstein,
Dirk Englund
Abstract:
This paper analyzes the performance and energy efficiency of Netcast, a recently proposed optical neural-network architecture designed for edge computing. Netcast performs deep neural network inference by dividing the computational task into two steps, which are split between the server and (edge) client: (1) the server employs a wavelength-multiplexed modulator array to encode the network's weigh…
▽ More
This paper analyzes the performance and energy efficiency of Netcast, a recently proposed optical neural-network architecture designed for edge computing. Netcast performs deep neural network inference by dividing the computational task into two steps, which are split between the server and (edge) client: (1) the server employs a wavelength-multiplexed modulator array to encode the network's weights onto an optical signal in an analog time-frequency basis, and (2) the client obtains the desired matrix-vector product through modulation and time-integrated detection. The simultaneous use of wavelength multiplexing, broadband modulation, and integration detection allows large neural networks to be run at the client by effectively pushing the energy and memory requirements back to the server. The performance and energy efficiency are fundamentally limited by crosstalk and detector noise, respectively. We derive analytic expressions for these limits and perform numerical simulations to verify these bounds.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Single-Shot Optical Neural Network
Authors:
Liane Bernstein,
Alexander Sludds,
Christopher Panuski,
Sivan Trajtenberg-Mills,
Ryan Hamerly,
Dirk Englund
Abstract:
As deep neural networks (DNNs) grow to solve increasingly complex problems, they are becoming limited by the latency and power consumption of existing digital processors. For improved speed and energy efficiency, specialized analog optical and electronic hardware has been proposed, however, with limited scalability (input vector length $K$ of hundreds of elements). Here, we present a scalable, sin…
▽ More
As deep neural networks (DNNs) grow to solve increasingly complex problems, they are becoming limited by the latency and power consumption of existing digital processors. For improved speed and energy efficiency, specialized analog optical and electronic hardware has been proposed, however, with limited scalability (input vector length $K$ of hundreds of elements). Here, we present a scalable, single-shot-per-layer analog optical processor that uses free-space optics to reconfigurably distribute an input vector and integrated optoelectronics for static, updatable weighting and the nonlinearity -- with $K \approx 1,000$ and beyond. We experimentally test classification accuracy of the MNIST handwritten digit dataset, achieving 94.7% (ground truth 96.3%) without data preprocessing or retraining on the hardware. We also determine the fundamental upper bound on throughput ($\sim$0.9 exaMAC/s), set by the maximum optical bandwidth before significant increase in error. Our combination of wide spectral and spatial bandwidths in a CMOS-compatible system enables highly efficient computing for next-generation DNNs.
△ Less
Submitted 22 June, 2022; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Delocalized Photonic Deep Learning on the Internet's Edge
Authors:
Alexander Sludds,
Saumil Bandyopadhyay,
Zaijun Chen,
Zhizhen Zhong,
Jared Cochrane,
Liane Bernstein,
Darius Bunandar,
P. Ben Dixon,
Scott A. Hamilton,
Matthew Streshinsky,
Ari Novack,
Tom Baehr-Jones,
Michael Hochberg,
Manya Ghobadi,
Ryan Hamerly,
Dirk Englund
Abstract:
Advances in deep neural networks (DNNs) are transforming science and technology. However, the increasing computational demands of the most powerful DNNs limit deployment on low-power devices, such as smartphones and sensors -- and this trend is accelerated by the simultaneous move towards Internet-of-Things (IoT) devices. Numerous efforts are underway to lower power consumption, but a fundamental…
▽ More
Advances in deep neural networks (DNNs) are transforming science and technology. However, the increasing computational demands of the most powerful DNNs limit deployment on low-power devices, such as smartphones and sensors -- and this trend is accelerated by the simultaneous move towards Internet-of-Things (IoT) devices. Numerous efforts are underway to lower power consumption, but a fundamental bottleneck remains due to energy consumption in matrix algebra, even for analog approaches including neuromorphic, analog memory and photonic meshes. Here we introduce and demonstrate a new approach that sharply reduces energy required for matrix algebra by doing away with weight memory access on edge devices, enabling orders of magnitude energy and latency reduction. At the core of our approach is a new concept that decentralizes the DNN for delocalized, optically accelerated matrix algebra on edge devices. Using a silicon photonic smart transceiver, we demonstrate experimentally that this scheme, termed Netcast, dramatically reduces energy consumption. We demonstrate operation in a photon-starved environment with 40 aJ/multiply of optical energy for 98.8% accurate image recognition and <1 photon/multiply using single photon detectors. Furthermore, we show realistic deployment of our system, classifying images with 3 THz of bandwidth over 86 km of deployed optical fiber in a Boston-area fiber network. Our approach enables computing on a new generation of edge devices with speeds comparable to modern digital electronics and power consumption that is orders of magnitude lower.
△ Less
Submitted 1 April, 2022; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Asymptotically Fault-Tolerant Programmable Photonics
Authors:
Ryan Hamerly,
Saumil Bandyopadhyay,
Dirk Englund
Abstract:
Component errors limit the scaling of programmable coherent photonic circuits. These errors arise because the standard tunable photonic coupler -- the Mach-Zehnder interferometer (MZI) -- cannot be perfectly programmed to the cross state. Here, we introduce two modified circuit architectures that overcome this limitation: (1) a 3-splitter MZI mesh for generic errors, and (2) a broadband MZI+Crossi…
▽ More
Component errors limit the scaling of programmable coherent photonic circuits. These errors arise because the standard tunable photonic coupler -- the Mach-Zehnder interferometer (MZI) -- cannot be perfectly programmed to the cross state. Here, we introduce two modified circuit architectures that overcome this limitation: (1) a 3-splitter MZI mesh for generic errors, and (2) a broadband MZI+Crossing design for correlated errors. Because these designs allow for perfect realization of the cross state, the matrix fidelity no longer decreases with mesh size, allowing scaling to arbitrarily large meshes. The proposed architectures support progressive self-configuration, are more compact than previous MZI-doubling schemes, and do not require additional phase shifters. This eliminates a major obstacle to the development of very-large-scale linear photonic circuits.
△ Less
Submitted 30 November, 2022; v1 submitted 11 September, 2021;
originally announced September 2021.
-
Stability of Self-Configuring Large Multiport Interferometers
Authors:
Ryan Hamerly,
Saumil Bandyopadhyay,
Dirk Englund
Abstract:
Realistic multiport interferometers (beamsplitter meshes) are sensitive to component imperfections, and this sensitivity increases with size. Self-configuration techniques can be employed to correct these imperfections, but not all techniques are equal. This paper highlights the importance of algorithmic stability in self-configuration. Naive approaches based on sequentially setting matrix element…
▽ More
Realistic multiport interferometers (beamsplitter meshes) are sensitive to component imperfections, and this sensitivity increases with size. Self-configuration techniques can be employed to correct these imperfections, but not all techniques are equal. This paper highlights the importance of algorithmic stability in self-configuration. Naive approaches based on sequentially setting matrix elements are unstable and perform poorly for large meshes, while techniques based on power ratios perform well in all cases, even in the presence of large errors. Based on this insight, we propose a self-configuration scheme for triangular meshes that requires only external detectors and works without prior knowledge of the component imperfections. This scheme extends to the rectangular mesh by adding a single array of detectors along the diagonal.
△ Less
Submitted 30 November, 2022; v1 submitted 6 June, 2021;
originally announced June 2021.
-
Accurate Self-Configuration of Rectangular Multiport Interferometers
Authors:
Ryan Hamerly,
Saumil Bandyopadhyay,
Dirk Englund
Abstract:
Multiport interferometers based on integrated beamsplitter meshes are widely used in photonic technologies. While the rectangular mesh is favored for its compactness and uniformity, its geometry resists conventional self-configuration approaches, which are essential to programming large meshes in the presence of fabrication error. Here, we present a new configuration algorithm, related to the…
▽ More
Multiport interferometers based on integrated beamsplitter meshes are widely used in photonic technologies. While the rectangular mesh is favored for its compactness and uniformity, its geometry resists conventional self-configuration approaches, which are essential to programming large meshes in the presence of fabrication error. Here, we present a new configuration algorithm, related to the $2\times 2$ block decomposition of a unitary matrix, that overcomes this limitation. Our proposed algorithm is robust to errors, requires no prior knowledge of the process variations, and relies only on external sources and detectors. We show that self-configuration using this technique reduces the effect of fabrication errors by the same quadratic factor observed in triangular meshes. This relaxes a significant limit to the size of multiport interferometers, removing a major roadblock to the scaling of optical quantum and machine-learning hardware.
△ Less
Submitted 30 November, 2022; v1 submitted 6 June, 2021;
originally announced June 2021.
-
Hardware error correction for programmable photonics
Authors:
Saumil Bandyopadhyay,
Ryan Hamerly,
Dirk Englund
Abstract:
Programmable photonic circuits of reconfigurable interferometers can be used to implement arbitrary operations on optical modes, facilitating a flexible platform for accelerating tasks in quantum simulation, signal processing, and artificial intelligence. A major obstacle to scaling up these systems is static fabrication error, where small component errors within each device accrue to produce sign…
▽ More
Programmable photonic circuits of reconfigurable interferometers can be used to implement arbitrary operations on optical modes, facilitating a flexible platform for accelerating tasks in quantum simulation, signal processing, and artificial intelligence. A major obstacle to scaling up these systems is static fabrication error, where small component errors within each device accrue to produce significant errors within the circuit computation. Mitigating this error usually requires numerical optimization dependent on real-time feedback from the circuit, which can greatly limit the scalability of the hardware. Here we present a deterministic approach to correcting circuit errors by locally correcting hardware errors within individual optical gates. We apply our approach to simulations of large scale optical neural networks and infinite impulse response filters implemented in programmable photonics, finding that they remain resilient to component error well beyond modern day process tolerances. Our results highlight a new avenue for scaling up programmable photonics to hundreds of modes within current day fabrication processes.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Freely scalable and reconfigurable optical hardware for deep learning
Authors:
Liane Bernstein,
Alexander Sludds,
Ryan Hamerly,
Vivienne Sze,
Joel Emer,
Dirk Englund
Abstract:
As deep neural network (DNN) models grow ever-larger, they can achieve higher accuracy and solve more complex problems. This trend has been enabled by an increase in available compute power; however, efforts to continue to scale electronic processors are impeded by the costs of communication, thermal management, power delivery and clocking. To improve scalability, we propose a digital optical neur…
▽ More
As deep neural network (DNN) models grow ever-larger, they can achieve higher accuracy and solve more complex problems. This trend has been enabled by an increase in available compute power; however, efforts to continue to scale electronic processors are impeded by the costs of communication, thermal management, power delivery and clocking. To improve scalability, we propose a digital optical neural network (DONN) with intralayer optical interconnects and reconfigurable input values. The near path-length-independence of optical energy consumption enables information locality between a transmitter and arbitrarily arranged receivers, which allows greater flexibility in architecture design to circumvent scaling limitations. In a proof-of-concept experiment, we demonstrate optical multicast in the classification of 500 MNIST images with a 3-layer, fully-connected network. We also analyze the energy consumption of the DONN and find that optical data transfer is beneficial over electronics when the spacing of computational units is on the order of >10 micrometers.
△ Less
Submitted 24 June, 2020;
originally announced June 2020.
-
Large-Scale Optical Neural Networks based on Photoelectric Multiplication
Authors:
Ryan Hamerly,
Liane Bernstein,
Alexander Sludds,
Marin Soljačić,
Dirk Englund
Abstract:
Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large ($N \gtrsim 10^6$) networks and can be operated at high (GHz) speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using the massive spa…
▽ More
Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large ($N \gtrsim 10^6$) networks and can be operated at high (GHz) speeds and very low (sub-aJ) energies per multiply-and-accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit- and image-classification reveal a "standard quantum limit" for optical neural networks, set by photodetector shot noise. This bound, which can be as low as 50 zJ/MAC, suggests performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully-connected and convolutional networks. We also present a scheme for back-propagation and training that can be performed in the same hardware. This architecture will enable a new class of ultra-low-energy processors for deep learning.
△ Less
Submitted 18 May, 2019; v1 submitted 12 November, 2018;
originally announced December 2018.
-
Experimental investigation of performance differences between Coherent Ising Machines and a quantum annealer
Authors:
Ryan Hamerly,
Takahiro Inagaki,
Peter L. McMahon,
Davide Venturelli,
Alireza Marandi,
Tatsuhiro Onodera,
Edwin Ng,
Carsten Langrock,
Kensuke Inaba,
Toshimori Honjo,
Koji Enbutsu,
Takeshi Umeki,
Ryoichi Kasahara,
Shoko Utsunomiya,
Satoshi Kako,
Ken-ichi Kawarabayashi,
Robert L. Byer,
Martin M. Fejer,
Hideo Mabuchi,
Dirk Englund,
Eleanor Rieffel,
Hiroki Takesue,
Yoshihisa Yamamoto
Abstract:
Physical annealing systems provide heuristic approaches to solving NP-hard Ising optimization problems. Here, we study the performance of two types of annealing machines--a commercially available quantum annealer built by D-Wave Systems, and measurement-feedback coherent Ising machines (CIMs) based on optical parametric oscillator networks--on two classes of problems, the Sherrington-Kirkpatrick (…
▽ More
Physical annealing systems provide heuristic approaches to solving NP-hard Ising optimization problems. Here, we study the performance of two types of annealing machines--a commercially available quantum annealer built by D-Wave Systems, and measurement-feedback coherent Ising machines (CIMs) based on optical parametric oscillator networks--on two classes of problems, the Sherrington-Kirkpatrick (SK) model and MAX-CUT. The D-Wave quantum annealer outperforms the CIMs on MAX-CUT on regular graphs of degree 3. On denser problems, however, we observe an exponential penalty for the quantum annealer ($\exp(-α_\textrm{DW} N^2)$) relative to CIMs ($\exp(-α_\textrm{CIM} N)$) for fixed anneal times, on both the SK model and on 50%-edge-density MAX-CUT, where the coefficients $α_\textrm{CIM}$ and $α_\textrm{DW}$ are problem-class-dependent. On instances with over $50$ vertices, a several-orders-of-magnitude time-to-solution difference exists between CIMs and the D-Wave annealer. An optimal-annealing-time analysis is also consistent with a significant projected performance difference. The difference in performance between the sparsely connected D-Wave machine and the measurement-feedback facilitated all-to-all connectivity of the CIMs provides strong experimental support for efforts to increase the connectivity of quantum annealers.
△ Less
Submitted 24 May, 2019; v1 submitted 14 May, 2018;
originally announced May 2018.