-
The Gaussian-Multinoulli Restricted Boltzmann Machine: A Potts Model Extension of the GRBM
Authors:
Nikhil Kapasi,
William Whitehead,
Luke Theogarajan
Abstract:
Many real-world tasks, from associative memory to symbolic reasoning, demand discrete, structured representations that standard continuous latent models struggle to express naturally. We introduce the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative energy-based model that extends the Gaussian-Bernoulli RBM (GB-RBM) by replacing binary hidden units with $q$-state Potts vari…
▽ More
Many real-world tasks, from associative memory to symbolic reasoning, demand discrete, structured representations that standard continuous latent models struggle to express naturally. We introduce the Gaussian-Multinoulli Restricted Boltzmann Machine (GM-RBM), a generative energy-based model that extends the Gaussian-Bernoulli RBM (GB-RBM) by replacing binary hidden units with $q$-state Potts variables. This modification enables a combinatorially richer latent space and supports learning over multivalued, interpretable latent concepts. We formally derive GM-RBM's energy function, learning dynamics, and conditional distributions, showing that it preserves tractable inference and training through contrastive divergence. Empirically, we demonstrate that GM-RBMs model complex multimodal distributions more effectively than binary RBMs, outperforming them on tasks involving analogical recall and structured memory. Our results highlight GM-RBMs as a scalable framework for discrete latent inference with enhanced expressiveness and interoperability.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
A CMOS Probabilistic Computing Chip With In-situ hardware Aware Learning
Authors:
Jinesh Jhonsa,
William Whitehead,
David McCarthy,
Shuvro Chowdhury,
Kerem Camsari,
Luke Theogarajan
Abstract:
This paper demonstrates a probabilistic bit physics inspired solver with 440 spins configured in a Chimera graph, occupying an area of 0.44 mm^2. Area efficiency is maximized through a current-mode implementation of the neuron update circuit, standard cell design for analog blocks pitch-matched to digital blocks, and a shared power supply for both digital and analog components. Process variation r…
▽ More
This paper demonstrates a probabilistic bit physics inspired solver with 440 spins configured in a Chimera graph, occupying an area of 0.44 mm^2. Area efficiency is maximized through a current-mode implementation of the neuron update circuit, standard cell design for analog blocks pitch-matched to digital blocks, and a shared power supply for both digital and analog components. Process variation related mismatches introduced by this approach are effectively mitigated using a hardware aware contrastive divergence algorithm during training. We validate the chip's ability to perform probabilistic computing tasks such as modeling logic gates and full adders, as well as optimization tasks such as MaxCut, demonstrating its potential for AI and machine learning applications.
△ Less
Submitted 30 April, 2025; v1 submitted 18 April, 2025;
originally announced April 2025.
-
Pushing the Boundary of Quantum Advantage in Hard Combinatorial Optimization with Probabilistic Computers
Authors:
Shuvro Chowdhury,
Navid Anjum Aadit,
Andrea Grimaldi,
Eleonora Raimondo,
Atharva Raut,
P. Aaron Lott,
Johan H. Mentink,
Marek M. Rams,
Federico Ricci-Tersenghi,
Massimo Chiappini,
Luke S. Theogarajan,
Tathagata Srimani,
Giovanni Finocchio,
Masoud Mohseni,
Kerem Y. Camsari
Abstract:
Recent demonstrations on specialized benchmarks have reignited excitement for quantum computers, yet whether they can deliver an advantage for practical real-world problems remains an open question. Here, we show that probabilistic computers (p-computers) when co-designed with hardware to implement powerful Monte Carlo algorithms can surpass state-of-the-art quantum annealers <a href="https://www.…
▽ More
Recent demonstrations on specialized benchmarks have reignited excitement for quantum computers, yet whether they can deliver an advantage for practical real-world problems remains an open question. Here, we show that probabilistic computers (p-computers) when co-designed with hardware to implement powerful Monte Carlo algorithms can surpass state-of-the-art quantum annealers <a href="https://www.nature.com/articles/s41586-023-05867-2" target="_blank">[King et al., Nature (2023)]</a> in solving certain hard optimization problems. We focus on two key algorithms: discrete-time simulated quantum annealing (DT-SQA) and adaptive parallel tempering (APT), both applied to 3D spin glasses. For DT-SQA, we find that increasing the number of replicas improves residual energy scaling, while parallelizing fewer replicas across independent runs also achieves comparable scaling. Both strategies align with the theoretical expectations from extreme value theory. In addition, APT outperforms DT-SQA when supported by non-local isoenergetic cluster moves. Finite-size scaling analysis suggests a universal behavior that explains the superior performance of APT over both DT-SQA and quantum annealing. We show that these algorithms are readily implementable in modern hardware thanks to the mature semiconductor technology. Unlike software simulations, replicas can be monolithically housed on a single chip and a large number of spins can be updated in parallel and asynchronously, similar to a quantum annealer. We project that custom Field Programmable Gate Arrays (FPGA) or specialized chips leveraging massive parallelism can further accelerate these algorithms by orders of magnitude, while drastically improving energy efficiency. Our results raise the bar for a practical quantum advantage in optimization and present p-computers as scalable, energy-efficient hardware for real-world optimization problems.
△ Less
Submitted 7 April, 2025; v1 submitted 13 March, 2025;
originally announced March 2025.
-
A full-stack view of probabilistic computing with p-bits: devices, architectures and algorithms
Authors:
Shuvro Chowdhury,
Andrea Grimaldi,
Navid Anjum Aadit,
Shaila Niazi,
Masoud Mohseni,
Shun Kanai,
Hideo Ohno,
Shunsuke Fukami,
Luke Theogarajan,
Giovanni Finocchio,
Supriyo Datta,
Kerem Y. Camsari
Abstract:
The transistor celebrated its 75${}^\text{th}$ birthday in 2022. The continued scaling of the transistor defined by Moore's Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with…
▽ More
The transistor celebrated its 75${}^\text{th}$ birthday in 2022. The continued scaling of the transistor defined by Moore's Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.
△ Less
Submitted 16 March, 2023; v1 submitted 13 February, 2023;
originally announced February 2023.
-
CMOS-compatible Ising and Potts Annealing Using Single Photon Avalanche Diodes
Authors:
William Whitehead,
Zachary Nelson,
Kerem Y. Camsari,
Luke Theogarajan
Abstract:
Massively parallel annealing processors may offer superior performance for a wide range of sampling and optimization problems. A key component dictating the size of these processors is the neuron update circuit, ideally implemented using special stochastic nanodevices. We leverage photon statistics using single photon avalanche diodes (SPADs) and temporal filtering to generate stochastic states. T…
▽ More
Massively parallel annealing processors may offer superior performance for a wide range of sampling and optimization problems. A key component dictating the size of these processors is the neuron update circuit, ideally implemented using special stochastic nanodevices. We leverage photon statistics using single photon avalanche diodes (SPADs) and temporal filtering to generate stochastic states. This method is a powerful alternative offering unique features not currently seen in annealing processors: the ability to continuously control the computational temperature and the seamless extension to the Potts model, a $n$-state generalization of the two-state Ising model. SPADs also offer a considerable practical advantage since they are readily manufacturable in current standard CMOS processes. As a first step towards realizing a CMOS SPAD-based annealer, we have designed Ising and Potts models driven by an array of discrete SPADs and show they accurately sample from their theoretical distributions.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Massively Parallel Probabilistic Computing with Sparse Ising Machines
Authors:
Navid Anjum Aadit,
Andrea Grimaldi,
Mario Carpentieri,
Luke Theogarajan,
John M. Martinis,
Giovanni Finocchio,
Kerem Y. Camsari
Abstract:
Inspired by the developments in quantum computing, building domain-specific classical hardware to solve computationally hard problems has received increasing attention. Here, by introducing systematic sparsification techniques, we demonstrate a massively parallel architecture: the sparse Ising Machine (sIM). Exploiting sparsity, sIM achieves ideal parallelism: its key figure of merit - flips per s…
▽ More
Inspired by the developments in quantum computing, building domain-specific classical hardware to solve computationally hard problems has received increasing attention. Here, by introducing systematic sparsification techniques, we demonstrate a massively parallel architecture: the sparse Ising Machine (sIM). Exploiting sparsity, sIM achieves ideal parallelism: its key figure of merit - flips per second - scales linearly with the number of probabilistic bits (p-bit) in the system. This makes sIM up to 6 orders of magnitude faster than a CPU implementing standard Gibbs sampling. Compared to optimized implementations in TPUs and GPUs, sIM delivers 5-18x speedup in sampling. In benchmark problems such as integer factorization, sIM can reliably factor semiprimes up to 32-bits, far larger than previous attempts from D-Wave and other probabilistic solvers. Strikingly, sIM beats competition-winning SAT solvers (by 4-700x in runtime to reach 95% accuracy) in solving 3SAT problems. Even when sampling is made inexact using faster clocks, sIM can find the correct ground state with further speedup. The problem encoding and sparsification techniques we introduce can be applied to other Ising Machines (classical and quantum) and the architecture we present can be used for scaling the demonstrated 5,000-10,000 p-bits to 1,000,000 or more through analog CMOS or nanodevices.
△ Less
Submitted 21 February, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.