-
Delay Balancing with Clock-Follow-Data: Optimizing Area Delay Trade-offs for Robust Rapid Single Flux Quantum Circuits
Authors:
Robert S. Aviles,
Phalgun G K,
Peter A. Beerel
Abstract:
This paper proposes an algorithm for synthesis of clock-follow-data designs that provides robustness against timing violations for RSFQ circuits while maintaining high performance and minimizing area costs. Since superconducting logic gates must be clocked, managing data flow is a challenging problem that often requires the insertion of many path balancing D Flips (DFFs) to properly sequence data,…
▽ More
This paper proposes an algorithm for synthesis of clock-follow-data designs that provides robustness against timing violations for RSFQ circuits while maintaining high performance and minimizing area costs. Since superconducting logic gates must be clocked, managing data flow is a challenging problem that often requires the insertion of many path balancing D Flips (DFFs) to properly sequence data, leading to a substantial increase in area. To address this challenge, we present an algorithm to insert DFFs into clock-follow-data RSFQ circuits that partially balances the delays within the circuit to achieve a target throughput while minimizing area. Our algorithm can account for expected timing variations and, by adjusting the bias of the clock network and clock frequency, we can mitigate unexpected timing violations post-fabrication. Quantifying the benefits of our approach with a benchmark suite with nominal delays, our designs offer an average 1.48x improvement in area delay product (ADP) over high frequency full path balancing (FPB) designs and a 2.07x improvement in ADP over the state of the art robust circuits provided by state-of-the-art (SOTA) multi-phase clocking solutions.
△ Less
Submitted 7 September, 2024;
originally announced September 2024.
-
A Joint Optimization of Buffer and Splitter Insertion for Phase-Skipping Adiabatic Quantum-Flux-Parametron Circuits
Authors:
Robert S. Aviles,
Peter A. Beerel
Abstract:
Adiabatic Quantum-Flux-Parametron (AQFP) logic is a promising emerging device technology with six orders of magnitude lower power than CMOS. However, AQFP is challenged by the fact that every gate must be clocked, where proper data transfer requires connected gates to have shifted but overlapping clocks. As a result, buffers need to be used to balance re-convergent logic paths, a problem that is e…
▽ More
Adiabatic Quantum-Flux-Parametron (AQFP) logic is a promising emerging device technology with six orders of magnitude lower power than CMOS. However, AQFP is challenged by the fact that every gate must be clocked, where proper data transfer requires connected gates to have shifted but overlapping clocks. As a result, buffers need to be used to balance re-convergent logic paths, a problem that is exacerbated by every multi-node fanout needing a tree of clocked splitters. Recent AQFP circuit design techniques have offered an opportunity to reduce buffer costs by supporting a notion of phase-skipping but the EDA support for these advanced circuits is limited. This paper proposes the first algorithm to optimize buffer and splitter insertion for phase-skipping AQFP circuits and achieves over 31\% savings over existing buffer reduction schemes and up to 74\% savings in buffers and splitter costs over the SOTA non-phase skipping circuits.
△ Less
Submitted 7 September, 2024; v1 submitted 14 January, 2024;
originally announced January 2024.
-
An Efficient and Scalable Clocking Assignment Algorithm for Multi-Threaded Multi-Phase Single Flux Quantum Circuits
Authors:
Robert S. Aviles,
Xi Li,
Lei Lu,
Zhaorui Ni,
Peter A. Beerel
Abstract:
A key distinguishing feature of single flux quantum (SFQ) circuits is that each logic gate is clocked. This feature forces the introduction of path-balancing flip-flops to ensure proper synchronization of inputs at each gate. This paper proposes a polynomial time complexity approximation algorithm for clocking assignments that minimizes the insertion of path balancing buffers for multi-threaded mu…
▽ More
A key distinguishing feature of single flux quantum (SFQ) circuits is that each logic gate is clocked. This feature forces the introduction of path-balancing flip-flops to ensure proper synchronization of inputs at each gate. This paper proposes a polynomial time complexity approximation algorithm for clocking assignments that minimizes the insertion of path balancing buffers for multi-threaded multi-phase clocking of SFQ circuits. Existing SFQ multi-phase clocking solutions have been shown to effectively reduce the number of required buffers inserted while maintaining high throughput, however, the associated clock assignment algorithms have exponential complexity and can have prohibitively long runtimes for large circuits, limiting the scalability of this approach. Our proposed algorithm is based on a linear program (LP) that leads to solutions that are experimentally on average within 5% of the optimum and helps accelerate convergence towards the optimal integer linear program (ILP) based solution. The improved LP and ILP runtimes permit multi-phase clocking schemes to scale to larger SFQ circuits than previous state of the art clocking assignment methods. We further extend the existing algorithm to support fanout sharing of the added buffers, saving, on average, an additional 10% of the inserted DFFs. Compared to traditional full path balancing (FPB) methods across 10 benchmarks, our enhanced LP saves 79.9%, 87.8%, and 91.2% of the inserted buffers for 2, 3, and 4 clock phases respectively. Finally, we extend this approach to the generation of circuits that completely mitigate potential hold-time violations at the cost of either adding on average less than 10% more buffers (for designs with 3 or more clock phases) or, more generally, adding a clock phase and thereby reducing throughput.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.