CATCH: a Cost Analysis Tool for Co-optimization of chiplet-based Heterogeneous systems
Authors:
Alexander Graening,
Jonti Talukdar,
Saptadeep Pal,
Krishnendu Chakrabarty,
Puneet Gupta
Abstract:
With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically. Instead of chips defaulting to a single die design, now there are options for 2.5D and 3D stacking along with a plethora of choices regarding configurations and processes. For chiplet-based designs, high-impact decisions such as those regarding the…
▽ More
With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically. Instead of chips defaulting to a single die design, now there are options for 2.5D and 3D stacking along with a plethora of choices regarding configurations and processes. For chiplet-based designs, high-impact decisions such as those regarding the number of chiplets, the design partitions, the interconnect types, and other factors must be made early in the development process. In this work, we describe an open-source tool, CATCH, that can be used to guide these early design choices. We also present case studies showing some of the insights we can draw by using this tool. We look at case studies on optimal chip size, defect density, test cost, IO types, assembly processes, and substrates.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration
Authors:
Shurui Li,
Wojciech Romaszkan,
Alexander Graening,
Puneet Gupta
Abstract:
Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS ca…
▽ More
Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS can achieve up to 54.3% (19.8%) point accuracy improvement compared to weight truncation when quantizing MobileNet-v2 to 4 (2) bits post-training (with retraining) showing the strength of leveraging shared bit-sparsity in weights. SWIS accelerator gives up to 6x speedup and 1.9x energy improvement overstate of the art bit-serial architectures.
△ Less
Submitted 2 March, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.