Showing 1–2 of 2 results for author: Graening, A

Search v0.5.6 released 2020-02-24

arXiv:2503.15753 [pdf, other]

cs.AR

CATCH: a Cost Analysis Tool for Co-optimization of chiplet-based Heterogeneous systems

Authors: Alexander Graening, Jonti Talukdar, Saptadeep Pal, Krishnendu Chakrabarty, Puneet Gupta

Abstract: With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically. Instead of chips defaulting to a single die design, now there are options for 2.5D and 3D stacking along with a plethora of choices regarding configurations and processes. For chiplet-based designs, high-impact decisions such as those regarding the… ▽ More With the increasing prevalence of chiplet systems in high-performance computing applications, the number of design options has increased dramatically. Instead of chips defaulting to a single die design, now there are options for 2.5D and 3D stacking along with a plethora of choices regarding configurations and processes. For chiplet-based designs, high-impact decisions such as those regarding the number of chiplets, the design partitions, the interconnect types, and other factors must be made early in the development process. In this work, we describe an open-source tool, CATCH, that can be used to guide these early design choices. We also present case studies showing some of the insights we can draw by using this tool. We look at case studies on optimal chip size, defect density, test cost, IO types, assembly processes, and substrates. △ Less

Submitted 19 March, 2025; originally announced March 2025.

Comments: 13 pages, 21 figures
arXiv:2103.01308 [pdf, other]

cs.LG

SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration

Authors: Shurui Li, Wojciech Romaszkan, Alexander Graening, Puneet Gupta

Abstract: Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS ca… ▽ More Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS can achieve up to 54.3% (19.8%) point accuracy improvement compared to weight truncation when quantizing MobileNet-v2 to 4 (2) bits post-training (with retraining) showing the strength of leveraging shared bit-sparsity in weights. SWIS accelerator gives up to 6x speedup and 1.9x energy improvement overstate of the art bit-serial architectures. △ Less

Submitted 2 March, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: 8 pages, 6 figures, accepted as a full-length paper at the 2021 TinyML Research Symposium (https://openreview.net/group?id=tinyml.org/tinyML/2021/Research_Symposium)

Search v0.5.6 released 2020-02-24