-
Cost-Driven Hardware-Software Co-Optimization of Machine Learning Pipelines
Authors:
Ravit Sharma,
Wojciech Romaszkan,
Feiqian Zhu,
Puneet Gupta,
Ankur Mehta
Abstract:
Researchers have long touted a vision of the future enabled by a proliferation of internet-of-things devices, including smart sensors, homes, and cities. Increasingly, embedding intelligence in such devices involves the use of deep neural networks. However, their storage and processing requirements make them prohibitive for cheap, off-the-shelf platforms. Overcoming those requirements is necessary…
▽ More
Researchers have long touted a vision of the future enabled by a proliferation of internet-of-things devices, including smart sensors, homes, and cities. Increasingly, embedding intelligence in such devices involves the use of deep neural networks. However, their storage and processing requirements make them prohibitive for cheap, off-the-shelf platforms. Overcoming those requirements is necessary for enabling widely-applicable smart devices. While many ways of making models smaller and more efficient have been developed, there is a lack of understanding of which ones are best suited for particular scenarios. More importantly for edge platforms, those choices cannot be analyzed in isolation from cost and user experience. In this work, we holistically explore how quantization, model scaling, and multi-modality interact with system components such as memory, sensors, and processors. We perform this hardware/software co-design from the cost, latency, and user-experience perspective, and develop a set of guidelines for optimal system design and model deployment for the most cost-constrained platforms. We demonstrate our approach using an end-to-end, on-device, biometric user authentication system using a $20 ESP-EYE board.
△ Less
Submitted 19 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
SWIS -- Shared Weight bIt Sparsity for Efficient Neural Network Acceleration
Authors:
Shurui Li,
Wojciech Romaszkan,
Alexander Graening,
Puneet Gupta
Abstract:
Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS ca…
▽ More
Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware. We present SWIS - Shared Weight bIt Sparsity, a quantization framework for efficient neural network inference acceleration delivering improved performance and storage compression through an offline weight decomposition and scheduling algorithm. SWIS can achieve up to 54.3% (19.8%) point accuracy improvement compared to weight truncation when quantizing MobileNet-v2 to 4 (2) bits post-training (with retraining) showing the strength of leveraging shared bit-sparsity in weights. SWIS accelerator gives up to 6x speedup and 1.9x energy improvement overstate of the art bit-serial architectures.
△ Less
Submitted 2 March, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
FlashCam: A fully digital camera for the Cherenkov Telescope Array
Authors:
G. Pühlhofer,
C. Bauer,
F. Eisenkolb,
D. Florin,
C. Föhr,
A. Gadola,
G. Hermann,
C. Kalkuhl,
J. Kasperek,
T. Kihm,
J. Koziol,
A. Manalaysay,
A. Marszalek,
P. J. Rajda,
W. Romaszkan,
M. Rupinski,
T. Schanz,
S. Steiner,
U. Straumann,
C. Tenzer,
A. Vollhardt,
Q. Weitzel,
K. Winiarski,
K. Zietara
Abstract:
FlashCam is a Cherenkov camera development project centered around a fully digital trigger and readout scheme with smart, digital signal processing, and a "horizontal" architecture for the electromechanical implementation. The fully digital approach, based on commercial FADCs and FPGAs as key components, provides the option to easily implement different types of triggers as well as digitization an…
▽ More
FlashCam is a Cherenkov camera development project centered around a fully digital trigger and readout scheme with smart, digital signal processing, and a "horizontal" architecture for the electromechanical implementation. The fully digital approach, based on commercial FADCs and FPGAs as key components, provides the option to easily implement different types of triggers as well as digitization and readout scenarios using identical hardware, by simply changing the firmware on the FPGAs. At the same time, a large dynamic range and high resolution of low-amplitude signals in a single readout channel per pixel is achieved using compression of high amplitude signals in the preamplifier and signal processing in the FPGA. The readout of the front-end modules into a camera server is Ethernet-based using standard Ethernet switches. In its current implementation, data transfer and backend processing rates of ~3.8 GBytes/sec have been achieved. Together with the dead-time-free front end event buffering on the FPGAs, this permits the cameras to operate at trigger rates of up to several tens of kHz.
In the horizontal architecture of FlashCam, the photon detector plane (PDP), consisting of photon detectors, preamplifiers, high voltage-, control-, and monitoring systems, is a self-contained unit, which is interfaced through analogue signal transmission to the digital readout system. The horizontal integration of FlashCam is expected not only to be more cost efficient, it also allows PDPs with different types of photon detectors to be adapted to the FlashCam readout system. This paper describes the FlashCam concept, its verification process, and its implementation for a 12 m class CTA telescope with PMT-based PDP.
△ Less
Submitted 13 July, 2013;
originally announced July 2013.
-
4 m Davies-Cotton telescope for the Cherenkov Telescope Array
Authors:
R. Moderski,
J. A. Aguilar,
A. Barnacka,
A. Basili,
V. Boccone,
L. Bogacz,
F. Cadoux,
A. Christov,
M. Della Volpe,
M. Dyrda,
A. Frankowski,
M. Grudzińska,
M. Janiak,
M. Karczewski,
J. Kasperek,
W. Kochański,
P. Korohoda,
J. Kozioł,
P. Lubiński,
J. Ludwin,
E. Lyard,
A. Marszałek,
J. Michałowski,
T. Montaruli,
J. Nicolau-Kukliński
, et al. (17 additional authors not shown)
Abstract:
The Cherenkov Telescope Array (CTA) is the next generation very high energy gamma-ray observatory. It will consist of three classes of telescopes, of large, medium and small sizes. The small telescopes, of 4 m diameter, will be dedicated to the observations of the highest energy gamma-rays, above several TeV. We present the technical characteristics of a single mirror, 4 m diameter, Davies-Cotton…
▽ More
The Cherenkov Telescope Array (CTA) is the next generation very high energy gamma-ray observatory. It will consist of three classes of telescopes, of large, medium and small sizes. The small telescopes, of 4 m diameter, will be dedicated to the observations of the highest energy gamma-rays, above several TeV. We present the technical characteristics of a single mirror, 4 m diameter, Davies-Cotton telescope for the CTA and the performance of the sub-array consisting of the telescopes of this type. The telescope will be equipped with a fully digital camera based on custom made, hexagonal Geiger-mode avalanche photodiodes. The development of cameras based on such devices is an RnD since traditionally photomultipliers are used. The photodiodes are now being characterized at various institutions of the CTA Consortium. Glass mirrors will be used, although an alternative is being considered: composite mirrors that could be adopted if they meet the project requirements. We present a design of the telescope structure, its components and results of the numerical simulations of the telescope performance.
△ Less
Submitted 11 July, 2013;
originally announced July 2013.
-
CTA contributions to the 33rd International Cosmic Ray Conference (ICRC2013)
Authors:
The CTA Consortium,
:,
O. Abril,
B. S. Acharya,
M. Actis,
G. Agnetta,
J. A. Aguilar,
F. Aharonian,
M. Ajello,
A. Akhperjanian,
M. Alcubierre,
J. Aleksic,
R. Alfaro,
E. Aliu,
A. J. Allafort,
D. Allan,
I. Allekotte,
R. Aloisio,
E. Amato,
G. Ambrosi,
M. Ambrosio,
J. Anderson,
E. O. Angüner,
L. A. Antonelli,
V. Antonuccio
, et al. (1082 additional authors not shown)
Abstract:
Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil
Compilation of CTA contributions to the proceedings of the 33rd International Cosmic Ray Conference (ICRC2013), which took place in 2-9 July, 2013, in Rio de Janeiro, Brazil
△ Less
Submitted 29 July, 2013; v1 submitted 8 July, 2013;
originally announced July 2013.