Search | arXiv e-print repository

Scalable iterative pruning of large language and vision models using block coordinate descent

Authors: Gili Rosenberg, J. Kyle Brubaker, Martin J. A. Schuetz, Elton Yechao Zhu, Serdar Kadıoğlu, Sima E. Borujeni, Helmut G. Katzgraber

Abstract: Pruning neural networks, which involves removing a fraction of their weights, can often maintain high accuracy while significantly reducing model complexity, at least up to a certain limit. We present a neural network pruning technique that builds upon the Combinatorial Brain Surgeon, but solves an optimization problem over a subset of the network weights in an iterative, block-wise manner using b… ▽ More Pruning neural networks, which involves removing a fraction of their weights, can often maintain high accuracy while significantly reducing model complexity, at least up to a certain limit. We present a neural network pruning technique that builds upon the Combinatorial Brain Surgeon, but solves an optimization problem over a subset of the network weights in an iterative, block-wise manner using block coordinate descent. The iterative, block-based nature of this pruning technique, which we dub ``iterative Combinatorial Brain Surgeon'' (iCBS) allows for scalability to very large models, including large language models (LLMs), that may not be feasible with a one-shot combinatorial optimization approach. When applied to large models like Mistral and DeiT, iCBS achieves higher performance metrics at the same density levels compared to existing pruning methods such as Wanda. This demonstrates the effectiveness of this iterative, block-wise pruning method in compressing and optimizing the performance of large deep learning models, even while optimizing over only a small fraction of the weights. Moreover, our approach allows for a quality-time (or cost) tradeoff that is not available when using a one-shot pruning technique alone. The block-wise formulation of the optimization problem enables the use of hardware accelerators, potentially offsetting the increased computational costs compared to one-shot pruning methods like Wanda. In particular, the optimization problem solved for each block is quantum-amenable in that it could, in principle, be solved by a quantum computer. △ Less

Submitted 26 November, 2024; originally announced November 2024.

Comments: 16 pages, 6 figures, 5 tables

arXiv:2405.05983 [pdf]

doi 10.1109/CISCE62493.2024.10653353

Real-Time Pill Identification for the Visually Impaired Using Deep Learning

Authors: Bo Dang, Wenchao Zhao, Yufeng Li, Danqing Ma, Qixuan Yu, Elly Yijun Zhu

Abstract: The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to… ▽ More The prevalence of mobile technology offers unique opportunities for addressing healthcare challenges, especially for individuals with visual impairments. This paper explores the development and implementation of a deep learning-based mobile application designed to assist blind and visually impaired individuals in real-time pill identification. Utilizing the YOLO framework, the application aims to accurately recognize and differentiate between various pill types through real-time image processing on mobile devices. The system incorporates Text-to- Speech (TTS) to provide immediate auditory feedback, enhancing usability and independence for visually impaired users. Our study evaluates the application's effectiveness in terms of detection accuracy and user experience, highlighting its potential to improve medication management and safety among the visually impaired community. Keywords-Deep Learning; YOLO Framework; Mobile Application; Visual Impairment; Pill Identification; Healthcare △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2306.03976 [pdf, other]

doi 10.3390/make5040086

Explainable AI using expressive Boolean formulas

Authors: Gili Rosenberg, J. Kyle Brubaker, Martin J. A. Schuetz, Grant Salton, Zhihuai Zhu, Elton Yechao Zhu, Serdar Kadıoğlu, Sima E. Borujeni, Helmut G. Katzgraber

Abstract: We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability), according to which input data are classified. Such a formula can include any operator that c… ▽ More We propose and implement an interpretable machine learning classification model for Explainable AI (XAI) based on expressive Boolean formulas. Potential applications include credit scoring and diagnosis of medical conditions. The Boolean formula defines a rule with tunable complexity (or interpretability), according to which input data are classified. Such a formula can include any operator that can be applied to one or more Boolean variables, thus providing higher expressivity compared to more rigid rule-based and tree-based approaches. The classifier is trained using native local optimization techniques, efficiently searching the space of feasible formulas. Shallow rules can be determined by fast Integer Linear Programming (ILP) or Quadratic Unconstrained Binary Optimization (QUBO) solvers, potentially powered by special purpose hardware or quantum devices. We combine the expressivity and efficiency of the native local optimizer with the fast operation of these devices by executing non-local moves that optimize over subtrees of the full Boolean formula. We provide extensive numerical benchmarking results featuring several baselines on well-known public datasets. Based on the results, we find that the native local rule classifier is generally competitive with the other classifiers. The addition of non-local moves achieves similar results with fewer iterations, and therefore using specialized or quantum hardware could lead to a speedup by fast proposal of non-local moves. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 28 pages, 16 figures, 4 tables

Journal ref: Mach. Learn. Knowl. Extr. 2023, 5(4), 1760-1795

arXiv:1708.04314 [pdf, other]

Superadditivity in trade-off capacities of quantum channels

Authors: Elton Yechao Zhu, Quntao Zhuang, Min-Hsiu Hsieh, Peter W. Shor

Abstract: In this article, we investigate the additivity phenomenon in the dynamic capacity of a quantum channel for trading classical communication, quantum communication and entanglement. Understanding such additivity property is important if we want to optimally use a quantum channel for general communication purpose. However, in a lot of cases, the channel one will be using only has an additive single o… ▽ More In this article, we investigate the additivity phenomenon in the dynamic capacity of a quantum channel for trading classical communication, quantum communication and entanglement. Understanding such additivity property is important if we want to optimally use a quantum channel for general communication purpose. However, in a lot of cases, the channel one will be using only has an additive single or double resource capacity, and it is largely unknown if this could lead to an superadditive double or triple resource capacity. For example, if a channel has an additive classical and quantum capacity, can the classical-quantum capacity be superadditive? In this work, we answer such questions affirmatively. We give proof-of-principle requirements for these channels to exist. In most cases, we can provide an explicit construction of these quantum channels. The existence of these superadditive phenomena is surprising in contrast to the result that the additivity of both classical-entanglement and classical-quantum capacity regions imply the additivity of the triple capacity region. △ Less

Submitted 15 August, 2017; v1 submitted 14 August, 2017; originally announced August 2017.

Comments: 15 pages. v2: typo corrected

Report number: MIT-CTP/4917

arXiv:1704.06955 [pdf, other]

doi 10.1103/PhysRevLett.119.040503

Superadditivity of the Classical Capacity with Limited Entanglement Assistance

Authors: Elton Yechao Zhu, Quntao Zhuang, Peter W. Shor

Abstract: Finding the optimal encoding strategies can be challenging for communication using quantum channels, as classical and quantum capacities may be superadditive. Entanglement assistance can often simplify this task, as the entanglement-assisted classical capacity for any channel is additive, making entanglement across channel uses unnecessary. If the entanglement assistance is limited, the picture is… ▽ More Finding the optimal encoding strategies can be challenging for communication using quantum channels, as classical and quantum capacities may be superadditive. Entanglement assistance can often simplify this task, as the entanglement-assisted classical capacity for any channel is additive, making entanglement across channel uses unnecessary. If the entanglement assistance is limited, the picture is much more unclear. Suppose the classical capacity is superadditive, then the classical capacity with limited entanglement assistance could retain superadditivity by continuity arguments. If the classical capacity is additive, it is unknown if superadditivity can still be developed with limited entanglement assistance. We show this is possible, by providing an example. We construct a channel for which, the classical capacity is additive, but that with limited entanglement assistance can be superadditive. This shows entanglement plays a weird role in communication and we still understand very little about it. △ Less

Submitted 28 July, 2017; v1 submitted 23 April, 2017; originally announced April 2017.

Comments: 13 pages

Report number: MIT-CTP/4895

Journal ref: Phys. Rev. Lett. 119, 040503 (2017)

Showing 1–5 of 5 results for author: Zhu, E Y