-
A Comprehensive Survey on Knowledge Distillation
Authors:
Amir M. Mansourian,
Rozhan Ahmadi,
Masoud Ghafouri,
Amir Mohammad Babaei,
Elaheh Badali Golezani,
Zeynab Yasamani Ghamchi,
Vida Ramezanian,
Alireza Taherian,
Kimia Dinashi,
Amirali Miri,
Shohreh Kasaei
Abstract:
Deep Neural Networks (DNNs) have achieved notable performance in the fields of computer vision and natural language processing with various applications in both academia and industry. However, with recent advancements in DNNs and transformer models with a tremendous number of parameters, deploying these large models on edge devices causes serious issues such as high runtime and memory consumption.…
▽ More
Deep Neural Networks (DNNs) have achieved notable performance in the fields of computer vision and natural language processing with various applications in both academia and industry. However, with recent advancements in DNNs and transformer models with a tremendous number of parameters, deploying these large models on edge devices causes serious issues such as high runtime and memory consumption. This is especially concerning with the recent large-scale foundation models, Vision-Language Models (VLMs), and Large Language Models (LLMs). Knowledge Distillation (KD) is one of the prominent techniques proposed to address the aforementioned problems using a teacher-student architecture. More specifically, a lightweight student model is trained using additional knowledge from a cumbersome teacher model. In this work, a comprehensive survey of knowledge distillation methods is proposed. This includes reviewing KD from different aspects: distillation sources, distillation schemes, distillation algorithms, distillation by modalities, applications of distillation, and comparison among existing methods. In contrast to most existing surveys, which are either outdated or simply update former surveys, this work proposes a comprehensive survey with a new point of view and representation structure that categorizes and investigates the most recent methods in knowledge distillation. This survey considers various critically important subcategories, including KD for diffusion models, 3D inputs, foundational models, transformers, and LLMs. Furthermore, existing challenges in KD and possible future research directions are discussed. Github page of the project: https://github.com/IPL-Sharif/KD_Survey
△ Less
Submitted 15 March, 2025;
originally announced March 2025.
-
A physics-based data-driven model for CO$_2$ gas diffusion electrodes to drive automated laboratories
Authors:
Ivan Grega,
Félix Therrien,
Abhishek Soni,
Karry Ocean,
Kevan Dettelbach,
Ribwar Ahmadi,
Mehrdad Mokhtari,
Curtis P. Berlinguette,
Yoshua Bengio
Abstract:
The electrochemical reduction of atmospheric CO$_2$ into high-energy molecules with renewable energy is a promising avenue for energy storage that can take advantage of existing infrastructure especially in areas where sustainable alternatives to fossil fuels do not exist. Automated laboratories are currently being developed and used to optimize the composition and operating conditions of gas diff…
▽ More
The electrochemical reduction of atmospheric CO$_2$ into high-energy molecules with renewable energy is a promising avenue for energy storage that can take advantage of existing infrastructure especially in areas where sustainable alternatives to fossil fuels do not exist. Automated laboratories are currently being developed and used to optimize the composition and operating conditions of gas diffusion electrodes (GDEs), the device in which this reaction takes place. Improving the efficiency of GDEs is crucial for this technology to become viable. Here we present a modeling framework to efficiently explore the high-dimensional parameter space of GDE designs in an active learning context. At the core of the framework is an uncertainty-aware physics model calibrated with experimental data. The model has the flexibility to capture various input parameter spaces and any carbon products which can be modeled with Tafel kinetics. It is interpretable, and a Gaussian process layer can capture deviations of real data from the function space of the physical model itself. We deploy the model in a simulated active learning setup with real electrochemical data gathered by the AdaCarbon automated laboratory and show that it can be used to efficiently traverse the multi-dimensional parameter space.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Comparison of Kinematics and Kinetics Between OpenCap and a Marker-Based Motion Capture System in Cycling
Authors:
Reza Kakavand,
Reza Ahmadi,
Atousa Parsaei,
W. Brent Edwards,
Amin Komeili
Abstract:
This study evaluates the agreement of marker-based and markerless (OpenCap) motion capture systems in assessing joint kinematics and kinetics during cycling. Markerless systems, such as OpenCap, offer the advantage of capturing natural movements without physical markers, making them more practical for real-world applications. However, the agreement of OpenCap with a marker-based system, particular…
▽ More
This study evaluates the agreement of marker-based and markerless (OpenCap) motion capture systems in assessing joint kinematics and kinetics during cycling. Markerless systems, such as OpenCap, offer the advantage of capturing natural movements without physical markers, making them more practical for real-world applications. However, the agreement of OpenCap with a marker-based system, particularly in cycling, remains underexplored. Ten participants cycled at varying speeds and resistances while motion data were recorded using both systems. Key metrics, including joint angles, moments, and joint reaction loads, were computed using OpenSim and compared using root mean squared error (RMSE) per trial across participants, Pearson correlation coefficients (r) per trial across participants and repeated measures Bland-Altman to control trials dependency within subject. Results revealed very strong agreement (r GT 0.9) for hip (flexion/extension), knee (flexion/extension), and ankle (dorsiflexion/plantarflexion) joint angles.
△ Less
Submitted 30 April, 2025; v1 submitted 20 August, 2024;
originally announced September 2024.
-
Attention-guided Feature Distillation for Semantic Segmentation
Authors:
Amir M. Mansourian,
Arya Jalali,
Rozhan Ahmadi,
Shohreh Kasaei
Abstract:
Deep learning models have achieved significant results across various computer vision tasks. However, due to the large number of parameters in these models, deploying them in real-time scenarios is a critical challenge, specifically in dense prediction tasks such as semantic segmentation. Knowledge distillation has emerged as a successful technique for addressing this problem by transferring knowl…
▽ More
Deep learning models have achieved significant results across various computer vision tasks. However, due to the large number of parameters in these models, deploying them in real-time scenarios is a critical challenge, specifically in dense prediction tasks such as semantic segmentation. Knowledge distillation has emerged as a successful technique for addressing this problem by transferring knowledge from a cumbersome model (teacher) to a lighter model (student). In contrast to existing complex methodologies commonly employed for distilling knowledge from a teacher to a student, this paper showcases the efficacy of a simple yet powerful method for utilizing refined feature maps to transfer attention. The proposed method has proven to be effective in distilling rich information, outperforming existing methods in semantic segmentation as a dense prediction task. The proposed Attention-guided Feature Distillation (AttnFD) method, employs the Convolutional Block Attention Module (CBAM), which refines feature maps by taking into account both channel-specific and spatial information content. Simply using the Mean Squared Error (MSE) loss function between the refined feature maps of the teacher and the student, AttnFD demonstrates outstanding performance in semantic segmentation, achieving state-of-the-art results in terms of improving the mean Intersection over Union (mIoU) of the student network on the PascalVoc 2012, Cityscapes, COCO, and CamVid datasets.
△ Less
Submitted 24 March, 2025; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation
Authors:
Rozhan Ahmadi,
Shohreh Kasaei
Abstract:
In recent years, weakly supervised semantic segmentation using image-level labels as supervision has received significant attention in the field of computer vision. Most existing methods have addressed the challenges arising from the lack of spatial information in these labels by focusing on facilitating supervised learning through the generation of pseudo-labels from class activation maps (CAMs).…
▽ More
In recent years, weakly supervised semantic segmentation using image-level labels as supervision has received significant attention in the field of computer vision. Most existing methods have addressed the challenges arising from the lack of spatial information in these labels by focusing on facilitating supervised learning through the generation of pseudo-labels from class activation maps (CAMs). Due to the localized pattern detection of CNNs, CAMs often emphasize only the most discriminative parts of an object, making it challenging to accurately distinguish foreground objects from each other and the background. Recent studies have shown that Vision Transformer (ViT) features, due to their global view, are more effective in capturing the scene layout than CNNs. However, the use of hierarchical ViTs has not been extensively explored in this field. This work explores the use of Swin Transformer by proposing "SWTformer" to enhance the accuracy of the initial seed CAMs by bringing local and global views together. SWTformer-V1 generates class probabilities and CAMs using only the patch tokens as features. SWTformer-V2 incorporates a multi-scale feature fusion mechanism to extract additional information and utilizes a background-aware mechanism to generate more accurate localization maps with improved cross-object discrimination. Based on experiments on the PascalVOC 2012 dataset, SWTformer-V1 achieves a 0.98% mAP higher localization accuracy, outperforming state-of-the-art models. It also yields comparable performance by 0.82% mIoU on average higher than other methods in generating initial localization maps, depending only on the classification network. SWTformer-V2 further improves the accuracy of the generated seed CAMs by 5.32% mIoU, further proving the effectiveness of the local-to-global view provided by the Swin transformer. Code available at: https://github.com/RozhanAhmadi/SWTformer
△ Less
Submitted 11 March, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
Integration of Swin UNETR and statistical shape modeling for a semi-automated segmentation of the knee and biomechanical modeling of articular cartilage
Authors:
Reza Kakavand,
Mehrdad Palizi,
Peyman Tahghighi,
Reza Ahmadi,
Neha Gianchandani,
Samer Adeeb,
Roberto Souza,
W. Brent Edwards,
Amin Komeili
Abstract:
Simulation studies like finite element (FE) modeling provide insight into knee joint mechanics without patient experimentation. Generic FE models represent biomechanical behavior of the tissue by overlooking variations in geometry, loading, and material properties of a population. On the other hand, subject-specific models include these specifics, resulting in enhanced predictive precision. Howeve…
▽ More
Simulation studies like finite element (FE) modeling provide insight into knee joint mechanics without patient experimentation. Generic FE models represent biomechanical behavior of the tissue by overlooking variations in geometry, loading, and material properties of a population. On the other hand, subject-specific models include these specifics, resulting in enhanced predictive precision. However, creating such models is laborious and time-intensive. The present study aimed to enhance subject-specific knee joint FE modeling by incorporating a semi-automated segmentation algorithm. This segmentation was a 3D Swin UNETR for an initial segmentation of the femur and tibia, followed by a statistical shape model (SSM) adjustment to improve surface roughness and continuity. Five hundred and seven magnetic resonance images (MRIs) from the Osteoarthritis Initiative (OAI) database were used to build and validate the segmentation model. A semi-automated FE model was developed using this semi-automated segmentation. On the other hand, a manual FE model was developed through manual segmentation (i.e., the gold standard approach). Both FE models were subjected to gait loading. The predicted mechanical response of manual and semi-automated FE models were compared. In the result, our semi-automated segmentation achieved Dice similarity coefficient (DSC) over 98% for both femur and tibia. The mechanical results (max principal stress, max principal strain, fluid pressure, fibril strain, and contact area) showed no significant differences between the manual and semi-automated FE models, indicating the effectiveness of the proposed semi-automated segmentation in creating accurate knee joint FE models. ( https://data.mendeley.com/datasets/k5hdc9cz7w/1 ).
△ Less
Submitted 18 September, 2023;
originally announced December 2023.
-
Free-Space Optical Spiking Neural Network
Authors:
Reyhane Ahmadi,
Amirreza Ahmadnejad,
Somayyeh Koohi
Abstract:
Neuromorphic engineering has emerged as a promising avenue for developing brain-inspired computational systems. However, conventional electronic AI-based processors often encounter challenges related to processing speed and thermal dissipation. As an alternative, optical implementations of such processors have been proposed, capitalizing on the intrinsic information-processing capabilities of ligh…
▽ More
Neuromorphic engineering has emerged as a promising avenue for developing brain-inspired computational systems. However, conventional electronic AI-based processors often encounter challenges related to processing speed and thermal dissipation. As an alternative, optical implementations of such processors have been proposed, capitalizing on the intrinsic information-processing capabilities of light. Within the realm of optical neuromorphic engineering, various optical neural networks (ONNs) have been explored. Among these, Spiking Neural Networks (SNNs) have exhibited notable success in emulating the computational principles of the human brain. Nevertheless, the integration of optical SNN processors has presented formidable obstacles, mainly when dealing with the computational demands of large datasets. In response to these challenges, we introduce a pioneering concept: the Free-space Optical deep Spiking Convolutional Neural Network (OSCNN). This novel approach draws inspiration from computational models of the human eye. We have meticulously designed various optical components within the OSCNN to tackle object detection tasks across prominent benchmark datasets, including MNIST, ETH 80, and Caltech. Our results demonstrate promising performance with minimal latency and power consumption compared to their electronic ONN counterparts. Additionally, we conducted several pertinent simulations, such as optical intensity to-latency conversion and synchronization. Of particular significance is the evaluation of the feature extraction layer, employing a Gabor filter bank, which stands to impact the practical deployment of diverse ONN architectures significantly.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
AICSD: Adaptive Inter-Class Similarity Distillation for Semantic Segmentation
Authors:
Amir M. Mansourian,
Rozhan Ahmadi,
Shohreh Kasaei
Abstract:
In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and…
▽ More
In recent years, deep neural networks have achieved remarkable accuracy in computer vision tasks. With inference time being a crucial factor, particularly in dense prediction tasks such as semantic segmentation, knowledge distillation has emerged as a successful technique for improving the accuracy of lightweight student networks. The existing methods often neglect the information in channels and among different classes. To overcome these limitations, this paper proposes a novel method called Inter-Class Similarity Distillation (ICSD) for the purpose of knowledge distillation. The proposed method transfers high-order relations from the teacher network to the student network by independently computing intra-class distributions for each class from network outputs. This is followed by calculating inter-class similarity matrices for distillation using KL divergence between distributions of each pair of classes. To further improve the effectiveness of the proposed method, an Adaptive Loss Weighting (ALW) training strategy is proposed. Unlike existing methods, the ALW strategy gradually reduces the influence of the teacher network towards the end of training process to account for errors in teacher's predictions. Extensive experiments conducted on two well-known datasets for semantic segmentation, Cityscapes and Pascal VOC 2012, validate the effectiveness of the proposed method in terms of mIoU and pixel accuracy. The proposed method outperforms most of existing knowledge distillation methods as demonstrated by both quantitative and qualitative evaluations. Code is available at: https://github.com/AmirMansurian/AICSD
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Mitigating Bias: Enhancing Image Classification by Improving Model Explanations
Authors:
Raha Ahmadi,
Mohammad Javad Rajabi,
Mohammad Khalooie,
Mohammad Sabokrou
Abstract:
Deep learning models have demonstrated remarkable capabilities in learning complex patterns and concepts from training data. However, recent findings indicate that these models tend to rely heavily on simple and easily discernible features present in the background of images rather than the main concepts or objects they are intended to classify. This phenomenon poses a challenge to image classifie…
▽ More
Deep learning models have demonstrated remarkable capabilities in learning complex patterns and concepts from training data. However, recent findings indicate that these models tend to rely heavily on simple and easily discernible features present in the background of images rather than the main concepts or objects they are intended to classify. This phenomenon poses a challenge to image classifiers as the crucial elements of interest in images may be overshadowed. In this paper, we propose a novel approach to address this issue and improve the learning of main concepts by image classifiers. Our central idea revolves around concurrently guiding the model's attention toward the foreground during the classification task. By emphasizing the foreground, which encapsulates the primary objects of interest, we aim to shift the focus of the model away from the dominant influence of the background. To accomplish this, we introduce a mechanism that encourages the model to allocate sufficient attention to the foreground. We investigate various strategies, including modifying the loss function or incorporating additional architectural components, to enable the classifier to effectively capture the primary concept within an image. Additionally, we explore the impact of different foreground attention mechanisms on model performance and provide insights into their effectiveness. Through extensive experimentation on benchmark datasets, we demonstrate the efficacy of our proposed approach in improving the classification accuracy of image classifiers. Our findings highlight the importance of foreground attention in enhancing model understanding and representation of the main concepts within images. The results of this study contribute to advancing the field of image classification and provide valuable insights for developing more robust and accurate deep-learning models.
△ Less
Submitted 22 September, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Distributed Energy Management and Demand Response in Smart Grids: A Multi-Agent Deep Reinforcement Learning Framework
Authors:
Amin Shojaeighadikolaei,
Arman Ghasemi,
Kailani Jones,
Yousif Dafalla,
Alexandru G. Bardas,
Reza Ahmadi,
Morteza Haashemi
Abstract:
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems. In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users. DR has a widely recognized potential for improving power grid stability and re…
▽ More
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems. In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users. DR has a widely recognized potential for improving power grid stability and reliability, while at the same time reducing end-users energy bills. However, the conventional DR techniques come with several shortcomings, such as the inability to handle operational uncertainties while incurring end-user disutility, which prevents widespread adoption in real-world applications. The proposed framework addresses these shortcomings by implementing DR and DEM based on real-time pricing strategy that is achieved using deep reinforcement learning. Furthermore, this framework enables the power grid service provider to leverage distributed energy resources (i.e., PV rooftop panels and battery storage) as dispatchable assets to support the smart grid during peak hours, thus achieving management of distributed energy resources. Simulation results based on the Deep Q-Network (DQN) demonstrate significant improvements of the 24-hour accumulative profit for both prosumers and the power grid service provider, as well as major reductions in the utilization of the power grid reserve generators.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
A Multi-Agent Deep Reinforcement Learning Approach for a Distributed Energy Marketplace in Smart Grids
Authors:
Arman Ghasemi,
Amin Shojaeighadikolaei,
Kailani Jones,
Morteza Hashemi,
Alexandru G. Bardas,
Reza Ahmadi
Abstract:
This paper presents a Reinforcement Learning (RL) based energy market for a prosumer dominated microgrid. The proposed market model facilitates a real-time and demanddependent dynamic pricing environment, which reduces grid costs and improves the economic benefits for prosumers. Furthermore, this market model enables the grid operator to leverage prosumers storage capacity as a dispatchable asset…
▽ More
This paper presents a Reinforcement Learning (RL) based energy market for a prosumer dominated microgrid. The proposed market model facilitates a real-time and demanddependent dynamic pricing environment, which reduces grid costs and improves the economic benefits for prosumers. Furthermore, this market model enables the grid operator to leverage prosumers storage capacity as a dispatchable asset for grid support applications. Simulation results based on the Deep QNetwork (DQN) framework demonstrate significant improvements of the 24-hour accumulative profit for both prosumers and the grid operator, as well as major reductions in grid reserve power utilization.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
Demand Responsive Dynamic Pricing Framework for Prosumer Dominated Microgrids using Multiagent Reinforcement Learning
Authors:
Amin Shojaeighadikolaei,
Arman Ghasemi,
Kailani R. Jones,
Alexandru G. Bardas,
Morteza Hashemi,
Reza Ahmadi
Abstract:
Demand Response (DR) has a widely recognized potential for improving grid stability and reliability while reducing customers energy bills. However, the conventional DR techniques come with several shortcomings, such as inability to handle operational uncertainties and incurring customer disutility, impeding their wide spread adoption in real-world applications. This paper proposes a new multiagent…
▽ More
Demand Response (DR) has a widely recognized potential for improving grid stability and reliability while reducing customers energy bills. However, the conventional DR techniques come with several shortcomings, such as inability to handle operational uncertainties and incurring customer disutility, impeding their wide spread adoption in real-world applications. This paper proposes a new multiagent Reinforcement Learning (RL) based decision-making environment for implementing a Real-Time Pricing (RTP) DR technique in a prosumer dominated microgrid. The proposed technique addresses several shortcomings common to traditional DR methods and provides significant economic benefits to the grid operator and prosumers. To show its better efficacy, the proposed DR method is compared to a baseline traditional operation scenario in a small-scale microgrid system. Finally, investigations on the use of prosumers energy storage capacity in this microgrid highlight the advantages of the proposed method in establishing a balanced market setup.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
A New Pseudo-color Technique Based on Intensity Information Protection for Passive Sensor Imagery
Authors:
Mohammad Reza Khosravi,
Habib Rostami,
Gholam Reza Ahmadi,
Suleiman Mansouri,
Ahmad Keshavarz
Abstract:
Remote sensing image processing is so important in geo-sciences. Images which are obtained by different types of sensors might initially be unrecognizable. To make an acceptable visual perception in the images, some pre-processing steps (for removing noises and etc) are preformed which they affect the analysis of images. There are different types of processing according to the types of remote sens…
▽ More
Remote sensing image processing is so important in geo-sciences. Images which are obtained by different types of sensors might initially be unrecognizable. To make an acceptable visual perception in the images, some pre-processing steps (for removing noises and etc) are preformed which they affect the analysis of images. There are different types of processing according to the types of remote sensing images. The method that we are going to introduce in this paper is to use virtual colors to colorize the gray-scale images of satellite sensors. This approach helps us to have a better analysis on a sample single-band image which has been taken by Landsat-8 (OLI) sensor (as a multi-band sensor with natural color bands, its images' natural color can be compared to synthetic color by our approach). A good feature of this method is the original image reversibility in order to keep the suitable resolution of output images.
△ Less
Submitted 8 April, 2017;
originally announced April 2017.
-
An Efficient Hybrid CS and K-Means Algorithm for the Capacitated PMedian Problem
Authors:
Hassan Gholami Mazinan,
Gholam Reza Ahmadi,
Erfan Khaji
Abstract:
Capacitated p-median problem (CPMP) is an important variation of facility location problem in which p capacitated medians are economically selected to serve a set of demand vertices so that the total assigned demand to each of the candidate medians must not exceed its capacity. This paper surveys and analyses the combination of Cuckoo Search and K-Means algorithms to solve the CPMP. In order to ch…
▽ More
Capacitated p-median problem (CPMP) is an important variation of facility location problem in which p capacitated medians are economically selected to serve a set of demand vertices so that the total assigned demand to each of the candidate medians must not exceed its capacity. This paper surveys and analyses the combination of Cuckoo Search and K-Means algorithms to solve the CPMP. In order to check for quality and validity of the suggestive method, we compared the final solution produced over the two test problems of Osman and Christofides, each of which including 10 sample tests. According to the results, the suggested meta-heuristic algorithm shows superiority over the rest known algorithms in this field as all the best known solutions in the first problem set, and several sample sets in the second problem set have been improved within reasonable periods of time.
△ Less
Submitted 29 June, 2014;
originally announced June 2014.