-
Clip4Retrofit: Enabling Real-Time Image Labeling on Edge Devices via Cross-Architecture CLIP Distillation
Authors:
Li Zhong,
Ahmed Ghazal,
Jun-Jun Wan,
Frederik Zilly,
Patrick Mackens,
Joachim E. Vollrath,
Bogdan Sorin Coseriu
Abstract:
Foundation models like CLIP (Contrastive Language-Image Pretraining) have revolutionized vision-language tasks by enabling zero-shot and few-shot learning through cross-modal alignment. However, their computational complexity and large memory footprint make them unsuitable for deployment on resource-constrained edge devices, such as in-car cameras used for image collection and real-time processing…
▽ More
Foundation models like CLIP (Contrastive Language-Image Pretraining) have revolutionized vision-language tasks by enabling zero-shot and few-shot learning through cross-modal alignment. However, their computational complexity and large memory footprint make them unsuitable for deployment on resource-constrained edge devices, such as in-car cameras used for image collection and real-time processing. To address this challenge, we propose Clip4Retrofit, an efficient model distillation framework that enables real-time image labeling on edge devices. The framework is deployed on the Retrofit camera, a cost-effective edge device retrofitted into thousands of vehicles, despite strict limitations on compute performance and memory. Our approach distills the knowledge of the CLIP model into a lightweight student model, combining EfficientNet-B3 with multi-layer perceptron (MLP) projection heads to preserve cross-modal alignment while significantly reducing computational requirements. We demonstrate that our distilled model achieves a balance between efficiency and performance, making it ideal for deployment in real-world scenarios. Experimental results show that Clip4Retrofit can perform real-time image labeling and object identification on edge devices with limited resources, offering a practical solution for applications such as autonomous driving and retrofitting existing systems. This work bridges the gap between state-of-the-art vision-language models and their deployment in resource-constrained environments, paving the way for broader adoption of foundation models in edge computing.
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
Surface-biased Multi-Level Context 3D Object Detection
Authors:
Sultan Abu Ghazal,
Jean Lahoud,
Rao Anwer
Abstract:
Object detection in 3D point clouds is a crucial task in a range of computer vision applications including robotics, autonomous cars, and augmented reality. This work addresses the object detection task in 3D point clouds using a highly efficient, surface-biased, feature extraction method (wang2022rbgnet), that also captures contextual cues on multiple levels. We propose a 3D object detector that…
▽ More
Object detection in 3D point clouds is a crucial task in a range of computer vision applications including robotics, autonomous cars, and augmented reality. This work addresses the object detection task in 3D point clouds using a highly efficient, surface-biased, feature extraction method (wang2022rbgnet), that also captures contextual cues on multiple levels. We propose a 3D object detector that extracts accurate feature representations of object candidates and leverages self-attention on point patches, object candidates, and on the global scene in 3D scene. Self-attention is proven to be effective in encoding correlation information in 3D point clouds by (xie2020mlcvnet). While other 3D detectors focus on enhancing point cloud feature extraction by selectively obtaining more meaningful local features (wang2022rbgnet) where contextual information is overlooked. To this end, the proposed architecture uses ray-based surface-biased feature extraction and multi-level context encoding to outperform the state-of-the-art 3D object detector. In this work, 3D detection experiments are performed on scenes from the ScanNet dataset whereby the self-attention modules are introduced one after the other to isolate the effect of self-attention at each level.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Channel Measurements and Models for High-Speed Train Wireless Communication Systems in Tunnel Scenarios: A Survey
Authors:
Yu Liu,
Ammar Ghazal,
Cheng-Xiang Wang,
Xiaohu Ge,
Yang Yang,
Yapei Zhang
Abstract:
The rapid developments of high-speed trains (HSTs) introduce new challenges to HST wireless communication systems. Realistic HST channel models play a critical role in designing and evaluating HST communication systems. Due to the length limitation, bounding of tunnel itself, and waveguide effect, channel characteristics in tunnel scenarios are very different from those in other HST scenarios. The…
▽ More
The rapid developments of high-speed trains (HSTs) introduce new challenges to HST wireless communication systems. Realistic HST channel models play a critical role in designing and evaluating HST communication systems. Due to the length limitation, bounding of tunnel itself, and waveguide effect, channel characteristics in tunnel scenarios are very different from those in other HST scenarios. Therefore, accurate tunnel channel models considering both large-scale and small-scale fading characteristics are essential for HST communication systems. Moreover, certain characteristics of tunnel channels have not been investigated sufficiently. This article provides a comprehensive review of the measurement campaigns in tunnels and presents some tunnel channel models using various modeling methods. Finally, future directions in HST tunnel channel measurements and modeling are discussed.
△ Less
Submitted 30 December, 2016;
originally announced December 2016.
-
Effect of Local Optical Potential Parameters on the Pion-Nucleus Elastic Scattering in Framework of Second-Order Eikonal Model at Intermediate Energy
Authors:
N. A. Elmahdy,
A. Y. Ellithi,
A. SH. Ghazal,
Z. Metawei,
M. Y. M. Hassan
Abstract:
The microscopic eikonal phase shifts with its first and second corrections for pi nucleus collision are calculated by using the expression previously derived for the local Kisslinger potential. A physical description to different forms of targets density distributions, are implemented. The roles of the Ericson Ericson Lorentz Lorentz EELL parameter and the adjustable scattering amplitude parameter…
▽ More
The microscopic eikonal phase shifts with its first and second corrections for pi nucleus collision are calculated by using the expression previously derived for the local Kisslinger potential. A physical description to different forms of targets density distributions, are implemented. The roles of the Ericson Ericson Lorentz Lorentz EELL parameter and the adjustable scattering amplitude parameters are discussed.The calculated differential cross sections include the second order corrections of eikonal phase shift.The need for modifying the effective interaction to account for higher order corrections, larger scattering angles and for lower incident projectile momentum is discussed.The results of this theory for 12C 16O 28Si 40 44 48Ca and 208Pb nuclei are shown to yield satisfactory agreement with experimental data in the momentum range of 114 to 292 MeVc.
△ Less
Submitted 14 March, 2022; v1 submitted 19 December, 2013;
originally announced December 2013.