-
Reconfigurable Intelligent Surfaces: Interplay of Unit-Cell- and Surface-Level Design and Performance under Quantifiable Benchmarks
Authors:
Ammar Rafique,
Naveed Ul Hassan,
Muhammad Zubair,
Ijaz Haider Naqvi,
Muhammad Qasim Mehmood,
Chau Yuen,
Marco Di Renzo,
Merouane Debbah
Abstract:
The ability of reconfigurable intelligent surfaces (RIS) to produce complex radiation patterns in the far-field is determined by various factors, such as the unit-cell's size, shape, spatial arrangement, tuning mechanism, the communication and control circuitry's complexity, and the illuminating source's type (point/planewave). Research on RIS has been mainly focused on two areas: first, the optim…
▽ More
The ability of reconfigurable intelligent surfaces (RIS) to produce complex radiation patterns in the far-field is determined by various factors, such as the unit-cell's size, shape, spatial arrangement, tuning mechanism, the communication and control circuitry's complexity, and the illuminating source's type (point/planewave). Research on RIS has been mainly focused on two areas: first, the optimization and design of unit-cells to achieve desired electromagnetic responses within a specific frequency band; and second, exploring the applications of RIS in various settings, including system-level performance analysis. The former does not assume any specific radiation pattern on the surface level, while the latter does not consider any particular unit-cell design. Both approaches largely ignore the complexity and power requirements of the RIS control circuitry. As we progress towards the fabrication and use of RIS in real-world settings, it is becoming increasingly necessary to consider the interplay between the unit-cell design, the required surface-level radiation patterns, the control circuit's complexity, and the power requirements concurrently. In this paper, a benchmarking framework for RIS is employed to compare performance and analyze tradeoffs between the unit-cell's specified radiation patterns and the control circuit's complexity for far-field beamforming, considering different diode-based unit-cell designs for a given surface size. This work lays the foundation for optimizing the design of the unit-cells and surface-level radiation patterns, facilitating the optimization of RIS-assisted wireless communication systems.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
3D Convolutional with Attention for Action Recognition
Authors:
Labina Shrestha,
Shikha Dubey,
Farrukh Olimov,
Muhammad Aasim Rafique,
Moongu Jeon
Abstract:
Human action recognition is one of the challenging tasks in computer vision. The current action recognition methods use computationally expensive models for learning spatio-temporal dependencies of the action. Models utilizing RGB channels and optical flow separately, models using a two-stream fusion technique, and models consisting of both convolutional neural network (CNN) and long-short term me…
▽ More
Human action recognition is one of the challenging tasks in computer vision. The current action recognition methods use computationally expensive models for learning spatio-temporal dependencies of the action. Models utilizing RGB channels and optical flow separately, models using a two-stream fusion technique, and models consisting of both convolutional neural network (CNN) and long-short term memory (LSTM) network are few examples of such complex models. Moreover, fine-tuning such complex models is computationally expensive as well. This paper proposes a deep neural network architecture for learning such dependencies consisting of a 3D convolutional layer, fully connected (FC) layers, and attention layer, which is simpler to implement and gives a competitive performance on the UCF-101 dataset. The proposed method first learns spatial and temporal features of actions through 3D-CNN, and then the attention mechanism helps the model to locate attention to essential features for recognition.
△ Less
Submitted 5 June, 2022;
originally announced June 2022.
-
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning
Authors:
Shikha Dubey,
Farrukh Olimov,
Muhammad Aasim Rafique,
Joonmo Kim,
Moongu Jeon
Abstract:
Automatic transcription of scene understanding in images and videos is a step towards artificial general intelligence. Image captioning is a nomenclature for describing meaningful information in an image using computer vision techniques. Automated image captioning techniques utilize encoder and decoder architecture, where the encoder extracts features from an image and the decoder generates a tran…
▽ More
Automatic transcription of scene understanding in images and videos is a step towards artificial general intelligence. Image captioning is a nomenclature for describing meaningful information in an image using computer vision techniques. Automated image captioning techniques utilize encoder and decoder architecture, where the encoder extracts features from an image and the decoder generates a transcript. In this work, we investigate two unexplored ideas for image captioning using transformers: First, we demonstrate the enforcement of using objects' relevance in the surrounding environment. Second, learning an explicit association between labels and language constructs. We propose label-attention Transformer with geometrically coherent objects (LATGeO). The proposed technique acquires a proposal of geometrically coherent objects using a deep neural network (DNN) and generates captions by investigating their relationships using a label-attention module. Object coherence is defined using the localized ratio of the geometrical properties of the proposals. The label-attention module associates the extracted objects classes to the available dictionary using self-attention layers. The experimentation results show that objects' relevance in surroundings and binding of their visual feature with their geometrically localized ratios combined with its associated labels help in defining meaningful captions. The proposed framework is tested on the MSCOCO dataset, and a thorough evaluation resulting in overall better quantitative scores pronounces its superiority.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
RVMDE: Radar Validated Monocular Depth Estimation for Robotics
Authors:
Muhamamd Ishfaq Hussain,
Muhammad Aasim Rafique,
Moongu Jeon
Abstract:
Stereoscopy exposits a natural perception of distance in a scene, and its manifestation in 3D world understanding is an intuitive phenomenon. However, an innate rigid calibration of binocular vision sensors is crucial for accurate depth estimation. Alternatively, a monocular camera alleviates the limitation at the expense of accuracy in estimating depth, and the challenge exacerbates in harsh envi…
▽ More
Stereoscopy exposits a natural perception of distance in a scene, and its manifestation in 3D world understanding is an intuitive phenomenon. However, an innate rigid calibration of binocular vision sensors is crucial for accurate depth estimation. Alternatively, a monocular camera alleviates the limitation at the expense of accuracy in estimating depth, and the challenge exacerbates in harsh environmental conditions. Moreover, an optical sensor often fails to acquire vital signals in harsh environments, and radar is used instead, which gives coarse but more accurate signals. This work explores the utility of coarse signals from radar when fused with fine-grained data from a monocular camera for depth estimation in harsh environmental conditions. A variant of feature pyramid network (FPN) extensively operates on fine-grained image features at multiple scales with a fewer number of parameters. FPN feature maps are fused with sparse radar features extracted with a Convolutional neural network. The concatenated hierarchical features are used to predict the depth with ordinal regression. We performed experiments on the nuScenes dataset, and the proposed architecture stays on top in quantitative evaluations with reduced parameters and faster inference. The depth estimation results suggest that the proposed techniques can be used as an alternative to stereo depth estimation in critical applications in robotics and self-driving cars. The source code will be available in the following: \url{https://github.com/MI-Hussain/RVMDE}.
△ Less
Submitted 18 April, 2022; v1 submitted 11 September, 2021;
originally announced September 2021.
-
Exploring Thermal Images for Object Detection in Underexposure Regions for Autonomous Driving
Authors:
Farzeen Munir,
Shoaib Azam,
Muhammd Aasim Rafique,
Ahmad Muqeem Sheri,
Moongu Jeon,
Witold Pedrycz
Abstract:
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving. The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals. A thermal camera captures an image using the heat difference emitted by objects in the infrared spectrum, and object detecti…
▽ More
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving. The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals. A thermal camera captures an image using the heat difference emitted by objects in the infrared spectrum, and object detection in thermal images becomes effective for autonomous driving in challenging conditions. Although object detection in the visible spectrum domain imaging has matured, thermal object detection lacks effectiveness. A significant challenge is scarcity of labeled data for the thermal domain which is desiderata for SOTA artificial intelligence techniques. This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images. The framework uses a generative adversarial network (GAN) to transfer the low-level features from the visible spectrum domain to the thermal domain through style consistency. The efficacy of the proposed method of object detection in thermal images is evident from the improved results when used styled images from publicly available thermal image datasets (FLIR ADAS and KAIST Multi-Spectral).
△ Less
Submitted 3 May, 2021; v1 submitted 1 June, 2020;
originally announced June 2020.
-
A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber
Authors:
MicroBooNE collaboration,
C. Adams,
M. Alrashed,
R. An,
J. Anthony,
J. Asaadi,
A. Ashkenazi,
M. Auger,
S. Balasubramanian,
B. Baller,
C. Barnes,
G. Barr,
M. Bass,
F. Bay,
A. Bhat,
K. Bhattacharya,
M. Bishai,
A. Blake,
T. Bolton,
L. Camilleri,
D. Caratelli,
I. Caro Terrazas,
R. Carr,
R. Castillo Fernandez,
F. Cavanna
, et al. (148 additional authors not shown)
Abstract:
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction cha…
▽ More
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stopping muon and a $ν_μ$ charged current neutral pion data samples.
△ Less
Submitted 22 August, 2018;
originally announced August 2018.