-
Towards Efficient Real-Time Video Motion Transfer via Generative Time Series Modeling
Authors:
Tasmiah Haque,
Md. Asif Bin Syed,
Byungheon Jeong,
Xue Bai,
Sumit Mohan,
Somdyuti Paul,
Imtiaz Ahmed,
Srinjoy Das
Abstract:
We propose a deep learning framework designed to significantly optimize bandwidth for motion-transfer-enabled video applications, including video conferencing, virtual reality interactions, health monitoring systems, and vision-based real-time anomaly detection. To capture complex motion effectively, we utilize the First Order Motion Model (FOMM), which encodes dynamic objects by detecting keypoin…
▽ More
We propose a deep learning framework designed to significantly optimize bandwidth for motion-transfer-enabled video applications, including video conferencing, virtual reality interactions, health monitoring systems, and vision-based real-time anomaly detection. To capture complex motion effectively, we utilize the First Order Motion Model (FOMM), which encodes dynamic objects by detecting keypoints and their associated local affine transformations. These keypoints are identified using a self-supervised keypoint detector and arranged into a time series corresponding to the successive frames. Forecasting is performed on these keypoints by integrating two advanced generative time series models into the motion transfer pipeline, namely the Variational Recurrent Neural Network (VRNN) and the Gated Recurrent Unit with Normalizing Flow (GRU-NF). The predicted keypoints are subsequently synthesized into realistic video frames using an optical flow estimator paired with a generator network, thereby facilitating accurate video forecasting and enabling efficient, low-frame-rate video transmission. We validate our results across three datasets for video animation and reconstruction using the following metrics: Mean Absolute Error, Joint Embedding Predictive Architecture Embedding Distance, Structural Similarity Index, and Average Pair-wise Displacement. Our results confirm that by utilizing the superior reconstruction property of the Variational Autoencoder, the VRNN integrated FOMM excels in applications involving multi-step ahead forecasts such as video conferencing. On the other hand, by leveraging the Normalizing Flow architecture for exact likelihood estimation, and enabling efficient latent space sampling, the GRU-NF based FOMM exhibits superior capabilities for producing diverse future samples while maintaining high visual quality for tasks like real-time video-based anomaly detection.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Large-Scale AI in Telecom: Charting the Roadmap for Innovation, Scalability, and Enhanced Digital Experiences
Authors:
Adnan Shahid,
Adrian Kliks,
Ahmed Al-Tahmeesschi,
Ahmed Elbakary,
Alexandros Nikou,
Ali Maatouk,
Ali Mokh,
Amirreza Kazemi,
Antonio De Domenico,
Athanasios Karapantelakis,
Bo Cheng,
Bo Yang,
Bohao Wang,
Carlo Fischione,
Chao Zhang,
Chaouki Ben Issaid,
Chau Yuen,
Chenghui Peng,
Chongwen Huang,
Christina Chaccour,
Christo Kurisummoottil Thomas,
Dheeraj Sharma,
Dimitris Kalogiros,
Dusit Niyato,
Eli De Poorter
, et al. (110 additional authors not shown)
Abstract:
This white paper discusses the role of large-scale AI in the telecommunications industry, with a specific focus on the potential of generative AI to revolutionize network functions and user experiences, especially in the context of 6G systems. It highlights the development and deployment of Large Telecom Models (LTMs), which are tailored AI models designed to address the complex challenges faced b…
▽ More
This white paper discusses the role of large-scale AI in the telecommunications industry, with a specific focus on the potential of generative AI to revolutionize network functions and user experiences, especially in the context of 6G systems. It highlights the development and deployment of Large Telecom Models (LTMs), which are tailored AI models designed to address the complex challenges faced by modern telecom networks. The paper covers a wide range of topics, from the architecture and deployment strategies of LTMs to their applications in network management, resource allocation, and optimization. It also explores the regulatory, ethical, and standardization considerations for LTMs, offering insights into their future integration into telecom infrastructure. The goal is to provide a comprehensive roadmap for the adoption of LTMs to enhance scalability, performance, and user-centric innovation in telecom networks.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Fully Programmable Spatial Photonic Ising Machine by Focal Plane Division
Authors:
Daniele Veraldi,
Davide Pierangeli,
Silvia Gentilini,
Marcello Calvanese Strinati,
Jason Sakellariou,
James S. Cummins,
Airat Kamaletdinov,
Marvin Syed,
Richard Zhipeng Wang,
Natalia G. Berloff,
Dimitrios Karanikolopoulos,
Pavlos G. Savvidis,
Claudio Conti
Abstract:
Ising machines are an emerging class of hardware that promises ultrafast and energy-efficient solutions to NP-hard combinatorial optimization problems. Spatial photonic Ising machines (SPIMs) exploit optical computing in free space to accelerate the computation, showcasing parallelism, scalability, and low power consumption. However, current SPIMs can implement only a restricted class of problems.…
▽ More
Ising machines are an emerging class of hardware that promises ultrafast and energy-efficient solutions to NP-hard combinatorial optimization problems. Spatial photonic Ising machines (SPIMs) exploit optical computing in free space to accelerate the computation, showcasing parallelism, scalability, and low power consumption. However, current SPIMs can implement only a restricted class of problems. This partial programmability is a critical limitation that hampers their benchmark. Achieving full programmability of the device while preserving its scalability is an open challenge. Here, we report a fully programmable SPIM achieved through a novel operation method based on the division of the focal plane. In our scheme, a general Ising problem is decomposed into a set of Mattis Hamiltonians, whose energies are simultaneously computed optically by measuring the intensity on different regions of the camera sensor. Exploiting this concept, we experimentally demonstrate the computation with high success probability of ground-state solutions of up to 32-spin Ising models on unweighted maximum cut graphs with and without ferromagnetic bias. Simulations of the hardware prove a favorable scaling of the accuracy with the number of spins. Our fully programmable SPIM enables the implementation of many quadratic unconstrained binary optimization problems, further establishing SPIMs as a leading paradigm in non von Neumann hardware.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
ReasonPlanner: Enhancing Autonomous Planning in Dynamic Environments with Temporal Knowledge Graphs and LLMs
Authors:
Minh Pham Dinh,
Munira Syed,
Michael G Yankoski,
Trenton W. Ford
Abstract:
Planning and performing interactive tasks, such as conducting experiments to determine the melting point of an unknown substance, is straightforward for humans but poses significant challenges for autonomous agents. We introduce ReasonPlanner, a novel generalist agent designed for reflective thinking, planning, and interactive reasoning. This agent leverages LLMs to plan hypothetical trajectories…
▽ More
Planning and performing interactive tasks, such as conducting experiments to determine the melting point of an unknown substance, is straightforward for humans but poses significant challenges for autonomous agents. We introduce ReasonPlanner, a novel generalist agent designed for reflective thinking, planning, and interactive reasoning. This agent leverages LLMs to plan hypothetical trajectories by building a World Model based on a Temporal Knowledge Graph. The agent interacts with the environment using a natural language actor-critic module, where the actor translates the imagined trajectory into a sequence of actionable steps, and the critic determines if replanning is necessary. ReasonPlanner significantly outperforms previous state-of-the-art prompting-based methods on the ScienceWorld benchmark by more than 1.8 times, while being more sample-efficient and interpretable. It relies solely on frozen weights thus requiring no gradient updates. ReasonPlanner can be deployed and utilized without specialized knowledge of Machine Learning, making it accessible to a wide range of users.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
Microservice Vulnerability Analysis: A Literature Review with Empirical Insights
Authors:
Raveen Kanishka Jayalath,
Hussain Ahmad,
Diksha Goel,
Muhammad Shuja Syed,
Faheem Ullah
Abstract:
Microservice architectures are revolutionizing both small businesses and large corporations, igniting a new era of innovation with their exceptional advantages in maintainability, reusability, and scalability. However, these benefits come with significant security challenges, as the increased complexity of service interactions, expanded attack surfaces, and intricate dependency management introduc…
▽ More
Microservice architectures are revolutionizing both small businesses and large corporations, igniting a new era of innovation with their exceptional advantages in maintainability, reusability, and scalability. However, these benefits come with significant security challenges, as the increased complexity of service interactions, expanded attack surfaces, and intricate dependency management introduce a new array of cybersecurity vulnerabilities. While security concerns are mounting, there is a lack of comprehensive research that integrates a review of existing knowledge with empirical analysis of microservice vulnerabilities. This study aims to fill this gap by gathering, analyzing, and synthesizing existing literature on security vulnerabilities associated with microservice architectures. Through a thorough examination of 62 studies, we identify, analyze, and report 126 security vulnerabilities inherent in microservice architectures. This comprehensive analysis enables us to (i) propose a taxonomy that categorizes microservice vulnerabilities based on the distinctive features of microservice architectures; (ii) conduct an empirical analysis by performing vulnerability scans on four diverse microservice benchmark applications using three different scanning tools to validate our taxonomy; and (iii) map our taxonomy vulnerabilities with empirically identified vulnerabilities, providing an in-depth vulnerability analysis at microservice, application, and scanning tool levels. Our study offers crucial guidelines for practitioners and researchers to advance both the state-of-the-practice and the state-of-the-art in securing microservice architectures.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Efficient Computation Using Spatial-Photonic Ising Machines: Utilizing Low-Rank and Circulant Matrix Constraints
Authors:
Richard Zhipeng Wang,
James S. Cummins,
Marvin Syed,
Nikita Stroev,
George Pastras,
Jason Sakellariou,
Symeon Tsintzos,
Alexis Askitopoulos,
Daniele Veraldi,
Marcello Calvanese Strinati,
Silvia Gentilini,
Davide Pierangeli,
Claudio Conti,
Natalia G. Berloff
Abstract:
We explore the potential of spatial-photonic Ising machines (SPIMs) to address computationally intensive Ising problems that employ low-rank and circulant coupling matrices. Our results indicate that the performance of SPIMs is critically affected by the rank and precision of the coupling matrices. By developing and assessing advanced decomposition techniques, we expand the range of problems SPIMs…
▽ More
We explore the potential of spatial-photonic Ising machines (SPIMs) to address computationally intensive Ising problems that employ low-rank and circulant coupling matrices. Our results indicate that the performance of SPIMs is critically affected by the rank and precision of the coupling matrices. By developing and assessing advanced decomposition techniques, we expand the range of problems SPIMs can solve, overcoming the limitations of traditional Mattis-type matrices. Our approach accommodates a diverse array of coupling matrices, including those with inherently low ranks, applicable to complex NP-complete problems. We explore the practical benefits of low-rank approximation in optimization tasks, particularly in financial optimization, to demonstrate the real-world applications of SPIMs. Finally, we evaluate the computational limitations imposed by SPIM hardware precision and suggest strategies to optimize the performance of these systems within these constraints.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Predictive Health Analysis in Industry 5.0: A Scientometric and Systematic Review of Motion Capture in Construction
Authors:
Md Hadisur Rahman,
Md Rabiul Hasan,
Nahian Ismail Chowdhury,
Md Asif Bin Syed,
Mst Ummul Farah
Abstract:
In an era of rapid technological advancement, the rise of Industry 4.0 has prompted industries to pursue innovative improvements in their processes. As we advance towards Industry 5.0, which focuses more on collaboration between humans and intelligent systems, there is a growing requirement for better sensing technologies for healthcare and safety purposes. Consequently, Motion Capture (MoCap) sys…
▽ More
In an era of rapid technological advancement, the rise of Industry 4.0 has prompted industries to pursue innovative improvements in their processes. As we advance towards Industry 5.0, which focuses more on collaboration between humans and intelligent systems, there is a growing requirement for better sensing technologies for healthcare and safety purposes. Consequently, Motion Capture (MoCap) systems have emerged as critical enablers in this technological evolution by providing unmatched precision and versatility in various workplaces, including construction. As the construction workplace requires physically demanding tasks, leading to work-related musculoskeletal disorders (WMSDs) and health issues, the study explores the increasing relevance of MoCap systems within the concept of Industry 4.0 and 5.0. Despite the growing significance, there needs to be more comprehensive research, a scientometric review that quantitatively assesses the role of MoCap systems in construction. Our study combines bibliometric, scientometric, and systematic review approaches to address this gap, analyzing articles sourced from the Scopus database. A total of 52 papers were carefully selected from a pool of 962 papers for a quantitative study using a scientometric approach and a qualitative, indepth examination. Results showed that MoCap systems are employed to improve worker health and safety and reduce occupational hazards.The in-depth study also finds the most tested construction tasks are masonry, lifting, training, and climbing, with a clear preference for markerless systems.
△ Less
Submitted 22 January, 2024;
originally announced February 2024.
-
ANNA: A Deep Learning Based Dataset in Heterogeneous Traffic for Autonomous Vehicles
Authors:
Mahedi Kamal,
Tasnim Fariha,
Afrina Kabir Zinia,
Md. Abu Syed,
Fahim Hasan Khan,
Md. Mahbubur Rahman
Abstract:
Recent breakthroughs in artificial intelligence offer tremendous promise for the development of self-driving applications. Deep Neural Networks, in particular, are being utilized to support the operation of semi-autonomous cars through object identification and semantic segmentation. To assess the inadequacy of the current dataset in the context of autonomous and semi-autonomous cars, we created a…
▽ More
Recent breakthroughs in artificial intelligence offer tremendous promise for the development of self-driving applications. Deep Neural Networks, in particular, are being utilized to support the operation of semi-autonomous cars through object identification and semantic segmentation. To assess the inadequacy of the current dataset in the context of autonomous and semi-autonomous cars, we created a new dataset named ANNA. This study discusses a custom-built dataset that includes some unidentified vehicles in the perspective of Bangladesh, which are not included in the existing dataset. A dataset validity check was performed by evaluating models using the Intersection Over Union (IOU) metric. The results demonstrated that the model trained on our custom dataset was more precise and efficient than the models trained on the KITTI or COCO dataset concerning Bangladeshi traffic. The research presented in this paper also emphasizes the importance of developing accurate and efficient object detection algorithms for the advancement of autonomous vehicles.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Analysis of the User Perception of Chatbots in Education Using A Partial Least Squares Structural Equation Modeling Approach
Authors:
Md Rabiul Hasan,
Nahian Ismail Chowdhury,
Md Hadisur Rahman,
Md Asif Bin Syed,
JuHyeong Ryu
Abstract:
The integration of Artificial Intelligence (AI) into education is a recent development, with chatbots emerging as a noteworthy addition to this transformative landscape. As online learning platforms rapidly advance, students need to adapt swiftly to excel in this dynamic environment. Consequently, understanding the acceptance of chatbots, particularly those employing Large Language Model (LLM) suc…
▽ More
The integration of Artificial Intelligence (AI) into education is a recent development, with chatbots emerging as a noteworthy addition to this transformative landscape. As online learning platforms rapidly advance, students need to adapt swiftly to excel in this dynamic environment. Consequently, understanding the acceptance of chatbots, particularly those employing Large Language Model (LLM) such as Chat Generative Pretrained Transformer (ChatGPT), Google Bard, and other interactive AI technologies, is of paramount importance. However, existing research on chatbots in education has overlooked key behavior-related aspects, such as Optimism, Innovativeness, Discomfort, Insecurity, Transparency, Ethics, Interaction, Engagement, and Accuracy, creating a significant literature gap. To address this gap, this study employs Partial Least Squares Structural Equation Modeling (PLS-SEM) to investigate the determinant of chatbots adoption in education among students, considering the Technology Readiness Index (TRI) and Technology Acceptance Model (TAM). Utilizing a five-point Likert scale for data collection, we gathered a total of 185 responses, which were analyzed using R-Studio software. We established 12 hypotheses to achieve its objectives. The results showed that Optimism and Innovativeness are positively associated with Perceived Ease of Use (PEOU) and Perceived Usefulness (PU). Conversely, Discomfort and Insecurity negatively impact PEOU, with only Insecurity negatively affecting PU. These findings provide insights for future technology designers, elucidating critical user behavior factors influencing chatbots adoption and utilization in educational contexts.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Comparative Evaluation of Transfer Learning for Classification of Brain Tumor Using MRI
Authors:
Abu Kaisar Mohammad Masum,
Nusrat Badhon,
S. M. Saiful Islam Badhon,
Nushrat Jahan Ria,
Sheikh Abujar,
Muntaser Mansur Syed,
Naveed Mahmud
Abstract:
Abnormal growth of cells in the brain and its surrounding tissues is known as a brain tumor. There are two types, one is benign (non-cancerous) and another is malignant (cancerous) which may cause death. The radiologists' ability to diagnose malignancies is greatly aided by magnetic resonance imaging (MRI). Brain cancer diagnosis has been considerably expedited by the field of computer-assisted di…
▽ More
Abnormal growth of cells in the brain and its surrounding tissues is known as a brain tumor. There are two types, one is benign (non-cancerous) and another is malignant (cancerous) which may cause death. The radiologists' ability to diagnose malignancies is greatly aided by magnetic resonance imaging (MRI). Brain cancer diagnosis has been considerably expedited by the field of computer-assisted diagnostics, especially in machine learning and deep learning. In our study, we categorize three different kinds of brain tumors using four transfer learning techniques. Our models were tested on a benchmark dataset of $3064$ MRI pictures representing three different forms of brain cancer. Notably, ResNet-50 outperformed other models with a remarkable accuracy of $99.06\%$. We stress the significance of a balanced dataset for improving accuracy without the use of augmentation methods. Additionally, we experimentally demonstrate our method and compare with other classification algorithms on the CE-MRI dataset using evaluations like F1-score, AUC, precision and recall.
△ Less
Submitted 23 September, 2023;
originally announced October 2023.
-
ML Algorithm Synthesizing Domain Knowledge for Fungal Spores Concentration Prediction
Authors:
Md Asif Bin Syed,
Azmine Toushik Wasi,
Imtiaz Ahmed
Abstract:
The pulp and paper manufacturing industry requires precise quality control to ensure pure, contaminant-free end products suitable for various applications. Fungal spore concentration is a crucial metric that affects paper usability, and current testing methods are labor-intensive with delayed results, hindering real-time control strategies. To address this, a machine learning algorithm utilizing t…
▽ More
The pulp and paper manufacturing industry requires precise quality control to ensure pure, contaminant-free end products suitable for various applications. Fungal spore concentration is a crucial metric that affects paper usability, and current testing methods are labor-intensive with delayed results, hindering real-time control strategies. To address this, a machine learning algorithm utilizing time-series data and domain knowledge was proposed. The optimal model employed Ridge Regression achieving an MSE of 2.90 on training and validation data. This approach could lead to significant improvements in efficiency and sustainability by providing real-time predictions for fungal spore concentrations. This paper showcases a promising method for real-time fungal spore concentration prediction, enabling stringent quality control measures in the pulp-and-paper industry.
△ Less
Submitted 23 September, 2023;
originally announced September 2023.
-
Cardiovascular Disease Risk Prediction via Social Media
Authors:
Al Zadid Sultan Bin Habib,
Md Asif Bin Syed,
Md Tanvirul Islam,
Donald A. Adjeroh
Abstract:
Researchers use Twitter and sentiment analysis to predict Cardiovascular Disease (CVD) risk. We developed a new dictionary of CVD-related keywords by analyzing emotions expressed in tweets. Tweets from eighteen US states, including the Appalachian region, were collected. Using the VADER model for sentiment analysis, users were classified as potentially at CVD risk. Machine Learning (ML) models wer…
▽ More
Researchers use Twitter and sentiment analysis to predict Cardiovascular Disease (CVD) risk. We developed a new dictionary of CVD-related keywords by analyzing emotions expressed in tweets. Tweets from eighteen US states, including the Appalachian region, were collected. Using the VADER model for sentiment analysis, users were classified as potentially at CVD risk. Machine Learning (ML) models were employed to classify individuals' CVD risk and applied to a CDC dataset with demographic information to make the comparison. Performance evaluation metrics such as Test Accuracy, Precision, Recall, F1 score, Mathew's Correlation Coefficient (MCC), and Cohen's Kappa (CK) score were considered. Results demonstrated that analyzing tweets' emotions surpassed the predictive power of demographic data alone, enabling the identification of individuals at potential risk of developing CVD. This research highlights the potential of Natural Language Processing (NLP) and ML techniques in using tweets to identify individuals with CVD risks, providing an alternative approach to traditional demographic information for public health monitoring.
△ Less
Submitted 28 September, 2023; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Predicting Real-time Crash Risks during Hurricane Evacuation Using Connected Vehicle Data
Authors:
Zaheen E Muktadi Syed,
Samiul Hasan
Abstract:
Hurricane evacuation, ordered to save lives of people of coastal regions, generates high traffic demand with increased crash risk. To mitigate such risk, transportation agencies need to anticipate highway locations with high crash risks to deploy appropriate countermeasures. With ubiquitous sensors and communication technologies, it is now possible to retrieve micro-level vehicular data containing…
▽ More
Hurricane evacuation, ordered to save lives of people of coastal regions, generates high traffic demand with increased crash risk. To mitigate such risk, transportation agencies need to anticipate highway locations with high crash risks to deploy appropriate countermeasures. With ubiquitous sensors and communication technologies, it is now possible to retrieve micro-level vehicular data containing individual vehicle trajectory and speed information. Such high-resolution vehicle data, potentially available in real time, can be used to assess prevailing traffic safety conditions. Using vehicle speed and acceleration profiles, potential crash risks can be predicted in real time. Previous studies on real-time crash risk prediction mainly used data from infrastructure-based sensors which may not cover many road segments. In this paper, we present methods to determine potential crash risks during hurricane evacuation from an emerging alternative data source known as connected vehicle data. Such data contain vehicle location, speed, and acceleration information collected at a very high frequency (less than 30 seconds). To predict potential crash risks, we utilized a dataset collected during the evacuation period of Hurricane Ida on Interstate-10 (I-10) in the state of Louisiana. Multiple machine learning models were trained considering weather features and different traffic characteristics extracted from the connected vehicle data in 5-minute intervals. The results indicate that the Gaussian Process Boosting (GPBoost) and Extreme Gradient Boosting (XGBoost) models perform better (recall = 0.91) than other models. The real-time connected vehicle data for crash risks assessment will allow traffic managers to efficiently utilize resources to proactively take safety measures.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Leveraging Natural Language Processing For Public Health Screening On YouTube: A COVID-19 Case Study
Authors:
Ahrar Bin Aslam,
Zafi Sherhan Syed,
Muhammad Faiz Khan,
Asghar Baloch,
Muhammad Shehram Shah Syed
Abstract:
Background: Social media platforms have become a viable source of medical information, with patients and healthcare professionals using them to share health-related information and track diseases. Similarly, YouTube, the largest video-sharing platform in the world contains vlogs where individuals talk about their illnesses. The aim of our study was to investigate the use of Natural Language Proces…
▽ More
Background: Social media platforms have become a viable source of medical information, with patients and healthcare professionals using them to share health-related information and track diseases. Similarly, YouTube, the largest video-sharing platform in the world contains vlogs where individuals talk about their illnesses. The aim of our study was to investigate the use of Natural Language Processing (NLP) to identify the spoken content of YouTube vlogs related to the diagnosis of Coronavirus disease of 2019 (COVID-19) for public health screening. Methods: COVID-19 videos on YouTube were searched using relevant keywords. A total of 1000 videos being spoken in English were downloaded out of which 791 were classified as vlogs, 192 were non-vlogs, and 17 were deleted by the channel. The videos were converted into a textual format using Microsoft Streams. The textual data was preprocessed using basic and advanced preprocessing methods. A lexicon of 200 words was created which contained words related to COVID-19. The data was analyzed using topic modeling, word clouds, and lexicon matching. Results: The word cloud results revealed discussions about COVID-19 symptoms like "fever", along with generic terms such as "mask" and "isolation". Lexical analysis demonstrated that in 96.46% of videos, patients discussed generic terms, and in 95.45% of videos, people talked about COVID-19 symptoms. LDA Topic Modeling results also generated topics that successfully captured key themes and content related to our investigation of COVID-19 diagnoses in YouTube vlogs. Conclusion: By leveraging NLP techniques on YouTube vlogs public health practitioners can enhance their ability to mitigate the effects of pandemics and effectively respond to public health challenges.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Multi model LSTM architecture for Track Association based on Automatic Identification System Data
Authors:
Md Asif Bin Syed,
Imtiaz Ahmed
Abstract:
For decades, track association has been a challenging problem in marine surveillance, which involves the identification and association of vessel observations over time. However, the Automatic Identification System (AIS) has provided a new opportunity for researchers to tackle this problem by offering a large database of dynamic and geo-spatial information of marine vessels. With the availability…
▽ More
For decades, track association has been a challenging problem in marine surveillance, which involves the identification and association of vessel observations over time. However, the Automatic Identification System (AIS) has provided a new opportunity for researchers to tackle this problem by offering a large database of dynamic and geo-spatial information of marine vessels. With the availability of such large databases, researchers can now develop sophisticated models and algorithms that leverage the increased availability of data to address the track association challenge effectively. Furthermore, with the advent of deep learning, track association can now be approached as a data-intensive problem. In this study, we propose a Long Short-Term Memory (LSTM) based multi-model framework for track association. LSTM is a recurrent neural network architecture that is capable of processing multivariate temporal data collected over time in a sequential manner, enabling it to predict current vessel locations from historical observations. Based on these predictions, a geodesic distance based similarity metric is then utilized to associate the unclassified observations to their true tracks (vessels). We evaluate the performance of our approach using standard performance metrics, such as precision, recall, and F1 score, which provide a comprehensive summary of the accuracy of the proposed framework.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
A CNN-LSTM Architecture for Marine Vessel Track Association Using Automatic Identification System (AIS) Data
Authors:
Md Asif Bin Syed,
Imtiaz Ahmed
Abstract:
In marine surveillance, distinguishing between normal and anomalous vessel movement patterns is critical for identifying potential threats in a timely manner. Once detected, it is important to monitor and track these vessels until a necessary intervention occurs. To achieve this, track association algorithms are used, which take sequential observations comprising geological and motion parameters o…
▽ More
In marine surveillance, distinguishing between normal and anomalous vessel movement patterns is critical for identifying potential threats in a timely manner. Once detected, it is important to monitor and track these vessels until a necessary intervention occurs. To achieve this, track association algorithms are used, which take sequential observations comprising geological and motion parameters of the vessels and associate them with respective vessels. The spatial and temporal variations inherent in these sequential observations make the association task challenging for traditional multi-object tracking algorithms. Additionally, the presence of overlapping tracks and missing data can further complicate the trajectory tracking process. To address these challenges, in this study, we approach this tracking task as a multivariate time series problem and introduce a 1D CNN-LSTM architecture-based framework for track association. This special neural network architecture can capture the spatial patterns as well as the long-term temporal relations that exist among the sequential observations. During the training process, it learns and builds the trajectory for each of these underlying vessels. Once trained, the proposed framework takes the marine vessel's location and motion data collected through the Automatic Identification System (AIS) as input and returns the most likely vessel track as output in real-time. To evaluate the performance of our approach, we utilize an AIS dataset containing observations from 327 vessels traveling in a specific geographic region. We measure the performance of our proposed framework using standard performance metrics such as accuracy, precision, recall, and F1 score. When compared with other competitive neural network architectures our approach demonstrates a superior tracking performance.
△ Less
Submitted 6 June, 2023; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Classification of Vocal Bursts for ACII 2022 A-VB-Type Competition using Convolutional Neural Networks and Deep Acoustic Embeddings
Authors:
Muhammad Shehram Shah Syed,
Zafi Sherhan Syed,
Abbas Syed
Abstract:
This report provides a brief description of our proposed solution for the Vocal Burst Type classification task of the ACII 2022 Affective Vocal Bursts (A-VB) Competition. We experimented with two approaches as part of our solution for the task at hand. The first of which is based on convolutional neural networks trained on Mel Spectrograms, and the second is based on average pooling of deep acoust…
▽ More
This report provides a brief description of our proposed solution for the Vocal Burst Type classification task of the ACII 2022 Affective Vocal Bursts (A-VB) Competition. We experimented with two approaches as part of our solution for the task at hand. The first of which is based on convolutional neural networks trained on Mel Spectrograms, and the second is based on average pooling of deep acoustic embeddings from a pretrained wav2vec2 model. Our best performing model achieves an unweighted average recall (UAR) of 0.5190 for the test partition, compared to the chance-level UAR of 0.1250 and a baseline of 0.4172. Thus, an improvement of around 20% over the challenge baseline. The results reported in this document demonstrate the efficacy of our proposed approaches to solve the AV-B Type Classification task.
△ Less
Submitted 13 October, 2022; v1 submitted 29 September, 2022;
originally announced September 2022.
-
Toward Ubiquitous and Flexible Coverage of UAV-IRS-Assisted NOMA Networks
Authors:
Chun-Hung Liu,
Md Asif Syed,
Lu Wei
Abstract:
This paper studies how to achieve a high and flexible coverage performance of a large-scale cellular network that enables unmanned aerial vehicles (UAVs) for non-orthogonal multiple access (NOMA) transmission to simultaneously serve multiple users. The considered cellular network consists of a tier of base stations and a tier of UAVs. Each UAV is mounted with an intelligent reflecting surface (IRS…
▽ More
This paper studies how to achieve a high and flexible coverage performance of a large-scale cellular network that enables unmanned aerial vehicles (UAVs) for non-orthogonal multiple access (NOMA) transmission to simultaneously serve multiple users. The considered cellular network consists of a tier of base stations and a tier of UAVs. Each UAV is mounted with an intelligent reflecting surface (IRS) in order to serve as an aerial IRS reflecting signals between a base station and a user in the network. All the UAVs in the network are deployed based on a newly proposed three-dimensional (3D) point process that leads to a tractable and accurate analysis of the association statistics, which is traditionally difficult to analyze due to the mobility of UAVs. In light of this, we are able to analyze the downlink coverage of UAV-IRS-assisted NOMA transmission for two users and derive the corresponding coverage probabilities. Our coverage analyses shed light on the optimal allocations of transmit power between NOMA users and UAVs to accomplish the goal of ubiquitous and flexible NOMA transmission. We also conduct numerical simulations to validate our coverage analytical results while demonstrating the improved coverage performance achieved by aerial IRSs.
△ Less
Submitted 19 January, 2022; v1 submitted 10 October, 2021;
originally announced October 2021.
-
Long-Term, in-the-Wild Study of Feedback about Speech Intelligibility for K-12 Students Attending Class via a Telepresence Robot
Authors:
Matthew Rueben,
Mohammad Syed,
Emily London,
Mark Camarena,
Eunsook Shin,
Yulun Zhang,
Timothy S. Wang,
Thomas R. Groechel,
Rhianna Lee,
Maja J. Matarić
Abstract:
Telepresence robots offer presence, embodiment, and mobility to remote users, making them promising options for homebound K-12 students. It is difficult, however, for robot operators to know how well they are being heard in remote and noisy classroom environments. One solution is to estimate the operator's speech intelligibility to their listeners in order to provide feedback about it to the opera…
▽ More
Telepresence robots offer presence, embodiment, and mobility to remote users, making them promising options for homebound K-12 students. It is difficult, however, for robot operators to know how well they are being heard in remote and noisy classroom environments. One solution is to estimate the operator's speech intelligibility to their listeners in order to provide feedback about it to the operator. This work contributes the first evaluation of a speech intelligibility feedback system for homebound K-12 students attending class remotely. In our four long-term, in-the-wild deployments we found that students speak at different volumes instead of adjusting the robot's volume, and that detailed audio calibration and network latency feedback are needed. We also contribute the first findings about the types and frequencies of multimodal comprehension cues given to homebound students by listeners in the classroom. By annotating and categorizing over 700 cues, we found that the most common cue modalities were conversation turn timing and verbal content. Conversation turn timing cues occurred more frequently overall, whereas verbal content cues contained more information and might be the most frequent modality for negative cues. Our work provides recommendations for telepresence systems that could intervene to ensure that remote users are being heard.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Social engineering: Concepts, Techniques and Security Countermeasures
Authors:
Adib Mohammed Syed
Abstract:
The purpose of this report is to research the topic called Social Engineering in Cyber Security and present the explanation of the meaning, concepts, techniques, and security countermeasures of Social Engineering based on factual academic research.
The purpose of this report is to research the topic called Social Engineering in Cyber Security and present the explanation of the meaning, concepts, techniques, and security countermeasures of Social Engineering based on factual academic research.
△ Less
Submitted 23 June, 2021;
originally announced July 2021.
-
Generalized Latency Performance Estimation for Once-For-All Neural Architecture Search
Authors:
Muhtadyuzzaman Syed,
Arvind Akpuram Srinivasan
Abstract:
Neural Architecture Search (NAS) has enabled the possibility of automated machine learning by streamlining the manual development of deep neural network architectures defining a search space, search strategy, and performance estimation strategy. To solve the need for multi-platform deployment of Convolutional Neural Network (CNN) models, Once-For-All (OFA) proposed to decouple Training and Search…
▽ More
Neural Architecture Search (NAS) has enabled the possibility of automated machine learning by streamlining the manual development of deep neural network architectures defining a search space, search strategy, and performance estimation strategy. To solve the need for multi-platform deployment of Convolutional Neural Network (CNN) models, Once-For-All (OFA) proposed to decouple Training and Search to deliver a one-shot model of sub-networks that are constrained to various accuracy-latency tradeoffs. We find that the performance estimation strategy for OFA's search severely lacks generalizability of different hardware deployment platforms due to single hardware latency lookup tables that require significant amount of time and manual effort to build beforehand. In this work, we demonstrate the framework for building latency predictors for neural network architectures to address the need for heterogeneous hardware support and reduce the overhead of lookup tables altogether. We introduce two generalizability strategies which include fine-tuning using a base model trained on a specific hardware and NAS search space, and GPU-generalization which trains a model on GPU hardware parameters such as Number of Cores, RAM Size, and Memory Bandwidth. With this, we provide a family of latency prediction models that achieve over 50% lower RMSE loss as compared to with ProxylessNAS. We also show that the use of these latency predictors match the NAS performance of the lookup table baseline approach if not exceeding it in certain cases.
△ Less
Submitted 3 January, 2021;
originally announced January 2021.
-
Planimation
Authors:
Gang Chen,
Yi Ding,
Hugo Edwards,
Chong Hin Chau,
Sai Hou,
Grace Johnson,
Mohammed Sharukh Syed,
Haoyuan Tang,
Yue Wu,
Ye Yan,
Gil Tidhar,
Nir Lipovetzky
Abstract:
Planimation is a modular and extensible open source framework to visualise sequential solutions of planning problems specified in PDDL. We introduce a preliminary declarative PDDL-like animation profile specification, expressive enough to synthesise animations of arbitrary initial states and goals of a benchmark with just a single profile.
Planimation is a modular and extensible open source framework to visualise sequential solutions of planning problems specified in PDDL. We introduce a preliminary declarative PDDL-like animation profile specification, expressive enough to synthesise animations of arbitrary initial states and goals of a benchmark with just a single profile.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
A 3D Tractable Model for UAV-Enabled Cellular Networks With Multiple Antennas
Authors:
Chun-Hung Liu,
Di-Chun Liang,
Md Asif Syed,
Rung-Hung Gau
Abstract:
This paper aims to propose a three-dimensional (3D) point process model that can be employed to generally deploy unmanned aerial vehicles (UAVs) in a large-scale cellular network and tractably analyze the fundamental network-wide performances of the network. The proposed 3D point process is devised based on a 2D homogeneous marked Poisson point process (PPP) in which each point and its random mark…
▽ More
This paper aims to propose a three-dimensional (3D) point process model that can be employed to generally deploy unmanned aerial vehicles (UAVs) in a large-scale cellular network and tractably analyze the fundamental network-wide performances of the network. The proposed 3D point process is devised based on a 2D homogeneous marked Poisson point process (PPP) in which each point and its random mark uniquely correspond to the projection and the altitude of each point in the 3D point process, respectively. We study some of the important statistical properties of the proposed 3D point process and shed light on some crucial insights into these properties that facilitate the analyses of a UAV-enabled cellular network wherein all the UAVs equipped with multiple antennas are deployed by the proposed 3D point process to serve as aerial base stations. The salient features of the proposed 3D point process lie in its suitability in practical 3D channel modeling and tractability in analysis. The downlink coverage performances of the UAV-enabled cellular network are analyzed and found in neat expressions and their closed-form results for some special cases are also derived. Most importantly, their fundamental limits achieved by cell-free massive antenna array are characterized when coordinating all the UAVs to jointly perform non-coherent downlink transmission. Finally, numerical results are provided to validate some of the key findings in this paper.
△ Less
Submitted 29 December, 2020; v1 submitted 19 July, 2020;
originally announced July 2020.
-
Calendar Graph Neural Networks for Modeling Time Structures in Spatiotemporal User Behaviors
Authors:
Daheng Wang,
Meng Jiang,
Munira Syed,
Oliver Conway,
Vishal Juneja,
Sriram Subramanian,
Nitesh V. Chawla
Abstract:
User behavior modeling is important for industrial applications such as demographic attribute prediction, content recommendation, and target advertising. Existing methods represent behavior log as a sequence of adopted items and find sequential patterns; however, concrete location and time information in the behavior log, reflecting dynamic and periodic patterns, joint with the spatial dimension,…
▽ More
User behavior modeling is important for industrial applications such as demographic attribute prediction, content recommendation, and target advertising. Existing methods represent behavior log as a sequence of adopted items and find sequential patterns; however, concrete location and time information in the behavior log, reflecting dynamic and periodic patterns, joint with the spatial dimension, can be useful for modeling users and predicting their characteristics. In this work, we propose a novel model based on graph neural networks for learning user representations from spatiotemporal behavior data. A behavior log comprises a sequence of sessions; and a session has a location, start time, end time, and a sequence of adopted items. Our model's architecture incorporates two networked structures. One is a tripartite network of items, sessions, and locations. The other is a hierarchical calendar network of hour, week, and weekday nodes. It first aggregates embeddings of location and items into session embeddings via the tripartite network, and then generates user embeddings from the session embeddings via the calendar structure. The user embeddings preserve spatial patterns and temporal patterns of a variety of periodicity (e.g., hourly, weekly, and weekday patterns). It adopts the attention mechanism to model complex interactions among the multiple patterns in user behaviors. Experiments on real datasets (i.e., clicks on news articles in a mobile app) show our approach outperforms strong baselines for predicting missing demographic attributes.
△ Less
Submitted 17 July, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.
-
The Challenges of Trace-Driven Wi-Fi Emulation
Authors:
Mohammad Imran Syed,
Renata Teixeira,
Sara Ayoubi,
Giulio Grassi
Abstract:
Wi-Fi link is unpredictable and it has never been easy to measure it perfectly; there is always bound to be some bias. As wireless becomes the medium of choice, it is useful to capture Wi-Fi traces in order to evaluate, tune, and adapt the different applications and protocols. Several methods have been used for the purpose of experimenting with different wireless conditions: simulation, experiment…
▽ More
Wi-Fi link is unpredictable and it has never been easy to measure it perfectly; there is always bound to be some bias. As wireless becomes the medium of choice, it is useful to capture Wi-Fi traces in order to evaluate, tune, and adapt the different applications and protocols. Several methods have been used for the purpose of experimenting with different wireless conditions: simulation, experimentation, and trace-driven emulation. In this paper, we argue that trace-driven emulation is the most favorable approach. In the absence of a trace-driven emulation tool for Wi-Fi, we evaluate the state-of-the-art trace-driven emulation tool for Cellular networks and we identify issues for Wi-Fi: interference with concurrent traffic, interference with its own traffic if measurements are done on both uplink and downlink simultaneously, and packet loss. We provide a solid argument as to why this tool falls short of effectively capturing Wi-Fi traces. The outcome of our analysis guides us to propose a number of suggestions on how the existing tool can be tweaked to accurately capture Wi-Fi traces.
△ Less
Submitted 18 November, 2024; v1 submitted 10 February, 2020;
originally announced February 2020.
-
Improved SVD-based Initialization for Nonnegative Matrix Factorization using Low-Rank Correction
Authors:
Atif Muhammad Syed,
Sameer Qazi,
Nicolas Gillis
Abstract:
Due to the iterative nature of most nonnegative matrix factorization (\textsc{NMF}) algorithms, initialization is a key aspect as it significantly influences both the convergence and the final solution obtained. Many initialization schemes have been proposed for NMF, among which one of the most popular class of methods are based on the singular value decomposition (SVD). However, these SVD-based i…
▽ More
Due to the iterative nature of most nonnegative matrix factorization (\textsc{NMF}) algorithms, initialization is a key aspect as it significantly influences both the convergence and the final solution obtained. Many initialization schemes have been proposed for NMF, among which one of the most popular class of methods are based on the singular value decomposition (SVD). However, these SVD-based initializations do not satisfy a rather natural condition, namely that the error should decrease as the rank of factorization increases. In this paper, we propose a novel SVD-based \textsc{NMF} initialization to specifically address this shortcoming by taking into account the SVD factors that were discarded to obtain a nonnegative initialization. This method, referred to as nonnegative SVD with low-rank correction (NNSVD-LRC), allows us to significantly reduce the initial error at a negligible additional computational cost using the low-rank structure of the discarded SVD factors. NNSVD-LRC has two other advantages compared to previous SVD-based initializations: (1) it provably generates sparse initial factors, and (2) it is faster as it only requires to compute a truncated SVD of rank $\lceil r/2 + 1 \rceil$ where $r$ is the factorization rank of the sought NMF decomposition (as opposed to a rank-$r$ truncated SVD for other methods). We show on several standard dense and sparse data sets that our new method competes favorably with state-of-the-art SVD-based initializations for NMF.
△ Less
Submitted 11 July, 2018;
originally announced July 2018.
-
An Implementation of Web Services for Inter-Connectivity of Information Systems
Authors:
Aftab Ahmed Chandio,
Dingju Zhu,
Ali Hassan Sodhro,
Muhammad Umer Syed
Abstract:
As educational institutions and their departments rapidly increase, a communication between their end-users becomes more and more difficult in traditional online management systems (OMS). However, the end-users, i.e., employees, teaching staff, and students are associated to different sub-domains and using different subsystems that are executed on different platforms following different administra…
▽ More
As educational institutions and their departments rapidly increase, a communication between their end-users becomes more and more difficult in traditional online management systems (OMS). However, the end-users, i.e., employees, teaching staff, and students are associated to different sub-domains and using different subsystems that are executed on different platforms following different administrative policies. Because of their intercommunication is not automated integrated, consequently, the overall efficiency of the system is degraded and the communication time is increased. Therefore, a technique for better interoperability and automated integration of departments is an urgent needed. Many of existing systems does not have a set of connections yet, such as the system of the University of Sindh (UoS). In this paper, we propose a system for the UoS, named integration of inter-connectivity of information system (i3) based on service oriented architecture (SOA) with web services. The system i3 monitors and exchanges the students information in support of verification along heterogeneous and decentralized nature. Moreover, the proposed system provides capability of interoperability between their subsystems that are deployed in different departments of UoS and using different programming languages and database management systems (DBMS)
△ Less
Submitted 31 July, 2014;
originally announced July 2014.