-
Consistent and Compatible Modelling of Cyber Intrusions and Incident Response Demonstrated in the Context of Malware Attacks on Critical Infrastructure
Authors:
Peter Maynard,
Yulia Cherdantseva,
Avi Shaked,
Pete Burnap,
Arif Mehmood
Abstract:
Cyber Security Incident Response (IR) Playbooks are used to capture the steps required to recover from a cyber intrusion. Individual IR playbooks should focus on a specific type of incident and be aligned with the architecture of a system under attack. Intrusion modelling focuses on a specific potential cyber intrusion and is used to identify where and what countermeasures are needed, and the resu…
▽ More
Cyber Security Incident Response (IR) Playbooks are used to capture the steps required to recover from a cyber intrusion. Individual IR playbooks should focus on a specific type of incident and be aligned with the architecture of a system under attack. Intrusion modelling focuses on a specific potential cyber intrusion and is used to identify where and what countermeasures are needed, and the resulting intrusion models are expected to be used in effective IR, ideally by feeding IR Playbooks designs. IR playbooks and intrusion models, however, are created in isolation and at varying stages of the system's lifecycle. We take nine critical national infrastructure intrusion models - expressed using Sequential AND Attack Trees - and transform them into models of the same format as IR playbooks. We use Security Modelling Framework for modelling attacks and playbooks, and for demonstrating the feasibility of the better integration between risk assessment and IR at the modelling level. This results in improved intrusion models and tighter coupling between IR playbooks and threat modelling which - as we demonstrate - yields novel insights into the analysis of attacks and response actions. The main contributions of this paper are (a) a novel way of representing attack trees using the Security Modelling Framework,(b) a new tool for converting Sequential AND attack trees into models compatible with playbooks, and (c) the examples of nine intrusion models represented using the Security Modelling Framework.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
A Novel Channel Boosted Residual CNN-Transformer with Regional-Boundary Learning for Breast Cancer Detection
Authors:
Aamir Mehmood,
Yue Hu,
Saddam Hussain Khan
Abstract:
Recent advancements in detecting tumors using deep learning on breast ultrasound images (BUSI) have demonstrated significant success. Deep CNNs and vision-transformers (ViTs) have demonstrated individually promising initial performance. However, challenges related to model complexity and contrast, texture, and tumor morphology variations introduce uncertainties that hinder the effectiveness of cur…
▽ More
Recent advancements in detecting tumors using deep learning on breast ultrasound images (BUSI) have demonstrated significant success. Deep CNNs and vision-transformers (ViTs) have demonstrated individually promising initial performance. However, challenges related to model complexity and contrast, texture, and tumor morphology variations introduce uncertainties that hinder the effectiveness of current methods. This study introduces a novel hybrid framework, CB-Res-RBCMT, combining customized residual CNNs and new ViT components for detailed BUSI cancer analysis. The proposed RBCMT uses stem convolution blocks with CNN Meet Transformer (CMT) blocks, followed by new Regional and boundary (RB) feature extraction operations for capturing contrast and morphological variations. Moreover, the CMT block incorporates global contextual interactions through multi-head attention, enhancing computational efficiency with a lightweight design. Additionally, the customized inverse residual and stem CNNs within the CMT effectively extract local texture information and handle vanishing gradients. Finally, the new channel-boosted (CB) strategy enriches the feature diversity of the limited dataset by combining the original RBCMT channels with transfer learning-based residual CNN-generated maps. These diverse channels are processed through a spatial attention block for optimal pixel selection, reducing redundancy and improving the discrimination of minor contrast and texture variations. The proposed CB-Res-RBCMT achieves an F1-score of 95.57%, accuracy of 95.63%, sensitivity of 96.42%, and precision of 94.79% on the standard harmonized stringent BUSI dataset, outperforming existing ViT and CNN methods. These results demonstrate the versatility of our integrated CNN-Transformer framework in capturing diverse features and delivering superior performance in BUSI cancer diagnosis.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
MetaphorShare: A Dynamic Collaborative Repository of Open Metaphor Datasets
Authors:
Joanne Boisson,
Arif Mehmood,
Jose Camacho-Collados
Abstract:
The metaphor studies community has developed numerous valuable labelled corpora in various languages over the years. Many of these resources are not only unknown to the NLP community, but are also often not easily shared among the researchers. Both in human sciences and in NLP, researchers could benefit from a centralised database of labelled resources, easily accessible and unified under an ident…
▽ More
The metaphor studies community has developed numerous valuable labelled corpora in various languages over the years. Many of these resources are not only unknown to the NLP community, but are also often not easily shared among the researchers. Both in human sciences and in NLP, researchers could benefit from a centralised database of labelled resources, easily accessible and unified under an identical format. To facilitate this, we present MetaphorShare, a website to integrate metaphor datasets making them open and accessible. With this effort, our aim is to encourage researchers to share and upload more datasets in any language in order to facilitate metaphor studies and the development of future metaphor processing NLP systems. The website has four main functionalities: upload, download, search and label metaphor datasets. It is accessible at www.metaphorshare.com.
△ Less
Submitted 10 March, 2025; v1 submitted 27 November, 2024;
originally announced November 2024.
-
A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media
Authors:
Ayaz Mehmood,
Muhammad Tayyab Zamir,
Muhammad Asif Ayub,
Nasir Ahmad,
Kashif Ahmad
Abstract:
Over the last decade, similar to other application domains, social media content has been proven very effective in disaster informatics. However, due to the unstructured nature of the data, several challenges are associated with disaster analysis in social media content. To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-l…
▽ More
Over the last decade, similar to other application domains, social media content has been proven very effective in disaster informatics. However, due to the unstructured nature of the data, several challenges are associated with disaster analysis in social media content. To fully explore the potential of social media content in disaster informatics, access to relevant content and the correct geo-location information is very critical. In this paper, we propose a three-step solution to tackling these challenges. Firstly, the proposed solution aims to classify social media posts into relevant and irrelevant posts followed by the automatic extraction of location information from the posts' text through Named Entity Recognition (NER) analysis. Finally, to quickly analyze the topics covered in large volumes of social media posts, we perform topic modeling resulting in a list of top keywords, that highlight the issues discussed in the tweet. For the Relevant Classification of Twitter Posts (RCTP), we proposed a merit-based fusion framework combining the capabilities of four different models namely BERT, RoBERTa, Distil BERT, and ALBERT obtaining the highest F1-score of 0.933 on a benchmark dataset. For the Location Extraction from Twitter Text (LETT), we evaluated four models namely BERT, RoBERTa, Distil BERTA, and Electra in an NER framework obtaining the highest F1-score of 0.960. For topic modeling, we used the BERTopic library to discover the hidden topic patterns in the relevant tweets. The experimental results of all the components of the proposed end-to-end solution are very encouraging and hint at the potential of social media content and NLP in disaster management.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
A Python Framework for Neutrosophic Sets and Mappings
Authors:
Giorgio Nordo,
Saeid Jafari,
Arif Mehmood,
Bhimraj Basumatary
Abstract:
In this paper we present an open source framework developed in Python and consisting of three distinct classes designed to manipulate in a simple and intuitive way both symbolic representations of neutrosophic sets over universes of various types as well as mappings between them. The capabilities offered by this framework extend and generalize previous attempts to provide software solutions to the…
▽ More
In this paper we present an open source framework developed in Python and consisting of three distinct classes designed to manipulate in a simple and intuitive way both symbolic representations of neutrosophic sets over universes of various types as well as mappings between them. The capabilities offered by this framework extend and generalize previous attempts to provide software solutions to the manipulation of neutrosophic sets such as those proposed by Salama et al., Saranya et al., El-Ghareeb, Topal et al. and Sleem. The code is described in detail and many examples and use cases are also provided.
△ Less
Submitted 24 March, 2024;
originally announced April 2024.
-
Cuff-less Arterial Blood Pressure Waveform Synthesis from Single-site PPG using Transformer & Frequency-domain Learning
Authors:
Muhammad Wasim Nawaz,
Muhammad Ahmad Tahir,
Ahsan Mehmood,
Muhammad Mahboob Ur Rahman,
Kashif Riaz,
Qammer H. Abbasi
Abstract:
We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-deco…
▽ More
We develop and evaluate two novel purpose-built deep learning (DL) models for synthesis of the arterial blood pressure (ABP) waveform in a cuff-less manner, using a single-site photoplethysmography (PPG) signal. We train and evaluate our DL models on the data of 209 subjects from the public UCI dataset on cuff-less blood pressure (CLBP) estimation. Our transformer model consists of an encoder-decoder pair that incorporates positional encoding, multi-head attention, layer normalization, and dropout techniques for ABP waveform synthesis. Secondly, under our frequency-domain (FD) learning approach, we first obtain the discrete cosine transform (DCT) coefficients of the PPG and ABP signals, and then learn a linear/non-linear (L/NL) regression between them. The transformer model (FD L/NL model) synthesizes the ABP waveform with a mean absolute error (MAE) of 3.01 (4.23). Further, the synthesis of ABP waveform also allows us to estimate the systolic blood pressure (SBP) and diastolic blood pressure (DBP) values. To this end, the transformer model reports an MAE of 3.77 mmHg and 2.69 mmHg, for SBP and DBP, respectively. On the other hand, the FD L/NL method reports an MAE of 4.37 mmHg and 3.91 mmHg, for SBP and DBP, respectively. Both methods fulfill the AAMI criterion. As for the BHS criterion, our transformer model (FD L/NL regression model) achieves grade A (grade B).
△ Less
Submitted 8 June, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
Multi-class Network Intrusion Detection with Class Imbalance via LSTM & SMOTE
Authors:
Muhammad Wasim Nawaz,
Rashid Munawar,
Ahsan Mehmood,
Muhammad Mahboob Ur Rahman,
Qammer H. Abbasi
Abstract:
Monitoring network traffic to maintain the quality of service (QoS) and to detect network intrusions in a timely and efficient manner is essential. As network traffic is sequential, recurrent neural networks (RNNs) such as long short-term memory (LSTM) are suitable for building network intrusion detection systems. However, in the case of a few dataset examples of the rare attack types, even these…
▽ More
Monitoring network traffic to maintain the quality of service (QoS) and to detect network intrusions in a timely and efficient manner is essential. As network traffic is sequential, recurrent neural networks (RNNs) such as long short-term memory (LSTM) are suitable for building network intrusion detection systems. However, in the case of a few dataset examples of the rare attack types, even these networks perform poorly. This paper proposes to use oversampling techniques along with appropriate loss functions to handle class imbalance for the detection of various types of network intrusions. Our deep learning model employs LSTM with fully connected layers to perform multi-class classification of network attacks. We enhance the representation of minority classes: i) through the application of the Synthetic Minority Over-sampling Technique (SMOTE), and ii) by employing categorical focal cross-entropy loss to apply a focal factor to down-weight examples of the majority classes and focus more on hard examples of the minority classes. Extensive experiments on KDD99 and CICIDS2017 datasets show promising results in detecting network intrusions (with many rare attack types, e.g., Probe, R2L, DDoS, PortScan, etc.).
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Go Together: Bridging the Gap between Learners and Teachers
Authors:
Asim Irfan,
Atif Nawaz,
Muhammad Turab,
Muhmmad Azeem,
Mashal Adnan,
Ahsan Mehmood,
Sarfaraz Ahmed,
Adnan Ashraf
Abstract:
After the pandemic, humanity has been facing different types of challenges. Social relationships, societal values, and academic and professional behavior have been hit the most. People are shifting their routines to social media and gadgets, and getting addicted to their isolation. This sudden change in their lives has caused an unusual social breakdown and endangered their mental health. In mid-2…
▽ More
After the pandemic, humanity has been facing different types of challenges. Social relationships, societal values, and academic and professional behavior have been hit the most. People are shifting their routines to social media and gadgets, and getting addicted to their isolation. This sudden change in their lives has caused an unusual social breakdown and endangered their mental health. In mid-2021, Pakistan's first Human Library was established under HelpingMind to overcome these effects. Despite online sessions and webinars, HelpingMind needs technology to reach the masses. In this work, we customized the UI or UX of a Go Together Mobile Application (GTMA) to meet the requirements of the client organization. A very interesting concept of the book (expert listener or psychologist) and the reader is introduced in GTMA. It offers separate dashboards, separate reviews or rating systems, booking, and venue information to engage the human reader with his or her favorite human book. The loyalty program enables the members to avail discounts through a mobile application and its membership is global where both the human-reader and human-books can register under the platform. The minimum viable product has been approved by our client organization.
△ Less
Submitted 23 July, 2023;
originally announced August 2023.
-
Evaluation of a Low-Cost Single-Lead ECG Module for Vascular Ageing Prediction and Studying Smoking-induced Changes in ECG
Authors:
S. Anas Ali,
M. Saqib Niaz,
Mubashir Rehman,
Ahsan Mehmood,
M. Mahboob Ur Rahman,
Kashif Riaz,
Qammer H. Abbasi
Abstract:
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently health…
▽ More
Vascular age is traditionally measured using invasive methods or through 12-lead electrocardiogram (ECG). This paper utilizes a low-cost single-lead (lead-I) ECG module to predict the vascular age of an apparently healthy young person. In addition, we also study the impact of smoking on ECG traces of the light-but-habitual smokers. We begin by collecting (lead-I) ECG data from 42 apparently healthy subjects (smokers and non-smokers) aged 18 to 30 years, using our custom-built low-cost single-lead ECG module, and anthropometric data, e.g., body mass index, smoking status, blood pressure, etc. Under our proposed method, we first pre-process our dataset by denoising the ECG traces, followed by baseline drift removal, followed by z-score normalization. Next, we create another dataset by dividing the ECG traces into overlapping segments of five-second duration. We then feed both segmented and unsegmented datasets to a number of machine learning models, a 1D convolutional neural network, and ResNet18 model, for vascular ageing prediction. We also do transfer learning whereby we pre-train our models on a public PPG dataset, and later, fine-tune and evaluate them on our unsegmented ECG dataset. The random forest model outperforms all other models and previous works by achieving a mean squared error (MSE) of 0.07 and coefficient of determination R2 of 0.99, MSE of 3.56 and R2 of 0.26, MSE of 0.99 and R2 of 0.87, for segmented ECG dataset, for unsegmented ECG dataset, and for transfer learning scenario, respectively. Finally, we utilize the explainable AI framework to identify those ECG features that get affected due to smoking. This work is aligned with the sustainable development goals 3 and 10 of the United Nations which aim to provide low-cost but quality healthcare solutions to the unprivileged. This work also finds its applications in the broad domain of forensic science.
△ Less
Submitted 25 November, 2024; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Permutation-Aware Action Segmentation via Unsupervised Frame-to-Segment Alignment
Authors:
Quoc-Huy Tran,
Ahmed Mehmood,
Muhammad Ahmed,
Muhammad Naufil,
Anas Zafar,
Andrey Konin,
M. Zeeshan Zia
Abstract:
This paper presents an unsupervised transformer-based framework for temporal activity segmentation which leverages not only frame-level cues but also segment-level cues. This is in contrast with previous methods which often rely on frame-level information only. Our approach begins with a frame-level prediction module which estimates framewise action classes via a transformer encoder. The frame-lev…
▽ More
This paper presents an unsupervised transformer-based framework for temporal activity segmentation which leverages not only frame-level cues but also segment-level cues. This is in contrast with previous methods which often rely on frame-level information only. Our approach begins with a frame-level prediction module which estimates framewise action classes via a transformer encoder. The frame-level prediction module is trained in an unsupervised manner via temporal optimal transport. To exploit segment-level information, we utilize a segment-level prediction module and a frame-to-segment alignment module. The former includes a transformer decoder for estimating video transcripts, while the latter matches frame-level features with segment-level features, yielding permutation-aware segmentation results. Moreover, inspired by temporal optimal transport, we introduce simple-yet-effective pseudo labels for unsupervised training of the above modules. Our experiments on four public datasets, i.e., 50 Salads, YouTube Instructions, Breakfast, and Desktop Assembly show that our approach achieves comparable or better performance than previous methods in unsupervised activity segmentation.
△ Less
Submitted 26 October, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Your smartphone could act as a pulse-oximeter and as a single-lead ECG
Authors:
Ahsan Mehmood,
Asma Sarauji,
M. Mahboob Ur Rahman,
Tareq Y. Al-Naffouri
Abstract:
In the post-covid19 era, every new wave of the pandemic causes an increased concern among the masses to learn more about their state of well-being. Therefore, it is the need of the hour to come up with ubiquitous, low-cost, non-invasive tools for rapid and continuous monitoring of body vitals that reflect the status of one's overall health. In this backdrop, this work proposes a deep learning appr…
▽ More
In the post-covid19 era, every new wave of the pandemic causes an increased concern among the masses to learn more about their state of well-being. Therefore, it is the need of the hour to come up with ubiquitous, low-cost, non-invasive tools for rapid and continuous monitoring of body vitals that reflect the status of one's overall health. In this backdrop, this work proposes a deep learning approach to turn a smartphone-the popular hand-held personal gadget-into a diagnostic tool to measure/monitor the three most important body vitals, i.e., pulse rate (PR), blood oxygen saturation level (aka SpO2), and respiratory rate (RR). Furthermore, we propose another method that could extract a single-lead electrocardiograph (ECG) of the subject. The proposed methods include the following core steps: subject records a small video of his/her fingertip by placing his/her finger on the rear camera of the smartphone, and the recorded video is pre-processed to extract the filtered and/or detrended video-photoplethysmography (vPPG) signal, which is then fed to custom-built convolutional neural networks (CNN), which eventually spit-out the vitals (PR, SpO2, and RR) as well as a single-lead ECG of the subject. To be precise, the contribution of this paper is two-fold: 1) estimation of the three body vitals (PR, SpO2, RR) from the vPPG data using custom-built CNNs, vision transformer, and most importantly by CLIP model; 2) a novel discrete cosine transform+feedforward neural network-based method that translates the recorded video- PPG signal to a single-lead ECG signal. The proposed method is anticipated to find its application in several use-case scenarios, e.g., remote healthcare, mobile health, fitness, sports, etc.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
A Deep Learning & Fast Wavelet Transform-based Hybrid Approach for Denoising of PPG Signals
Authors:
Rabia Ahmed,
Ahsan Mehmood,
Muhammad Mahboob Ur Rahman,
Octavia A. Dobre
Abstract:
This letter presents a novel hybrid method that leverages deep learning to exploit the multi-resolution analysis capability of the wavelets, in order to denoise a photoplethysmography (PPG) signal. Under the proposed method, a noisy PPG sequence of length N is first decomposed into L detailed coefficients using the fast wavelet transform (FWT). Then, the clean PPG sequence is reconstructed as foll…
▽ More
This letter presents a novel hybrid method that leverages deep learning to exploit the multi-resolution analysis capability of the wavelets, in order to denoise a photoplethysmography (PPG) signal. Under the proposed method, a noisy PPG sequence of length N is first decomposed into L detailed coefficients using the fast wavelet transform (FWT). Then, the clean PPG sequence is reconstructed as follows. A custom feedforward neural network (FFNN) provides the binary weights for each of the wavelet sub-signals outputted by the inverse-FWT block. This way, all those sub-signals which correspond to noise or artefacts are discarded during reconstruction. The FFNN is trained on the Beth Israel Deaconess Medical Center (BIDMC) dataset under the supervised learning framework, whereby we compute the mean squared-error (MSE) between the denoised sequence and the reference clean PPG signal, and compute the gradient of the MSE for the back-propagation. Numerical results show that the proposed method effectively denoises the corrupted PPG and video-PPG signal.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
Energy Disaggregation & Appliance Identification in a Smart Home: Transfer Learning enables Edge Computing
Authors:
M. Hashim Shahab,
Hasan Mujtaba Buttar,
Ahsan Mehmood,
Waqas Aman,
M. Mahboob Ur Rahman,
M. Wasim Nawaz,
Haris Pervaiz,
Qammer H. Abbasi
Abstract:
Non-intrusive load monitoring (NILM) or energy disaggregation aims to extract the load profiles of individual consumer electronic appliances, given an aggregate load profile of the mains of a smart home. This work proposes a novel deep-learning and edge computing approach to solve the NILM problem and a few related problems as follows. 1) We build upon the reputed seq2-point convolutional neural n…
▽ More
Non-intrusive load monitoring (NILM) or energy disaggregation aims to extract the load profiles of individual consumer electronic appliances, given an aggregate load profile of the mains of a smart home. This work proposes a novel deep-learning and edge computing approach to solve the NILM problem and a few related problems as follows. 1) We build upon the reputed seq2-point convolutional neural network (CNN) model to come up with the proposed seq2-[3]-point CNN model to solve the (home) NILM problem and site-NILM problem (basically, NILM at a smaller scale). 2) We solve the related problem of appliance identification by building upon the state-of-the-art (pre-trained) 2D-CNN models, i.e., AlexNet, ResNet-18, and DenseNet-121, which are fine-tuned two custom datasets that consist of Wavelets and short-time Fourier transform (STFT)-based 2D electrical signatures of the appliances. 3) Finally, we do some basic qualitative inference about an individual appliance's health by comparing the power consumption of the same appliance across multiple homes. Low-frequency REDD dataset is used for all problems, except site-NILM where REFIT dataset has been used. As for the results, we achieve a maximum accuracy of 94.6\% for home-NILM, 81\% for site-NILM, and 88.9\% for appliance identification (with Resnet-based model).
△ Less
Submitted 14 March, 2024; v1 submitted 8 January, 2023;
originally announced January 2023.
-
Floods Relevancy and Identification of Location from Twitter Posts using NLP Techniques
Authors:
Muhammad Suleman,
Muhammad Asif,
Tayyab Zamir,
Ayaz Mehmood,
Jebran Khan,
Nasir Ahmad,
Kashif Ahmad
Abstract:
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of l…
▽ More
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
△ Less
Submitted 31 December, 2022;
originally announced January 2023.
-
Towards Understanding Trends Manipulation in Pakistan Twitter
Authors:
Soufia Kausar,
Bilal Tahir,
Muhammad Amir Mehmood
Abstract:
The rapid adoption of online social media platforms has transformed the way of communication and interaction. On these platforms, discussions in the form of trending topics provide a glimpse of events happening around the world in real-time. Also, these trends are used for political campaigns, public awareness, and brand promotions. Consequently, these trends are sensitive to manipulation by malic…
▽ More
The rapid adoption of online social media platforms has transformed the way of communication and interaction. On these platforms, discussions in the form of trending topics provide a glimpse of events happening around the world in real-time. Also, these trends are used for political campaigns, public awareness, and brand promotions. Consequently, these trends are sensitive to manipulation by malicious users who aim to mislead the mass audience. In this article, we identify and study the characteristics of users involved in the manipulation of Twitter trends in Pakistan. We propose 'Manipify', a framework for automatic detection and analysis of malicious users for Twitter trends. Our framework consists of three distinct modules: i) user classifier, ii) hashtag classifier, and ii) trend analyzer. The user classifier introduces a novel approach to automatically detect manipulators using tweet content and user behaviour features. Also, the module classifies human and bot users. Next, the hashtag classifier categorizes trending hashtags into six categories assisting in examining manipulators behaviour across different categories. Finally, the trend analyzer module examines users, hashtags, and tweets for hashtag reach, linguistic features and user behaviour. Our user classifier module achieves 0.91 accuracy in classifying the manipulators. We further test Manipify on the dataset comprising of 665 trending hashtags with 5.4 million tweets and 1.9 million users. The analysis of trends reveals that the trending panel is mostly dominated by political hashtags. In addition, our results show a higher contribution of human accounts in trend manipulation as compared to bots. Furthermore, we present two case studies of hashtag-wars and anti-state propaganda to implicate the real-world application of our research.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
CURE: Collection for Urdu Information Retrieval Evaluation and Ranking
Authors:
Muntaha Iqbal,
Kamran Amjad,
Bilal Tahir,
Muhammad Amir Mehmood
Abstract:
Urdu is a widely spoken language with 163 million speakers worldwide across the globe. Information Retrieval (IR) for Urdu entails special consideration of research community due to its rich morphological features and a large number of speakers. In general, IR evaluation task is not extensively explored for Urdu. The most important missing element is the availability of a standardized evaluation c…
▽ More
Urdu is a widely spoken language with 163 million speakers worldwide across the globe. Information Retrieval (IR) for Urdu entails special consideration of research community due to its rich morphological features and a large number of speakers. In general, IR evaluation task is not extensively explored for Urdu. The most important missing element is the availability of a standardized evaluation corpus specific to Urdu. In this research work, we propose and construct a standard test collection of Urdu documents for IR evaluation and named it Collection for Urdu Retrieval Evaluation (CURE). We select 1,096 unique documents against 50 diverse queries from a large collection of 0.5 million crawled documents using two IR models. The purpose of test collection is the evaluation of IR models, ranking algorithms, and different natural language processing techniques. Next, we perform binary relevance judgment on the selected documents. We also built two other language resources for lemmatization and query expansion specific to our test collection. Evaluation of test collection is carried out using four retrieval models as well using the stop-words list, lemmatization, and query expansion. Furthermore, error analysis was performed for each query with different NLP techniques. To the best of our knowledge, this work is the first attempt for preparing a standardized information retrieval evaluation test collection for the Urdu language.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Preventing Identity Attacks in RFID Backscatter Communication Systems: A Physical-Layer Approach
Authors:
Ahsan Mehmood,
Waqas Aman,
M. Mahboob Ur Rahman,
M. A. Imran,
Qammer H. Abbasi
Abstract:
This work considers identity attack on a radio-frequency identification (RFID)-based backscatter communication system. Specifically, we consider a single-reader, single-tag RFID system whereby the reader and the tag undergo two-way signaling which enables the reader to extract the tag ID in order to authenticate the legitimate tag (L-tag). We then consider a scenario whereby a malicious tag (M-tag…
▽ More
This work considers identity attack on a radio-frequency identification (RFID)-based backscatter communication system. Specifically, we consider a single-reader, single-tag RFID system whereby the reader and the tag undergo two-way signaling which enables the reader to extract the tag ID in order to authenticate the legitimate tag (L-tag). We then consider a scenario whereby a malicious tag (M-tag)---having the same ID as the L-tag programmed in its memory by a wizard---attempts to deceive the reader by pretending to be the L-tag. To this end, we counter the identity attack by exploiting the non-reciprocity of the end-to-end channel (i.e., the residual channel) between the reader and the tag as the fingerprint of the tag. The passive nature of the tag(s) (and thus, lack of any computational platform at the tag) implies that the proposed light-weight physical-layer authentication method is implemented at the reader. To be concrete, in our proposed scheme, the reader acquires the raw data via two-way (challenge-response) message exchange mechanism, does least-squares estimation to extract the fingerprint, and does binary hypothesis testing to do authentication. We also provide closed-form expressions for the two error probabilities of interest (i.e., false alarm and missed detection). Simulation results attest to the efficacy of the proposed method.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks with Multi-Part Loss Functions
Authors:
A. Ali Heydari,
Craig A. Thompson,
Asif Mehmood
Abstract:
Adaptive loss function formulation is an active area of research and has gained a great deal of popularity in recent years, following the success of deep learning. However, existing frameworks of adaptive loss functions often suffer from slow convergence and poor choice of weights for the loss components. Traditionally, the elements of a multi-part loss function are weighted equally or their weigh…
▽ More
Adaptive loss function formulation is an active area of research and has gained a great deal of popularity in recent years, following the success of deep learning. However, existing frameworks of adaptive loss functions often suffer from slow convergence and poor choice of weights for the loss components. Traditionally, the elements of a multi-part loss function are weighted equally or their weights are determined through heuristic approaches that yield near-optimal (or sub-optimal) results. To address this problem, we propose a family of methods, called SoftAdapt, that dynamically change function weights for multi-part loss functions based on live performance statistics of the component losses. SoftAdapt is mathematically intuitive, computationally efficient and straightforward to implement. In this paper, we present the mathematical formulation and pseudocode for SoftAdapt, along with results from applying our methods to image reconstruction (Sparse Autoencoders) and synthetic data generation (Introspective Variational Autoencoders).
△ Less
Submitted 27 December, 2019;
originally announced December 2019.
-
Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data
Authors:
M. Usman Ali,
Shahzad Ahmed,
Javed Ferzund,
Atif Mehmood,
Abbas Rehman
Abstract:
Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selectio…
▽ More
Large volume of Genomics data is produced on daily basis due to the advancement in sequencing technology. This data is of no value if it is not properly analysed. Different kinds of analytics are required to extract useful information from this raw data. Classification, Prediction, Clustering and Pattern Extraction are useful techniques of data mining. These techniques require appropriate selection of attributes of data for getting accurate results. However, Bioinformatics data is high dimensional, usually having hundreds of attributes. Such large a number of attributes affect the performance of machine learning algorithms used for classification/prediction. So, dimensionality reduction techniques are required to reduce the number of attributes that can be further used for analysis. In this paper, Principal Component Analysis and Factor Analysis are used for dimensionality reduction of Bioinformatics data. These techniques were applied on Leukaemia data set and the number of attributes was reduced from to.
△ Less
Submitted 22 July, 2017;
originally announced July 2017.
-
Modern Data Formats for Big Bioinformatics Data Analytics
Authors:
Shahzad Ahmed,
M. Usman Ali,
Javed Ferzund,
Muhammad Atif Sarwar,
Abbas Rehman,
Atif Mehmood
Abstract:
Next Generation Sequencing (NGS) technology has resulted in massive amounts of proteomics and genomics data. This data is of no use if it is not properly analyzed. ETL (Extraction, Transformation, Loading) is an important step in designing data analytics applications. ETL requires proper understanding of features of data. Data format plays a key role in understanding of data, representation of dat…
▽ More
Next Generation Sequencing (NGS) technology has resulted in massive amounts of proteomics and genomics data. This data is of no use if it is not properly analyzed. ETL (Extraction, Transformation, Loading) is an important step in designing data analytics applications. ETL requires proper understanding of features of data. Data format plays a key role in understanding of data, representation of data, space required to store data, data I/O during processing of data, intermediate results of processing, in-memory analysis of data and overall time required to process data. Different data mining and machine learning algorithms require input data in specific types and formats. This paper explores the data formats used by different tools and algorithms and also presents modern data formats that are used on Big Data Platform. It will help researchers and developers in choosing appropriate data format to be used for a particular tool or algorithm.
△ Less
Submitted 5 May, 2017;
originally announced July 2017.
-
An exploratory study of the suitability of UML-based aspect modeling techniques with respect to their integration into Model-Driven Engineering context
Authors:
Abid Mehmood,
Dayang N. A. Jawawi
Abstract:
The integration of aspect oriented modeling approaches with model-driven engineering process achieved through their direct transformation to aspect-oriented code is expected to enhance the software development from many perspectives. However, since no aspect modeling technique has been adopted as the standard while the code generation has to be fully dependent on the input model, it becomes impera…
▽ More
The integration of aspect oriented modeling approaches with model-driven engineering process achieved through their direct transformation to aspect-oriented code is expected to enhance the software development from many perspectives. However, since no aspect modeling technique has been adopted as the standard while the code generation has to be fully dependent on the input model, it becomes imperative to compare all ubiquitous techniques on the basis of some appropriate criteria. This study aims to assess existing UML-based aspect-oriented modeling techniques from the perspective of their suitability with regards to integration into model-driven engineering process through aspect-oriented code generation. We defined an evaluation framework and employed it to evaluate 14 well-published, UML-based aspect-oriented modeling approaches. Further, based on the comparison results, we selected 2 modeling approaches, Reusable Aspect Models and Theme/UML, and proceeded to evaluate them in a detailed way from specific perspectives of design and its mapping to the implementation code. Results of the comparison of 14 approaches show that majority of aspect modeling approaches lack from different perspectives, which results in reducing their use in practice within the context of model-driven engineering. The in-depth comparison of Reusable Aspect Models and Theme/UML reveals some points equally shared by both approaches, and identifies some areas where the former has advantage over the latter.
△ Less
Submitted 14 October, 2014;
originally announced October 2014.
-
Effect of Fast Moving Object on RSSI in WSN: An Experimental Approach
Authors:
Syed Hassan Ahmed,
Safdar H. Bouk,
Amjad Mehmood,
Nadeem Javaid,
Sasase Iwao
Abstract:
In this paper, we experimentally investigate the effect of fast moving object on the RSSI in the wireless sensor networks in presence of the ground effect and antenna orientation in elevation direction. In experimental setup, MICAz mote pair was placed on the ground, where one mote acts as a transmitter and the other as a receiver. The trans- mitter mote's antenna was oriented in elevation directi…
▽ More
In this paper, we experimentally investigate the effect of fast moving object on the RSSI in the wireless sensor networks in presence of the ground effect and antenna orientation in elevation direction. In experimental setup, MICAz mote pair was placed on the ground, where one mote acts as a transmitter and the other as a receiver. The trans- mitter mote's antenna was oriented in elevation direction with respect to the receiver mote's antenna. The fast moving object i.e. car, was passed between the motes and the fluctuations in the RSSI are observed. The experimental results show some sequential pattern in RSSI fluctuations when car moves at some relatively slow speed. However, some irregu- larities were also observed when antenna was oriented at 45 and 90 in elevation direction.
△ Less
Submitted 19 February, 2012;
originally announced February 2012.
-
A New Teaching Model For The Subject Of Software Project Management
Authors:
M. Rizwan Jameel Qureshi,
Muhammad Rafiq Asim,
Muhammad Nadeem,
Asif Mehmood
Abstract:
Software (SW) development is a very tough task which requires a skilled project leader for its success. If the project leader is not skilled enough then project may fail. In the real world of SW engineering 65% of the SW projects fail to meet their objectives as in [1]. The main reason is lack of training of the project mangers. This extreme ratio of failure can be reduced by teaching SW project m…
▽ More
Software (SW) development is a very tough task which requires a skilled project leader for its success. If the project leader is not skilled enough then project may fail. In the real world of SW engineering 65% of the SW projects fail to meet their objectives as in [1]. The main reason is lack of training of the project mangers. This extreme ratio of failure can be reduced by teaching SW project management (SPM) to the future project managers in the practical manner, so that they may be skillful enough to handle the project in a better way. This paper intends to propose a model to be used to teach SPM to the student of SW engineering to reduce the failure rate of projects.
△ Less
Submitted 12 February, 2012;
originally announced February 2012.
-
Resource Management Services for a Grid Analysis Environment
Authors:
Arshad Ali,
Ashiq Anjum,
Tahir Azim,
Julian Bunn,
Atif Mehmood,
Richard McClatchey,
Harvey Newman,
Waqas ur Rehman,
Conrad Steenberg,
Michael Thomas,
Frank van Lingen,
Ian Willers,
Muhammad Adeel Zafar
Abstract:
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users…
▽ More
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Grid-enabled physics analysis.
△ Less
Submitted 10 April, 2005;
originally announced April 2005.
-
A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment
Authors:
Arshad Ali,
Ashiq Anjum,
Atif Mehmood,
Richard McClatchey,
Ian Willers,
Julian Bunn,
Harvey Newman,
Michael Thomas,
Conrad Steenberg
Abstract:
The concept of coupling geographically distributed resources for solving large scale problems is becoming increasingly popular forming what is popularly called grid computing. Management of resources in the Grid environment becomes complex as the resources are geographically distributed, heterogeneous in nature and owned by different individuals and organizations each having their own resource m…
▽ More
The concept of coupling geographically distributed resources for solving large scale problems is becoming increasingly popular forming what is popularly called grid computing. Management of resources in the Grid environment becomes complex as the resources are geographically distributed, heterogeneous in nature and owned by different individuals and organizations each having their own resource management policies and different access and cost models. There have been many projects that have designed and implemented the resource management systems with a variety of architectures and services. In this paper we have presented the general requirements that a Resource Management system should satisfy. The taxonomy has also been defined based on which survey of resource management systems in different existing Grid projects has been conducted to identify the key areas where these systems lack the desired functionality.
△ Less
Submitted 14 January, 2018; v1 submitted 5 July, 2004;
originally announced July 2004.