-
Graph-Based Floor Separation Using Node Embeddings and Clustering of WiFi Trajectories
Authors:
Rabia Yasa Kostas,
Kahraman Kostas
Abstract:
Indoor positioning systems (IPSs) are increasingly vital for location-based services in complex multi-storey environments. This study proposes a novel graph-based approach for floor separation using Wi-Fi fingerprint trajectories, addressing the challenge of vertical localization in indoor settings. We construct a graph where nodes represent Wi-Fi fingerprints, and edges are weighted by signal sim…
▽ More
Indoor positioning systems (IPSs) are increasingly vital for location-based services in complex multi-storey environments. This study proposes a novel graph-based approach for floor separation using Wi-Fi fingerprint trajectories, addressing the challenge of vertical localization in indoor settings. We construct a graph where nodes represent Wi-Fi fingerprints, and edges are weighted by signal similarity and contextual transitions. Node2Vec is employed to generate low-dimensional embeddings, which are subsequently clustered using K-means to identify distinct floors. Evaluated on the Huawei University Challenge 2021 dataset, our method outperforms traditional community detection algorithms, achieving an accuracy of 68.97%, an F1- score of 61.99%, and an Adjusted Rand Index of 57.19%. By publicly releasing the preprocessed dataset and implementation code, this work contributes to advancing research in indoor positioning. The proposed approach demonstrates robustness to signal noise and architectural complexities, offering a scalable solution for floor-level localization.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
GeMID: Generalizable Models for IoT Device Identification
Authors:
Kahraman Kostas,
Rabia Yasa Kostas,
Mike Just,
Michael A. Lones
Abstract:
With the proliferation of Internet of Things (IoT) devices, ensuring their security has become paramount. Device identification (DI), which distinguishes IoT devices based on their traffic patterns, plays a crucial role in both differentiating devices and identifying vulnerable ones, closing a serious security gap. However, existing approaches to DI that build machine learning models often overloo…
▽ More
With the proliferation of Internet of Things (IoT) devices, ensuring their security has become paramount. Device identification (DI), which distinguishes IoT devices based on their traffic patterns, plays a crucial role in both differentiating devices and identifying vulnerable ones, closing a serious security gap. However, existing approaches to DI that build machine learning models often overlook the challenge of model generalizability across diverse network environments. In this study, we propose a novel framework to address this limitation and evaluate the generalizability of DI models across datasets collected within different network environments. Our approach involves a two-step process: first, we develop a feature and model selection method that is more robust to generalization issues by using a genetic algorithm with external feedback and datasets from distinct environments to refine the selections. Second, the resulting DI models are then tested on further independent datasets in order to robustly assess their generalizability. We demonstrate the effectiveness of our method by empirically comparing it to alternatives, highlighting how fundamental limitations of commonly employed techniques such as sliding window and flow statistics limit their generalizability. Our findings advance research in IoT security and device identification, offering insights into improving model effectiveness and mitigating risks in IoT networks.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Physics-Informed Geometric Operators to Support Surrogate, Dimension Reduction and Generative Models for Engineering Design
Authors:
Shahroz Khan,
Zahid Masood,
Muhammad Usama,
Konstantinos Kostas,
Panagiotis Kaklis,
Wei,
Chen
Abstract:
In this work, we propose a set of physics-informed geometric operators (GOs) to enrich the geometric data provided for training surrogate/discriminative models, dimension reduction, and generative models, typically employed for performance prediction, dimension reduction, and creating data-driven parameterisations, respectively. However, as both the input and output streams of these models consist…
▽ More
In this work, we propose a set of physics-informed geometric operators (GOs) to enrich the geometric data provided for training surrogate/discriminative models, dimension reduction, and generative models, typically employed for performance prediction, dimension reduction, and creating data-driven parameterisations, respectively. However, as both the input and output streams of these models consist of low-level shape representations, they often fail to capture shape characteristics essential for performance analyses. Therefore, the proposed GOs exploit the differential and integral properties of shapes--accessed through Fourier descriptors, curvature integrals, geometric moments, and their invariants--to infuse high-level intrinsic geometric information and physics into the feature vector used for training, even when employing simple model architectures or low-level parametric descriptions. We showed that for surrogate modelling, along with the inclusion of the notion of physics, GOs enact regularisation to reduce over-fitting and enhance generalisation to new, unseen designs. Furthermore, through extensive experimentation, we demonstrate that for dimension reduction and generative models, incorporating the proposed GOs enriches the training data with compact global and local geometric features. This significantly enhances the quality of the resulting latent space, thereby facilitating the generation of valid and diverse designs. Lastly, we also show that GOs can enable learning parametric sensitivities to a great extent. Consequently, these enhancements accelerate the convergence rate of shape optimisers towards optimal solutions.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Individual Packet Features are a Risk to Model Generalisation in ML-Based Intrusion Detection
Authors:
Kahraman Kostas,
Mike Just,
Michael A. Lones
Abstract:
Machine learning is increasingly used for intrusion detection in IoT networks. This paper explores the effectiveness of using individual packet features (IPF), which are attributes extracted from a single network packet, such as timing, size, and source-destination information. Through literature review and experiments, we identify the limitations of IPF, showing they can produce misleadingly high…
▽ More
Machine learning is increasingly used for intrusion detection in IoT networks. This paper explores the effectiveness of using individual packet features (IPF), which are attributes extracted from a single network packet, such as timing, size, and source-destination information. Through literature review and experiments, we identify the limitations of IPF, showing they can produce misleadingly high detection rates. Our findings emphasize the need for approaches that consider packet interactions for robust intrusion detection. Additionally, we demonstrate that models based on IPF often fail to generalize across datasets, compromising their reliability in diverse IoT environments.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Generative VS non-Generative Models in Engineering Shape Optimization
Authors:
Muhammad Usama,
Zahid Masood,
Shahroz Khan,
Konstantinos Kostas,
Panagiotis Kaklis
Abstract:
In this work, we perform a systematic comparison of the effectiveness and efficiency of generative and non-generative models in constructing design spaces for novel and efficient design exploration and shape optimization. We apply these models in the case of airfoil/hydrofoil design and conduct the comparison on the resulting design spaces. A conventional Generative Adversarial Network (GAN) and a…
▽ More
In this work, we perform a systematic comparison of the effectiveness and efficiency of generative and non-generative models in constructing design spaces for novel and efficient design exploration and shape optimization. We apply these models in the case of airfoil/hydrofoil design and conduct the comparison on the resulting design spaces. A conventional Generative Adversarial Network (GAN) and a state-of-the-art generative model, the Performance-Augmented Diverse Generative Adversarial Network (PaDGAN), are juxtaposed with a linear non-generative model based on the coupling of the Karhunen-Loève Expansion and a physics-informed Shape Signature Vector (SSV-KLE). The comparison demonstrates that, with an appropriate shape encoding and a physics-augmented design space, non-generative models have the potential to cost-effectively generate high-performing valid designs with enhanced coverage of the design space. In this work, both approaches are applied to two large foil profile datasets comprising real-world and artificial designs generated through either a profile-generating parametric model or deep-learning approach. These datasets are further enriched with integral properties of their members' shapes as well as physics-informed parameters. Our results illustrate that the design spaces constructed by the non-generative model outperform the generative model in terms of design validity, generating robust latent spaces with none or significantly fewer invalid designs when compared to generative models. We aspire that these findings will aid the engineering design community in making informed decisions when constructing designs spaces for shape optimization, as we have show that under certain conditions computationally inexpensive approaches can closely match or even outperform state-of-the art generative models.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
IoTGeM: Generalizable Models for Behaviour-Based IoT Attack Detection
Authors:
Kahraman Kostas,
Mike Just,
Michael A. Lones
Abstract:
Previous research on behaviour-based attack detection on networks of IoT devices has resulted in machine learning models whose ability to adapt to unseen data is limited, and often not demonstrated. In this paper we present an approach for modelling IoT network attacks that focuses on generalizability, yet also leads to better detection and performance. First, we present an improved rolling window…
▽ More
Previous research on behaviour-based attack detection on networks of IoT devices has resulted in machine learning models whose ability to adapt to unseen data is limited, and often not demonstrated. In this paper we present an approach for modelling IoT network attacks that focuses on generalizability, yet also leads to better detection and performance. First, we present an improved rolling window approach for feature extraction, and introduce a multi-step feature selection process that reduces overfitting. Second, we build and test models using isolated train and test datasets, thereby avoiding common data leaks that have limited the generalizability of previous models. Third, we rigorously evaluate our methodology using a diverse portfolio of machine learning models, evaluation metrics and datasets. Finally, we build confidence in the models by using explainable AI techniques, allowing us to identify the features that underlie accurate detection of attacks.
△ Less
Submitted 17 October, 2023;
originally announced January 2024.
-
Externally validating the IoTDevID device identification methodology using the CIC IoT 2022 Dataset
Authors:
Kahraman Kostas,
Mike Just,
Michael A. Lones
Abstract:
In the era of rapid IoT device proliferation, recognizing, diagnosing, and securing these devices are crucial tasks. The IoTDevID method (IEEE Internet of Things 2022) proposes a machine learning approach for device identification using network packet features. In this article we present a validation study of the IoTDevID method by testing core components, namely its feature set and its aggregatio…
▽ More
In the era of rapid IoT device proliferation, recognizing, diagnosing, and securing these devices are crucial tasks. The IoTDevID method (IEEE Internet of Things 2022) proposes a machine learning approach for device identification using network packet features. In this article we present a validation study of the IoTDevID method by testing core components, namely its feature set and its aggregation algorithm, on a new dataset. The new dataset (CIC-IoT-2022) offers several advantages over earlier datasets, including a larger number of devices, multiple instances of the same device, both IP and non-IP device data, normal (benign) usage data, and diverse usage profiles, such as active and idle states. Using this independent dataset, we explore the validity of IoTDevID's core components, and also examine the impacts of the new data on model performance. Our results indicate that data diversity is important to model performance. For example, models trained with active usage data outperformed those trained with idle usage data, and multiple usage data similarly improved performance. Results for IoTDevID were strong with a 92.50 F1 score for 31 IP-only device classes, similar to our results on previous datasets. In all cases, the IoTDevID aggregation algorithm improved model performance. For non-IP devices we obtained a 78.80 F1 score for 40 device classes, though with much less data, confirming that data quantity is also important to model performance.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
ShipHullGAN: A generic parametric modeller for ship hull design using deep convolutional generative model
Authors:
Shahroz Khan,
Kosa Goucher-Lambert,
Konstantinos Kostas,
Panagiotis Kaklis
Abstract:
In this work, we introduce ShipHullGAN, a generic parametric modeller built using deep convolutional generative adversarial networks (GANs) for the versatile representation and generation of ship hulls. At a high level, the new model intends to address the current conservatism in the parametric ship design paradigm, where parametric modellers can only handle a particular ship type. We trained Ship…
▽ More
In this work, we introduce ShipHullGAN, a generic parametric modeller built using deep convolutional generative adversarial networks (GANs) for the versatile representation and generation of ship hulls. At a high level, the new model intends to address the current conservatism in the parametric ship design paradigm, where parametric modellers can only handle a particular ship type. We trained ShipHullGAN on a large dataset of 52,591 \textit{physically validated} designs from a wide range of existing ship types, including container ships, tankers, bulk carriers, tugboats, and crew supply vessels. We developed a new shape extraction and representation strategy to convert all training designs into a common geometric representation of the same resolution, as typically GANs can only accept vectors of fixed dimension as input. A space-filling layer is placed right after the generator component to ensure that the trained generator can cover all design classes. During training, designs are provided in the form of a shape-signature tensor (SST) which harnesses the compact geometric representation using geometric moments that further enable the inexpensive incorporation of physics-informed elements in ship design. We have shown through extensive comparative studies and optimisation cases that ShipHullGAN can generate designs with augmented features resulting in versatile design spaces that produce traditional and novel designs with geometrically valid and practically feasible shapes.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
LSTM based IoT Device Identification
Authors:
Kahraman Kostas
Abstract:
While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number of devices being introduced to the market. In this environment, IoT device identification methods provide a preventive security measure as an important factor in identifying these devices and detecting the vulnerabilities they suffer from. In this study, we pr…
▽ More
While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number of devices being introduced to the market. In this environment, IoT device identification methods provide a preventive security measure as an important factor in identifying these devices and detecting the vulnerabilities they suffer from. In this study, we present a method that identifies devices in the Aalto dataset using Long short-term memory (LSTM)
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
CNN based IoT Device Identification
Authors:
Kahraman Kostas
Abstract:
While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number of devices being introduced to the market. In this environment, IoT device identification methods provide a preventive security measure as an important factor in identifying these devices and detecting the vulnerabilities they suffer from. In this study, we pr…
▽ More
While the use of the Internet of Things is becoming more and more popular, many security vulnerabilities are emerging with the large number of devices being introduced to the market. In this environment, IoT device identification methods provide a preventive security measure as an important factor in identifying these devices and detecting the vulnerabilities they suffer from. In this study, we present a method that identifies devices in the Aalto dataset using the convolutional neural network (CNN).
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
WiFi Based Distance Estimation Using Supervised Machine Learning
Authors:
Kahraman Kostas,
Rabia Yasa Kostas,
Francisco Zampella,
Firas Alsehly
Abstract:
In recent years WiFi became the primary source of information to locate a person or device indoor. Collecting RSSI values as reference measurements with known positions, known as WiFi fingerprinting, is commonly used in various positioning methods and algorithms that appear in literature. However, measuring the spatial distance between given set of WiFi fingerprints is heavily affected by the sele…
▽ More
In recent years WiFi became the primary source of information to locate a person or device indoor. Collecting RSSI values as reference measurements with known positions, known as WiFi fingerprinting, is commonly used in various positioning methods and algorithms that appear in literature. However, measuring the spatial distance between given set of WiFi fingerprints is heavily affected by the selection of the signal distance function used to model signal space as geospatial distance. In this study, the authors proposed utilization of machine learning to improve the estimation of geospatial distance between fingerprints. This research examined data collected from 13 different open datasets to provide a broad representation aiming for general model that can be used in any indoor environment. The proposed novel approach extracted data features by examining a set of commonly used signal distance metrics via feature selection process that includes feature analysis and genetic algorithm. To demonstrate that the output of this research is venue independent, all models were tested on datasets previously excluded during the training and validation phase. Finally, various machine learning algorithms were compared using wide variety of evaluation metrics including ability to scale out the test bed to real world unsolicited datasets.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
IoTDevID: A Behavior-Based Device Identification Method for the IoT
Authors:
Kahraman Kostas,
Mike Just,
Michael A. Lones
Abstract:
Device identification is one way to secure a network of IoT devices, whereby devices identified as suspicious can subsequently be isolated from a network. In this study, we present a machine learning-based method, IoTDevID, that recognizes devices through characteristics of their network packets. As a result of using a rigorous feature analysis and selection process, our study offers a generalizab…
▽ More
Device identification is one way to secure a network of IoT devices, whereby devices identified as suspicious can subsequently be isolated from a network. In this study, we present a machine learning-based method, IoTDevID, that recognizes devices through characteristics of their network packets. As a result of using a rigorous feature analysis and selection process, our study offers a generalizable and realistic approach to modelling device behavior, achieving high predictive accuracy across two public datasets. The model's underlying feature set is shown to be more predictive than existing feature sets used for device identification, and is shown to generalize to data unseen during the feature selection process. Unlike most existing approaches to IoT device identification, IoTDevID is able to detect devices using non-IP and low-energy protocols.
△ Less
Submitted 18 July, 2022; v1 submitted 17 February, 2021;
originally announced February 2021.