Search | arXiv e-print repository

arXiv:2505.20066 [pdf, ps, other]

Automated data curation for self-supervised learning in underwater acoustic analysis

Authors: Hilde I Hummel, Sandjai Bhulai, Burooj Ghani, Rob van der Mei

Abstract: The sustainability of the ocean ecosystem is threatened by increased levels of sound pollution, making monitoring crucial to understand its variability and impact. Passive acoustic monitoring (PAM) systems collect a large amount of underwater sound recordings, but the large volume of data makes manual analysis impossible, creating the need for automation. Although machine learning offers a potenti… ▽ More The sustainability of the ocean ecosystem is threatened by increased levels of sound pollution, making monitoring crucial to understand its variability and impact. Passive acoustic monitoring (PAM) systems collect a large amount of underwater sound recordings, but the large volume of data makes manual analysis impossible, creating the need for automation. Although machine learning offers a potential solution, most underwater acoustic recordings are unlabeled. Self-supervised learning models have demonstrated success in learning from large-scale unlabeled data in various domains like computer vision, Natural Language Processing, and audio. However, these models require large, diverse, and balanced datasets for training in order to generalize well. To address this, a fully automated self-supervised data curation pipeline is proposed to create a diverse and balanced dataset from raw PAM data. It integrates Automatic Identification System (AIS) data with recordings from various hydrophones in the U.S. waters. Using hierarchical k-means clustering, the raw audio data is sampled and then combined with AIS samples to create a balanced and diverse dataset. The resulting curated dataset enables the development of self-supervised learning models, facilitating various tasks such as monitoring marine mammals and assessing sound pollution. △ Less

Submitted 26 May, 2025; originally announced May 2025.

arXiv:2505.12904 [pdf, ps, other]

The Computation of Generalized Embeddings for Underwater Acoustic Target Recognition using Contrastive Learning

Authors: Hilde I. Hummel, Arwin Gansekoele, Sandjai Bhulai, Rob van der Mei

Abstract: The increasing level of sound pollution in marine environments poses an increased threat to ocean health, making it crucial to monitor underwater noise. By monitoring this noise, the sources responsible for this pollution can be mapped. Monitoring is performed by passively listening to these sounds. This generates a large amount of data records, capturing a mix of sound sources such as ship activi… ▽ More The increasing level of sound pollution in marine environments poses an increased threat to ocean health, making it crucial to monitor underwater noise. By monitoring this noise, the sources responsible for this pollution can be mapped. Monitoring is performed by passively listening to these sounds. This generates a large amount of data records, capturing a mix of sound sources such as ship activities and marine mammal vocalizations. Although machine learning offers a promising solution for automatic sound classification, current state-of-the-art methods implement supervised learning. This requires a large amount of high-quality labeled data that is not publicly available. In contrast, a massive amount of lower-quality unlabeled data is publicly available, offering the opportunity to explore unsupervised learning techniques. This research explores this possibility by implementing an unsupervised Contrastive Learning approach. Here, a Conformer-based encoder is optimized by the so-called Variance-Invariance-Covariance Regularization loss function on these lower-quality unlabeled data and the translation to the labeled data is made. Through classification tasks involving recognizing ship types and marine mammal vocalizations, our method demonstrates to produce robust and generalized embeddings. This shows to potential of unsupervised methods for various automatic underwater acoustic analysis tasks. △ Less

Submitted 19 May, 2025; originally announced May 2025.

arXiv:2504.08210 [pdf, other]

Optimizing Power Grid Topologies with Reinforcement Learning: A Survey of Methods and Challenges

Authors: Erica van der Sar, Alessandro Zocca, Sandjai Bhulai

Abstract: Power grid operation is becoming increasingly complex due to the rising integration of renewable energy sources and the need for more adaptive control strategies. Reinforcement Learning (RL) has emerged as a promising approach to power network control (PNC), offering the potential to enhance decision-making in dynamic and uncertain environments. The Learning To Run a Power Network (L2RPN) competit… ▽ More Power grid operation is becoming increasingly complex due to the rising integration of renewable energy sources and the need for more adaptive control strategies. Reinforcement Learning (RL) has emerged as a promising approach to power network control (PNC), offering the potential to enhance decision-making in dynamic and uncertain environments. The Learning To Run a Power Network (L2RPN) competitions have played a key role in accelerating research by providing standardized benchmarks and problem formulations, leading to rapid advancements in RL-based methods. This survey provides a comprehensive and structured overview of RL applications for power grid topology optimization, categorizing existing techniques, highlighting key design choices, and identifying gaps in current research. Additionally, we present a comparative numerical study evaluating the impact of commonly applied RL-based methods, offering insights into their practical effectiveness. By consolidating existing research and outlining open challenges, this survey aims to provide a foundation for future advancements in RL-driven power grid optimization. △ Less

Submitted 15 May, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

Comments: 60 pages, 26 figures, preprint

arXiv:2501.04730 [pdf, ps, other]

Relative Phase Equivariant Deep Neural Systems for Physical Layer Communications

Authors: Arwin Gansekoele, Sandjai Bhulai, Mark Hoogendoorn, Rob van der Mei

Abstract: In the era of telecommunications, the increasing demand for complex and specialized communication systems has led to a focus on improving physical layer communications. Artificial intelligence (AI) has emerged as a promising solution avenue for doing so. Deep neural receivers have already shown significant promise in improving the performance of communications systems. However, a major challenge l… ▽ More In the era of telecommunications, the increasing demand for complex and specialized communication systems has led to a focus on improving physical layer communications. Artificial intelligence (AI) has emerged as a promising solution avenue for doing so. Deep neural receivers have already shown significant promise in improving the performance of communications systems. However, a major challenge lies in developing deep neural receivers that match the energy efficiency and speed of traditional receivers. This work investigates the incorporation of inductive biases in the physical layer using group-equivariant deep learning to improve the parameter efficiency of deep neural receivers. We do so by constructing a deep neural receiver that is equivariant with respect to the phase of arrival. We show that the inclusion of relative phase equivariance significantly reduces the error rate of deep neural receivers at similar model sizes. Thus, we show the potential of group-equivariant deep learning in the domain of physical layer communications. △ Less

Submitted 7 July, 2025; v1 submitted 6 January, 2025; originally announced January 2025.

Comments: Published at TMLR (https://openreview.net/forum?id=vttqWoSJIW)

arXiv:2408.16003 [pdf, other]

Meta-Learning for Federated Face Recognition in Imbalanced Data Regimes

Authors: Arwin Gansekoele, Emiel Hess, Sandjai Bhulai

Abstract: The growing privacy concerns surrounding face image data demand new techniques that can guarantee user privacy. One such face recognition technique that claims to achieve better user privacy is Federated Face Recognition (FRR), a subfield of Federated Learning (FL). However, FFR faces challenges due to the heterogeneity of the data, given the large number of classes that need to be handled. To ove… ▽ More The growing privacy concerns surrounding face image data demand new techniques that can guarantee user privacy. One such face recognition technique that claims to achieve better user privacy is Federated Face Recognition (FRR), a subfield of Federated Learning (FL). However, FFR faces challenges due to the heterogeneity of the data, given the large number of classes that need to be handled. To overcome this problem, solutions are sought in the field of personalized FL. This work introduces three new data partitions based on the CelebA dataset, each with a different form of data heterogeneity. It also proposes Hessian-Free Model Agnostic Meta-Learning (HF-MAML) in an FFR setting. We show that HF-MAML scores higher in verification tests than current FFR models on three different CelebA data partitions. In particular, the verification scores improve the most in heterogeneous data partitions. To balance personalization with the development of an effective global model, an embedding regularization term is introduced for the loss function. This term can be combined with HF-MAML and is shown to increase global model verification performance. Lastly, this work performs a fairness analysis, showing that HF-MAML and its embedding regularization extension can improve fairness by reducing the standard deviation over the client evaluation scores. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: To appear in the IEEE FLTA 2024 proceedings

arXiv:2405.09909 [pdf, other]

A Machine Learning Approach for Simultaneous Demapping of QAM and APSK Constellations

Authors: Arwin Gansekoele, Alexios Balatsoukas-Stimming, Tom Brusse, Mark Hoogendoorn, Sandjai Bhulai, Rob van der Mei

Abstract: As telecommunication systems evolve to meet increasing demands, integrating deep neural networks (DNNs) has shown promise in enhancing performance. However, the trade-off between accuracy and flexibility remains challenging when replacing traditional receivers with DNNs. This paper introduces a novel probabilistic framework that allows a single DNN demapper to demap multiple QAM and APSK constella… ▽ More As telecommunication systems evolve to meet increasing demands, integrating deep neural networks (DNNs) has shown promise in enhancing performance. However, the trade-off between accuracy and flexibility remains challenging when replacing traditional receivers with DNNs. This paper introduces a novel probabilistic framework that allows a single DNN demapper to demap multiple QAM and APSK constellations simultaneously. We also demonstrate that our framework allows exploiting hierarchical relationships in families of constellations. The consequence is that we need fewer neural network outputs to encode the same function without an increase in Bit Error Rate (BER). Our simulation results confirm that our approach approaches the optimal demodulation error bound under an Additive White Gaussian Noise (AWGN) channel for multiple constellations. Thereby, we address multiple important issues in making DNNs flexible enough for practical use as receivers. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: To appear in the ICMLCN 2024 proceedings

arXiv:2405.09902 [pdf, other]

doi 10.1109/WI-IAT59888.2023.00028

Unveiling the Potential: Harnessing Deep Metric Learning to Circumvent Video Streaming Encryption

Authors: Arwin Gansekoele, Tycho Bot, Rob van der Mei, Sandjai Bhulai, Mark Hoogendoorn

Abstract: Encryption on the internet with the shift to HTTPS has been an important step to improve the privacy of internet users. However, there is an increasing body of work about extracting information from encrypted internet traffic without having to decrypt it. Such attacks bypass security guarantees assumed to be given by HTTPS and thus need to be understood. Prior works showed that the variable bitrat… ▽ More Encryption on the internet with the shift to HTTPS has been an important step to improve the privacy of internet users. However, there is an increasing body of work about extracting information from encrypted internet traffic without having to decrypt it. Such attacks bypass security guarantees assumed to be given by HTTPS and thus need to be understood. Prior works showed that the variable bitrates of video streams are sufficient to identify which video someone is watching. These works generally have to make trade-offs in aspects such as accuracy, scalability, robustness, etc. These trade-offs complicate the practical use of these attacks. To that end, we propose a deep metric learning framework based on the triplet loss method. Through this framework, we achieve robust, generalisable, scalable and transferable encrypted video stream detection. First, the triplet loss is better able to deal with video streams not seen during training. Second, our approach can accurately classify videos not seen during training. Third, we show that our method scales well to a dataset of over 1000 videos. Finally, we show that a model trained on video streams over Chrome can also classify streams over Firefox. Our results suggest that this side-channel attack is more broadly applicable than originally thought. We provide our code alongside a diverse and up-to-date dataset for future research. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Published in the WI-IAT 2023 proceedings

arXiv:2401.17789 [pdf, other]

Robustly overfitting latents for flexible neural image compression

Authors: Yura Perugachi-Diaz, Arwin Gansekoele, Sandjai Bhulai

Abstract: Neural image compression has made a great deal of progress. State-of-the-art models are based on variational autoencoders and are outperforming classical models. Neural compression models learn to encode an image into a quantized latent representation that can be efficiently sent to the decoder, which decodes the quantized latent into a reconstructed image. While these models have proven successfu… ▽ More Neural image compression has made a great deal of progress. State-of-the-art models are based on variational autoencoders and are outperforming classical models. Neural compression models learn to encode an image into a quantized latent representation that can be efficiently sent to the decoder, which decodes the quantized latent into a reconstructed image. While these models have proven successful in practice, they lead to sub-optimal results due to imperfect optimization and limitations in the encoder and decoder capacity. Recent work shows how to use stochastic Gumbel annealing (SGA) to refine the latents of pre-trained neural image compression models. We extend this idea by introducing SGA+, which contains three different methods that build upon SGA. We show how our method improves the overall compression performance in terms of the R-D trade-off, compared to its predecessors. Additionally, we show how refinement of the latents with our best-performing method improves the compression performance on both the Tecnick and CLIC dataset. Our method is deployed for a pre-trained hyperprior and for a more flexible model. Further, we give a detailed analysis of our proposed methods and show that they are less sensitive to hyperparameter choices. Finally, we show how each method can be extended to three- instead of two-class rounding. △ Less

Submitted 5 November, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2024

arXiv:2310.02605 [pdf, other]

Multi-Agent Reinforcement Learning for Power Grid Topology Optimization

Authors: Erica van der Sar, Alessandro Zocca, Sandjai Bhulai

Abstract: Recent challenges in operating power networks arise from increasing energy demands and unpredictable renewable sources like wind and solar. While reinforcement learning (RL) shows promise in managing these networks, through topological actions like bus and line switching, efficiently handling large action spaces as networks grow is crucial. This paper presents a hierarchical multi-agent reinforcem… ▽ More Recent challenges in operating power networks arise from increasing energy demands and unpredictable renewable sources like wind and solar. While reinforcement learning (RL) shows promise in managing these networks, through topological actions like bus and line switching, efficiently handling large action spaces as networks grow is crucial. This paper presents a hierarchical multi-agent reinforcement learning (MARL) framework tailored for these expansive action spaces, leveraging the power grid's inherent hierarchical nature. Experimental results indicate the MARL framework's competitive performance with single-agent RL methods. We also compare different RL algorithms for lower-level agents alongside different policies for higher-order agents. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to PSCC 2024

arXiv:2301.04740 [pdf, ps, other]

The Berkelmans-Pries Feature Importance Method: A Generic Measure of Informativeness of Features

Authors: Joris Pries, Guus Berkelmans, Sandjai Bhulai, Rob van der Mei

Abstract: Over the past few years, the use of machine learning models has emerged as a generic and powerful means for prediction purposes. At the same time, there is a growing demand for interpretability of prediction models. To determine which features of a dataset are important to predict a target variable $Y$, a Feature Importance (FI) method can be used. By quantifying how important each feature is for… ▽ More Over the past few years, the use of machine learning models has emerged as a generic and powerful means for prediction purposes. At the same time, there is a growing demand for interpretability of prediction models. To determine which features of a dataset are important to predict a target variable $Y$, a Feature Importance (FI) method can be used. By quantifying how important each feature is for predicting $Y$, irrelevant features can be identified and removed, which could increase the speed and accuracy of a model, and moreover, important features can be discovered, which could lead to valuable insights. A major problem with evaluating FI methods, is that the ground truth FI is often unknown. As a consequence, existing FI methods do not give the exact correct FI values. This is one of the many reasons why it can be hard to properly interpret the results of an FI method. Motivated by this, we introduce a new global approach named the Berkelmans-Pries FI method, which is based on a combination of Shapley values and the Berkelmans-Pries dependency function. We prove that our method has many useful properties, and accurately predicts the correct FI values for several cases where the ground truth FI can be derived in an exact manner. We experimentally show for a large collection of FI methods (468) that existing methods do not have the same useful properties. This shows that the Berkelmans-Pries FI method is a highly valuable tool for analyzing datasets with complex interdependencies. △ Less

Submitted 11 January, 2023; originally announced January 2023.

arXiv:2301.03318 [pdf, ps, other]

The Optimal Input-Independent Baseline for Binary Classification: The Dutch Draw

Authors: Joris Pries, Etienne van de Bijl, Jan Klein, Sandjai Bhulai, Rob van der Mei

Abstract: Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is `good' or `bad'. The goal of this paper is to examine all baseline methods that are independent of feature values and determine which model is the `best' and why. By iden… ▽ More Before any binary classification model is taken into practice, it is important to validate its performance on a proper test set. Without a frame of reference given by a baseline method, it is impossible to determine if a score is `good' or `bad'. The goal of this paper is to examine all baseline methods that are independent of feature values and determine which model is the `best' and why. By identifying which baseline models are optimal, a crucial selection decision in the evaluation process is simplified. We prove that the recently proposed Dutch Draw baseline is the best input-independent classifier (independent of feature values) for all positional-invariant measures (independent of sequence order) assuming that the samples are randomly shuffled. This means that the Dutch Draw baseline is the optimal baseline under these intuitive requirements and should therefore be used in practice. △ Less

Submitted 9 January, 2023; originally announced January 2023.

arXiv:2208.00003 [pdf, other]

RangL: A Reinforcement Learning Competition Platform

Authors: Viktor Zobernig, Richard A. Saldanha, Jinke He, Erica van der Sar, Jasper van Doorn, Jia-Chen Hua, Lachlan R. Mason, Aleksander Czechowski, Drago Indjic, Tomasz Kosmala, Alessandro Zocca, Sandjai Bhulai, Jorge Montalvo Arvizu, Claude Klöckl, John Moriarty

Abstract: The RangL project hosted by The Alan Turing Institute aims to encourage the wider uptake of reinforcement learning by supporting competitions relating to real-world dynamic decision problems. This article describes the reusable code repository developed by the RangL team and deployed for the 2022 Pathways to Net Zero Challenge, supported by the UK Net Zero Technology Centre. The winning solutions… ▽ More The RangL project hosted by The Alan Turing Institute aims to encourage the wider uptake of reinforcement learning by supporting competitions relating to real-world dynamic decision problems. This article describes the reusable code repository developed by the RangL team and deployed for the 2022 Pathways to Net Zero Challenge, supported by the UK Net Zero Technology Centre. The winning solutions to this particular Challenge seek to optimize the UK's energy transition policy to net zero carbon emissions by 2050. The RangL repository includes an OpenAI Gym reinforcement learning environment and code that supports both submission to, and evaluation in, a remote instance of the open source EvalAI platform as well as all winning learning agent strategies. The repository is an illustrative example of RangL's capability to provide a reusable structure for future challenges. △ Less

Submitted 28 July, 2022; originally announced August 2022.

Comments: Documents in general and premierly the RangL competition plattform and in particular its 2022's competition "Pathways to Netzero" 10 pages, 2 figures, 1 table, Comments welcome!

arXiv:2203.13084 [pdf, other]

The Dutch Draw: Constructing a Universal Baseline for Binary Prediction Models

Authors: Etienne van de Bijl, Jan Klein, Joris Pries, Sandjai Bhulai, Mark Hoogendoorn, Rob van der Mei

Abstract: Novel prediction methods should always be compared to a baseline to know how well they perform. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is needed to evaluate the `goodness' of a performance score. Comparing with the latest state-of-the-art model is usually in… ▽ More Novel prediction methods should always be compared to a baseline to know how well they perform. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is needed to evaluate the `goodness' of a performance score. Comparing with the latest state-of-the-art model is usually insightful. However, being state-of-the-art can change rapidly when newer models are developed. Contrary to an advanced model, a simple dummy classifier could be used. However, the latter could be beaten too easily, making the comparison less valuable. This paper presents a universal baseline method for all binary classification models, named the Dutch Draw (DD). This approach weighs simple classifiers and determines the best classifier to use as a baseline. We theoretically derive the DD baseline for many commonly used evaluation measures and show that in most situations it reduces to (almost) always predicting either zero or one. Summarizing, the DD baseline is: (1) general, as it is applicable to all binary classification problems; (2) simple, as it is quickly determined without training or parameter-tuning; (3) informative, as insightful conclusions can be drawn from the results. The DD baseline serves two purposes. First, to enable comparisons across research papers by this robust and universal baseline. Secondly, to provide a sanity check during the development process of a prediction model. It is a major warning sign when a model is outperformed by the DD baseline. △ Less

Submitted 24 March, 2022; originally announced March 2022.

arXiv:2203.12329 [pdf, other]

The BP Dependency Function: a Generic Measure of Dependence between Random Variables

Authors: Guus Berkelmans, Joris Pries, Sandjai Bhulai, Rob van der Mei

Abstract: Measuring and quantifying dependencies between random variables (RV's) can give critical insights into a data-set. Typical questions are: `Do underlying relationships exist?', `Are some variables redundant?', and `Is some target variable $Y$ highly or weakly dependent on variable $X$?' Interestingly, despite the evident need for a general-purpose measure of dependency between RV's, common practice… ▽ More Measuring and quantifying dependencies between random variables (RV's) can give critical insights into a data-set. Typical questions are: `Do underlying relationships exist?', `Are some variables redundant?', and `Is some target variable $Y$ highly or weakly dependent on variable $X$?' Interestingly, despite the evident need for a general-purpose measure of dependency between RV's, common practice of data analysis is that most data analysts use the Pearson correlation coefficient (PCC) to quantify dependence between RV's, while it is well-recognized that the PCC is essentially a measure for linear dependency only. Although many attempts have been made to define more generic dependency measures, there is yet no consensus on a standard, general-purpose dependency function. In fact, several ideal properties of a dependency function have been proposed, but without much argumentation. Motivated by this, in this paper we will discuss and revise the list of desired properties and propose a new dependency function that meets all these requirements. This general-purpose dependency function provides data analysts a powerful means to quantify the level of dependence between variables. To this end, we also provide Python code to determine the dependency function for use in practice. △ Less

Submitted 23 March, 2022; originally announced March 2022.

MSC Class: 62H20 (Primary) 60A10; 62H05 (Secondary)

arXiv:2111.13576 [pdf, ps, other]

Job Recommender Systems: A Review

Authors: Corné de Ruijt, Sandjai Bhulai

Abstract: This paper provides a review of the job recommender system (JRS) literature published in the past decade (2011-2021). Compared to previous literature reviews, we put more emphasis on contributions that incorporate the temporal and reciprocal nature of job recommendations. Previous studies on JRS suggest that taking such views into account in the design of the JRS can lead to improved model perform… ▽ More This paper provides a review of the job recommender system (JRS) literature published in the past decade (2011-2021). Compared to previous literature reviews, we put more emphasis on contributions that incorporate the temporal and reciprocal nature of job recommendations. Previous studies on JRS suggest that taking such views into account in the design of the JRS can lead to improved model performance. Also, it may lead to a more uniform distribution of candidates over a set of similar jobs. We also consider the literature from the perspective of algorithm fairness. Here we find that this is rarely discussed in the literature, and if it is discussed, many authors wrongly assume that removing the discriminatory feature would be sufficient. With respect to the type of models used in JRS, authors frequently label their method as `hybrid'. Unfortunately, they thereby obscure what these methods entail. Using existing recommender taxonomies, we split this large class of hybrids into subcategories that are easier to analyse. We further find that data availability, and in particular the availability of click data, has a large impact on the choice of method and validation. Last, although the generalizability of JRS across different datasets is infrequently considered, results suggest that error scores may vary across these datasets. △ Less

Submitted 26 November, 2021; originally announced November 2021.

arXiv:2111.11314 [pdf, ps, other]

The Generalized Cascade Click Model: A Unified Framework for Estimating Click Models

Authors: Corné de Ruijt, Sandjai Bhulai

Abstract: Given the vital importance of search engines to find digital information, there has been much scientific attention on how users interact with search engines, and how such behavior can be modeled. Many models on user - search engine interaction, which in the literature are known as click models, come in the form of Dynamic Bayesian Networks. Although many authors have used the resemblance between t… ▽ More Given the vital importance of search engines to find digital information, there has been much scientific attention on how users interact with search engines, and how such behavior can be modeled. Many models on user - search engine interaction, which in the literature are known as click models, come in the form of Dynamic Bayesian Networks. Although many authors have used the resemblance between the different click models to derive estimation procedures for these models, in particular in the form of expectation maximization (EM), still this commonly requires considerable work, in particular when it comes to deriving the E-step. What we propose in this paper, is that this derivation is commonly unnecessary: many existing click models can in fact, under certain assumptions, be optimized as they were Input-Output Hidden Markov Models (IO-HMMs), for which the forward-backward equations immediately provide this E-step. To arrive at that conclusion, we will present the Generalized Cascade Model (GCM) and show how this model can be estimated using the IO-HMM EM framework, and provide two examples of how existing click models can be mapped to GCM. Our GCM approach to estimating click models has also been implemented in the gecasmo Python package. △ Less

Submitted 22 November, 2021; originally announced November 2021.

arXiv:2108.06238 [pdf, other]

Jasmine: A New Active Learning Approach to Combat Cybercrime

Authors: Jan Klein, Sandjai Bhulai, Mark Hoogendoorn, Rob van der Mei

Abstract: Over the past decade, the advent of cybercrime has accelarated the research on cybersecurity. However, the deployment of intrusion detection methods falls short. One of the reasons for this is the lack of realistic evaluation datasets, which makes it a challenge to develop techniques and compare them. This is caused by the large amounts of effort it takes for a cyber analyst to classify network co… ▽ More Over the past decade, the advent of cybercrime has accelarated the research on cybersecurity. However, the deployment of intrusion detection methods falls short. One of the reasons for this is the lack of realistic evaluation datasets, which makes it a challenge to develop techniques and compare them. This is caused by the large amounts of effort it takes for a cyber analyst to classify network connections. This has raised the need for methods (i) that can learn from small sets of labeled data, (ii) that can make predictions on large sets of unlabeled data, and (iii) that request the label of only specially selected unlabeled data instances. Hence, Active Learning (AL) methods are of interest. These approaches choose specific unlabeled instances by a query function that are expected to improve overall classification performance. The resulting query observations are labeled by a human expert and added to the labeled set. In this paper, we propose a new hybrid AL method called Jasmine. Firstly, it determines how suitable each observation is for querying, i.e., how likely it is to enhance classification. These properties are the uncertainty score and anomaly score. Secondly, Jasmine introduces dynamic updating. This allows the model to adjust the balance between querying uncertain, anomalous and randomly selected observations. To this end, Jasmine is able to learn the best query strategy during the labeling process. This is in contrast to the other AL methods in cybersecurity that all have static, predetermined query functions. We show that dynamic updating, and therefore Jasmine, is able to consistently obtain good and more robust results than querying only uncertainties, only anomalies or a fixed combination of the two. △ Less

Submitted 13 August, 2021; originally announced August 2021.

Comments: 35 pages, 15 figures

arXiv:2102.02694 [pdf, other]

Invertible DenseNets with Concatenated LipSwish

Authors: Yura Perugachi-Diaz, Jakub M. Tomczak, Sandjai Bhulai

Abstract: We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient extension of Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce invertibility of the network by satisfying the Lipschitz constant. Furthermore, we propose a learnable weighted concatenation, which not only improves the model performance but… ▽ More We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient extension of Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce invertibility of the network by satisfying the Lipschitz constant. Furthermore, we propose a learnable weighted concatenation, which not only improves the model performance but also indicates the importance of the concatenated weighted representation. Additionally, we introduce the Concatenated LipSwish as activation function, for which we show how to enforce the Lipschitz condition and which boosts performance. The new architecture, i-DenseNet, out-performs Residual Flow and other flow-based models on density estimation evaluated in bits per dimension, where we utilize an equal parameter budget. Moreover, we show that the proposed model out-performs Residual Flows when trained as a hybrid model where the model is both a generative and a discriminative model. △ Less

Submitted 23 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2021. This is an extension of Invertible DenseNets (arXiv:2010.02125). arXiv admin note: text overlap with arXiv:2010.02125

arXiv:2010.02125 [pdf, other]

Invertible DenseNets

Authors: Yura Perugachi-Diaz, Jakub M. Tomczak, Sandjai Bhulai

Abstract: We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce the invertibility of the network by satisfying the Lipschitz constraint. Additionally, we extend this method by proposing a learnable concatenation, which not only improves th… ▽ More We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce the invertibility of the network by satisfying the Lipschitz constraint. Additionally, we extend this method by proposing a learnable concatenation, which not only improves the model performance but also indicates the importance of the concatenated representation. We demonstrate the performance of i-DenseNets and Residual Flows on toy, MNIST, and CIFAR10 data. Both i-DenseNets outperform Residual Flows evaluated in negative log-likelihood, on all considered datasets under an equal parameter budget. △ Less

Submitted 8 January, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: Accepted at 3rd Symposium on Advances in Approximate Bayesian Inference (AABI)

arXiv:1511.01861 [pdf, other]

Modeling trend progression through an extension of the Polya Urn Process

Authors: Marijn ten Thij, Sandjai Bhulai

Abstract: Knowing how and when trends are formed is a frequently visited research goal. In our work, we focus on the progression of trends through (social) networks. We use a random graph (RG) model to mimic the progression of a trend through the network. The context of the trend is not included in our model. We show that every state of the RG model maps to a state of the Polya process. We find that the lim… ▽ More Knowing how and when trends are formed is a frequently visited research goal. In our work, we focus on the progression of trends through (social) networks. We use a random graph (RG) model to mimic the progression of a trend through the network. The context of the trend is not included in our model. We show that every state of the RG model maps to a state of the Polya process. We find that the limit of the component size distribution of the RG model shows power-law behaviour. These results are also supported by simulations. △ Less

Submitted 5 November, 2015; originally announced November 2015.

Comments: 11 pages, 2 figures, NetSci-X Conference, Wroclaw, Poland, 11-13 January 2016. arXiv admin note: text overlap with arXiv:1502.00166

arXiv:1502.00166 [pdf, other]

doi 10.1007/978-3-319-13123-8_11

Modelling of trends in Twitter using retweet graph dynamics

Authors: Marijn ten Thij, Tanneke Ouboter, Daniel Worm, Nelly Litvak, Hans van den Berg, Sandjai Bhulai

Abstract: In this paper we model user behaviour in Twitter to capture the emergence of trending topics. For this purpose, we first extensively analyse tweet datasets of several different events. In particular, for these datasets, we construct and investigate the retweet graphs. We find that the retweet graph for a trending topic has a relatively dense largest connected component (LCC). Next, based on the in… ▽ More In this paper we model user behaviour in Twitter to capture the emergence of trending topics. For this purpose, we first extensively analyse tweet datasets of several different events. In particular, for these datasets, we construct and investigate the retweet graphs. We find that the retweet graph for a trending topic has a relatively dense largest connected component (LCC). Next, based on the insights obtained from the analyses of the datasets, we design a mathematical model that describes the evolution of a retweet graph by three main parameters. We then quantify, analytically and by simulation, the influence of the model parameters on the basic characteristics of the retweet graph, such as the density of edges and the size and density of the LCC. Finally, we put the model in practice, estimate its parameters and compare the resulting behavior of the model to our datasets. △ Less

Submitted 31 January, 2015; originally announced February 2015.

Comments: 16 pages, 5 figures, presented at WAW 2014

Journal ref: Algorithms and Models for the Web Graph, 11th International Workshop, WAW 2014, Beijing, China, December 17-18, 2014, Proceedings pp 132-147, Lecture Notes in Computer Science, Springer

Showing 1–21 of 21 results for author: Bhulai, S