Skip to main content

Showing 1–30 of 30 results for author: Roughan, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.12806  [pdf, ps, other

    q-fin.ST cs.NE

    Hierarchical Representations for Evolving Acyclic Vector Autoregressions (HEAVe)

    Authors: Cameron Cornell, Lewis Mitchell, Matthew Roughan

    Abstract: Causal networks offer an intuitive framework to understand influence structures within time series systems. However, the presence of cycles can obscure dynamic relationships and hinder hierarchical analysis. These networks are typically identified through multivariate predictive modelling, but enforcing acyclic constraints significantly increases computational and analytical complexity. Despite re… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. Evolutionary Generation of Random Surreal Numbers for Benchmarking

    Authors: Matthew Roughan

    Abstract: There are many areas of scientific endeavour where large, complex datasets are needed for benchmarking. Evolutionary computing provides a means towards creating such sets. As a case study, we consider Conway's Surreal numbers. They have largely been treated as a theoretical construct, with little effort towards empirical study, at least in part because of the difficulty of working with all but the… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

    Comments: To appear in short form in Genetic and Evolutionary Computation Conference (GECCO '25), 2025

    Journal ref: Genetic and Evolutionary Computation Conference (GECCO '25), July 14--18, 2025, Malaga

  3. arXiv:2407.00939  [pdf

    cs.NE math.OC

    Modified CMA-ES Algorithm for Multi-Modal Optimization: Incorporating Niching Strategies and Dynamic Adaptation Mechanism

    Authors: Wathsala Karunarathne, Indu Bala, Dikshit Chauhan, Matthew Roughan, Lewis Mitchell

    Abstract: This study modifies the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm for multi-modal optimization problems. The enhancements focus on addressing the challenges of multiple global minima, improving the algorithm's ability to maintain diversity and explore complex fitness landscapes. We incorporate niching strategies and dynamic adaptation mechanisms to refine the algorithm's p… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 15 pages, 1 figure, 16 tables. Submitted for GECCO 2024 competition on Benchmarking Niching Methods for Multimodal Optimization

  4. arXiv:2211.05350  [pdf, other

    cs.IT physics.data-an

    The entropy rate of Linear Additive Markov Processes

    Authors: Bridget Smart, Matthew Roughan, Lewis Mitchell

    Abstract: This work derives a theoretical value for the entropy of a Linear Additive Markov Process (LAMP), an expressive model able to generate sequences with a given autocorrelation structure. While a first-order Markov Chain model generates new values by conditioning on the current state, the LAMP model takes the transition state from the sequence's history according to some distribution which does not h… ▽ More

    Submitted 9 January, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: 9 pages, code available on Github

  5. Performance Analysis: Discovering Semi-Markov Models From Event Logs

    Authors: Anna Kalenkova, Lewis Mitchell, Matthew Roughan

    Abstract: Process mining is a well-established discipline of data analysis focused on the discovery of process models from information systems' event logs. Recently, an emerging subarea of process mining, known as stochastic process discovery, has started to evolve. Stochastic process discovery considers frequencies of events in the event data and allows for a more comprehensive analysis. In particular, whe… ▽ More

    Submitted 6 March, 2025; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: This work has been accepted to IEEE Access journal

    Journal ref: IEEE Access, vol. 13, pp. 38035-38053, 2025

  6. arXiv:2205.06029  [pdf

    physics.soc-ph cs.IT cs.SI

    Information flow estimation: a study of news on Twitter

    Authors: Tobin South, Bridget Smart, Matthew Roughan, Lewis Mitchell

    Abstract: News media has long been an ecosystem of creation, reproduction, and critique, where news outlets report on current events and add commentary to ongoing stories. Understanding the dynamics of news information creation and dispersion is important to accurately ascribe credit to influential work and understand how societal narratives develop. These dynamics can be modelled through a combination of i… ▽ More

    Submitted 28 September, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Journal ref: Online Social Networks and Media, Volume 31, September 2022, 100231

  7. arXiv:2205.04210  [pdf, other

    cs.CR

    Boolean Expressions in Firewall Analysis

    Authors: Adam Hamilton, Matthew Roughan, Giang T. Nguyen

    Abstract: Firewall policies are an important line of defence in cybersecurity, specifying which packets are allowed to pass through a network and which are not. These firewall policies are made up of a list of interacting rules. In practice, firewall can consist of hundreds or thousands of rules. This can be very difficult for a human to correctly configure. One proposed solution is to model firewall polici… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

  8. arXiv:2111.04070  [pdf, other

    cs.DB cs.IR

    Em-K Indexing for Approximate Query Matching in Large-scale ER

    Authors: Samudra Herath, Matthew Roughan, Gary Glonek

    Abstract: Accurate and efficient entity resolution (ER) is a significant challenge in many data mining and analysis projects requiring integrating and processing massive data collections. It is becoming increasingly important in real-world applications to develop ER solutions that produce prompt responses for entity queries on large-scale databases. Some of these applications demand entity query matching ag… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    ACM Class: H.2.8

  9. arXiv:2111.04067  [pdf, other

    cs.LG cs.DB cs.IR

    High Performance Out-of-sample Embedding Techniques for Multidimensional Scaling

    Authors: Samudra Herath, Matthew Roughan, Gary Glonek

    Abstract: The recent rapid growth of the dimension of many datasets means that many approaches to dimension reduction (DR) have gained significant attention. High-performance DR algorithms are required to make data analysis feasible for big and fast data sets. However, many traditional DR techniques are challenged by truly large data sets. In particular multidimensional scaling (MDS) does not scale well. MD… ▽ More

    Submitted 7 November, 2021; originally announced November 2021.

    ACM Class: I.2.0

  10. arXiv:2110.14881  [pdf, ps, other

    math.PR cs.IT

    Convergence of Conditional Entropy for Long Range Dependent Markov Chains

    Authors: Andrew Feutrill, Matthew Roughan

    Abstract: In this paper we consider the convergence of the conditional entropy to the entropy rate for Markov chains. Convergence of certain statistics of long range dependent processes, such as the sample mean, is slow. It has been shown in Carpio and Daley \cite{carpio2007long} that the convergence of the $n$-step transition probabilities to the stationary distribution is slow, without quantifying the con… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: 16 pages

  11. arXiv:2105.11580  [pdf, other

    cs.IT math.ST

    NPD Entropy: A Non-Parametric Differential Entropy Rate Estimator

    Authors: Andrew Feutrill, Matthew Roughan

    Abstract: The estimation of entropy rates for stationary discrete-valued stochastic processes is a well studied problem in information theory. However, estimating the entropy rate for stationary continuous-valued stochastic processes has not received as much attention. In fact, many current techniques are not able to accurately estimate or characterise the complexity of the differential entropy rate for str… ▽ More

    Submitted 24 May, 2021; originally announced May 2021.

  12. arXiv:2102.05306  [pdf, other

    cs.IT math.PR

    Differential Entropy Rate Characterisations of Long Range Dependent Processes

    Authors: Andrew Feutrill, Matthew Roughan

    Abstract: A quantity of interest to characterise continuous-valued stochastic processes is the differential entropy rate. The rate of convergence of many properties of LRD processes is slower than might be expected, based on the intuition for conventional processes, e.g. Markov processes. Is this also true of the entropy rate? In this paper we consider the properties of the differential entropy rate of st… ▽ More

    Submitted 30 October, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

  13. arXiv:2010.09860  [pdf, other

    math.NA cs.MS

    The Polylogarithm Function in Julia

    Authors: Matthew Roughan

    Abstract: The polylogarithm function is one of the constellation of important mathematical functions. It has a long history, and many connections to other special functions and series, and many applications, for instance in statistical physics. However, the practical aspects of its numerical evaluation have not received the type of comprehensive treatments lavished on its siblings. Only a handful of formal… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    MSC Class: 33E20; 33F05; 65B10

  14. arXiv:2009.03014  [pdf, other

    stat.ME cs.IR cs.IT

    Simulating Name-like Vectors for Testing Large-scale Entity Resolution

    Authors: Samudra Herath, Matthew Roughan, Gary Glonek

    Abstract: Accurate and efficient entity resolution (ER) has been a problem in data analysis and data mining projects for decades. In our work, we are interested in developing ER methods to handle big data. Good public datasets are restricted in this area and usually small in size. Simulation is one technique for generating datasets for testing. Existing simulation tools have problems of complexity, scalabil… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

  15. Popularity and Centrality in Spotify Networks: Critical transitions in eigenvector centrality

    Authors: Tobin South, Matthew Roughan, Lewis Mitchell

    Abstract: The modern age of digital music access has increased the availability of data about music consumption and creation, facilitating the large-scale analysis of the complex networks that connect music together. Data about user streaming behaviour, and the musical collaboration networks are particularly important with new data-driven recommendation systems. Without thorough analysis, such collaboration… ▽ More

    Submitted 29 August, 2021; v1 submitted 26 August, 2020; originally announced August 2020.

    Journal ref: Journal of Complex Networks, Volume 8, Issue 6, 1 December 2020, cnaa050

  16. arXiv:1908.09996  [pdf, other

    math.CO cs.DM

    Counting Candy Crush Configurations

    Authors: Adam Hamilton, Giang T. Nguyen, Matthew Roughan

    Abstract: A k-stable c-coloured Candy Crush grid is a weak proper c-colouring of a particular type of k-uniform hypergraph. In this paper we introduce a fully polynomial randomised approximation scheme (FPRAS) which counts the number of k-stable c-coloured Candy Crush grids of a given size (m, n) for certain values of c and k. We implemented this algorithm on Matlab, and found that in a Candy Crush grid wit… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: 19 pages, 3 figures

  17. arXiv:1908.03318  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    Bayesian inference of network structure from information cascades

    Authors: Caitlin Gray, Lewis Mitchell, Matthew Roughan

    Abstract: Contagion processes are strongly linked to the network structures on which they propagate, and learning these structures is essential for understanding and intervention on complex network processes such as epidemics and (mis)information propagation. However, using contagion data to infer network structure is a challenging inverse problem. In particular, it is imperative to have appropriate measure… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

  18. arXiv:1906.08403  [pdf, other

    cs.SI physics.soc-ph

    How the Avengers assemble: Ecological modelling of effective cast sizes for movies

    Authors: Matthew Roughan, Lewis Mitchell, Tobin South

    Abstract: The number of characters in a movie is an interesting feature. However, it is non-trivial to measure directly. Naive metrics such as the number of credited characters vary wildly. Here, we show that a metric based on the notion of "ecological diversity" as expressed through a Shannon-entropy based metric can characterise the number of characters in a movie, and is useful in taxonomic classificatio… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

  19. arXiv:1902.05689  [pdf, other

    cs.CR

    ForestFirewalls: Getting Firewall Configuration Right in Critical Networks (Technical Report)

    Authors: Dinesha Ranathunga, Matthew Roughan, Paul Tune, Phil Kernick, Nick Falkner

    Abstract: Firewall configuration is critical, yet often conducted manually with inevitable errors, leaving networks vulnerable to cyber attack [40]. The impact of misconfigured firewalls can be catastrophic in Supervisory Control and Data Acquisition (SCADA) networks. These networks control the distributed assets of industrial systems such as power generation and water distribution systems. Automation can m… ▽ More

    Submitted 15 February, 2019; originally announced February 2019.

  20. Verifying and Monitoring IoTs Network Behavior using MUD Profiles

    Authors: Ayyoob Hamza, Dinesha Ranathunga, Hassan Habibi Gharakheili, Theophilus A. Benson, Matthew Roughan, Vijay Sivaraman

    Abstract: IoT devices are increasingly being implicated in cyber-attacks, raising community concern about the risks they pose to critical infrastructure, corporations, and citizens. In order to reduce this risk, the IETF is pushing IoT vendors to develop formal specifications of the intended purpose of their IoT devices, in the form of a Manufacturer Usage Description (MUD), so that their network behavior i… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

    Comments: 17 pages, 17 figures. arXiv admin note: text overlap with arXiv:1804.04358

  21. arXiv:1811.01467  [pdf, other

    cs.SI physics.soc-ph stat.AP

    The one comparing narrative social network extraction techniques

    Authors: Michelle Edwards, Lewis Mitchell, Jonathan Tuke, Matthew Roughan

    Abstract: Analysing narratives through their social networks is an expanding field in quantitative literary studies. Manually extracting a social network from any narrative can be time consuming, so automatic extraction methods of varying complexity have been developed. However, the effect of different extraction methods on the analysis is unknown. Here we model and compare three extraction methods for soci… ▽ More

    Submitted 4 November, 2018; originally announced November 2018.

  22. arXiv:1806.11276  [pdf, other

    cs.SI cs.DS stat.CO

    Generating Connected Random Graphs

    Authors: Caitlin Gray, Lewis Mitchell, Matthew Roughan

    Abstract: Sampling random graphs is essential in many applications, and often algorithms use Markov chain Monte Carlo methods to sample uniformly from the space of graphs. However, often there is a need to sample graphs with some property that we are unable, or it is too inefficient, to sample using standard approaches. In this paper, we are interested in sampling graphs from a conditional ensemble of the u… ▽ More

    Submitted 25 October, 2018; v1 submitted 29 June, 2018; originally announced June 2018.

    Comments: Added references, Expanded Implementation - same conclusions

  23. Clear as MUD: Generating, Validating and Applying IoT Behaviorial Profiles (Technical Report)

    Authors: Ayyoob Hamza, Dinesha Ranathunga, H. Habibi Gharakheili, Matthew Roughan, Vijay Sivaraman

    Abstract: IoT devices are increasingly being implicated in cyber-attacks, driving community concern about the risks they pose to critical infrastructure, corporations, and citizens. In order to reduce this risk, the IETF is pushing IoT vendors to develop formal specifications of the intended purpose of their IoT devices, in the form of a Manufacturer Usage Description (MUD), so that their network behavior i… ▽ More

    Submitted 12 April, 2018; originally announced April 2018.

  24. arXiv:1802.05039  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    Super-blockers and the effect of network structure on information cascades

    Authors: Caitlin Gray, Lewis Mitchell, Matthew Roughan

    Abstract: Modelling information cascades over online social networks is important in fields from marketing to civil unrest prediction, however the underlying network structure strongly affects the probability and nature of such cascades. Even with simple cascade dynamics the probability of large cascades are almost entirely dictated by network properties, with well-known networks such as Erdos-Renyi and Bar… ▽ More

    Submitted 21 March, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

  25. arXiv:1706.02813  [pdf, ps, other

    cs.NI stat.AP

    Rigorous statistical analysis of HTTPS reachability

    Authors: George Michaelson, Matthew Roughan, Jonathan Tuke, Matt P. Wand, Randy Bush

    Abstract: The use of secure connections using HTTPS as the default means, or even the only means, to connect to web servers is increasing. It is being pushed from both sides: from the bottom up by client distributions and plugins, and from the top down by organisations such as Google. However, there are potential technical hurdles that might lock some clients out of the modern web. This paper seeks to measu… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

    MSC Class: 62P30 ACM Class: C.2.2; C.2.3; C.2.6

  26. arXiv:1605.09115  [pdf, other

    cs.CR

    The Mathematical Foundations for Mapping Policies to Network Devices (Technical Report)

    Authors: Dinesha Ranathunga, Matthew Roughan, Phil Kernick, Nick Falkner

    Abstract: A common requirement in policy specification languages is the ability to map policies to the underlying network devices. Doing so, in a provably correct way, is important in a security policy context, so administrators can be confident of the level of protection provided by the policies for their networks. Existing policy languages allow policy composition but lack formal semantics to allocate pol… ▽ More

    Submitted 30 May, 2016; originally announced May 2016.

  27. arXiv:1512.03532  [pdf, other

    cs.DS cs.SI

    Fast Generation of Spatially Embedded Random Networks

    Authors: Eric Parsonage, Matthew Roughan

    Abstract: Spatially Embedded Random Networks such as the Waxman random graph have been used in a variety of settings for synthesizing networks. However, little thought has been put into fast generation of these networks. Existing techniques are $O(n^2)$ where $n$ is the number of nodes in the graph. In this paper we present an $O(n + e)$ algorithm, where $e$ is the number of edges.

    Submitted 11 December, 2015; originally announced December 2015.

  28. arXiv:1512.00877  [pdf, other

    stat.ME cs.SI

    All networks look the same to me: Testing for homogeneity in networks

    Authors: Jonathan Tuke, Matthew Roughan

    Abstract: How can researchers test for heterogeneity in the local structure of a network? In this paper, we present a framework that utilizes random sampling to give subgraphs which are then used in a goodness of fit test to test for heterogeneity. We illustrate how to use the goodness of fit test for an analytically derived distribution as well as an empirical distribution. To demonstrate our framework, we… ▽ More

    Submitted 2 December, 2015; originally announced December 2015.

    Comments: 19 pages, 7 figures, 5 tables

  29. arXiv:1503.02781  [pdf, other

    cs.DB cs.IR

    Unravelling Graph-Exchange File Formats

    Authors: Matthew Roughan, Jonathan Tuke

    Abstract: A graph is used to represent data in which the relationships between the objects in the data are at least as important as the objects themselves. Over the last two decades nearly a hundred file formats have been proposed or used to provide portable access to such data. This paper seeks to review these formats, and provide some insight to both reduce the ongoing creation of unnecessary formats, and… ▽ More

    Submitted 10 March, 2015; originally announced March 2015.

  30. arXiv:1305.0321  [pdf, ps, other

    cs.IT

    Hidden Markov Model Identifiability via Tensors

    Authors: Paul Tune, Hung X. Nguyen, Matthew Roughan

    Abstract: The prevalence of hidden Markov models (HMMs) in various applications of statistical signal processing and communications is a testament to the power and flexibility of the model. In this paper, we link the identifiability problem with tensor decomposition, in particular, the Canonical Polyadic decomposition. Using recent results in deriving uniqueness conditions for tensor decomposition, we are a… ▽ More

    Submitted 1 May, 2013; originally announced May 2013.

    Comments: Accepted to ISIT 2013. 5 pages, no figures