Skip to main content

Showing 1–13 of 13 results for author: Heard, N A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.19236  [pdf, other

    stat.CO stat.ML

    Approximate learning of parsimonious Bayesian context trees

    Authors: Daniyar Ghani, Nicholas A. Heard, Francesco Sanna Passino

    Abstract: Models for categorical sequences typically assume exchangeable or first-order dependent sequence elements. These are common assumptions, for example, in models of computer malware traces and protein sequences. Although such simplifying assumptions lead to computational tractability, these models fail to capture long-range, complex dependence structures that may be harnessed for greater predictive… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  2. Nested Dirichlet models for unsupervised attack pattern detection in honeypot data

    Authors: Francesco Sanna Passino, Anastasia Mantziou, Daniyar Ghani, Philip Thiede, Ross Bevington, Nicholas A. Heard

    Abstract: Cyber-systems are under near-constant threat from intrusion attempts. Attacks types vary, but each attempt typically has a specific underlying intent, and the perpetrators are typically groups of individuals with similar objectives. Clustering attacks appearing to share a common intent is very valuable to threat-hunting experts. This article explores Dirichlet distribution topic models for cluster… ▽ More

    Submitted 21 December, 2024; v1 submitted 6 January, 2023; originally announced January 2023.

    Journal ref: Annals of Applied Statistics 2025, Vol. 19, No. 1, 586-613

  3. arXiv:2111.05054  [pdf, other

    stat.ME

    Changepoint detection in non-exchangeable data

    Authors: Karl L. Hallgren, Nicholas A. Heard, Niall M. Adams

    Abstract: Changepoint models typically assume the data within each segment are independent and identically distributed conditional on some parameters which change across segments. This construction may be inadequate when data are subject to local correlation patterns, often resulting in many more changepoints fitted than preferable. This article proposes a Bayesian changepoint model which relaxes the assump… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

  4. Latent structure blockmodels for Bayesian spectral graph clustering

    Authors: Francesco Sanna Passino, Nicholas A. Heard

    Abstract: Spectral embedding of network adjacency matrices often produces node representations living approximately around low-dimensional submanifold structures. In particular, hidden substructure is expected to arise when the graph is generated from a latent position model. Furthermore, the presence of communities within the network might generate community-specific submanifold structures in the embedding… ▽ More

    Submitted 2 January, 2022; v1 submitted 4 July, 2021; originally announced July 2021.

    Journal ref: Statistics and Computing, 32(22) (2022)

  5. Mutually exciting point process graphs for modelling dynamic networks

    Authors: Francesco Sanna Passino, Nicholas A. Heard

    Abstract: A new class of models for dynamic networks is proposed, called mutually exciting point process graphs (MEG). MEG is a scalable network-wide statistical model for point processes with dyadic marks, which can be used for anomaly detection when assessing the significance of future events, including previously unobserved connections between nodes. The model combines mutually exciting point processes t… ▽ More

    Submitted 22 December, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Journal ref: Journal of Computational and Graphical Statistics, 32:1, 116-130 (2023)

  6. Changepoint detection on a graph of time series

    Authors: Karl L. Hallgren, Nicholas A. Heard, Melissa J. M. Turcotte

    Abstract: When analysing multiple time series that may be subject to changepoints, it is sometimes possible to specify a priori, by means of a graph, which pairs of time series are likely to be impacted by simultaneous changepoints. This article proposes an informative prior for changepoints which encodes the information contained in the graph, inducing a changepoint model for multiple time series that borr… ▽ More

    Submitted 8 February, 2023; v1 submitted 8 February, 2021; originally announced February 2021.

    MSC Class: 62-09

  7. Spectral clustering on spherical coordinates under the degree-corrected stochastic blockmodel

    Authors: Francesco Sanna Passino, Nicholas A. Heard, Patrick Rubin-Delanchy

    Abstract: Spectral clustering is a popular method for community detection in network graphs: starting from a matrix representation of the graph, the nodes are clustered on a low dimensional projection obtained from a truncated spectral decomposition of the matrix. Estimating correctly the number of communities and the dimension of the reduced latent space is critical for good performance of spectral cluster… ▽ More

    Submitted 8 September, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Journal ref: Technometrics 64(3), 346-357 (2022)

  8. Graph link prediction in computer networks using Poisson matrix factorisation

    Authors: Francesco Sanna Passino, Melissa J. M. Turcotte, Nicholas A. Heard

    Abstract: Graph link prediction is an important task in cyber-security: relationships between entities within a computer network, such as users interacting with computers, or system libraries and the corresponding processes that use them, can provide key insights into adversary behaviour. Poisson matrix factorisation (PMF) is a popular model for link prediction in large networks, particularly useful for its… ▽ More

    Submitted 21 May, 2021; v1 submitted 26 January, 2020; originally announced January 2020.

    Journal ref: Annals of Applied Statistics, 16(3), 1313-1332 (2022)

  9. Link prediction in dynamic networks using random dot product graphs

    Authors: Francesco Sanna Passino, Anna S. Bertiger, Joshua C. Neil, Nicholas A. Heard

    Abstract: The problem of predicting links in large networks is an important task in a variety of practical applications, including social sciences, biology and computer security. In this paper, statistical techniques for link prediction based on the popular random dot product graph model are carefully presented, analysed and extended to dynamic settings. Motivated by a practical application in cyber-securit… ▽ More

    Submitted 13 July, 2021; v1 submitted 22 December, 2019; originally announced December 2019.

    Journal ref: Data Mining and Knowledge Discovery (2021), 35(5), 2168-2199

  10. arXiv:1904.05333  [pdf, other

    cs.SI cs.LG stat.AP stat.ML

    Bayesian estimation of the latent dimension and communities in stochastic blockmodels

    Authors: Francesco Sanna Passino, Nicholas A. Heard

    Abstract: Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing a network in a lower dimensional latent space, with optimal theoretical guarantees. The embedding can be used to estimate the community structure of the network, with strong consistency results in the stochastic blockmodel framework. One of the main practical limitations of standard al… ▽ More

    Submitted 28 May, 2020; v1 submitted 6 April, 2019; originally announced April 2019.

    Journal ref: Statistics and Computing 30(5), 1291-1307 (2020)

  11. arXiv:1509.08442  [pdf, other

    stat.AP stat.ME

    Adaptive sequential Monte Carlo for multiple changepoint analysis

    Authors: Melissa J. M. Turcotte, Nicholas A. Heard

    Abstract: Process monitoring and control requires detection of structural changes in a data stream in real time. This article introduces an efficient sequential Monte Carlo algorithm designed for learning unknown changepoints in continuous time. The method is intuitively simple: new changepoints for the latest window of data are proposed by conditioning only on data observed since the most recent estimated… ▽ More

    Submitted 28 September, 2015; originally announced September 2015.

    Comments: 23 pages, 6 figures

  12. arXiv:1408.3845  [pdf, other

    stat.ME

    A test for dependence between two point processes on the real line

    Authors: Patrick Rubin-Delanchy, Nicholas A. Heard

    Abstract: Many scientific questions rely on determining whether two sequences of event times are associated. This article introduces a likelihood ratio test which can be parameterised in several ways to detect different forms of dependence. A common finite-sample distribution is derived, and shown to be asymptotically related to a weighted Kolmogorov-Smirnov test. Analysis leading to these results also moti… ▽ More

    Submitted 20 December, 2014; v1 submitted 17 August, 2014; originally announced August 2014.

    Comments: 13 pages, 4 figures

    MSC Class: 62N03; 62P35; 62P30; 62M10; 62F03; 62F05

  13. Bayesian anomaly detection methods for social networks

    Authors: Nicholas A. Heard, David J. Weston, Kiriaki Platanioti, David J. Hand

    Abstract: Learning the network structure of a large graph is computationally demanding, and dynamically monitoring the network over time for any changes in structure threatens to be more challenging still. This paper presents a two-stage method for anomaly detection in dynamic graphs: the first stage uses simple, conjugate Bayesian models for discrete time counting processes to track the pairwise links of a… ▽ More

    Submitted 8 November, 2010; originally announced November 2010.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS329 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS329

    Journal ref: Annals of Applied Statistics 2010, Vol. 4, No. 2, 645-662