-
A Survey on Archetypal Analysis
Authors:
Aleix Alcacer,
Irene Epifanio,
Sebastian Mair,
Morten Mørup
Abstract:
Archetypal analysis (AA) was originally proposed in 1994 by Adele Cutler and Leo Breiman as a computational procedure to extract the distinct aspects called archetypes in observations with each observational record approximated as a mixture (i.e., convex combination) of these archetypes. AA thereby provides straightforward, interpretable, and explainable representations for feature extraction and…
▽ More
Archetypal analysis (AA) was originally proposed in 1994 by Adele Cutler and Leo Breiman as a computational procedure to extract the distinct aspects called archetypes in observations with each observational record approximated as a mixture (i.e., convex combination) of these archetypes. AA thereby provides straightforward, interpretable, and explainable representations for feature extraction and dimensionality reduction, facilitating the understanding of the structure of high-dimensional data with wide applications throughout the sciences. However, AA also faces challenges, particularly as the associated optimization problem is non-convex. This survey provides researchers and data mining practitioners an overview of methodologies and opportunities that AA has to offer surveying the many applications of AA across disparate fields of science, as well as best practices for modeling data using AA and limitations. The survey concludes by explaining important future research directions concerning AA.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings
Authors:
Nikolaos Nakis,
Niels Raunkjær Holm,
Andreas Lyhne Fiehn,
Morten Mørup
Abstract:
Low-dimensional embeddings are essential for machine learning tasks involving graphs, such as node classification, link prediction, community detection, network visualization, and network compression. Although recent studies have identified exact low-dimensional embeddings, the limits of the required embedding dimensions remain unclear. We presently prove that lower dimensional embeddings are poss…
▽ More
Low-dimensional embeddings are essential for machine learning tasks involving graphs, such as node classification, link prediction, community detection, network visualization, and network compression. Although recent studies have identified exact low-dimensional embeddings, the limits of the required embedding dimensions remain unclear. We presently prove that lower dimensional embeddings are possible when using Euclidean metric embeddings as opposed to vector-based Logistic PCA (LPCA) embeddings. In particular, we provide an efficient logarithmic search procedure for identifying the exact embedding dimension and demonstrate how metric embeddings enable inference of the exact embedding dimensions of large-scale networks by exploiting that the metric properties can be used to provide linearithmic scaling. Empirically, we show that our approach extracts substantially lower dimensional representations of networks than previously reported for small-sized networks. For the first time, we demonstrate that even large-scale networks can be effectively embedded in very low-dimensional spaces, and provide examples of scalable, exact reconstruction for graphs with up to a million nodes. Our approach highlights that the intrinsic dimensionality of networks is substantially lower than previously reported and provides a computationally efficient assessment of the exact embedding dimension also of large-scale networks. The surprisingly low dimensional representations achieved demonstrate that networks in general can be losslessly represented using very low dimensional feature spaces, which can be used to guide existing network analysis tasks from community detection and node classification to structure revealing exact network visualizations.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Archetypal Analysis for Binary Data
Authors:
A. Emilie J. Wedenborg,
Morten Mørup
Abstract:
Archetypal analysis (AA) is a matrix decomposition method that identifies distinct patterns using convex combinations of the data points denoted archetypes with each data point in turn reconstructed as convex combinations of the archetypes. AA thereby forms a polytope representing trade-offs of the distinct aspects in the data. Most existing methods for AA are designed for continuous data and do n…
▽ More
Archetypal analysis (AA) is a matrix decomposition method that identifies distinct patterns using convex combinations of the data points denoted archetypes with each data point in turn reconstructed as convex combinations of the archetypes. AA thereby forms a polytope representing trade-offs of the distinct aspects in the data. Most existing methods for AA are designed for continuous data and do not exploit the structure of the data distribution. In this paper, we propose two new optimization frameworks for archetypal analysis for binary data. i) A second order approximation of the AA likelihood based on the Bernoulli distribution with efficient closed-form updates using an active set procedure for learning the convex combinations defining the archetypes, and a sequential minimal optimization strategy for learning the observation specific reconstructions. ii) A Bernoulli likelihood based version of the principal convex hull analysis (PCHA) algorithm originally developed for least squares optimization. We compare these approaches with the only existing binary AA procedure relying on multiplicative updates and demonstrate their superiority on both synthetic and real binary data. Notably, the proposed optimization frameworks for AA can easily be extended to other data distributions providing generic efficient optimization frameworks for AA based on tailored likelihood functions reflecting the underlying data distribution.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
SepMamba: State-space models for speaker separation using Mamba
Authors:
Thor Højhus Avenstrup,
Boldizsár Elek,
István László Mádi,
András Bence Schin,
Morten Mørup,
Bjørn Sand Jensen,
Kenny Falkær Olsen
Abstract:
Deep learning-based single-channel speaker separation has improved significantly in recent years largely due to the introduction of the transformer-based attention mechanism. However, these improvements come at the expense of intense computational demands, precluding their use in many practical applications. As a computationally efficient alternative with similar modeling capabilities, Mamba was r…
▽ More
Deep learning-based single-channel speaker separation has improved significantly in recent years largely due to the introduction of the transformer-based attention mechanism. However, these improvements come at the expense of intense computational demands, precluding their use in many practical applications. As a computationally efficient alternative with similar modeling capabilities, Mamba was recently introduced. We propose SepMamba, a U-Net-based architecture composed primarily of bidirectional Mamba layers. We find that our approach outperforms similarly-sized prominent models - including transformer-based models - on the WSJ0 2-speaker dataset while enjoying a significant reduction in computational cost, memory usage, and forward pass time. We additionally report strong results for causal variants of SepMamba. Our approach provides a computationally favorable alternative to transformer-based architectures for deep speech separation.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Modeling Human Responses by Ordinal Archetypal Analysis
Authors:
Anna Emilie J. Wedenborg,
Michael Alexander Harborg,
Andreas Bigom,
Oliver Elmgreen,
Marcus Presutti,
Andreas Råskov,
Fumiko Kano Glückstad,
Mikkel Schmidt,
Morten Mørup
Abstract:
This paper introduces a novel framework for Archetypal Analysis (AA) tailored to ordinal data, particularly from questionnaires. Unlike existing methods, the proposed method, Ordinal Archetypal Analysis (OAA), bypasses the two-step process of transforming ordinal data into continuous scales and operates directly on the ordinal data. We extend traditional AA methods to handle the subjective nature…
▽ More
This paper introduces a novel framework for Archetypal Analysis (AA) tailored to ordinal data, particularly from questionnaires. Unlike existing methods, the proposed method, Ordinal Archetypal Analysis (OAA), bypasses the two-step process of transforming ordinal data into continuous scales and operates directly on the ordinal data. We extend traditional AA methods to handle the subjective nature of questionnaire-based data, acknowledging individual differences in scale perception. We introduce the Response Bias Ordinal Archetypal Analysis (RBOAA), which learns individualized scales for each subject during optimization. The effectiveness of these methods is demonstrated on synthetic data and the European Social Survey dataset, highlighting their potential to provide deeper insights into human behavior and perception. The study underscores the importance of considering response bias in cross-national research and offers a principled approach to analyzing ordinal data through Archetypal Analysis.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Evaluating the Influence of Temporal Context on Automatic Mouse Sleep Staging through the Application of Human Models
Authors:
Javier García Ciudad,
Morten Mørup,
Birgitte Rahbek Kornum,
Alexander Neergaard Zahid
Abstract:
In human sleep staging models, augmenting the temporal context of the input to the range of tens of minutes has recently demonstrated performance improvement. In contrast, the temporal context of mouse sleep staging models is typically in the order of tens of seconds. While long-term time patterns are less clear in mouse sleep, increasing the temporal context further than that of the current mouse…
▽ More
In human sleep staging models, augmenting the temporal context of the input to the range of tens of minutes has recently demonstrated performance improvement. In contrast, the temporal context of mouse sleep staging models is typically in the order of tens of seconds. While long-term time patterns are less clear in mouse sleep, increasing the temporal context further than that of the current mouse sleep staging models might still result in a performance increase, given that the current methods only model very short term patterns. In this study, we examine the influence of increasing the temporal context in mouse sleep staging up to 15 minutes in three mouse cohorts using two recent and high-performing human sleep staging models that account for long-term dependencies. These are compared to two prominent mouse sleep staging models that use a local context of 12 s and 20 s, respectively. An increase in context up to 28 s is observed to have a positive impact on sleep stage classification performance, especially in REM sleep. However, the impact is limited for longer context windows. One of the human sleep scoring models, L-SeqSleepNet, outperforms both mouse models in all cohorts. This suggests that mouse sleep staging can benefit from more temporal context than currently used.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
E$^2$M: Double Bounded $α$-Divergence Optimization for Tensor-based Discrete Density Estimation
Authors:
Kazu Ghalamkari,
Jesper Løve Hinrich,
Morten Mørup
Abstract:
Tensor-based discrete density estimation requires flexible modeling and proper divergence criteria to enable effective learning; however, traditional approaches using $α$-divergence face analytical challenges due to the $α$-power terms in the objective function, which hinder the derivation of closed-form update rules. We present a generalization of the expectation-maximization (EM) algorithm, call…
▽ More
Tensor-based discrete density estimation requires flexible modeling and proper divergence criteria to enable effective learning; however, traditional approaches using $α$-divergence face analytical challenges due to the $α$-power terms in the objective function, which hinder the derivation of closed-form update rules. We present a generalization of the expectation-maximization (EM) algorithm, called E$^2$M algorithm. It circumvents this issue by first relaxing the optimization into minimization of a surrogate objective based on the Kullback-Leibler (KL) divergence, which is tractable via the standard EM algorithm, and subsequently applying a tensor many-body approximation in the M-step to enable simultaneous closed-form updates of all parameters. Our approach offers flexible modeling for not only a variety of low-rank structures, including the CP, Tucker, and Tensor Train formats, but also their mixtures, thus allowing us to leverage the strengths of different low-rank structures. We demonstrate the effectiveness of our approach in classification and density estimation tasks.
△ Less
Submitted 23 May, 2025; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Coupled generator decomposition for fusion of electro- and magnetoencephalography data
Authors:
Anders Stevnhoved Olsen,
Jesper Duemose Nielsen,
Morten Mørup
Abstract:
Data fusion modeling can identify common features across diverse data sources while accounting for source-specific variability. Here we introduce the concept of a \textit{coupled generator decomposition} and demonstrate how it generalizes sparse principal component analysis (SPCA) for data fusion. Leveraging data from a multisubject, multimodal (electro- and magnetoencephalography (EEG and MEG)) n…
▽ More
Data fusion modeling can identify common features across diverse data sources while accounting for source-specific variability. Here we introduce the concept of a \textit{coupled generator decomposition} and demonstrate how it generalizes sparse principal component analysis (SPCA) for data fusion. Leveraging data from a multisubject, multimodal (electro- and magnetoencephalography (EEG and MEG)) neuroimaging experiment, we demonstrate the efficacy of the framework in identifying common features in response to face perception stimuli, while accommodating modality- and subject-specific variability. Through split-half cross-validation of EEG/MEG trials, we investigate the optimal model order and regularization strengths for models of varying complexity, comparing these to a group-level model assuming shared brain responses to stimuli. Our findings reveal altered $\sim170ms$ fusiform face area activation for scrambled faces, as opposed to real faces, particularly evident in the multimodal, multisubject model. Model parameters were inferred using stochastic optimization in PyTorch, demonstrating comparable performance to conventional quadratic programming inference for SPCA but with considerably faster execution. We provide an easily accessible toolbox for coupled generator decomposition that includes data fusion for SPCA, archetypal analysis and directional archetypal analysis. Overall, our approach offers a promising new avenue for data fusion.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Time to Cite: Modeling Citation Networks using the Dynamic Impact Single-Event Embedding Model
Authors:
Nikolaos Nakis,
Abdulkadir Celikkanat,
Louis Boucherie,
Sune Lehmann,
Morten Mørup
Abstract:
Understanding the structure and dynamics of scientific research, i.e., the science of science (SciSci), has become an important area of research in order to address imminent questions including how scholars interact to advance science, how disciplines are related and evolve, and how research impact can be quantified and predicted. Central to the study of SciSci has been the analysis of citation ne…
▽ More
Understanding the structure and dynamics of scientific research, i.e., the science of science (SciSci), has become an important area of research in order to address imminent questions including how scholars interact to advance science, how disciplines are related and evolve, and how research impact can be quantified and predicted. Central to the study of SciSci has been the analysis of citation networks. Here, two prominent modeling methodologies have been employed: one is to assess the citation impact dynamics of papers using parametric distributions, and the other is to embed the citation networks in a latent space optimal for characterizing the static relations between papers in terms of their citations. Interestingly, citation networks are a prominent example of single-event dynamic networks, i.e., networks for which each dyad only has a single event (i.e., the point in time of citation). We presently propose a novel likelihood function for the characterization of such single-event networks. Using this likelihood, we propose the Dynamic Impact Single-Event Embedding model (DISEE). The \textsc{\modelabbrev} model characterizes the scientific interactions in terms of a latent distance model in which random effects account for citation heterogeneity while the time-varying impact is characterized using existing parametric representations for assessment of dynamic impact. We highlight the proposed approach on several real citation networks finding that the DISEE well reconciles static latent distance network embedding approaches with classical dynamic impact assessments.
△ Less
Submitted 28 February, 2024;
originally announced March 2024.
-
Continuous-time Graph Representation with Sequential Survival Process
Authors:
Abdulkadir Celikkanat,
Nikolaos Nakis,
Morten Mørup
Abstract:
Over the past two decades, there has been a tremendous increase in the growth of representation learning methods for graphs, with numerous applications across various fields, including bioinformatics, chemistry, and the social sciences. However, current dynamic network approaches focus on discrete-time networks or treat links in continuous-time networks as instantaneous events. Therefore, these ap…
▽ More
Over the past two decades, there has been a tremendous increase in the growth of representation learning methods for graphs, with numerous applications across various fields, including bioinformatics, chemistry, and the social sciences. However, current dynamic network approaches focus on discrete-time networks or treat links in continuous-time networks as instantaneous events. Therefore, these approaches have limitations in capturing the persistence or absence of links that continuously emerge and disappear over time for particular durations. To address this, we propose a novel stochastic process relying on survival functions to model the durations of links and their absences over time. This forms a generic new likelihood specification explicitly accounting for intermittent edge-persistent networks, namely GraSSP: Graph Representation with Sequential Survival Process. We apply the developed framework to a recent continuous time dynamic latent distance model characterizing network dynamics in terms of a sequence of piecewise linear movements of nodes in latent space. We quantitatively assess the developed framework in various downstream tasks, such as link prediction and network completion, demonstrating that the developed modeling framework accounting for link persistence and absence well tracks the intrinsic trajectories of nodes in a latent space and captures the underlying characteristics of evolving network structure.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework for Zero-Shot Electroencephalography Signal Conversion
Authors:
Anders Vestergaard Nørskov,
Alexander Neergaard Zahid,
Morten Mørup
Abstract:
Electroencephalography (EEG) is a prominent non-invasive neuroimaging technique providing insights into brain function. Unfortunately, EEG data exhibit a high degree of noise and variability across subjects hampering generalizable signal extraction. Therefore, a key aim in EEG analysis is to extract the underlying neural activation (content) as well as to account for the individual subject variabi…
▽ More
Electroencephalography (EEG) is a prominent non-invasive neuroimaging technique providing insights into brain function. Unfortunately, EEG data exhibit a high degree of noise and variability across subjects hampering generalizable signal extraction. Therefore, a key aim in EEG analysis is to extract the underlying neural activation (content) as well as to account for the individual subject variability (style). We hypothesize that the ability to convert EEG signals between tasks and subjects requires the extraction of latent representations accounting for content and style. Inspired by recent advancements in voice conversion technologies, we propose a novel contrastive split-latent permutation autoencoder (CSLP-AE) framework that directly optimizes for EEG conversion. Importantly, the latent representations are guided using contrastive learning to promote the latent splits to explicitly represent subject (style) and task (content). We contrast CSLP-AE to conventional supervised, unsupervised (AE), and self-supervised (contrastive learning) training and find that the proposed approach provides favorable generalizable characterizations of subject and task. Importantly, the procedure also enables zero-shot conversion between unseen subjects. While the present work only considers conversion of EEG, the proposed CSLP-AE provides a general framework for signal conversion and extraction of content (task activation) and style (subject variability) components of general interest for the modeling and analysis of biological signals.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Probabilistic Block Term Decomposition for the Modelling of Higher-Order Arrays
Authors:
Jesper Løve Hinrich,
Morten Mørup
Abstract:
Tensors are ubiquitous in science and engineering and tensor factorization approaches have become important tools for the characterization of higher order structure. Factorizations includes the outer-product rank Canonical Polyadic Decomposition (CPD) as well as the multi-linear rank Tucker decomposition in which the Block-Term Decomposition (BTD) is a structured intermediate interpolating between…
▽ More
Tensors are ubiquitous in science and engineering and tensor factorization approaches have become important tools for the characterization of higher order structure. Factorizations includes the outer-product rank Canonical Polyadic Decomposition (CPD) as well as the multi-linear rank Tucker decomposition in which the Block-Term Decomposition (BTD) is a structured intermediate interpolating between these two representations. Whereas CPD, Tucker, and BTD have traditionally relied on maximum-likelihood estimation, Bayesian inference has been use to form probabilistic CPD and Tucker. We propose, an efficient variational Bayesian probabilistic BTD, which uses the von-Mises Fisher matrix distribution to impose orthogonality in the multi-linear Tucker parts forming the BTD. On synthetic and two real datasets, we highlight the Bayesian inference procedure and demonstrate using the proposed pBTD on noisy data and for model order quantification. We find that the probabilistic BTD can quantify suitable multi-linear structures providing a means for robust inference of patterns in multi-linear data.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
A Hybrid Membership Latent Distance Model for Unsigned and Signed Integer Weighted Networks
Authors:
Nikolaos Nakis,
Abdulkadir Çelikkanat,
Morten Mørup
Abstract:
Graph representation learning (GRL) has become a prominent tool for furthering the understanding of complex networks providing tools for network embedding, link prediction, and node classification. In this paper, we propose the Hybrid Membership-Latent Distance Model (HM-LDM) by exploring how a Latent Distance Model (LDM) can be constrained to a latent simplex. By controlling the edge lengths of t…
▽ More
Graph representation learning (GRL) has become a prominent tool for furthering the understanding of complex networks providing tools for network embedding, link prediction, and node classification. In this paper, we propose the Hybrid Membership-Latent Distance Model (HM-LDM) by exploring how a Latent Distance Model (LDM) can be constrained to a latent simplex. By controlling the edge lengths of the corners of the simplex, the volume of the latent space can be systematically controlled. Thereby communities are revealed as the space becomes more constrained, with hard memberships being recovered as the simplex volume goes to zero. We further explore a recent likelihood formulation for signed networks utilizing the Skellam distribution to account for signed weighted networks and extend the HM-LDM to the signed Hybrid Membership-Latent Distance Model (sHM-LDM). Importantly, the induced likelihood function explicitly attracts nodes with positive links and deters nodes from having negative interactions. We demonstrate the utility of HM-LDM and sHM-LDM on several real networks. We find that the procedures successfully identify prominent distinct structures, as well as how nodes relate to the extracted aspects providing favorable performances in terms of link prediction when compared to prominent baselines. Furthermore, the learned soft memberships enable easily interpretable network visualizations highlighting distinct patterns.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Characterizing Polarization in Social Networks using the Signed Relational Latent Distance Model
Authors:
Nikolaos Nakis,
Abdulkadir Çelikkanat,
Louis Boucherie,
Christian Djurhuus,
Felix Burmester,
Daniel Mathias Holmelund,
Monika Frolcová,
Morten Mørup
Abstract:
Graph representation learning has become a prominent tool for the characterization and understanding of the structure of networks in general and social networks in particular. Typically, these representation learning approaches embed the networks into a low-dimensional space in which the role of each individual can be characterized in terms of their latent position. A major current concern in soci…
▽ More
Graph representation learning has become a prominent tool for the characterization and understanding of the structure of networks in general and social networks in particular. Typically, these representation learning approaches embed the networks into a low-dimensional space in which the role of each individual can be characterized in terms of their latent position. A major current concern in social networks is the emergence of polarization and filter bubbles promoting a mindset of "us-versus-them" that may be defined by extreme positions believed to ultimately lead to political violence and the erosion of democracy. Such polarized networks are typically characterized in terms of signed links reflecting likes and dislikes. We propose the latent Signed relational Latent dIstance Model (SLIM) utilizing for the first time the Skellam distribution as a likelihood function for signed networks and extend the modeling to the characterization of distinct extreme positions by constraining the embedding space to polytopes. On four real social signed networks of polarization, we demonstrate that the model extracts low-dimensional characterizations that well predict friendships and animosity while providing interpretable visualizations defined by extreme positions when endowing the model with an embedding space restricted to polytopes.
△ Less
Submitted 3 March, 2023; v1 submitted 23 January, 2023;
originally announced January 2023.
-
Piecewise-Velocity Model for Learning Continuous-time Dynamic Node Representations
Authors:
Abdulkadir Çelikkanat,
Nikolaos Nakis,
Morten Mørup
Abstract:
Networks have become indispensable and ubiquitous structures in many fields to model the interactions among different entities, such as friendship in social networks or protein interactions in biological graphs. A major challenge is to understand the structure and dynamics of these systems. Although networks evolve through time, most existing graph representation learning methods target only stati…
▽ More
Networks have become indispensable and ubiquitous structures in many fields to model the interactions among different entities, such as friendship in social networks or protein interactions in biological graphs. A major challenge is to understand the structure and dynamics of these systems. Although networks evolve through time, most existing graph representation learning methods target only static networks. Whereas approaches have been developed for the modeling of dynamic networks, there is a lack of efficient continuous time dynamic graph representation learning methods that can provide accurate network characterization and visualization in low dimensions while explicitly accounting for prominent network characteristics such as homophily and transitivity. In this paper, we propose the Piecewise-Velocity Model (PiVeM) for the representation of continuous-time dynamic networks. It learns dynamic embeddings in which the temporal evolution of nodes is approximated by piecewise linear interpolations based on a latent distance model with piecewise constant node-specific velocities. The model allows for analytically tractable expressions of the associated Poisson process likelihood with scalable inference invariant to the number of events. We further impose a scalable Kronecker structured Gaussian Process prior to the dynamics accounting for community structure, temporal smoothness, and disentangled (uncorrelated) latent embedding dimensions optimally learned to characterize the network dynamics. We show that PiVeM can successfully represent network structure and dynamics in ultra-low two-dimensional spaces. It outperforms relevant state-of-art methods in downstream tasks such as link prediction. In summary, PiVeM enables easily interpretable dynamic network visualizations and characterizations that can further improve our understanding of the intrinsic dynamics of time-evolving networks.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
HM-LDM: A Hybrid-Membership Latent Distance Model
Authors:
Nikolaos Nakis,
Abdulkadir Çelikkanat,
Morten Mørup
Abstract:
A central aim of modeling complex networks is to accurately embed networks in order to detect structures and predict link and node properties. The latent space models (LSM) have become prominent frameworks for embedding networks and include the latent distance (LDM) and eigenmodel (LEM) as the most widely used LSM specifications. For latent community detection, the embedding space in LDMs has been…
▽ More
A central aim of modeling complex networks is to accurately embed networks in order to detect structures and predict link and node properties. The latent space models (LSM) have become prominent frameworks for embedding networks and include the latent distance (LDM) and eigenmodel (LEM) as the most widely used LSM specifications. For latent community detection, the embedding space in LDMs has been endowed with a clustering model whereas LEMs have been constrained to part-based non-negative matrix factorization (NMF) inspired representations promoting community discovery. We presently reconcile LSMs with latent community detection by constraining the LDM representation to the D-simplex forming the hybrid-membership latent distance model (HM-LDM). We show that for sufficiently large simplex volumes this can be achieved without loss of expressive power whereas by extending the model to squared Euclidean distances, we recover the LEM formulation with constraints promoting part-based representations akin to NMF. Importantly, by systematically reducing the volume of the simplex, the model becomes unique and ultimately leads to hard assignments of nodes to simplex corners. We demonstrate experimentally how the proposed HM-LDM admits accurate node representations in regimes ensuring identifiability and valid community extraction. Importantly, HM-LDM naturally reconciles soft and hard community detection with network embeddings exploring a simple continuous optimization procedure on a volume constrained simplex that admits the systematic investigation of trade-offs between hard and mixed membership community detection.
△ Less
Submitted 20 July, 2022; v1 submitted 7 June, 2022;
originally announced June 2022.
-
A Hierarchical Block Distance Model for Ultra Low-Dimensional Graph Representations
Authors:
Nikolaos Nakis,
Abdulkadir Çelikkanat,
Sune Lehmann Jørgensen,
Morten Mørup
Abstract:
Graph Representation Learning (GRL) has become central for characterizing structures of complex networks and performing tasks such as link prediction, node classification, network reconstruction, and community detection. Whereas numerous generative GRL models have been proposed, many approaches have prohibitive computational requirements hampering large-scale network analysis, fewer are able to ex…
▽ More
Graph Representation Learning (GRL) has become central for characterizing structures of complex networks and performing tasks such as link prediction, node classification, network reconstruction, and community detection. Whereas numerous generative GRL models have been proposed, many approaches have prohibitive computational requirements hampering large-scale network analysis, fewer are able to explicitly account for structure emerging at multiple scales, and only a few explicitly respect important network properties such as homophily and transitivity. This paper proposes a novel scalable graph representation learning method named the Hierarchical Block Distance Model (HBDM). The HBDM imposes a multiscale block structure akin to stochastic block modeling (SBM) and accounts for homophily and transitivity by accurately approximating the latent distance model (LDM) throughout the inferred hierarchy. The HBDM naturally accommodates unipartite, directed, and bipartite networks whereas the hierarchy is designed to ensure linearithmic time and space complexity enabling the analysis of very large-scale networks. We evaluate the performance of the HBDM on massive networks consisting of millions of nodes. Importantly, we find that the proposed HBDM framework significantly outperforms recent scalable approaches in all considered downstream tasks. Surprisingly, we observe superior performance even imposing ultra-low two-dimensional embeddings facilitating accurate direct and hierarchical-aware network visualization and interpretation.
△ Less
Submitted 9 August, 2023; v1 submitted 12 April, 2022;
originally announced April 2022.
-
Short Term Blood Glucose Prediction based on Continuous Glucose Monitoring Data
Authors:
Ali Mohebbi,
Alexander R. Johansen,
Nicklas Hansen,
Peter E. Christensen,
Jens M. Tarp,
Morten L. Jensen,
Henrik Bengtsson,
Morten Mørup
Abstract:
Continuous Glucose Monitoring (CGM) has enabled important opportunities for diabetes management. This study explores the use of CGM data as input for digital decision support tools. We investigate how Recurrent Neural Networks (RNNs) can be used for Short Term Blood Glucose (STBG) prediction and compare the RNNs to conventional time-series forecasting using Autoregressive Integrated Moving Average…
▽ More
Continuous Glucose Monitoring (CGM) has enabled important opportunities for diabetes management. This study explores the use of CGM data as input for digital decision support tools. We investigate how Recurrent Neural Networks (RNNs) can be used for Short Term Blood Glucose (STBG) prediction and compare the RNNs to conventional time-series forecasting using Autoregressive Integrated Moving Average (ARIMA). A prediction horizon up to 90 min into the future is considered. In this context, we evaluate both population-based and patient-specific RNNs and contrast them to patient-specific ARIMA models and a simple baseline predicting future observations as the last observed. We find that the population-based RNN model is the best performing model across the considered prediction horizons without the need of patient-specific data. This demonstrates the potential of RNNs for STBG prediction in diabetes patients towards detecting/mitigating severe events in the STBG, in particular hypoglycemic events. However, further studies are needed in regards to the robustness and practical use of the investigated STBG prediction models.
△ Less
Submitted 15 July, 2020; v1 submitted 6 February, 2020;
originally announced February 2020.
-
Probabilistic PARAFAC2
Authors:
Philip J. H. Jørgensen,
Søren F. V. Nielsen,
Jesper L. Hinrich,
Mikkel N. Schmidt,
Kristoffer H. Madsen,
Morten Mørup
Abstract:
The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but ch…
▽ More
The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but challenging because the factor loadings are constrained to be orthogonal. We develop two probabilistic formulations of the PARAFAC2 along with variational procedures for inference: In the one approach, the mean values of the factor loadings are orthogonal leading to closed form variational updates, and in the other, the factor loadings themselves are orthogonal using a matrix Von Mises-Fisher distribution. We contrast our probabilistic formulation to the conventional direct fitting algorithm based on maximum likelihood. On simulated data and real fluorescence spectroscopy and gas chromatography-mass spectrometry data, we compare our approach to the conventional PARAFAC2 model estimation and find that the probabilistic formulation is more robust to noise and model order misspecification. The probabilistic PARAFAC2 thus forms a promising framework for modeling multi-way data accounting for uncertainty.
△ Less
Submitted 21 June, 2018;
originally announced June 2018.
-
Crowds, Bluetooth, and Rock-n-Roll. Understanding Music Festival Participant Behavior
Authors:
Jakob Eg Larsen,
Piotr Sapiezynski,
Arkadiusz Stopczynski,
Morten Moerup,
Rasmus Theodorsen
Abstract:
In this paper we present a study of sensing and analyzing an offline social network of participants at a large-scale music festival (8 days, 130,000+ participants). We place 33 fixed-location Bluetooth scanners in strategic spots around the festival area to discover Bluetooth-enabled mobile phones carried by the participants, and thus collect spatio-temporal traces of their mobility and interactio…
▽ More
In this paper we present a study of sensing and analyzing an offline social network of participants at a large-scale music festival (8 days, 130,000+ participants). We place 33 fixed-location Bluetooth scanners in strategic spots around the festival area to discover Bluetooth-enabled mobile phones carried by the participants, and thus collect spatio-temporal traces of their mobility and interactions. We subsequently analyze the data on two levels. On the micro level, we run a community detection algorithm to reveal a variety of groups the festival participants form. On the macro level, we employ an Infinite Relational Model (IRM) in order to recover the structure of the social network related to participants' music preferences. The obtained structure in the form of clusters of concerts and participants is then interpreted using meta-information about music genres, band origins, stages, and dates of performances. We show that most of the concerts clusters can be described by one or more of the meta-features, effectively revealing preferences of participants (e.g. a cluster of US bands) and discuss the significance of the findings and the potential and limitations of the used method. Finally, we discuss the possibility of employing the described method and techniques for creating user-oriented applications and extending the sensing capabilities during large-scale events by introducing user involvement.
△ Less
Submitted 14 June, 2013; v1 submitted 13 June, 2013;
originally announced June 2013.
-
Infinite Multiple Membership Relational Modeling for Complex Networks
Authors:
Morten Mørup,
Mikkel N. Schmidt,
Lars Kai Hansen
Abstract:
Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new non-parametric Bayesian multiple-membership latent feature model for networks. Contrary to existing multiple-membership models that scale quadratically in the number of vertices the proposed model sc…
▽ More
Learning latent structure in complex networks has become an important problem fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new non-parametric Bayesian multiple-membership latent feature model for networks. Contrary to existing multiple-membership models that scale quadratically in the number of vertices the proposed model scales linearly in the number of links admitting multiple-membership analysis in large scale networks. We demonstrate a connection between the single membership relational model and multiple membership models and show on "real" size benchmark network data that accounting for multiple memberships improves the learning of latent structure as measured by link prediction while explicitly accounting for multiple membership result in a more compact representation of the latent structure of networks.
△ Less
Submitted 26 January, 2011;
originally announced January 2011.
-
Semi-Supervised Kernel PCA
Authors:
Christian Walder,
Ricardo Henao,
Morten Mørup,
Lars Kai Hansen
Abstract:
We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous…
▽ More
We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous which achieves a sigmoid loss function on the labeled points. We provide a theoretical risk bound as well as illustrative experiments on real and toy data sets.
△ Less
Submitted 8 August, 2010;
originally announced August 2010.
-
Multiplicative updates For Non-Negative Kernel SVM
Authors:
Vamsi K. Potluru,
Sergey M. Plis,
Morten Morup,
Vince D. Calhoun,
Terran Lane
Abstract:
We present multiplicative updates for solving hard and soft margin support vector machines (SVM) with non-negative kernels. They follow as a natural extension of the updates for non-negative matrix factorization. No additional param- eter setting, such as choosing learning, rate is required. Ex- periments demonstrate rapid convergence to good classifiers. We analyze the rates of asymptotic conve…
▽ More
We present multiplicative updates for solving hard and soft margin support vector machines (SVM) with non-negative kernels. They follow as a natural extension of the updates for non-negative matrix factorization. No additional param- eter setting, such as choosing learning, rate is required. Ex- periments demonstrate rapid convergence to good classifiers. We analyze the rates of asymptotic convergence of the up- dates and establish tight bounds. We test the performance on several datasets using various non-negative kernels and report equivalent generalization errors to that of a standard SVM.
△ Less
Submitted 24 February, 2009;
originally announced February 2009.