-
CGAAL: Distributed On-The-Fly ATL Model Checker with Heuristics
Authors:
Falke B. Ø. Carlsen,
Lars Bo P. Frydenskov,
Nicolaj Ø. Jensen,
Jener Rasmussen,
Mathias M. Sørensen,
Asger G. Weirsøe,
Mathias C. Jensen,
Kim G. Larsen
Abstract:
We present CGAAL, our efficient on-the-fly model checker for alternating-time temporal logic (ATL) on concurrent game structures (CGS). We present how our tool encodes ATL as extended dependency graphs with negation edges and employs the distributed on-the-fly algorithm by Dalsgaard et al. Our tool offers multiple novel search strategies for the algorithm, including DHS which is inspired by PageRa…
▽ More
We present CGAAL, our efficient on-the-fly model checker for alternating-time temporal logic (ATL) on concurrent game structures (CGS). We present how our tool encodes ATL as extended dependency graphs with negation edges and employs the distributed on-the-fly algorithm by Dalsgaard et al. Our tool offers multiple novel search strategies for the algorithm, including DHS which is inspired by PageRank and uses the in-degree of configurations as a heuristic, IHS which estimates instability of assignment values, and LPS which estimates the distance to a state satisfying the constituent property using linear programming. CGS are input using our modelling language LCGS, where composition and synchronisation are easily described. We prove the correctness of our encoding, and our experiments show that our tool CGAAL is often one to three orders of magnitude faster than the popular tool PRISM-games on case studies from PRISM's documentation and among case studies we have developed. In our evaluation, we also compare and evaluate our search strategies, and find that our custom search strategies are often significantly faster than the usual breadth-first and depth-first search strategies.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Learning to Taste: A Multimodal Wine Dataset
Authors:
Thoranna Bender,
Simon Moe Sørensen,
Alireza Kashani,
K. Eldjarn Hjorleifsson,
Grethe Hyldig,
Søren Hauberg,
Serge Belongie,
Frederik Warburg
Abstract:
We present WineSensed, a large multimodal wine dataset for studying the relations between visual perception, language, and flavor. The dataset encompasses 897k images of wine labels and 824k reviews of wines curated from the Vivino platform. It has over 350k unique bottlings, annotated with year, region, rating, alcohol percentage, price, and grape composition. We obtained fine-grained flavor anno…
▽ More
We present WineSensed, a large multimodal wine dataset for studying the relations between visual perception, language, and flavor. The dataset encompasses 897k images of wine labels and 824k reviews of wines curated from the Vivino platform. It has over 350k unique bottlings, annotated with year, region, rating, alcohol percentage, price, and grape composition. We obtained fine-grained flavor annotations on a subset by conducting a wine-tasting experiment with 256 participants who were asked to rank wines based on their similarity in flavor, resulting in more than 5k pairwise flavor distances. We propose a low-dimensional concept embedding algorithm that combines human experience with automatic machine similarity kernels. We demonstrate that this shared concept embedding space improves upon separate embedding spaces for coarse flavor classification (alcohol percentage, country, grape, price, rating) and aligns with the intricate human perception of flavor.
△ Less
Submitted 15 January, 2024; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Extremely weakly-supervised blood vessel segmentation with physiologically based synthesis and domain adaptation
Authors:
Peidi Xu,
Olga Sosnovtseva,
Charlotte Mehlin Sørensen,
Kenny Erleben,
Sune Darkner
Abstract:
Accurate analysis and modeling of renal functions require a precise segmentation of the renal blood vessels. Micro-CT scans provide image data at higher resolutions, making more small vessels near the renal cortex visible. Although deep-learning-based methods have shown state-of-the-art performance in automatic blood vessel segmentations, they require a large amount of labeled training data. Howev…
▽ More
Accurate analysis and modeling of renal functions require a precise segmentation of the renal blood vessels. Micro-CT scans provide image data at higher resolutions, making more small vessels near the renal cortex visible. Although deep-learning-based methods have shown state-of-the-art performance in automatic blood vessel segmentations, they require a large amount of labeled training data. However, voxel-wise labeling in micro-CT scans is extremely time-consuming given the huge volume sizes. To mitigate the problem, we simulate synthetic renal vascular trees physiologically while generating corresponding scans of the simulated trees by training a generative model on unlabeled scans. This enables the generative model to learn the mapping implicitly without the need for explicit functions to emulate the image acquisition process. We further propose an additional segmentation branch over the generative model trained on the generated scans. We demonstrate that the model can directly segment blood vessels on real scans and validate our method on both 3D micro-CT scans of rat kidneys and a proof-of-concept experiment on 2D retinal images. Code and 3D results are available at https://github.com/miccai2023anony/RenalVesselSeg
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
A Hybrid Approach to Full-Scale Reconstruction of Renal Arterial Network
Authors:
Peidi Xu,
Niels-Henrik Holstein-Rathlou,
Stinne Byrholdt Søgaard,
Carsten Gundlach,
Charlotte Mehlin Sørensen,
Kenny Erleben,
Olga Sosnovtseva,
Sune Darkner
Abstract:
The renal vasculature, acting as a resource distribution network, plays an important role in both the physiology and pathophysiology of the kidney. However, no imaging techniques allow an assessment of the structure and function of the renal vasculature due to limited spatial and temporal resolution. To develop realistic computer simulations of renal function, and to develop new image-based diagno…
▽ More
The renal vasculature, acting as a resource distribution network, plays an important role in both the physiology and pathophysiology of the kidney. However, no imaging techniques allow an assessment of the structure and function of the renal vasculature due to limited spatial and temporal resolution. To develop realistic computer simulations of renal function, and to develop new image-based diagnostic methods based on artificial intelligence, it is necessary to have a realistic full-scale model of the renal vasculature. We propose a hybrid framework to build subject-specific models of the renal vascular network by using semi-automated segmentation of large arteries and estimation of cortex area from a micro-CT scan as a starting point, and by adopting the Global Constructive Optimization algorithm for generating smaller vessels. Our results show a statistical correspondence between the reconstructed data and existing anatomical data obtained from a rat kidney with respect to morphometric and hemodynamic parameters.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
SolarDK: A high-resolution urban solar panel image classification and localization dataset
Authors:
Maxim Khomiakov,
Julius Holbech Radzikowski,
Carl Anton Schmidt,
Mathias Bonde Sørensen,
Mads Andersen,
Michael Riis Andersen,
Jes Frellsen
Abstract:
The body of research on classification of solar panel arrays from aerial imagery is increasing, yet there are still not many public benchmark datasets. This paper introduces two novel benchmark datasets for classifying and localizing solar panel arrays in Denmark: A human annotated dataset for classification and segmentation, as well as a classification dataset acquired using self-reported data fr…
▽ More
The body of research on classification of solar panel arrays from aerial imagery is increasing, yet there are still not many public benchmark datasets. This paper introduces two novel benchmark datasets for classifying and localizing solar panel arrays in Denmark: A human annotated dataset for classification and segmentation, as well as a classification dataset acquired using self-reported data from the Danish national building registry. We explore the performance of prior works on the new benchmark dataset, and present results after fine-tuning models using a similar approach as recent works. Furthermore, we train models of newer architectures and provide benchmark baselines to our datasets in several scenarios. We believe the release of these datasets may improve future research in both local and global geospatial domains for identifying and mapping of solar panel arrays from aerial imagery. The data is accessible at https://osf.io/aj539/.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Generative time series models using Neural ODE in Variational Autoencoders
Authors:
M. L. Garsdal,
V. Søgaard,
S. M. Sørensen
Abstract:
In this paper, we implement Neural Ordinary Differential Equations in a Variational Autoencoder setting for generative time series modeling. An object-oriented approach to the code was taken to allow for easier development and research and all code used in the paper can be found here: https://github.com/simonmoesorensen/neural-ode-project
The results were initially recreated and the reconstructi…
▽ More
In this paper, we implement Neural Ordinary Differential Equations in a Variational Autoencoder setting for generative time series modeling. An object-oriented approach to the code was taken to allow for easier development and research and all code used in the paper can be found here: https://github.com/simonmoesorensen/neural-ode-project
The results were initially recreated and the reconstructions compared to a baseline Long-Short Term Memory AutoEncoder. The model was then extended with a LSTM encoder and challenged by more complex data consisting of time series in the form of spring oscillations. The model showed promise, and was able to reconstruct true trajectories for all complexities of data with a smaller RMSE than the baseline model. However, it was able to capture the dynamic behavior of the time series for known data in the decoder but was not able to produce extrapolations following the true trajectory very well for any of the complexities of spring data. A final experiment was carried out where the model was also presented with 68 days of solar power production data, and was able to reconstruct just as well as the baseline, even when very little data is available.
Finally, the models training time was compared to the baseline. It was found that for small amounts of data the NODE method was significantly slower at training than the baseline, while for larger amounts of data the NODE method would be equal or faster at training.
The paper is ended with a future work section which describes the many natural extensions to the work presented in this paper, with examples being investigating further the importance of input data, including extrapolation in the baseline model or testing more specific model setups.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Uncovering migration systems through spatio-temporal tensor co-clustering
Authors:
Zack W. Almquist,
Tri Duc Nguyen,
Mikael Sorensen,
Xiao Fu,
Nicholas D. Sidiropoulos
Abstract:
A central problem in the study of human mobility is that of migration systems. Typically, migration systems are defined as a set of relatively stable movements of people between two or more locations over time. While these emergent systems are expected to vary over time, they ideally contain a stable underlying structure that could be discovered empirically. There have been some notable attempts t…
▽ More
A central problem in the study of human mobility is that of migration systems. Typically, migration systems are defined as a set of relatively stable movements of people between two or more locations over time. While these emergent systems are expected to vary over time, they ideally contain a stable underlying structure that could be discovered empirically. There have been some notable attempts to formally or informally define migration systems, however they have been limited by being hard to operationalize, and by defining migration systems in ways that ignore origin/destination aspects and/or fail to account for migration dynamics. In this work we propose a novel method, spatio-temporal (ST) tensor co-clustering, stemming from signal processing and machine learning theory. To demonstrate its effectiveness for describing stable migration systems we focus on domestic migration between counties in the US from 1990-2018. Relevant data for this period has been made available through the US Internal Revenue Service. Specifically, we concentrate on three illustrative case studies: (i) US Metropolitan Areas, (ii) the state of California, and (iii) Louisiana, focusing on detecting exogenous events such as Hurricane Katrina in 2005. Finally, we conclude with discussion and limitations of this approach.
△ Less
Submitted 26 January, 2023; v1 submitted 30 December, 2021;
originally announced December 2021.
-
Optimal Size-Performance Tradeoffs: Weighing PoS Tagger Models
Authors:
Magnus Jacobsen,
Mikkel H. Sørensen,
Leon Derczynski
Abstract:
Improvement in machine learning-based NLP performance are often presented with bigger models and more complex code. This presents a trade-off: better scores come at the cost of larger tools; bigger models tend to require more during training and inference time. We present multiple methods for measuring the size of a model, and for comparing this with the model's performance.
In a case study over…
▽ More
Improvement in machine learning-based NLP performance are often presented with bigger models and more complex code. This presents a trade-off: better scores come at the cost of larger tools; bigger models tend to require more during training and inference time. We present multiple methods for measuring the size of a model, and for comparing this with the model's performance.
In a case study over part-of-speech tagging, we then apply these techniques to taggers for eight languages and present a novel analysis identifying which taggers are size-performance optimal. Results indicate that some classical taggers place on the size-performance skyline across languages. Further, although the deep models have highest performance for multiple scores, it is often not the most complex of these that reach peak performance.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Generalized Canonical Correlation Analysis: A Subspace Intersection Approach
Authors:
Mikael Sørensen,
Charilaos I. Kanatsoulis,
Nicholas D. Sidiropoulos
Abstract:
Generalized Canonical Correlation Analysis (GCCA) is an important tool that finds numerous applications in data mining, machine learning, and artificial intelligence. It aims at finding `common' random variables that are strongly correlated across multiple feature representations (views) of the same set of entities. CCA and to a lesser extent GCCA have been studied from the statistical and algorit…
▽ More
Generalized Canonical Correlation Analysis (GCCA) is an important tool that finds numerous applications in data mining, machine learning, and artificial intelligence. It aims at finding `common' random variables that are strongly correlated across multiple feature representations (views) of the same set of entities. CCA and to a lesser extent GCCA have been studied from the statistical and algorithmic points of view, but not as much from the standpoint of linear algebra. This paper offers a fresh algebraic perspective of GCCA based on a (bi-)linear generative model that naturally captures its essence. It is shown that from a linear algebra point of view, GCCA is tantamount to subspace intersection; and conditions under which the common subspace of the different views is identifiable are provided. A novel GCCA algorithm is proposed based on subspace intersection, which scales up to handle large GCCA tasks. Synthetic as well as real data experiments are provided to showcase the effectiveness of the proposed approach.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
A Note on Shortest Developments
Authors:
Morten Heine Sørensen
Abstract:
De Vrijer has presented a proof of the finite developments theorem which, in addition to showing that all developments are finite, gives an effective reduction strategy computing longest developments as well as a simple formula computing their length.
We show that by applying a rather simple and intuitive principle of duality to de Vrijer's approach one arrives at a proof that some development…
▽ More
De Vrijer has presented a proof of the finite developments theorem which, in addition to showing that all developments are finite, gives an effective reduction strategy computing longest developments as well as a simple formula computing their length.
We show that by applying a rather simple and intuitive principle of duality to de Vrijer's approach one arrives at a proof that some developments are finite which in addition yields an effective reduction strategy computing shortest developments as well as a simple formula computing their length. The duality fails for general beta-reduction.
Our results simplify previous work by Khasidashvili.
△ Less
Submitted 5 November, 2007; v1 submitted 1 August, 2007;
originally announced August 2007.