-
When Does Neuroevolution Outcompete Reinforcement Learning in Transfer Learning Tasks?
Authors:
Eleni Nisioti,
Joachim Winther Pedersen,
Erwan Plantec,
Milton L. Montero,
Sebastian Risi
Abstract:
The ability to continuously and efficiently transfer skills across tasks is a hallmark of biological intelligence and a long-standing goal in artificial systems. Reinforcement learning (RL), a dominant paradigm for learning in high-dimensional control tasks, is known to suffer from brittleness to task variations and catastrophic forgetting. Neuroevolution (NE) has recently gained attention for its…
▽ More
The ability to continuously and efficiently transfer skills across tasks is a hallmark of biological intelligence and a long-standing goal in artificial systems. Reinforcement learning (RL), a dominant paradigm for learning in high-dimensional control tasks, is known to suffer from brittleness to task variations and catastrophic forgetting. Neuroevolution (NE) has recently gained attention for its robustness, scalability, and capacity to escape local optima. In this paper, we investigate an understudied dimension of NE: its transfer learning capabilities. To this end, we introduce two benchmarks: a) in stepping gates, neural networks are tasked with emulating logic circuits, with designs that emphasize modular repetition and variation b) ecorobot extends the Brax physics engine with objects such as walls and obstacles and the ability to easily switch between different robotic morphologies. Crucial in both benchmarks is the presence of a curriculum that enables evaluating skill transfer across tasks of increasing complexity. Our empirical analysis shows that NE methods vary in their transfer abilities and frequently outperform RL baselines. Our findings support the potential of NE as a foundation for building more adaptable agents and highlight future challenges for scaling NE to complex, real-world problems.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
NLP Security and Ethics, in the Wild
Authors:
Heather Lent,
Erick Galinkin,
Yiyi Chen,
Jens Myrup Pedersen,
Leon Derczynski,
Johannes Bjerva
Abstract:
As NLP models are used by a growing number of end-users, an area of increasing importance is NLP Security (NLPSec): assessing the vulnerability of models to malicious attacks and developing comprehensive countermeasures against them. While work at the intersection of NLP and cybersecurity has the potential to create safer NLP for all, accidental oversights can result in tangible harm (e.g., breach…
▽ More
As NLP models are used by a growing number of end-users, an area of increasing importance is NLP Security (NLPSec): assessing the vulnerability of models to malicious attacks and developing comprehensive countermeasures against them. While work at the intersection of NLP and cybersecurity has the potential to create safer NLP for all, accidental oversights can result in tangible harm (e.g., breaches of privacy or proliferation of malicious models). In this emerging field, however, the research ethics of NLP have not yet faced many of the long-standing conundrums pertinent to cybersecurity, until now. We thus examine contemporary works across NLPSec, and explore their engagement with cybersecurity's ethical norms. We identify trends across the literature, ultimately finding alarming gaps on topics like harm minimization and responsible disclosure. To alleviate these concerns, we provide concrete recommendations to help NLP researchers navigate this space more ethically, bridging the gap between traditional cybersecurity and NLP ethics, which we frame as ``white hat NLP''. The goal of this work is to help cultivate an intentional culture of ethical research for those working in NLP Security.
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots
Authors:
Binggwong Leung,
Worasuchad Haomachai,
Joachim Winther Pedersen,
Sebastian Risi,
Poramate Manoonpong
Abstract:
Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more…
▽ More
Artificial neural networks can be used to solve a variety of robotic tasks. However, they risk failing catastrophically when faced with out-of-distribution (OOD) situations. Several approaches have employed a type of synaptic plasticity known as Hebbian learning that can dynamically adjust weights based on local neural activities. Research has shown that synaptic plasticity can make policies more robust and help them adapt to unforeseen changes in the environment. However, networks augmented with Hebbian learning can lead to weight divergence, resulting in network instability. Furthermore, such Hebbian networks have not yet been applied to solve legged locomotion in complex real robots with many degrees of freedom. In this work, we improve the Hebbian network with a weight normalization mechanism for preventing weight divergence, analyze the principal components of the Hebbian's weights, and perform a thorough evaluation of network performance in locomotion control for real 18-DOF dung beetle-like and 16-DOF gecko-like robots. We find that the Hebbian-based plastic network can execute zero-shot sim-to-real adaptation locomotion and generalize to unseen conditions, such as uneven terrain and morphological damage.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
Implicit Neural Representations for Registration of Left Ventricle Myocardium During a Cardiac Cycle
Authors:
Mathias Micheelsen Lowes,
Jonas Jalili Pedersen,
Bjørn S. Hansen,
Klaus Fuglsang Kofoed,
Maxime Sermesant,
Rasmus R. Paulsen
Abstract:
Understanding the movement of the left ventricle myocardium (LVmyo) during the cardiac cycle is essential for assessing cardiac function. One way to model this movement is through a series of deformable image registrations (DIRs) of the LVmyo. Traditional deep learning methods for DIRs, such as those based on convolutional neural networks, often require substantial memory and computational resourc…
▽ More
Understanding the movement of the left ventricle myocardium (LVmyo) during the cardiac cycle is essential for assessing cardiac function. One way to model this movement is through a series of deformable image registrations (DIRs) of the LVmyo. Traditional deep learning methods for DIRs, such as those based on convolutional neural networks, often require substantial memory and computational resources. In contrast, implicit neural representations (INRs) offer an efficient approach by operating on any number of continuous points. This study extends the use of INRs for DIR to cardiac computed tomography (CT), focusing on LVmyo registration. To enhance the precision of the registration around the LVmyo, we incorporate the signed distance field of the LVmyo with the Hounsfield Unit values from the CT frames. This guides the registration of the LVmyo, while keeping the tissue information from the CT frames. Our framework demonstrates high registration accuracy and provides a robust method for temporal registration that facilitates further analysis of LVmyo motion.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
GERD: Geometric event response data generation
Authors:
Jens Egholm Pedersen,
Dimitris Korakovounis,
Jörg Conradt
Abstract:
Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption. They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time. Unlike conventional models grounded in established geometric and physical principles, event-based models lack comparable…
▽ More
Event-based vision sensors are appealing because of their time resolution, higher dynamic range, and low-power consumption. They also provide data that is fundamentally different from conventional frame-based cameras: events are sparse, discrete, and require integration in time. Unlike conventional models grounded in established geometric and physical principles, event-based models lack comparable foundations. We introduce a method to generate event-based data under controlled transformations. Specifically, we subject a prototypical object to transformations that change over time to produce carefully curated event videos. We hope this work simplifies studies for geometric approaches in event-based vision. GERD is available at https://github.com/ncskth/gerd
△ Less
Submitted 4 December, 2024;
originally announced December 2024.
-
Neuromorphic Programming: Emerging Directions for Brain-Inspired Hardware
Authors:
Steven Abreu,
Jens E. Pedersen
Abstract:
The value of brain-inspired neuromorphic computers critically depends on our ability to program them for relevant tasks. Currently, neuromorphic hardware often relies on machine learning methods adapted from deep learning. However, neuromorphic computers have potential far beyond deep learning if we can only harness their energy efficiency and full computational power. Neuromorphic programming wil…
▽ More
The value of brain-inspired neuromorphic computers critically depends on our ability to program them for relevant tasks. Currently, neuromorphic hardware often relies on machine learning methods adapted from deep learning. However, neuromorphic computers have potential far beyond deep learning if we can only harness their energy efficiency and full computational power. Neuromorphic programming will necessarily be different from conventional programming, requiring a paradigm shift in how we think about programming. This paper presents a conceptual analysis of programming within the context of neuromorphic computing, challenging conventional paradigms and proposing a framework that aligns more closely with the physical intricacies of these systems. Our analysis revolves around five characteristics that are fundamental to neuromorphic programming and provides a basis for comparison to contemporary programming methods and languages. By studying past approaches, we contribute a framework that advocates for underutilized techniques and calls for richer abstractions to effectively instrument the new hardware class.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Long or Short or Both? An Exploration on Lookback Time Windows of Behavioral Features in Product Search Ranking
Authors:
Qi Liu,
Atul Singh,
Jingbo Liu,
Cun Mu,
Zheng Yan,
Jan Pedersen
Abstract:
Customer shopping behavioral features are core to product search ranking models in eCommerce. In this paper, we investigate the effect of lookback time windows when aggregating these features at the (query, product) level over history. By studying the pros and cons of using long and short time windows, we propose a novel approach to integrating these historical behavioral features of different tim…
▽ More
Customer shopping behavioral features are core to product search ranking models in eCommerce. In this paper, we investigate the effect of lookback time windows when aggregating these features at the (query, product) level over history. By studying the pros and cons of using long and short time windows, we propose a novel approach to integrating these historical behavioral features of different time windows. In particular, we address the criticality of using query-level vertical signals in ranking models to effectively aggregate all information from different behavioral features. Anecdotal evidence for the proposed approach is also provided using live product search traffic on Walmart.com.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Spatio-temporal neural distance fields for conditional generative modeling of the heart
Authors:
Kristine Sørensen,
Paula Diez,
Jan Margeta,
Yasmin El Youssef,
Michael Pham,
Jonas Jalili Pedersen,
Tobias Kühl,
Ole de Backer,
Klaus Kofoed,
Oscar Camara,
Rasmus Paulsen
Abstract:
The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing m…
▽ More
The rhythmic pumping motion of the heart stands as a cornerstone in life, as it circulates blood to the entire human body through a series of carefully timed contractions of the individual chambers. Changes in the size, shape and movement of the chambers can be important markers for cardiac disease and modeling this in relation to clinical demography or disease is therefore of interest. Existing methods for spatio-temporal modeling of the human heart require shape correspondence over time or suffer from large memory requirements, making it difficult to use for complex anatomies. We introduce a novel conditional generative model, where the shape and movement is modeled implicitly in the form of a spatio-temporal neural distance field and conditioned on clinical demography. The model is based on an auto-decoder architecture and aims to disentangle the individual variations from that related to the clinical demography. It is tested on the left atrium (including the left atrial appendage), where it outperforms current state-of-the-art methods for anatomical sequence completion and generates synthetic sequences that realistically mimics the shape and motion of the real left atrium. In practice, this means we can infer functional measurements from a static image, generate synthetic populations with specified demography or disease and investigate how non-imaging clinical data effect the shape and motion of cardiac anatomies.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
From Text to Life: On the Reciprocal Relationship between Artificial Life and Large Language Models
Authors:
Eleni Nisioti,
Claire Glanois,
Elias Najarro,
Andrew Dai,
Elliot Meyerson,
Joachim Winther Pedersen,
Laetitia Teodorescu,
Conor F. Hayes,
Shyam Sudhakaran,
Sebastian Risi
Abstract:
Large Language Models (LLMs) have taken the field of AI by storm, but their adoption in the field of Artificial Life (ALife) has been, so far, relatively reserved. In this work we investigate the potential synergies between LLMs and ALife, drawing on a large body of research in the two fields. We explore the potential of LLMs as tools for ALife research, for example, as operators for evolutionary…
▽ More
Large Language Models (LLMs) have taken the field of AI by storm, but their adoption in the field of Artificial Life (ALife) has been, so far, relatively reserved. In this work we investigate the potential synergies between LLMs and ALife, drawing on a large body of research in the two fields. We explore the potential of LLMs as tools for ALife research, for example, as operators for evolutionary computation or the generation of open-ended environments. Reciprocally, principles of ALife, such as self-organization, collective intelligence and evolvability can provide an opportunity for shaping the development and functionalities of LLMs, leading to more adaptive and responsive models. By investigating this dynamic interplay, the paper aims to inspire innovative crossover approaches for both ALife and LLM research. Along the way, we examine the extent to which LLMs appear to increasingly exhibit properties such as emergence or collective intelligence, expanding beyond their original goal of generating text, and potentially redefining our perception of lifelike intelligence in artificial systems.
△ Less
Submitted 14 June, 2024;
originally announced July 2024.
-
Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning
Authors:
Erwan Plantec,
Joachin W. Pedersen,
Milton L. Montero,
Eleni Nisioti,
Sebastian Risi
Abstract:
Biological neural networks are characterized by their high degree of plasticity, a core property that enables the remarkable adaptability of natural organisms. Importantly, this ability affects both the synaptic strength and the topology of the nervous systems. Artificial neural networks, on the other hand, have been mainly designed as static, fully connected structures that can be notoriously bri…
▽ More
Biological neural networks are characterized by their high degree of plasticity, a core property that enables the remarkable adaptability of natural organisms. Importantly, this ability affects both the synaptic strength and the topology of the nervous systems. Artificial neural networks, on the other hand, have been mainly designed as static, fully connected structures that can be notoriously brittle in the face of changing environments and novel inputs. Building on previous works on Neural Developmental Programs (NDPs), we propose a class of self-organizing neural networks capable of synaptic and structural plasticity in an activity and reward-dependent manner which we call Lifelong Neural Developmental Program (LNDP). We present an instance of such a network built on the graph transformer architecture and propose a mechanism for pre-experience plasticity based on the spontaneous activity of sensory neurons. Our results demonstrate the ability of the model to learn from experiences in different control tasks starting from randomly connected or empty networks. We further show that structural plasticity is advantageous in environments necessitating fast adaptation or with non-stationary rewards.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Q-S5: Towards Quantized State Space Models
Authors:
Steven Abreu,
Jens E. Pedersen,
Kade M. Heckel,
Alessandro Pierro
Abstract:
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-…
▽ More
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-constrained platforms. Using quantization-aware training (QAT) and post-training quantization (PTQ), we systematically evaluate the quantization sensitivity of SSMs across different tasks like dynamical systems modeling, Sequential MNIST (sMNIST) and most of the Long Range Arena (LRA). We present fully quantized S5 models whose test accuracy drops less than 1% on sMNIST and most of the LRA. We find that performance on most tasks degrades significantly for recurrent weights below 8-bit precision, but that other components can be compressed further without significant loss of performance. Our results further show that PTQ only performs well on language-based LRA tasks whereas all others require QAT. Our investigation provides necessary insights for the continued development of efficient and hardware-optimized SSMs.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Meta-Learning an Evolvable Developmental Encoding
Authors:
Milton L. Montero,
Erwan Plantec,
Eleni Nisioti,
Joachim W. Pedersen,
Sebastian Risi
Abstract:
Representations for black-box optimisation methods (such as evolutionary algorithms) are traditionally constructed using a delicate manual process. This is in contrast to the representation that maps DNAs to phenotypes in biological organisms, which is at the hear of biological complexity and evolvability. Additionally, the core of this process is fundamentally the same across nearly all forms of…
▽ More
Representations for black-box optimisation methods (such as evolutionary algorithms) are traditionally constructed using a delicate manual process. This is in contrast to the representation that maps DNAs to phenotypes in biological organisms, which is at the hear of biological complexity and evolvability. Additionally, the core of this process is fundamentally the same across nearly all forms of life, reflecting their shared evolutionary origin. Generative models have shown promise in being learnable representations for black-box optimisation but they are not per se designed to be easily searchable. Here we present a system that can meta-learn such representation by directly optimising for a representation's ability to generate quality-diversity. In more detail, we show our meta-learning approach can find one Neural Cellular Automata, in which cells can attend to different parts of a "DNA" string genome during development, enabling it to grow different solvable 2D maze structures. We show that the evolved genotype-to-phenotype mappings become more and more evolvable, not only resulting in a faster search but also increasing the quality and diversity of grown artefacts.
△ Less
Submitted 5 July, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Growing Artificial Neural Networks for Control: the Role of Neuronal Diversity
Authors:
Eleni Nisioti,
Erwan Plantec,
Milton Montero,
Joachim Winther Pedersen,
Sebastian Risi
Abstract:
In biological evolution complex neural structures grow from a handful of cellular ingredients. As genomes in nature are bounded in size, this complexity is achieved by a growth process where cells communicate locally to decide whether to differentiate, proliferate and connect with other cells. This self-organisation is hypothesized to play an important part in the generalisation, and robustness of…
▽ More
In biological evolution complex neural structures grow from a handful of cellular ingredients. As genomes in nature are bounded in size, this complexity is achieved by a growth process where cells communicate locally to decide whether to differentiate, proliferate and connect with other cells. This self-organisation is hypothesized to play an important part in the generalisation, and robustness of biological neural networks. Artificial neural networks (ANNs), on the other hand, are traditionally optimized in the space of weights. Thus, the benefits and challenges of growing artificial neural networks remain understudied. Building on the previously introduced Neural Developmental Programs (NDP), in this work we present an algorithm for growing ANNs that solve reinforcement learning tasks. We identify a key challenge: ensuring phenotypic complexity requires maintaining neuronal diversity, but this diversity comes at the cost of optimization stability. To address this, we introduce two mechanisms: (a) equipping neurons with an intrinsic state inherited upon neurogenesis; (b) lateral inhibition, a mechanism inspired by biological growth, which controlls the pace of growth, helping diversity persist. We show that both mechanisms contribute to neuronal diversity and that, equipped with them, NDPs achieve comparable results to existing direct and developmental encodings in complex locomotion tasks
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Covariant spatio-temporal receptive fields for spiking neural networks
Authors:
Jens Egholm Pedersen,
Jörg Conradt,
Tony Lindeberg
Abstract:
Biological nervous systems constitute important sources of inspiration towards computers that are faster, cheaper, and more energy efficient. Neuromorphic disciplines view the brain as a coevolved system, simultaneously optimizing the hardware and the algorithms running on it. There are clear efficiency gains when bringing the computations into a physical substrate, but we presently lack theories…
▽ More
Biological nervous systems constitute important sources of inspiration towards computers that are faster, cheaper, and more energy efficient. Neuromorphic disciplines view the brain as a coevolved system, simultaneously optimizing the hardware and the algorithms running on it. There are clear efficiency gains when bringing the computations into a physical substrate, but we presently lack theories to guide efficient implementations. Here, we present a principled computational model for neuromorphic systems in terms of spatio-temporal receptive fields, based on affine Gaussian kernels over space and leaky-integrator and leaky integrate-and-fire models over time. Our theory is provably covariant to spatial affine and temporal scaling transformations, and with close similarities to the visual processing in mammalian brains. We use these spatio-temporal receptive fields as a prior in an event-based vision task, and show that this improves the training of spiking networks, which otherwise is known as problematic for event-based vision. This work combines efforts within scale-space theory and computational neuroscience to identify theoretically well-founded ways to process spatio-temporal signals in neuromorphic systems. Our contributions are immediately relevant for signal processing and event-based vision, and can be extended to other processing tasks over space and time, such as memory and control.
△ Less
Submitted 4 May, 2025; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Structurally Flexible Neural Networks: Evolving the Building Blocks for General Agents
Authors:
Joachim Winther Pedersen,
Erwan Plantec,
Eleni Nisioti,
Milton Montero,
Sebastian Risi
Abstract:
Artificial neural networks used for reinforcement learning are structurally rigid, meaning that each optimized parameter of the network is tied to its specific placement in the network structure. It also means that a network only works with pre-defined and fixed input- and output sizes. This is a consequence of having the number of optimized parameters being directly dependent on the structure of…
▽ More
Artificial neural networks used for reinforcement learning are structurally rigid, meaning that each optimized parameter of the network is tied to its specific placement in the network structure. It also means that a network only works with pre-defined and fixed input- and output sizes. This is a consequence of having the number of optimized parameters being directly dependent on the structure of the network. Structural rigidity limits the ability to optimize parameters of policies across multiple environments that do not share input and output spaces. Here, we evolve a set of neurons and plastic synapses each represented by a gated recurrent unit (GRU). During optimization, the parameters of these fundamental units of a neural network are optimized in different random structural configurations. Earlier work has shown that parameter sharing between units is important for making structurally flexible neurons We show that it is possible to optimize a set of distinct neuron- and synapse types allowing for a mitigation of the symmetry dilemma. We demonstrate this by optimizing a single set of neurons and synapses to solve multiple reinforcement learning control tasks simultaneously.
△ Less
Submitted 17 May, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
Neuromorphic Intermediate Representation: A Unified Instruction Set for Interoperable Brain-Inspired Computing
Authors:
Jens E. Pedersen,
Steven Abreu,
Matthias Jobst,
Gregor Lenz,
Vittorio Fra,
Felix C. Bauer,
Dylan R. Muir,
Peng Zhou,
Bernhard Vogginger,
Kade Heckel,
Gianvito Urgese,
Sadasivan Shankar,
Terrence C. Stewart,
Sadique Sheik,
Jason K. Eshraghian
Abstract:
Spiking neural networks and neuromorphic hardware platforms that simulate neuronal dynamics are getting wide attention and are being applied to many relevant problems using Machine Learning. Despite a well-established mathematical foundation for neural dynamics, there exists numerous software and hardware solutions and stacks whose variability makes it difficult to reproduce findings. Here, we est…
▽ More
Spiking neural networks and neuromorphic hardware platforms that simulate neuronal dynamics are getting wide attention and are being applied to many relevant problems using Machine Learning. Despite a well-established mathematical foundation for neural dynamics, there exists numerous software and hardware solutions and stacks whose variability makes it difficult to reproduce findings. Here, we establish a common reference frame for computations in digital neuromorphic systems, titled Neuromorphic Intermediate Representation (NIR). NIR defines a set of computational and composable model primitives as hybrid systems combining continuous-time dynamics and discrete events. By abstracting away assumptions around discretization and hardware constraints, NIR faithfully captures the computational model, while bridging differences between the evaluated implementation and the underlying mathematical formalism. NIR supports an unprecedented number of neuromorphic systems, which we demonstrate by reproducing three spiking neural network models of different complexity across 7 neuromorphic simulators and 4 digital hardware platforms. NIR decouples the development of neuromorphic hardware and software, enabling interoperability between platforms and improving accessibility to multiple neuromorphic technologies. We believe that NIR is a key next step in brain-inspired hardware-software co-evolution, enabling research towards the implementation of energy efficient computational principles of nervous systems. NIR is available at neuroir.org
△ Less
Submitted 30 September, 2024; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Which algorithm to select in sports timetabling?
Authors:
David Van Bulck,
Dries Goossens,
Jan-Patrick Clarner,
Angelos Dimitsas,
George H. G. Fonseca,
Carlos Lamas-Fernandez,
Martin Mariusz Lester,
Jaap Pedersen,
Antony E. Phillips,
Roberto Maria Rosati
Abstract:
Any sports competition needs a timetable, specifying when and where teams meet each other. The recent International Timetabling Competition (ITC2021) on sports timetabling showed that, although it is possible to develop general algorithms, the performance of each algorithm varies considerably over the problem instances. This paper provides an instance space analysis for sports timetabling, resulti…
▽ More
Any sports competition needs a timetable, specifying when and where teams meet each other. The recent International Timetabling Competition (ITC2021) on sports timetabling showed that, although it is possible to develop general algorithms, the performance of each algorithm varies considerably over the problem instances. This paper provides an instance space analysis for sports timetabling, resulting in powerful insights into the strengths and weaknesses of eight state-of-the-art algorithms. Based on machine learning techniques, we propose an algorithm selection system that predicts which algorithm is likely to perform best when given the characteristics of a sports timetabling problem instance. Furthermore, we identify which characteristics are important in making that prediction, providing insights in the performance of the algorithms, and suggestions to further improve them. Finally, we assess the empirical hardness of the instances. Our results are based on large computational experiments involving about 50 years of CPU time on more than 500 newly generated problem instances.
△ Less
Submitted 5 July, 2024; v1 submitted 4 September, 2023;
originally announced September 2023.
-
Learning to Act through Evolution of Neural Diversity in Random Neural Networks
Authors:
Joachim Winther Pedersen,
Sebastian Risi
Abstract:
Biological nervous systems consist of networks of diverse, sophisticated information processors in the form of neurons of different classes. In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons within a layer or even the whole network; training of ANNs focuses on synaptic optimization. In this paper, we pro…
▽ More
Biological nervous systems consist of networks of diverse, sophisticated information processors in the form of neurons of different classes. In most artificial neural networks (ANNs), neural computation is abstracted to an activation function that is usually shared between all neurons within a layer or even the whole network; training of ANNs focuses on synaptic optimization. In this paper, we propose the optimization of neuro-centric parameters to attain a set of diverse neurons that can perform complex computations. Demonstrating the promise of the approach, we show that evolving neural parameters alone allows agents to solve various reinforcement learning tasks without optimizing any synaptic weights. While not aiming to be an accurate biological model, parameterizing neurons to a larger degree than the current common practice, allows us to ask questions about the computational abilities afforded by neural diversity in random neural networks. The presented results open up interesting future research directions, such as combining evolved neural diversity with activity-dependent plasticity.
△ Less
Submitted 8 June, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
AEStream: Accelerated event-based processing with coroutines
Authors:
Jens Egholm Pedersen,
Jörg Conradt
Abstract:
Neuromorphic sensors imitate the sparse and event-based communication seen in biological sensory organs and brains. Today's sensors can emit many millions of asynchronous events per second, which is challenging to process on conventional computers. To avoid bottleneck effects, there is a need to apply and improve concurrent and parallel processing of events.
We present AEStream: a library to eff…
▽ More
Neuromorphic sensors imitate the sparse and event-based communication seen in biological sensory organs and brains. Today's sensors can emit many millions of asynchronous events per second, which is challenging to process on conventional computers. To avoid bottleneck effects, there is a need to apply and improve concurrent and parallel processing of events.
We present AEStream: a library to efficiently stream asynchronous events from inputs to outputs on conventional computers. AEStream leverages cooperative multitasking primitives known as coroutines to concurrently process individual events, which dramatically simplifies the integration with event-based peripherals, such as event-based cameras and (neuromorphic) asynchronous hardware. We explore the effects of coroutines in concurrent settings by benchmarking them against conventional threading mechanisms, and find that AEStream provides at least twice the throughput. We then apply AEStream in a real-time edge detection task on a GPU and demonstrate 1.3 times faster processing with 5 times fewer memory operations.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Minimal Neural Network Models for Permutation Invariant Agents
Authors:
Joachim Winther Pedersen,
Sebastian Risi
Abstract:
Organisms in nature have evolved to exhibit flexibility in face of changes to the environment and/or to themselves. Artificial neural networks (ANNs) have proven useful for controlling of artificial agents acting in environments. However, most ANN models used for reinforcement learning-type tasks have a rigid structure that does not allow for varying input sizes. Further, they fail catastrophicall…
▽ More
Organisms in nature have evolved to exhibit flexibility in face of changes to the environment and/or to themselves. Artificial neural networks (ANNs) have proven useful for controlling of artificial agents acting in environments. However, most ANN models used for reinforcement learning-type tasks have a rigid structure that does not allow for varying input sizes. Further, they fail catastrophically if inputs are presented in an ordering unseen during optimization. We find that these two ANN inflexibilities can be mitigated and their solutions are simple and highly related. For permutation invariance, no optimized parameters can be tied to a specific index of the input elements. For size invariance, inputs must be projected onto a common space that does not grow with the number of projections. Based on these restrictions, we construct a conceptually simple model that exhibit flexibility most ANNs lack. We demonstrate the model's properties on multiple control problems, and show that it can cope with even very rapid permutations of input indices, as well as changes in input size. Ablation studies show that is possible to achieve these properties with simple feedforward structures, but that it is much easier to optimize recurrent structures.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
LTU Attacker for Membership Inference
Authors:
Joseph Pedersen,
Rafael Muñoz-Gómez,
Jiangnan Huang,
Haozhe Sun,
Wei-Wei Tu,
Isabelle Guyon
Abstract:
We address the problem of defending predictive models, such as machine learning classifiers (Defender models), against membership inference attacks, in both the black-box and white-box setting, when the trainer and the trained model are publicly released. The Defender aims at optimizing a dual objective: utility and privacy. Both utility and privacy are evaluated with an external apparatus includi…
▽ More
We address the problem of defending predictive models, such as machine learning classifiers (Defender models), against membership inference attacks, in both the black-box and white-box setting, when the trainer and the trained model are publicly released. The Defender aims at optimizing a dual objective: utility and privacy. Both utility and privacy are evaluated with an external apparatus including an Attacker and an Evaluator. On one hand, Reserved data, distributed similarly to the Defender training data, is used to evaluate Utility; on the other hand, Reserved data, mixed with Defender training data, is used to evaluate membership inference attack robustness. In both cases classification accuracy or error rate are used as the metric: Utility is evaluated with the classification accuracy of the Defender model; Privacy is evaluated with the membership prediction error of a so-called "Leave-Two-Unlabeled" LTU Attacker, having access to all of the Defender and Reserved data, except for the membership label of one sample from each. We prove that, under certain conditions, even a "naïve" LTU Attacker can achieve lower bounds on privacy loss with simple attack strategies, leading to concrete necessary conditions to protect privacy, including: preventing over-fitting and adding some amount of randomness. However, we also show that such a naïve LTU Attacker can fail to attack the privacy of models known to be vulnerable in the literature, demonstrating that knowledge must be complemented with strong attack strategies to turn the LTU Attacker into a powerful means of evaluating privacy. Our experiments on the QMNIST and CIFAR-10 datasets validate our theoretical results and confirm the roles of over-fitting prevention and randomness in the algorithms to protect against privacy attacks.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
COVID-19 vaccination certificates in the Darkweb
Authors:
Dimitrios Georgoulias,
Jens Myrup Pedersen,
Morten Falch,
Emmanouil Vasilomanolakis
Abstract:
COVID-19 vaccines have been rolled out in many countries and with them a number of vaccination certificates. For instance, the EU is utilizing a digital certificate in the form of a QR-code that is digitally signed and can be easily validated throughout all EU countries. In this paper, we investigate the current state of the COVID-19 vaccination certificate market in the darkweb with a focus on th…
▽ More
COVID-19 vaccines have been rolled out in many countries and with them a number of vaccination certificates. For instance, the EU is utilizing a digital certificate in the form of a QR-code that is digitally signed and can be easily validated throughout all EU countries. In this paper, we investigate the current state of the COVID-19 vaccination certificate market in the darkweb with a focus on the EU Digital Green Certificate (DGC). We investigate $17$ marketplaces and $10$ vendor shops, that include vaccination certificates in their listings. Our results suggest that a multitude of sellers in both types of platforms are advertising selling capabilities. According to their claims, it is possible to buy fake vaccination certificates issued in most countries worldwide. We demonstrate some examples of such sellers, including how they advertise their capabilities, and the methods they claim to be using to provide their services. We highlight two particular cases of vendor shops, with one of them showing an elevated degree of professionalism, showcasing forged valid certificates, the validity of which we verify using two different national mobile COVID-19 applications.
△ Less
Submitted 25 November, 2021; v1 submitted 24 November, 2021;
originally announced November 2021.
-
Gotta catch 'em all: a Multistage Framework for honeypot fingerprinting
Authors:
Shreyas Srinivasa,
Jens Myrup Pedersen,
Emmanouil Vasilomanolakis
Abstract:
Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recogni…
▽ More
Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recognize the existence of such a system, they can evade it. In this article, we revisit the honeypot identification field, by providing a holistic framework that includes state of the art and novel fingerprinting components. We decrease the probability of false positives by proposing a rigid multi-step approach for labeling a system as a honeypot. We perform extensive scans covering 2.9 billion addresses of the IPv4 space and identify a total of 21,855 honeypot instances. Moreover, we present a number of interesting side-findings such as the identification of more than 354,431 non-honeypot systems that represent potentially vulnerable servers (e.g. SSH servers with default password configurations and vulnerable versions). Lastly, we discuss countermeasures against honeypot fingerprinting techniques.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Evolving and Merging Hebbian Learning Rules: Increasing Generalization by Decreasing the Number of Rules
Authors:
Joachim Winther Pedersen,
Sebastian Risi
Abstract:
Generalization to out-of-distribution (OOD) circumstances after training remains a challenge for artificial agents. To improve the robustness displayed by plastic Hebbian neural networks, we evolve a set of Hebbian learning rules, where multiple connections are assigned to a single rule. Inspired by the biological phenomenon of the genomic bottleneck, we show that by allowing multiple connections…
▽ More
Generalization to out-of-distribution (OOD) circumstances after training remains a challenge for artificial agents. To improve the robustness displayed by plastic Hebbian neural networks, we evolve a set of Hebbian learning rules, where multiple connections are assigned to a single rule. Inspired by the biological phenomenon of the genomic bottleneck, we show that by allowing multiple connections in the network to share the same local learning rule, it is possible to drastically reduce the number of trainable parameters, while obtaining a more robust agent. During evolution, by iteratively using simple K-Means clustering to combine rules, our Evolve and Merge approach is able to reduce the number of trainable parameters from 61,440 to 1,920, while at the same time improving robustness, all without increasing the number of generations used. While optimization of the agents is done on a standard quadruped robot morphology, we evaluate the agents' performances on slight morphology modifications in a total of 30 unseen morphologies. Our results add to the discussion on generalization, overfitting and OOD adaptation. To create agents that can adapt to a wider array of unexpected situations, Hebbian learning combined with a regularising "genomic bottleneck" could be a promising research direction.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Dynamic Coded Caching in Wireless Networks Using Multi-Agent Reinforcement Learning
Authors:
Jesper Pedersen,
Alexandre Graell i Amat,
Fredrik Brännström,
Eirik Rosnes
Abstract:
We consider distributed caching of content across several small base stations (SBSs) in a wireless network, where the content is encoded using a maximum distance separable code. Specifically, we apply soft time-to-live (STTL) cache management policies, where coded packets may be evicted from the caches at periodic times. We propose a reinforcement learning (RL) approach to find coded STTL policies…
▽ More
We consider distributed caching of content across several small base stations (SBSs) in a wireless network, where the content is encoded using a maximum distance separable code. Specifically, we apply soft time-to-live (STTL) cache management policies, where coded packets may be evicted from the caches at periodic times. We propose a reinforcement learning (RL) approach to find coded STTL policies minimizing the overall network load. We demonstrate that such caching policies achieve almost the same network load as policies obtained through optimization, where the latter assumes perfect knowledge of the distribution of times between file requests as well the distribution of the number of SBSs within communication range of a user placing a request. We also suggest a multi-agent RL (MARL) framework for the scenario of non-uniformly distributed requests in space. For such a scenario, we show that MARL caching policies achieve lower network load as compared to optimized caching policies assuming a uniform request placement. We also provide convincing evidence that synchronous updates offer a lower network load than asynchronous updates for spatially homogeneous renewal request processes due to the memory of the renewal processes.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Learning Continuous Treatment Policy and Bipartite Embeddings for Matching with Heterogeneous Causal Effects
Authors:
Will Y. Zou,
Smitha Shyam,
Michael Mui,
Mingshi Wang,
Jan Pedersen,
Zoubin Ghahramani
Abstract:
Causal inference methods are widely applied in the fields of medicine, policy, and economics. Central to these applications is the estimation of treatment effects to make decisions. Current methods make binary yes-or-no decisions based on the treatment effect of a single outcome dimension. These methods are unable to capture continuous space treatment policies with a measure of intensity. They als…
▽ More
Causal inference methods are widely applied in the fields of medicine, policy, and economics. Central to these applications is the estimation of treatment effects to make decisions. Current methods make binary yes-or-no decisions based on the treatment effect of a single outcome dimension. These methods are unable to capture continuous space treatment policies with a measure of intensity. They also lack the capacity to consider the complexity of treatment such as matching candidate treatments with the subject. We propose to formulate the effectiveness of treatment as a parametrizable model, expanding to a multitude of treatment intensities and complexities through the continuous policy treatment function, and the likelihood of matching. Our proposal to decompose treatment effect functions into effectiveness factors presents a framework to model a rich space of actions using causal inference. We utilize deep learning to optimize the desired holistic metric space instead of predicting single-dimensional treatment counterfactual. This approach employs a population-wide effectiveness measure and significantly improves the overall effectiveness of the model. The performance of our algorithms is. demonstrated with experiments. When using generic continuous space treatments and matching architecture, we observe a 41% improvement upon prior art with cost-effectiveness and 68% improvement upon a similar method in the average treatment effect. The algorithms capture subtle variations in treatment space, structures the efficient optimizations techniques, and opens up the arena for many applications.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
Heterogeneous Causal Learning for Effectiveness Optimization in User Marketing
Authors:
Will Y. Zou,
Shuyang Du,
James Lee,
Jan Pedersen
Abstract:
User marketing is a key focus of consumer-based internet companies. Learning algorithms are effective to optimize marketing campaigns which increase user engagement, and facilitates cross-marketing to related products. By attracting users with rewards, marketing methods are effective to boost user activity in the desired products. Rewards incur significant cost that can be off-set by increase in f…
▽ More
User marketing is a key focus of consumer-based internet companies. Learning algorithms are effective to optimize marketing campaigns which increase user engagement, and facilitates cross-marketing to related products. By attracting users with rewards, marketing methods are effective to boost user activity in the desired products. Rewards incur significant cost that can be off-set by increase in future revenue. Most methodologies rely on churn predictions to prevent losing users to make marketing decisions, which cannot capture up-lift across counterfactual outcomes with business metrics. Other predictive models are capable of estimating heterogeneous treatment effects, but fail to capture the balance of cost versus benefit. We propose a treatment effect optimization methodology for user marketing. This algorithm learns from past experiments and utilizes novel optimization methods to optimize cost efficiency with respect to user selection. The method optimizes decisions using deep learning optimization models to treat and reward users, which is effective in producing cost-effective, impactful marketing campaigns. Our methodology demonstrates superior algorithmic flexibility with integration with deep learning methods and dealing with business constraints. The effectiveness of our model surpasses the quasi-oracle estimation (R-learner) model and causal forests. We also established evaluation metrics that reflect the cost-efficiency and real-world business value. Our proposed constrained and direct optimization algorithms outperform by 24.6% compared with the best performing method in prior art and baseline methods. The methodology is useful in many product scenarios such as optimal treatment allocation and it has been deployed in production world-wide.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
Dynamic Coded Caching in Wireless Networks
Authors:
Jesper Pedersen,
Alexandre Graell i Amat,
Jasper Goseling,
Fredrik Brännström,
Iryna Andriyanova,
Eirik Rosnes
Abstract:
We consider distributed and dynamic caching of coded content at small base stations (SBSs) in an area served by a macro base station (MBS). Specifically, content is encoded using a maximum distance separable code and cached according to a time-to-live (TTL) cache eviction policy, which allows coded packets to be removed from the caches at periodic times. Mobile users requesting a particular conten…
▽ More
We consider distributed and dynamic caching of coded content at small base stations (SBSs) in an area served by a macro base station (MBS). Specifically, content is encoded using a maximum distance separable code and cached according to a time-to-live (TTL) cache eviction policy, which allows coded packets to be removed from the caches at periodic times. Mobile users requesting a particular content download coded packets from SBSs within communication range. If additional packets are required to decode the file, these are downloaded from the MBS. We formulate an optimization problem that is efficiently solved numerically, providing TTL caching policies minimizing the overall network load. We demonstrate that distributed coded caching using TTL caching policies can offer significant reductions in terms of network load when request arrivals are bursty. We show how the distributed coded caching problem utilizing TTL caching policies can be analyzed as a specific single cache, convex optimization problem. Our problem encompasses static caching and the single cache as special cases. We prove that, interestingly, static caching is optimal under a Poisson request process, and that for a single cache the optimization problem has a surprisingly simple solution.
△ Less
Submitted 22 December, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Graph Refinement based Airway Extraction using Mean-Field Networks and Graph Neural Networks
Authors:
Raghavendra Selvan,
Thomas Kipf,
Max Welling,
Antonio Garcia-Uceda Juarez,
Jesper H Pedersen,
Jens Petersen,
Marleen de Bruijne
Abstract:
Graph refinement, or the task of obtaining subgraphs of interest from over-complete graphs, can have many varied applications. In this work, we extract trees or collection of sub-trees from image data by, first deriving a graph-based representation of the volumetric data and then, posing the tree extraction as a graph refinement task. We present two methods to perform graph refinement. First, we u…
▽ More
Graph refinement, or the task of obtaining subgraphs of interest from over-complete graphs, can have many varied applications. In this work, we extract trees or collection of sub-trees from image data by, first deriving a graph-based representation of the volumetric data and then, posing the tree extraction as a graph refinement task. We present two methods to perform graph refinement. First, we use mean-field approximation (MFA) to approximate the posterior density over the subgraphs from which the optimal subgraph of interest can be estimated. Mean field networks (MFNs) are used for inference based on the interpretation that iterations of MFA can be seen as feed-forward operations in a neural network. This allows us to learn the model parameters using gradient descent. Second, we present a supervised learning approach using graph neural networks (GNNs) which can be seen as generalisations of MFNs. Subgraphs are obtained by training a GNN-based graph refinement model to directly predict edge probabilities. We discuss connections between the two classes of methods and compare them for the task of extracting airways from 3D, low-dose, chest CT data. We show that both the MFN and GNN models show significant improvement when compared to one baseline method, that is similar to a top performing method in the EXACT'09 Challenge, and a 3D U-Net based airway segmentation model, in detecting more branches with fewer false positives.
△ Less
Submitted 2 June, 2020; v1 submitted 21 November, 2018;
originally announced November 2018.
-
Extracting Tree-structures in CT data by Tracking Multiple Statistically Ranked Hypotheses
Authors:
Raghavendra Selvan,
Jens Petersen,
Jesper H Pedersen,
Marleen de Bruijne
Abstract:
In this work, we adapt a method based on multiple hypothesis tracking (MHT) that has been shown to give state-of-the-art vessel segmentation results in interactive settings, for the purpose of extracting trees. Regularly spaced tubular templates are fit to image data forming local hypotheses. These local hypotheses are used to construct the MHT tree, which is then traversed to make segmentation de…
▽ More
In this work, we adapt a method based on multiple hypothesis tracking (MHT) that has been shown to give state-of-the-art vessel segmentation results in interactive settings, for the purpose of extracting trees. Regularly spaced tubular templates are fit to image data forming local hypotheses. These local hypotheses are used to construct the MHT tree, which is then traversed to make segmentation decisions. However, some critical parameters in this method are scale-dependent and have an adverse effect when tracking structures of varying dimensions. We propose to use statistical ranking of local hypotheses in constructing the MHT tree, which yields a probabilistic interpretation of scores across scales and helps alleviate the scale-dependence of MHT parameters. This enables our method to track trees starting from a single seed point. Our method is evaluated on chest CT data to extract airway trees and coronary arteries. In both cases, we show that our method performs significantly better than the original MHT method.
△ Less
Submitted 10 July, 2019; v1 submitted 23 June, 2018;
originally announced June 2018.
-
Extraction of Airways using Graph Neural Networks
Authors:
Raghavendra Selvan,
Thomas Kipf,
Max Welling,
Jesper H. Pedersen,
Jens Petersen,
Marleen de Bruijne
Abstract:
We present extraction of tree structures, such as airways, from image data as a graph refinement task. To this end, we propose a graph auto-encoder model that uses an encoder based on graph neural networks (GNNs) to learn embeddings from input node features and a decoder to predict connections between nodes. Performance of the GNN model is compared with mean-field networks in their ability to extr…
▽ More
We present extraction of tree structures, such as airways, from image data as a graph refinement task. To this end, we propose a graph auto-encoder model that uses an encoder based on graph neural networks (GNNs) to learn embeddings from input node features and a decoder to predict connections between nodes. Performance of the GNN model is compared with mean-field networks in their ability to extract airways from 3D chest CT scans.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Mean Field Network based Graph Refinement with application to Airway Tree Extraction
Authors:
Raghavendra Selvan,
Max Welling,
Jesper H. Pedersen,
Jens Petersen,
Marleen de Bruijne
Abstract:
We present tree extraction in 3D images as a graph refinement task, of obtaining a subgraph from an over-complete input graph. To this end, we formulate an approximate Bayesian inference framework on undirected graphs using mean field approximation (MFA). Mean field networks are used for inference based on the interpretation that iterations of MFA can be seen as feed-forward operations in a neural…
▽ More
We present tree extraction in 3D images as a graph refinement task, of obtaining a subgraph from an over-complete input graph. To this end, we formulate an approximate Bayesian inference framework on undirected graphs using mean field approximation (MFA). Mean field networks are used for inference based on the interpretation that iterations of MFA can be seen as feed-forward operations in a neural network. This allows us to learn the model parameters from training data using back-propagation algorithm. We demonstrate usefulness of the model to extract airway trees from 3D chest CT data. We first obtain probability images using a voxel classifier that distinguishes airways from background and use Bayesian smoothing to model individual airway branches. This yields us joint Gaussian density estimates of position, orientation and scale as node features of the input graph. Performance of the method is compared with two methods: the first uses probability images from a trained voxel classifier with region growing, which is similar to one of the best performing methods at EXACT'09 airway challenge, and the second method is based on Bayesian smoothing on these probability images. Using centerline distance as error measure the presented method shows significant improvement compared to these two methods.
△ Less
Submitted 10 April, 2018;
originally announced April 2018.
-
Extraction of Airways with Probabilistic State-space Models and Bayesian Smoothing
Authors:
Raghavendra Selvan,
Jens Petersen,
Jesper H. Pedersen,
Marleen de Bruijne
Abstract:
Segmenting tree structures is common in several image processing applications. In medical image analysis, reliable segmentations of airways, vessels, neurons and other tree structures can enable important clinical applications. We present a framework for tracking tree structures comprising of elongated branches using probabilistic state-space models and Bayesian smoothing. Unlike most existing met…
▽ More
Segmenting tree structures is common in several image processing applications. In medical image analysis, reliable segmentations of airways, vessels, neurons and other tree structures can enable important clinical applications. We present a framework for tracking tree structures comprising of elongated branches using probabilistic state-space models and Bayesian smoothing. Unlike most existing methods that proceed with sequential tracking of branches, we present an exploratory method, that is less sensitive to local anomalies in the data due to acquisition noise and/or interfering structures. The evolution of individual branches is modelled using a process model and the observed data is incorporated into the update step of the Bayesian smoother using a measurement model that is based on a multi-scale blob detector. Bayesian smoothing is performed using the RTS (Rauch-Tung-Striebel) smoother, which provides Gaussian density estimates of branch states at each tracking step. We select likely branch seed points automatically based on the response of the blob detection and track from all such seed points using the RTS smoother. We use covariance of the marginal posterior density estimated for each branch to discriminate false positive and true positive branches. The method is evaluated on 3D chest CT scans to track airways. We show that the presented method results in additional branches compared to a baseline method based on region growing on probability images.
△ Less
Submitted 7 August, 2017;
originally announced August 2017.
-
Classification of COPD with Multiple Instance Learning
Authors:
Veronika Cheplygina,
Lauge Sørensen,
David M. J. Tax,
Jesper Holst Pedersen,
Marco Loog,
Marleen de Bruijne
Abstract:
Chronic obstructive pulmonary disease (COPD) is a lung disease where early detection benefits the survival rate. COPD can be quantified by classifying patches of computed tomography images, and combining patch labels into an overall diagnosis for the image. As labeled patches are often not available, image labels are propagated to the patches, incorrectly labeling healthy patches in COPD patients…
▽ More
Chronic obstructive pulmonary disease (COPD) is a lung disease where early detection benefits the survival rate. COPD can be quantified by classifying patches of computed tomography images, and combining patch labels into an overall diagnosis for the image. As labeled patches are often not available, image labels are propagated to the patches, incorrectly labeling healthy patches in COPD patients as being affected by the disease. We approach quantification of COPD from lung images as a multiple instance learning (MIL) problem, which is more suitable for such weakly labeled data. We investigate various MIL assumptions in the context of COPD and show that although a concept region with COPD-related disease patterns is present, considering the whole distribution of lung tissue patches improves the performance. The best method is based on averaging instances and obtains an AUC of 0.742, which is higher than the previously reported best of 0.713 on the same dataset. Using the full training set further increases performance to 0.776, which is significantly higher (DeLong test) than previous results.
△ Less
Submitted 15 March, 2017;
originally announced March 2017.
-
Optimizing MDS Coded Caching in Wireless Networks with Device-to-Device Communication
Authors:
Jesper Pedersen,
Alexandre Graell i Amat,
Iryna Andriyanova,
Fredrik Brännström
Abstract:
We consider the caching of content in the mobile devices in a dense wireless network using maximum distance separable (MDS) codes. We focus on an area, served by a base station (BS), where mobile devices move around according to a random mobility model. Users requesting a particular file download coded packets from caching devices within a communication range, using device-to-device communication.…
▽ More
We consider the caching of content in the mobile devices in a dense wireless network using maximum distance separable (MDS) codes. We focus on an area, served by a base station (BS), where mobile devices move around according to a random mobility model. Users requesting a particular file download coded packets from caching devices within a communication range, using device-to-device communication. If additional packets are required to decode the file, these are downloaded from the BS. We analyze the device mobility and derive a good approximation of the distribution of caching devices within the communication range of mobile devices at any given time. We then optimize the MDS codes to minimize the network load under a cache size constraint and show that using optimized MDS codes results in significantly lower network load compared to when caching the most popular files. We further show numerically that caching coded packets of each file on all mobile devices, i.e., maximal spreading, is optimal.
△ Less
Submitted 30 October, 2018; v1 submitted 23 January, 2017;
originally announced January 2017.
-
Transfer learning for multi-center classification of chronic obstructive pulmonary disease
Authors:
Veronika Cheplygina,
Isabel Pino Peña,
Jesper Holst Pedersen,
David A. Lynch,
Lauge Sørensen,
Marleen de Bruijne
Abstract:
Chronic obstructive pulmonary disease (COPD) is a lung disease which can be quantified using chest computed tomography (CT) scans. Recent studies have shown that COPD can be automatically diagnosed using weakly supervised learning of intensity and texture distributions. However, up till now such classifiers have only been evaluated on scans from a single domain, and it is unclear whether they woul…
▽ More
Chronic obstructive pulmonary disease (COPD) is a lung disease which can be quantified using chest computed tomography (CT) scans. Recent studies have shown that COPD can be automatically diagnosed using weakly supervised learning of intensity and texture distributions. However, up till now such classifiers have only been evaluated on scans from a single domain, and it is unclear whether they would generalize across domains, such as different scanners or scanning protocols. To address this problem, we investigate classification of COPD in a multi-center dataset with a total of 803 scans from three different centers, four different scanners, with heterogenous subject distributions. Our method is based on Gaussian texture features, and a weighted logistic classifier, which increases the weights of samples similar to the test data. We show that Gaussian texture features outperform intensity features previously used in multi-center classification tasks. We also show that a weighting strategy based on a classifier that is trained to discriminate between scans from different domains, can further improve the results. To encourage further research into transfer learning methods for classification of COPD, upon acceptance of the paper we will release two feature datasets used in this study on http://bigr.nl/research/projects/copd
△ Less
Submitted 23 November, 2017; v1 submitted 18 January, 2017;
originally announced January 2017.
-
Extraction of airway trees using multiple hypothesis tracking and template matching
Authors:
Raghavendra Selvan,
Jens Petersen,
Jesper H. Pedersen,
Marleen de Bruijne
Abstract:
Knowledge of airway tree morphology has important clinical applications in diagnosis of chronic obstructive pulmonary disease. We present an automatic tree extraction method based on multiple hypothesis tracking and template matching for this purpose and evaluate its performance on chest CT images. The method is adapted from a semi-automatic method devised for vessel segmentation. Idealized tubula…
▽ More
Knowledge of airway tree morphology has important clinical applications in diagnosis of chronic obstructive pulmonary disease. We present an automatic tree extraction method based on multiple hypothesis tracking and template matching for this purpose and evaluate its performance on chest CT images. The method is adapted from a semi-automatic method devised for vessel segmentation. Idealized tubular templates are constructed that match airway probability obtained from a trained classifier and ranked based on their relative significance. Several such regularly spaced templates form the local hypotheses used in constructing a multiple hypothesis tree, which is then traversed to reach decisions. The proposed modifications remove the need for local thresholding of hypotheses as decisions are made entirely based on statistical comparisons involving the hypothesis tree. The results show improvements in performance when compared to the original method and region growing on intensity images. We also compare the method with region growing on the probability images, where the presented method does not show substantial improvement, but we expect it to be less sensitive to local anomalies in the data.
△ Less
Submitted 24 November, 2016;
originally announced November 2016.
-
Distributed Storage in Mobile Wireless Networks with Device-to-Device Communication
Authors:
Jesper Pedersen,
Alexandre Graell i Amat,
Iryna Andriyanova,
Fredrik Brännström
Abstract:
We consider the use of distributed storage (DS) to reduce the communication cost of content delivery in wireless networks. Content is stored (cached) in a number of mobile devices using an erasure correcting code. Users retrieve content from other devices using device-to-device communication or from the base station (BS), at the expense of higher communication cost. We address the repair problem w…
▽ More
We consider the use of distributed storage (DS) to reduce the communication cost of content delivery in wireless networks. Content is stored (cached) in a number of mobile devices using an erasure correcting code. Users retrieve content from other devices using device-to-device communication or from the base station (BS), at the expense of higher communication cost. We address the repair problem when a device storing data leaves the cell. We introduce a repair scheduling where repair is performed periodically and derive analytical expressions for the overall communication cost of content download and data repair as a function of the repair interval. The derived expressions are then used to evaluate the communication cost entailed by DS using several erasure correcting codes. Our results show that DS can reduce the communication cost with respect to the case where content is downloaded only from the BS, provided that repairs are performed frequently enough. If devices storing content arrive to the cell, the communication cost using DS is further reduced and, for large enough arrival rate, it is always beneficial. Interestingly, we show that MDS codes, which do not perform well for classical DS, can yield a low overall communication cost in wireless DS.
△ Less
Submitted 27 September, 2016; v1 submitted 4 January, 2016;
originally announced January 2016.
-
Repair Scheduling in Wireless Distributed Storage with D2D Communication
Authors:
Jesper Pedersen,
Alexandre Graell i Amat,
Iryna Andriyanova,
Fredrik Brännström
Abstract:
We consider distributed storage (DS) for a wireless network where mobile devices arrive and depart according to a Poisson random process. Content is stored in a number of mobile devices, using an erasure correcting code. When requesting a piece of content, a user retrieves the content from the mobile devices using device-to-device communication or, if not possible, from the base station (BS), at t…
▽ More
We consider distributed storage (DS) for a wireless network where mobile devices arrive and depart according to a Poisson random process. Content is stored in a number of mobile devices, using an erasure correcting code. When requesting a piece of content, a user retrieves the content from the mobile devices using device-to-device communication or, if not possible, from the base station (BS), at the expense of a higher communication cost. We consider the repair problem when a device that stores data leaves the network. In particular, we introduce a repair scheduling where repair is performed (from storage devices or the BS) periodically. We derive analytical expressions for the overall communication cost of repair and download as a function of the repair interval. We illustrate the analysis by giving results for maximum distance separable codes and regenerating codes. Our results indicate that DS can reduce the overall communication cost with respect to the case where content is only downloaded from the BS, provided that repairs are performed frequently enough. The required repair frequency depends on the code used for storage and the network parameters. In particular, minimum bandwidth regenerating codes require very frequent repairs, while maximum distance separable codes give better performance if repair is performed less frequently. We also show that instantaneous repair is not always optimal.
△ Less
Submitted 7 August, 2015; v1 submitted 23 April, 2015;
originally announced April 2015.
-
Geometric tree kernels: Classification of COPD from airway tree geometry
Authors:
Aasa Feragen,
Jens Petersen,
Dominik Grimm,
Asger Dirksen,
Jesper Holst Pedersen,
Karsten Borgwardt,
Marleen de Bruijne
Abstract:
Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or othe…
▽ More
Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or other vector valued properties. In addition to being flexible in their ability to model different types of attributes, the presented kernels are computationally efficient and some of them can easily be computed for large datasets (N of the order 10.000) of trees with 30-600 branches. Combining the kernels with standard machine learning tools enables us to analyze the relation between disease and anatomical tree structure and geometry. Experimental results: The kernels are used to compare airway trees segmented from low-dose CT, endowed with branch shape descriptors and airway wall area percentage measurements made along the tree. Using kernelized hypothesis testing we show that the geometric airway trees are significantly differently distributed in patients with Chronic Obstructive Pulmonary Disease (COPD) than in healthy individuals. The geometric tree kernels also give a significant increase in the classification accuracy of COPD from geometric tree structure endowed with airway wall thickness measurements in comparison with state-of-the-art methods, giving further insight into the relationship between airway wall thickness and COPD. Software: Software for computing kernels and statistical tests is available at http://image.diku.dk/aasa/software.php.
△ Less
Submitted 8 April, 2013; v1 submitted 29 March, 2013;
originally announced March 2013.
-
On the Complexity of Buffer Allocation in Message Passing Systems
Authors:
Alex Brodsky,
Jan B. Pedersen,
Alan Wagner
Abstract:
Message passing programs commonly use buffers to avoid unnecessary synchronizations and to improve performance by overlapping communication with computation. Unfortunately, using buffers makes the program no longer portable, potentially unable to complete on systems without a sufficient number of buffers. Effective buffer use entails that the minimum number needed for a safe execution be allocat…
▽ More
Message passing programs commonly use buffers to avoid unnecessary synchronizations and to improve performance by overlapping communication with computation. Unfortunately, using buffers makes the program no longer portable, potentially unable to complete on systems without a sufficient number of buffers. Effective buffer use entails that the minimum number needed for a safe execution be allocated.
We explore a variety of problems related to buffer allocation for safe and efficient execution of message passing programs. We show that determining the minimum number of buffers or verifying a buffer assignment are intractable problems. However, we give a polynomial time algorithm to determine the minimum number of buffers needed to allow for asynchronous execution. We extend these results to several different buffering schemes, which in some cases make the problems tractable.
△ Less
Submitted 30 January, 2003;
originally announced January 2003.