-
Aggregation Strategies for Efficient Annotation of Bioacoustic Sound Events Using Active Learning
Authors:
Richard Lindholm,
Oscar Marklund,
Olof Mogren,
John Martinsson
Abstract:
The vast amounts of audio data collected in Sound Event Detection (SED) applications require efficient annotation strategies to enable supervised learning. Manual labeling is expensive and time-consuming, making Active Learning (AL) a promising approach for reducing annotation effort. We introduce Top K Entropy, a novel uncertainty aggregation strategy for AL that prioritizes the most uncertain se…
▽ More
The vast amounts of audio data collected in Sound Event Detection (SED) applications require efficient annotation strategies to enable supervised learning. Manual labeling is expensive and time-consuming, making Active Learning (AL) a promising approach for reducing annotation effort. We introduce Top K Entropy, a novel uncertainty aggregation strategy for AL that prioritizes the most uncertain segments within an audio recording, instead of averaging uncertainty across all segments. This approach enables the selection of entire recordings for annotation, improving efficiency in sparse data scenarios. We compare Top K Entropy to random sampling and Mean Entropy, and show that fewer labels can lead to the same model performance, particularly in datasets with sparse sound events. Evaluations are conducted on audio mixtures of sound recordings from parks with meerkat, dog, and baby crying sound events, representing real-world bioacoustic monitoring scenarios. Using Top K Entropy for active learning, we can achieve comparable performance to training on the fully labeled dataset with only 8% of the labels. Top K Entropy outperforms Mean Entropy, suggesting that it is best to let the most uncertain segments represent the uncertainty of an audio file. The findings highlight the potential of AL for scalable annotation in audio and time-series applications, including bioacoustics.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
The Accuracy Cost of Weakness: A Theoretical Analysis of Fixed-Segment Weak Labeling for Events in Time
Authors:
John Martinsson,
Olof Mogren,
Tuomas Virtanen,
Maria Sandsten
Abstract:
Accurate labels are critical for deriving robust machine learning models. Labels are used to train supervised learning models and to evaluate most machine learning paradigms. In this paper, we model the accuracy and cost of a common weak labeling process where annotators assign presence or absence labels to fixed-length data segments for a given event class. The annotator labels a segment as "pres…
▽ More
Accurate labels are critical for deriving robust machine learning models. Labels are used to train supervised learning models and to evaluate most machine learning paradigms. In this paper, we model the accuracy and cost of a common weak labeling process where annotators assign presence or absence labels to fixed-length data segments for a given event class. The annotator labels a segment as "present" if it sufficiently covers an event from that class, e.g., a birdsong sound event in audio data. We analyze how the segment length affects the label accuracy and the required number of annotations, and compare this fixed-length labeling approach with an oracle method that uses the true event activations to construct the segments. Furthermore, we quantify the gap between these methods and verify that in most realistic scenarios the oracle method is better than the fixed-length labeling method in both accuracy and cost. Our findings provide a theoretical justification for adaptive weak labeling strategies that mimic the oracle process, and a foundation for optimizing weak labeling processes in sequence labeling tasks.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Flexible SE(2) graph neural networks with applications to PDE surrogates
Authors:
Maria Bånkestad,
Olof Mogren,
Aleksis Pirinen
Abstract:
This paper presents a novel approach for constructing graph neural networks equivariant to 2D rotations and translations and leveraging them as PDE surrogates on non-gridded domains. We show that aligning the representations with the principal axis allows us to sidestep many constraints while preserving SE(2) equivariance. By applying our model as a surrogate for fluid flow simulations and conduct…
▽ More
This paper presents a novel approach for constructing graph neural networks equivariant to 2D rotations and translations and leveraging them as PDE surrogates on non-gridded domains. We show that aligning the representations with the principal axis allows us to sidestep many constraints while preserving SE(2) equivariance. By applying our model as a surrogate for fluid flow simulations and conducting thorough benchmarks against non-equivariant models, we demonstrate significant gains in terms of both data efficiency and accuracy.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
From Weak to Strong Sound Event Labels using Adaptive Change-Point Detection and Active Learning
Authors:
John Martinsson,
Olof Mogren,
Maria Sandsten,
Tuomas Virtanen
Abstract:
We propose an adaptive change point detection method (A-CPD) for machine guided weak label annotation of audio recording segments. The goal is to maximize the amount of information gained about the temporal activations of the target sounds. For each unlabeled audio recording, we use a prediction model to derive a probability curve used to guide annotation. The prediction model is initially pre-tra…
▽ More
We propose an adaptive change point detection method (A-CPD) for machine guided weak label annotation of audio recording segments. The goal is to maximize the amount of information gained about the temporal activations of the target sounds. For each unlabeled audio recording, we use a prediction model to derive a probability curve used to guide annotation. The prediction model is initially pre-trained on available annotated sound event data with classes that are disjoint from the classes in the unlabeled dataset. The prediction model then gradually adapts to the annotations provided by the annotator in an active learning loop. We derive query segments to guide the weak label annotator towards strong labels, using change point detection on these probabilities. We show that it is possible to derive strong labels of high quality with a limited annotation budget, and show favorable results for A-CPD when compared to two baseline query segment strategies.
△ Less
Submitted 26 August, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Impacts of Color and Texture Distortions on Earth Observation Data in Deep Learning
Authors:
Martin Willbo,
Aleksis Pirinen,
John Martinsson,
Edvin Listo Zec,
Olof Mogren,
Mikael Nilsson
Abstract:
Land cover classification and change detection are two important applications of remote sensing and Earth observation (EO) that have benefited greatly from the advances of deep learning. Convolutional and transformer-based U-net models are the state-of-the-art architectures for these tasks, and their performances have been boosted by an increased availability of large-scale annotated EO datasets.…
▽ More
Land cover classification and change detection are two important applications of remote sensing and Earth observation (EO) that have benefited greatly from the advances of deep learning. Convolutional and transformer-based U-net models are the state-of-the-art architectures for these tasks, and their performances have been boosted by an increased availability of large-scale annotated EO datasets. However, the influence of different visual characteristics of the input EO data on a model's predictions is not well understood. In this work we systematically examine model sensitivities with respect to several color- and texture-based distortions on the input EO data during inference, given models that have been trained without such distortions. We conduct experiments with multiple state-of-the-art segmentation networks for land cover classification and show that they are in general more sensitive to texture than to color distortions. Beyond revealing intriguing characteristics of widely used land cover classification models, our results can also be used to guide the development of more robust models within the EO domain.
△ Less
Submitted 12 April, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Concept-aware clustering for decentralized deep learning under temporal shift
Authors:
Marcus Toftås,
Emilie Klefbom,
Edvin Listo Zec,
Martin Willbo,
Olof Mogren
Abstract:
Decentralized deep learning requires dealing with non-iid data across clients, which may also change over time due to temporal shifts. While non-iid data has been extensively studied in distributed settings, temporal shifts have received no attention. To the best of our knowledge, we are first with tackling the novel and challenging problem of decentralized learning with non-iid and dynamic data.…
▽ More
Decentralized deep learning requires dealing with non-iid data across clients, which may also change over time due to temporal shifts. While non-iid data has been extensively studied in distributed settings, temporal shifts have received no attention. To the best of our knowledge, we are first with tackling the novel and challenging problem of decentralized learning with non-iid and dynamic data. We propose a novel algorithm that can automatically discover and adapt to the evolving concepts in the network, without any prior knowledge or estimation of the number of concepts. We evaluate our algorithm on standard benchmark datasets and demonstrate that it outperforms previous methods for decentralized learning.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Grammatical gender in Swedish is predictable using recurrent neural networks
Authors:
Edvin Listo Zec,
Olof Mogren
Abstract:
The grammatical gender of Swedish nouns is a mystery. While there are few rules that can indicate the gender with some certainty, it does in general not depend on either meaning or the structure of the word. In this paper we demonstrate the surprising fact that grammatical gender for Swedish nouns can be predicted with high accuracy using a recurrent neural network (RNN) working on the raw charact…
▽ More
The grammatical gender of Swedish nouns is a mystery. While there are few rules that can indicate the gender with some certainty, it does in general not depend on either meaning or the structure of the word. In this paper we demonstrate the surprising fact that grammatical gender for Swedish nouns can be predicted with high accuracy using a recurrent neural network (RNN) working on the raw character sequence of the word, without using any contextual information.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Fully Convolutional Networks for Dense Water Flow Intensity Prediction in Swedish Catchment Areas
Authors:
Aleksis Pirinen,
Olof Mogren,
Mårten Västerdal
Abstract:
Intensifying climate change will lead to more extreme weather events, including heavy rainfall and drought. Accurate stream flow prediction models which are adaptable and robust to new circumstances in a changing climate will be an important source of information for decisions on climate adaptation efforts, especially regarding mitigation of the risks of and damages associated with flooding. In th…
▽ More
Intensifying climate change will lead to more extreme weather events, including heavy rainfall and drought. Accurate stream flow prediction models which are adaptable and robust to new circumstances in a changing climate will be an important source of information for decisions on climate adaptation efforts, especially regarding mitigation of the risks of and damages associated with flooding. In this work we propose a machine learning-based approach for predicting water flow intensities in inland watercourses based on the physical characteristics of the catchment areas, obtained from geospatial data (including elevation and soil maps, as well as satellite imagery), in addition to temporal information about past rainfall quantities and temperature variations. We target the one-day-ahead regime, where a fully convolutional neural network model receives spatio-temporal inputs and predicts the water flow intensity in every coordinate of the spatial input for the subsequent day. To the best of our knowledge, we are the first to tackle the task of dense water flow intensity prediction; earlier works have considered predicting flow intensities at a sparse set of locations at a time. An extensive set of model evaluations and ablations are performed, which empirically justify our various design choices. Code and preprocessed data have been made publicly available at https://github.com/aleksispi/fcn-water-flow.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
Efficient Node Selection in Private Personalized Decentralized Learning
Authors:
Edvin Listo Zec,
Johan Östman,
Olof Mogren,
Daniel Gillblad
Abstract:
Personalized decentralized learning is a promising paradigm for distributed learning, enabling each node to train a local model on its own data and collaborate with other nodes to improve without sharing any data. However, this approach poses significant privacy risks, as nodes may inadvertently disclose sensitive information about their data or preferences through their collaboration choices. In…
▽ More
Personalized decentralized learning is a promising paradigm for distributed learning, enabling each node to train a local model on its own data and collaborate with other nodes to improve without sharing any data. However, this approach poses significant privacy risks, as nodes may inadvertently disclose sensitive information about their data or preferences through their collaboration choices. In this paper, we propose Private Personalized Decentralized Learning (PPDL), a novel approach that combines secure aggregation and correlated adversarial multi-armed bandit optimization to protect node privacy while facilitating efficient node selection. By leveraging dependencies between different arms, represented by potential collaborators, we demonstrate that PPDL can effectively identify suitable collaborators solely based on aggregated models. Additionally, we show that PPDL surpasses previous non-private methods in model performance on standard benchmarks under label and covariate shift scenarios.
△ Less
Submitted 15 January, 2024; v1 submitted 30 January, 2023;
originally announced January 2023.
-
EFFGAN: Ensembles of fine-tuned federated GANs
Authors:
Ebba Ekblom,
Edvin Listo Zec,
Olof Mogren
Abstract:
Generative adversarial networks have proven to be a powerful tool for learning complex and high-dimensional data distributions, but issues such as mode collapse have been shown to make it difficult to train them. This is an even harder problem when the data is decentralized over several clients in a federated learning setup, as problems such as client drift and non-iid data make it hard for federa…
▽ More
Generative adversarial networks have proven to be a powerful tool for learning complex and high-dimensional data distributions, but issues such as mode collapse have been shown to make it difficult to train them. This is an even harder problem when the data is decentralized over several clients in a federated learning setup, as problems such as client drift and non-iid data make it hard for federated averaging to converge.
In this work, we study the task of how to learn a data distribution when training data is heterogeneously decentralized over clients and cannot be shared. Our goal is to sample from this distribution centrally, while the data never leaves the clients. We show using standard benchmark image datasets that existing approaches fail in this setting, experiencing so-called client drift when the local number of epochs becomes to large. We thus propose a novel approach we call EFFGAN: Ensembles of fine-tuned federated GANs. Being an ensemble of local expert generators, EFFGAN is able to learn the data distribution over all clients and mitigate client drift. It is able to train with a large number of local epochs, making it more communication efficient than previous works.
△ Less
Submitted 31 October, 2022; v1 submitted 23 June, 2022;
originally announced June 2022.
-
Decentralized adaptive clustering of deep nets is beneficial for client collaboration
Authors:
Edvin Listo Zec,
Ebba Ekblom,
Martin Willbo,
Olof Mogren,
Sarunas Girdzijauskas
Abstract:
We study the problem of training personalized deep learning models in a decentralized peer-to-peer setting, focusing on the setting where data distributions differ between the clients and where different clients have different local learning tasks. We study both covariate and label shift, and our contribution is an algorithm which for each client finds beneficial collaborations based on a similari…
▽ More
We study the problem of training personalized deep learning models in a decentralized peer-to-peer setting, focusing on the setting where data distributions differ between the clients and where different clients have different local learning tasks. We study both covariate and label shift, and our contribution is an algorithm which for each client finds beneficial collaborations based on a similarity estimate for the local task. Our method does not rely on hyperparameters which are hard to estimate, such as the number of client clusters, but rather continuously adapts to the network topology using soft cluster assignment based on a novel adaptive gossip algorithm. We test the proposed method in various settings where data is not independent and identically distributed among the clients. The experimental evaluation shows that the proposed method performs better than previous state-of-the-art algorithms for this problem setting, and handles situations well where previous methods fail.
△ Less
Submitted 31 October, 2022; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Decentralized federated learning of deep neural networks on non-iid data
Authors:
Noa Onoszko,
Gustav Karlsson,
Olof Mogren,
Edvin Listo Zec
Abstract:
We tackle the non-convex problem of learning a personalized deep learning model in a decentralized setting. More specifically, we study decentralized federated learning, a peer-to-peer setting where data is distributed among many clients and where there is no central server to orchestrate the training. In real world scenarios, the data distributions are often heterogeneous between clients. Therefo…
▽ More
We tackle the non-convex problem of learning a personalized deep learning model in a decentralized setting. More specifically, we study decentralized federated learning, a peer-to-peer setting where data is distributed among many clients and where there is no central server to orchestrate the training. In real world scenarios, the data distributions are often heterogeneous between clients. Therefore, in this work we study the problem of how to efficiently learn a model in a peer-to-peer system with non-iid client data. We propose a method named Performance-Based Neighbor Selection (PENS) where clients with similar data distributions detect each other and cooperate by evaluating their training losses on each other's data to learn a model suitable for the local data distribution. Our experiments on benchmark datasets show that our proposed method is able to achieve higher accuracies as compared to strong baselines.
△ Less
Submitted 20 July, 2021; v1 submitted 18 July, 2021;
originally announced July 2021.
-
Scaling Federated Learning for Fine-tuning of Large Language Models
Authors:
Agrin Hilmkil,
Sebastian Callh,
Matteo Barbieri,
Leon René Sütfeld,
Edvin Listo Zec,
Olof Mogren
Abstract:
Federated learning (FL) is a promising approach to distributed compute, as well as distributed data, and provides a level of privacy and compliance to legal frameworks. This makes FL attractive for both consumer and healthcare applications. While the area is actively being explored, few studies have examined FL in the context of larger language models and there is a lack of comprehensive reviews o…
▽ More
Federated learning (FL) is a promising approach to distributed compute, as well as distributed data, and provides a level of privacy and compliance to legal frameworks. This makes FL attractive for both consumer and healthcare applications. While the area is actively being explored, few studies have examined FL in the context of larger language models and there is a lack of comprehensive reviews of robustness across tasks, architectures, numbers of clients, and other relevant factors. In this paper, we explore the fine-tuning of Transformer-based language models in a federated learning setting. We evaluate three popular BERT-variants of different sizes (BERT, ALBERT, and DistilBERT) on a number of text classification tasks such as sentiment analysis and author identification. We perform an extensive sweep over the number of clients, ranging up to 32, to evaluate the impact of distributed compute on task performance in the federated averaging setting. While our findings suggest that the large sizes of the evaluated models are not generally prohibitive to federated training, we found that the different models handle federated averaging to a varying degree. Most notably, DistilBERT converges significantly slower with larger numbers of clients, and under some circumstances, even collapses to chance level performance. Investigating this issue presents an interesting perspective for future research.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
Specialized federated learning using a mixture of experts
Authors:
Edvin Listo Zec,
Olof Mogren,
John Martinsson,
Leon René Sütfeld,
Daniel Gillblad
Abstract:
In federated learning, clients share a global model that has been trained on decentralized local client data. Although federated learning shows significant promise as a key approach when data cannot be shared or centralized, current methods show limited privacy properties and have shortcomings when applied to common real-world scenarios, especially when client data is heterogeneous. In this paper,…
▽ More
In federated learning, clients share a global model that has been trained on decentralized local client data. Although federated learning shows significant promise as a key approach when data cannot be shared or centralized, current methods show limited privacy properties and have shortcomings when applied to common real-world scenarios, especially when client data is heterogeneous. In this paper, we propose an alternative method to learn a personalized model for each client in a federated setting, with greater generalization abilities than previous methods. To achieve this personalization we propose a federated learning framework using a mixture of experts to combine the specialist nature of a locally trained model with the generalist knowledge of a global model. We evaluate our method on a variety of datasets with different levels of data heterogeneity, and our results show that the mixture of experts model is better suited as a personalized model for devices in these settings, outperforming both fine-tuned global models and local specialists.
△ Less
Submitted 8 February, 2021; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Adversarial representation learning for private speech generation
Authors:
David Ericsson,
Adam Östberg,
Edvin Listo Zec,
John Martinsson,
Olof Mogren
Abstract:
As more and more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy. Developing privacy preserving methods for data analytics is thus an important area of research. In this work we present a model based on generative adversarial networks (GANs) that learns to obfuscate specific sensitive attributes in speec…
▽ More
As more and more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy. Developing privacy preserving methods for data analytics is thus an important area of research. In this work we present a model based on generative adversarial networks (GANs) that learns to obfuscate specific sensitive attributes in speech data. We train a model that learns to hide sensitive information in the data, while preserving the meaning in the utterance. The model is trained in two steps: first to filter sensitive information in the spectrogram domain, and then to generate new and private information independent of the filtered one. The model is based on a U-Net CNN that takes mel-spectrograms as input. A MelGAN is used to invert the spectrograms back to raw audio waveforms. We show that it is possible to hide sensitive information such as gender by generating new data, trained adversarially to maintain utility and realism.
△ Less
Submitted 17 June, 2020; v1 submitted 16 June, 2020;
originally announced June 2020.
-
Adversarial representation learning for synthetic replacement of private attributes
Authors:
John Martinsson,
Edvin Listo Zec,
Daniel Gillblad,
Olof Mogren
Abstract:
Data privacy is an increasingly important aspect of many real-world Data sources that contain sensitive information may have immense potential which could be unlocked using the right privacy enhancing transformations, but current methods often fail to produce convincing output. Furthermore, finding the right balance between privacy and utility is often a tricky trade-off. In this work, we propose…
▽ More
Data privacy is an increasingly important aspect of many real-world Data sources that contain sensitive information may have immense potential which could be unlocked using the right privacy enhancing transformations, but current methods often fail to produce convincing output. Furthermore, finding the right balance between privacy and utility is often a tricky trade-off. In this work, we propose a novel approach for data privatization, which involves two steps: in the first step, it removes the sensitive information, and in the second step, it replaces this information with an independent random sample. Our method builds on adversarial representation learning which ensures strong privacy by training the model to fool an increasingly strong adversary. While previous methods only aim at obfuscating the sensitive information, we find that adding new random information in its place strengthens the provided privacy and provides better utility at any given level of privacy. The result is an approach that can provide stronger privatization on image data, and yet be preserving both the domain and the utility of the inputs, entirely independent of the downstream task.
△ Less
Submitted 8 February, 2021; v1 submitted 14 June, 2020;
originally announced June 2020.
-
C-RNN-GAN: Continuous recurrent neural networks with adversarial training
Authors:
Olof Mogren
Abstract:
Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and le…
▽ More
Generative adversarial networks have been proposed as a way of efficiently training deep generative neural networks. We propose a generative adversarial model that works on continuous sequential data, and apply it by training it on a collection of classical music. We conclude that it generates music that sounds better and better as the model is trained, report statistics on generated music, and let the reader judge the quality by downloading the generated songs.
△ Less
Submitted 29 November, 2016;
originally announced November 2016.
-
Adaptive Dynamics of Realistic Small-World Networks
Authors:
Olof Mogren,
Oskar Sandberg,
Vilhelm Verendel,
Devdatt Dubhashi
Abstract:
Continuing in the steps of Jon Kleinberg's and others celebrated work on decentralized search in small-world networks, we conduct an experimental analysis of a dynamic algorithm that produces small-world networks. We find that the algorithm adapts robustly to a wide variety of situations in realistic geographic networks with synthetic test data and with real world data, even when vertices are un…
▽ More
Continuing in the steps of Jon Kleinberg's and others celebrated work on decentralized search in small-world networks, we conduct an experimental analysis of a dynamic algorithm that produces small-world networks. We find that the algorithm adapts robustly to a wide variety of situations in realistic geographic networks with synthetic test data and with real world data, even when vertices are uneven and non-homogeneously distributed.
We investigate the same algorithm in the case where some vertices are more popular destinations for searches than others, for example obeying power-laws. We find that the algorithm adapts and adjusts the networks according to the distributions, leading to improved performance. The ability of the dynamic process to adapt and create small worlds in such diverse settings suggests a possible mechanism by which such networks appear in nature.
△ Less
Submitted 7 April, 2008;
originally announced April 2008.