-
Intrinsic uncertainties and where to find them
Authors:
Francesco Farina,
Lawrence Phillips,
Nicola J Richmond
Abstract:
We introduce a framework for uncertainty estimation that both describes and extends many existing methods. We consider typical hyperparameters involved in classical training as random variables and marginalise them out to capture various sources of uncertainty in the parameter space. We investigate which forms and combinations of marginalisation are most useful from a practical point of view on st…
▽ More
We introduce a framework for uncertainty estimation that both describes and extends many existing methods. We consider typical hyperparameters involved in classical training as random variables and marginalise them out to capture various sources of uncertainty in the parameter space. We investigate which forms and combinations of marginalisation are most useful from a practical point of view on standard benchmarking data sets. Moreover, we discuss how some marginalisations may produce reliable estimates of uncertainty without the need for extensive hyperparameter tuning and/or large-scale ensembling.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Data efficiency in graph networks through equivariance
Authors:
Francesco Farina,
Emma Slade
Abstract:
We introduce a novel architecture for graph networks which is equivariant to any transformation in the coordinate embeddings that preserves the distance between neighbouring nodes. In particular, it is equivariant to the Euclidean and conformal orthogonal groups in $n$-dimensions. Thanks to its equivariance properties, the proposed model is extremely more data efficient with respect to classical g…
▽ More
We introduce a novel architecture for graph networks which is equivariant to any transformation in the coordinate embeddings that preserves the distance between neighbouring nodes. In particular, it is equivariant to the Euclidean and conformal orthogonal groups in $n$-dimensions. Thanks to its equivariance properties, the proposed model is extremely more data efficient with respect to classical graph architectures and also intrinsically equipped with a better inductive bias. We show that, learning on a minimal amount of data, the architecture we propose can perfectly generalise to unseen data in a synthetic problem, while much more training data are required from a standard model to reach comparable performance.
△ Less
Submitted 11 July, 2021; v1 submitted 25 June, 2021;
originally announced June 2021.
-
Symmetry-driven graph neural networks
Authors:
Francesco Farina,
Emma Slade
Abstract:
Exploiting symmetries and invariance in data is a powerful, yet not fully exploited, way to achieve better generalisation with more efficiency. In this paper, we introduce two graph network architectures that are equivariant to several types of transformations affecting the node coordinates. First, we build equivariance to any transformation in the coordinate embeddings that preserves the distance…
▽ More
Exploiting symmetries and invariance in data is a powerful, yet not fully exploited, way to achieve better generalisation with more efficiency. In this paper, we introduce two graph network architectures that are equivariant to several types of transformations affecting the node coordinates. First, we build equivariance to any transformation in the coordinate embeddings that preserves the distance between neighbouring nodes, allowing for equivariance to the Euclidean group. Then, we introduce angle attributes to build equivariance to any angle preserving transformation - thus, to the conformal group. Thanks to their equivariance properties, the proposed models can be vastly more data efficient with respect to classical graph architectures, intrinsically equipped with a better inductive bias and better at generalising. We demonstrate these capabilities on a synthetic dataset composed of $n$-dimensional geometric objects. Additionally, we provide examples of their limitations when (the right) symmetries are not present in the data.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Beyond permutation equivariance in graph networks
Authors:
Emma Slade,
Francesco Farina
Abstract:
In this draft paper, we introduce a novel architecture for graph networks which is equivariant to the Euclidean group in $n$-dimensions. The model is designed to work with graph networks in their general form and can be shown to include particular variants as special cases. Thanks to its equivariance properties, we expect the proposed model to be more data efficient with respect to classical graph…
▽ More
In this draft paper, we introduce a novel architecture for graph networks which is equivariant to the Euclidean group in $n$-dimensions. The model is designed to work with graph networks in their general form and can be shown to include particular variants as special cases. Thanks to its equivariance properties, we expect the proposed model to be more data efficient with respect to classical graph architectures and also intrinsically equipped with a better inductive bias. We defer investigating this matter to future work.
△ Less
Submitted 21 May, 2021; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Recognizability of languages via deterministic finite automata with values on a monoid: General Myhill-Nerode Theorem
Authors:
José Ramón González de Mendívil,
Federico Fariña
Abstract:
This paper deals with the problem of recognizability of functions l: Sigma* --> M that map words to values in the support set M of a monoid (M,.,1). These functions are called M-languages. M-languages are studied from the aspect of their recognition by deterministic finite automata whose components take values on M (M-DFAs). The characterization of an M-language l is based on providing a right con…
▽ More
This paper deals with the problem of recognizability of functions l: Sigma* --> M that map words to values in the support set M of a monoid (M,.,1). These functions are called M-languages. M-languages are studied from the aspect of their recognition by deterministic finite automata whose components take values on M (M-DFAs). The characterization of an M-language l is based on providing a right congruence on Sigma* that is defined through l and a factorization on the set of all M-languages, L(Sigma*,M) (in sort L). A factorization on L is a pair of functions (g,f) such that, for each l in L, g(l). f(l)= l, where g(l) in M and f(l) in L. In essence, a factorization is a form of common factor extraction. A general Myhill-Nerode theorem, which is valid for any L(Sigma*, M), is provided. Basically, l is recognized by an M-DFA if and only if there exists a factorization on L, (g,f), such that the right congruence on Sigma* induced by the factorization (g,f) and f(l), has finite index. This paper shows that the existence of M-DFAs guarantees the existence of natural non-trivial factorizations on L without taking account any additional property on the monoid.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
GTAdam: Gradient Tracking with Adaptive Momentum for Distributed Online Optimization
Authors:
Guido Carnevale,
Francesco Farina,
Ivano Notarnicola,
Giuseppe Notarstefano
Abstract:
This paper deals with a network of computing agents aiming to solve an online optimization problem in a distributed fashion, i.e., by means of local computation and communication, without any central coordinator. We propose the gradient tracking with adaptive momentum estimation (GTAdam) distributed algorithm, which combines a gradient tracking mechanism with first and second order momentum estima…
▽ More
This paper deals with a network of computing agents aiming to solve an online optimization problem in a distributed fashion, i.e., by means of local computation and communication, without any central coordinator. We propose the gradient tracking with adaptive momentum estimation (GTAdam) distributed algorithm, which combines a gradient tracking mechanism with first and second order momentum estimates of the gradient. The algorithm is analyzed in the online setting for strongly convex cost functions with Lipschitz continuous gradients. We provide an upper bound for the dynamic regret given by a term related to the initial conditions and another term related to the temporal variations of the objective functions. Moreover, a linear convergence rate is guaranteed in the static setup. The algorithm is tested on a time-varying classification problem, on a (moving) target localization problem, and in a stochastic optimization setup from image classification. In these numerical experiments from multi-agent learning, GTAdam outperforms state-of-the-art distributed optimization methods.
△ Less
Submitted 12 September, 2023; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Collective Learning
Authors:
Francesco Farina
Abstract:
In this paper, we introduce the concept of collective learning (CL) which exploits the notion of collective intelligence in the field of distributed semi-supervised learning. The proposed framework draws inspiration from the learning behavior of human beings, who alternate phases involving collaboration, confrontation and exchange of views with other consisting of studying and learning on their ow…
▽ More
In this paper, we introduce the concept of collective learning (CL) which exploits the notion of collective intelligence in the field of distributed semi-supervised learning. The proposed framework draws inspiration from the learning behavior of human beings, who alternate phases involving collaboration, confrontation and exchange of views with other consisting of studying and learning on their own. On this regard, CL comprises two main phases: a self-training phase in which learning is performed on local private (labeled) data only and a collective training phase in which proxy-labels are assigned to shared (unlabeled) data by means of a consensus-based algorithm. In the considered framework, heterogeneous systems can be connected over the same network, each with different computational capabilities and resources and everyone in the network may take advantage of the cooperation and will eventually reach higher performance with respect to those it can reach on its own. An extensive experimental campaign on an image classification problem emphasizes the properties of CL by analyzing the performance achieved by the cooperating agents.
△ Less
Submitted 26 May, 2021; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Asynchronous Distributed Learning from Constraints
Authors:
Francesco Farina,
Stefano Melacci,
Andrea Garulli,
Antonio Giannitrapani
Abstract:
In this paper, the extension of the framework of Learning from Constraints (LfC) to a distributed setting where multiple parties, connected over the network, contribute to the learning process is studied. LfC relies on the generic notion of "constraint" to inject knowledge into the learning problem and, due to its generality, it deals with possibly nonconvex constraints, enforced either in a hard…
▽ More
In this paper, the extension of the framework of Learning from Constraints (LfC) to a distributed setting where multiple parties, connected over the network, contribute to the learning process is studied. LfC relies on the generic notion of "constraint" to inject knowledge into the learning problem and, due to its generality, it deals with possibly nonconvex constraints, enforced either in a hard or soft way. Motivated by recent progresses in the field of distributed and constrained nonconvex optimization, we apply the (distributed) Asynchronous Method of Multipliers (ASYMM) to LfC. The study shows that such a method allows us to support scenarios where selected constraints (i.e., knowledge), data, and outcomes of the learning process can be locally stored in each computational node without being shared with the rest of the network, opening the road to further investigations into privacy-preserving LfC. Constraints act as a bridge between what is shared over the net and what is private to each node and no central authority is required. We demonstrate the applicability of these ideas in two distributed real-world settings in the context of digit recognition and document classification.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
DISROPT: a Python Framework for Distributed Optimization
Authors:
Francesco Farina,
Andrea Camisa,
Andrea Testa,
Ivano Notarnicola,
Giuseppe Notarstefano
Abstract:
In this paper we introduce DISROPT, a Python package for distributed optimization over networks. We focus on cooperative set-ups in which an optimization problem must be solved by peer-to-peer processors (without central coordinators) that have access only to partial knowledge of the entire problem. To reflect this, agents in DISROPT are modeled as entities that are initialized with their local kn…
▽ More
In this paper we introduce DISROPT, a Python package for distributed optimization over networks. We focus on cooperative set-ups in which an optimization problem must be solved by peer-to-peer processors (without central coordinators) that have access only to partial knowledge of the entire problem. To reflect this, agents in DISROPT are modeled as entities that are initialized with their local knowledge of the problem. Agents then run local routines and communicate with each other to solve the global optimization problem. A simple syntax has been designed to allow for an easy modeling of the problems. The package comes with many distributed optimization algorithms that are already embedded. Moreover, the package provides full-fledged functionalities for communication and local computation, which can be used to design and implement new algorithms. DISROPT is available at github.com/disropt/disropt under the GPL license, with a complete documentation and many examples.
△ Less
Submitted 20 May, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Upper Body Pose Estimation Using Wearable Inertial Sensors and Multiplicative Kalman Filter
Authors:
Tommaso Lisini Baldi,
Francesco Farina,
Andrea Garulli,
Antonio Giannitrapani,
Domenico Prattichizzo
Abstract:
Estimating the limbs pose in a wearable way may benefit multiple areas such as rehabilitation, teleoperation, human-robot interaction, gaming, and many more. Several solutions are commercially available, but they are usually expensive or not wearable/portable. We present a wearable pose estimation system (WePosE), based on inertial measurements units (IMUs), for motion analysis and body tracking.…
▽ More
Estimating the limbs pose in a wearable way may benefit multiple areas such as rehabilitation, teleoperation, human-robot interaction, gaming, and many more. Several solutions are commercially available, but they are usually expensive or not wearable/portable. We present a wearable pose estimation system (WePosE), based on inertial measurements units (IMUs), for motion analysis and body tracking. Differently from camera-based approaches, the proposed system does not suffer from occlusion problems and lighting conditions, it is cost effective and it can be used in indoor and outdoor environments. Moreover, since only accelerometers and gyroscopes are used to estimate the orientation, the system can be used also in the presence of iron and magnetic disturbances. An experimental validation using a high precision optical tracker has been performed. Results confirmed the effectiveness of the proposed approach.
△ Less
Submitted 24 September, 2019; v1 submitted 23 September, 2019;
originally announced September 2019.
-
Distributed Submodular Minimization via Block-Wise Updates and Communications
Authors:
Andrea Testa,
Francesco Farina,
Giuseppe Notarstefano
Abstract:
In this paper we deal with a network of computing agents with local processing and neighboring communication capabilities that aim at solving (without any central unit) a submodular optimization problem. The cost function is the sum of many local submodular functions and each agent in the network has access to one function in the sum only. In this \emph{distributed} set-up, in order to preserve th…
▽ More
In this paper we deal with a network of computing agents with local processing and neighboring communication capabilities that aim at solving (without any central unit) a submodular optimization problem. The cost function is the sum of many local submodular functions and each agent in the network has access to one function in the sum only. In this \emph{distributed} set-up, in order to preserve their own privacy, agents communicate with neighbors but do not share their local cost functions. We propose a distributed algorithm in which agents resort to the Lovàsz extension of their local submodular functions and perform local updates and communications in terms of single blocks of the entire optimization variable. Updates are performed by means of a greedy algorithm which is run only until the selected block is computed, thus resulting in a reduced computational burden. The proposed algorithm is shown to converge in expected value to the optimal cost of the problem, and an approximate solution to the submodular problem is retrieved by a thresholding operation. As an application, we consider a distributed image segmentation problem in which each agent has access only to a portion of the entire image. While agents cannot segment the entire image on their own, they correctly complete the task by cooperating through the proposed distributed algorithm.
△ Less
Submitted 25 May, 2020; v1 submitted 31 May, 2019;
originally announced May 2019.