-
Evaluating Uncertainty in Deep Gaussian Processes
Authors:
Matthijs van der Lende,
Jeremias Lino Ferrao,
Niclas Müller-Hof
Abstract:
Reliable uncertainty estimates are crucial in modern machine learning. Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty quantification grounded in Bayesian principles. However, their empirical calibration and robustness under distribution shift relative to baselines like Deep Ensembles remain understudied. T…
▽ More
Reliable uncertainty estimates are crucial in modern machine learning. Deep Gaussian Processes (DGPs) and Deep Sigma Point Processes (DSPPs) extend GPs hierarchically, offering promising methods for uncertainty quantification grounded in Bayesian principles. However, their empirical calibration and robustness under distribution shift relative to baselines like Deep Ensembles remain understudied. This work evaluates these models on regression (CASP dataset) and classification (ESR dataset) tasks, assessing predictive performance (MAE, Accu- racy), calibration using Negative Log-Likelihood (NLL) and Expected Calibration Error (ECE), alongside robustness under various synthetic feature-level distribution shifts. Results indicate DSPPs provide strong in-distribution calibration leveraging their sigma point approximations. However, compared to Deep Ensembles, which demonstrated superior robustness in both per- formance and calibration under the tested shifts, the GP-based methods showed vulnerabilities, exhibiting particular sensitivity in the observed metrics. Our findings underscore ensembles as a robust baseline, suggesting that while deep GP methods offer good in-distribution calibration, their practical robustness under distribution shift requires careful evaluation. To facilitate reproducibility, we make our code available at https://github.com/matthjs/xai-gp.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
Testing and validation of innovative eXtended Reality technologies for astronaut training in a partial-gravity parabolic flight campaign
Authors:
Florian Saling,
Andrea Emanuele Maria Casini,
Andreas Treuer,
Martial Costantini,
Leonie Bensch,
Tommy Nilsson,
Lionel Ferra
Abstract:
The use of eXtended Reality (XR) technologies in the space domain has increased significantly over the past few years as it can offer many advantages when simulating complex and challenging environments. Space agencies are currently using these disruptive tools to train astronauts for Extravehicular Activities (EVAs), to test equipment and procedures, and to assess spacecraft and hardware designs.…
▽ More
The use of eXtended Reality (XR) technologies in the space domain has increased significantly over the past few years as it can offer many advantages when simulating complex and challenging environments. Space agencies are currently using these disruptive tools to train astronauts for Extravehicular Activities (EVAs), to test equipment and procedures, and to assess spacecraft and hardware designs. With the Moon being the current focus of the next generation of space exploration missions, simulating its harsh environment is one of the key areas where XR can be applied, particularly for astronaut training. Peculiar lunar lighting conditions in combination with reduced gravity levels will highly impact human locomotion especially for movements such as walking, jumping, and running. In order to execute operations on the lunar surface and to safely live on the Moon for an extended period of time, innovative training methodologies and tools such as XR are becoming paramount to perform pre-mission validation and certification. This research work presents the findings of the experiments aimed at exploring the integration of XR technology and parabolic flight activities for astronaut training. In addition, the study aims to consolidate these findings into a set of guidelines that can assist future researchers who wish to incorporate XR technology into lunar training and preparation activities, including the use of such XR tools during long duration missions.
△ Less
Submitted 18 October, 2024;
originally announced October 2024.
-
Robust Contract Evolution in a TypeSafe MicroServices Architecture
Authors:
João Costa Seco,
Paulo Ferreira,
Hugo Lourenço,
Carla Ferreira,
Lucio Ferrao
Abstract:
Microservices architectures allow for short deployment cycles and immediate effects but offer no safety mechanisms when service contracts need to be changed. Maintaining the soundness of microservice architectures is an error-prone task that is only accessible to the most disciplined development teams. We present a microservice management system that statically verifies service interfaces and supp…
▽ More
Microservices architectures allow for short deployment cycles and immediate effects but offer no safety mechanisms when service contracts need to be changed. Maintaining the soundness of microservice architectures is an error-prone task that is only accessible to the most disciplined development teams. We present a microservice management system that statically verifies service interfaces and supports the seamless evolution of compatible interfaces. We define a compatibility relation that captures real evolution patterns and embodies known good practices on the evolution of interfaces. Namely, we allow for the addition, removal, and renaming of data fields of a producer module without breaking or needing to upgrade consumer services. The evolution of interfaces is supported by runtime generated proxy components that dynamically adapt data exchanged between services to match with the statically checked service code.The model was instantiated in a core language whose semantics is defined by a labeled transition system and a type system that prevents breaking changes from being deployed. Standard soundness results for the core language entail the existence of adapters, hence the absence of adaptation errors and the correctness of the management model. This adaptive approach allows for gradual deployment of modules, without halting the whole system and avoiding losing or misinterpreting data exchanged between system nodes. Experimental data shows that an average of 69% of deployments that would require adaptation and recompilation are safe under our approach.
△ Less
Submitted 14 February, 2020;
originally announced February 2020.
-
Análise de Segurança Baseada em Roles para Fábricas de Software
Authors:
Miguel Loureiro,
Luísa Lourenço,
Lúcio Ferrão,
Carla Ferreira
Abstract:
Most software factories contain applications with sensitive information that needs to be protected against breaches of confidentiality and integrity, which can have serious consequences. In the context of large factories with complex applications, it is not feasible to manually analyze accesses to sensitive information without some form of safety mechanisms. This article presents a static analysis…
▽ More
Most software factories contain applications with sensitive information that needs to be protected against breaches of confidentiality and integrity, which can have serious consequences. In the context of large factories with complex applications, it is not feasible to manually analyze accesses to sensitive information without some form of safety mechanisms. This article presents a static analysis technique for software factories, based on role-based security policies. We start by synthesising a graph representation of the relevant software factories, based on the security policy defined by the user. Later the graph model is analysed to find access information where the security policy is breached, ensuring that all possible execution states are analysed. A proof of concept of our technique has been developed for the analysis of OutSystems software factories. The security reports generated by the tool allows developers to find and prioritise security breaches in their factories. The prototype was evaluated using large software factories, with strong safety requirements. Several security flaws were found, some serious ones that would be hard to be detected without our analysis.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.