Search | arXiv e-print repository

Quantifying Local Model Validity using Active Learning

Authors: Sven Lämmle, Can Bogoclu, Robert Voßhall, Anselm Haselhoff, Dirk Roos

Abstract: Real-world applications of machine learning models are often subject to legal or policy-based regulations. Some of these regulations require ensuring the validity of the model, i.e., the approximation error being smaller than a threshold. A global metric is generally too insensitive to determine the validity of a specific prediction, whereas evaluating local validity is costly since it requires ga… ▽ More Real-world applications of machine learning models are often subject to legal or policy-based regulations. Some of these regulations require ensuring the validity of the model, i.e., the approximation error being smaller than a threshold. A global metric is generally too insensitive to determine the validity of a specific prediction, whereas evaluating local validity is costly since it requires gathering additional data.We propose learning the model error to acquire a local validity estimate while reducing the amount of required data through active learning. Using model validation benchmarks, we provide empirical evidence that the proposed method can lead to an error model with sufficient discriminative properties using a relatively small amount of data. Furthermore, an increased sensitivity to local changes of the validity bounds compared to alternative approaches is demonstrated. △ Less

Submitted 17 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

Comments: 40th Conference on Uncertainty in Artificial Intelligence

arXiv:2403.15908 [pdf, other]

doi 10.1109/ACDSA59508.2024.10467448

Deep Gaussian Covariance Network with Trajectory Sampling for Data-Efficient Policy Search

Authors: Can Bogoclu, Robert Vosshall, Kevin Cremanns, Dirk Roos

Abstract: Probabilistic world models increase data efficiency of model-based reinforcement learning (MBRL) by guiding the policy with their epistemic uncertainty to improve exploration and acquire new samples. Moreover, the uncertainty-aware learning procedures in probabilistic approaches lead to robust policies that are less sensitive to noisy observations compared to uncertainty unaware solutions. We prop… ▽ More Probabilistic world models increase data efficiency of model-based reinforcement learning (MBRL) by guiding the policy with their epistemic uncertainty to improve exploration and acquire new samples. Moreover, the uncertainty-aware learning procedures in probabilistic approaches lead to robust policies that are less sensitive to noisy observations compared to uncertainty unaware solutions. We propose to combine trajectory sampling and deep Gaussian covariance network (DGCN) for a data-efficient solution to MBRL problems in an optimal control setting. We compare trajectory sampling with density-based approximation for uncertainty propagation using three different probabilistic world models; Gaussian processes, Bayesian neural networks, and DGCNs. We provide empirical evidence using four different well-known test environments, that our method improves the sample-efficiency over other combinations of uncertainty propagation methods and probabilistic models. During our tests, we place particular emphasis on the robustness of the learned policies with respect to noisy initial states. △ Less

Submitted 23 March, 2024; originally announced March 2024.

arXiv:2401.10355 [pdf, other]

doi 10.1016/j.ymssp.2019.106250

Intelligent Optimization and Machine Learning Algorithms for Structural Anomaly Detection using Seismic Signals

Authors: Maximilian Trapp, Can Bogoclu, Tamara Nestorović, Dirk Roos

Abstract: The lack of anomaly detection methods during mechanized tunnelling can cause financial loss and deficits in drilling time. On-site excavation requires hard obstacles to be recognized prior to drilling in order to avoid damaging the tunnel boring machine and to adjust the propagation velocity. The efficiency of the structural anomaly detection can be increased with intelligent optimization techniqu… ▽ More The lack of anomaly detection methods during mechanized tunnelling can cause financial loss and deficits in drilling time. On-site excavation requires hard obstacles to be recognized prior to drilling in order to avoid damaging the tunnel boring machine and to adjust the propagation velocity. The efficiency of the structural anomaly detection can be increased with intelligent optimization techniques and machine learning. In this research, the anomaly in a simple structure is detected by comparing the experimental measurements of the structural vibrations with numerical simulations using parameter estimation methods. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Journal ref: Mechanical Systems and Signal Processing, Volume 133, 2019, 106250

arXiv:2310.00110 [pdf, other]

doi 10.1016/j.cma.2023.116226

Gradient and Uncertainty Enhanced Sequential Sampling for Global Fit

Authors: Sven Lämmle, Can Bogoclu, Kevin Cremanns, Dirk Roos

Abstract: Surrogate models based on machine learning methods have become an important part of modern engineering to replace costly computer simulations. The data used for creating a surrogate model are essential for the model accuracy and often restricted due to cost and time constraints. Adaptive sampling strategies have been shown to reduce the number of samples needed to create an accurate model. This pa… ▽ More Surrogate models based on machine learning methods have become an important part of modern engineering to replace costly computer simulations. The data used for creating a surrogate model are essential for the model accuracy and often restricted due to cost and time constraints. Adaptive sampling strategies have been shown to reduce the number of samples needed to create an accurate model. This paper proposes a new sampling strategy for global fit called Gradient and Uncertainty Enhanced Sequential Sampling (GUESS). The acquisition function uses two terms: the predictive posterior uncertainty of the surrogate model for exploration of unseen regions and a weighted approximation of the second and higher-order Taylor expansion values for exploitation. Although various sampling strategies have been proposed so far, the selection of a suitable method is not trivial. Therefore, we compared our proposed strategy to 9 adaptive sampling strategies for global surrogate modeling, based on 26 different 1 to 8-dimensional deterministic benchmarks functions. Results show that GUESS achieved on average the highest sample efficiency compared to other surrogate-based strategies on the tested examples. An ablation study considering the behavior of GUESS in higher dimensions and the importance of surrogate choice is also presented. △ Less

Submitted 29 September, 2023; originally announced October 2023.

arXiv:2205.09546 [pdf, other]

Deterministic training of generative autoencoders using invertible layers

Authors: Gianluigi Silvestri, Daan Roos, Luca Ambrogioni

Abstract: In this work, we provide a deterministic alternative to the stochastic variational training of generative autoencoders. We refer to these new generative autoencoders as AutoEncoders within Flows (AEF), since the encoder and decoder are defined as affine layers of an overall invertible architecture. This results in a deterministic encoding of the data, as opposed to the stochastic encoding of VAEs.… ▽ More In this work, we provide a deterministic alternative to the stochastic variational training of generative autoencoders. We refer to these new generative autoencoders as AutoEncoders within Flows (AEF), since the encoder and decoder are defined as affine layers of an overall invertible architecture. This results in a deterministic encoding of the data, as opposed to the stochastic encoding of VAEs. The paper introduces two related families of AEFs. The first family relies on a partition of the ambient space and is trained by exact maximum-likelihood. The second family exploits a deterministic expansion of the ambient space and is trained by maximizing the log-probability in this extended space. This latter case leaves complete freedom in the choice of encoder, decoder and prior architectures, making it a drop-in replacement for the training of existing VAEs and VAE-style models. We show that these AEFs can have strikingly higher performance than architecturally identical VAEs in terms of log-likelihood and sample quality, especially for low dimensional latent spaces. Importantly, we show that AEF samples are substantially sharper than VAE samples. △ Less

Submitted 3 March, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

Comments: International Conference on Learning Representations 2023

arXiv:2108.08890 [pdf, other]

doi 10.1016/j.asoc.2021.107807

Local Latin Hypercube Refinement for Multi-objective Design Uncertainty Optimization

Authors: Can Bogoclu, Dirk Roos, Tamara Nestorović

Abstract: Optimizing the reliability and the robustness of a design is important but often unaffordable due to high sample requirements. Surrogate models based on statistical and machine learning methods are used to increase the sample efficiency. However, for higher dimensional or multi-modal systems, surrogate models may also require a large amount of samples to achieve good results. We propose a sequenti… ▽ More Optimizing the reliability and the robustness of a design is important but often unaffordable due to high sample requirements. Surrogate models based on statistical and machine learning methods are used to increase the sample efficiency. However, for higher dimensional or multi-modal systems, surrogate models may also require a large amount of samples to achieve good results. We propose a sequential sampling strategy for the surrogate based solution of multi-objective reliability based robust design optimization problems. Proposed local Latin hypercube refinement (LoLHR) strategy is model-agnostic and can be combined with any surrogate model because there is no free lunch but possibly a budget one. The proposed method is compared to stationary sampling as well as other proposed strategies from the literature. Gaussian process and support vector regression are both used as surrogate models. Empirical evidence is presented, showing that LoLHR achieves on average better results compared to other surrogate based strategies on the tested examples. △ Less

Submitted 5 May, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: The code repository can be found at https://github.com/canbooo/duqo

arXiv:1710.06202 [pdf, other]

Deep Gaussian Covariance Network

Authors: Kevin Cremanns, Dirk Roos

Abstract: The correlation length-scale next to the noise variance are the most used hyperparameters for the Gaussian processes. Typically, stationary covariance functions are used, which are only dependent on the distances between input points and thus invariant to the translations in the input space. The optimization of the hyperparameters is commonly done by maximizing the log marginal likelihood. This wo… ▽ More The correlation length-scale next to the noise variance are the most used hyperparameters for the Gaussian processes. Typically, stationary covariance functions are used, which are only dependent on the distances between input points and thus invariant to the translations in the input space. The optimization of the hyperparameters is commonly done by maximizing the log marginal likelihood. This works quite well, if the distances are uniform distributed. In the case of a locally adapted or even sparse input space, the prediction of a test point can be worse dependent of its position. A possible solution to this, is the usage of a non-stationary covariance function, where the hyperparameters are calculated by a deep neural network. So that the correlation length scales and possibly the noise variance are dependent on the test point. Furthermore, different types of covariance functions are trained simultaneously, so that the Gaussian process prediction is an additive overlay of different covariance matrices. The right covariance functions combination and its hyperparameters are learned by the deep neural network. Additional, the Gaussian process will be able to be trained by batches or online and so it can handle arbitrarily large data sets. We call this framework Deep Gaussian Covariance Network (DGCP). There are also further extensions to this framework possible, for example sequentially dependent problems like time series or the local mixture of experts. The basic framework and some extension possibilities will be presented in this work. Moreover, a comparison to some recent state of the art surrogate model methods will be performed, also for a time dependent problem. △ Less

Submitted 27 October, 2017; v1 submitted 17 October, 2017; originally announced October 2017.

arXiv:1602.00569 [pdf, other]

Improving PIE's performance over high-delay paths

Authors: Nicolas Kuhn, David Ros

Abstract: Bufferbloat is excessive latency due to over- provisioned network buffers. PIE and CoDel are two recently proposed Active Queue Management (AQM) algorithms, designed to tackle bufferbloat by lowering the queuing delay without degrading the bottleneck utilization. PIE uses a proportional integral controller to maintain the average queuing delay at a desired level; however, large Round Trip Times (R… ▽ More Bufferbloat is excessive latency due to over- provisioned network buffers. PIE and CoDel are two recently proposed Active Queue Management (AQM) algorithms, designed to tackle bufferbloat by lowering the queuing delay without degrading the bottleneck utilization. PIE uses a proportional integral controller to maintain the average queuing delay at a desired level; however, large Round Trip Times (RTT) result in large spikes in queuing delays, which induce high dropping probability and low utilization. To deal with this problem, we propose Maximum and Average queuing Delay with PIE (MADPIE). Loosely based on the drop policy used by CoDel to keep queuing delay bounded, MADPIE is a simple extension to PIE that adds deterministic packet drops at controlled intervals. By means of simulations, we observe that our proposed change does not affect PIE's performance when RTT < 100 ms. The deterministic drops are more dominant when the RTT increases, which results in lower maximum queuing delays and better performance for VoIP traffic and small file downloads, with no major impact on bulk transfers. △ Less

Submitted 1 February, 2016; originally announced February 2016.

arXiv:1010.5128 [pdf, ps, other]

TCP over low-power and lossy networks: tuning the segment size to minimize energy consumption

Authors: Ahmed Ayadi, Patrick Maillé, David Ros

Abstract: Low-power and Lossy Networks (LLNs), like wireless networks based upon the IEEE 802.15.4 standard, have strong energy constraints, and are moreover subject to frequent transmission errors, not only due to congestion but also to collisions and to radio channel conditions. This paper introduces an analytical model to compute the total energy consumption in an LLN due to the TCP protocol. The model a… ▽ More Low-power and Lossy Networks (LLNs), like wireless networks based upon the IEEE 802.15.4 standard, have strong energy constraints, and are moreover subject to frequent transmission errors, not only due to congestion but also to collisions and to radio channel conditions. This paper introduces an analytical model to compute the total energy consumption in an LLN due to the TCP protocol. The model allows us to highlight some tradeoffs as regards the choice of the TCP maximum segment size, of the Forward Error Correction (FEC) redundancy ratio, and of the number of link-layer retransmissions, in order to minimize the total energy consumption. △ Less

Submitted 25 October, 2010; originally announced October 2010.

Comments: TELECOM Bretagne Research Report

arXiv:1004.0050 [pdf, other]

A Study of Bandwidth-Perception Management Mechanisms in IEEE 802.16 Networks

Authors: Andres Arcia-Moret, Yubo Yang, Nicolas Montavont, David Ros

Abstract: Bandwidth request-grant mechanisms are used in 802.16 networks to manage the uplink bandwidth needs of subscriber stations (SSs). Requests may be sent by SSs to the base station (BS) by means of several mechanisms defined in the standard. Based on the incoming requests, the BS (which handles most of the bandwidth scheduling in the system) schedules the transmission of uplink traffic, by assigning… ▽ More Bandwidth request-grant mechanisms are used in 802.16 networks to manage the uplink bandwidth needs of subscriber stations (SSs). Requests may be sent by SSs to the base station (BS) by means of several mechanisms defined in the standard. Based on the incoming requests, the BS (which handles most of the bandwidth scheduling in the system) schedules the transmission of uplink traffic, by assigning transmission opportunities to the SSs in an implementation-dependent manner. In this paper we present a study of some bandwidth allocation issues, arising from the management of the perception of subscriber stations' bandwidth needs at the base station. We illustrate how the bandwidth perception varies depending on the policy used to handle requests and grants. By means of ns-2 simulations, we evaluate the potential impact of such policies on the system's aggregate throughput when the traffic is composed of Best-Effort TCP flows. △ Less

Submitted 1 April, 2010; originally announced April 2010.

Comments: 10 pages, 20 figs.

ACM Class: C.2.1

Showing 1–10 of 10 results for author: Ros, D