-
Risk-Aware Safe Reinforcement Learning for Control of Stochastic Linear Systems
Authors:
Babak Esmaeili,
Nariman Niknejad,
Hamidreza Modares
Abstract:
This paper presents a risk-aware safe reinforcement learning (RL) control design for stochastic discrete-time linear systems. Rather than using a safety certifier to myopically intervene with the RL controller, a risk-informed safe controller is also learned besides the RL controller, and the RL and safe controllers are combined together. Several advantages come along with this approach: 1) High-c…
▽ More
This paper presents a risk-aware safe reinforcement learning (RL) control design for stochastic discrete-time linear systems. Rather than using a safety certifier to myopically intervene with the RL controller, a risk-informed safe controller is also learned besides the RL controller, and the RL and safe controllers are combined together. Several advantages come along with this approach: 1) High-confidence safety can be certified without relying on a high-fidelity system model and using limited data available, 2) Myopic interventions and convergence to an undesired equilibrium can be avoided by deciding on the contribution of two stabilizing controllers, and 3) highly efficient and computationally tractable solutions can be provided by optimizing over a scalar decision variable and linear programming polyhedral sets. To learn safe controllers with a large invariant set, piecewise affine controllers are learned instead of linear controllers. To this end, the closed-loop system is first represented using collected data, a decision variable, and noise. The effect of the decision variable on the variance of the safe violation of the closed-loop system is formalized. The decision variable is then designed such that the probability of safety violation for the learned closed-loop system is minimized. It is shown that this control-oriented approach reduces the data requirements and can also reduce the variance of safety violations. Finally, to integrate the safe and RL controllers, a new data-driven interpolation technique is introduced. This method aims to maintain the RL agent's optimal implementation while ensuring its safety within environments characterized by noise. The study concludes with a simulation example that serves to validate the theoretical results.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
Col-Con: A Virtual Reality Simulation Testbed for Exploring Collaborative Behaviors in Construction
Authors:
Liuchuan Yu,
Ching-Yu Cheng,
William F Ranc,
Joshua Dow,
Michael Szilagyi,
Haikun Huang,
Sungsoo Ray Hong,
Behzad Esmaeili,
Lap-Fai Yu
Abstract:
Virtual reality is widely adopted for applications such as training, education, and collaboration. The construction industry, known for its complex projects and numerous personnel involved, relies heavily on effective collaboration. Setting up a real-world construction site for experiments can be expensive and time-consuming, whereas conducting experiments in VR is relatively low-cost, scalable, a…
▽ More
Virtual reality is widely adopted for applications such as training, education, and collaboration. The construction industry, known for its complex projects and numerous personnel involved, relies heavily on effective collaboration. Setting up a real-world construction site for experiments can be expensive and time-consuming, whereas conducting experiments in VR is relatively low-cost, scalable, and efficient. We propose Col-Con, a virtual reality simulation testbed for exploring collaborative behaviors in construction. Col-Con is a multi-user testbed that supports users in completing tasks collaboratively. Additionally, Col-Con provides immersive and realistic simulated construction scenes, where real-time voice communication, along with synchronized transformations, animations, sounds, and interactions, enhances the collaborative experience. As a showcase, we implemented a pipe installation construction task based on Col-Con. A user study demonstrated that Col-Con excels in usability, and participants reported a strong sense of immersion and collaboration. We envision that Col-Con will facilitate research on exploring virtual reality-based collaborative behaviors in construction.
△ Less
Submitted 31 October, 2024;
originally announced October 2024.
-
Variational Stochastic Gradient Descent for Deep Neural Networks
Authors:
Haotian Chen,
Anna Kuzina,
Babak Esmaeili,
Jakub M Tomczak
Abstract:
Current state-of-the-art optimizers are adaptive gradient-based optimization methods such as Adam. Recently, there has been an increasing interest in formulating gradient-based optimizers in a probabilistic framework for better modeling the uncertainty of the gradients. Here, we propose to combine both approaches, resulting in the Variational Stochastic Gradient Descent (VSGD) optimizer. We model…
▽ More
Current state-of-the-art optimizers are adaptive gradient-based optimization methods such as Adam. Recently, there has been an increasing interest in formulating gradient-based optimizers in a probabilistic framework for better modeling the uncertainty of the gradients. Here, we propose to combine both approaches, resulting in the Variational Stochastic Gradient Descent (VSGD) optimizer. We model gradient updates as a probabilistic model and utilize stochastic variational inference (SVI) to derive an efficient and effective update rule. Further, we show how our VSGD method relates to other adaptive gradient-based optimizers like Adam. Lastly, we carry out experiments on two image classification datasets and four deep neural network architectures, where we show that VSGD outperforms Adam and SGD.
△ Less
Submitted 18 April, 2025; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Topological Obstructions and How to Avoid Them
Authors:
Babak Esmaeili,
Robin Walters,
Heiko Zimmermann,
Jan-Willem van de Meent
Abstract:
Incorporating geometric inductive biases into models can aid interpretability and generalization, but encoding to a specific geometric structure can be challenging due to the imposed topological constraints. In this paper, we theoretically and empirically characterize obstructions to training encoders with geometric latent spaces. We show that local optima can arise due to singularities (e.g. self…
▽ More
Incorporating geometric inductive biases into models can aid interpretability and generalization, but encoding to a specific geometric structure can be challenging due to the imposed topological constraints. In this paper, we theoretically and empirically characterize obstructions to training encoders with geometric latent spaces. We show that local optima can arise due to singularities (e.g. self-intersection) or due to an incorrect degree or winding number. We then discuss how normalizing flows can potentially circumvent these obstructions by defining multimodal variational distributions. Inspired by this observation, we propose a new flow-based model that maps data points to multimodal distributions over geometric spaces and empirically evaluate our model on 2 domains. We observe improved stability during training and a higher chance of converging to a homeomorphic encoder.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Using Unmanned Aerial Systems (UAS) for Assessing and Monitoring Fall Hazard Prevention Systems in High-rise Building Projects
Authors:
Yimeng Li,
Behzad Esmaeili,
Masoud Gheisari,
Jana Kosecka,
Abbas Rashidi
Abstract:
This study develops a framework for unmanned aerial systems (UASs) to monitor fall hazard prevention systems near unprotected edges and openings in high-rise building projects. A three-step machine-learning-based framework was developed and tested to detect guardrail posts from the images captured by UAS. First, a guardrail detector was trained to localize the candidate locations of posts supporti…
▽ More
This study develops a framework for unmanned aerial systems (UASs) to monitor fall hazard prevention systems near unprotected edges and openings in high-rise building projects. A three-step machine-learning-based framework was developed and tested to detect guardrail posts from the images captured by UAS. First, a guardrail detector was trained to localize the candidate locations of posts supporting the guardrail. Since images were used in this process collected from an actual job site, several false detections were identified. Therefore, additional constraints were introduced in the following steps to filter out false detections. Second, the research team applied a horizontal line detector to the image to properly detect floors and remove the detections that were not close to the floors. Finally, since the guardrail posts are installed with approximately normal distribution between each post, the space between them was estimated and used to find the most likely distance between the two posts. The research team used various combinations of the developed approaches to monitor guardrail systems in the captured images from a high-rise building project. Comparing the precision and recall metrics indicated that the cascade classifier achieves better performance with floor detection and guardrail spacing estimation. The research outcomes illustrate that the proposed guardrail recognition system can improve the assessment of guardrails and facilitate the safety engineer's task of identifying fall hazards in high-rise building projects.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Conjugate Energy-Based Models
Authors:
Hao Wu,
Babak Esmaeili,
Michael Wick,
Jean-Baptiste Tristan,
Jan-Willem van de Meent
Abstract:
In this paper, we propose conjugate energy-based models (CEBMs), a new class of energy-based models that define a joint density over data and latent variables. The joint density of a CEBM decomposes into an intractable distribution over data and a tractable posterior over latent variables. CEBMs have similar use cases as variational autoencoders, in the sense that they learn an unsupervised mappin…
▽ More
In this paper, we propose conjugate energy-based models (CEBMs), a new class of energy-based models that define a joint density over data and latent variables. The joint density of a CEBM decomposes into an intractable distribution over data and a tractable posterior over latent variables. CEBMs have similar use cases as variational autoencoders, in the sense that they learn an unsupervised mapping from data to latent variables. However, these models omit a generator network, which allows them to learn more flexible notions of similarity between data points. Our experiments demonstrate that conjugate EBMs achieve competitive results in terms of image modelling, predictive power of latent space, and out-of-domain detection on a variety of datasets.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Nested Variational Inference
Authors:
Heiko Zimmermann,
Hao Wu,
Babak Esmaeili,
Jan-Willem van de Meent
Abstract:
We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler. Our experiments appl…
▽ More
We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler. Our experiments apply NVI to (a) sample from a multimodal distribution using a learned annealing path (b) learn heuristics that approximate the likelihood of future observations in a hidden Markov model and (c) to perform amortized inference in hierarchical deep generative models. We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
Active-Learning in the Online Environment
Authors:
Zahra Derakhshandeh,
Babak Esmaeili
Abstract:
Online learning is convenient for many learners; it gives them the possibility of learning without being restricted by attending a particular classroom at a specific time. While this exciting opportunity can let its users manage their life in a better way, many students may suffer from feeling isolated or disconnected from the community that consists of the instructor and the learners. Lack of int…
▽ More
Online learning is convenient for many learners; it gives them the possibility of learning without being restricted by attending a particular classroom at a specific time. While this exciting opportunity can let its users manage their life in a better way, many students may suffer from feeling isolated or disconnected from the community that consists of the instructor and the learners. Lack of interaction among students and the instructor may negatively impact their learnings and cause adverse emotions like anxiety, sadness, and depression. Apart from the feeling of loneliness, sometimes students may come up with different issues or questions as they study the course, which can stop them from confidently progressing or make them feel discouraged if we leave them alone. To promote interaction and to overcome the limitations of geographic distance in online education, we propose a customized design, Tele-instruction, with useful features supplement to the traditional online learning systems to enable peers and the instructor of the course to interact at their conveniences once needed. The designed system can help students address their questions through the answers already provided to other students or ask for the instructor's point of view by two-way communication, similar to face-to-face forms of educational experiences. We believe our approach can assist in filling the gaps when online learning falls behind the traditional classroom-based learning systems.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Rate-Regularization and Generalization in VAEs
Authors:
Alican Bozkurt,
Babak Esmaeili,
Jean-Baptiste Tristan,
Dana H. Brooks,
Jennifer G. Dy,
Jan-Willem van de Meent
Abstract:
Variational autoencoders optimize an objective that combines a reconstruction loss (the distortion) and a KL term (the rate). The rate is an upper bound on the mutual information, which is often interpreted as a regularizer that controls the degree of compression. We here examine whether inclusion of the rate also acts as an inductive bias that improves generalization. We perform rate-distortion a…
▽ More
Variational autoencoders optimize an objective that combines a reconstruction loss (the distortion) and a KL term (the rate). The rate is an upper bound on the mutual information, which is often interpreted as a regularizer that controls the degree of compression. We here examine whether inclusion of the rate also acts as an inductive bias that improves generalization. We perform rate-distortion analyses that control the strength of the rate term, the network capacity, and the difficulty of the generalization problem. Decreasing the strength of the rate paradoxically improves generalization in most settings, and reducing the mutual information typically leads to underfitting. Moreover, we show that generalization continues to improve even after the mutual information saturates, indicating that the gap on the bound (i.e. the KL divergence relative to the inference marginal) affects generalization. This suggests that the standard Gaussian prior is not an inductive bias that typically aids generalization, prompting work to understand what choices of priors improve generalization in VAEs.
△ Less
Submitted 25 March, 2021; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Can VAEs Generate Novel Examples?
Authors:
Alican Bozkurt,
Babak Esmaeili,
Dana H. Brooks,
Jennifer G. Dy,
Jan-Willem van de Meent
Abstract:
An implicit goal in works on deep generative models is that such models should be able to generate novel examples that were not previously seen in the training data. In this paper, we investigate to what extent this property holds for widely employed variational autoencoder (VAE) architectures. VAEs maximize a lower bound on the log marginal likelihood, which implies that they will in principle ov…
▽ More
An implicit goal in works on deep generative models is that such models should be able to generate novel examples that were not previously seen in the training data. In this paper, we investigate to what extent this property holds for widely employed variational autoencoder (VAE) architectures. VAEs maximize a lower bound on the log marginal likelihood, which implies that they will in principle overfit the training data when provided with a sufficiently expressive decoder. In the limit of an infinite capacity decoder, the optimal generative model is a uniform mixture over the training data. More generally, an optimal decoder should output a weighted average over the examples in the training data, where the magnitude of the weights is determined by the proximity in the latent space. This leads to the hypothesis that, for a sufficiently high capacity encoder and decoder, the VAE decoder will perform nearest-neighbor matching according to the coordinates in the latent space. To test this hypothesis, we investigate generalization on the MNIST dataset. We consider both generalization to new examples of previously seen classes, and generalization to the classes that were withheld from the training set. In both cases, we find that reconstructions are closely approximated by nearest neighbors for higher-dimensional parameterizations. When generalizing to unseen classes however, lower-dimensional parameterizations offer a clear advantage.
△ Less
Submitted 22 December, 2018;
originally announced December 2018.
-
Structured Neural Topic Models for Reviews
Authors:
Babak Esmaeili,
Hongyi Huang,
Byron C. Wallace,
Jan-Willem van de Meent
Abstract:
We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a struc…
▽ More
We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by inferring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects. Experimental evaluation on large number of datasets demonstrates that aspects are interpretable, yield higher coherence scores than non-structured autoencoding topic model variants, and can be utilized to perform aspect-based comparison and genre discovery.
△ Less
Submitted 1 January, 2019; v1 submitted 12 December, 2018;
originally announced December 2018.
-
Structured Disentangled Representations
Authors:
Babak Esmaeili,
Hao Wu,
Sarthak Jain,
Alican Bozkurt,
N. Siddharth,
Brooks Paige,
Dana H. Brooks,
Jennifer Dy,
Jan-Willem van de Meent
Abstract:
Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to relia…
▽ More
Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We derive this objective as a generalization of the evidence lower bound, which allows us to explicitly represent the trade-offs between mutual information between data and representation, KL divergence between representation and prior, and coverage of the support of the empirical data distribution. Experiments on a variety of datasets demonstrate that our objective can not only disentangle discrete variables, but that doing so also improves disentanglement of other variables and, importantly, generalization even to unseen combinations of factors.
△ Less
Submitted 12 December, 2018; v1 submitted 5 April, 2018;
originally announced April 2018.