Search | arXiv e-print repository

arXiv:2409.14473 [pdf, other]

A Large Language Model and Denoising Diffusion Framework for Targeted Design of Microstructures with Commands in Natural Language

Authors: Nikita Kartashov, Nikolaos N. Vlassis

Abstract: Microstructure plays a critical role in determining the macroscopic properties of materials, with applications spanning alloy design, MEMS devices, and tissue engineering, among many others. Computational frameworks have been developed to capture the complex relationship between microstructure and material behavior. However, despite these advancements, the steep learning curve associated with doma… ▽ More Microstructure plays a critical role in determining the macroscopic properties of materials, with applications spanning alloy design, MEMS devices, and tissue engineering, among many others. Computational frameworks have been developed to capture the complex relationship between microstructure and material behavior. However, despite these advancements, the steep learning curve associated with domain-specific knowledge and complex algorithms restricts the broader application of these tools. To lower this barrier, we propose a framework that integrates Natural Language Processing (NLP), Large Language Models (LLMs), and Denoising Diffusion Probabilistic Models (DDPMs) to enable microstructure design using intuitive natural language commands. Our framework employs contextual data augmentation, driven by a pretrained LLM, to generate and expand a diverse dataset of microstructure descriptors. A retrained NER model extracts relevant microstructure descriptors from user-provided natural language inputs, which are then used by the DDPM to generate microstructures with targeted mechanical properties and topological features. The NLP and DDPM components of the framework are modular, allowing for separate training and validation, which ensures flexibility in adapting the framework to different datasets and use cases. A surrogate model system is employed to rank and filter generated samples based on their alignment with target properties. Demonstrated on a database of nonlinear hyperelastic microstructures, this framework serves as a prototype for accessible inverse design of microstructures, starting from intuitive natural language commands. △ Less

Submitted 22 September, 2024; originally announced September 2024.

Comments: 29 pages, 15 figures

arXiv:2405.03658 [pdf, other]

A review on data-driven constitutive laws for solids

Authors: Jan Niklas Fuhg, Govinda Anantha Padmanabha, Nikolaos Bouklas, Bahador Bahmani, WaiChing Sun, Nikolaos N. Vlassis, Moritz Flaschel, Pietro Carrara, Laura De Lorenzis

Abstract: This review article highlights state-of-the-art data-driven techniques to discover, encode, surrogate, or emulate constitutive laws that describe the path-independent and path-dependent response of solids. Our objective is to provide an organized taxonomy to a large spectrum of methodologies developed in the past decades and to discuss the benefits and drawbacks of the various techniques for inter… ▽ More This review article highlights state-of-the-art data-driven techniques to discover, encode, surrogate, or emulate constitutive laws that describe the path-independent and path-dependent response of solids. Our objective is to provide an organized taxonomy to a large spectrum of methodologies developed in the past decades and to discuss the benefits and drawbacks of the various techniques for interpreting and forecasting mechanics behavior across different scales. Distinguishing between machine-learning-based and model-free methods, we further categorize approaches based on their interpretability and on their learning process/type of required data, while discussing the key problems of generalization and trustworthiness. We attempt to provide a road map of how these can be reconciled in a data-availability-aware context. We also touch upon relevant aspects such as data sampling techniques, design of experiments, verification, and validation. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 57 pages, 7 Figures

MSC Class: 74-02 (Primary)

arXiv:2308.14165 [pdf, other]

Distributional Off-Policy Evaluation for Slate Recommendations

Authors: Shreyas Chaudhari, David Arbour, Georgios Theocharous, Nikos Vlassis

Abstract: Recommendation strategies are typically evaluated by using previously logged data, employing off-policy evaluation methods to estimate their expected performance. However, for strategies that present users with slates of multiple items, the resulting combinatorial action space renders many of these methods impractical. Prior work has developed estimators that leverage the structure in slates to es… ▽ More Recommendation strategies are typically evaluated by using previously logged data, employing off-policy evaluation methods to estimate their expected performance. However, for strategies that present users with slates of multiple items, the resulting combinatorial action space renders many of these methods impractical. Prior work has developed estimators that leverage the structure in slates to estimate the expected off-policy performance, but the estimation of the entire performance distribution remains elusive. Estimating the complete distribution allows for a more comprehensive evaluation of recommendation strategies, particularly along the axes of risk and fairness that employ metrics computable from the distribution. In this paper, we propose an estimator for the complete off-policy performance distribution for slates and establish conditions under which the estimator is unbiased and consistent. This builds upon prior work on off-policy evaluation for slates and off-policy distribution estimation in reinforcement learning. We validate the efficacy of our method empirically on synthetic data as well as on a slate recommendation simulator constructed from real-world data (MovieLens-20M). Our results show a significant reduction in estimation variance and improved sample efficiency over prior work across a range of slate structures. △ Less

Submitted 27 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

Comments: Accepted in The 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24)

arXiv:2306.04411 [pdf, other]

Synthesizing realistic sand assemblies with denoising diffusion in latent space

Authors: Nikolaos N. Vlassis, WaiChing Sun, Khalid A. Alshibli, Richard A. Regueiro

Abstract: The shapes and morphological features of grains in sand assemblies have far-reaching implications in many engineering applications, such as geotechnical engineering, computer animations, petroleum engineering, and concentrated solar power. Yet, our understanding of the influence of grain geometries on macroscopic response is often only qualitative, due to the limited availability of high-quality 3… ▽ More The shapes and morphological features of grains in sand assemblies have far-reaching implications in many engineering applications, such as geotechnical engineering, computer animations, petroleum engineering, and concentrated solar power. Yet, our understanding of the influence of grain geometries on macroscopic response is often only qualitative, due to the limited availability of high-quality 3D grain geometry data. In this paper, we introduce a denoising diffusion algorithm that uses a set of point clouds collected from the surface of individual sand grains to generate grains in the latent space. By employing a point cloud autoencoder, the three-dimensional point cloud structures of sand grains are first encoded into a lower-dimensional latent space. A generative denoising diffusion probabilistic model is trained to produce synthetic sand that maximizes the log-likelihood of the generated samples belonging to the original data distribution measured by a Kullback-Leibler divergence. Numerical experiments suggest that the proposed method is capable of generating realistic grains with morphology, shapes and sizes consistent with the training data inferred from an F50 sand database. We then use a rigid contact dynamic simulator to pour the synthetic sand in a confined volume to form granular assemblies in a static equilibrium state with targeted distribution properties. To ensure third-party validation, 50,000 synthetic sand grains and the 1,542 real synchrotron microcomputed tomography (SMT) scans of the F50 sand, as well as the granular assemblies composed of synthetic sand grains are made available in an open-source repository. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 28 pages, 13 figures, 4 tables

arXiv:2302.12881 [pdf, other]

doi 10.1016/j.cma.2023.116126

Denoising diffusion algorithm for inverse design of microstructures with fine-tuned nonlinear material properties

Authors: Nikolaos N. Vlassis, WaiChing Sun

Abstract: In this paper, we introduce a denoising diffusion algorithm to discover microstructures with nonlinear fine-tuned properties. Denoising diffusion probabilistic models are generative models that use diffusion-based dynamics to gradually denoise images and generate realistic synthetic samples. By learning the reverse of a Markov diffusion process, we design an artificial intelligence to efficiently… ▽ More In this paper, we introduce a denoising diffusion algorithm to discover microstructures with nonlinear fine-tuned properties. Denoising diffusion probabilistic models are generative models that use diffusion-based dynamics to gradually denoise images and generate realistic synthetic samples. By learning the reverse of a Markov diffusion process, we design an artificial intelligence to efficiently manipulate the topology of microstructures to generate a massive number of prototypes that exhibit constitutive responses sufficiently close to designated nonlinear constitutive responses. To identify the subset of microstructures with sufficiently precise fine-tuned properties, a convolutional neural network surrogate is trained to replace high-fidelity finite element simulations to filter out prototypes outside the admissible range. The results of this study indicate that the denoising diffusion process is capable of creating microstructures of fine-tuned nonlinear material properties within the latent space of the training data. More importantly, the resulting algorithm can be easily extended to incorporate additional topological and geometric modifications by introducing high-dimensional structures embedded in the latent space. The algorithm is tested on the open-source mechanical MNIST data set. Consequently, this algorithm is not only capable of performing inverse design of nonlinear effective media but also learns the nonlinear structure-property map to quantitatively understand the multiscale interplay among the geometry and topology and their effective macroscopic properties. △ Less

Submitted 24 February, 2023; originally announced February 2023.

Comments: 21 pages, 11 figures

arXiv:2212.11431 [pdf, other]

Local Policy Improvement for Recommender Systems

Authors: Dawen Liang, Nikos Vlassis

Abstract: Recommender systems predict what items a user will interact with next, based on their past interactions. The problem is often approached through supervised learning, but recent advancements have shifted towards policy optimization of rewards (e.g., user engagement). One challenge with the latter is policy mismatch: we are only able to train a new policy given data collected from a previously-deplo… ▽ More Recommender systems predict what items a user will interact with next, based on their past interactions. The problem is often approached through supervised learning, but recent advancements have shifted towards policy optimization of rewards (e.g., user engagement). One challenge with the latter is policy mismatch: we are only able to train a new policy given data collected from a previously-deployed policy. The conventional way to address this problem is through importance sampling correction, but this comes with practical limitations. We suggest an alternative approach of local policy improvement without off-policy correction. Our method computes and optimizes a lower bound of expected reward of the target policy, which is easy to estimate from data and does not involve density ratios (such as those appearing in importance sampling correction). This local policy improvement paradigm is ideal for recommender systems, as previous policies are typically of decent quality and policies are updated frequently. We provide empirical evidence and practical recipes for applying our technique in a sequential recommendation setting. △ Less

Submitted 26 April, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

arXiv:2209.13126 [pdf, other]

Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter

Authors: Ruben Villarreal, Nikolaos N. Vlassis, Nhon N. Phan, Tommie A. Catanach, Reese E. Jones, Nathaniel A. Trask, Sharlotte L. B. Kramer, WaiChing Sun

Abstract: Experimental data is costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback-Leibler (KL) divergence obtaine… ▽ More Experimental data is costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback-Leibler (KL) divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where traditional methods are too costly. We formulate possible configurations of experiments as a decision tree and a Markov decision process (MDP), where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash-Sutcliffe efficiency (NSE) index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: 40 pages, 20 figures

arXiv:2208.00246 [pdf, other]

doi 10.1016/j.cma.2022.115768

Geometric deep learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity

Authors: Nikolaos N. Vlassis, WaiChing Sun

Abstract: The history-dependent behaviors of classical plasticity models are often driven by internal variables evolved according to phenomenological laws. The difficulty to interpret how these internal variables represent a history of deformation, the lack of direct measurement of these internal variables for calibration and validation, and the weak physical underpinning of those phenomenological laws have… ▽ More The history-dependent behaviors of classical plasticity models are often driven by internal variables evolved according to phenomenological laws. The difficulty to interpret how these internal variables represent a history of deformation, the lack of direct measurement of these internal variables for calibration and validation, and the weak physical underpinning of those phenomenological laws have long been criticized as barriers to creating realistic models. In this work, geometric machine learning on graph data (e.g. finite element solutions) is used as a means to establish a connection between nonlinear dimensional reduction techniques and plasticity models. Geometric learning-based encoding on graphs allows the embedding of rich time-history data onto a low-dimensional Euclidean space such that the evolution of plastic deformation can be predicted in the embedded feature space. A corresponding decoder can then convert these low-dimensional internal variables back into a weighted graph such that the dominating topological features of plastic deformation can be observed and analyzed. △ Less

Submitted 30 July, 2022; originally announced August 2022.

Comments: 34 pages, 23 figures

arXiv:2112.02077 [pdf, other]

MD-inferred neural network monoclinic finite-strain hyperelasticity models for $β$-HMX: Sobolev training and validation against physical constraints

Authors: Nikolaos N. Vlassis, Puhan Zhao, Ran Ma, Tommy Sewell, WaiChing Sun

Abstract: We present a machine learning framework to train and validate neural networks to predict the anisotropic elastic response of the monoclinic organic molecular crystal $β$-HMX in the geometrical nonlinear regime. A filtered molecular dynamic (MD) simulations database is used to train the neural networks with a Sobolev norm that uses the stress measure and a reference configuration to deduce the elas… ▽ More We present a machine learning framework to train and validate neural networks to predict the anisotropic elastic response of the monoclinic organic molecular crystal $β$-HMX in the geometrical nonlinear regime. A filtered molecular dynamic (MD) simulations database is used to train the neural networks with a Sobolev norm that uses the stress measure and a reference configuration to deduce the elastic stored energy functional. To improve the accuracy of the elasticity tangent predictions originating from the learned stored energy, a transfer learning technique is used to introduce additional tangential constraints from the data while necessary conditions (e.g. strong ellipticity, crystallographic symmetry) for the correctness of the model are either introduced as additional physical constraints or incorporated in the validation tests. Assessment of the neural networks is based on (1) the accuracy with which they reproduce the bottom-line constitutive responses predicted by MD, (2) detailed examination of their stability and uniqueness, and (3) admissibility of the predicted responses with respect to continuum mechanics theory in the finite-deformation regime. We compare the neural networks' training efficiency under different Sobolev constraints and assess the models' accuracy and robustness against MD benchmarks for $β$-HMX. △ Less

Submitted 29 November, 2021; originally announced December 2021.

Comments: 29 pages, 17 figures

arXiv:2106.07914 [pdf, other]

Control Variates for Slate Off-Policy Evaluation

Authors: Nikos Vlassis, Ashok Chandrashekar, Fernando Amat Gil, Nathan Kallus

Abstract: We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates. The problem is common to recommender systems and user-interface optimization, and it is particularly challenging because of the combinatorially-sized action space. Swaminathan et al. (2017) have proposed the pseudoinverse (PI) estimator under the assumption that the… ▽ More We study the problem of off-policy evaluation from batched contextual bandit data with multidimensional actions, often termed slates. The problem is common to recommender systems and user-interface optimization, and it is particularly challenging because of the combinatorially-sized action space. Swaminathan et al. (2017) have proposed the pseudoinverse (PI) estimator under the assumption that the conditional mean rewards are additive in actions. Using control variates, we consider a large class of unbiased estimators that includes as specific cases the PI estimator and (asymptotically) its self-normalized variant. By optimizing over this class, we obtain new estimators with risk improvement guarantees over both the PI and the self-normalized PI estimators. Experiments with real-world recommender data as well as synthetic data validate these improvements in practice. △ Less

Submitted 2 November, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Journal ref: NeurIPS 2021

arXiv:2105.09980 [pdf, other]

Data-driven discovery of interpretable causal relations for deep learning material laws with uncertainty propagation

Authors: Xiao Sun, Bahador Bahmani, Nikolaos N. Vlassis, WaiChing Sun, Yanxun Xu

Abstract: This paper presents a computational framework that generates ensemble predictive mechanics models with uncertainty quantification (UQ). We first develop a causal discovery algorithm to infer causal relations among time-history data measured during each representative volume element (RVE) simulation through a directed acyclic graph (DAG). With multiple plausible sets of causal relationships estimat… ▽ More This paper presents a computational framework that generates ensemble predictive mechanics models with uncertainty quantification (UQ). We first develop a causal discovery algorithm to infer causal relations among time-history data measured during each representative volume element (RVE) simulation through a directed acyclic graph (DAG). With multiple plausible sets of causal relationships estimated from multiple RVE simulations, the predictions are propagated in the derived causal graph while using a deep neural network equipped with dropout layers as a Bayesian approximation for uncertainty quantification. We select two representative numerical examples (traction-separation laws for frictional interfaces, elastoplasticity models for granular assembles) to examine the accuracy and robustness of the proposed causal discovery method for the common material law predictions in civil engineering applications. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: 43 pages, 27 figures

arXiv:2104.05608 [pdf, other]

doi 10.1615/IntJMultCompEng.2022042266

Equivariant geometric learning for digital rock physics: estimating formation factor and effective permeability tensors from Morse graph

Authors: Chen Cai, Nikolaos Vlassis, Lucas Magee, Ran Ma, Zeyu Xiong, Bahador Bahmani, Teng-Fong Wong, Yusu Wang, WaiChing Sun

Abstract: We present a SE(3)-equivariant graph neural network (GNN) approach that directly predicting the formation factor and effective permeability from micro-CT images. FFT solvers are established to compute both the formation factor and effective permeability, while the topology and geometry of the pore space are represented by a persistence-based Morse graph. Together, they constitute the database for… ▽ More We present a SE(3)-equivariant graph neural network (GNN) approach that directly predicting the formation factor and effective permeability from micro-CT images. FFT solvers are established to compute both the formation factor and effective permeability, while the topology and geometry of the pore space are represented by a persistence-based Morse graph. Together, they constitute the database for training, validating, and testing the neural networks. While the graph and Euclidean convolutional approaches both employ neural networks to generate low-dimensional latent space to represent the features of the micro-structures for forward predictions, the SE(3) equivariant neural network is found to generate more accurate predictions, especially when the training data is limited. Numerical experiments have also shown that the new SE(3) approach leads to predictions that fulfill the material frame indifference whereas the predictions from classical convolutional neural networks (CNN) may suffer from spurious dependence on the coordinate system of the training data. Comparisons among predictions inferred from training the CNN and those from graph convolutional neural networks (GNN) with and without the equivariant constraint indicate that the equivariant graph neural network seems to perform better than the CNN and GNN without enforcing equivariant constraints. △ Less

Submitted 12 October, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

arXiv:2101.02553 [pdf, other]

Off-Policy Evaluation of Slate Policies under Bayes Risk

Authors: Nikos Vlassis, Fernando Amat Gil, Ashok Chandrashekar

Abstract: We study the problem of off-policy evaluation for slate bandits, for the typical case in which the logging policy factorizes over the slots of the slate. We slightly depart from the existing literature by taking Bayes risk as the criterion by which to evaluate estimators, and we analyze the family of 'additive' estimators that includes the pseudoinverse (PI) estimator of Swaminathan et al.\ (2017;… ▽ More We study the problem of off-policy evaluation for slate bandits, for the typical case in which the logging policy factorizes over the slots of the slate. We slightly depart from the existing literature by taking Bayes risk as the criterion by which to evaluate estimators, and we analyze the family of 'additive' estimators that includes the pseudoinverse (PI) estimator of Swaminathan et al.\ (2017; arXiv:1605.04812). Using a control variate approach, we identify a new estimator in this family that is guaranteed to have lower risk than PI in the above class of problems. In particular, we show that the risk improvement over PI grows linearly with the number of slots, and linearly with the gap between the arithmetic and the harmonic mean of a set of slot-level divergences between the logging and the target policy. In the typical case of a uniform logging policy and a deterministic target policy, each divergence corresponds to slot size, showing that maximal gains can be obtained for slate problems with diverse numbers of actions per slot. △ Less

Submitted 5 January, 2021; originally announced January 2021.

arXiv:2010.11265 [pdf, other]

Sobolev training of thermodynamic-informed neural networks for smoothed elasto-plasticity models with level set hardening

Authors: Nikolaos N. Vlassis, WaiChing Sun

Abstract: We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components, such as a smoothed stored elastic energy function, a yield surface, and a plastic flow that are evolved based on a set of deep neural network predictions. By recasting the yield function as an evolving level set, we introduce a machine learning approach to predict the solutions… ▽ More We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components, such as a smoothed stored elastic energy function, a yield surface, and a plastic flow that are evolved based on a set of deep neural network predictions. By recasting the yield function as an evolving level set, we introduce a machine learning approach to predict the solutions of the Hamilton-Jacobi equation that governs the hardening mechanism. This machine learning hardening law may recover classical hardening models and discover new mechanisms that are otherwise very difficult to anticipate and hand-craft. This treatment enables us to use supervised machine learning to generate models that are thermodynamically consistent, interpretable, but also exhibit excellent learning capacity. Using a 3D FFT solver to create a polycrystal database, numerical experiments are conducted and the implementations of each component of the models are individually verified. Our numerical experiments reveal that this new approach provides more robust and accurate forward predictions of cyclic stress paths than these obtained from black-box deep neural network models such as a recurrent GRU neural network, a 1D convolutional neural network, and a multi-step feedforward model. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: 42 pages, 28 figures

arXiv:2001.04292 [pdf, other]

doi 10.1016/j.cma.2020.113299

Geometric deep learning for computational mechanics Part I: Anisotropic Hyperelasticity

Authors: Nikolaos Vlassis, Ran Ma, WaiChing Sun

Abstract: This paper is the first attempt to use geometric deep learning and Sobolev training to incorporate non-Euclidean microstructural data such that anisotropic hyperelastic material machine learning models can be trained in the finite deformation range. While traditional hyperelasticity models often incorporate homogenized measures of microstructural attributes, such as porosity averaged orientation o… ▽ More This paper is the first attempt to use geometric deep learning and Sobolev training to incorporate non-Euclidean microstructural data such that anisotropic hyperelastic material machine learning models can be trained in the finite deformation range. While traditional hyperelasticity models often incorporate homogenized measures of microstructural attributes, such as porosity averaged orientation of constitutes, these measures cannot reflect the topological structures of the attributes. We fill this knowledge gap by introducing the concept of weighted graph as a new mean to store topological information, such as the connectivity of anisotropic grains in assembles. Then, by leveraging a graph convolutional deep neural network architecture in the spectral domain, we introduce a mechanism to incorporate these non-Euclidean weighted graph data directly as input for training and for predicting the elastic responses of materials with complex microstructures. To ensure smoothness and prevent non-convexity of the trained stored energy functional, we introduce a Sobolev training technique for neural networks such that stress measure is obtained implicitly from taking directional derivatives of the trained energy functional. By optimizing the neural network to approximate both the energy functional output and the stress measure, we introduce a training procedure the improves efficiency and generalize the learned energy functional for different microstructures. The trained hybrid neural network model is then used to generate new stored energy functional for unseen microstructures in a parametric study to predict the influence of elastic anisotropy on the nucleation and propagation of fracture in the brittle regime. △ Less

Submitted 7 January, 2020; originally announced January 2020.

MSC Class: 74B20; 70-08; 05C62; 05C85; 68Q32 ACM Class: E.1.3

arXiv:1912.06292 [pdf, other]

More Efficient Off-Policy Evaluation through Regularized Targeted Learning

Authors: Aurélien F. Bibaut, Ivana Malenica, Nikos Vlassis, Mark J. van der Laan

Abstract: We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies. In particular, we introduce a novel doubly-robust estimator for the OPE problem in RL, based on the Targeted Maximum Likelihood Estimation principle from the statistica… ▽ More We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies. In particular, we introduce a novel doubly-robust estimator for the OPE problem in RL, based on the Targeted Maximum Likelihood Estimation principle from the statistical causal inference literature. We also introduce several variance reduction techniques that lead to impressive performance gains in off-policy evaluation. We show empirically that our estimator uniformly wins over existing off-policy evaluation methods across multiple RL environments and various levels of model misspecification. Finally, we further the existing theoretical analysis of estimators for the RL off-policy estimation problem by showing their $O_P(1/\sqrt{n})$ rate of convergence and characterizing their asymptotic distribution. △ Less

Submitted 12 December, 2019; originally announced December 2019.

Comments: We are uploading the full paper with the appendix as of 12/12/2019, as we noticed that, unlike the main text, the appendix has not been made available on PMLR's website. The version of the appendix in this document is the same that we have been sending by email since June 2019 to readers who solicited it

Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:654-663, 2019

arXiv:1802.09646 [pdf, other]

Optimizing over a Restricted Policy Class in Markov Decision Processes

Authors: Ershad Banijamali, Yasin Abbasi-Yadkori, Mohammad Ghavamzadeh, Nikos Vlassis

Abstract: We address the problem of finding an optimal policy in a Markov decision process under a restricted policy class defined by the convex hull of a set of base policies. This problem is of great interest in applications in which a number of reasonably good (or safe) policies are already known and we are only interested in optimizing in their convex hull. We show that this problem is NP-hard to solve… ▽ More We address the problem of finding an optimal policy in a Markov decision process under a restricted policy class defined by the convex hull of a set of base policies. This problem is of great interest in applications in which a number of reasonably good (or safe) policies are already known and we are only interested in optimizing in their convex hull. We show that this problem is NP-hard to solve exactly as well as to approximate to arbitrary accuracy. However, under a condition that is akin to the occupancy measures of the base policies having large overlap, we show that there exists an efficient algorithm that finds a policy that is almost as good as the best convex combination of the base policies. The running time of the proposed algorithm is linear in the number of states and polynomial in the number of base policies. In practice, we demonstrate an efficient implementation for large state problems. Compared to traditional policy gradient methods, the proposed approach has the advantage that, apart from the computation of occupancy measures of some base policies, the iterative method need not interact with the environment during the optimization process. This is especially important in complex systems where estimating the value of a policy can be a time consuming process. △ Less

Submitted 26 February, 2018; originally announced February 2018.

Comments: 14 pages

arXiv:1711.07979 [pdf, other]

Posterior Sampling for Large Scale Reinforcement Learning

Authors: Georgios Theocharous, Zheng Wen, Yasin Abbasi-Yadkori, Nikos Vlassis

Abstract: We propose a practical non-episodic PSRL algorithm that unlike recent state-of-the-art PSRL algorithms uses a deterministic, model-independent episode switching schedule. Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity. We prove a Bayesian regret bound under mild assumptions. Our result is more generally applicable to multiple… ▽ More We propose a practical non-episodic PSRL algorithm that unlike recent state-of-the-art PSRL algorithms uses a deterministic, model-independent episode switching schedule. Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity. We prove a Bayesian regret bound under mild assumptions. Our result is more generally applicable to multiple parameters and continuous state action problems. We compare our algorithm with state-of-the-art PSRL algorithms on standard discrete and continuous problems from the literature. Finally, we show how the assumptions of our algorithm satisfy a sensible parametrization for a large class of problems in sequential recommendations. △ Less

Submitted 22 October, 2018; v1 submitted 20 November, 2017; originally announced November 2017.

arXiv:1701.08716 [pdf, other]

Does Weather Matter? Causal Analysis of TV Logs

Authors: Shi Zong, Branislav Kveton, Shlomo Berkovsky, Azin Ashkan, Nikos Vlassis, Zheng Wen

Abstract: Weather affects our mood and behaviors, and many aspects of our life. When it is sunny, most people become happier; but when it rains, some people get depressed. Despite this evidence and the abundance of data, weather has mostly been overlooked in the machine learning and data science research. This work presents a causal analysis of how weather affects TV watching patterns. We show that some wea… ▽ More Weather affects our mood and behaviors, and many aspects of our life. When it is sunny, most people become happier; but when it rains, some people get depressed. Despite this evidence and the abundance of data, weather has mostly been overlooked in the machine learning and data science research. This work presents a causal analysis of how weather affects TV watching patterns. We show that some weather attributes, such as pressure and precipitation, cause major changes in TV watching patterns. To the best of our knowledge, this is the first large-scale causal study of the impact of weather on TV watching patterns. △ Less

Submitted 24 March, 2017; v1 submitted 25 January, 2017; originally announced January 2017.

Comments: Companion of the 26th International World Wide Web Conference

arXiv:1611.09957 [pdf, other]

Low-dimensional Data Embedding via Robust Ranking

Authors: Ehsan Amid, Nikos Vlassis, Manfred K. Warmuth

Abstract: We describe a new method called t-ETE for finding a low-dimensional embedding of a set of objects in Euclidean space. We formulate the embedding problem as a joint ranking problem over a set of triplets, where each triplet captures the relative similarities between three objects in the set. By exploiting recent advances in robust ranking, t-ETE produces high-quality embeddings even in the presence… ▽ More We describe a new method called t-ETE for finding a low-dimensional embedding of a set of objects in Euclidean space. We formulate the embedding problem as a joint ranking problem over a set of triplets, where each triplet captures the relative similarities between three objects in the set. By exploiting recent advances in robust ranking, t-ETE produces high-quality embeddings even in the presence of a significant amount of noise and better preserves local scale than known methods, such as t-STE and t-SNE. In particular, our method produces significantly better results than t-SNE on signature datasets while also being faster to compute. △ Less

Submitted 16 May, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

arXiv:1607.06494 [pdf, ps, other]

Stochastic Control via Entropy Compression

Authors: Dimitris Achlioptas, Fotis Iliopoulos, Nikos Vlassis

Abstract: We consider an agent trying to bring a system to an acceptable state by repeated probabilistic action. Several recent works on algorithmizations of the Lovasz Local Lemma (LLL) can be seen as establishing sufficient conditions for the agent to succeed. Here we study whether such stochastic control is also possible in a noisy environment, where both the process of state-observation and the process… ▽ More We consider an agent trying to bring a system to an acceptable state by repeated probabilistic action. Several recent works on algorithmizations of the Lovasz Local Lemma (LLL) can be seen as establishing sufficient conditions for the agent to succeed. Here we study whether such stochastic control is also possible in a noisy environment, where both the process of state-observation and the process of state-evolution are subject to adversarial perturbation (noise). The introduction of noise causes the tools developed for LLL algorithmization to break down since the key LLL ingredient, the sparsity of the causality (dependence) relationship, no longer holds. To overcome this challenge we develop a new analysis where entropy plays a central role, both to measure the rate at which progress towards an acceptable state is made and the rate at which noise undoes this progress. The end result is a sufficient condition that allows a smooth tradeoff between the intensity of the noise and the amenability of the system, recovering an asymmetric LLL condition in the noiseless case. △ Less

Submitted 26 November, 2016; v1 submitted 21 July, 2016; originally announced July 2016.

Comments: 18 pages

arXiv:1607.00514 [pdf, ps, other]

Approximate Joint Matrix Triangularization

Authors: Nicolo Colombo, Nikos Vlassis

Abstract: We consider the problem of approximate joint triangularization of a set of noisy jointly diagonalizable real matrices. Approximate joint triangularizers are commonly used in the estimation of the joint eigenstructure of a set of matrices, with applications in signal processing, linear algebra, and tensor decomposition. By assuming the input matrices to be perturbations of noise-free, simultaneousl… ▽ More We consider the problem of approximate joint triangularization of a set of noisy jointly diagonalizable real matrices. Approximate joint triangularizers are commonly used in the estimation of the joint eigenstructure of a set of matrices, with applications in signal processing, linear algebra, and tensor decomposition. By assuming the input matrices to be perturbations of noise-free, simultaneously diagonalizable ground-truth matrices, the approximate joint triangularizers are expected to be perturbations of the exact joint triangularizers of the ground-truth matrices. We provide a priori and a posteriori perturbation bounds on the `distance' between an approximate joint triangularizer and its exact counterpart. The a priori bounds are theoretical inequalities that involve functions of the ground-truth matrices and noise matrices, whereas the a posteriori bounds are given in terms of observable quantities that can be computed from the input matrices. From a practical perspective, the problem of finding the best approximate joint triangularizer of a set of noisy matrices amounts to solving a nonconvex optimization problem. We show that, under a condition on the noise level of the input matrices, it is possible to find a good initial triangularizer such that the solution obtained by any local descent-type algorithm has certain global guarantees. Finally, we discuss the application of approximate joint matrix triangularization to canonical tensor decomposition and we derive novel estimation error bounds. △ Less

Submitted 2 July, 2016; originally announced July 2016.

Comments: 19 pages

MSC Class: 15A23; 15A42; 15A45; 15B10

arXiv:1407.6125 [pdf, other]

Spectral Sequence Motif Discovery

Authors: Nicolò Colombo, Nikos Vlassis

Abstract: Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, motif finding algorithms of increasingly high performance are required to process the big datasets produced by new high-throughput sequencing technologies. Most existing algorithms are computationally demanding and often cannot support the large size of… ▽ More Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, motif finding algorithms of increasingly high performance are required to process the big datasets produced by new high-throughput sequencing technologies. Most existing algorithms are computationally demanding and often cannot support the large size of new experimental data. We present a new motif discovery algorithm that is built on a recent machine learning technique, referred to as Method of Moments. Based on spectral decompositions, this method is robust under model misspecification and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. In a few minutes, we can process datasets of hundreds of thousand sequences and extract motif profiles that match those computed by various state-of-the-art algorithms. △ Less

Submitted 26 August, 2014; v1 submitted 23 July, 2014; originally announced July 2014.

Comments: 20 pages, 3 figures, 1 table

arXiv:1310.1930 [pdf, ps, other]

Polytopic uncertainty for linear systems: New and old complexity results

Authors: Nikos Vlassis, Raphaël Jungers

Abstract: We survey the problem of deciding the stability or stabilizability of uncertain linear systems whose region of uncertainty is a polytope. This natural setting has applications in many fields of applied science, from Control Theory to Systems Engineering to Biology. We focus on the algorithmic decidability of this property when one is given a particular polytope. This setting gives rise to several… ▽ More We survey the problem of deciding the stability or stabilizability of uncertain linear systems whose region of uncertainty is a polytope. This natural setting has applications in many fields of applied science, from Control Theory to Systems Engineering to Biology. We focus on the algorithmic decidability of this property when one is given a particular polytope. This setting gives rise to several different algorithmic questions, depending on the nature of time (discrete/continuous), the property asked (stability/stabilizability), or the type of uncertainty (fixed/switching). Several of these questions have been answered in the literature in the last thirty years. We point out the ones that have remained open, and we answer all of them, except one which we raise as an open question. In all the cases, the results are negative in the sense that the questions are NP-hard. As a byproduct, we obtain complexity results for several other matrix problems in Systems and Control. △ Less

Submitted 11 February, 2014; v1 submitted 7 October, 2013; originally announced October 2013.

Comments: Fixed some typos and added some references

arXiv:1304.7992 [pdf, other]

doi 10.1371/journal.pcbi.1003424

Fast Reconstruction of Compact Context-Specific Metabolic Network Models

Authors: Nikos Vlassis, Maria Pires Pacheco, Thomas Sauter

Abstract: Systemic approaches to the study of a biological cell or tissue rely increasingly on the use of context-specific metabolic network models. The reconstruction of such a model from high-throughput data can routinely involve large numbers of tests under different conditions and extensive parameter tuning, which calls for fast algorithms. We present FASTCORE, a generic algorithm for reconstructing con… ▽ More Systemic approaches to the study of a biological cell or tissue rely increasingly on the use of context-specific metabolic network models. The reconstruction of such a model from high-throughput data can routinely involve large numbers of tests under different conditions and extensive parameter tuning, which calls for fast algorithms. We present FASTCORE, a generic algorithm for reconstructing context-specific metabolic network models from global genome-wide metabolic network models such as Recon X. FASTCORE takes as input a core set of reactions that are known to be active in the context of interest (e.g., cell or tissue), and it searches for a flux consistent subnetwork of the global network that contains all reactions from the core set and a minimal set of additional reactions. Our key observation is that a minimal consistent reconstruction can be defined via a set of sparse modes of the global network, and FASTCORE iteratively computes such a set via a series of linear programs. Experiments on liver data demonstrate speedups of several orders of magnitude, and significantly more compact reconstructions, over a chief rival method. Given its simplicity and its excellent performance, FASTCORE can form the backbone of many future metabolic network reconstruction algorithms. △ Less

Submitted 23 November, 2013; v1 submitted 30 April, 2013; originally announced April 2013.

Comments: fixed an error in the functional analysis of the liver model

arXiv:1206.2059 [pdf, ps, other]

NP-hardness of polytope M-matrix testing and related problems

Authors: Nikos Vlassis

Abstract: In this note we prove NP-hardness of the following problem: Given a set of matrices, is there a convex combination of those that is a nonsingular M-matrix? Via known characterizations of M-matrices, our result establishes NP-hardness of several fundamental problems in systems analysis and control, such as testing the instability of an uncertain dynamical system, and minimizing the spectral radius… ▽ More In this note we prove NP-hardness of the following problem: Given a set of matrices, is there a convex combination of those that is a nonsingular M-matrix? Via known characterizations of M-matrices, our result establishes NP-hardness of several fundamental problems in systems analysis and control, such as testing the instability of an uncertain dynamical system, and minimizing the spectral radius of an affine matrix function. △ Less

Submitted 10 June, 2012; originally announced June 2012.

arXiv:1111.0062 [pdf, ps, other]

doi 10.1613/jair.2447

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Authors: Frans A. Oliehoek, Matthijs T. J. Spaan, Nikos Vlassis

Abstract: Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extrac… ▽ More Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem. △ Less

Submitted 31 October, 2011; originally announced November 2011.

Journal ref: Journal Of Artificial Intelligence Research, Volume 32, pages 289-353, 2008

arXiv:1109.2145 [pdf, ps]

doi 10.1613/jair.1659

Perseus: Randomized Point-based Value Iteration for POMDPs

Authors: M. T. J. Spaan, N. Vlassis

Abstract: Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate… ▽ More Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems. △ Less

Submitted 9 September, 2011; originally announced September 2011.

Journal ref: Journal Of Artificial Intelligence Research, Volume 24, pages 195-220, 2005

arXiv:1107.3090 [pdf, other]

On the Computational Complexity of Stochastic Controller Optimization in POMDPs

Authors: Nikos Vlassis, Michael L. Littman, David Barber

Abstract: We show that the problem of finding an optimal stochastic 'blind' controller in a Markov decision process is an NP-hard problem. The corresponding decision problem is NP-hard, in PSPACE, and SQRT-SUM-hard, hence placing it in NP would imply breakthroughs in long-standing open problems in computer science. Our result establishes that the more general problem of stochastic controller optimization in… ▽ More We show that the problem of finding an optimal stochastic 'blind' controller in a Markov decision process is an NP-hard problem. The corresponding decision problem is NP-hard, in PSPACE, and SQRT-SUM-hard, hence placing it in NP would imply breakthroughs in long-standing open problems in computer science. Our result establishes that the more general problem of stochastic controller optimization in POMDPs is also NP-hard. Nonetheless, we outline a special case that is convex and admits efficient global solutions. △ Less

Submitted 4 October, 2012; v1 submitted 15 July, 2011; originally announced July 2011.

Comments: Corrected error in the proof of Theorem 2, and revised Section 5

ACM Class: F.2.1

Showing 1–29 of 29 results for author: Vlassis, N