Search | arXiv e-print repository

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

Authors: S. Chandra Mouli, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Andrew Stuart, Michael W. Mahoney, Yuyang Wang

Abstract: Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve… ▽ More Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are even moderately out-of-domain (OOD), even when the model approximates the solution well for in-domain tasks. To address this limitation, we show that ensembling several NOs can identify high-error regions and provide good uncertainty estimates that are well-correlated with prediction errors. Based on this, we propose a cost-effective alternative, DiverseNO, that mimics the properties of the ensemble by encouraging diverse predictions from its multiple heads in the last feed-forward layer. We then introduce Operator-ProbConserv, a method that uses these well-calibrated UQ estimates within the ProbConserv framework to update the model. Our empirical results show that Operator-ProbConserv enhances OOD model performance for a variety of challenging PDE problems and satisfies physical constraints such as conservation laws. △ Less

Submitted 12 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: ICML 2024

arXiv:2403.03933 [pdf, ps, other]

Polynomial Calculus sizes over the Boolean and Fourier bases are incomparable

Authors: Sasank Mouli

Abstract: For every $n >0$, we show the existence of a CNF tautology over $O(n^2)$ variables of width $O(\log n)$ such that it has a Polynomial Calculus Resolution refutation over $\{0,1\}$ variables of size $O(n^3polylog(n))$ but any Polynomial Calculus refutation over $\{+1,-1\}$ variables requires size $2^{Ω(n)}$. This shows that Polynomial Calculus sizes over the $\{0,1\}$ and $\{+1,-1\}$ bases are inco… ▽ More For every $n >0$, we show the existence of a CNF tautology over $O(n^2)$ variables of width $O(\log n)$ such that it has a Polynomial Calculus Resolution refutation over $\{0,1\}$ variables of size $O(n^3polylog(n))$ but any Polynomial Calculus refutation over $\{+1,-1\}$ variables requires size $2^{Ω(n)}$. This shows that Polynomial Calculus sizes over the $\{0,1\}$ and $\{+1,-1\}$ bases are incomparable (since Tseitin tautologies show a separation in the other direction) and answers an open problem posed by Sokolov [Sok20] and Razborov. △ Less

Submitted 30 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

arXiv:2303.03181 [pdf, other]

MetaPhysiCa: OOD Robustness in Physics-informed Machine Learning

Authors: S Chandra Mouli, Muhammad Ashraful Alam, Bruno Ribeiro

Abstract: A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In… ▽ More A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In this work we propose a solution for such tasks, which we define as a meta-learning procedure for causal structure discovery (including invariant risk minimization). Using three different OOD tasks, we empirically observe that the proposed approach significantly outperforms existing state-of-the-art PIML and deep learning methods. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2209.05104 [pdf, other]

Bias Challenges in Counterfactual Data Augmentation

Authors: S Chandra Mouli, Yangze Zhou, Bruno Ribeiro

Abstract: Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augme… ▽ More Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactual-invariance if the augmentation is performed by a context-guessing machine, an abstract machine that guesses the most-likely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a context-guessing machine does not lead to robust OOD classifiers. △ Less

Submitted 13 September, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: Accepted at UAI 2022 Workshop on Causal Representation Learning

arXiv:2104.10105 [pdf, other]

Neural Networks for Learning Counterfactual G-Invariances from Single Environments

Authors: S Chandra Mouli, Bruno Ribeiro

Abstract: Despite -- or maybe because of -- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not expl… ▽ More Despite -- or maybe because of -- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not explicitly observed with infinitely many training examples have underspecified outcomes in the learner's model. In order to endow neural networks with the ability to extrapolate over group transformations, we introduce a learning framework counterfactually-guided by the learning hypothesis that any group invariance to (known) transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Unlike existing invariance-driven methods for (counterfactual) extrapolations, this framework allows extrapolations from a single environment. Finally, we introduce sequence and image extrapolation tasks that validate our framework and showcase the shortcomings of traditional approaches. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: ICLR 2021

arXiv:2010.10584 [pdf, other]

doi 10.1109/ACCESS.2021.3058579

Incandescent Bulb and LED Brake Lights:Novel Analysis of Reaction Times

Authors: Ramaswamy Palaniappan, Surej Mouli, Evangelina Fringi, Howard Bowman, Ian McLoughlin

Abstract: Rear-end collision accounts for around 8% of all vehicle crashes in the UK, with the failure to notice or react to a brake light signal being a major contributory cause. Meanwhile traditional incandescent brake light bulbs on vehicles are increasingly being replaced by a profusion of designs featuring LEDs. In this paper, we investigate the efficacy of brake light design using a novel approach to… ▽ More Rear-end collision accounts for around 8% of all vehicle crashes in the UK, with the failure to notice or react to a brake light signal being a major contributory cause. Meanwhile traditional incandescent brake light bulbs on vehicles are increasingly being replaced by a profusion of designs featuring LEDs. In this paper, we investigate the efficacy of brake light design using a novel approach to recording subject reaction times in a simulation setting using physical brake light assemblies. The reaction times of 22 subjects were measured for ten pairs of LED and incandescent bulb brake lights. Three events were investigated for each subject, namely the latency of brake light activation to accelerator release (BrakeAcc), the latency of accelerator release to brake pedal depression (AccPdl), and the cumulative time from light activation to brake pedal depression (BrakePdl). To our knowledge, this is the first study in which reaction times have been split into BrakeAcc and AccPdl. Results indicate that the two brake lights containing incandescent bulbs led to significantly slower reaction times compared to the tested eight LED lights. BrakeAcc results also show that experienced subjects were quicker to respond to the activation of brake lights by releasing the accelerator pedal. Interestingly, the analysis also revealed that the type of brake light influenced the AccPdl time, although experienced subjects did not always act quicker than inexperienced subjects. Overall, the study found that different designs of brake light can significantly influence driver response times. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Comments: 10 pages, 18 figures

Journal ref: For a revised version and its published version refer to IEEE Access journal, 2021

arXiv:2005.14113 [pdf, other]

Deceptive Deletions for Protecting Withdrawn Posts on Social Platforms

Authors: Mohsen Minaei, S Chandra Mouli, Mainack Mondal, Bruno Ribeiro, Aniket Kate

Abstract: Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make… ▽ More Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale. The reason for such hunting is simple: deleting a post acts as a powerful signal that the post might be damaging to its owner. Today, multiple archival services are already scanning social media for these deleted posts. Moreover, as we demonstrate in this work, powerful machine learning models can detect damaging deletions at scale. Towards restraining such a global adversary against users' right to be forgotten, we introduce Deceptive Deletion, a decoy mechanism that minimizes the adversarial advantage. Our mechanism injects decoy deletions, hence creating a two-player minmax game between an adversary that seeks to classify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions. We formalize the Deceptive Game between the two players, determine conditions under which either the adversary or the challenger provably wins the game, and discuss the scenarios in-between these two extremes. We apply the Deceptive Deletion mechanism to a real-world task on Twitter: hiding damaging tweet deletions. We show that a powerful global adversary can be beaten by a powerful challenger, raising the bar significantly and giving a glimmer of hope in the ability to be really forgotten on social platforms. △ Less

Submitted 28 May, 2020; originally announced May 2020.

arXiv:1910.00547 [pdf, other]

Deep Lifetime Clustering

Authors: S Chandra Mouli, Leonardo Teixeira, Jennifer Neville, Bruno Ribeiro

Abstract: The goal of lifetime clustering is to develop an inductive model that maps subjects into $K$ clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters. Accordingly, we define a novel clust… ▽ More The goal of lifetime clustering is to develop an inductive model that maps subjects into $K$ clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters. Accordingly, we define a novel clustering loss function over the lifetime distributions (of entire clusters) based on a tight upper bound of the two-sample Kuiper test p-value. The resultant model is robust to the modeling issues associated with the unobservability of termination signals, and does not assume proportional hazards. Our results in real and synthetic datasets show significantly better lifetime clusters (as evaluated by C-index, Brier Score, Logrank score and adjusted Rand index) as compared to competing approaches. △ Less

Submitted 1 October, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

arXiv:1703.03401 [pdf, other]

Identifying User Survival Types via Clustering of Censored Social Network Data

Authors: S Chandra Mouli, Abhishek Naik, Bruno Ribeiro, Jennifer Neville

Abstract: The goal of cluster analysis in survival data is to identify clusters that are decidedly associated with the survival outcome. Previous research has explored this problem primarily in the medical domain with relatively small datasets, but the need for such a clustering methodology could arise in other domains with large datasets, such as social networks. Concretely, we wish to identify different s… ▽ More The goal of cluster analysis in survival data is to identify clusters that are decidedly associated with the survival outcome. Previous research has explored this problem primarily in the medical domain with relatively small datasets, but the need for such a clustering methodology could arise in other domains with large datasets, such as social networks. Concretely, we wish to identify different survival classes in a social network by clustering the users based on their lifespan in the network. In this paper, we propose a decision tree based algorithm that uses a global normalization of $p$-values to identify clusters with significantly different survival distributions. We evaluate the clusters from our model with the help of a simple survival prediction task and show that our model outperforms other competing methods. △ Less

Submitted 9 March, 2017; originally announced March 2017.

Showing 1–9 of 9 results for author: Mouli, S