-
Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs
Authors:
S. Chandra Mouli,
Danielle C. Maddix,
Shima Alizadeh,
Gaurav Gupta,
Andrew Stuart,
Michael W. Mahoney,
Yuyang Wang
Abstract:
Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve…
▽ More
Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are even moderately out-of-domain (OOD), even when the model approximates the solution well for in-domain tasks. To address this limitation, we show that ensembling several NOs can identify high-error regions and provide good uncertainty estimates that are well-correlated with prediction errors. Based on this, we propose a cost-effective alternative, DiverseNO, that mimics the properties of the ensemble by encouraging diverse predictions from its multiple heads in the last feed-forward layer. We then introduce Operator-ProbConserv, a method that uses these well-calibrated UQ estimates within the ProbConserv framework to update the model. Our empirical results show that Operator-ProbConserv enhances OOD model performance for a variety of challenging PDE problems and satisfies physical constraints such as conservation laws.
△ Less
Submitted 12 June, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Polynomial Calculus sizes over the Boolean and Fourier bases are incomparable
Authors:
Sasank Mouli
Abstract:
For every $n >0$, we show the existence of a CNF tautology over $O(n^2)$ variables of width $O(\log n)$ such that it has a Polynomial Calculus Resolution refutation over $\{0,1\}$ variables of size $O(n^3polylog(n))$ but any Polynomial Calculus refutation over $\{+1,-1\}$ variables requires size $2^{Ω(n)}$. This shows that Polynomial Calculus sizes over the $\{0,1\}$ and $\{+1,-1\}$ bases are inco…
▽ More
For every $n >0$, we show the existence of a CNF tautology over $O(n^2)$ variables of width $O(\log n)$ such that it has a Polynomial Calculus Resolution refutation over $\{0,1\}$ variables of size $O(n^3polylog(n))$ but any Polynomial Calculus refutation over $\{+1,-1\}$ variables requires size $2^{Ω(n)}$. This shows that Polynomial Calculus sizes over the $\{0,1\}$ and $\{+1,-1\}$ bases are incomparable (since Tseitin tautologies show a separation in the other direction) and answers an open problem posed by Sokolov [Sok20] and Razborov.
△ Less
Submitted 30 June, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
MetaPhysiCa: OOD Robustness in Physics-informed Machine Learning
Authors:
S Chandra Mouli,
Muhammad Ashraful Alam,
Bruno Ribeiro
Abstract:
A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In…
▽ More
A fundamental challenge in physics-informed machine learning (PIML) is the design of robust PIML methods for out-of-distribution (OOD) forecasting tasks. These OOD tasks require learning-to-learn from observations of the same (ODE) dynamical system with different unknown ODE parameters, and demand accurate forecasts even under out-of-support initial conditions and out-of-support ODE parameters. In this work we propose a solution for such tasks, which we define as a meta-learning procedure for causal structure discovery (including invariant risk minimization). Using three different OOD tasks, we empirically observe that the proposed approach significantly outperforms existing state-of-the-art PIML and deep learning methods.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Bias Challenges in Counterfactual Data Augmentation
Authors:
S Chandra Mouli,
Yangze Zhou,
Bruno Ribeiro
Abstract:
Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augme…
▽ More
Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactual-invariance if the augmentation is performed by a context-guessing machine, an abstract machine that guesses the most-likely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a context-guessing machine does not lead to robust OOD classifiers.
△ Less
Submitted 13 September, 2022; v1 submitted 12 September, 2022;
originally announced September 2022.
-
Neural Networks for Learning Counterfactual G-Invariances from Single Environments
Authors:
S Chandra Mouli,
Bruno Ribeiro
Abstract:
Despite -- or maybe because of -- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not expl…
▽ More
Despite -- or maybe because of -- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not explicitly observed with infinitely many training examples have underspecified outcomes in the learner's model. In order to endow neural networks with the ability to extrapolate over group transformations, we introduce a learning framework counterfactually-guided by the learning hypothesis that any group invariance to (known) transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Unlike existing invariance-driven methods for (counterfactual) extrapolations, this framework allows extrapolations from a single environment. Finally, we introduce sequence and image extrapolation tasks that validate our framework and showcase the shortcomings of traditional approaches.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Incandescent Bulb and LED Brake Lights:Novel Analysis of Reaction Times
Authors:
Ramaswamy Palaniappan,
Surej Mouli,
Evangelina Fringi,
Howard Bowman,
Ian McLoughlin
Abstract:
Rear-end collision accounts for around 8% of all vehicle crashes in the UK, with the failure to notice or react to a brake light signal being a major contributory cause. Meanwhile traditional incandescent brake light bulbs on vehicles are increasingly being replaced by a profusion of designs featuring LEDs. In this paper, we investigate the efficacy of brake light design using a novel approach to…
▽ More
Rear-end collision accounts for around 8% of all vehicle crashes in the UK, with the failure to notice or react to a brake light signal being a major contributory cause. Meanwhile traditional incandescent brake light bulbs on vehicles are increasingly being replaced by a profusion of designs featuring LEDs. In this paper, we investigate the efficacy of brake light design using a novel approach to recording subject reaction times in a simulation setting using physical brake light assemblies. The reaction times of 22 subjects were measured for ten pairs of LED and incandescent bulb brake lights. Three events were investigated for each subject, namely the latency of brake light activation to accelerator release (BrakeAcc), the latency of accelerator release to brake pedal depression (AccPdl), and the cumulative time from light activation to brake pedal depression (BrakePdl). To our knowledge, this is the first study in which reaction times have been split into BrakeAcc and AccPdl. Results indicate that the two brake lights containing incandescent bulbs led to significantly slower reaction times compared to the tested eight LED lights. BrakeAcc results also show that experienced subjects were quicker to respond to the activation of brake lights by releasing the accelerator pedal. Interestingly, the analysis also revealed that the type of brake light influenced the AccPdl time, although experienced subjects did not always act quicker than inexperienced subjects. Overall, the study found that different designs of brake light can significantly influence driver response times.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Deceptive Deletions for Protecting Withdrawn Posts on Social Platforms
Authors:
Mohsen Minaei,
S Chandra Mouli,
Mainack Mondal,
Bruno Ribeiro,
Aniket Kate
Abstract:
Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make…
▽ More
Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale. The reason for such hunting is simple: deleting a post acts as a powerful signal that the post might be damaging to its owner. Today, multiple archival services are already scanning social media for these deleted posts. Moreover, as we demonstrate in this work, powerful machine learning models can detect damaging deletions at scale.
Towards restraining such a global adversary against users' right to be forgotten, we introduce Deceptive Deletion, a decoy mechanism that minimizes the adversarial advantage. Our mechanism injects decoy deletions, hence creating a two-player minmax game between an adversary that seeks to classify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions. We formalize the Deceptive Game between the two players, determine conditions under which either the adversary or the challenger provably wins the game, and discuss the scenarios in-between these two extremes. We apply the Deceptive Deletion mechanism to a real-world task on Twitter: hiding damaging tweet deletions. We show that a powerful global adversary can be beaten by a powerful challenger, raising the bar significantly and giving a glimmer of hope in the ability to be really forgotten on social platforms.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
Deep Lifetime Clustering
Authors:
S Chandra Mouli,
Leonardo Teixeira,
Jennifer Neville,
Bruno Ribeiro
Abstract:
The goal of lifetime clustering is to develop an inductive model that maps subjects into $K$ clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters. Accordingly, we define a novel clust…
▽ More
The goal of lifetime clustering is to develop an inductive model that maps subjects into $K$ clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters. Accordingly, we define a novel clustering loss function over the lifetime distributions (of entire clusters) based on a tight upper bound of the two-sample Kuiper test p-value. The resultant model is robust to the modeling issues associated with the unobservability of termination signals, and does not assume proportional hazards. Our results in real and synthetic datasets show significantly better lifetime clusters (as evaluated by C-index, Brier Score, Logrank score and adjusted Rand index) as compared to competing approaches.
△ Less
Submitted 1 October, 2019; v1 submitted 1 October, 2019;
originally announced October 2019.
-
Identifying User Survival Types via Clustering of Censored Social Network Data
Authors:
S Chandra Mouli,
Abhishek Naik,
Bruno Ribeiro,
Jennifer Neville
Abstract:
The goal of cluster analysis in survival data is to identify clusters that are decidedly associated with the survival outcome. Previous research has explored this problem primarily in the medical domain with relatively small datasets, but the need for such a clustering methodology could arise in other domains with large datasets, such as social networks. Concretely, we wish to identify different s…
▽ More
The goal of cluster analysis in survival data is to identify clusters that are decidedly associated with the survival outcome. Previous research has explored this problem primarily in the medical domain with relatively small datasets, but the need for such a clustering methodology could arise in other domains with large datasets, such as social networks. Concretely, we wish to identify different survival classes in a social network by clustering the users based on their lifespan in the network. In this paper, we propose a decision tree based algorithm that uses a global normalization of $p$-values to identify clusters with significantly different survival distributions. We evaluate the clusters from our model with the help of a simple survival prediction task and show that our model outperforms other competing methods.
△ Less
Submitted 9 March, 2017;
originally announced March 2017.