-
Neural Multivariate Regression: Qualitative Insights from the Unconstrained Feature Model
Authors:
George Andriopoulos,
Soyuj Jung Basnet,
Juan Guevara,
Li Guo,
Keith Ross
Abstract:
The Unconstrained Feature Model (UFM) is a mathematical framework that enables closed-form approximations for minimal training loss and related performance measures in deep neural networks (DNNs). This paper leverages the UFM to provide qualitative insights into neural multivariate regression, a critical task in imitation learning, robotics, and reinforcement learning. Specifically, we address two…
▽ More
The Unconstrained Feature Model (UFM) is a mathematical framework that enables closed-form approximations for minimal training loss and related performance measures in deep neural networks (DNNs). This paper leverages the UFM to provide qualitative insights into neural multivariate regression, a critical task in imitation learning, robotics, and reinforcement learning. Specifically, we address two key questions: (1) How do multi-task models compare to multiple single-task models in terms of training performance? (2) Can whitening and normalizing regression targets improve training performance? The UFM theory predicts that multi-task models achieve strictly smaller training MSE than multiple single-task models when the same or stronger regularization is applied to the latter, and our empirical results confirm these findings. Regarding whitening and normalizing regression targets, the UFM theory predicts that they reduce training MSE when the average variance across the target dimensions is less than one, and our empirical results once again confirm these findings. These findings highlight the UFM as a powerful framework for deriving actionable insights into DNN design and data pre-processing strategies.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
Enhancing operational wind downscaling capabilities over Canada: Application of a Conditional Wasserstein GAN methodology
Authors:
Jorge Guevara,
Victor Nascimento,
Johannes Schmude,
Daniel Salles,
Simon Corbeil-Létourneau,
Madalina Surcel,
Dominique Brunet
Abstract:
Wind downscaling is essential for improving the spatial resolution of weather forecasts, particularly in operational Numerical Weather Prediction (NWP). This study advances wind downscaling by extending the DownGAN framework introduced by Annau et al.,to operational datasets from the Global Deterministic Prediction System (GDPS) and High-Resolution Deterministic Prediction System (HRDPS), covering…
▽ More
Wind downscaling is essential for improving the spatial resolution of weather forecasts, particularly in operational Numerical Weather Prediction (NWP). This study advances wind downscaling by extending the DownGAN framework introduced by Annau et al.,to operational datasets from the Global Deterministic Prediction System (GDPS) and High-Resolution Deterministic Prediction System (HRDPS), covering the entire Canadian domain. We enhance the model by incorporating high-resolution static covariates, such as HRDPS-derived topography, into a Conditional Wasserstein Generative Adversarial Network with Gradient Penalty, implemented using a UNET-based generator. Following the DownGAN framework, our methodology integrates low-resolution GDPS forecasts (15 km, 10-day horizon) and high-resolution HRDPS forecasts (2.5 km, 48-hour horizon) with Frequency Separation techniques adapted from computer vision. Through robust training and inference over the Canadian region, we demonstrate the operational scalability of our approach, achieving significant improvements in wind downscaling accuracy. Statistical validation highlights reductions in root mean square error (RMSE) and log spectral distance (LSD) metrics compared to the original DownGAN. High-resolution conditioning covariates and Frequency Separation strategies prove instrumental in enhancing model performance. This work underscores the potential for extending high-resolution wind forecasts beyond the 48-hour horizon, bridging the gap to the 10-day low resolution global forecast window.
△ Less
Submitted 26 February, 2025; v1 submitted 9 December, 2024;
originally announced December 2024.
-
Inclusive Design of AI's Explanations: Just for Those Previously Left Out, or for Everyone?
Authors:
Md Montaser Hamid,
Fatima Moussaoui,
Jimena Noa Guevara,
Andrew Anderson,
Puja Agarwal,
Jonathan Dodge,
Margaret Burnett
Abstract:
Motivations: Explainable Artificial Intelligence (XAI) systems aim to improve users' understanding of AI, but XAI research shows many cases of different explanations serving some users well and being unhelpful to others. In non-AI systems, some software practitioners have used inclusive design approaches and sometimes their improvements turned out to be "curb-cut" improvements -- not only addressi…
▽ More
Motivations: Explainable Artificial Intelligence (XAI) systems aim to improve users' understanding of AI, but XAI research shows many cases of different explanations serving some users well and being unhelpful to others. In non-AI systems, some software practitioners have used inclusive design approaches and sometimes their improvements turned out to be "curb-cut" improvements -- not only addressing the needs of underserved users, but also making the products better for everyone. So, if AI practitioners used inclusive design approaches, they too might create curb-cut improvements, i.e., better explanations for everyone. Objectives: To find out, we investigated the curb-cut effects of inclusivity-driven fixes on users' mental models of AI when using an XAI prototype. The prototype and fixes came from an AI team who had adopted an inclusive design approach (GenderMag) to improve their XAI prototype. Methods: We ran a between-subject study with 69 participants with no AI background. 34 participants used the original version of the XAI prototype and 35 used the version with the inclusivity fixes. We compared the two groups' mental model concepts scores, prediction accuracy, and inclusivity. Results: We found four main results. First, it revealed several curb-cut effects of the inclusivity fixes: overall increased engagement with explanations and better mental model concepts scores, which revealed fixes with curb-cut properties. However (second), the inclusivity fixes did not improve participants' prediction accuracy scores -- instead, it appears to have harmed them. This "curb-fence" effect (opposite of the curb-cut effect) revealed the AI explanations' double-edged impact. Third, the AI team's inclusivity fixes brought significant improvements for users whose problem-solving styles had previously been underserved. Further (fourth), the AI team's fixes reduced the gender gap by 45%.
△ Less
Submitted 2 December, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Community detection problem based on polarization measures:an application to Twitter: the COVID-19 case in Spain
Authors:
Inmaculada Gutiérrez,
Juan Antonio Guevara,
Daniel Gómez,
Javier Castro,
Rosa Espínola
Abstract:
In this paper, we address one of the most important topics in the field of Social Networks Analysis: the community detection problem with additional information. That additional information is modeled by a fuzzy measure that represents the risk of polarization. Particularly, we are interested in dealing with the problem of taking into account the polarization of nodes in the community detection pr…
▽ More
In this paper, we address one of the most important topics in the field of Social Networks Analysis: the community detection problem with additional information. That additional information is modeled by a fuzzy measure that represents the risk of polarization. Particularly, we are interested in dealing with the problem of taking into account the polarization of nodes in the community detection problem. Adding this type of information to the community detection problem makes it more realistic, as a community is more likely to be defined if the corresponding elements are willing to maintain a peaceful dialogue. The polarization capacity is modeled by a fuzzy measure based on the JDJpol measure of polarization related to two poles. We also present an efficient algorithm for finding groups whose elements are no polarized. Hereafter, we work in a real case. It is a network obtained from Twitter, concerning the political position against the Spanish government taken by several influential users. We analyze how the partitions obtained change when some additional information related to how polarized that society is, is added to the problem.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Self-Confirming Transformer for Belief-Conditioned Adaptation in Offline Multi-Agent Reinforcement Learning
Authors:
Tao Li,
Juan Guevara,
Xinhong Xie,
Quanyan Zhu
Abstract:
Offline reinforcement learning (RL) suffers from the distribution shift between the offline dataset and the online environment. In multi-agent RL (MARL), this distribution shift may arise from the nonstationary opponents in the online testing who display distinct behaviors from those recorded in the offline dataset. Hence, the key to the broader deployment of offline MARL is the online adaptation…
▽ More
Offline reinforcement learning (RL) suffers from the distribution shift between the offline dataset and the online environment. In multi-agent RL (MARL), this distribution shift may arise from the nonstationary opponents in the online testing who display distinct behaviors from those recorded in the offline dataset. Hence, the key to the broader deployment of offline MARL is the online adaptation to nonstationary opponents. Recent advances in foundation models, e.g., large language models, have demonstrated the generalization ability of the transformer, an emerging neural network architecture, in sequence modeling, of which offline RL is a special case. One naturally wonders \textit{whether offline-trained transformer-based RL policies adapt to nonstationary opponents online}. We propose a novel auto-regressive training to equip transformer agents with online adaptability based on the idea of self-augmented pre-conditioning. The transformer agent first learns offline to predict the opponent's action based on past observations. When deployed online, such a fictitious opponent play, referred to as the belief, is fed back to the transformer, together with other environmental feedback, to generate future actions conditional on the belief. Motivated by self-confirming equilibrium in game theory, the training loss consists of belief consistency loss, requiring the beliefs to match the opponent's actual actions and best response loss, mandating the agent to behave optimally under the belief. We evaluate the online adaptability of the proposed self-confirming transformer (SCT) in a structured environment, iterated prisoner's dilemma games, to demonstrate SCT's belief consistency and equilibrium behaviors as well as more involved multi-particle environments to showcase its superior performance against nonstationary opponents over prior transformers and offline MARL baselines.
△ Less
Submitted 24 February, 2025; v1 submitted 6 October, 2023;
originally announced October 2023.
-
Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving Styles
Authors:
Andrew Anderson,
Jimena Noa Guevara,
Fatima Moussaoui,
Tianyi Li,
Mihaela Vorvoreanu,
Margaret Burnett
Abstract:
Motivations: Recent research has emerged on generally how to improve AI product user experiences, but relatively little is known about an AI product's inclusivity. For example, what kinds of users does it support well, and who does it leave out? And what changes in the product would make it more inclusive?
Objectives: Our overall objective is to help fill this gap, investigating what kinds of di…
▽ More
Motivations: Recent research has emerged on generally how to improve AI product user experiences, but relatively little is known about an AI product's inclusivity. For example, what kinds of users does it support well, and who does it leave out? And what changes in the product would make it more inclusive?
Objectives: Our overall objective is to help fill this gap, investigating what kinds of diverse users an AI product leaves out, and how to act upon that knowledge. To bring actionability to our findings, we focus on users' diversity of problem-solving attributes. Thus, our specific objectives were: (1) to reveal whether participants with diverse problem-solving styles were left behind in a set of AI products; and (2) to relate participants' problem-solving diversity to their demographic diversity, specifically, gender and age.
Methods: We performed 18 experiments, discarding two that failed manipulation checks. Each experiment was a 2x2 factorial experiment with online participants. Each experiment compared two AI products: one deliberately violating an HAI guideline and the other applying the guideline. For our first objective, we analyzed how much each AI product gained/lost inclusivity compared to its counterpart, where inclusivity was supportiveness to participants with particular problem-solving styles. For our second objective, we analyzed how participants' problem-solving styles aligned with their demographics, namely their genders and ages.
Results & Implications: Participants' diverse problem-solving styles revealed six types of inclusivity results: (1) the AI products that followed an HAI guideline were almost always more inclusive across diversity of problem-solving styles than the products that did not follow that guideline-but the "who" that got most of the inclusivity varied widely by guideline and by problem-solving style...
△ Less
Submitted 16 February, 2024; v1 submitted 1 August, 2021;
originally announced August 2021.
-
A comparative study of stochastic and deep generative models for multisite precipitation synthesis
Authors:
Jorge Guevara,
Dario Borges,
Campbell Watson,
Bianca Zadrozny
Abstract:
Future climate change scenarios are usually hypothesized using simulations from weather generators. However, there only a few works comparing and evaluating promising deep learning models for weather generation against classical approaches. This study shows preliminary results making such evaluations for the multisite precipitation synthesis task. We compared two open-source weather generators: IB…
▽ More
Future climate change scenarios are usually hypothesized using simulations from weather generators. However, there only a few works comparing and evaluating promising deep learning models for weather generation against classical approaches. This study shows preliminary results making such evaluations for the multisite precipitation synthesis task. We compared two open-source weather generators: IBMWeathergen (an extension of the Weathergen library) and RGeneratePrec, and two deep generative models: GAN and VAE, on a variety of metrics. Our preliminary results can serve as a guide for improving the design of deep learning architectures and algorithms for the multisite precipitation synthesis task.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
A modular framework for extreme weather generation
Authors:
Bianca Zadrozny,
Campbell D. Watson,
Daniela Szwarcman,
Daniel Civitarese,
Dario Oliveira,
Eduardo Rodrigues,
Jorge Guevara
Abstract:
Extreme weather events have an enormous impact on society and are expected to become more frequent and severe with climate change. In this context, resilience planning becomes crucial for risk mitigation and coping with these extreme events. Machine learning techniques can play a critical role in resilience planning through the generation of realistic extreme weather event scenarios that can be us…
▽ More
Extreme weather events have an enormous impact on society and are expected to become more frequent and severe with climate change. In this context, resilience planning becomes crucial for risk mitigation and coping with these extreme events. Machine learning techniques can play a critical role in resilience planning through the generation of realistic extreme weather event scenarios that can be used to evaluate possible mitigation actions. This paper proposes a modular framework that relies on interchangeable components to produce extreme weather event scenarios. We discuss possible alternatives for each of the components and show initial results comparing two approaches on the task of generating precipitation scenarios.
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Elastic registration based on compliance analysis and biomechanical graph matching
Authors:
Jaime Garcia Guevara,
Igor Peterlik,
Marie-Odile Berger,
Stéphane Cotin
Abstract:
An automatic elastic registration method suited for vascularized organs is proposed. The vasculature in both the preoperative and intra-operative images is represented as a graph. A typical application of this method is the fusion of pre-operative information onto the organ during surgery, to compensate for the limited details provided by the intra-operative imaging modality (e.g. CBCT) and to cop…
▽ More
An automatic elastic registration method suited for vascularized organs is proposed. The vasculature in both the preoperative and intra-operative images is represented as a graph. A typical application of this method is the fusion of pre-operative information onto the organ during surgery, to compensate for the limited details provided by the intra-operative imaging modality (e.g. CBCT) and to cope with changes in the shape of the organ. Due to image modalities differences and organ deformation, each graph has a different topology and shape. The Adaptive Compliance Graph Matching (ACGM) method presented does not require any manual initialization, handles intra-operative nonrigid deformations of up to 65 mm and computes a complete displacement field over the organ from only the matched vasculature. ACGM is better than the previous Biomechanical Graph Matching method 3 (BGM) because it uses an efficient biomechanical vascularized liver model to compute the organ's transformation and the vessels bifurcations compliance. This allows to efficiently find the best graph matches with a novel compliance-based adaptive search. These contributions are evaluated on ten realistic synthetic and two real porcine automatically segmented datasets. ACGM obtains better target registration error (TRE) than BGM, with an average TRE in the real datasets of 4.2 mm compared to 6.5 mm, respectively. It also is up to one order of magnitude faster, less dependent on the parameters used and more robust to noise.
△ Less
Submitted 13 December, 2019;
originally announced December 2019.
-
Kernels on fuzzy sets: an overview
Authors:
Jorge Guevara,
Roberto Hirata Jr,
Stéphane Canu
Abstract:
This paper introduces the concept of kernels on fuzzy sets as a similarity measure for $[0,1]$-valued functions, a.k.a. \emph{membership functions of fuzzy sets}.
We defined the following classes of kernels: the cross product, the intersection, the non-singleton and the distance-based kernels on fuzzy sets.
Applicability of those kernels are on machine learning and data science tasks where unc…
▽ More
This paper introduces the concept of kernels on fuzzy sets as a similarity measure for $[0,1]$-valued functions, a.k.a. \emph{membership functions of fuzzy sets}.
We defined the following classes of kernels: the cross product, the intersection, the non-singleton and the distance-based kernels on fuzzy sets.
Applicability of those kernels are on machine learning and data science tasks where uncertainty in data has an ontic or epistemistic interpretation.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
A data-driven workflow for predicting horizontal well production using vertical well logs
Authors:
Jorge Guevara,
Matthias Kormaksson,
Bianca Zadrozny,
Ligang Lu,
John Tolle,
Tyler Croft,
Mingqi Wu,
Jan Limbeck,
Detlef Hohl
Abstract:
In recent work, data-driven sweet spotting technique for shale plays previously explored with vertical wells has been proposed. Here, we extend this technique to multiple formations and formalize a general data-driven workflow to facilitate feature extraction from vertical well logs and predictive modeling of horizontal well production. We also develop an experimental framework that facilitates mo…
▽ More
In recent work, data-driven sweet spotting technique for shale plays previously explored with vertical wells has been proposed. Here, we extend this technique to multiple formations and formalize a general data-driven workflow to facilitate feature extraction from vertical well logs and predictive modeling of horizontal well production. We also develop an experimental framework that facilitates model selection and validation in a realistic drilling scenario. We present some experimental results using this methodology in a field with 90 vertical wells and 98 horizontal wells, showing that it can achieve better results in terms of predictive ability than kriging of known production values.
△ Less
Submitted 15 May, 2017;
originally announced May 2017.
-
moco: Fast Motion Correction for Calcium Imaging
Authors:
Alexander Dubbs,
James Guevara,
Darcy S. Peterka,
Rafael Yuste
Abstract:
Motion correction is the first in a pipeline of algorithms to analyze calcium imaging videos and extract biologically relevant information, for example the network structure of the neurons therein. Fast motion correction would be especially critical for closed-loop activity triggered stimulation experiments, where accurate detection and targeting of specific cells in necessary. Our algorithm uses…
▽ More
Motion correction is the first in a pipeline of algorithms to analyze calcium imaging videos and extract biologically relevant information, for example the network structure of the neurons therein. Fast motion correction would be especially critical for closed-loop activity triggered stimulation experiments, where accurate detection and targeting of specific cells in necessary. Our algorithm uses a Fourier-transform approach, and its efficiency derives from a combination of judicious downsampling and the accelerated computation of many $L_2$ norms using dynamic programming and two-dimensional, fft-accelerated convolutions. Its accuracy is comparable to that of established community-used algorithms, and it is more stable to large translational motions. It is programmed in Java and is compatible with ImageJ.
△ Less
Submitted 19 June, 2015;
originally announced June 2015.