-
Hybrid Forecasting of Geopolitical Events
Authors:
Daniel M. Benjamin,
Fred Morstatter,
Ali E. Abbas,
Andres Abeliuk,
Pavel Atanasov,
Stephen Bennett,
Andreas Beger,
Saurabh Birari,
David V. Budescu,
Michele Catasta,
Emilio Ferrara,
Lucas Haravitch,
Mark Himmelstein,
KSM Tozammel Hossain,
Yuzhong Huang,
Woojeong Jin,
Regina Joseph,
Jure Leskovec,
Akira Matsui,
Mehrnoosh Mirtaheri,
Xiang Ren,
Gleb Satyukov,
Rajiv Sethi,
Amandeep Singh,
Rok Sosic
, et al. (4 additional authors not shown)
Abstract:
Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models and thus anchor their judgments on an objective ben…
▽ More
Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models and thus anchor their judgments on an objective benchmark. The system also aggregates human and machine forecasts weighting both for propinquity and based on assessed skill while adjusting for overconfidence. We present results from the Hybrid Forecasting Competition (HFC) - larger than comparable forecasting tournaments - including 1085 users forecasting 398 real-world forecasting problems over eight months. Our main result is that the hybrid system generated more accurate forecasts compared to a human-only baseline which had no machine generated predictions. We found that skilled forecasters who had access to machine-generated forecasts outperformed those who only viewed historical data. We also demonstrated the inclusion of machine-generated forecasts in our aggregation algorithms improved performance, both in terms of accuracy and scalability. This suggests that hybrid forecasting systems, which potentially require fewer human resources, can be a viable approach for maintaining a competitive level of accuracy over a larger number of forecasting questions.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Safe Autonomy for Uncrewed Surface Vehicles Using Adaptive Control and Reachability Analysis
Authors:
Karan Mahesh,
Tyler M. Paine,
Max L. Greene,
Nicholas Rober,
Steven Lee,
Sildomar T. Monteiro,
Anuradha Annaswamy,
Michael R. Benjamin,
Jonathan P. How
Abstract:
Marine robots must maintain precise control and ensure safety during tasks like ocean monitoring, even when encountering unpredictable disturbances that affect performance. Designing algorithms for uncrewed surface vehicles (USVs) requires accounting for these disturbances to control the vehicle and ensure it avoids obstacles. While adaptive control has addressed USV control challenges, real-world…
▽ More
Marine robots must maintain precise control and ensure safety during tasks like ocean monitoring, even when encountering unpredictable disturbances that affect performance. Designing algorithms for uncrewed surface vehicles (USVs) requires accounting for these disturbances to control the vehicle and ensure it avoids obstacles. While adaptive control has addressed USV control challenges, real-world applications are limited, and certifying USV safety amidst unexpected disturbances remains difficult. To tackle control issues, we employ a model reference adaptive controller (MRAC) to stabilize the USV along a desired trajectory. For safety certification, we developed a reachability module with a moving horizon estimator (MHE) to estimate disturbances affecting the USV. This estimate is propagated through a forward reachable set calculation, predicting future states and enabling real-time safety certification. We tested our safe autonomy pipeline on a Clearpath Heron USV in the Charles River, near MIT. Our experiments demonstrated that the USV's MRAC controller and reachability module could adapt to disturbances like thruster failures and drag forces. The MRAC controller outperformed a PID baseline, showing a 45%-81% reduction in RMSE position error. Additionally, the reachability module provided real-time safety certification, ensuring the USV's safety. We further validated our pipeline's effectiveness in underway replenishment and canal scenarios, simulating relevant marine tasks.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games
Authors:
Tyler M. Paine,
Anastasia Bizyaeva,
Michael R. Benjamin
Abstract:
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task…
▽ More
This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
△ Less
Submitted 18 December, 2024; v1 submitted 20 September, 2024;
originally announced September 2024.
-
A systematic dataset generation technique applied to data-driven automotive aerodynamics
Authors:
Mark Benjamin,
Gianluca Iaccarino
Abstract:
A novel strategy for generating datasets is developed within the context of drag prediction for automotive geometries using neural networks. A primary challenge in this space is constructing a training databse of sufficient size and diversity. Our method relies on a small number of starting data points, and provides a recipe to interpolate systematically between them, generating an arbitrary numbe…
▽ More
A novel strategy for generating datasets is developed within the context of drag prediction for automotive geometries using neural networks. A primary challenge in this space is constructing a training databse of sufficient size and diversity. Our method relies on a small number of starting data points, and provides a recipe to interpolate systematically between them, generating an arbitrary number of samples at the desired quality. We test this strategy using a realistic automotive geometry, and demonstrate that convolutional neural networks perform exceedingly well at predicting drag coefficients and surface pressures. Promising results are obtained in testing extrapolation performance. Our method can be applied to other problems of aerodynamic shape optimization.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
On the modification and revocation of open source licences
Authors:
Paul Gagnon,
Misha Benjamin,
Justine Gauthier,
Catherine Regis,
Jenny Lee,
Alexei Nordell-Markovits
Abstract:
Historically, open source commitments have been deemed irrevocable once materials are released under open source licenses. In this paper, the authors argue for the creation of a subset of rights that allows open source contributors to force users to (i) update to the most recent version of a model, (ii) accept new use case restrictions, or even (iii) cease using the software entirely. While this w…
▽ More
Historically, open source commitments have been deemed irrevocable once materials are released under open source licenses. In this paper, the authors argue for the creation of a subset of rights that allows open source contributors to force users to (i) update to the most recent version of a model, (ii) accept new use case restrictions, or even (iii) cease using the software entirely. While this would be a departure from the traditional open source approach, the legal, reputational and moral risks related to open-sourcing AI models could justify contributors having more control over downstream uses. Recent legislative changes have also opened the door to liability of open source contributors in certain cases. The authors believe that contributors would welcome the ability to ensure that downstream users are implementing updates that address issues like bias, guardrail workarounds or adversarial attacks on their contributions. Finally, this paper addresses how this license category would interplay with RAIL licenses, and how it should be operationalized and adopted by key stakeholders such as OSS platforms and scanning tools.
△ Less
Submitted 28 May, 2024;
originally announced July 2024.
-
Evaluating Collaborative Autonomy in Opposed Environments using Maritime Capture-the-Flag Competitions
Authors:
Jordan Beason,
Michael Novitzky,
John Kliem,
Tyler Errico,
Zachary Serlin,
Kevin Becker,
Tyler Paine,
Michael Benjamin,
Prithviraj Dasgupta,
Peter Crowley,
Charles O'Donnell,
John James
Abstract:
The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations…
▽ More
The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations in behavior-based optimization and deep reinforcement learning (RL) were deployed on these USV systems in two versus two teams and tested against each other during a competition period in the fall of 2023. Deep reinforcement learning applied to USV agents was achieved via the Pyquaticus test bed, a lightweight gymnasium environment that allows simulated CTF training in a low-level environment. The results of the experiment demonstrate that rule-based cooperation for behavior-based agents outperformed those trained in Deep-reinforcement learning paradigms as implemented in these competitions. Further integration of the Pyquaticus gymnasium environment for RL with MOOS-IvP in terms of configuration and control schema will allow for more competitive CTF games in future studies. As the development of experimental deep RL methods continues, the authors expect that the competitive gap between behavior-based autonomy and deep RL will be reduced. As such, this report outlines the overall competition, methods, and results with an emphasis on future works such as reward shaping and sim-to-real methodologies and extending rule-based cooperation among agents to react to safety and security events in accordance with human experts intent/rules for executing safety and security processes.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
A Model for Multi-Agent Autonomy That Uses Opinion Dynamics and Multi-Objective Behavior Optimization
Authors:
Tyler M. Paine,
Michael R. Benjamin
Abstract:
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a nonlinear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively…
▽ More
This paper reports a new hierarchical architecture for modeling autonomous multi-robot systems (MRSs): a nonlinear dynamical opinion process is used to model high-level group choice, and multi-objective behavior optimization is used to model individual decisions. Using previously reported theoretical results, we show it is possible to design the behavior of the MRS by the selection of a relatively small set of parameters. The resulting behavior - both collective actions and individual actions - can be understood intuitively. The approach is entirely decentralized and the communication cost scales by the number of group options, not agents. We demonstrated the effectiveness of this approach using a hypothetical `explore-exploit-migrate' scenario in a two hour field demonstration with eight unmanned surface vessels (USVs). The results from our preliminary field experiment show the collective behavior is robust even with time-varying network topology and agent dropouts.
△ Less
Submitted 18 October, 2024; v1 submitted 18 November, 2023;
originally announced November 2023.
-
An ensemble of online estimation methods for one degree-of-freedom models of unmanned surface vehicles: applied theory and preliminary field results with eight vehicles
Authors:
Tyler M. Paine,
Michael R. Benjamin
Abstract:
In this paper we report an experimental evaluation of three popular methods for online system identification of unmanned surface vehicles (USVs) which were implemented as an ensemble: certifiably stable shallow recurrent neural network (RNN), adaptive identification (AID), and recursive least squares (RLS). The algorithms were deployed on eight USVs for a total of 30 hours of online estimation. Du…
▽ More
In this paper we report an experimental evaluation of three popular methods for online system identification of unmanned surface vehicles (USVs) which were implemented as an ensemble: certifiably stable shallow recurrent neural network (RNN), adaptive identification (AID), and recursive least squares (RLS). The algorithms were deployed on eight USVs for a total of 30 hours of online estimation. During online training the loss function for the RNN was augmented to include a cost for violating a sufficient condition for the RNN to be stable in the sense of contraction stability. Additionally we described an efficient method to calculate the equilibrium points of the RNN and classify the associated stability properties about these points. We found the AID method had lowest mean absolute error in the online prediction setting, but a weighted ensemble had lower error in offline processing.
△ Less
Submitted 3 August, 2023; v1 submitted 1 August, 2023;
originally announced August 2023.
-
Morpheus: An A-sized AUV with morphing fins and algorithms for agile maneuvering
Authors:
Supun Randeni,
Michael Sacarny,
Michael Benjamin,
Michael Triantafyllou
Abstract:
We designed and constructed an A-sized base autonomous underwater vehicle (AUV), augmented with a stack of modular and extendable hardware and software, including autonomy, navigation, control and high fidelity simulation capabilities (A-size stands for the standard sonobuoy form factor, with a maximum diameter of 124 mm). Subsequently, we extended this base vehicle with a novel tuna-inspired morp…
▽ More
We designed and constructed an A-sized base autonomous underwater vehicle (AUV), augmented with a stack of modular and extendable hardware and software, including autonomy, navigation, control and high fidelity simulation capabilities (A-size stands for the standard sonobuoy form factor, with a maximum diameter of 124 mm). Subsequently, we extended this base vehicle with a novel tuna-inspired morphing fin payload module (referred to as the Morpheus AUV), to achieve good directional stability and exceptional maneuverability; properties that are highly desirable for rigid hull AUVs, but are presently difficult to achieve because they impose contradictory requirements. The morphing fin payload allows the base AUV to dynamically change its stability-maneuverability qualities by using morphing fins, which can be deployed, deflected and retracted, as needed. The base vehicle and Morpheus AUV were both extensively field tested in-water in the Charles river, Massachusetts, USA; by conducting hundreds of hours of operations over a period of two years. The maneuvering capability of the Morpheus AUV was evaluated with and without the use of morphing fins to quantify the performance improvement. The Morpheus AUV was able to showcase an exceptional turning rate of around 25-35 deg/s. A maximum turn rate improvement of around 35% - 50% was gained through the use of morphing fins.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
Towards Explaining Autonomy with Verbalised Decision Tree States
Authors:
Konstantinos Gavriilidis,
Andrea Munafo,
Helen Hastie,
Conlan Cesar,
Michael DeFilippo,
Michael R. Benjamin
Abstract:
The development of new AUV technology increased the range of tasks that AUVs can tackle and the length of their operations. As a result, AUVs are capable of handling highly complex operations. However, these missions do not fit easily into the traditional method of defining a mission as a series of pre-planned waypoints because it is not possible to know, in advance, everything that might occur du…
▽ More
The development of new AUV technology increased the range of tasks that AUVs can tackle and the length of their operations. As a result, AUVs are capable of handling highly complex operations. However, these missions do not fit easily into the traditional method of defining a mission as a series of pre-planned waypoints because it is not possible to know, in advance, everything that might occur during the mission. This results in a gap between the operator's expectations and actual operational performance. Consequently, this can create a diminished level of trust between the operators and AUVs, resulting in unnecessary mission interruptions. To bridge this gap between in-mission robotic behaviours and operators' expectations, this work aims to provide a framework to explain decisions and actions taken by an autonomous vehicle during the mission, in an easy-to-understand manner. Additionally, the objective is to have an autonomy-agnostic system that can be added as an additional layer on top of any autonomy architecture. To make the approach applicable across different autonomous systems equipped with different autonomies, this work decouples the inner workings of the autonomy from the decision points and the resulting executed actions applying Knowledge Distillation. Finally, to present the explanations to the operators in a more natural way, the output of the distilled decision tree is combined with natural language explanations and reported to the operators as sentences. For this reason, an additional step known as Concept2Text Generation is added at the end of the explanation pipeline.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Adaptive and Collaborative Bathymetric Channel-Finding Approach for Multiple Autonomous Marine Vehicles
Authors:
Nikolai Gershfeld,
Tyler M Paine,
Michael R. Benjamin
Abstract:
This paper reports an investigation into the problem of rapid identification of a channel that crosses a body of water using one or more Unmanned Surface Vehicles (USV). A new algorithm called Proposal Based Adaptive Channel Search (PBACS) is presented as a potential solution that improves upon current methods. The empirical performance of PBACS is compared to lawnmower surveying and to Markov dec…
▽ More
This paper reports an investigation into the problem of rapid identification of a channel that crosses a body of water using one or more Unmanned Surface Vehicles (USV). A new algorithm called Proposal Based Adaptive Channel Search (PBACS) is presented as a potential solution that improves upon current methods. The empirical performance of PBACS is compared to lawnmower surveying and to Markov decision process (MDP) planning with two state-of-the-art reward functions: Upper Confidence Bound (UCB) and Maximum Value Information (MVI). The performance of each method is evaluated through comparison of the time it takes to identify a continuous channel through an area, using one, two, three, or four USVs. The performance of each method is compared across ten simulated bathymetry scenarios and one field area, each with different channel layouts. The results from simulations and field trials indicate that on average multi-vehicle PBACS outperforms lawnmower, UCB, and MVI based methods, especially when at least three vehicles are used.
△ Less
Submitted 27 May, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
MassMIND: Massachusetts Maritime INfrared Dataset
Authors:
Shailesh Nirgudkar,
Michael DeFilippo,
Michael Sacarny,
Michael Benjamin,
Paul Robinette
Abstract:
Recent advances in deep learning technology have triggered radical progress in the autonomy of ground vehicles. Marine coastal Autonomous Surface Vehicles (ASVs) that are regularly used for surveillance, monitoring and other routine tasks can benefit from this autonomy. Long haul deep sea transportation activities are additional opportunities. These two use cases present very different terrains --…
▽ More
Recent advances in deep learning technology have triggered radical progress in the autonomy of ground vehicles. Marine coastal Autonomous Surface Vehicles (ASVs) that are regularly used for surveillance, monitoring and other routine tasks can benefit from this autonomy. Long haul deep sea transportation activities are additional opportunities. These two use cases present very different terrains -- the first being coastal waters -- with many obstacles, structures and human presence while the latter is mostly devoid of such obstacles. Variations in environmental conditions are common to both terrains. Robust labeled datasets mapping such terrains are crucial in improving the situational awareness that can drive autonomy. However, there are only limited such maritime datasets available and these primarily consist of optical images. Although, Long Wave Infrared (LWIR) is a strong complement to the optical spectrum that helps in extreme light conditions, a labeled public dataset with LWIR images does not currently exist. In this paper, we fill this gap by presenting a labeled dataset of over 2,900 LWIR segmented images captured in coastal maritime environment under diverse conditions. The images are labeled using instance segmentation and classified in seven categories -- sky, water, obstacle, living obstacle, bridge, self and background. We also evaluate this dataset across three deep learning architectures (UNet, PSPNet, DeepLabv3) and provide detailed analysis of its efficacy. While the dataset focuses on the coastal terrain it can equally help deep sea use cases. Such terrain would have less traffic, and the classifier trained on cluttered environment would be able to handle sparse scenes effectively. We share this dataset with the research community with the hope that it spurs new scene understanding capabilities in the maritime environment.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Models, Markets, and the Forecasting of Elections
Authors:
Rajiv Sethi,
Julie Seager,
Emily Cai,
Daniel M. Benjamin,
Fred Morstatter
Abstract:
We examine probabilistic forecasts for battleground states in the 2020 US presidential election, using daily data from two sources over seven months: a model published by The Economist, and prices from the PredictIt exchange. We find systematic differences in accuracy over time, with markets performing better several months before the election, and the model performing better as the election appro…
▽ More
We examine probabilistic forecasts for battleground states in the 2020 US presidential election, using daily data from two sources over seven months: a model published by The Economist, and prices from the PredictIt exchange. We find systematic differences in accuracy over time, with markets performing better several months before the election, and the model performing better as the election approached. A simple average of the two forecasts performs better than either one of them overall, even though no average can outperform both component forecasts for any given state-date pair. This effect arises because the model and the market make different kinds of errors in different states: the model was confidently wrong in some cases, while the market was excessively uncertain in others. We conclude that there is value in using hybrid forecasting methods, and propose a market design that incorporates model forecasts via a trading bot to generate synthetic predictions. We also propose and conduct a profitability test that can be used as a novel criterion for the evaluation of forecasting performance.
△ Less
Submitted 25 May, 2021; v1 submitted 6 February, 2021;
originally announced February 2021.
-
A Multisite, Report-Based, Centralized Infrastructure for Feedback and Monitoring of Radiology AI/ML Development and Clinical Deployment
Authors:
Menashe Benjamin,
Guy Engelhard,
Alex Aisen,
Yinon Aradi,
Elad Benjamin
Abstract:
An infrastructure for multisite, geographically-distributed creation and collection of diverse, high-quality, curated and labeled radiology image data is crucial for the successful automated development, deployment, monitoring and continuous improvement of Artificial Intelligence (AI)/Machine Learning (ML) solutions in the real world. An interactive radiology reporting approach that integrates ima…
▽ More
An infrastructure for multisite, geographically-distributed creation and collection of diverse, high-quality, curated and labeled radiology image data is crucial for the successful automated development, deployment, monitoring and continuous improvement of Artificial Intelligence (AI)/Machine Learning (ML) solutions in the real world. An interactive radiology reporting approach that integrates image viewing, dictation, natural language processing (NLP) and creation of hyperlinks between image findings and the report, provides localized labels during routine interpretation. These images and labels can be captured and centralized in a cloud-based system. This method provides a practical and efficient mechanism with which to monitor algorithm performance. It also supplies feedback for iterative development and quality improvement of new and existing algorithmic models. Both feedback and monitoring are achieved without burdening the radiologist. The method addresses proposed regulatory requirements for post-marketing surveillance and external data. Comprehensive multi-site data collection assists in reducing bias. Resource requirements are greatly reduced compared to dedicated retrospective expert labeling.
△ Less
Submitted 31 August, 2020;
originally announced August 2020.
-
Towards Standardization of Data Licenses: The Montreal Data License
Authors:
Misha Benjamin,
Paul Gagnon,
Negar Rostamzadeh,
Chris Pal,
Yoshua Bengio,
Alex Shee
Abstract:
This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In pa…
▽ More
This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper's approach is summarized in a new family of data license language - \textit{the Montreal Data License (MDL)}. Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.