-
A Vision for Trustworthy, Fair, and Efficient Socio-Technical Control using Karma Economies
Authors:
Ezzat Elokda,
Andrea Censi,
Emilio Frazzoli,
Florian Dörfler,
Saverio Bolognani
Abstract:
Control systems will play a pivotal role in addressing societal-scale challenges as they drive the development of sustainable future smart cities. At the heart of these challenges is the trustworthy, fair, and efficient allocation of scarce public resources, including renewable energy, transportation, data, computation, etc.. Historical evidence suggests that monetary control -- the prototypical m…
▽ More
Control systems will play a pivotal role in addressing societal-scale challenges as they drive the development of sustainable future smart cities. At the heart of these challenges is the trustworthy, fair, and efficient allocation of scarce public resources, including renewable energy, transportation, data, computation, etc.. Historical evidence suggests that monetary control -- the prototypical mechanism for managing resource scarcity -- is not always well-accepted in socio-technical resource contexts. In this vision article, we advocate for karma economies as an emerging non-monetary mechanism for socio-technical control. Karma leverages the repetitive nature of many socio-technical resources to jointly attain trustworthy, fair, and efficient allocations; by budgeting resource consumption over time and letting resource users ``play against their future selves.'' To motivate karma, we review related concepts in economics through a control systems lens, and make a case for a) shifting the viewpoint of resource allocations from single-shot and static to repeated and dynamic games; and b) adopting long-run Nash welfare as the formalization of ``fairness and efficiency'' in socio-technical contexts. We show that in many dynamic resource settings, karma Nash equilibria maximize long-run Nash welfare. Moreover, we discuss implications for a future smart city built on multi-karma economies: by choosing whether to combine different socio-technical resources, e.g., electricity and transportation, in a single karma economy, or separate into resource-specific economies, karma provides new flexibility to design the scope of fairness and efficiency.
△ Less
Submitted 20 June, 2025;
originally announced June 2025.
-
Learn to Bid as a Price-Maker Wind Power Producer
Authors:
Shobhit Singhal,
Marta Fochesato,
Liviu Aolaritei,
Florian Dörfler
Abstract:
Wind power producers (WPPs) participating in short-term power markets face significant imbalance costs due to their non-dispatchable and variable production. While some WPPs have a large enough market share to influence prices with their bidding decisions, existing optimal bidding methods rarely account for this aspect. Price-maker approaches typically model bidding as a bilevel optimization probl…
▽ More
Wind power producers (WPPs) participating in short-term power markets face significant imbalance costs due to their non-dispatchable and variable production. While some WPPs have a large enough market share to influence prices with their bidding decisions, existing optimal bidding methods rarely account for this aspect. Price-maker approaches typically model bidding as a bilevel optimization problem, but these methods require complex market models, estimating other participants' actions, and are computationally demanding. To address these challenges, we propose an online learning algorithm that leverages contextual information to optimize WPP bids in the price-maker setting. We formulate the strategic bidding problem as a contextual multi-armed bandit, ensuring provable regret minimization. The algorithm's performance is evaluated against various benchmark strategies using a numerical simulation of the German day-ahead and real-time markets.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Decision-Dependent Stochastic Optimization: The Role of Distribution Dynamics
Authors:
Zhiyu He,
Saverio Bolognani,
Florian Dörfler,
Michael Muehlebach
Abstract:
Distribution shifts have long been regarded as troublesome external forces that a decision-maker should either counteract or conform to. An intriguing feedback phenomenon termed decision dependence arises when the deployed decision affects the environment and alters the data-generating distribution. In the realm of performative prediction, this is encoded by distribution maps parameterized by deci…
▽ More
Distribution shifts have long been regarded as troublesome external forces that a decision-maker should either counteract or conform to. An intriguing feedback phenomenon termed decision dependence arises when the deployed decision affects the environment and alters the data-generating distribution. In the realm of performative prediction, this is encoded by distribution maps parameterized by decisions due to strategic behaviors. In contrast, we formalize an endogenous distribution shift as a feedback process featuring nonlinear dynamics that couple the evolving distribution with the decision. Stochastic optimization in this dynamic regime provides a fertile ground to examine the various roles played by dynamics in the composite problem structure. To this end, we develop an online algorithm that achieves optimal decision-making by both adapting to and shaping the dynamic distribution. Throughout the paper, we adopt a distributional perspective and demonstrate how this view facilitates characterizations of distribution dynamics and the optimality and generalization performance of the proposed algorithm. We showcase the theoretical results in an opinion dynamics context, where an opportunistic party maximizes the affinity of a dynamic polarized population, and in a recommender system scenario, featuring performance optimization with discrete distributions in the probability simplex.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
An Adaptive Data-Enabled Policy Optimization Approach for Autonomous Bicycle Control
Authors:
Niklas Persson,
Feiran Zhao,
Mojtaba Kaheni,
Florian Dörfler,
Alessandro V. Papadopoulos
Abstract:
This paper presents a unified control framework that integrates a Feedback Linearization (FL) controller in the inner loop with an adaptive Data-Enabled Policy Optimization (DeePO) controller in the outer loop to balance an autonomous bicycle. While the FL controller stabilizes and partially linearizes the inherently unstable and nonlinear system, its performance is compromised by unmodeled dynami…
▽ More
This paper presents a unified control framework that integrates a Feedback Linearization (FL) controller in the inner loop with an adaptive Data-Enabled Policy Optimization (DeePO) controller in the outer loop to balance an autonomous bicycle. While the FL controller stabilizes and partially linearizes the inherently unstable and nonlinear system, its performance is compromised by unmodeled dynamics and time-varying characteristics. To overcome these limitations, the DeePO controller is introduced to enhance adaptability and robustness. The initial control policy of DeePO is obtained from a finite set of offline, persistently exciting input and state data. To improve stability and compensate for system nonlinearities and disturbances, a robustness-promoting regularizer refines the initial policy, while the adaptive section of the DeePO framework is enhanced with a forgetting factor to improve adaptation to time-varying dynamics. The proposed DeePO+FL approach is evaluated through simulations and real-world experiments on an instrumented autonomous bicycle. Results demonstrate its superiority over the FL-only approach, achieving more precise tracking of the reference lean angle and lean rate.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Optimizing Social Network Interventions via Hypergradient-Based Recommender System Design
Authors:
Marino Kühne,
Panagiotis D. Grontas,
Giulia De Pasquale,
Giuseppe Belgioioso,
Florian Dörfler,
John Lygeros
Abstract:
Although social networks have expanded the range of ideas and information accessible to users, they are also criticized for amplifying the polarization of user opinions. Given the inherent complexity of these phenomena, existing approaches to counteract these effects typically rely on handcrafted algorithms and heuristics. We propose an elegant solution: we act on the network weights that model us…
▽ More
Although social networks have expanded the range of ideas and information accessible to users, they are also criticized for amplifying the polarization of user opinions. Given the inherent complexity of these phenomena, existing approaches to counteract these effects typically rely on handcrafted algorithms and heuristics. We propose an elegant solution: we act on the network weights that model user interactions on social networks (e.g., frequency of communication), to optimize a performance metric (e.g., polarization reduction), while users' opinions follow the classical Friedkin-Johnsen model. Our formulation gives rise to a challenging large-scale optimization problem with non-convex constraints, for which we develop a gradient-based algorithm. Our scheme is simple, scalable, and versatile, as it can readily integrate different, potentially non-convex, objectives. We demonstrate its merit by: (i) rapidly solving complex social network intervention problems with 3 million variables based on the Reddit and DBLP datasets; (ii) significantly outperforming competing approaches in terms of both computation time and disagreement reduction.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
To Travel Quickly or to Park Conveniently: Coupled Resource Allocations with Multi-Karma Economies
Authors:
Ezzat Elokda,
Andrea Censi,
Saverio Bolognani,
Florian Dörfler,
Emilio Frazzoli
Abstract:
The large-scale allocation of public resources (e.g., transportation, energy) is among the core challenges of future Cyber-Physical-Human Systems (CPHS). In order to guarantee that these systems are efficient and fair, recent works have investigated non-monetary resource allocation schemes, including schemes that employ karma. Karma is a non-tradable token that flows from users gaining resources t…
▽ More
The large-scale allocation of public resources (e.g., transportation, energy) is among the core challenges of future Cyber-Physical-Human Systems (CPHS). In order to guarantee that these systems are efficient and fair, recent works have investigated non-monetary resource allocation schemes, including schemes that employ karma. Karma is a non-tradable token that flows from users gaining resources to users yielding resources. Thus far karma-based solutions considered the allocation of a single public resource, however, modern CPHS are complex as they involve the allocation of multiple coupled resources. For example, a user might want to trade-off fast travel on highways for convenient parking in the city center, and different users could have heterogeneous preferences for such coupled resources. In this paper, we explore how to optimally combine multiple karma economies for coupled resource allocations, using two mechanism-design instruments: (non-uniform) karma redistribution; and (non-unit) exchange rates. We first extend the existing Dynamic Population Game (DPG) model that predicts the Stationary Nash Equilibrium (SNE) of the multi-karma economies. Then, in a numerical case study, we demonstrate that the design of redistribution significantly affects the coupled resource allocations, while non-unit exchange rates play a minor role. To assess the allocation outcomes under user heterogeneity, we adopt Nash welfare as our social welfare function, since it makes no interpersonal comparisons and it is axiomatically rooted in social choice theory. Our findings suggest that the simplest mechanism design, that is, uniform redistribution with unit exchange rates, also attains maximum social welfare.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Contractivity and linear convergence in bilinear saddle-point problems: An operator-theoretic approach
Authors:
Colin Dirren,
Mattia Bianchi,
Panagiotis D. Grontas,
John Lygeros,
Florian Dörfler
Abstract:
We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x) + y^\top Ax - g(y)$, where both, only one, or none of the functions $f$ and $g$ are strongly convex, and suitable rank conditions on the matrix $A$ hold. The solution of this problem is at the core of many machine learning tasks. By employing tools from monotone operator theory, we systematically prove the contractivity…
▽ More
We study the convex-concave bilinear saddle-point problem $\min_x \max_y f(x) + y^\top Ax - g(y)$, where both, only one, or none of the functions $f$ and $g$ are strongly convex, and suitable rank conditions on the matrix $A$ hold. The solution of this problem is at the core of many machine learning tasks. By employing tools from monotone operator theory, we systematically prove the contractivity (in turn, the linear convergence) of several first-order primal-dual algorithms, including the Chambolle-Pock method. Our approach results in concise proofs, and it yields new convergence guarantees and tighter bounds compared to known results.
△ Less
Submitted 21 April, 2025; v1 submitted 18 October, 2024;
originally announced October 2024.
-
Maximum likelihood inference for high-dimensional problems with multiaffine variable relations
Authors:
Jean-Sébastien Brouillon,
Florian Dörfler,
Giancarlo Ferrari-Trecate
Abstract:
Maximum Likelihood Estimation of continuous variable models can be very challenging in high dimensions, due to potentially complex probability distributions. The existence of multiple interdependencies among variables can make it very difficult to establish convergence guarantees. This leads to a wide use of brute-force methods, such as grid searching and Monte-Carlo sampling and, when applicable,…
▽ More
Maximum Likelihood Estimation of continuous variable models can be very challenging in high dimensions, due to potentially complex probability distributions. The existence of multiple interdependencies among variables can make it very difficult to establish convergence guarantees. This leads to a wide use of brute-force methods, such as grid searching and Monte-Carlo sampling and, when applicable, complex and problem-specific algorithms. In this paper, we consider inference problems where the variables are related by multiaffine expressions. We propose a novel Alternating and Iteratively-Reweighted Least Squares (AIRLS) algorithm, and prove its convergence for problems with Generalized Normal Distributions. We also provide an efficient method to compute the variance of the estimates obtained using AIRLS. Finally, we show how the method can be applied to graphical statistical models. We perform numerical experiments on several inference problems, showing significantly better performance than state-of-the-art approaches in terms of scalability, robustness to noise, and convergence speed due to an empirically observed super-linear convergence rate.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Fairness in Social Influence Maximization via Optimal Transport
Authors:
Shubham Chowdhary,
Giulia De Pasquale,
Nicolas Lanzetti,
Ana-Andreea Stoica,
Florian Dorfler
Abstract:
We study fairness in social influence maximization, whereby one seeks to select seeds that spread a given information throughout a network, ensuring balanced outreach among different communities (e.g. demographic groups). In the literature, fairness is often quantified in terms of the expected outreach within individual communities. In this paper, we demonstrate that such fairness metrics can be m…
▽ More
We study fairness in social influence maximization, whereby one seeks to select seeds that spread a given information throughout a network, ensuring balanced outreach among different communities (e.g. demographic groups). In the literature, fairness is often quantified in terms of the expected outreach within individual communities. In this paper, we demonstrate that such fairness metrics can be misleading since they overlook the stochastic nature of information diffusion processes. When information diffusion occurs in a probabilistic manner, multiple outreach scenarios can occur. As such, outcomes such as ``In 50% of the cases, no one in group 1 gets the information, while everyone in group 2 does, and in the other 50%, it is the opposite'', which always results in largely unfair outcomes, are classified as fair by a variety of fairness metrics in the literature. We tackle this problem by designing a new fairness metric, mutual fairness, that captures variability in outreach through optimal transport theory. We propose a new seed-selection algorithm that optimizes both outreach and mutual fairness, and we show its efficacy on several real datasets. We find that our algorithm increases fairness with only a minor decrease (and at times, even an increase) in efficiency.
△ Less
Submitted 30 January, 2025; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Learning diffusion at lightspeed
Authors:
Antonio Terpin,
Nicolas Lanzetti,
Martin Gadea,
Florian Dörfler
Abstract:
Diffusion regulates numerous natural processes and the dynamics of many successful generative models. Existing models to learn the diffusion terms from observational data rely on complex bilevel optimization problems and model only the drift of the system. We propose a new simple model, JKOnet*, which bypasses the complexity of existing architectures while presenting significantly enhanced represe…
▽ More
Diffusion regulates numerous natural processes and the dynamics of many successful generative models. Existing models to learn the diffusion terms from observational data rely on complex bilevel optimization problems and model only the drift of the system. We propose a new simple model, JKOnet*, which bypasses the complexity of existing architectures while presenting significantly enhanced representational capabilities: JKOnet* recovers the potential, interaction, and internal energy components of the underlying diffusion process. JKOnet* minimizes a simple quadratic loss and outperforms other baselines in terms of sample efficiency, computational complexity, and accuracy. Additionally, JKOnet* provides a closed-form optimal solution for linearly parametrized functionals, and, when applied to predict the evolution of cellular processes from real-world data, it achieves state-of-the-art accuracy at a fraction of the computational cost of all existing methods. Our methodology is based on the interpretation of diffusion processes as energy-minimizing trajectories in the probability space via the so-called JKO scheme, which we study via its first-order optimality conditions.
△ Less
Submitted 18 October, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
NeoRL: Efficient Exploration for Nonepisodic RL
Authors:
Bhavya Sukhija,
Lenart Treven,
Florian Dörfler,
Stelian Coros,
Andreas Krause
Abstract:
We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimist…
▽ More
We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics. Under continuity and bounded energy assumptions on the system, we provide a first-of-its-kind regret bound of $O(Γ_T \sqrt{T})$ for general nonlinear systems with Gaussian process dynamics. We compare NeoRL to other baselines on several deep RL environments and empirically demonstrate that NeoRL achieves the optimal average cost while incurring the least regret.
△ Less
Submitted 11 February, 2025; v1 submitted 3 June, 2024;
originally announced June 2024.
-
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
Authors:
Lenart Treven,
Bhavya Sukhija,
Yarden As,
Florian Dörfler,
Andreas Krause
Abstract:
Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently…
▽ More
Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently costly. Therefore, we generally prefer a time-adaptive approach with fewer interactions with the system. In this work, we formalize an RL framework, Time-adaptive Control & Sensing (TaCoS), that tackles this challenge by optimizing over policies that besides control predict the duration of its application. Our formulation results in an extended MDP that any standard RL algorithm can solve. We demonstrate that state-of-the-art RL algorithms trained on TaCoS drastically reduce the interaction amount over their discrete-time counterpart while retaining the same or improved performance, and exhibiting robustness over discretization frequency. Finally, we propose OTaCoS, an efficient model-based algorithm for our setting. We show that OTaCoS enjoys sublinear regret for systems with sufficiently smooth dynamics and empirically results in further sample-efficiency gains.
△ Less
Submitted 30 October, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Carbon-Aware Computing in a Network of Data Centers: A Hierarchical Game-Theoretic Approach
Authors:
Enno Breukelman,
Sophie Hall,
Giuseppe Belgioioso,
Florian Dörfler
Abstract:
Over the past decade, the continuous surge in cloud computing demand has intensified data center workloads, leading to significant carbon emissions and driving the need for improving their efficiency and sustainability. This paper focuses on the optimal allocation problem of batch compute loads with temporal and spatial flexibility across a global network of data centers. We propose a bilevel game…
▽ More
Over the past decade, the continuous surge in cloud computing demand has intensified data center workloads, leading to significant carbon emissions and driving the need for improving their efficiency and sustainability. This paper focuses on the optimal allocation problem of batch compute loads with temporal and spatial flexibility across a global network of data centers. We propose a bilevel game-theoretic solution approach that captures the inherent hierarchical relationship between supervisory control objectives, such as carbon reduction and peak shaving, and operational objectives, such as priority-aware scheduling. Numerical simulations with real carbon intensity data demonstrate that the proposed approach successfully reduces carbon emissions while simultaneously ensuring operational reliability and priority-aware scheduling.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Distributed Traffic Signal Control via Coordinated Maximum Pressure-plus-Penalty
Authors:
Vinzenz Tütsch,
Zhiyu He,
Florian Dörfler,
Kenan Zhang
Abstract:
This paper develops an adaptive traffic control policy inspired by Maximum Pressure (MP) while imposing coordination across intersections. The proposed Coordinated Maximum Pressure-plus-Penalty (CMPP) control policy features a local objective for each intersection that consists of the total pressure within the neighborhood and a penalty accounting for the queue capacities and continuous green time…
▽ More
This paper develops an adaptive traffic control policy inspired by Maximum Pressure (MP) while imposing coordination across intersections. The proposed Coordinated Maximum Pressure-plus-Penalty (CMPP) control policy features a local objective for each intersection that consists of the total pressure within the neighborhood and a penalty accounting for the queue capacities and continuous green time for certain movements. The corresponding control task is reformulated as a distributed optimization problem and solved via two customized algorithms: one based on the alternating direction method of multipliers (ADMM) and the other follows a greedy heuristic augmented with a majority vote. CMPP not only provides a theoretical guarantee of queuing network stability but also outperforms several benchmark controllers in simulations on a large-scale real traffic network with lower average travel and waiting time per vehicle, as well as less network congestion. Furthermore, CPMM with the greedy algorithm enjoys comparable computational efficiency as fully decentralized controllers without significantly compromising the control performance, which highlights its great potential for real-world deployment.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Dynamic Resource Allocation with Karma: An Experimental Study
Authors:
Ezzat Elokda,
Heinrich Nax,
Saverio Bolognani,
Florian Dörfler
Abstract:
A system of non-tradable credits that flow between individuals like karma, hence proposed under that name, is a mechanism for repeated resource allocation that comes with attractive efficiency and fairness properties, in theory. In this study, we test karma in an online experiment in which human subjects repeatedly compete for a resource with time-varying and stochastic individual preferences or u…
▽ More
A system of non-tradable credits that flow between individuals like karma, hence proposed under that name, is a mechanism for repeated resource allocation that comes with attractive efficiency and fairness properties, in theory. In this study, we test karma in an online experiment in which human subjects repeatedly compete for a resource with time-varying and stochastic individual preferences or urgency to acquire the resource. We confirm that karma has significant and sustained welfare benefits even in a population with no prior training. We identify mechanism usage in contexts with sporadic high urgency, more so than with frequent moderate urgency, and implemented as a simple (binary) karma bidding scheme as particularly effective for welfare improvements: relatively larger aggregate efficiency gains are realized that are (almost) Pareto superior. These findings provide guidance for further testing and for future implementation plans of such mechanisms in the real world.
△ Less
Submitted 25 December, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Mitigating Transient Bullwhip Effects Under Imperfect Demand Forecasts
Authors:
Sarah H. Q. Li,
Florian Dörfler
Abstract:
Motivated by how forecast errors exacerbate order fluctuations in supply chains, we leverage robust feedback controller synthesis to characterize, compute, and minimize the worst-case order fluctuation experienced by an individual supply chain vendor. Assuming bounded forecast errors and demand fluctuations, we model forecast error and demand fluctuations as inputs to linear inventory dynamics, an…
▽ More
Motivated by how forecast errors exacerbate order fluctuations in supply chains, we leverage robust feedback controller synthesis to characterize, compute, and minimize the worst-case order fluctuation experienced by an individual supply chain vendor. Assuming bounded forecast errors and demand fluctuations, we model forecast error and demand fluctuations as inputs to linear inventory dynamics, and use the $\ell_\infty$ gain to define a transient Bullwhip measure. In contrast to the existing Bullwhip measure, the transient Bullwhip measure explicitly depends on the forecast error. This enables us to separately quantify the transient Bullwhip measure's sensitivity to forecast error and demand fluctuations. To compute the controller that minimizes the worst-case peak gain, we formulate an optimization problem with bilinear matrix inequalities and show that it is equivalent to minimizing a quasi-convex function on a bounded domain. We simulate our model for vendors with non-zero perishable rates and order backlogging rates, and prove that the transient Bullwhip measure can be bounded by a monotonic quasi-convex function whose dependency on the product backlog rate and perishing rate is verified in simulation.
△ Less
Submitted 12 September, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Bridging the Sim-to-Real Gap with Bayesian Inference
Authors:
Jonas Rothfuss,
Bhavya Sukhija,
Lenart Treven,
Florian Dörfler,
Stelian Coros,
Andreas Krause
Abstract:
We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with imp…
▽ More
We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with implicit physical priors results in accurate mean model estimation as well as precise uncertainty quantification. We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system. Using model-based RL, we demonstrate a highly dynamic parking maneuver with drifting, using less than half the data compared to the state of the art.
△ Less
Submitted 1 September, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
A Coupled Optimization Framework for Correlated Equilibria in Normal-Form Game
Authors:
Sarah H. Q. Li,
Yue Yu,
Florian Dörfler,
John Lygeros
Abstract:
In competitive multi-player interactions, simultaneous optimality is a key requirement for establishing strategic equilibria. This property is explicit when the game-theoretic equilibrium is the simultaneously optimal solution of coupled optimization problems. However, no such optimization problems exist for the correlated equilibrium, a strategic equilibrium where the players can correlate their…
▽ More
In competitive multi-player interactions, simultaneous optimality is a key requirement for establishing strategic equilibria. This property is explicit when the game-theoretic equilibrium is the simultaneously optimal solution of coupled optimization problems. However, no such optimization problems exist for the correlated equilibrium, a strategic equilibrium where the players can correlate their actions. We address the lack of a coupled optimization framework for the correlated equilibrium by introducing an {unnormalized game} -- an extension of normal-form games in which the player strategies are lifted to unnormalized measures over the joint actions. We show that the set of fully mixed generalized Nash equilibria of this unnormalized game is a subset of the correlated equilibrium of the normal-form game. Furthermore, we introduce an entropy regularization to the unnormalized game and prove that the entropy-regularized generalized Nash equilibrium is a sub-optimal correlated equilibrium of the normal form game where the degree of sub-optimality depends on the magnitude of regularization. We prove that the entropy-regularized unnormalized game has a closed-form solution, and empirically verify its computational efficacy at approximating the correlated equilibrium of normal-form games.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Control Strategies for Recommendation Systems in Social Networks
Authors:
Ben Sprenger,
Giulia De Pasquale,
Raffaele Soloperto,
John Lygeros,
Florian Dörfler
Abstract:
A closed-loop control model to analyze the impact of recommendation systems on opinion dynamics within social networks is introduced. The core contribution is the development and formalization of model-free and model-based approaches to recommendation system design, integrating the dynamics of social interactions within networks via an extension of the Friedkin-Johnsen (FJ) model. Comparative anal…
▽ More
A closed-loop control model to analyze the impact of recommendation systems on opinion dynamics within social networks is introduced. The core contribution is the development and formalization of model-free and model-based approaches to recommendation system design, integrating the dynamics of social interactions within networks via an extension of the Friedkin-Johnsen (FJ) model. Comparative analysis and numerical simulations demonstrate the effectiveness of the proposed control strategies in maximizing user engagement and their potential for influencing opinion formation processes.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
To Spend or to Gain: Online Learning in Repeated Karma Auctions
Authors:
Damien Berriaud,
Ezzat Elokda,
Devansh Jalota,
Emilio Frazzoli,
Marco Pavone,
Florian Dörfler
Abstract:
Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users…
▽ More
Recent years have seen a surge of artificial currency-based mechanisms in contexts where monetary instruments are deemed unfair or inappropriate, e.g., in allocating food donations to food banks, course seats to students, and, more recently, even for traffic congestion management. Yet the applicability of these mechanisms remains limited in repeated auction settings, as it is challenging for users to learn how to bid an artificial currency that has no value outside the auctions. Indeed, users must jointly learn the value of the currency in addition to how to spend it optimally. Moreover, in the prominent class of karma mechanisms, in which artificial karma payments are redistributed to users at each time step, users do not only spend karma to obtain public resources but also gain karma for yielding them. For this novel class of karma auctions, we propose an adaptive karma pacing strategy that learns to bid optimally, and show that this strategy a) is asymptotically optimal for a single user bidding against competing bids drawn from a stationary distribution; b) leads to convergent learning dynamics when all users adopt it; and c) constitutes an approximate Nash equilibrium as the number of users grows. Our results require a novel analysis in comparison to adaptive pacing strategies in monetary auctions, since we depart from the classical assumption that the currency has known value outside the auctions, and consider that the currency is both spent and gained through the redistribution of payments.
△ Less
Submitted 4 February, 2025; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Towards a Systems Theory of Algorithms
Authors:
Florian Dörfler,
Zhiyu He,
Giuseppe Belgioioso,
Saverio Bolognani,
John Lygeros,
Michael Muehlebach
Abstract:
Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control str…
▽ More
Traditionally, numerical algorithms are seen as isolated pieces of code confined to an {\em in silico} existence. However, this perspective is not appropriate for many modern computational approaches in control, learning, or optimization, wherein {\em in vivo} algorithms interact with their environment. Examples of such {\em open algorithms} include various real-time optimization-based control strategies, reinforcement learning, decision-making architectures, online optimization, and many more. Further, even {\em closed} algorithms in learning or optimization are increasingly abstracted in block diagrams with interacting dynamic modules and pipelines. In this opinion paper, we state our vision on a to-be-cultivated {\em systems theory of algorithms} and argue in favor of viewing algorithms as open dynamical systems interacting with other algorithms, physical systems, humans, or databases. Remarkably, the manifold tools developed under the umbrella of systems theory are well suited for addressing a range of challenges in the algorithmic domain. We survey various instances where the principles of algorithmic systems theory are being developed and outline pertinent modeling, analysis, and design challenges.
△ Less
Submitted 30 April, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Efficient Exploration in Continuous-time Model-based Reinforcement Learning
Authors:
Lenart Treven,
Jonas Hübotter,
Bhavya Sukhija,
Florian Dörfler,
Andreas Krause
Abstract:
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use t…
▽ More
Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use the optimistic principle for exploration. Our regret bounds surface the importance of the measurement selection strategy(MSS), since in continuous time we not only must decide how to explore, but also when to observe the underlying system. Our analysis demonstrates that the regret is sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of MSS, such as equidistant sampling. Additionally, we propose an adaptive, data-dependent, practical MSS that, when combined with GP dynamics, also achieves sublinear regret with significantly fewer samples. We showcase the benefits of continuous-time modeling over its discrete-time counterpart, as well as our proposed adaptive MSS over standard baselines, on several applications.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Physics-Informed Graph Neural Network for Dynamic Reconfiguration of Power Systems
Authors:
Jules Authier,
Rabab Haider,
Anuradha Annaswamy,
Florian Dorfler
Abstract:
To maintain a reliable grid we need fast decision-making algorithms for complex problems like Dynamic Reconfiguration (DyR). DyR optimizes distribution grid switch settings in real-time to minimize grid losses and dispatches resources to supply loads with available generation. DyR is a mixed-integer problem and can be computationally intractable to solve for large grids and at fast timescales. We…
▽ More
To maintain a reliable grid we need fast decision-making algorithms for complex problems like Dynamic Reconfiguration (DyR). DyR optimizes distribution grid switch settings in real-time to minimize grid losses and dispatches resources to supply loads with available generation. DyR is a mixed-integer problem and can be computationally intractable to solve for large grids and at fast timescales. We propose GraPhyR, a Physics-Informed Graph Neural Network (GNNs) framework tailored for DyR. We incorporate essential operational and connectivity constraints directly within the GNN framework and train it end-to-end. Our results show that GraPhyR is able to learn to optimize the DyR task.
△ Less
Submitted 2 April, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
The Impact of Recommendation Systems on Opinion Dynamics: Microscopic versus Macroscopic Effects
Authors:
Nicolas Lanzetti,
Florian Dörfler,
Nicolò Pagan
Abstract:
Recommendation systems are widely used in web services, such as social networks and e-commerce platforms, to serve personalized content to the users and, thus, enhance their experience. While personalization assists users in navigating through the available options, there have been growing concerns regarding its repercussions on the users and their opinions. Examples of negative impacts include th…
▽ More
Recommendation systems are widely used in web services, such as social networks and e-commerce platforms, to serve personalized content to the users and, thus, enhance their experience. While personalization assists users in navigating through the available options, there have been growing concerns regarding its repercussions on the users and their opinions. Examples of negative impacts include the emergence of filter bubbles and the amplification of users' confirmation bias, which can cause opinion polarization and radicalization. In this paper, we study the impact of recommendation systems on users, both from a microscopic (i.e., at the level of individual users) and a macroscopic (i.e., at the level of a homogenous population) perspective. Specifically, we build on recent work on the interactions between opinion dynamics and recommendation systems to propose a model for this closed loop, which we then study both analytically and numerically. Among others, our analysis reveals that shifts in the opinions of individual users do not always align with shifts in the opinion distribution of the population. In particular, even in settings where the opinion distribution appears unaltered (e.g., measured via surveys across the population), the opinion of individual users might be significantly distorted by the recommendation system.
△ Less
Submitted 7 December, 2023; v1 submitted 16 September, 2023;
originally announced September 2023.
-
Nash equilibrium seeking over digraphs with row-stochastic matrices and network-independent step-sizes
Authors:
Duong Thuy Anh Nguyen,
Mattia Bianchi,
Florian Dörfler,
Duong Tung Nguyen,
Angelia Nedić
Abstract:
In this paper, we address the challenge of Nash equilibrium (NE) seeking in non-cooperative convex games with partial-decision information. We propose a distributed algorithm, where each agent refines its strategy through projected-gradient steps and an averaging procedure. Each agent uses estimates of competitors' actions obtained solely from local neighbor interactions, in a directed communicati…
▽ More
In this paper, we address the challenge of Nash equilibrium (NE) seeking in non-cooperative convex games with partial-decision information. We propose a distributed algorithm, where each agent refines its strategy through projected-gradient steps and an averaging procedure. Each agent uses estimates of competitors' actions obtained solely from local neighbor interactions, in a directed communication network. Unlike previous approaches that rely on (strong) monotonicity assumptions, this work establishes the convergence towards a NE under a diagonal dominance property of the pseudo-gradient mapping, that can be checked locally by the agents. Further, this condition is physically interpretable and of relevance for many applications, as it suggests that an agent's objective function is primarily influenced by its individual strategic decisions, rather than by the actions of its competitors. In virtue of a novel block-infinity norm convergence argument, we provide explicit bounds for constant step-size that are independent of the communication structure, and can be computed in a totally decentralized way. Numerical simulations on an optical network's power control problem validate the algorithm's effectiveness.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Strategic Interactions in Multi-modal Mobility Systems: A Game-Theoretic Perspective
Authors:
Gioele Zardini,
Nicolas Lanzetti,
Giuseppe Belgioioso,
Christian Hartnik,
Saverio Bolognani,
Florian Dörfler,
Emilio Frazzoli
Abstract:
The evolution of existing transportation systems,mainly driven by urbanization and increased availability of mobility options, such as private, profit-maximizing ride-hailing companies, calls for tools to reason about their design and regulation. To study this complex socio-technical problem, one needs to account for the strategic interactions of the heterogeneous stakeholders involved in the mobi…
▽ More
The evolution of existing transportation systems,mainly driven by urbanization and increased availability of mobility options, such as private, profit-maximizing ride-hailing companies, calls for tools to reason about their design and regulation. To study this complex socio-technical problem, one needs to account for the strategic interactions of the heterogeneous stakeholders involved in the mobility ecosystem and analyze how they influence the system. In this paper, we focus on the interactions between citizens who compete for the limited resources of a mobility system to complete their desired trip. Specifically, we present a game-theoretic framework for multi-modal mobility systems, where citizens, characterized by heterogeneous preferences, have access to various mobility options and seek individually-optimal decisions. We study the arising game and prove the existence of an equilibrium, which can be efficiently computed via a convex optimization problem. Through both an analytical and a numerical case study for the classic scenario of Sioux Falls, USA, we illustrate the capabilities of our model and perform sensitivity analyses. Importantly, we show how to embed our framework into a "larger" game among stakeholders of the mobility ecosystem (e.g., municipality, Mobility Service Providers, and citizens), effectively giving rise to tools to inform strategic interventions and policy-making in the mobility ecosystem.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Designing Optimal Personalized Incentive for Traffic Routing using BIG Hype algorithm
Authors:
Panagiotis D. Grontas,
Carlo Cenedese,
Marta Fochesato,
Giuseppe Belgioioso,
John Lygeros,
Florian Dörfler
Abstract:
We study the problem of optimally routing plug-in electric and conventional fuel vehicles on a city level. In our model, commuters selfishly aim to minimize a local cost that combines travel time, from a fixed origin to a desired destination, and the monetary cost of using city facilities, parking or service stations. The traffic authority can influence the commuters' preferred routing choice by m…
▽ More
We study the problem of optimally routing plug-in electric and conventional fuel vehicles on a city level. In our model, commuters selfishly aim to minimize a local cost that combines travel time, from a fixed origin to a desired destination, and the monetary cost of using city facilities, parking or service stations. The traffic authority can influence the commuters' preferred routing choice by means of personalized discounts on parking tickets and on the energy price at service stations. We formalize the problem of designing these monetary incentives optimally as a large-scale bilevel game, where constraints arise at both levels due to the finite capacities of city facilities and incentives budget. Then, we develop an efficient decentralized solution scheme with convergence guarantees based on BIG Hype, a recently-proposed hypergradient-based algorithm for hierarchical games. Finally, we validate our model via numerical simulations over the Anaheim's network, and show that the proposed approach produces sensible results in terms of traffic decongestion and it is able to solve in minutes problems with more than 48000 variables and 110000 constraints.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Nash Equilibria, Regularization and Computation in Optimal Transport-Based Distributionally Robust Optimization
Authors:
Soroosh Shafiee,
Liviu Aolaritei,
Florian Dörfler,
Daniel Kuhn
Abstract:
We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by reshaping a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz reg…
▽ More
We study optimal transport-based distributionally robust optimization problems where a fictitious adversary, often envisioned as nature, can choose the distribution of the uncertain problem parameters by reshaping a prescribed reference distribution at a finite transportation cost. In this framework, we show that robustification is intimately related to various forms of variation and Lipschitz regularization even if the transportation cost function fails to be (some power of) a metric. We also derive conditions for the existence and the computability of a Nash equilibrium between the decision-maker and nature, and we demonstrate numerically that nature's Nash strategy can be viewed as a distribution that is supported on remarkably deceptive adversarial samples. Finally, we identify practically relevant classes of optimal transport-based distributionally robust optimization problems that can be addressed with efficient gradient descent algorithms even if the loss function or the transportation cost function are nonconvex (but not both at the same time).
△ Less
Submitted 1 June, 2025; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Designing Fairness in Autonomous Peer-to-peer Energy Trading
Authors:
Varsha Behrunani,
Andrew Irvine,
Giuseppe Belgioioso,
Philipp Heer,
John Lygeros,
Florian Dörfler
Abstract:
Several autonomous energy management and peer-to-peer trading mechanisms for future energy markets have been recently proposed based on optimization and game theory. In this paper, we study the impact of trading prices on the outcome of these market designs for energy-hub networks. We prove that, for a generic choice of trading prices, autonomous peer-to-peer trading is always network-wide benefic…
▽ More
Several autonomous energy management and peer-to-peer trading mechanisms for future energy markets have been recently proposed based on optimization and game theory. In this paper, we study the impact of trading prices on the outcome of these market designs for energy-hub networks. We prove that, for a generic choice of trading prices, autonomous peer-to-peer trading is always network-wide beneficial but not necessarily individually beneficial for each hub. Therefore, we leverage hierarchical game theory to formalize the problem of designing locally-beneficial and network-wide fair peer-to-peer trading prices. Then, we propose a scalable and privacy-preserving price-mediation algorithm that provably converges to a profile of such prices. Numerical simulations on a 3-hub network show that the proposed algorithm can indeed incentivize active participation of energy hubs in autonomous peer-to-peer trading schemes.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Follow the Clairvoyant: an Imitation Learning Approach to Optimal Control
Authors:
Andrea Martin,
Luca Furieri,
Florian Dörfler,
John Lygeros,
Giancarlo Ferrari-Trecate
Abstract:
We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instea…
▽ More
We consider control of dynamical systems through the lens of competitive analysis. Most prior work in this area focuses on minimizing regret, that is, the loss relative to an ideal clairvoyant policy that has noncausal access to past, present, and future disturbances. Motivated by the observation that the optimal cost only provides coarse information about the ideal closed-loop behavior, we instead propose directly minimizing the tracking error relative to the optimal trajectories in hindsight, i.e., imitating the clairvoyant policy. By embracing a system level perspective, we present an efficient optimization-based approach for computing follow-the-clairvoyant (FTC) safe controllers. We prove that these attain minimal regret if no constraints are imposed on the noncausal benchmark. In addition, we present numerical experiments to show that our policy retains the hallmark of competitive algorithms of interpolating between classical $\mathcal{H}_2$ and $\mathcal{H}_\infty$ control laws - while consistently outperforming regret minimization methods in constrained scenarios thanks to the superior ability to chase the clairvoyant.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Stability and Robustness of Distributed Suboptimal Model Predictive Control
Authors:
Giuseppe Belgioioso,
Dominic Liao-McPherson,
Mathias Hudoba de Badyn,
Nicolas Pelzmann,
John Lygeros,
Florian Dörfler
Abstract:
In distributed model predictive control (MPC), the control input at each sampling time is computed by solving a large-scale optimal control problem (OCP) over a finite horizon using distributed algorithms. Typically, such algorithms require several (virtually, infinite) communication rounds between the subsystems to converge, which is a major drawback both computationally and from an energetic per…
▽ More
In distributed model predictive control (MPC), the control input at each sampling time is computed by solving a large-scale optimal control problem (OCP) over a finite horizon using distributed algorithms. Typically, such algorithms require several (virtually, infinite) communication rounds between the subsystems to converge, which is a major drawback both computationally and from an energetic perspective (for wireless systems). Motivated by these challenges, we propose a suboptimal distributed MPC scheme in which the total communication burden is distributed also in time, by maintaining a running solution estimate for the large-scale OCP and updating it at each sampling time. We demonstrate that, under some regularity conditions, the resulting suboptimal MPC control law recovers the qualitative robust stability properties of optimal MPC, if the communication budget at each sampling time is large enough.
△ Less
Submitted 27 March, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions
Authors:
Antonio Terpin,
Nicolas Lanzetti,
Batuhan Yardim,
Florian Dörfler,
Giorgia Ramponi
Abstract:
Policy Optimization (PO) algorithms have been proven particularly suited to handle the high-dimensionality of real-world continuous control tasks. In this context, Trust Region Policy Optimization methods represent a popular approach to stabilize the policy updates. These usually rely on the Kullback-Leibler (KL) divergence to limit the change in the policy. The Wasserstein distance represents a n…
▽ More
Policy Optimization (PO) algorithms have been proven particularly suited to handle the high-dimensionality of real-world continuous control tasks. In this context, Trust Region Policy Optimization methods represent a popular approach to stabilize the policy updates. These usually rely on the Kullback-Leibler (KL) divergence to limit the change in the policy. The Wasserstein distance represents a natural alternative, in place of the KL divergence, to define trust regions or to regularize the objective function. However, state-of-the-art works either resort to its approximations or do not provide an algorithm for continuous state-action spaces, reducing the applicability of the method. In this paper, we explore optimal transport discrepancies (which include the Wasserstein distance) to define trust regions, and we propose a novel algorithm - Optimal Transport Trust Region Policy Optimization (OT-TRPO) - for continuous state-action spaces. We circumvent the infinite-dimensional optimization problem for PO by providing a one-dimensional dual reformulation for which strong duality holds. We then analytically derive the optimal policy update given the solution of the dual problem. This way, we bypass the computation of optimal transport costs and of optimal transport maps, which we implicitly characterize by solving the dual formulation. Finally, we provide an experimental evaluation of our approach across various control tasks. Our results show that optimal transport discrepancies can offer an advantage over state-of-the-art approaches.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
A self-contained karma economy for the dynamic allocation of common resources
Authors:
Ezzat Elokda,
Saverio Bolognani,
Andrea Censi,
Florian Dörfler,
Emilio Frazzoli
Abstract:
This paper presents karma mechanisms, a novel approach to the repeated allocation of a scarce resource among competing agents over an infinite time. Examples include deciding which ride hailing trip requests to serve during peak demand, granting the right of way in intersections or lane mergers, or admitting internet content to a regulated fast channel. We study a simplified yet insightful formula…
▽ More
This paper presents karma mechanisms, a novel approach to the repeated allocation of a scarce resource among competing agents over an infinite time. Examples include deciding which ride hailing trip requests to serve during peak demand, granting the right of way in intersections or lane mergers, or admitting internet content to a regulated fast channel. We study a simplified yet insightful formulation of these problems where at every instant two agents from a large population get randomly matched to compete over the resource. The intuitive interpretation of a karma mechanism is "If I give in now, I will be rewarded in the future." Agents compete in an auction-like setting where they bid units of karma, which circulates directly among them and is self-contained in the system. We demonstrate that this allows a society of self-interested agents to achieve high levels of efficiency without resorting to a (possibly problematic) monetary pricing of the resource. We model karma mechanisms as dynamic population games and guarantee the existence of a stationary Nash equilibrium. We then analyze the performance at the stationary Nash equilibrium numerically. For the case of homogeneous agents, we compare different mechanism design choices, showing that it is possible to achieve an efficient and ex-post fair allocation when the agents are future aware. Finally, we test the robustness against agent heterogeneity and propose remedies to some of the observed phenomena via karma redistribution.
△ Less
Submitted 8 May, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Wasserstein Distributionally Robust Estimation in High Dimensions: Performance Analysis and Optimal Hyperparameter Tuning
Authors:
Liviu Aolaritei,
Soroosh Shafiee,
Florian Dörfler
Abstract:
Distributionally robust optimization (DRO) has become a powerful framework for estimation under uncertainty, offering strong out-of-sample performance and principled regularization. In this paper, we propose a DRO-based method for linear regression and address a central question: how to optimally choose the robustness radius, which controls the trade-off between robustness and accuracy. Focusing o…
▽ More
Distributionally robust optimization (DRO) has become a powerful framework for estimation under uncertainty, offering strong out-of-sample performance and principled regularization. In this paper, we propose a DRO-based method for linear regression and address a central question: how to optimally choose the robustness radius, which controls the trade-off between robustness and accuracy. Focusing on high-dimensional settings where the dimension and the number of samples are both large and comparable in size, we employ tools from high-dimensional asymptotic statistics to precisely characterize the estimation error of the resulting estimator. Remarkably, this error can be recovered by solving a simple convex-concave optimization problem involving only four scalar variables. This characterization enables efficient selection of the radius that minimizes the estimation error. In doing so, it achieves the same effect as cross-validation, but at a fraction of the computational cost. Numerical experiments confirm that our theoretical predictions closely match empirical performance and that the optimal radius selected through our method aligns with that chosen by cross-validation, highlighting both the accuracy and the practical benefits of our approach.
△ Less
Submitted 2 May, 2025; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Data-Driven Behaviour Estimation in Parametric Games
Authors:
Anna M. Maddux,
Nicolò Pagan,
Giuseppe Belgioioso,
Florian Dörfler
Abstract:
A central question in multi-agent strategic games deals with learning the underlying utilities driving the agents' behaviour. Motivated by the increasing availability of large data-sets, we develop an unifying data-driven technique to estimate agents' utility functions from their observed behaviour, irrespective of whether the observations correspond to equilibrium configurations or to temporal se…
▽ More
A central question in multi-agent strategic games deals with learning the underlying utilities driving the agents' behaviour. Motivated by the increasing availability of large data-sets, we develop an unifying data-driven technique to estimate agents' utility functions from their observed behaviour, irrespective of whether the observations correspond to equilibrium configurations or to temporal sequences of action profiles. Under standard assumptions on the parametrization of the utilities, the proposed inference method is computationally efficient and finds all the parameters that rationalize the observed behaviour best. We numerically validate our theoretical findings on the market share estimation problem under advertising competition, using historical data from the Coca-Cola Company and Pepsi Inc. duopoly.
△ Less
Submitted 14 January, 2024; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Posetal Games: Efficiency, Existence, and Refinement of Equilibria in Games with Prioritized Metrics
Authors:
Alessandro Zanardi,
Gioele Zardini,
Sirish Srinivasan,
Saverio Bolognani,
Andrea Censi,
Florian Dörfler,
Emilio Frazzoli
Abstract:
Modern applications require robots to comply with multiple, often conflicting rules and to interact with the other agents. We present Posetal Games as a class of games in which each player expresses a preference over the outcomes via a partially ordered set of metrics. This allows one to combine hierarchical priorities of each player with the interactive nature of the environment. By contextualizi…
▽ More
Modern applications require robots to comply with multiple, often conflicting rules and to interact with the other agents. We present Posetal Games as a class of games in which each player expresses a preference over the outcomes via a partially ordered set of metrics. This allows one to combine hierarchical priorities of each player with the interactive nature of the environment. By contextualizing standard game theoretical notions, we provide two sufficient conditions on the preference of the players to prove existence of pure Nash Equilibria in finite action sets. Moreover, we define formal operations on the preference structures and link them to a refinement of the game solutions, showing how the set of equilibria can be systematically shrunk. The presented results are showcased in a driving game where autonomous vehicles select from a finite set of trajectories. The results demonstrate the interpretability of results in terms of minimum-rank-violation for each player.
△ Less
Submitted 13 November, 2021;
originally announced November 2021.
-
Learning Stable Deep Dynamics Models for Partially Observed or Delayed Dynamical Systems
Authors:
Andreas Schlaginhaufen,
Philippe Wenk,
Andreas Krause,
Florian Dörfler
Abstract:
Learning how complex dynamical systems evolve over time is a key challenge in system identification. For safety critical systems, it is often crucial that the learned model is guaranteed to converge to some equilibrium point. To this end, neural ODEs regularized with neural Lyapunov functions are a promising approach when states are fully observed. For practical applications however, partial obser…
▽ More
Learning how complex dynamical systems evolve over time is a key challenge in system identification. For safety critical systems, it is often crucial that the learned model is guaranteed to converge to some equilibrium point. To this end, neural ODEs regularized with neural Lyapunov functions are a promising approach when states are fully observed. For practical applications however, partial observations are the norm. As we will demonstrate, initialization of unobserved augmented states can become a key problem for neural ODEs. To alleviate this issue, we propose to augment the system's state with its history. Inspired by state augmentation in discrete-time systems, we thus obtain neural delay differential equations. Based on classical time delay stability analysis, we then show how to ensure stability of the learned models, and theoretically analyze our approach. Our experiments demonstrate its applicability to stable system identification of partially observed systems and learning a stabilizing feedback policy in delayed feedback control.
△ Less
Submitted 10 December, 2021; v1 submitted 27 October, 2021;
originally announced October 2021.
-
Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models
Authors:
Lenart Treven,
Philippe Wenk,
Florian Dörfler,
Andreas Krause
Abstract:
Differential equations in general and neural ODEs in particular are an essential technique in continuous-time system identification. While many deterministic learning algorithms have been designed based on numerical integration via the adjoint method, many downstream tasks such as active learning, exploration in reinforcement learning, robust control, or filtering require accurate estimates of pre…
▽ More
Differential equations in general and neural ODEs in particular are an essential technique in continuous-time system identification. While many deterministic learning algorithms have been designed based on numerical integration via the adjoint method, many downstream tasks such as active learning, exploration in reinforcement learning, robust control, or filtering require accurate estimates of predictive uncertainties. In this work, we propose a novel approach towards estimating epistemically uncertain neural ODEs, avoiding the numerical integration bottleneck. Instead of modeling uncertainty in the ODE parameters, we directly model uncertainties in the state space. Our algorithm - distributional gradient matching (DGM) - jointly trains a smoother and a dynamics model and matches their gradients via minimizing a Wasserstein loss. Our experiments show that, compared to traditional approximate inference methods based on numerical integration, our approach is faster to train, faster at predicting previously unseen trajectories, and in the context of neural ODEs, significantly more accurate.
△ Less
Submitted 15 October, 2021; v1 submitted 22 June, 2021;
originally announced June 2021.
-
Dynamic Population Games: A Tractable Intersection of Mean-Field Games and Population Games
Authors:
Ezzat Elokda,
Saverio Bolognani,
Andrea Censi,
Florian Dörfler,
Emilio Frazzoli
Abstract:
In many real-world large-scale decision problems, self-interested agents have individual dynamics and optimize their own long-term payoffs. Important examples include the competitive access to shared resources (e.g., roads, energy, or bandwidth) but also non-engineering domains like epidemic propagation and control. These problems are natural to model as mean-field games. Existing mathematical for…
▽ More
In many real-world large-scale decision problems, self-interested agents have individual dynamics and optimize their own long-term payoffs. Important examples include the competitive access to shared resources (e.g., roads, energy, or bandwidth) but also non-engineering domains like epidemic propagation and control. These problems are natural to model as mean-field games. Existing mathematical formulations of mean field games have had limited applicability in practice, since they require solving non-standard initial-terminal-value problems that are tractable only in limited special cases. In this letter, we propose a novel formulation, along with computational tools, for a practically relevant class of Dynamic Population Games (DPGs), which correspond to discrete-time, finite-state-and-action, stationary mean-field games. Our main contribution is a mathematical reduction of Stationary Nash Equilibria (SNE) in DPGs to standard Nash Equilibria (NE) in static population games. This reduction is leveraged to guarantee the existence of a SNE, develop an evolutionary dynamics-based SNE computation algorithm, and derive simple conditions that guarantee stability and uniqueness of the SNE. We provide two examples of applications: fair resource allocation with heterogeneous agents and control of epidemic propagation. Open source software for SNE computation: https://gitlab.ethz.ch/elokdae/dynamic-population-games
△ Less
Submitted 4 June, 2024; v1 submitted 29 April, 2021;
originally announced April 2021.
-
Game Theory to Study Interactions between Mobility Stakeholders
Authors:
Gioele Zardini,
Nicolas Lanzetti,
Laura Guerrini,
Emilio Frazzoli,
Florian Dörfler
Abstract:
Increasing urbanization and exacerbation of sustainability goals threaten the operational efficiency of current transportation systems and confront cities with complex choices with huge impact on future generations. At the same time, the rise of private, profit-maximizing Mobility Service Providers leveraging public resources, such as ride-hailing companies, entangles current regulation schemes. T…
▽ More
Increasing urbanization and exacerbation of sustainability goals threaten the operational efficiency of current transportation systems and confront cities with complex choices with huge impact on future generations. At the same time, the rise of private, profit-maximizing Mobility Service Providers leveraging public resources, such as ride-hailing companies, entangles current regulation schemes. This calls for tools to study such complex socio-technical problems. In this paper, we provide a game-theoretic framework to study interactions between stakeholders of the mobility ecosystem, modeling regulatory aspects such as taxes and public transport prices, as well as operational matters for Mobility Service Providers such as pricing strategy, fleet sizing, and vehicle design. Our framework is modular and can readily accommodate different types of Mobility Service Providers, actions of municipalities, and low-level models of customers choices in the mobility system. Through both an analytical and a numerical case study for the city of Berlin, Germany, we showcase the ability of our framework to compute equilibria of the problem, to study fundamental tradeoffs, and to inform stakeholders and policy makers on the effects of interventions. Among others, we show tradeoffs between customers satisfaction, environmental impact, and public revenue, as well as the impact of strategic decisions on these metrics.
△ Less
Submitted 6 November, 2021; v1 submitted 21 April, 2021;
originally announced April 2021.
-
Structural Balance and Interpersonal Appraisals Dynamics: Beyond All-to-All and Two-Faction Networks
Authors:
Wenjun Mei,
Ge Chen,
Noah E. Friedkin,
Florian Dörfler
Abstract:
Structural balance theory describes stable configurations of topologies of signed interpersonal appraisal networks. Existing models explaining the convergence of appraisal networks to structural balance either diverge in finite time, or could get stuck in jammed states, or converge to only complete graphs. In this paper, we study the open problem how steady non-all-to-all structural balance emerge…
▽ More
Structural balance theory describes stable configurations of topologies of signed interpersonal appraisal networks. Existing models explaining the convergence of appraisal networks to structural balance either diverge in finite time, or could get stuck in jammed states, or converge to only complete graphs. In this paper, we study the open problem how steady non-all-to-all structural balance emerges via local dynamics of interpersonal appraisals. We first compare two well-justified definitions of structural balance for general non-all-to-all graphs, i.e., the triad-wise structural balance and the two-faction structural balance, and thoroughly study their relations. Secondly, based on three widely adopted sociological mechanisms: the symmetry mechanism, the influence mechanism, and the homophily mechanism, we propose two simple models of gossip-like appraisal dynamics, the symmetry-influence-homophily (SIH) dynamics and the symmetry-influence-opinion-homophily (SIOH) dynamics. In these models, the appraisal network starting from any initial condition almost surely achieves non-all-to-all triad-wise and two-faction structural balance in finite time respectively. Moreover, the SIOH dynamics capture the co-evolution of interpersonal appraisals and individuals' opinions. Regarding the theoretical contributions, we show that the equilibrium set of the SIH (SIOH resp.) dynamics corresponds to the set of all the possible triad-wise (two-faction resp.) structural balance configurations of the appraisal networks. Moreover, we prove that, for any initial condition, the appraisal networks in the SIH (SIOH resp.) dynamics almost surely achieve triad-wise (two-faction resp.) structural balance in finite time. Numerical studies of the SIH dynamics also imply some insightful take-home messages on whether multilateral relations reduce or exacerbate conflicts.
△ Less
Submitted 23 December, 2020; v1 submitted 18 December, 2020;
originally announced December 2020.
-
Rethinking the Micro-Foundation of Opinion Dynamics: Rich Consequences of the Weighted-Median Mechanism
Authors:
Wenjun Mei,
Francesco Bullo,
Ge Chen,
Julien Hendrickx,
Florian Dörfler
Abstract:
To identify the main mechanisms underlying complex opinion formation processes in social systems, researchers have long been exploring simple mechanistic mathematical models. Most existing opinion dynamics models are built on a common micro-foundation, i.e., the weighted-averaging opinion update. However, we argue that this universally-adopted mechanism features a non-negligible unrealistic featur…
▽ More
To identify the main mechanisms underlying complex opinion formation processes in social systems, researchers have long been exploring simple mechanistic mathematical models. Most existing opinion dynamics models are built on a common micro-foundation, i.e., the weighted-averaging opinion update. However, we argue that this universally-adopted mechanism features a non-negligible unrealistic feature, which brings unnecessary difficulties in seeking a proper balance between model complexity and predictive power. In this paper, we propose the weighted-median mechanism as a new micro-foundation of opinion dynamics, which, with minimal assumptions, fundamentally resolves the inherent unrealistic feature of the weighted-averaging mechanism. Derived from the cognitive dissonance theory in psychology, the weighted-median mechanism is supported by online experiment data and broadens the applicability of opinion dynamics models to multiple-choice issues with ordered discrete options. Moreover, the weighted-median mechanism, despite being the simplest in form, captures various non-trivial real-world features of opinion evolution, while some widely-studied averaging-based models fail to.
△ Less
Submitted 17 December, 2022; v1 submitted 13 September, 2019;
originally announced September 2019.
-
Incentive Design in Peer Review: Rating and Repeated Endogenous Matching
Authors:
Yuanzhang Xiao,
Florian Dörfler,
Mihaela van der Schaar
Abstract:
Peer review (e.g., grading assignments in Massive Open Online Courses (MOOCs), academic paper review) is an effective and scalable method to evaluate the products (e.g., assignments, papers) of a large number of agents when the number of dedicated reviewing experts (e.g., teaching assistants, editors) is limited. Peer review poses two key challenges: 1) identifying the reviewers' intrinsic capabil…
▽ More
Peer review (e.g., grading assignments in Massive Open Online Courses (MOOCs), academic paper review) is an effective and scalable method to evaluate the products (e.g., assignments, papers) of a large number of agents when the number of dedicated reviewing experts (e.g., teaching assistants, editors) is limited. Peer review poses two key challenges: 1) identifying the reviewers' intrinsic capabilities (i.e., adverse selection) and 2) incentivizing the reviewers to exert high effort (i.e., moral hazard). Some works in mechanism design address pure adverse selection using one-shot matching rules, and pure moral hazard was addressed in repeated games with exogenously given and fixed matching rules. However, in peer review systems exhibiting both adverse selection and moral hazard, one-shot or exogenous matching rules do not link agents' current behavior with future matches and future payoffs, and as we prove, will induce myopic behavior (i.e., exerting the lowest effort) resulting in the lowest review quality.
In this paper, we propose for the first time a solution that simultaneously solves adverse selection and moral hazard. Our solution exploits the repeated interactions of agents, utilizes ratings to summarize agents' past review quality, and designs matching rules that endogenously depend on agents' ratings. Our proposed matching rules are easy to implement and require no knowledge about agents' private information (e.g., their benefit and cost functions). Yet, they are effective in guiding the system to an equilibrium where the agents are incentivized to exert high effort and receive ratings that precisely reflect their review quality. Using several illustrative examples, we quantify the significant performance gains obtained by our proposed mechanism as compared to existing one-shot or exogenous matching rules.
△ Less
Submitted 8 November, 2014;
originally announced November 2014.
-
Kron Reduction of Graphs with Applications to Electrical Networks
Authors:
Florian Dorfler,
Francesco Bullo
Abstract:
Consider a weighted and undirected graph, possibly with self-loops, and its corresponding Laplacian matrix, possibly augmented with additional diagonal elements corresponding to the self-loops. The Kron reduction of this graph is again a graph whose Laplacian matrix is obtained by the Schur complement of the original Laplacian matrix with respect to a subset of nodes. The Kron reduction process is…
▽ More
Consider a weighted and undirected graph, possibly with self-loops, and its corresponding Laplacian matrix, possibly augmented with additional diagonal elements corresponding to the self-loops. The Kron reduction of this graph is again a graph whose Laplacian matrix is obtained by the Schur complement of the original Laplacian matrix with respect to a subset of nodes. The Kron reduction process is ubiquitous in classic circuit theory and in related disciplines such as electrical impedance tomography, smart grid monitoring, transient stability assessment in power networks, or analysis and simulation of induction motors and power electronics. More general applications of Kron reduction occur in sparse matrix algorithms, multi-grid solvers, finite--element analysis, and Markov chains. The Schur complement of a Laplacian matrix and related concepts have also been studied under different names and as purely theoretic problems in the literature on linear algebra. In this paper we propose a general graph-theoretic framework for Kron reduction that leads to novel and deep insights both on the mathematical and the physical side. We show the applicability of our framework to various practical problem setups arising in engineering applications and computation. Furthermore, we provide a comprehensive and detailed graph-theoretic analysis of the Kron reduction process encompassing topological, algebraic, spectral, resistive, and sensitivity analyses. Throughout our theoretic elaborations we especially emphasize the practical applicability of our results.
△ Less
Submitted 14 February, 2011;
originally announced February 2011.