Search | arXiv e-print repository

Exploring Competitive and Collusive Behaviors in Algorithmic Pricing with Deep Reinforcement Learning

Authors: Shidi Deng, Maximilian Schiffer, Martin Bichler

Abstract: Nowadays, a significant share of the business-to-consumer sector is based on online platforms like Amazon and Alibaba and uses AI for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when Reinforcement Learnin… ▽ More Nowadays, a significant share of the business-to-consumer sector is based on online platforms like Amazon and Alibaba and uses AI for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when Reinforcement Learning (RL) algorithms are used to decide on pricing strategies in competitive markets. Prior research in this field focused on Tabular Q-learning (TQL) and led to opposing views on whether learning-based algorithms can result in supra-competitive prices. Building on this, our work contributes to this ongoing discussion by providing a more nuanced numerical study that goes beyond TQL, additionally capturing off- and on- policy Deep Reinforcement Learning (DRL) algorithms, two distinct families of DRL algorithms that recently gained attention for algorithmic pricing. We study multiple Bertrand oligopoly variants and show that algorithmic collusion depends on the algorithm used. In our experiments, we observed that TQL tends to exhibit higher collusion and price dispersion. Moreover, it suffers from instability and disparity, as agents with higher learning rates consistently achieve higher profits, and it lacks robustness in state representation, with pricing dynamics varying significantly based on information access. In contrast, DRL algorithms, such as PPO and DQN, generally converge to lower prices closer to the Nash equilibrium. Additionally, we show that when pre-trained TQL agents interact with DRL agents, the latter quickly outperforms the former, highlighting the advantages of DRL in pricing competition. Lastly, we find that competition between heterogeneous DRL algorithms, such as PPO and DQN, tends to reduce the likelihood of supra-competitive pricing. △ Less

Submitted 14 March, 2025; originally announced March 2025.

arXiv:2406.02437 [pdf, other]

Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning

Authors: Shidi Deng, Maximilian Schiffer, Martin Bichler

Abstract: Nowadays, a significant share of the Business-to-Consumer sector is based on online platforms like Amazon and Alibaba and uses Artificial Intelligence for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when… ▽ More Nowadays, a significant share of the Business-to-Consumer sector is based on online platforms like Amazon and Alibaba and uses Artificial Intelligence for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when Reinforcement Learning algorithms are used to decide on pricing strategies in competitive markets. Prior research in this field focused on Tabular Q-learning (TQL) and led to opposing views on whether learning-based algorithms can lead to supra-competitive prices. Our work contributes to this ongoing discussion by providing a more nuanced numerical study that goes beyond TQL by additionally capturing off- and on-policy Deep Reinforcement Learning (DRL) algorithms. We study multiple Bertrand oligopoly variants and show that algorithmic collusion depends on the algorithm used. In our experiments, TQL exhibits higher collusion and price dispersion phenomena compared to DRL algorithms. We show that the severity of collusion depends not only on the algorithm used but also on the characteristics of the market environment. We further find that Proximal Policy Optimization appears to be less sensitive to collusive outcomes compared to other state-of-the-art DRL algorithms. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2306.17355 [pdf, other]

Recurring Auctions with Costly Entry: Theory and Evidence

Authors: Shanglyu Deng, Qiyao Zhou

Abstract: Recurring auctions are ubiquitous for selling durable assets like artworks and homes, with follow-up auctions held for unsold items. We investigate such auctions theoretically and empirically. Theoretical analysis demonstrates that recurring auctions outperform single-round auctions when buyers face entry costs, enhancing efficiency and revenue due to sorted entry of potential buyers. Optimal rese… ▽ More Recurring auctions are ubiquitous for selling durable assets like artworks and homes, with follow-up auctions held for unsold items. We investigate such auctions theoretically and empirically. Theoretical analysis demonstrates that recurring auctions outperform single-round auctions when buyers face entry costs, enhancing efficiency and revenue due to sorted entry of potential buyers. Optimal reserve price sequences are characterized. Empirical findings from home foreclosure auctions in China reveal significant annual gains in efficiency (3.40 billion USD, 16.60%) and revenue (2.97 billion USD, 15.92%) using recurring auctions compared to single-round auctions. Implementing optimal reserve prices can further improve efficiency (3.35%) and revenue (3.06%). △ Less

Submitted 18 February, 2025; v1 submitted 29 June, 2023; originally announced June 2023.

arXiv:2203.03044 [pdf, other]

doi 10.1016/j.jet.2023.105692

Speculation in Procurement Auctions

Authors: Shanglyu Deng

Abstract: A speculator can take advantage of a procurement auction by acquiring items for sale before the auction. The accumulated market power can then be exercised in the auction and may lead to a large enough gain to cover the acquisition costs. I show that speculation always generates a positive expected profit in second-price auctions but could be unprofitable in first-price auctions. In the case where… ▽ More A speculator can take advantage of a procurement auction by acquiring items for sale before the auction. The accumulated market power can then be exercised in the auction and may lead to a large enough gain to cover the acquisition costs. I show that speculation always generates a positive expected profit in second-price auctions but could be unprofitable in first-price auctions. In the case where speculation is profitable in first-price auctions, it is more profitable in second-price auctions. This comparison in profitability is driven by different competition patterns in the two auction mechanisms. In terms of welfare, speculation causes private value destruction and harms efficiency. Sellers benefit from the acquisition offer made by the speculator. Therefore, speculation comes at the expense of the auctioneer. △ Less

Submitted 16 May, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

arXiv:2012.02394 [pdf, ps, other]

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics

Authors: Bo Cowgill, Fabrizio Dell'Acqua, Samuel Deng, Daniel Hsu, Nakul Verma, Augustin Chaintreau

Abstract: Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sa… ▽ More Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sample predictions from their algorithms using randomized audit manipulations of algorithm inputs and ground-truth math performance for 20K subjects. We find that biased predictions are mostly caused by biased training data. However, one-third of the benefit of better training data comes through a novel economic mechanism: Engineers exert greater effort and are more responsive to incentives when given better training data. We also assess how performance varies with programmers' demographic characteristics, and their performance on a psychological test of implicit bias (IAT) concerning gender and careers. We find no evidence that female, minority and low-IAT engineers exhibit lower bias or discrimination in their code. However, we do find that prediction errors are correlated within demographic groups, which creates performance improvements through cross-demographic averaging. Finally, we quantify the benefits and tradeoffs of practical managerial or policy interventions such as technical advice, simple reminders, and improved incentives for decreasing algorithmic bias. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Comments: Part of the Navigating the Broader Impacts of AI Research Workshop at NeurIPS 2020

Showing 1–5 of 5 results for author: Deng, S