MSPM: A Modularized and Scalable Multi-Agent Reinforcement Learning-based System for Financial Portfolio Management

Huang, Zhenhan; Tanaka, Fumihide

doi:10.1371/journal.pone.0263689

Quantitative Finance > Portfolio Management

arXiv:2102.03502 (q-fin)

[Submitted on 6 Feb 2021 (v1), last revised 19 Feb 2022 (this version, v4)]

Title:MSPM: A Modularized and Scalable Multi-Agent Reinforcement Learning-based System for Financial Portfolio Management

Authors:Zhenhan Huang, Fumihide Tanaka

View PDF

Abstract:Financial portfolio management (PM) is one of the most applicable problems in reinforcement learning (RL) owing to its sequential decision-making nature. However, existing RL-based approaches rarely focus on scalability or reusability to adapt to the ever-changing markets. These approaches are rigid and unscalable to accommodate the varying number of assets of portfolios and increasing need for heterogeneous data. Also, RL agents in the existing systems are ad-hoc trained and hardly reusable for different portfolios. To confront the above problems, a modular design is desired for the systems to be compatible with reusable asset-dedicated agents. In this paper, we propose a multi-agent RL-based system for PM (MSPM). MSPM involves two types of asynchronously-updated modules: Evolving Agent Module (EAM) and Strategic Agent Module (SAM). An EAM is an information-generating module with a DQN agent, and it receives heterogeneous data and generates signal-comprised information for a particular asset. An SAM is a decision-making module with a PPO agent for portfolio optimization, and it connects to EAMs to reallocate the assets in a portfolio. Trained EAMs can be connected to any SAM at will. With its modularized architecture, the multi-step condensation of volatile market information, and the reusable design of EAM, MSPM simultaneously addresses the two challenges in RL-based PM: scalability and reusability. Experiments on 8-year U.S. stock market data prove the effectiveness of MSPM in profit accumulation by its outperformance over five baselines in terms of accumulated rate of return (ARR), daily rate of return, and Sortino ratio. MSPM improves ARR by at least 186.5% compared to CRP, a widely-used PM strategy. To validate the indispensability of EAM, we back-test and compare MSPMs on four portfolios. EAM-enabled MSPMs improve ARR by at least 1341.8% compared to EAM-disabled MSPMs.

Subjects:	Portfolio Management (q-fin.PM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Computational Finance (q-fin.CP)
Cite as:	arXiv:2102.03502 [q-fin.PM]
	(or arXiv:2102.03502v4 [q-fin.PM] for this version)
	https://doi.org/10.48550/arXiv.2102.03502
Journal reference:	PLoS ONE 17(2): e0263689 (2022)
Related DOI:	https://doi.org/10.1371/journal.pone.0263689

Submission history

From: Zhenhan Huang [view email]
[v1] Sat, 6 Feb 2021 04:04:57 UTC (7,700 KB)
[v2] Tue, 9 Feb 2021 16:19:01 UTC (6,194 KB)
[v3] Fri, 11 Jun 2021 08:42:30 UTC (5,009 KB)
[v4] Sat, 19 Feb 2022 03:54:41 UTC (6,308 KB)

Quantitative Finance > Portfolio Management

Title:MSPM: A Modularized and Scalable Multi-Agent Reinforcement Learning-based System for Financial Portfolio Management

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Finance > Portfolio Management

Title:MSPM: A Modularized and Scalable Multi-Agent Reinforcement Learning-based System for Financial Portfolio Management

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators