Search | arXiv e-print repository

Community Bail Fund Systems: Fluid Limits and Approximations

Abstract: Community bail funds (CBFs) assist individuals who have been arrested and cannot afford bail, preventing unnecessary pretrial incarceration along with its harmful or sometimes fatal consequences. By posting bail, CBFs allow defendants to stay at home and maintain their livelihoods until trial. This paper introduces new stochastic models that combine queueing theory with classic insurance risk mode… ▽ More Community bail funds (CBFs) assist individuals who have been arrested and cannot afford bail, preventing unnecessary pretrial incarceration along with its harmful or sometimes fatal consequences. By posting bail, CBFs allow defendants to stay at home and maintain their livelihoods until trial. This paper introduces new stochastic models that combine queueing theory with classic insurance risk models to capture the dynamics of the remaining funds in a CBF. We first analyze a model where all bail requests are accepted. Although the remaining fund balance can go negative, this model provides insight for CBFs that are not financially constrained. We then apply the Skorokhod map to make sure the CBF balance does not go negative and show that the Skorokhod map produces a model where requests are partially fulfilled. Finally, we analyze a model where bail requests can be blocked if there is not enough money to satisfy the request upon arrival. Although the blocking model prevents the CBF from being negative, the blocking feature gives rise to new analytical challenges for a direct stochastic analysis. Thus, we prove a functional law of large numbers or a fluid limit for the blocking model and show that the fluid limit is a distributed delay equation. We assess the quality of our fluid limit via simulation and show that the fluid limit accurately describes the large-scale stochastic dynamics of the CBF. Finally, we prove stochastic ordering results for the CBF processes we analyze. △ Less

Submitted 7 July, 2025; originally announced July 2025.

arXiv:2505.16136 [pdf, ps, other]

Interpretable Machine Learning for Macro Alpha: A News Sentiment Case Study

Authors: Yuke Zhang

Abstract: This study introduces an interpretable machine learning (ML) framework to extract macroeconomic alpha from global news sentiment. We process the Global Database of Events, Language, and Tone (GDELT) Project's worldwide news feed using FinBERT -- a Bidirectional Encoder Representations from Transformers (BERT) based model pretrained on finance-specific language -- to construct daily sentiment indic… ▽ More This study introduces an interpretable machine learning (ML) framework to extract macroeconomic alpha from global news sentiment. We process the Global Database of Events, Language, and Tone (GDELT) Project's worldwide news feed using FinBERT -- a Bidirectional Encoder Representations from Transformers (BERT) based model pretrained on finance-specific language -- to construct daily sentiment indices incorporating mean tone, dispersion, and event impact. These indices drive an XGBoost classifier, benchmarked against logistic regression, to predict next-day returns for EUR/USD, USD/JPY, and 10-year U.S. Treasury futures (ZN). Rigorous out-of-sample (OOS) backtesting (5-fold expanding-window cross-validation, OOS period: c. 2017-April 2025) demonstrates exceptional, cost-adjusted performance for the XGBoost strategy: Sharpe ratios achieve 5.87 (EUR/USD), 4.65 (USD/JPY), and 4.65 (Treasuries), with respective compound annual growth rates (CAGRs) exceeding 50% in Foreign Exchange (FX) and 22% in bonds. Shapley Additive Explanations (SHAP) affirm that sentiment dispersion and article impact are key predictive features. Our findings establish that integrating domain-specific Natural Language Processing (NLP) with interpretable ML offers a potent and explainable source of macro alpha. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: 18 pages (including references), 1 figure, 1 table. Code available at \url{https://github.com/yukepenn/macro-news-sentiment-trading}. Keywords: Macro Sentiment, News Sentiment, Algorithmic Trading, GDELT, FinBERT, NLP, Alternative Data, Foreign Exchange, Treasury Futures, Quantitative Finance, Machine Learning, SHAP, Interpretability

arXiv:2504.20349 [pdf, other]

ClusterLOB: Enhancing Trading Strategies by Clustering Orders in Limit Order Books

Authors: Yichi Zhang, Mihai Cucuringu, Alexander Y. Shestopaloff, Stefan Zohren

Abstract: In the rapidly evolving world of financial markets, understanding the dynamics of limit order book (LOB) is crucial for unraveling market microstructure and participant behavior. We introduce ClusterLOB as a method to cluster individual market events in a stream of market-by-order (MBO) data into different groups. To do so, each market event is augmented with six time-dependent features. By applyi… ▽ More In the rapidly evolving world of financial markets, understanding the dynamics of limit order book (LOB) is crucial for unraveling market microstructure and participant behavior. We introduce ClusterLOB as a method to cluster individual market events in a stream of market-by-order (MBO) data into different groups. To do so, each market event is augmented with six time-dependent features. By applying the K-means++ clustering algorithm to the resulting order features, we are then able to assign each new order to one of three distinct clusters, which we identify as directional, opportunistic, and market-making participants, each capturing unique trading behaviors. Our experimental results are performed on one year of MBO data containing small-tick, medium-tick, and large-tick stocks from NASDAQ. To validate the usefulness of our clustering, we compute order flow imbalances across each cluster within 30-minute buckets during the trading day. We treat each cluster's imbalance as a signal that provides insights into trading strategies and participants' responses to varying market conditions. To assess the effectiveness of these signals, we identify the trading strategy with the highest Sharpe ratio in the training dataset, and demonstrate that its performance in the test dataset is superior to benchmark trading strategies that do not incorporate clustering. We also evaluate trading strategies based on order flow imbalance decompositions across different market event types, including add, cancel, and trade events, to assess their robustness in various market conditions. This work establishes a robust framework for clustering market participant behavior, which helps us to better understand market microstructure, and inform the development of more effective predictive trading signals with practical applications in algorithmic trading and quantitative finance. △ Less

Submitted 9 May, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

arXiv:2504.17468 [pdf, other]

Optimal design of reinsurance contracts under adverse selection with a continuum of types

Authors: Ka Chun Cheung, Sheung Chi Phillip Yam, Fei Lung Yuen, Yiying Zhang

Abstract: In this paper, we use the principal-agent model to study the optimal contract design in a monopolistic reinsurance market under adverse selection with a continuum of types of insurers. Instead of adopting the classical expected utility paradigm, we model the risk preference of each insurer (agent) by his Value-at-Risk at his own chosen risk tolerance level. Under information asymmetry, the reinsur… ▽ More In this paper, we use the principal-agent model to study the optimal contract design in a monopolistic reinsurance market under adverse selection with a continuum of types of insurers. Instead of adopting the classical expected utility paradigm, we model the risk preference of each insurer (agent) by his Value-at-Risk at his own chosen risk tolerance level. Under information asymmetry, the reinsurer (principal) aims to maximize her expected profit by designing an optimal menu of reinsurance contracts for a continuum of insurers with hidden characteristics. The optimization problem is constrained by agents' individual compatibility and rationality constraints. By making use of the notion of indirect utility functions, the problem is completely solved for the following three commonly encountered classes of reinsurance indemnities: stop-loss, quota-share, and change loss. Some numerical examples are provided as illustrations. △ Less

Submitted 24 April, 2025; originally announced April 2025.

Comments: 40 pages, 2 figures

arXiv:2504.15809 [pdf, other]

A Line Graph-Based Framework for Identifying Optimal Routing Paths in Decentralized Exchanges

Authors: Yu Zhang, Yafei Li, Claudio Tessone

Abstract: Decentralized exchanges, such as those employing constant product market makers (CPMMs) like Uniswap V2, play a crucial role in the blockchain ecosystem by enabling peer-to-peer token swaps without intermediaries. Despite the increasing volume of transactions, there remains limited research on identifying optimal trading paths across multiple DEXs. This paper presents a novel line-graph-based algo… ▽ More Decentralized exchanges, such as those employing constant product market makers (CPMMs) like Uniswap V2, play a crucial role in the blockchain ecosystem by enabling peer-to-peer token swaps without intermediaries. Despite the increasing volume of transactions, there remains limited research on identifying optimal trading paths across multiple DEXs. This paper presents a novel line-graph-based algorithm (LG) designed to efficiently discover profitable trading routes within DEX environments. We benchmark LG against the widely adopted Depth-First Search (DFS) algorithm under a linear routing scenario, encompassing platforms such as Uniswap, SushiSwap, and PancakeSwap. Experimental results demonstrate that LG consistently identifies trading paths that are as profitable as, or more profitable than, those found by DFS, while incurring comparable gas costs. Evaluations on Uniswap V2 token graphs across two temporal snapshots further validate LG's performance. Although LG exhibits exponential runtime growth with respect to graph size in empirical tests, it remains viable for practical, real-world use cases. Our findings underscore the potential of the LG algorithm for industrial adoption, offering tangible benefits to traders and market participants in the DeFi space. △ Less

Submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.12771 [pdf, other]

Classification-Based Analysis of Price Pattern Differences Between Cryptocurrencies and Stocks

Authors: Yu Zhang, Zelin Wu, Claudio Tessone

Abstract: Cryptocurrencies are digital tokens built on blockchain technology, with thousands actively traded on centralized exchanges (CEXs). Unlike stocks, which are backed by real businesses, cryptocurrencies are recognized as a distinct class of assets by researchers. How do investors treat this new category of asset in trading? Are they similar to stocks as an investment tool for investors? We answer th… ▽ More Cryptocurrencies are digital tokens built on blockchain technology, with thousands actively traded on centralized exchanges (CEXs). Unlike stocks, which are backed by real businesses, cryptocurrencies are recognized as a distinct class of assets by researchers. How do investors treat this new category of asset in trading? Are they similar to stocks as an investment tool for investors? We answer these questions by investigating cryptocurrencies' and stocks' price time series which can reflect investors' attitudes towards the targeted assets. Concretely, we use different machine learning models to classify cryptocurrencies' and stocks' price time series in the same period and get an extremely high accuracy rate, which reflects that cryptocurrency investors behave differently in trading from stock investors. We then extract features from these price time series to explain the price pattern difference, including mean, variance, maximum, minimum, kurtosis, skewness, and first to third-order autocorrelation, etc., and then use machine learning methods including logistic regression (LR), random forest (RF), support vector machine (SVM), etc. for classification. The classification results show that these extracted features can help to explain the price time series pattern difference between cryptocurrencies and stocks. △ Less

Submitted 17 April, 2025; originally announced April 2025.

arXiv:2502.19615 [pdf]

A Method for Evaluating the Interpretability of Machine Learning Models in Predicting Bond Default Risk Based on LIME and SHAP

Authors: Yan Zhang, Lin Chen, Yixiang Tian

Abstract: Interpretability analysis methods for artificial intelligence models, such as LIME and SHAP, are widely used, though they primarily serve as post-model for analyzing model outputs. While it is commonly believed that the transparency and interpretability of AI models diminish as their complexity increases, currently there is no standardized method for assessing the inherent interpretability of the… ▽ More Interpretability analysis methods for artificial intelligence models, such as LIME and SHAP, are widely used, though they primarily serve as post-model for analyzing model outputs. While it is commonly believed that the transparency and interpretability of AI models diminish as their complexity increases, currently there is no standardized method for assessing the inherent interpretability of the models themselves. This paper uses bond market default prediction as a case study, applying commonly used machine learning algorithms within AI models. First, the classification performance of these algorithms in default prediction is evaluated. Then, leveraging LIME and SHAP to assess the contribution of sample features to prediction outcomes, the paper proposes a novel method for evaluating the interpretability of the models themselves. The results of this analysis are consistent with the intuitive understanding and logical expectations regarding the interpretability of these models. △ Less

Submitted 26 February, 2025; originally announced February 2025.

Comments: 12 Pages,9 figures

ACM Class: F.2.2

arXiv:2411.19436 [pdf, ps, other]

Self-protection and insurance demand with convex premium principles

Authors: Qiqi Li, Wei Wang, Yiying Zhang

Abstract: In economic analysis, rational decision-makers often take actions to reduce their risk exposure. These actions include purchasing market insurance and implementing prevention measures to modify the shape of the loss distribution. Under the assumption that the insureds' actions are fully observed by the insurer, this paper investigates the interaction between self-protection and insurance demand wh… ▽ More In economic analysis, rational decision-makers often take actions to reduce their risk exposure. These actions include purchasing market insurance and implementing prevention measures to modify the shape of the loss distribution. Under the assumption that the insureds' actions are fully observed by the insurer, this paper investigates the interaction between self-protection and insurance demand when insurance premiums are determined by convex premium principles within the framework of distortion risk measures. Specifically, the insured selects an optimal proportional insurance share and prevention effort to minimize the risk measure of their end-of-period exposure. We explicitly characterize the optimal combination of prevention effort and insurance demand in a self-protection model when the insured adopts tail value-at-risk or a subclass with strictly concave distortion functions. Additionally, we conduct comparative static analyses to illustrate our main findings under various premium structures, risk aversion levels, and loss distributions. Our results indicate that market insurance and self-protection are complementary, supporting classical insights from the literature regarding corner insurance policies (i.e., null and full insurance) in the absence of ex ante moral hazard. Finally, we consider the effects of moral hazard on the interaction between self-protection and insurance demand. Our findings show that ex ante moral hazard shifts the complementary effect into substitution effect. △ Less

Submitted 20 February, 2025; v1 submitted 28 November, 2024; originally announced November 2024.

arXiv:2411.13384 [pdf, other]

doi 10.1017/S026996482500004X

On multivariate contribution measures of systemic risk with applications in cryptocurrency market

Authors: Limin Wen, Junxue Li, Tong Pu, Yiying Zhang

Abstract: Conditional risk measures and their associated risk contribution measures are commonly employed in finance and actuarial science for evaluating systemic risk and quantifying the effects of risk interactions. This paper introduces various types of contribution ratio measures based on the MCoVaR, MCoES, and MMME studied in Ortega-Jiménez et al. (2021) and Das & Fasen-Hartmann (2018) to assess the re… ▽ More Conditional risk measures and their associated risk contribution measures are commonly employed in finance and actuarial science for evaluating systemic risk and quantifying the effects of risk interactions. This paper introduces various types of contribution ratio measures based on the MCoVaR, MCoES, and MMME studied in Ortega-Jiménez et al. (2021) and Das & Fasen-Hartmann (2018) to assess the relative effects of a single risk when other risks in a group are in distress. The properties of these contribution risk measures are examined, and sufficient conditions for comparing these measures between two sets of random vectors are established using univariate and multivariate stochastic orders and statistically dependent notions. Numerical examples are presented to validate these conditions. Finally, a real dataset from the cryptocurrency market is used to analyze the spillover effects through our proposed contribution measures. △ Less

Submitted 3 March, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

arXiv:2411.09676 [pdf, other]

On Vulnerability Conditional Risk Measures: Comparisons and Applications in Cryptocurrency Market

Authors: Tong Pu, Yunran Wei, Yiying Zhang

Abstract: We introduce a novel class of systemic risk measures, the Vulnerability Conditional risk measures, which try to capture the "tail risk" of a risky position in scenarios where one or more market participants is experiencing financial distress. Various theoretical properties of Vulnerability Conditional risk measures, along with a series of related contribution measures, have been considered in this… ▽ More We introduce a novel class of systemic risk measures, the Vulnerability Conditional risk measures, which try to capture the "tail risk" of a risky position in scenarios where one or more market participants is experiencing financial distress. Various theoretical properties of Vulnerability Conditional risk measures, along with a series of related contribution measures, have been considered in this paper. We further introduce the backtesting procedures of VCoES and MCoES. Through numerical examples, we validate our theoretical insights and further apply our newly proposed risk measures to the empirical analysis of cryptocurrencies, demonstrating their practical relevance and utility in capturing systemic risk. △ Less

Submitted 14 November, 2024; originally announced November 2024.

arXiv:2411.09657 [pdf, other]

Asymptotics of Sum of Heavy-tailed Risks with Copulas

Authors: Fan Yang, Yi Zhang

Abstract: We study the tail asymptotics of the sum of two heavy-tailed random variables. The dependence structure is modeled by copulas with the so-called tail order property. Examples are presented to illustrate the approach. Further for each example we apply the main results to obtain the asymptotic expansions for Value-at-Risk of aggregate risk. We study the tail asymptotics of the sum of two heavy-tailed random variables. The dependence structure is modeled by copulas with the so-called tail order property. Examples are presented to illustrate the approach. Further for each example we apply the main results to obtain the asymptotic expansions for Value-at-Risk of aggregate risk. △ Less

Submitted 14 November, 2024; originally announced November 2024.

arXiv:2410.19291 [pdf, other]

A Stock Price Prediction Approach Based on Time Series Decomposition and Multi-Scale CNN using OHLCT Images

Authors: Zhiyuan Pei, Jianqi Yan, Jin Yan, Bailing Yang, Ziyuan Li, Lin Zhang, Xin Liu, Yang Zhang

Abstract: Recently, deep learning in stock prediction has become an important branch. Image-based methods show potential by capturing complex visual patterns and spatial correlations, offering advantages in interpretability over time series models. However, image-based approaches are more prone to overfitting, hindering robust predictive performance. To improve accuracy, this paper proposes a novel method,… ▽ More Recently, deep learning in stock prediction has become an important branch. Image-based methods show potential by capturing complex visual patterns and spatial correlations, offering advantages in interpretability over time series models. However, image-based approaches are more prone to overfitting, hindering robust predictive performance. To improve accuracy, this paper proposes a novel method, named Sequence-based Multi-scale Fusion Regression Convolutional Neural Network (SMSFR-CNN), for predicting stock price movements in the China A-share market. By utilizing CNN to learn sequential features and combining them with image features, we improve the accuracy of stock trend prediction on the A-share market stock dataset. This approach reduces the search space for image features, stabilizes, and accelerates the training process. Extensive comparative experiments on 4,454 A-share stocks show that the model achieves a 61.15% positive predictive value and a 63.37% negative predictive value for the next 5 days, resulting in a total profit of 165.09%. △ Less

Submitted 29 October, 2024; v1 submitted 24 October, 2024; originally announced October 2024.

Comments: 32 pages, 5 figures, 5 tables

arXiv:2410.14059 [pdf, other]

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Authors: Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, Benyou Wang

Abstract: This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, task-specific interactions to simulate the complexities of evolving financial scenarios. Firstly… ▽ More This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, task-specific interactions to simulate the complexities of evolving financial scenarios. Firstly, we conducted a user study involving 804 participants, collecting their feedback on financial tasks. Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions. This dataset serves as the foundation for benchmarking 11 LLMs services using the LLM-as-Judge methodology. Our results show a significant alignment between benchmark scores and human preferences, with a Pearson correlation coefficient of 0.78, confirming the effectiveness of the UCFE dataset and our evaluation approach. UCFE benchmark not only reveals the potential of LLMs in the financial domain but also provides a robust framework for assessing their performance and user satisfaction. △ Less

Submitted 7 February, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

arXiv:2409.13957 [pdf]

The Impact of Implicit Government Guarantee on Credit Rating of Municipal Investment Bonds

Authors: Yan Zhang, Yixiang Tian, Lin Chen

Abstract: One type of bond with the most implicit government guarantee is municipal investment bonds. In recent years, there have been an increasing number of downgrades in the credit ratings of municipal bonds, which has led some people to question whether the implicit government guarantee may affect the objectivity of the bond ratings? This paper uses text mining methods to mine relevant policy documents… ▽ More One type of bond with the most implicit government guarantee is municipal investment bonds. In recent years, there have been an increasing number of downgrades in the credit ratings of municipal bonds, which has led some people to question whether the implicit government guarantee may affect the objectivity of the bond ratings? This paper uses text mining methods to mine relevant policy documents related to municipal investment bond issuance, and calculates the implicit guarantee strength of municipal investment bonds based on the PMC index model. It further analyzes the impact of the implicit guarantee strength of municipal bonds on their credit evaluation. The study found that the implicit government guarantee on municipal investment bonds does indeed help to raise the credit ratings assigned by credit rating agencies. The study found that, moreover, the government's implicit guarantee has a more pronounced effect in boosting credit ratings in less developed western regions. △ Less

Submitted 20 September, 2024; originally announced September 2024.

Comments: 16pages,1 figure

arXiv:2409.12831 [pdf]

Implicit Government Guarantee Measurement Based on PMC Index Model

Authors: Yan Zhang, Yixiang Tian, Lin Chen, Qi Wang

Abstract: The implicit government guarantee hampers the recognition and management of risks by all stakeholders in the bond market, and it has led to excessive debt for local governments or state-owned enterprises. To prevent the risk of local government debt defaults and reduce investors' expectations of implicit government guarantees, various regulatory departments have issued a series of policy documents… ▽ More The implicit government guarantee hampers the recognition and management of risks by all stakeholders in the bond market, and it has led to excessive debt for local governments or state-owned enterprises. To prevent the risk of local government debt defaults and reduce investors' expectations of implicit government guarantees, various regulatory departments have issued a series of policy documents related to municipal investment bonds. By employing text mining techniques on policy documents related to municipal investment bond, and utilizing the PMC index model to assess the effectiveness of policy documents. This paper proposes a novel method for quantifying the intensity of implicit governmental guarantees based on PMC index model. The intensity of implicit governmental guarantees is inversely correlated with the PMC index of policies aimed at de-implicitizing governmental guarantees. Then as these policies become more effective, the intensity of implicit governmental guarantees diminishes correspondingly. These findings indicate that recent policies related to municipal investment bond have indeed succeeded in reducing implicit governmental guarantee intensity, and these policies have achieved the goal of risk management. Furthermore, it was showed that the intensity of implicit governmental guarantee affected by diverse aspects of these policies such as effectiveness, clarity, and specificity, as well as incentive and assurance mechanisms. △ Less

Submitted 19 September, 2024; originally announced September 2024.

Comments: 22 pages,6 figures

arXiv:2409.11569 [pdf, ps, other]

Optimal Investment with Costly Expert Opinions

Authors: Christoph Knochenhauer, Alexander Merkel, Yufei Zhang

Abstract: We consider the Merton problem of optimizing expected power utility of terminal wealth in the case of an unobservable Markov-modulated drift. What makes the model special is that the agent is allowed to purchase costly expert opinions of varying quality on the current state of the drift, leading to a mixed stochastic control problem with regular and impulse controls involving random consequences.… ▽ More We consider the Merton problem of optimizing expected power utility of terminal wealth in the case of an unobservable Markov-modulated drift. What makes the model special is that the agent is allowed to purchase costly expert opinions of varying quality on the current state of the drift, leading to a mixed stochastic control problem with regular and impulse controls involving random consequences. Using ideas from filtering theory, we first embed the original problem with unobservable drift into a full information problem on a larger state space. The value function of the full information problem is characterized as the unique viscosity solution of the dynamic programming PDE. This characterization is achieved by a new variant of the stochastic Perron's method, which additionally allows us to show that, in between purchases of expert opinions, the problem reduces to an exit time control problem which is known to admit an optimal feedback control. Under the assumption of sufficient regularity of this feedback map, we are able to construct optimal trading and expert opinion strategies. △ Less

Submitted 17 September, 2024; originally announced September 2024.

arXiv:2409.04233 [pdf, other]

Pricing and hedging of decentralised lending contracts

Authors: Lukasz Szpruch, Marc Sabaté Vidales, Tanut Treetanthiploet, Yufei Zhang

Abstract: We study the loan contracts offered by decentralised loan protocols (DLPs) through the lens of financial derivatives. DLPs, which effectively are clearinghouses, facilitate transactions between option buyers (i.e. borrowers) and option sellers (i.e. lenders). The loan-to-value at which the contract is initiated determines the option premium borrowers pay for entering the contract, and this can be… ▽ More We study the loan contracts offered by decentralised loan protocols (DLPs) through the lens of financial derivatives. DLPs, which effectively are clearinghouses, facilitate transactions between option buyers (i.e. borrowers) and option sellers (i.e. lenders). The loan-to-value at which the contract is initiated determines the option premium borrowers pay for entering the contract, and this can be deduced from the non-arbitrage pricing theory. We show that when there are no market frictions, and there is no spread between lending and borrowing rates, it is optimal to never enter the lending contract. Next, by accounting for the spread between rates and transactional costs, we develop a deep neural network-based algorithm for learning trading strategies on the external markets that allow us to replicate the payoff of the lending contracts that are not necessarily optimally exercised. This allows hedge the risk lenders carry by issuing options sold to the borrowers, which can complement (or even replace) the liquidations mechanism used to protect lenders' capital. Our approach can also be used to exploit (statistical) arbitrage opportunities that may arise when DLP allow users to enter lending contracts with loan-to-value, which is not appropriately calibrated to market conditions or/and when different markets price risk differently. We present thorough simulation experiments using historical data and simulations to validate our approach. △ Less

Submitted 6 September, 2024; originally announced September 2024.

arXiv:2409.01908 [pdf, other]

Bayesian CART models for aggregate claim modeling

Authors: Yaojun Zhang, Lanpeng Ji, Georgios Aivaliotis, Charles C. Taylor

Abstract: This paper proposes three types of Bayesian CART (or BCART) models for aggregate claim amount, namely, frequency-severity models, sequential models and joint models. We propose a general framework for the BCART models applicable to data with multivariate responses, which is particularly useful for the joint BCART models with a bivariate response: the number of claims and aggregate claim amount. To… ▽ More This paper proposes three types of Bayesian CART (or BCART) models for aggregate claim amount, namely, frequency-severity models, sequential models and joint models. We propose a general framework for the BCART models applicable to data with multivariate responses, which is particularly useful for the joint BCART models with a bivariate response: the number of claims and aggregate claim amount. To facilitate frequency-severity modeling, we investigate BCART models for the right-skewed and heavy-tailed claim severity data by using various distributions. We discover that the Weibull distribution is superior to gamma and lognormal distributions, due to its ability to capture different tail characteristics in tree models. Additionally, we find that sequential BCART models and joint BCART models, which incorporate dependence between the number of claims and average severity, are beneficial and thus preferable to the frequency-severity BCART models in which independence is assumed. The effectiveness of these models' performance is illustrated by carefully designed simulations and real insurance data. △ Less

Submitted 3 September, 2024; originally announced September 2024.

arXiv:2408.11878 [pdf, ps, other]

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Authors: Jimin Huang, Mengxi Xiao, Dong Li, Zihao Jiang, Yuzhe Yang, Yifei Zhang, Lingfei Qian, Yan Wang, Xueqing Peng, Yang Ren, Ruoyu Xiang, Zhengyu Chen, Xiao Zhang, Yueru He, Weiguang Han, Shunian Chen, Lihang Shen, Daniel Kim, Yangyang Yu, Yupeng Cao, Zhiyang Deng, Haohang Li, Duanyu Feng, Yongfu Dai, VijayaSai Somasundaram , et al. (19 additional authors not shown)

Abstract: Financial LLMs hold promise for advancing financial tasks and domain-specific applications. However, they are limited by scarce corpora, weak multimodal capabilities, and narrow evaluations, making them less suited for real-world application. To address this, we introduce \textit{Open-FinLLMs}, the first open-source multimodal financial LLMs designed to handle diverse tasks across text, tabular, t… ▽ More Financial LLMs hold promise for advancing financial tasks and domain-specific applications. However, they are limited by scarce corpora, weak multimodal capabilities, and narrow evaluations, making them less suited for real-world application. To address this, we introduce \textit{Open-FinLLMs}, the first open-source multimodal financial LLMs designed to handle diverse tasks across text, tabular, time-series, and chart data, excelling in zero-shot, few-shot, and fine-tuning settings. The suite includes FinLLaMA, pre-trained on a comprehensive 52-billion-token corpus; FinLLaMA-Instruct, fine-tuned with 573K financial instructions; and FinLLaVA, enhanced with 1.43M multimodal tuning pairs for strong cross-modal reasoning. We comprehensively evaluate Open-FinLLMs across 14 financial tasks, 30 datasets, and 4 multimodal tasks in zero-shot, few-shot, and supervised fine-tuning settings, introducing two new multimodal evaluation datasets. Our results show that Open-FinLLMs outperforms afvanced financial and general LLMs such as GPT-4, across financial NLP, decision-making, and multi-modal tasks, highlighting their potential to tackle real-world challenges. To foster innovation and collaboration across academia and industry, we release all codes (https://anonymous.4open.science/r/PIXIU2-0D70/B1D7/LICENSE) and models under OSI-approved licenses. △ Less

Submitted 6 June, 2025; v1 submitted 20 August, 2024; originally announced August 2024.

Comments: 33 pages, 13 figures

arXiv:2407.18957 [pdf, other]

When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments

Authors: Chong Zhang, Xinyi Liu, Zhongmou Zhang, Mingyu Jin, Lingyao Li, Zhenting Wang, Wenyue Hua, Dong Shu, Suiyuan Zhu, Xiaobo Jin, Sujian Li, Mengnan Du, Yongfeng Zhang

Abstract: Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large langu… ▽ More Can AI Agents simulate real-world trading environments to investigate the impact of external factors on stock trading activities (e.g., macroeconomics, policy changes, company fundamentals, and global events)? These factors, which frequently influence trading behaviors, are critical elements in the quest for maximizing investors' profits. Our work attempts to solve this problem through large language model based agents. We have developed a multi-agent AI system called StockAgent, driven by LLMs, designed to simulate investors' trading behaviors in response to the real stock market. The StockAgent allows users to evaluate the impact of different external factors on investor trading and to analyze trading behavior and profitability effects. Additionally, StockAgent avoids the test set leakage issue present in existing trading simulation systems based on AI Agents. Specifically, it prevents the model from leveraging prior knowledge it may have acquired related to the test data. We evaluate different LLMs under the framework of StockAgent in a stock trading environment that closely resembles real-world conditions. The experimental results demonstrate the impact of key external factors on stock market trading, including trading behavior and stock price fluctuation rules. This research explores the study of agents' free trading gaps in the context of no prior knowledge related to market data. The patterns identified through StockAgent simulations provide valuable insights for LLM-based investment advice and stock recommendation. The code is available at https://github.com/MingyuJ666/Stockagent. △ Less

Submitted 20 September, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

Comments: 33 pages, 10 figures

arXiv:2406.16600 [pdf, other]

Profit Maximization In Arbitrage Loops

Authors: Yu Zhang, Zichen Li, Tao Yan, Qianyu Liu, Nicolo Vallarano, Claudio Tessone

Abstract: Cyclic arbitrage chances exist abundantly among decentralized exchanges (DEXs), like Uniswap V2. For an arbitrage cycle (loop), researchers or practitioners usually choose a specific token, such as Ether as input, and optimize their input amount to get the net maximal amount of the specific token as arbitrage profit. By considering the tokens' prices from CEXs in this paper, the new arbitrage prof… ▽ More Cyclic arbitrage chances exist abundantly among decentralized exchanges (DEXs), like Uniswap V2. For an arbitrage cycle (loop), researchers or practitioners usually choose a specific token, such as Ether as input, and optimize their input amount to get the net maximal amount of the specific token as arbitrage profit. By considering the tokens' prices from CEXs in this paper, the new arbitrage profit, called monetized arbitrage profit, will be quantified as the product of the net number of a specific token we got from the arbitrage loop and its corresponding price in CEXs. Based on this concept, we put forward three different strategies to maximize the monetized arbitrage profit for each arbitrage loop. The first strategy is called the MaxPrice strategy. Under this strategy, arbitrageurs start arbitrage only from the token with the highest CEX price. The second strategy is called the MaxMax strategy. Under this strategy, we calculate the monetized arbitrage profit for each token as input in turn in the arbitrage loop. Then, we pick up the most maximal monetized arbitrage profit among them as the monetized arbitrage profit of the MaxMax strategy. The third one is called the Convex Optimization strategy. By mapping the MaxMax strategy to a convex optimization problem, we proved that the Convex Optimization strategy could get more profit in theory than the MaxMax strategy, which is proved again in a given example. We also proved that if no arbitrage profit exists according to the MaxMax strategy, then the Convex Optimization strategy can not detect any arbitrage profit, either. However, the empirical data analysis denotes that the profitability of the Convex Optimization strategy is almost equal to that of the MaxMax strategy, and the MaxPrice strategy is not reliable in getting the maximal monetized arbitrage profit compared to the MaxMax strategy. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16573 [pdf, other]

An Improved Algorithm to Identify More Arbitrage Opportunities on Decentralized Exchanges

Authors: Yu Zhang, Tao Yan, Jianhong Lin, Benjamin Kraner, Claudio Tessone

Abstract: In decentralized exchanges (DEXs), the arbitrage paths exist abundantly in the form of both arbitrage loops (e.g. the arbitrage path starts from token A and back to token A again in the end, A, B,..., A) and non-loops (e.g. the arbitrage path starts from token A and stops at a different token N, A, B,..., N). The Moore-Bellman-Ford algorithm, often coupled with the ``walk to the root" technique, i… ▽ More In decentralized exchanges (DEXs), the arbitrage paths exist abundantly in the form of both arbitrage loops (e.g. the arbitrage path starts from token A and back to token A again in the end, A, B,..., A) and non-loops (e.g. the arbitrage path starts from token A and stops at a different token N, A, B,..., N). The Moore-Bellman-Ford algorithm, often coupled with the ``walk to the root" technique, is commonly employed for detecting arbitrage loops in the token graph of decentralized exchanges (DEXs) such as Uniswap. However, a limitation of this algorithm is its ability to recognize only a limited number of arbitrage loops in each run. Additionally, it cannot specify the starting token of the detected arbitrage loops, further constraining its effectiveness in certain scenarios. Another limitation of this algorithm is its incapacity to detect non-loop arbitrage paths between any specified pairs of tokens. In this paper, we develop a new method to solve these problems by combining the line graph and a modified Moore-Bellman-Ford algorithm (MMBF). This method can help to find more arbitrage loops by detecting at least one arbitrage loop starting from any specified tokens in the DEXs and can detect the non-loop arbitrage paths between any pair of tokens. Then, we applied our algorithm to Uniswap V2 and found more arbitrage loops and non-loops indeed compared with applying the Moore-Bellman-Ford (MBF) combined algorithm. The found arbitrage profit by our method in some arbitrage paths can be even as high as one million dollars, far larger than that found by the MBF combined algorithm. Finally, we statistically compare the distribution of arbitrage path lengths and the arbitrage profit detected by both our method and the MBF combined algorithm, and depict how potential arbitrage opportunities change with time by our method. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2405.11444 [pdf, other]

Adaptive Optimal Market Making Strategies with Inventory Liquidation Cos

Authors: Jonathan Chávez-Casillas, José E. Figueroa-López, Chuyi Yu, Yi Zhang

Abstract: A novel high-frequency market-making approach in discrete time is proposed that admits closed-form solutions. By taking advantage of demand functions that are linear in the quoted bid and ask spreads with random coefficients, we model the variability of the partial filling of limit orders posted in a limit order book (LOB). As a result, we uncover new patterns as to how the demand's randomness aff… ▽ More A novel high-frequency market-making approach in discrete time is proposed that admits closed-form solutions. By taking advantage of demand functions that are linear in the quoted bid and ask spreads with random coefficients, we model the variability of the partial filling of limit orders posted in a limit order book (LOB). As a result, we uncover new patterns as to how the demand's randomness affects the optimal placement strategy. We also allow the price process to follow general dynamics without any Brownian or martingale assumption as is commonly adopted in the literature. The most important feature of our optimal placement strategy is that it can react or adapt to the behavior of market orders online. Using LOB data, we train our model and reproduce the anticipated final profit and loss of the optimal strategy on a given testing date using the actual flow of orders in the LOB. Our adaptive optimal strategies outperform the non-adaptive strategy and those that quote limit orders at a fixed distance from the midprice. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: A preprint of this paper was distributed under the title of "Market Making with Stochastic Liquidity Demand: Simultaneous Order Arrival and Price Change Forecasts". The present paper extends the results in the referred preprint, which will remain as an unpublished manuscript

arXiv:2405.08047 [pdf, other]

Autonomous Sparse Mean-CVaR Portfolio Optimization

Authors: Yizun Lin, Yangyu Zhang, Zhao-Rong Lai, Cheng Li

Abstract: The $\ell_0$-constrained mean-CVaR model poses a significant challenge due to its NP-hard nature, typically tackled through combinatorial methods characterized by high computational demands. From a markedly different perspective, we propose an innovative autonomous sparse mean-CVaR portfolio model, capable of approximating the original $\ell_0$-constrained mean-CVaR model with arbitrary accuracy.… ▽ More The $\ell_0$-constrained mean-CVaR model poses a significant challenge due to its NP-hard nature, typically tackled through combinatorial methods characterized by high computational demands. From a markedly different perspective, we propose an innovative autonomous sparse mean-CVaR portfolio model, capable of approximating the original $\ell_0$-constrained mean-CVaR model with arbitrary accuracy. The core idea is to convert the $\ell_0$ constraint into an indicator function and subsequently handle it through a tailed approximation. We then propose a proximal alternating linearized minimization algorithm, coupled with a nested fixed-point proximity algorithm (both convergent), to iteratively solve the model. Autonomy in sparsity refers to retaining a significant portion of assets within the selected asset pool during adjustments in pool size. Consequently, our framework offers a theoretically guaranteed approximation of the $\ell_0$-constrained mean-CVaR model, improving computational efficiency while providing a robust asset selection scheme. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.07549 [pdf, other]

On Joint Marginal Expected Shortfall and Associated Contribution Risk Measures

Authors: Tong Pu, Yifei Zhang, Yiying Zhang

Abstract: Systemic risk is the risk that a company- or industry-level risk could trigger a huge collapse of another or even the whole institution. Various systemic risk measures have been proposed in the literature to quantify the domino and (relative) spillover effects induced by systemic risks such as the well-known CoVaR, CoES, MES and CoD risk measures, and associated contribution measures. This paper p… ▽ More Systemic risk is the risk that a company- or industry-level risk could trigger a huge collapse of another or even the whole institution. Various systemic risk measures have been proposed in the literature to quantify the domino and (relative) spillover effects induced by systemic risks such as the well-known CoVaR, CoES, MES and CoD risk measures, and associated contribution measures. This paper proposes another new type of systemic risk measure, called the joint marginal expected shortfall (JMES), to measure whether the MES of one entity's risk-taking adds to another one or the overall risk conditioned on the event that the entity is already in some specified distress level. We further introduce two useful systemic risk contribution measures based on the difference function or relative ratio function of the JMES and the conventional ES, respectively. Some basic properties of these proposed measures are studied such as monotonicity, comonotonic additivity, non-identifiability and non-elicitability. For both risk measures and two different vectors of bivariate risks, we establish sufficient conditions imposed on copula structure, stress levels, and stochastic orders to compare these new measures. We further provide some numerical examples to illustrate our main findings. A real application in analyzing the risk contagion among several stock market indices is implemented to show the performances of our proposed measures compared with other commonly used measures including CoVaR, CoES, MES, and their associated contribution measures. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.03624 [pdf, ps, other]

$ε$-Policy Gradient for Online Pricing

Authors: Lukasz Szpruch, Tanut Treetanthiploet, Yufei Zhang

Abstract: Combining model-based and model-free reinforcement learning approaches, this paper proposes and analyzes an $ε$-policy gradient algorithm for the online pricing learning task. The algorithm extends $ε$-greedy algorithm by replacing greedy exploitation with gradient descent step and facilitates learning via model inference. We optimize the regret of the proposed algorithm by quantifying the explora… ▽ More Combining model-based and model-free reinforcement learning approaches, this paper proposes and analyzes an $ε$-policy gradient algorithm for the online pricing learning task. The algorithm extends $ε$-greedy algorithm by replacing greedy exploitation with gradient descent step and facilitates learning via model inference. We optimize the regret of the proposed algorithm by quantifying the exploration cost in terms of the exploration probability $ε$ and the exploitation cost in terms of the gradient descent optimization and gradient estimation errors. The algorithm achieves an expected regret of order $\mathcal{O}(\sqrt{T})$ (up to a logarithmic factor) over $T$ trials. △ Less

Submitted 6 May, 2024; originally announced May 2024.

MSC Class: 62J12; 68Q32; 65Y20

arXiv:2403.09532 [pdf, other]

Robust SGLD algorithm for solving non-convex distributionally robust optimisation problems

Authors: Ariel Neufeld, Matthew Ng Cheng En, Ying Zhang

Abstract: In this paper we develop a Stochastic Gradient Langevin Dynamics (SGLD) algorithm tailored for solving a certain class of non-convex distributionally robust optimisation (DRO) problems. By deriving non-asymptotic convergence bounds, we build an algorithm which for any prescribed accuracy $\varepsilon>0$ outputs an estimator whose expected excess risk is at most $\varepsilon$. As a concrete applica… ▽ More In this paper we develop a Stochastic Gradient Langevin Dynamics (SGLD) algorithm tailored for solving a certain class of non-convex distributionally robust optimisation (DRO) problems. By deriving non-asymptotic convergence bounds, we build an algorithm which for any prescribed accuracy $\varepsilon>0$ outputs an estimator whose expected excess risk is at most $\varepsilon$. As a concrete application, we consider the problem of identifying the best non-linear estimator of a given regression model involving a neural network using adversarially corrupted samples. We formulate this problem as a DRO problem and demonstrate both theoretically and numerically the applicability of the proposed robust SGLD algorithm. Moreover, numerical experiments show that the robust SGLD estimator outperforms the estimator obtained using vanilla SGLD in terms of test accuracy, which highlights the advantage of incorporating model uncertainty when optimising with perturbed samples. △ Less

Submitted 13 March, 2025; v1 submitted 14 March, 2024; originally announced March 2024.

arXiv:2402.11231 [pdf]

Enhancing Security in Blockchain Networks: Anomalies, Frauds, and Advanced Detection Techniques

Authors: Joerg Osterrieder, Stephen Chan, Jeffrey Chu, Yuanyuan Zhang, Branka Hadji Misheva, Codruta Mare

Abstract: Blockchain technology, a foundational distributed ledger system, enables secure and transparent multi-party transactions. Despite its advantages, blockchain networks are susceptible to anomalies and frauds, posing significant risks to their integrity and security. This paper offers a detailed examination of blockchain's key definitions and properties, alongside a thorough analysis of the various a… ▽ More Blockchain technology, a foundational distributed ledger system, enables secure and transparent multi-party transactions. Despite its advantages, blockchain networks are susceptible to anomalies and frauds, posing significant risks to their integrity and security. This paper offers a detailed examination of blockchain's key definitions and properties, alongside a thorough analysis of the various anomalies and frauds that undermine these networks. It describes an array of detection and prevention strategies, encompassing statistical and machine learning methods, game-theoretic solutions, digital forensics, reputation-based systems, and comprehensive risk assessment techniques. Through case studies, we explore practical applications of anomaly and fraud detection in blockchain networks, extracting valuable insights and implications for both current practice and future research. Moreover, we spotlight emerging trends and challenges within the field, proposing directions for future investigation and technological development. Aimed at both practitioners and researchers, this paper seeks to provide a technical, in-depth overview of anomaly and fraud detection within blockchain networks, marking a significant step forward in the search for enhanced network security and reliability. △ Less

Submitted 17 February, 2024; originally announced February 2024.

arXiv:2401.04702 [pdf, other]

Scaling Laws And Statistical Properties of The Transaction Flows And Holding Times of Bitcoin

Authors: Didier Sornette, Yu Zhang

Abstract: We study the temporal evolution of the holding-time distribution of bitcoins and find that the average distribution of holding-time is a heavy-tailed power law extending from one day to over at least $200$ weeks with an exponent approximately equal to $0.9$, indicating very long memory effects. We also report significant sample-to-sample variations of the distribution of holding times, which can b… ▽ More We study the temporal evolution of the holding-time distribution of bitcoins and find that the average distribution of holding-time is a heavy-tailed power law extending from one day to over at least $200$ weeks with an exponent approximately equal to $0.9$, indicating very long memory effects. We also report significant sample-to-sample variations of the distribution of holding times, which can be best characterized as multiscaling, with power-law exponents varying between $0.3$ and $2.5$ depending on bitcoin price regimes. We document significant differences between the distributions of book-to-market and of realized returns, showing that traders obtain far from optimal performance. We also report strong direct qualitative and quantitative evidence of the disposition effect in the Bitcoin Blockchain data. Defining age-dependent transaction flows as the fraction of bitcoins that are traded at a given time and that were born (last traded) at some specific earlier time, we document that the time-averaged transaction flow fraction has a power law dependence as a function of age, with an exponent close to $-1.5$, a value compatible with priority queuing theory. We document the existence of multifractality on the measure defined as the normalized number of bitcoins exchanged at a given time. △ Less

Submitted 9 January, 2024; originally announced January 2024.

arXiv:2309.08800 [pdf, other]

Dynamic Time Warping for Lead-Lag Relationships in Lagged Multi-Factor Models

Authors: Yichi Zhang, Mihai Cucuringu, Alexander Y. Shestopaloff, Stefan Zohren

Abstract: In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns… ▽ More In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns within the system. We develop a cluster-driven methodology based on dynamic time warping for robust detection of lead-lag relationships in lagged multi-factor models. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our algorithm is able to robustly detect lead-lag relationships in financial markets, which can be subsequently leveraged in trading strategies with significant economic benefits. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2305.06704

arXiv:2309.02994 [pdf, ps, other]

An Offline Learning Approach to Propagator Models

Authors: Eyal Neuman, Wolfgang Stockinger, Yufei Zhang

Abstract: We consider an offline learning problem for an agent who first estimates an unknown price impact kernel from a static dataset, and then designs strategies to liquidate a risky asset while creating transient price impact. We propose a novel approach for a nonparametric estimation of the propagator from a dataset containing correlated price trajectories, trading signals and metaorders. We quantify t… ▽ More We consider an offline learning problem for an agent who first estimates an unknown price impact kernel from a static dataset, and then designs strategies to liquidate a risky asset while creating transient price impact. We propose a novel approach for a nonparametric estimation of the propagator from a dataset containing correlated price trajectories, trading signals and metaorders. We quantify the accuracy of the estimated propagator using a metric which depends explicitly on the dataset. We show that a trader who tries to minimise her execution costs by using a greedy strategy purely based on the estimated propagator will encounter suboptimality due to so-called spurious correlation between the trading strategy and the estimator and due to intrinsic uncertainty resulting from a biased cost functional. By adopting an offline reinforcement learning approach, we introduce a pessimistic loss functional taking the uncertainty of the estimated propagator into account, with an optimiser which eliminates the spurious correlation, and derive an asymptotically optimal bound on the execution costs even without precise information on the true propagator. Numerical experiments are included to demonstrate the effectiveness of the proposed propagator estimator and the pessimistic trading strategy. △ Less

Submitted 6 September, 2023; originally announced September 2023.

Comments: 12 figures

MSC Class: 62L05; 60H30; 91G80; 68Q32; 93C73; 93E35; 62G08

arXiv:2308.06935 [pdf, other]

Insurance pricing on price comparison websites via reinforcement learning

Authors: Tanut Treetanthiploet, Yufei Zhang, Lukasz Szpruch, Isaac Bowers-Barnard, Henrietta Ridley, James Hickey, Chris Pearce

Abstract: The emergence of price comparison websites (PCWs) has presented insurers with unique challenges in formulating effective pricing strategies. Operating on PCWs requires insurers to strike a delicate balance between competitive premiums and profitability, amidst obstacles such as low historical conversion rates, limited visibility of competitors' actions, and a dynamic market environment. In additio… ▽ More The emergence of price comparison websites (PCWs) has presented insurers with unique challenges in formulating effective pricing strategies. Operating on PCWs requires insurers to strike a delicate balance between competitive premiums and profitability, amidst obstacles such as low historical conversion rates, limited visibility of competitors' actions, and a dynamic market environment. In addition to this, the capital intensive nature of the business means pricing below the risk levels of customers can result in solvency issues for the insurer. To address these challenges, this paper introduces reinforcement learning (RL) framework that learns the optimal pricing policy by integrating model-based and model-free methods. The model-based component is used to train agents in an offline setting, avoiding cold-start issues, while model-free algorithms are then employed in a contextual bandit (CB) manner to dynamically update the pricing policy to maximise the expected revenue. This facilitates quick adaptation to evolving market dynamics and enhances algorithm efficiency and decision interpretability. The paper also highlights the importance of evaluating pricing policies using an offline dataset in a consistent fashion and demonstrates the superiority of the proposed methodology over existing off-the-shelf RL/CB approaches. We validate our methodology using synthetic data, generated to reflect private commercially available data within real-world insurers, and compare against 6 other benchmark approaches. Our hybrid agent outperforms these benchmarks in terms of sample efficiency and cumulative reward with the exception of an agent that has access to perfect market information which would not be available in a real-world set-up. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2305.08740 [pdf, other]

doi 10.1145/3511808.3557089

Temporal and Heterogeneous Graph Neural Network for Financial Time Series Prediction

Authors: Sheng Xiang, Dawei Cheng, Chencheng Shang, Ying Zhang, Yuqi Liang

Abstract: The price movement prediction of stock market has been a classical yet challenging problem, with the attention of both economists and computer scientists. In recent years, graph neural network has significantly improved the prediction performance by employing deep learning on company relations. However, existing relation graphs are usually constructed by handcraft human labeling or nature language… ▽ More The price movement prediction of stock market has been a classical yet challenging problem, with the attention of both economists and computer scientists. In recent years, graph neural network has significantly improved the prediction performance by employing deep learning on company relations. However, existing relation graphs are usually constructed by handcraft human labeling or nature language processing, which are suffering from heavy resource requirement and low accuracy. Besides, they cannot effectively response to the dynamic changes in relation graphs. Therefore, in this paper, we propose a temporal and heterogeneous graph neural network-based (THGNN) approach to learn the dynamic relations among price movements in financial time series. In particular, we first generate the company relation graph for each trading day according to their historic price. Then we leverage a transformer encoder to encode the price movement information into temporal representations. Afterward, we propose a heterogeneous graph attention network to jointly optimize the embeddings of the financial time series data by transformer encoder and infer the probability of target movements. Finally, we conduct extensive experiments on the stock market in the United States and China. The results demonstrate the effectiveness and superior performance of our proposed methods compared with state-of-the-art baselines. Moreover, we also deploy the proposed THGNN in a real-world quantitative algorithm trading system, the accumulated portfolio return obtained by our method significantly outperforms other baselines. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 10 pages, 6 figures, ACM CIKM'22, Code: https://github.com/CharlieSCC/alpha/tree/main/alpha/model/THGNN

arXiv:2305.06704 [pdf, other]

Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models

Authors: Yichi Zhang, Mihai Cucuringu, Alexander Y. Shestopaloff, Stefan Zohren

Abstract: In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lag… ▽ More In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models. Within our framework, the envisioned pipeline takes as input a set of time series, and creates an enlarged universe of extracted subsequence time series from each input time series, via a sliding window approach. This is then followed by an application of various clustering techniques, (such as k-means++ and spectral clustering), employing a variety of pairwise similarity measures, including nonlinear ones. Once the clusters have been extracted, lead-lag estimates across clusters are robustly aggregated to enhance the identification of the consistent relationships in the original universe. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our method is not only able to robustly detect lead-lag relationships in financial markets, but can also yield insightful results when applied to an environmental data set. △ Less

Submitted 18 September, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

arXiv:2304.00323 [pdf, other]

Company Competition Graph

Authors: Yanci Zhang, Yutong Lu, Haitao Mao, Jiawei Huang, Cien Zhang, Xinyi Li, Rui Dai

Abstract: Financial market participants frequently rely on numerous business relationships to make investment decisions. Investors can learn about potential risks and opportunities associated with other connected entities through these corporate connections. Nonetheless, human annotation of a large corpus to extract such relationships is highly time-consuming, not to mention that it requires a considerable… ▽ More Financial market participants frequently rely on numerous business relationships to make investment decisions. Investors can learn about potential risks and opportunities associated with other connected entities through these corporate connections. Nonetheless, human annotation of a large corpus to extract such relationships is highly time-consuming, not to mention that it requires a considerable amount of industry expertise and professional training. Meanwhile, we have yet to observe means to generate reliable knowledge graphs of corporate relationships due to the lack of impartial and granular data sources. This study proposes a system to process financial reports and construct the public competitor graph to fill the void. Our method can retrieve more than 83\% competition relationship of the S\&P 500 index companies. Based on the output from our system, we construct a knowledge graph with more than 700 nodes and 1200 edges. A demo interactive graph interface is available. △ Less

Submitted 1 April, 2023; originally announced April 2023.

arXiv:2303.04688 [pdf, other]

Form 10-K Itemization

Authors: Yanci Zhang, Mengjia Xia, Mingyang Li, Haitao Mao, Yutong Lu, Yupeng Lan, Jinlin Ye, Rui Dai

Abstract: Form 10-K report is a financial report disclosing the annual financial state of a public company. It is an important evidence to conduct financial analysis, i.e., asset pricing, corporate finance. Practitioners and researchers are constantly designing algorithms to better conduct analysis on information in the Form 10-K report. The vast majority of previous works focus on quantitative data. With r… ▽ More Form 10-K report is a financial report disclosing the annual financial state of a public company. It is an important evidence to conduct financial analysis, i.e., asset pricing, corporate finance. Practitioners and researchers are constantly designing algorithms to better conduct analysis on information in the Form 10-K report. The vast majority of previous works focus on quantitative data. With recent advancement on natural language processing (NLP), textual data in financial filing attracts more attention. However, to incorporate textual data for analyzing, Form 10-K Itemization is a necessary pre-process step. It aims to segment the whole document into several Item sections, where each Item section focuses on a specific financial aspect of the company. With the segmented Item sections, NLP techniques can directly apply on those Item sections related to downstream tasks. In this paper, we develop a Form 10-K Itemization system which can automatically segment all the Item sections in 10-K documents. The system is both effective and efficient. It reaches a retrieval rate of 93%. △ Less

Submitted 18 February, 2023; originally announced March 2023.

Comments: For demo website, see http://review10-k.ddns.net

arXiv:2303.01923 [pdf, other]

Bayesian CART models for insurance claims frequency

Authors: Yaojun Zhang, Lanpeng Ji, Georgios Aivaliotis, Charles Taylor

Abstract: Accuracy and interpretability of a (non-life) insurance pricing model are essential qualities to ensure fair and transparent premiums for policy-holders, that reflect their risk. In recent years, the classification and regression trees (CARTs) and their ensembles have gained popularity in the actuarial literature, since they offer good prediction performance and are relatively easily interpretable… ▽ More Accuracy and interpretability of a (non-life) insurance pricing model are essential qualities to ensure fair and transparent premiums for policy-holders, that reflect their risk. In recent years, the classification and regression trees (CARTs) and their ensembles have gained popularity in the actuarial literature, since they offer good prediction performance and are relatively easily interpretable. In this paper, we introduce Bayesian CART models for insurance pricing, with a particular focus on claims frequency modelling. Additionally to the common Poisson and negative binomial (NB) distributions used for claims frequency, we implement Bayesian CART for the zero-inflated Poisson (ZIP) distribution to address the difficulty arising from the imbalanced insurance claims data. To this end, we introduce a general MCMC algorithm using data augmentation methods for posterior tree exploration. We also introduce the deviance information criterion (DIC) for the tree model selection. The proposed models are able to identify trees which can better classify the policy-holders into risk groups. Some simulations and real insurance data will be discussed to illustrate the applicability of these models. △ Less

Submitted 1 December, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: 46 pages

MSC Class: 62P05

arXiv:2301.05157 [pdf, other]

Statistical Learning with Sublinear Regret of Propagator Models

Authors: Eyal Neuman, Yufei Zhang

Abstract: We consider a class of learning problems in which an agent liquidates a risky asset while creating both transient price impact driven by an unknown convolution propagator and linear temporary price impact with an unknown parameter. We characterize the trader's performance as maximization of a revenue-risk functional, where the trader also exploits available information on a price predicting signal… ▽ More We consider a class of learning problems in which an agent liquidates a risky asset while creating both transient price impact driven by an unknown convolution propagator and linear temporary price impact with an unknown parameter. We characterize the trader's performance as maximization of a revenue-risk functional, where the trader also exploits available information on a price predicting signal. We present a trading algorithm that alternates between exploration and exploitation phases and achieves sublinear regrets with high probability. For the exploration phase we propose a novel approach for non-parametric estimation of the price impact kernel by observing only the visible price process and derive sharp bounds on the convergence rate, which are characterised by the singularity of the propagator. These kernel estimation methods extend existing methods from the area of Tikhonov regularisation for inverse problems and are of independent interest. The bound on the regret in the exploitation phase is obtained by deriving stability results for the optimizer and value function of the associated class of infinite-dimensional stochastic control problems. As a complementary result we propose a regression-based algorithm to estimate the conditional expectation of non-Markovian signals and derive its convergence rate. △ Less

Submitted 21 January, 2025; v1 submitted 12 January, 2023; originally announced January 2023.

Comments: 57 pages, accepted by The Annals of Applied Probability

MSC Class: 62L05; 60H30; 91G80; 68Q32; 93C73; 93E35; 62G08

arXiv:2212.08518 [pdf, ps, other]

Systemic robustness: a mean-field particle system approach

Authors: Erhan Bayraktar, Gaoyue Guo, Wenpin Tang, Yuming Paul Zhang

Abstract: This paper is concerned with the problem of budget control in a large particle system modeled by stochastic differential equations involving hitting times, which arises from considerations of systemic risk in a regional financial network. Motivated by Tang and Tsai (Ann. Probab., 46(2018), pp. 1597{1650), we focus on the number or proportion of surviving entities that never default to measure the… ▽ More This paper is concerned with the problem of budget control in a large particle system modeled by stochastic differential equations involving hitting times, which arises from considerations of systemic risk in a regional financial network. Motivated by Tang and Tsai (Ann. Probab., 46(2018), pp. 1597{1650), we focus on the number or proportion of surviving entities that never default to measure the systemic robustness. First we show that both the mean-field particle system and its limiting McKean-Vlasov equation are well-posed by virtue of the notion of minimal solutions. We then establish a connection between the proportion of surviving entities in the large particle system and the probability of default in the limiting McKean-Vlasov equation as the size of the interacting particle system N tends to infinity. Finally, we study the asymptotic efficiency of budget control in different economy regimes: the expected number of surviving entities is of constant order in a negative economy; it is of order of the square root of N in a neutral economy; and it is of order N in a positive economy where the budget's effect is negligible. △ Less

Submitted 29 August, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Comments: 33 pages

arXiv:2212.05632 [pdf, other]

Blockchain Network Analysis: A Comparative Study of Decentralized Banks

Authors: Yufan Zhang, Zichao Chen, Yutong Sun, Yulin Liu, Luyao Zhang

Abstract: Decentralized finance (DeFi) is known for its unique mechanism design, which applies smart contracts to facilitate peer-to-peer transactions. The decentralized bank is a typical DeFi application. Ideally, a decentralized bank should be decentralized in the transaction. However, many recent studies have found that decentralized banks have not achieved a significant degree of decentralization. This… ▽ More Decentralized finance (DeFi) is known for its unique mechanism design, which applies smart contracts to facilitate peer-to-peer transactions. The decentralized bank is a typical DeFi application. Ideally, a decentralized bank should be decentralized in the transaction. However, many recent studies have found that decentralized banks have not achieved a significant degree of decentralization. This research conducts a comparative study among mainstream decentralized banks. We apply core-periphery network features analysis using the transaction data from four decentralized banks, Liquity, Aave, MakerDao, and Compound. We extract six features and compare the banks' levels of decentralization cross-sectionally. According to the analysis results, we find that: 1) MakerDao and Compound are more decentralized in the transactions than Aave and Liquity. 2) Although decentralized banking transactions are supposed to be decentralized, the data show that four banks have primary external transaction core addresses such as Huobi, Coinbase, and Binance, etc. We also discuss four design features that might affect network decentralization. Our research contributes to the literature at the interface of decentralized finance, financial technology (Fintech), and social network analysis and inspires future protocol designs to live up to the promise of decentralized finance for a truly peer-to-peer transaction network. △ Less

Submitted 8 July, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

MSC Class: 91D30; 91-11; ACM Class: J.4; C.2; K.4

arXiv:2212.05369 [pdf]

doi 10.47191/afmj/v8i2.03

Time Series Analysis in American Stock Market Recovering in Post COVID-19 Pandemic Period

Authors: Weilin Fu, Zhuoran Li, Yupeng Zhang, Xingyou Zhou

Abstract: Every financial crisis has caused a dual shock to the global economy. The shortage of market liquidity, such as default in debt and bonds, has led to the spread of bankruptcies, such as Lehman Brothers in 2008. Using the data for the ETFs of the S&P 500, Nasdaq 100, and Dow Jones Industrial Average collected from Yahoo Finance, this study implemented Deep Learning, Neuro Network, and Time-series t… ▽ More Every financial crisis has caused a dual shock to the global economy. The shortage of market liquidity, such as default in debt and bonds, has led to the spread of bankruptcies, such as Lehman Brothers in 2008. Using the data for the ETFs of the S&P 500, Nasdaq 100, and Dow Jones Industrial Average collected from Yahoo Finance, this study implemented Deep Learning, Neuro Network, and Time-series to analyze the trend of the American Stock Market in the post-COVID-19 period. LSTM model in Neuro Network to predict the future trend, which suggests the US stock market keeps falling for the post-COVID-19 period. This study reveals a reasonable allocation method of Long Short-Term Memory for which there is strong evidence. △ Less

Submitted 10 December, 2022; originally announced December 2022.

Comments: 9 pages, 4 figures, Submitted to the Cambridge University Press Journal

arXiv:2205.08743 [pdf, other]

Mean-variance portfolio selection with dynamic attention behavior in a hidden Markov model

Authors: Y. Zhang, Z. Jin, J. Wei, G. Yin

Abstract: In this paper, we study closed-loop equilibrium strategies for mean-variance portfolio selection problem in a hidden Markov model with dynamic attention behavior. In addition to the investment strategy, the investor's attention to news is introduced as a control of the accuracy of the news signal process. The objective is to find equilibrium strategies by numerically solving an extended HJB equati… ▽ More In this paper, we study closed-loop equilibrium strategies for mean-variance portfolio selection problem in a hidden Markov model with dynamic attention behavior. In addition to the investment strategy, the investor's attention to news is introduced as a control of the accuracy of the news signal process. The objective is to find equilibrium strategies by numerically solving an extended HJB equation by using Markov chain approximation method. An iterative algorithm is constructed and its convergence is established. Numerical examples are also provided to illustrate the results. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: 15 pages, 4 figures

arXiv:2204.09544 [pdf]

Digging into Primary Financial Market: Challenges and Opportunities of Adopting Blockchain

Authors: Ji Liu, Zheng Xu, Yanmei Zhang, Wei Dai, Hao Wu, Shiping Chen

Abstract: Since the emergence of blockchain technology, its application in the financial market has always been an area of focus and exploration by all parties. With the characteristics of anonymity, trust, tamper-proof, etc., blockchain technology can effectively solve some problems faced by the financial market, such as trust issues and information asymmetry issues. To deeply understand the application sc… ▽ More Since the emergence of blockchain technology, its application in the financial market has always been an area of focus and exploration by all parties. With the characteristics of anonymity, trust, tamper-proof, etc., blockchain technology can effectively solve some problems faced by the financial market, such as trust issues and information asymmetry issues. To deeply understand the application scenarios of blockchain in the financial market, the issue of securities issuance and trading in the primary market is a problem that must be studied clearly. We conducted an empirical study to investigate the main difficulties faced by primary market participants in their business practices and the potential challenges of the deepening application of blockchain technology in the primary market. We adopted a hybrid method combining interviews (qualitative methods) and surveys (quantitative methods) to conduct this research in two stages. In the first stage, we interview 15 major primary market participants with different backgrounds and expertise. In the second phase, we conducted a verification survey of 54 primary market practitioners to confirm various insights from the interviews, including challenges and desired improvements. Our interviews and survey results revealed several significant challenges facing blockchain applications in the primary market: complex due diligence, mismatch, and difficult monitoring. On this basis, we believe that our future research can focus on some aspects of these challenges. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: 11 pages and 7 figures

arXiv:2202.08962 [pdf, ps, other]

Volatility forecasting with machine learning and intraday commonality

Authors: Chao Zhang, Yihuang Zhang, Mihai Cucuringu, Zhongmin Qian

Abstract: We apply machine learning models to forecast intraday realized volatility (RV), by exploiting commonality in intraday volatility via pooling stock data together, and by incorporating a proxy for the market volatility. Neural networks dominate linear regressions and tree-based models in terms of performance, due to their ability to uncover and model complex latent interactions among variables. Our… ▽ More We apply machine learning models to forecast intraday realized volatility (RV), by exploiting commonality in intraday volatility via pooling stock data together, and by incorporating a proxy for the market volatility. Neural networks dominate linear regressions and tree-based models in terms of performance, due to their ability to uncover and model complex latent interactions among variables. Our findings remain robust when we apply trained models to new stocks that have not been included in the training set, thus providing new empirical evidence for a universal volatility mechanism among stocks. Finally, we propose a new approach to forecasting one-day-ahead RVs using past intraday RVs as predictors, and highlight interesting time-of-day effects that aid the forecasting mechanism. The results demonstrate that the proposed methodology yields superior out-of-sample forecasts over a strong set of traditional baselines that only rely on past daily RVs. △ Less

Submitted 24 February, 2023; v1 submitted 8 February, 2022; originally announced February 2022.

Comments: 40 pages, 12 figures, 6 tables; to appear in Journal of Financial Econometrics

arXiv:2109.15060 [pdf]

doi 10.1016/j/eswa.2020.113688

Stock index futures trading impact on spot price volatility. The CSI 300 studied with a TGARCH model

Authors: Marcel Ausloos, Yining Zhang, Gurjeet Dhesi

Abstract: A TGARCH modeling is argued to be the optimal basis for investigating the impact of index futures trading on spot price variability. We discuss the CSI-300 index (China-Shanghai-Shenzhen-300-Stock Index) as a test case. The results prove that the introduction of CSI-300 index futures (CSI-300-IF) trading significantly reduces the volatility in the corresponding spot market. It is also found that t… ▽ More A TGARCH modeling is argued to be the optimal basis for investigating the impact of index futures trading on spot price variability. We discuss the CSI-300 index (China-Shanghai-Shenzhen-300-Stock Index) as a test case. The results prove that the introduction of CSI-300 index futures (CSI-300-IF) trading significantly reduces the volatility in the corresponding spot market. It is also found that there is a stationary equilibrium relationship between the CSI-300 spot and CCSI-300-IF markets. A bidirectional Granger causality is also detected. ''Finally'', it is deduced that spot prices are predicted with greater accuracy over a 3 or 4 lag day time span. △ Less

Submitted 29 August, 2021; originally announced September 2021.

Comments: 31 pages, 10 tables, 2 figures, 109 references

Journal ref: Expert Systems with Applications 160 (2020) 113688

arXiv:2104.11783 [pdf, other]

Form 10-Q Itemization

Authors: Yanci Zhang, Tianming Du, Yujie Sun, Lawrence Donohue, Rui Dai

Abstract: The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable… ▽ More The quarterly financial statement, or Form 10-Q, is one of the most frequently required filings for US public companies to disclose financial and other important business information. Due to the massive volume of 10-Q filings and the enormous variations in the reporting format, it has been a long-standing challenge to retrieve item-specific information from 10-Q filings that lack machine-readable hierarchy. This paper presents a solution for itemizing 10-Q files by complementing a rule-based algorithm with a Convolutional Neural Network (CNN) image classifier. This solution demonstrates a pipeline that can be generalized to a rapid data retrieval solution among a large volume of textual data using only typographic items. The extracted textual data can be used as unlabeled content-specific data to train transformer models (e.g., BERT) or fit into various field-focus natural language processing (NLP) applications. △ Less

Submitted 19 October, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

Comments: 6 pages, 3 figures, 3 tables, http://review10q.ddns.net/

arXiv:2102.03417 [pdf, ps, other]

Reward Design in Risk-Taking Contests

Authors: Marcel Nutz, Yuchong Zhang

Abstract: Following the risk-taking model of Seel and Strack, $n$ players decide when to stop privately observed Brownian motions with drift and absorption at zero. They are then ranked according to their level of stopping and paid a rank-dependent reward. We study the problem of a principal who aims to induce a desirable equilibrium performance of the players by choosing how much reward is attributed to ea… ▽ More Following the risk-taking model of Seel and Strack, $n$ players decide when to stop privately observed Brownian motions with drift and absorption at zero. They are then ranked according to their level of stopping and paid a rank-dependent reward. We study the problem of a principal who aims to induce a desirable equilibrium performance of the players by choosing how much reward is attributed to each rank. Specifically, we determine optimal reward schemes for principals interested in the average performance and the performance at a given rank. While the former can be related to reward inequality in the Lorenz sense, the latter can have a surprising shape. △ Less

Submitted 8 November, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

Comments: To appear in SIAM Journal on Financial Mathematics

MSC Class: 91A65; 91A15; 91A55

arXiv:2012.13121 [pdf, other]

Memory-Gated Recurrent Networks

Authors: Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang

Abstract: The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint… ▽ More The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint" memory). Because of the multivariate complexity in the evolution of the joint distribution that underlies the data generating process, we take a data-driven approach and construct a novel recurrent network architecture, termed Memory-Gated Recurrent Networks (mGRN), with gates explicitly regulating two distinct types of memories: the marginal memory and the joint memory. Through a combination of comprehensive simulation studies and empirical experiments on a range of public datasets, we show that our proposed mGRN architecture consistently outperforms state-of-the-art architectures targeting multivariate time series. △ Less

Submitted 30 December, 2020; v1 submitted 24 December, 2020; originally announced December 2020.

Comments: This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

arXiv:2010.14646 [pdf, ps, other]

McKean-Vlasov equations involving hitting times: blow-ups and global solvability

Authors: Erhan Bayraktar, Gaoyue Guo, Wenpin Tang, Yuming Zhang

Abstract: This paper is concerned with the analysis of blow-ups for two McKean-Vlasov equations involving hitting times. Let $(B(t); \, t \ge 0)$ be standard Brownian motion, and $τ:= \inf\{t \ge 0: X(t) \le 0\}$ be the hitting time to zero of a given process $X$. The first equation is $X(t) = X(0) + B(t) - α\mathbb{P}(τ\le t)$. We provide a simple condition on $α$ and the distribution of $X(0)$ such that t… ▽ More This paper is concerned with the analysis of blow-ups for two McKean-Vlasov equations involving hitting times. Let $(B(t); \, t \ge 0)$ be standard Brownian motion, and $τ:= \inf\{t \ge 0: X(t) \le 0\}$ be the hitting time to zero of a given process $X$. The first equation is $X(t) = X(0) + B(t) - α\mathbb{P}(τ\le t)$. We provide a simple condition on $α$ and the distribution of $X(0)$ such that the corresponding Fokker-Planck equation has no blow-up, and thus the McKean-Vlasov dynamics is well-defined for all time $t \ge 0$. Our approach relies on a connection between the McKean-Vlasov equation and the supercooled Stefan problem, as well as several comparison principles. The second equation is $X(t) = X(0) + βt + B(t) + α\log \mathbb{P}(τ> t)$, whose Fokker-Planck equation is non-local. We prove that for $β> 0$ sufficiently large and $α$ no greater than a sufficiently small positive constant, there is no blow-up and the McKean-Vlasov dynamics is well-defined for all time $t \ge 0$. The argument is based on a new transform, which removes the non-local term, followed by a relative entropy analysis. △ Less

Submitted 1 July, 2023; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 22 pages

MSC Class: 35K61; 60H30

arXiv:2007.01672 [pdf, other]

A fully data-driven approach to minimizing CVaR for portfolio of assets via SGLD with discontinuous updating

Authors: Sotirios Sabanis, Ying Zhang

Abstract: A new approach in stochastic optimization via the use of stochastic gradient Langevin dynamics (SGLD) algorithms, which is a variant of stochastic gradient decent (SGD) methods, allows us to efficiently approximate global minimizers of possibly complicated, high-dimensional landscapes. With this in mind, we extend here the non-asymptotic analysis of SGLD to the case of discontinuous stochastic gra… ▽ More A new approach in stochastic optimization via the use of stochastic gradient Langevin dynamics (SGLD) algorithms, which is a variant of stochastic gradient decent (SGD) methods, allows us to efficiently approximate global minimizers of possibly complicated, high-dimensional landscapes. With this in mind, we extend here the non-asymptotic analysis of SGLD to the case of discontinuous stochastic gradients. We are thus able to provide theoretical guarantees for the algorithm's convergence in (standard) Wasserstein distances for both convex and non-convex objective functions. We also provide explicit upper estimates of the expected excess risk associated with the approximation of global minimizers of these objective functions. All these findings allow us to devise and present a fully data-driven approach for the optimal allocation of weights for the minimization of CVaR of portfolio of assets with complete theoretical guarantees for its performance. Numerical results illustrate our main findings. △ Less

Submitted 2 July, 2020; originally announced July 2020.

Comments: arXiv admin note: text overlap with arXiv:1910.02008

Showing 1–50 of 107 results for author: Zhang, Y