Search | arXiv e-print repository

SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation

Authors: Qilong Wu, Xiaoneng Xiang, Hejia Huang, Xuan Wang, Yeo Wei Jie, Ranjan Satapathy, Ricardo Shirota Filho, Bharadwaj Veeravalli

Abstract: The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose… ▽ More The rapid growth of the financial sector and the rising focus on Environmental, Social, and Governance (ESG) considerations highlight the need for advanced NLP tools. However, open-source LLMs proficient in both finance and ESG domains remain scarce. To address this gap, we introduce SusGen-30K, a category-balanced dataset comprising seven financial NLP tasks and ESG report generation, and propose TCFD-Bench, a benchmark for evaluating sustainability report generation. Leveraging this dataset, we developed SusGen-GPT, a suite of models achieving state-of-the-art performance across six adapted and two off-the-shelf tasks, trailing GPT-4 by only 2% despite using 7-8B parameters compared to GPT-4's 1,700B. Based on this, we propose the SusGen system, integrated with Retrieval-Augmented Generation (RAG), to assist in sustainability report generation. This work demonstrates the efficiency of our approach, advancing research in finance and ESG. △ Less

Submitted 14 December, 2024; originally announced December 2024.

arXiv:2405.17770 [pdf, other]

Risk-Neutral Generative Networks

Authors: Zhonghao Xian, Xing Yan, Cheuk Hang Leung, Qi Wu

Abstract: We present a functional generative approach to extract risk-neutral densities from market prices of options. Specifically, we model the log-returns on the time-to-maturity continuum as a stochastic curve driven by standard normal. We then use neural nets to represent the term structures of the location, the scale, and the higher-order moments, and impose stringent conditions on the learning proces… ▽ More We present a functional generative approach to extract risk-neutral densities from market prices of options. Specifically, we model the log-returns on the time-to-maturity continuum as a stochastic curve driven by standard normal. We then use neural nets to represent the term structures of the location, the scale, and the higher-order moments, and impose stringent conditions on the learning process to ensure the neural net-based curve representation is free of static arbitrage. This specification is structurally clear in that it separates the modeling of randomness from the modeling of the term structures of the parameters. It is data adaptive in that we use neural nets to represent the shape of the stochastic curve. It is also generative in that the functional form of the stochastic curve, although parameterized by neural nets, is an explicit and deterministic function of the standard normal. This explicitness allows for the efficient generation of samples to price options across strikes and maturities, without compromising data adaptability. We have validated the effectiveness of this approach by benchmarking it against a comprehensive set of baseline models. Experiments show that the extracted risk-neutral densities accommodate a diverse range of shapes. Its accuracy significantly outperforms the extensive set of baseline models--including three parametric models and nine stochastic process models--in terms of accuracy and stability. The success of this approach is attributed to its capacity to offer flexible term structures for risk-neutral skewness and kurtosis. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2403.13138 [pdf, ps, other]

Max- and min-stability under first-order stochastic dominance

Authors: Christopher Chambers, Alan Miller, Ruodu Wang, Qinyu Wu

Abstract: Max-stability is the property that taking a maximum between two inputs results in a maximum between two outputs. We study max-stability with respect to first-order stochastic dominance, the most fundamental notion of stochastic dominance in decision theory. Under two additional standard axioms of nondegeneracy and lower semicontinuity, we establish a representation theorem for functionals satisfyi… ▽ More Max-stability is the property that taking a maximum between two inputs results in a maximum between two outputs. We study max-stability with respect to first-order stochastic dominance, the most fundamental notion of stochastic dominance in decision theory. Under two additional standard axioms of nondegeneracy and lower semicontinuity, we establish a representation theorem for functionals satisfying max-stability, which turns out to be represented by the supremum of a bivariate function. A parallel characterization result for min-stability, that is, with the maximum replaced by the minimum in max-stability, is also established. By combining both max-stability and min-stability, we obtain a new characterization for a class of functionals, called the Lambda-quantiles, that appear in finance and political science. △ Less

Submitted 3 February, 2025; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2312.10388 [pdf, other]

The Causal Impact of Credit Lines on Spending Distributions

Authors: Yijun Li, Cheuk Hang Leung, Xiangqian Sun, Chaoqun Wang, Yiyan Huang, Xing Yan, Qi Wu, Dongdong Wang, Zhixiang Huang

Abstract: Consumer credit services offered by e-commerce platforms provide customers with convenient loan access during shopping and have the potential to stimulate sales. To understand the causal impact of credit lines on spending, previous studies have employed causal estimators, based on direct regression (DR), inverse propensity weighting (IPW), and double machine learning (DML) to estimate the treatmen… ▽ More Consumer credit services offered by e-commerce platforms provide customers with convenient loan access during shopping and have the potential to stimulate sales. To understand the causal impact of credit lines on spending, previous studies have employed causal estimators, based on direct regression (DR), inverse propensity weighting (IPW), and double machine learning (DML) to estimate the treatment effect. However, these estimators do not consider the notion that an individual's spending can be understood and represented as a distribution, which captures the range and pattern of amounts spent across different orders. By disregarding the outcome as a distribution, valuable insights embedded within the outcome distribution might be overlooked. This paper develops a distribution-valued estimator framework that extends existing real-valued DR-, IPW-, and DML-based estimators to distribution-valued estimators within Rubin's causal framework. We establish their consistency and apply them to a real dataset from a large e-commerce platform. Our findings reveal that credit lines positively influence spending across all quantiles; however, as credit lines increase, consumers allocate more to luxuries (higher quantiles) than necessities (lower quantiles). △ Less

Submitted 16 December, 2023; originally announced December 2023.

arXiv:2312.01034 [pdf, other]

Monotonic mean-deviation risk measures

Authors: Xia Han, Ruodu Wang, Qinyu Wu

Abstract: Mean-deviation models, along with the existing theory of coherent risk measures, are well studied in the literature. In this paper, we characterize monotonic mean-deviation (risk) measures from a general mean-deviation model by applying a risk-weighting function to the deviation part. The form is a combination of the deviation-related functional and the expectation, and such measures belong to the… ▽ More Mean-deviation models, along with the existing theory of coherent risk measures, are well studied in the literature. In this paper, we characterize monotonic mean-deviation (risk) measures from a general mean-deviation model by applying a risk-weighting function to the deviation part. The form is a combination of the deviation-related functional and the expectation, and such measures belong to the class of consistent risk measures. The monotonic mean-deviation measures admit an axiomatic foundation via preference relations. By further assuming the convexity and linearity of the risk-weighting function, the characterizations for convex and coherent risk measures are obtained, giving rise to many new explicit examples of convex and nonconvex consistent risk measures. Further, we specialize in the convex case of the monotonic mean-deviation measure and obtain its dual representation. The worst-case values of the monotonic mean-deviation measures are analyzed under two popular settings of model uncertainty. Further, we establish asymptotic consistency and normality of the natural estimators of the monotonic mean-deviation measures.Finally, the monotonic mean-deviation measures are applied to a problem of portfolio selection using financial data. △ Less

Submitted 8 August, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

arXiv:2305.19499 [pdf, other]

Deep into The Domain Shift: Transfer Learning through Dependence Regularization

Authors: Shumin Ma, Zhiri Yuan, Qi Wu, Yiyan Huang, Xixu Hu, Cheuk Hang Leung, Dongdong Wang, Zhixiang Huang

Abstract: Classical Domain Adaptation methods acquire transferability by regularizing the overall distributional discrepancies between features in the source domain (labeled) and features in the target domain (unlabeled). They often do not differentiate whether the domain differences come from the marginals or the dependence structures. In many business and financial applications, the labeling function usua… ▽ More Classical Domain Adaptation methods acquire transferability by regularizing the overall distributional discrepancies between features in the source domain (labeled) and features in the target domain (unlabeled). They often do not differentiate whether the domain differences come from the marginals or the dependence structures. In many business and financial applications, the labeling function usually has different sensitivities to the changes in the marginals versus changes in the dependence structures. Measuring the overall distributional differences will not be discriminative enough in acquiring transferability. Without the needed structural resolution, the learned transfer is less optimal. This paper proposes a new domain adaptation approach in which one can measure the differences in the internal dependence structure separately from those in the marginals. By optimizing the relative weights among them, the new regularization strategy greatly relaxes the rigidness of the existing approaches. It allows a learning machine to pay special attention to places where the differences matter the most. Experiments on three real-world datasets show that the improvements are quite notable and robust compared to various benchmark domain adaptation models. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: 15 pages

arXiv:2301.12420 [pdf, ps, other]

Conditional generalized quantiles based on expected utility model and equivalent characterization of properties

Authors: Qinyu Wu, Fan Yang, Ping Zhang

Abstract: As a counterpart to the (static) risk measures of generalized quantiles and motivated by Bellini et al. (2018), we propose a new kind of conditional risk measure called conditional generalized quantiles. We first show their well-definedness and they can be equivalently characterised by a conditional first order condition. We also discuss their main properties, and, especially, We give the characte… ▽ More As a counterpart to the (static) risk measures of generalized quantiles and motivated by Bellini et al. (2018), we propose a new kind of conditional risk measure called conditional generalized quantiles. We first show their well-definedness and they can be equivalently characterised by a conditional first order condition. We also discuss their main properties, and, especially, We give the characterization of coherency/convexity. For potential applications as a dynamic risk measure, we study their time consistency properties, and establish their equivalent characterizations among conditional generalized quantiles. △ Less

Submitted 29 January, 2023; originally announced January 2023.

arXiv:2301.07318 [pdf, other]

Dynamic CVaR Portfolio Construction with Attention-Powered Generative Factor Learning

Authors: Chuting Sun, Qi Wu, Xing Yan

Abstract: The dynamic portfolio construction problem requires dynamic modeling of the joint distribution of multivariate stock returns. To achieve this, we propose a dynamic generative factor model which uses random variable transformation as an implicit way of distribution modeling and relies on the Attention-GRU network for dynamic learning and forecasting. The proposed model captures the dynamic dependen… ▽ More The dynamic portfolio construction problem requires dynamic modeling of the joint distribution of multivariate stock returns. To achieve this, we propose a dynamic generative factor model which uses random variable transformation as an implicit way of distribution modeling and relies on the Attention-GRU network for dynamic learning and forecasting. The proposed model captures the dynamic dependence among multivariate stock returns, especially focusing on the tail-side properties. We also propose a two-step iterative algorithm to train the model and then predict the time-varying model parameters, including the time-invariant tail parameters. At each investment date, we can easily simulate new samples from the learned generative model, and we further perform CVaR portfolio optimization with the simulated samples to form a dynamic portfolio strategy. The numerical experiment on stock data shows that our model leads to wiser investments that promise higher reward-risk ratios and present lower tail risks. △ Less

Submitted 16 January, 2024; v1 submitted 18 January, 2023; originally announced January 2023.

arXiv:2209.03425 [pdf, ps, other]

Probabilistic risk aversion for generalized rank-dependent functions

Authors: Ruodu Wang, Qinyu Wu

Abstract: Probabilistic risk aversion, defined through quasi-convexity in probabilistic mixtures, is a common useful property in decision analysis. We study a general class of non-monotone mappings, called the generalized rank-dependent functions, which includes the preference models of expected utilities, dual utilities, and rank-dependent utilities as special cases, as well as signed Choquet functions use… ▽ More Probabilistic risk aversion, defined through quasi-convexity in probabilistic mixtures, is a common useful property in decision analysis. We study a general class of non-monotone mappings, called the generalized rank-dependent functions, which includes the preference models of expected utilities, dual utilities, and rank-dependent utilities as special cases, as well as signed Choquet functions used in risk management. Our results fully characterize probabilistic risk aversion for generalized rank-dependent functions: This property is determined by the distortion function, which is precisely one of the two cases: those that are convex and those that correspond to scaled quantile-spread mixtures. Our result also leads to seven equivalent conditions for quasi-convexity in probabilistic mixtures of dual utilities and signed Choquet functions. As a consequence, although probabilistic risk aversion is quite different from the classic notion of strong risk aversion for generalized rank-dependent functions, these two notions coincide for dual utilities under an additional continuity assumption. △ Less

Submitted 26 September, 2024; v1 submitted 7 September, 2022; originally announced September 2022.

arXiv:2209.01805 [pdf, other]

Robust Causal Learning for the Estimation of Average Treatment Effects

Authors: Yiyan Huang, Cheuk Hang Leung, Xing Yan, Qi Wu, Shumin Ma, Zhiri Yuan, Dongdong Wang, Zhixiang Huang

Abstract: Many practical decision-making problems in economics and healthcare seek to estimate the average treatment effect (ATE) from observational data. The Double/Debiased Machine Learning (DML) is one of the prevalent methods to estimate ATE in the observational study. However, the DML estimators can suffer an error-compounding issue and even give an extreme estimate when the propensity scores are missp… ▽ More Many practical decision-making problems in economics and healthcare seek to estimate the average treatment effect (ATE) from observational data. The Double/Debiased Machine Learning (DML) is one of the prevalent methods to estimate ATE in the observational study. However, the DML estimators can suffer an error-compounding issue and even give an extreme estimate when the propensity scores are misspecified or very close to 0 or 1. Previous studies have overcome this issue through some empirical tricks such as propensity score trimming, yet none of the existing literature solves this problem from a theoretical standpoint. In this paper, we propose a Robust Causal Learning (RCL) method to offset the deficiencies of the DML estimators. Theoretically, the RCL estimators i) are as consistent and doubly robust as the DML estimators, and ii) can get rid of the error-compounding issue. Empirically, the comprehensive experiments show that i) the RCL estimators give more stable estimations of the causal parameters than the DML estimators, and ii) the RCL estimators outperform the traditional estimators and their variants when applying different machine learning models on both simulation and benchmark datasets. △ Less

Submitted 5 September, 2022; originally announced September 2022.

Comments: This paper was accepted and will be published at The 2022 International Joint Conference on Neural Networks (IJCNN2022). arXiv admin note: substantial text overlap with arXiv:2103.11869

arXiv:2201.06370 [pdf, other]

Model Aggregation for Risk Evaluation and Robust Optimization

Authors: Tiantian Mao, Ruodu Wang, Qinyu Wu

Abstract: We introduce a new approach for prudent risk evaluation based on stochastic dominance, which will be called the model aggregation (MA) approach. In contrast to the classic worst-case risk (WR) approach, the MA approach produces not only a robust value of risk evaluation but also a robust distributional model, independent of any specific risk measure. The MA risk evaluation can be computed through… ▽ More We introduce a new approach for prudent risk evaluation based on stochastic dominance, which will be called the model aggregation (MA) approach. In contrast to the classic worst-case risk (WR) approach, the MA approach produces not only a robust value of risk evaluation but also a robust distributional model, independent of any specific risk measure. The MA risk evaluation can be computed through explicit formulas in the lattice theory of stochastic dominance, and under some standard assumptions, the MA robust optimization admits a convex-program reformulation. The MA approach for Wasserstein and mean-variance uncertainty sets admits explicit formulas for the obtained robust models. Via an equivalence property between the MA and the WR approaches, new axiomatic characterizations are obtained for the Value-at-Risk (VaR) and the Expected Shortfall (ES, also known as CVaR). The new approach is illustrated with various risk measures and examples from portfolio optimization. △ Less

Submitted 8 June, 2024; v1 submitted 17 January, 2022; originally announced January 2022.

arXiv:2110.15102 [pdf, other]

Risk and return prediction for pricing portfolios of non-performing consumer credit

Authors: Siyi Wang, Xing Yan, Bangqi Zheng, Hu Wang, Wangli Xu, Nanbo Peng, Qi Wu

Abstract: We design a system for risk-analyzing and pricing portfolios of non-performing consumer credit loans. The rapid development of credit lending business for consumers heightens the need for trading portfolios formed by overdue loans as a manner of risk transferring. However, the problem is nontrivial technically and related research is absent. We tackle the challenge by building a bottom-up architec… ▽ More We design a system for risk-analyzing and pricing portfolios of non-performing consumer credit loans. The rapid development of credit lending business for consumers heightens the need for trading portfolios formed by overdue loans as a manner of risk transferring. However, the problem is nontrivial technically and related research is absent. We tackle the challenge by building a bottom-up architecture, in which we model the distribution of every single loan's repayment rate, followed by modeling the distribution of the portfolio's overall repayment rate. To address the technical issues encountered, we adopt the approaches of simultaneous quantile regression, R-copula, and Gaussian one-factor copula model. To our best knowledge, this is the first study that successfully adopts a bottom-up system for analyzing credit portfolio risks of consumer loans. We conduct experiments on a vast amount of data and prove that our methodology can be applied successfully in real business tasks. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Comments: Accepted by 2nd ACM International Conference on AI in Finance (ICAIF'21)

arXiv:2108.05066 [pdf, ps, other]

Risk Concentration and the Mean-Expected Shortfall Criterion

Authors: Xia Han, Bin Wang, Ruodu Wang, Qinyu Wu

Abstract: Expected Shortfall (ES, also known as CVaR) is the most important coherent risk measure in finance, insurance, risk management, and engineering. Recently, Wang and Zitikis (2021) put forward four economic axioms for portfolio risk assessment and provide the first economic axiomatic foundation for the family of ES. In particular, the axiom of no reward for concentration (NRC) is arguably quite stro… ▽ More Expected Shortfall (ES, also known as CVaR) is the most important coherent risk measure in finance, insurance, risk management, and engineering. Recently, Wang and Zitikis (2021) put forward four economic axioms for portfolio risk assessment and provide the first economic axiomatic foundation for the family of ES. In particular, the axiom of no reward for concentration (NRC) is arguably quite strong, which imposes an additive form of the risk measure on portfolios with a certain dependence structure. We move away from the axiom of NRC by introducing the notion of concentration aversion, which does not impose any specific form of the risk measure. It turns out that risk measures with concentration aversion are functions of ES and the expectation. Together with the other three standard axioms of monotonicity, translation invariance and lower semicontinuity, concentration aversion uniquely characterizes the family of ES. In addition, we establish an axiomatic foundation for the problem of mean-ES portfolio selection and new explicit formulas for convex and consistent risk measures. Finally, we provide an economic justification for concentration aversion via a few axioms on the attitude of a regulator towards dependence structures. △ Less

Submitted 4 April, 2022; v1 submitted 11 August, 2021; originally announced August 2021.

arXiv:2107.09629 [pdf, other]

Order Book Queue Hawkes-Markovian Modeling

Authors: Philip Protter, Qianfan Wu, Shihao Yang

Abstract: This article presents a Hawkes process model with Markovian baseline intensities for high-frequency order book data modeling. We classify intraday order book trading events into a range of categories based on their order types and the price changes after their arrivals. To capture the stimulating effects between multiple types of order book events, we use the multivariate Hawkes process to model t… ▽ More This article presents a Hawkes process model with Markovian baseline intensities for high-frequency order book data modeling. We classify intraday order book trading events into a range of categories based on their order types and the price changes after their arrivals. To capture the stimulating effects between multiple types of order book events, we use the multivariate Hawkes process to model the self- and mutually-exciting event arrivals. We also integrate a Markovian baseline intensity into the event arrival dynamic, by including the impacts of order book liquidity state and time factor to the baseline intensity. A regression-based non-parametric estimation procedure is adopted to estimate the model parameters in our Hawkes+Markovian model. To eliminate redundant model parameters, LASSO regularization is incorporated in the estimation procedure. Besides, model selection method based on Akaike Information Criteria is applied to evaluate the effect of each part of the proposed model. An implementation example based on real LOB data is provided. Through the example, we study the empirical shapes of Hawkes excitement functions, the effects of liquidity state as well as time factors, the LASSO variable selection, and the explanatory power of Hawkes and Markovian elements to the dynamics of the order book. △ Less

Submitted 5 January, 2022; v1 submitted 20 July, 2021; originally announced July 2021.

Comments: 71 pages, 80 figures

MSC Class: 62P05 (Primary) 62G05 (Secondary)

arXiv:2103.11869 [pdf, other]

Robust Orthogonal Machine Learning of Treatment Effects

Authors: Yiyan Huang, Cheuk Hang Leung, Qi Wu, Xing Yan

Abstract: Causal learning is the key to obtaining stable predictions and answering \textit{what if} problems in decision-makings. In causal learning, it is central to seek methods to estimate the average treatment effect (ATE) from observational data. The Double/Debiased Machine Learning (DML) is one of the prevalent methods to estimate ATE. However, the DML estimators can suffer from an \textit{error-compo… ▽ More Causal learning is the key to obtaining stable predictions and answering \textit{what if} problems in decision-makings. In causal learning, it is central to seek methods to estimate the average treatment effect (ATE) from observational data. The Double/Debiased Machine Learning (DML) is one of the prevalent methods to estimate ATE. However, the DML estimators can suffer from an \textit{error-compounding issue} and even give extreme estimates when the propensity scores are close to 0 or 1. Previous studies have overcome this issue through some empirical tricks such as propensity score trimming, yet none of the existing works solves it from a theoretical standpoint. In this paper, we propose a \textit{Robust Causal Learning (RCL)} method to offset the deficiencies of DML estimators. Theoretically, the RCL estimators i) satisfy the (higher-order) orthogonal condition and are as \textit{consistent and doubly robust} as the DML estimators, and ii) get rid of the error-compounding issue. Empirically, the comprehensive experiments show that: i) the RCL estimators give more stable estimations of the causal parameters than DML; ii) the RCL estimators outperform traditional estimators and their variants when applying different machine learning models on both simulation and benchmark datasets, and a mimic consumer credit dataset generated by WGAN. △ Less

Submitted 5 December, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

arXiv:2012.13121 [pdf, other]

Memory-Gated Recurrent Networks

Authors: Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang

Abstract: The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint… ▽ More The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint" memory). Because of the multivariate complexity in the evolution of the joint distribution that underlies the data generating process, we take a data-driven approach and construct a novel recurrent network architecture, termed Memory-Gated Recurrent Networks (mGRN), with gates explicitly regulating two distinct types of memories: the marginal memory and the joint memory. Through a combination of comprehensive simulation studies and empirical experiments on a range of public datasets, we show that our proposed mGRN architecture consistently outperforms state-of-the-art architectures targeting multivariate time series. △ Less

Submitted 30 December, 2020; v1 submitted 24 December, 2020; originally announced December 2020.

Comments: This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

arXiv:2012.09448 [pdf, other]

The Causal Learning of Retail Delinquency

Authors: Yiyan Huang, Cheuk Hang Leung, Xing Yan, Qi Wu, Nanbo Peng, Dongdong Wang, Zhixiang Huang

Abstract: This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consisten… ▽ More This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

arXiv:2011.13132

Generative Learning of Heterogeneous Tail Dependence

Authors: Xiangqian Sun, Xing Yan, Qi Wu

Abstract: We propose a multivariate generative model to capture the complex dependence structure often encountered in business and financial data. Our model features heterogeneous and asymmetric tail dependence between all pairs of individual dimensions while also allowing heterogeneity and asymmetry in the tails of the marginals. A significant merit of our model structure is that it is not prone to error p… ▽ More We propose a multivariate generative model to capture the complex dependence structure often encountered in business and financial data. Our model features heterogeneous and asymmetric tail dependence between all pairs of individual dimensions while also allowing heterogeneity and asymmetry in the tails of the marginals. A significant merit of our model structure is that it is not prone to error propagation in the parameter estimation process, hence very scalable, as the dimensions of datasets grow large. However, the likelihood methods are infeasible for parameter estimation in our case due to the lack of a closed-form density function. Instead, we devise a novel moment learning algorithm to learn the parameters. To demonstrate the effectiveness of the model and its estimator, we test them on simulated as well as real-world datasets. Results show that this framework gives better finite-sample performance compared to the copula-based benchmarks as well as recent similar models. △ Less

Submitted 12 November, 2023; v1 submitted 26 November, 2020; originally announced November 2020.

Comments: Major technical flaws in theoretical aspects

arXiv:2010.08263 [pdf, other]

Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning

Authors: Xing Yan, Weizhong Zhang, Lin Ma, Wei Liu, Qi Wu

Abstract: We propose a parsimonious quantile regression framework to learn the dynamic tail behaviors of financial asset returns. Our model captures well both the time-varying characteristic and the asymmetrical heavy-tail property of financial time series. It combines the merits of a popular sequential neural network model, i.e., LSTM, with a novel parametric quantile function that we construct to represen… ▽ More We propose a parsimonious quantile regression framework to learn the dynamic tail behaviors of financial asset returns. Our model captures well both the time-varying characteristic and the asymmetrical heavy-tail property of financial time series. It combines the merits of a popular sequential neural network model, i.e., LSTM, with a novel parametric quantile function that we construct to represent the conditional distribution of asset returns. Our model also captures individually the serial dependences of higher moments, rather than just the volatility. Across a wide range of asset classes, the out-of-sample forecasts of conditional quantiles or VaR of our model outperform the GARCH family. Further, the proposed approach does not suffer from the issue of quantile crossing, nor does it expose to the ill-posedness comparing to the parametric probability density function approach. △ Less

Submitted 16 October, 2020; originally announced October 2020.

Comments: NeurIPS 2018:1582-1592

arXiv:1909.04497 [pdf, other]

Equity2Vec: End-to-end Deep Learning Framework for Cross-sectional Asset Pricing

Authors: Qiong Wu, Christopher G. Brinton, Zheng Zhang, Andrea Pizzoferrato, Zhenming Liu, Mihai Cucuringu

Abstract: Pricing assets has attracted significant attention from the financial technology community. We observe that the existing solutions overlook the cross-sectional effects and not fully leveraged the heterogeneous data sets, leading to sub-optimal performance. To this end, we propose an end-to-end deep learning framework to price the assets. Our framework possesses two main properties: 1) We propose… ▽ More Pricing assets has attracted significant attention from the financial technology community. We observe that the existing solutions overlook the cross-sectional effects and not fully leveraged the heterogeneous data sets, leading to sub-optimal performance. To this end, we propose an end-to-end deep learning framework to price the assets. Our framework possesses two main properties: 1) We propose Equity2Vec, a graph-based component that effectively captures both long-term and evolving cross-sectional interactions. 2) The framework simultaneously leverages all the available heterogeneous alpha sources including technical indicators, financial news signals, and cross-sectional signals. Experimental results on datasets from the real-world stock market show that our approach outperforms the existing state-of-the-art approaches. Furthermore, market trading simulations demonstrate that our framework monetizes the signals effectively. △ Less

Submitted 26 October, 2021; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: 9 pages

Journal ref: International Conference on AI in Finance, 2021

arXiv:1906.09024 [pdf, ps, other]

BERT-based Financial Sentiment Index and LSTM-based Stock Return Predictability

Authors: Joshua Zoen Git Hiew, Xin Huang, Hao Mou, Duan Li, Qi Wu, Yabo Xu

Abstract: Traditional sentiment construction in finance relies heavily on the dictionary-based approach, with a few exceptions using simple machine learning techniques such as Naive Bayes classifier. While the current literature has not yet invoked the rapid advancement in the natural language processing, we construct in this research a textual-based sentiment index using a well-known pre-trained model BERT… ▽ More Traditional sentiment construction in finance relies heavily on the dictionary-based approach, with a few exceptions using simple machine learning techniques such as Naive Bayes classifier. While the current literature has not yet invoked the rapid advancement in the natural language processing, we construct in this research a textual-based sentiment index using a well-known pre-trained model BERT developed by Google, especially for three actively trading individual stocks in Hong Kong market with at the same time the hot discussion on Weibo.com. On the one hand, we demonstrate a significant enhancement of applying BERT in financial sentiment analysis when compared with the existing models. On the other hand, by combining with the other two commonly-used methods when it comes to building the sentiment index in the financial literature, i.e., the option-implied and the market-implied approaches, we propose a more general and comprehensive framework for the financial sentiment analysis, and further provide convincing outcomes for the predictability of individual stock return by combining LSTM (with a feature of a nonlinear mapping). It is significantly distinct with the dominating econometric methods in sentiment influence analysis which are all of a nature of linear regression. △ Less

Submitted 7 July, 2022; v1 submitted 21 June, 2019; originally announced June 2019.

Comments: Manuscript

arXiv:1906.01981 [pdf, ps, other]

Understanding Distributional Ambiguity via Non-robust Chance Constraint

Authors: Qi Wu, Shumin Ma, Cheuk Hang Leung, Wei Liu, Nanbo Peng

Abstract: This paper provides a non-robust interpretation of the distributionally robust optimization (DRO) problem by relating the distributional uncertainties to the chance probabilities. Our analysis allows a decision-maker to interpret the size of the ambiguity set, which is often lack of business meaning, through the chance parameters constraining the objective function. We first show that, for general… ▽ More This paper provides a non-robust interpretation of the distributionally robust optimization (DRO) problem by relating the distributional uncertainties to the chance probabilities. Our analysis allows a decision-maker to interpret the size of the ambiguity set, which is often lack of business meaning, through the chance parameters constraining the objective function. We first show that, for general $φ$-divergences, a DRO problem is asymptotically equivalent to a class of mean-deviation problems. These mean-deviation problems are not subject to uncertain distributions, and the ambiguity radius in the original DRO problem now plays the role of controlling the risk preference of the decision-maker. We then demonstrate that a DRO problem can be cast as a chance-constrained optimization (CCO) problem when a boundedness constraint is added to the decision variables. Without the boundedness constraint, the CCO problem is shown to perform uniformly better than the DRO problem, irrespective of the radius of the ambiguity set, the choice of the divergence measure, or the tail heaviness of the center distribution. Thanks to our high-order expansion result, a notable feature of our analysis is that it applies to divergence measures that accommodate well heavy tail distributions such as the student $t$-distribution and the lognormal distribution, besides the widely-used Kullback-Leibler (KL) divergence, which requires the distribution of the objective function to be exponentially bounded. Using the portfolio selection problem as an example, our comprehensive testings on multivariate heavy-tail datasets, both synthetic and real-world, shows that this business-interpretation approach is indeed useful and insightful. △ Less

Submitted 21 September, 2020; v1 submitted 3 June, 2019; originally announced June 2019.

Comments: 8 pages, 3 figures, Accepted for publication in ICAIF 2020

arXiv:1906.01923 [pdf, other]

Neural Learning of Online Consumer Credit Risk

Authors: Di Wang, Qi Wu, Wen Zhang

Abstract: This paper takes a deep learning approach to understand consumer credit risk when e-commerce platforms issue unsecured credit to finance customers' purchase. The "NeuCredit" model can capture both serial dependences in multi-dimensional time series data when event frequencies in each dimension differ. It also captures nonlinear cross-sectional interactions among different time-evolving features. A… ▽ More This paper takes a deep learning approach to understand consumer credit risk when e-commerce platforms issue unsecured credit to finance customers' purchase. The "NeuCredit" model can capture both serial dependences in multi-dimensional time series data when event frequencies in each dimension differ. It also captures nonlinear cross-sectional interactions among different time-evolving features. Also, the predicted default probability is designed to be interpretable such that risks can be decomposed into three components: the subjective risk indicating the consumers' willingness to repay, the objective risk indicating their ability to repay, and the behavioral risk indicating consumers' behavioral differences. Using a unique dataset from one of the largest global e-commerce platforms, we show that the inclusion of shopping behavioral data, besides conventional payment records, requires a deep learning approach to extract the information content of these data, which turns out significantly enhancing forecasting performance than the traditional machine learning methods. △ Less

Submitted 5 June, 2019; originally announced June 2019.

Comments: 49 pages, 11 tables, 7 figures

arXiv:1905.13425 [pdf, other]

Cross-sectional Learning of Extremal Dependence among Financial Assets

Authors: Xing Yan, Qi Wu, Wen Zhang

Abstract: We propose a novel probabilistic model to facilitate the learning of multivariate tail dependence of multiple financial assets. Our method allows one to construct from known random vectors, e.g., standard normal, sophisticated joint heavy-tailed random vectors featuring not only distinct marginal tail heaviness, but also flexible tail dependence structure. The novelty lies in that pairwise tail de… ▽ More We propose a novel probabilistic model to facilitate the learning of multivariate tail dependence of multiple financial assets. Our method allows one to construct from known random vectors, e.g., standard normal, sophisticated joint heavy-tailed random vectors featuring not only distinct marginal tail heaviness, but also flexible tail dependence structure. The novelty lies in that pairwise tail dependence between any two dimensions is modeled separately from their correlation, and can vary respectively according to its own parameter rather than the correlation parameter, which is an essential advantage over many commonly used methods such as multivariate $t$ or elliptical distribution. It is also intuitive to interpret, easy to track, and simple to sample comparing to the copula approach. We show its flexible tail dependence structure through simulation. Coupled with a GARCH model to eliminate serial dependence of each individual asset return series, we use this novel method to model and forecast multivariate conditional distribution of stock returns, and obtain notable performance improvements in multi-dimensional coverage tests. Besides, our empirical finding about the asymmetry of tails of the idiosyncratic component as well as the market component is interesting and worth to be well studied in the future. △ Less

Submitted 27 October, 2019; v1 submitted 31 May, 2019; originally announced May 2019.

Journal ref: Advances in Neural Information Processing Systems, pages 3852-3862, 2019

Showing 1–24 of 24 results for author: Wu, Q