Search | arXiv e-print repository

arXiv:2411.19285 [pdf, other]

BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning

Authors: Jianming Pan, Zeqi Ye, Xiao Yang, Xu Yang, Weiqing Liu, Lewen Wang, Jiang Bian

Abstract: Data-driven decision-making processes increasingly utilize end-to-end learnable deep neural networks to render final decisions. Sometimes, the output of the forward functions in certain layers is determined by the solutions to mathematical optimization problems, leading to the emergence of differentiable optimization layers that permit gradient back-propagation. However, real-world scenarios often… ▽ More Data-driven decision-making processes increasingly utilize end-to-end learnable deep neural networks to render final decisions. Sometimes, the output of the forward functions in certain layers is determined by the solutions to mathematical optimization problems, leading to the emergence of differentiable optimization layers that permit gradient back-propagation. However, real-world scenarios often involve large-scale datasets and numerous constraints, presenting significant challenges. Current methods for differentiating optimization problems typically rely on implicit differentiation, which necessitates costly computations on the Jacobian matrices, resulting in low efficiency. In this paper, we introduce BPQP, a differentiable convex optimization framework designed for efficient end-to-end learning. To enhance efficiency, we reformulate the backward pass as a simplified and decoupled quadratic programming problem by leveraging the structural properties of the KKT matrix. This reformulation enables the use of first-order optimization algorithms in calculating the backward pass gradients, allowing our framework to potentially utilize any state-of-the-art solver. As solver technologies evolve, BPQP can continuously adapt and improve its efficiency. Extensive experiments on both simulated and real-world datasets demonstrate that BPQP achieves a significant improvement in efficiency--typically an order of magnitude faster in overall execution time compared to other differentiable optimization layers. Our results not only highlight the efficiency gains of BPQP but also underscore its superiority over differentiable optimization layer baselines. △ Less

Submitted 29 December, 2024; v1 submitted 28 November, 2024; originally announced November 2024.

Comments: NeurIPS 2024 Spotlight

arXiv:2409.17392 [pdf, other]

Trading through Earnings Seasons using Self-Supervised Contrastive Representation Learning

Authors: Zhengxin Joseph Ye, Bjoern Schuller

Abstract: Earnings release is a key economic event in the financial markets and crucial for predicting stock movements. Earnings data gives a glimpse into how a company is doing financially and can hint at where its stock might go next. However, the irregularity of its release cycle makes it a challenge to incorporate this data in a medium-frequency algorithmic trading model and the usefulness of this data… ▽ More Earnings release is a key economic event in the financial markets and crucial for predicting stock movements. Earnings data gives a glimpse into how a company is doing financially and can hint at where its stock might go next. However, the irregularity of its release cycle makes it a challenge to incorporate this data in a medium-frequency algorithmic trading model and the usefulness of this data fades fast after it is released, making it tough for models to stay accurate over time. Addressing this challenge, we introduce the Contrastive Earnings Transformer (CET) model, a self-supervised learning approach rooted in Contrastive Predictive Coding (CPC), aiming to optimise the utilisation of earnings data. To ascertain its effectiveness, we conduct a comparative study of CET against benchmark models across diverse sectors. Our research delves deep into the intricacies of stock data, evaluating how various models, and notably CET, handle the rapidly changing relevance of earnings data over time and over different sectors. The research outcomes shed light on CET's distinct advantage in extrapolating the inherent value of earnings data over time. Its foundation on CPC allows for a nuanced understanding, facilitating consistent stock predictions even as the earnings data ages. This finding about CET presents a fresh approach to better use earnings data in algorithmic trading for predicting stock price trends. △ Less

Submitted 25 September, 2024; originally announced September 2024.

arXiv:2404.11276 [pdf, other]

Towards Data-Centric Automatic R&D

Authors: Haotian Chen, Xinjie Shen, Zeqi Ye, Wenjun Feng, Haoxue Wang, Xiao Yang, Xu Yang, Weiqing Liu, Jiang Bian

Abstract: The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method has demonstrated its effectiveness in a wide range… ▽ More The progress of humanity is driven by those successful discoveries accompanied by countless failed experiments. Researchers often seek the potential research directions by reading and then verifying them through experiments. The process imposes a significant burden on researchers. In the past decade, the data-driven black-box deep learning method has demonstrated its effectiveness in a wide range of real-world scenarios, which exacerbates the experimental burden of researchers and thus renders the potential successful discoveries veiled. Therefore, automating such a research and development (R&D) process is an urgent need. In this paper, we serve as the first effort to formalize the goal by proposing a Real-world Data-centric automatic R&D Benchmark, namely RD2Bench. RD2Bench benchmarks all the operations in data-centric automatic R&D (D-CARD) as a whole to navigate future work toward our goal directly. We focus on evaluating the interaction and synergistic effects of various model capabilities and aiding in selecting well-performing trustworthy models. Although RD2Bench is very challenging to the state-of-the-art (SOTA) large language model (LLM) named GPT-4, indicating ample research opportunities and more research efforts, LLMs possess promising potential to bring more significant development to D-CARD: They are able to implement some simple methods without adopting any additional techniques. We appeal to future work to take developing techniques for tackling automatic R&D into consideration, thus bringing the opportunities of the potential revolutionary upgrade to human productivity. △ Less

Submitted 30 July, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: 17 pages, 3 figures

arXiv:2312.15730 [pdf, other]

Deep Reinforcement Learning for Quantitative Trading

Authors: Maochun Xu, Zixun Lan, Zheng Tao, Jiawei Du, Zongao Ye

Abstract: Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in pred… ▽ More Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in predicting market trends and executing trades at a speed and accuracy that far surpass human capabilities. Its capacity to automate critical tasks, such as discerning market conditions and executing trading strategies, has been pivotal. However, persistent challenges exist in current QT methods, especially in effectively handling noisy and high-frequency financial data. Striking a balance between exploration and exploitation poses another challenge for AI-driven trading agents. To surmount these hurdles, our proposed solution, QTNet, introduces an adaptive trading model that autonomously formulates QT strategies through an intelligent trading agent. Incorporating deep reinforcement learning (DRL) with imitative learning methodologies, we bolster the proficiency of our model. To tackle the challenges posed by volatile financial datasets, we conceptualize the QT mechanism within the framework of a Partially Observable Markov Decision Process (POMDP). Moreover, by embedding imitative learning, the model can capitalize on traditional trading tactics, nurturing a balanced synergy between discovery and utilization. For a more realistic simulation, our trading agent undergoes training using minute-frequency data sourced from the live financial market. Experimental findings underscore the model's proficiency in extracting robust market features and its adaptability to diverse market conditions. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2310.11249 [pdf, other]

Leveraging Large Language Model for Automatic Evolving of Industrial Data-Centric R&D Cycle

Authors: Xu Yang, Xiao Yang, Weiqing Liu, Jinhui Li, Peng Yu, Zeqi Ye, Jiang Bian

Abstract: In the wake of relentless digital transformation, data-driven solutions are emerging as powerful tools to address multifarious industrial tasks such as forecasting, anomaly detection, planning, and even complex decision-making. Although data-centric R&D has been pivotal in harnessing these solutions, it often comes with significant costs in terms of human, computational, and time resources. This p… ▽ More In the wake of relentless digital transformation, data-driven solutions are emerging as powerful tools to address multifarious industrial tasks such as forecasting, anomaly detection, planning, and even complex decision-making. Although data-centric R&D has been pivotal in harnessing these solutions, it often comes with significant costs in terms of human, computational, and time resources. This paper delves into the potential of large language models (LLMs) to expedite the evolution cycle of data-centric R&D. Assessing the foundational elements of data-centric R&D, including heterogeneous task-related data, multi-facet domain knowledge, and diverse computing-functional tools, we explore how well LLMs can understand domain-specific requirements, generate professional ideas, utilize domain-specific tools to conduct experiments, interpret results, and incorporate knowledge from past endeavors to tackle new challenges. We take quantitative investment research as a typical example of industrial data-centric R&D scenario and verified our proposed framework upon our full-stack open-sourced quantitative research platform Qlib and obtained promising results which shed light on our vision of automatic evolving of industrial data-centric R&D cycle. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: 29 pages, 11 figures

arXiv:2009.03094 [pdf, other]

Capturing dynamics of post-earnings-announcement drift using genetic algorithm-optimised supervised learning

Authors: Zhengxin Joseph Ye, Bjorn W. Schuller

Abstract: While Post-Earnings-Announcement Drift (PEAD) is one of the most studied stock market anomalies, the current literature is often limited in explaining this phenomenon by a small number of factors using simpler regression methods. In this paper, we use a machine learning based approach instead, and aim to capture the PEAD dynamics using data from a large group of stocks and a wide range of both fun… ▽ More While Post-Earnings-Announcement Drift (PEAD) is one of the most studied stock market anomalies, the current literature is often limited in explaining this phenomenon by a small number of factors using simpler regression methods. In this paper, we use a machine learning based approach instead, and aim to capture the PEAD dynamics using data from a large group of stocks and a wide range of both fundamental and technical factors. Our model is built around the Extreme Gradient Boosting (XGBoost) and uses a long list of engineered input features based on quarterly financial announcement data from 1,106 companies in the Russell 1000 index between 1997 and 2018. We perform numerous experiments on PEAD predictions and analysis and have the following contributions to the literature. First, we show how Post-Earnings-Announcement Drift can be analysed using machine learning methods and demonstrate such methods' prowess in producing credible forecasting on the drift direction. It is the first time PEAD dynamics are studied using XGBoost. We show that the drift direction is in fact driven by different factors for stocks from different industrial sectors and in different quarters and XGBoost is effective in understanding the changing drivers. Second, we show that an XGBoost well optimised by a Genetic Algorithm can help allocate out-of-sample stocks to form portfolios with higher positive returns to long and portfolios with lower negative returns to short, a finding that could be adopted in the process of developing market neutral strategies. Third, we show how theoretical event-driven stock strategies have to grapple with ever changing market prices in reality, reducing their effectiveness. We present a tactic to remedy the difficulty of buying into a moving market when dealing with PEAD signals. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: 13 pages of main article plus 6 pages of data in appendix. 7 figures and 4 tables

arXiv:0706.3331 [pdf, ps, other]

A Model for Counterparty Risk with Geometric Attenuation Effect and the Valuation of CDS

Authors: Yunfen Bai, Xinhua Hu, Zhongxing Ye

Abstract: In this paper, a geometric function is introduced to reflect the attenuation speed of impact of one firm's default to its partner. If two firms are competitions (copartners), the default intensity of one firm will decrease (increase) abruptly when the other firm defaults. As time goes on, the impact will decrease gradually until extinct. In this model, the joint distribution and marginal distrib… ▽ More In this paper, a geometric function is introduced to reflect the attenuation speed of impact of one firm's default to its partner. If two firms are competitions (copartners), the default intensity of one firm will decrease (increase) abruptly when the other firm defaults. As time goes on, the impact will decrease gradually until extinct. In this model, the joint distribution and marginal distributions of default times are derived by employing the change of measure, so can we value the fair swap premium of a CDS. △ Less

Submitted 22 June, 2007; originally announced June 2007.

Comments: 8 pages

MSC Class: 62P05

Showing 1–7 of 7 results for author: Ye, Z