Search | arXiv e-print repository

Contrastive Similarity Learning for Market Forecasting: The ContraSim Framework

Authors: Nicholas Vinden, Raeid Saqur, Zining Zhu, Frank Rudzicz

Abstract: We introduce the Contrastive Similarity Space Embedding Algorithm (ContraSim), a novel framework for uncovering the global semantic relationships between daily financial headlines and market movements. ContraSim operates in two key stages: (I) Weighted Headline Augmentation, which generates augmented financial headlines along with a semantic fine-grained similarity score, and (II) Weighted Self-Su… ▽ More We introduce the Contrastive Similarity Space Embedding Algorithm (ContraSim), a novel framework for uncovering the global semantic relationships between daily financial headlines and market movements. ContraSim operates in two key stages: (I) Weighted Headline Augmentation, which generates augmented financial headlines along with a semantic fine-grained similarity score, and (II) Weighted Self-Supervised Contrastive Learning (WSSCL), an extended version of classical self-supervised contrastive learning that uses the similarity metric to create a refined weighted embedding space. This embedding space clusters semantically similar headlines together, facilitating deeper market insights. Empirical results demonstrate that integrating ContraSim features into financial forecasting tasks improves classification accuracy from WSJ headlines by 7%. Moreover, leveraging an information density analysis, we find that the similarity spaces constructed by ContraSim intrinsically cluster days with homogeneous market movement directions, indicating that ContraSim captures market dynamics independent of ground truth labels. Additionally, ContraSim enables the identification of historical news days that closely resemble the headlines of the current day, providing analysts with actionable insights to predict market trends by referencing analogous past events. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: 8 pages, 3 appendices

ACM Class: I.2.4; I.2.6; I.5.1; J.1

arXiv:2406.15508 [pdf, other]

What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs

Authors: Raeid Saqur

Abstract: Machine learning techniques applied to the problem of financial market forecasting struggle with dynamic regime switching, or underlying correlation and covariance shifts in true (hidden) market variables. Drawing inspiration from the success of reinforcement learning in robotics, particularly in agile locomotion adaptation of quadruped robots to unseen terrains, we introduce an innovative approac… ▽ More Machine learning techniques applied to the problem of financial market forecasting struggle with dynamic regime switching, or underlying correlation and covariance shifts in true (hidden) market variables. Drawing inspiration from the success of reinforcement learning in robotics, particularly in agile locomotion adaptation of quadruped robots to unseen terrains, we introduce an innovative approach that leverages world knowledge of pretrained LLMs (aka. 'privileged information' in robotics) and dynamically adapts them using intrinsic, natural market rewards using LLM alignment technique we dub as "Reinforcement Learning from Market Feedback" (**RLMF**). Strong empirical results demonstrate the efficacy of our method in adapting to regime shifts in financial markets, a challenge that has long plagued predictive models in this domain. The proposed algorithmic framework outperforms best-performing SOTA LLM models on the existing (FLARE) benchmark stock-movement (SM) tasks by more than 15\% improved accuracy. On the recently proposed NIFTY SM task, our adaptive policy outperforms the SOTA best performing trillion parameter models like GPT-4. The paper details the dual-phase, teacher-student architecture and implementation of our model, the empirical results obtained, and an analysis of the role of language embeddings in terms of Information Gain. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2405.09747

ACM Class: I.2.0; I.2.6; I.2.7; I.2.9

arXiv:2406.02969 [pdf, other]

Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models

Authors: Raeid Saqur, Anastasis Kratsios, Florian Krach, Yannick Limmer, Jacob-Junqi Tian, John Willes, Blanka Horvath, Frank Rudzicz

Abstract: We propose MoE-F - a formalized mechanism for combining $N$ pre-trained Large Language Models (LLMs) for online time-series prediction by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its next step. Di… ▽ More We propose MoE-F - a formalized mechanism for combining $N$ pre-trained Large Language Models (LLMs) for online time-series prediction by adaptively forecasting the best weighting of LLM predictions at every time step. Our mechanism leverages the conditional information in each expert's running performance to forecast the best combination of LLMs for predicting the time series in its next step. Diverging from static (learned) Mixture of Experts (MoE) methods, our approach employs time-adaptive stochastic filtering techniques to combine experts. By framing the expert selection problem as a finite state-space, continuous-time Hidden Markov model (HMM), we can leverage the Wohman-Shiryaev filter. Our approach first constructs N parallel filters corresponding to each of the $N$ individual LLMs. Each filter proposes its best combination of LLMs, given the information that they have access to. Subsequently, the N filter outputs are optimally aggregated to maximize their robust predictive power, and this update is computed efficiently via a closed-form expression, generating our ensemble predictor. Our contributions are: **(I)** the MoE-F plug-and-play filtering harness algorithm, **(II)** theoretical optimality guarantees of the proposed filtering-based gating algorithm (via optimality guarantees for its parallel Bayesian filtering and its robust aggregation steps), and **(III)** empirical evaluation and ablative results using state-of-the-art foundational and MoE LLMs on a real-world __Financial Market Movement__ task where MoE-F attains a remarkable 17\% absolute and 48.5\% relative F1 measure improvement over the next best performing individual LLM expert predicting short-horizon market movement based on streaming news. Further, we provide empirical evidence of substantial performance gains in applying MoE-F over specialized models in the long-horizon time-series forecasting domain. △ Less

Submitted 20 February, 2025; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: 33 pages, 5 Appendix sections

MSC Class: 60J05; 60G35; 68T20; 68T42; 68T50 ACM Class: I.2.6; I.2.7; G.3

arXiv:2405.09747 [pdf, other]

NIFTY Financial News Headlines Dataset

Authors: Raeid Saqur, Ken Kato, Nicholas Vinden, Frank Rudzicz

Abstract: We introduce and make publicly available the NIFTY Financial News Headlines dataset, designed to facilitate and advance research in financial market forecasting using large language models (LLMs). This dataset comprises two distinct versions tailored for different modeling approaches: (i) NIFTY-LM, which targets supervised fine-tuning (SFT) of LLMs with an auto-regressive, causal language-modeling… ▽ More We introduce and make publicly available the NIFTY Financial News Headlines dataset, designed to facilitate and advance research in financial market forecasting using large language models (LLMs). This dataset comprises two distinct versions tailored for different modeling approaches: (i) NIFTY-LM, which targets supervised fine-tuning (SFT) of LLMs with an auto-regressive, causal language-modeling objective, and (ii) NIFTY-RL, formatted specifically for alignment methods (like reinforcement learning from human feedback (RLHF)) to align LLMs via rejection sampling and reward modeling. Each dataset version provides curated, high-quality data incorporating comprehensive metadata, market indices, and deduplicated financial news headlines systematically filtered and ranked to suit modern LLM frameworks. We also include experiments demonstrating some applications of the dataset in tasks like stock price movement and the role of LLM embeddings in information acquisition/richness. The NIFTY dataset along with utilities (like truncating prompt's context length systematically) are available on Hugging Face at https://huggingface.co/datasets/raeidsaqur/NIFTY. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:1801.00091 [pdf, other]

PrivySense: $\underline{Pri}$ce $\underline{V}$olatilit$\underline{y}$ based $\underline{Sen}$timent$\underline{s}$ $\underline{E}$stimation from Financial News using Machine Learning

Authors: Raeid Saqur, Nicole Langballe

Abstract: As machine learning ascends the peak of computer science zeitgeist, the usage and experimentation with sentiment analysis using various forms of textual data seems pervasive. The effect is especially pronounced in formulating securities trading strategies, due to a plethora of reasons including the relative ease of implementation and the abundance of academic research suggesting automated sentimen… ▽ More As machine learning ascends the peak of computer science zeitgeist, the usage and experimentation with sentiment analysis using various forms of textual data seems pervasive. The effect is especially pronounced in formulating securities trading strategies, due to a plethora of reasons including the relative ease of implementation and the abundance of academic research suggesting automated sentiment analysis can be productively used in trading strategies. The source data for such analyzers ranges a broad spectrum like social media feeds, micro-blogs, real-time news feeds, ex-post financial data etc. The abstract technique underlying these analyzers involve supervised learning of sentiment classification where the classifier is trained on annotated source corpus, and accuracy is measured by testing how well the classifiers generalizes on unseen test data from the corpus. Post training, and validation of fitted models, the classifiers are used to execute trading strategies, and the corresponding returns are compared with appropriate benchmark returns (for e.g., the S&P500 returns). In this paper, we introduce $\underline{a\ novel\ technique\ of\ using\ price\ volatilities\ to\ empirically\ determine\ the\ sentiment\ in\ news\ data}$, instead of the traditional reverse approach. We also perform meta sentiment analysis by evaluating the efficacy of existing sentiment classifiers and the precise definition of sentiment from securities trading context. We scrutinize the efficacy of using human-annotated sentiment classification and the tacit assumptions that introduces subjective bias in existing financial news sentiment classifiers. △ Less

Submitted 21 February, 2018; v1 submitted 30 December, 2017; originally announced January 2018.

Comments: Initial draft, updates are w.i.p

ACM Class: I.2.0; I.2.1; I.2.6; I.2.7

Showing 1–5 of 5 results for author: Saqur, R