Search | arXiv e-print repository

LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data

Authors: Hanyu Zhang, Chuck Arvin, Dmitry Efimov, Michael W. Mahoney, Dominique Perrault-Joncas, Shankar Ramasubramanian, Andrew Gordon Wilson, Malcolm Wolff

Abstract: Modern time-series forecasting models often fail to make full use of rich unstructured information about the time series themselves. This lack of proper conditioning can lead to obvious model failures; for example, models may be unaware of the details of a particular product, and hence fail to anticipate seasonal surges in customer demand in the lead up to major exogenous events like holidays for… ▽ More Modern time-series forecasting models often fail to make full use of rich unstructured information about the time series themselves. This lack of proper conditioning can lead to obvious model failures; for example, models may be unaware of the details of a particular product, and hence fail to anticipate seasonal surges in customer demand in the lead up to major exogenous events like holidays for clearly relevant products. To address this shortcoming, this paper introduces a novel forecast post-processor -- which we call LLMForecaster -- that fine-tunes large language models (LLMs) to incorporate unstructured semantic and contextual information and historical data to improve the forecasts from an existing demand forecasting pipeline. In an industry-scale retail application, we demonstrate that our technique yields statistically significantly forecast improvements across several sets of products subject to holiday-driven demand surges. △ Less

Submitted 3 December, 2024; originally announced December 2024.

Comments: Presented at NeurIPS Time Series in the Age of Large Models (2024)

arXiv:2411.10811 [pdf]

Detecting collusion in procurement auctions

Authors: Konstantin D. Efimov

Abstract: The study aimed at detecting cartel collusion involved analyzing decisions of the Russian Federal Antimonopoly Service and data on auctions. As a result, a machine learning model was developed that predicts with 91% accuracy the signs of collusion between bidders based on their history after dividing 40 auctions into test and training samples in a 30/70 ratio. Decomposition of the model using the… ▽ More The study aimed at detecting cartel collusion involved analyzing decisions of the Russian Federal Antimonopoly Service and data on auctions. As a result, a machine learning model was developed that predicts with 91% accuracy the signs of collusion between bidders based on their history after dividing 40 auctions into test and training samples in a 30/70 ratio. Decomposition of the model using the Shepley vector allowed the interpretation of the decision-making process. The behavior of honest companies in auctions was also studied, confirmed by independent simulation validation. △ Less

Submitted 16 November, 2024; originally announced November 2024.

Comments: 22 pages, 9 figures. in Russian language

arXiv:2411.05852 [pdf, other]

$\spadesuit$ SPADE $\spadesuit$ Split Peak Attention DEcomposition

Authors: Malcolm Wolff, Kin G. Olivares, Boris Oreshkin, Sunny Ruan, Sitan Yang, Abhinav Katoch, Shankar Ramasubramanian, Youxin Zhang, Michael W. Mahoney, Dmitry Efimov, Vincent Quenneville-Bélair

Abstract: Demand forecasting faces challenges induced by Peak Events (PEs) corresponding to special periods such as promotions and holidays. Peak events create significant spikes in demand followed by demand ramp down periods. Neural networks like MQCNN and MQT overreact to demand peaks by carrying over the elevated PE demand into subsequent Post-Peak-Event (PPE) periods, resulting in significantly over-bia… ▽ More Demand forecasting faces challenges induced by Peak Events (PEs) corresponding to special periods such as promotions and holidays. Peak events create significant spikes in demand followed by demand ramp down periods. Neural networks like MQCNN and MQT overreact to demand peaks by carrying over the elevated PE demand into subsequent Post-Peak-Event (PPE) periods, resulting in significantly over-biased forecasts. To tackle this challenge, we introduce a neural forecasting model called Split Peak Attention DEcomposition, SPADE. This model reduces the impact of PEs on subsequent forecasts by modeling forecasting as consisting of two separate tasks: one for PEs; and the other for the rest. Its architecture then uses masked convolution filters and a specialized Peak Attention module. We show SPADE's performance on a worldwide retail dataset with hundreds of millions of products. Our results reveal an overall PPE improvement of 4.5%, a 30% improvement for most affected forecasts after promotions and holidays, and an improvement in PE accuracy by 3.9%, relative to current production models. △ Less

Submitted 21 January, 2025; v1 submitted 6 November, 2024; originally announced November 2024.

Journal ref: 31st Conference on Neural Information Processing In 38th Conference on Neural Information Processing Systems NIPS 2017, Time Series in the Age of Large Models Workshop, 2024

arXiv:2012.15330 [pdf, other]

Sequential Deep Learning for Credit Risk Monitoring with Tabular Financial Data

Authors: Jillian M. Clements, Di Xu, Nooshin Yousefi, Dmitry Efimov

Abstract: Machine learning plays an essential role in preventing financial losses in the banking industry. Perhaps the most pertinent prediction task that can result in billions of dollars in losses each year is the assessment of credit risk (i.e., the risk of default on debt). Today, much of the gains from machine learning to predict credit risk are driven by gradient boosted decision tree models. However,… ▽ More Machine learning plays an essential role in preventing financial losses in the banking industry. Perhaps the most pertinent prediction task that can result in billions of dollars in losses each year is the assessment of credit risk (i.e., the risk of default on debt). Today, much of the gains from machine learning to predict credit risk are driven by gradient boosted decision tree models. However, these gains begin to plateau without the addition of expensive new data sources or highly engineered features. In this paper, we present our attempts to create a novel approach to assessing credit risk using deep learning that does not rely on new model inputs. We propose a new credit card transaction sampling technique to use with deep recurrent and causal convolution-based neural networks that exploits long historical sequences of financial data without costly resource requirements. We show that our sequential deep learning approach using a temporal convolutional network outperformed the benchmark non-sequential tree-based model, achieving significant financial savings and earlier detection of credit risk. We also demonstrate the potential for our approach to be used in a production environment, where our sampling technique allows for sequences to be stored efficiently in memory and used for fast online learning and inference. △ Less

Submitted 30 December, 2020; originally announced December 2020.

ACM Class: I.2.1

arXiv:2002.10816 [pdf, other]

Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs

Authors: Edouard Leurent, Denis Efimov, Odalric-Ambrym Maillard

Abstract: We consider the problem of robust and adaptive model predictive control (MPC) of a linear system, with unknown parameters that are learned along the way (adaptive), in a critical setting where failures must be prevented (robust). This problem has been studied from different perspectives by different communities. However, the existing theory deals only with the case of quadratic costs (the LQ probl… ▽ More We consider the problem of robust and adaptive model predictive control (MPC) of a linear system, with unknown parameters that are learned along the way (adaptive), in a critical setting where failures must be prevented (robust). This problem has been studied from different perspectives by different communities. However, the existing theory deals only with the case of quadratic costs (the LQ problem), which limits applications to stabilisation and tracking tasks only. In order to handle more general (non-convex) costs that naturally arise in many practical problems, we carefully select and bring together several tools from different communities, namely non-asymptotic linear regression, recent results in interval prediction, and tree-based planning. Combining and adapting the theoretical guarantees at each layer is non trivial, and we provide the first end-to-end suboptimality analysis for this setting. Interestingly, our analysis naturally adapts to handle many models and combines with a data-driven robust model selection strategy, which enables to relax the modelling assumptions. Last, we strive to preserve tractability at any stage of the method, that we illustrate on two challenging simulated environments. △ Less

Submitted 21 October, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

arXiv:2002.02271 [pdf, other]

Using generative adversarial networks to synthesize artificial financial datasets

Authors: Dmitry Efimov, Di Xu, Luyang Kong, Alexey Nefedov, Archana Anandakrishnan

Abstract: Generative Adversarial Networks (GANs) became very popular for generation of realistically looking images. In this paper, we propose to use GANs to synthesize artificial financial data for research and benchmarking purposes. We test this approach on three American Express datasets, and show that properly trained GANs can replicate these datasets with high fidelity. For our experiments, we define a… ▽ More Generative Adversarial Networks (GANs) became very popular for generation of realistically looking images. In this paper, we propose to use GANs to synthesize artificial financial data for research and benchmarking purposes. We test this approach on three American Express datasets, and show that properly trained GANs can replicate these datasets with high fidelity. For our experiments, we define a novel type of GAN, and suggest methods for data preprocessing that allow good training and testing performance of GANs. We also discuss methods for evaluating the quality of generated data, and their comparison with the original real data. △ Less

Submitted 6 February, 2020; originally announced February 2020.

Journal ref: Robust AI in FS 2019 : NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy, December 2019, Vancouver, Canada

arXiv:1903.00220 [pdf, other]

Approximate Robust Control of Uncertain Dynamical Systems

Authors: Edouard Leurent, Yann Blanco, Denis Efimov, Odalric-Ambrym Maillard

Abstract: This work studies the design of safe control policies for large-scale non-linear systems operating in uncertain environments. In such a case, the robust control framework is a principled approach to safety that aims to maximize the worst-case performance of a system. However, the resulting optimization problem is generally intractable for non-linear systems with continuous states. To overcome this… ▽ More This work studies the design of safe control policies for large-scale non-linear systems operating in uncertain environments. In such a case, the robust control framework is a principled approach to safety that aims to maximize the worst-case performance of a system. However, the resulting optimization problem is generally intractable for non-linear systems with continuous states. To overcome this issue, we introduce two tractable methods that are based either on sampling or on a conservative approximation of the robust objective. The proposed approaches are applied to the problem of autonomous driving. △ Less

Submitted 1 March, 2019; originally announced March 2019.

Journal ref: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018) Workshop, Dec 2018, Montr{é}al, Canada

Showing 1–7 of 7 results for author: Efimov, D