Search | arXiv e-print repository

Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting

Authors: Luka Hobor, Mario Brcic, Lidija Polutnik, Ante Kapetanovic

Abstract: Accurate forecasting is key for all business planning. When estimated sales are too high, brick-and-mortar retailers may incur higher costs due to unsold inventories, higher labor and storage space costs, etc. On the other hand, when forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market. Accurate fo… ▽ More Accurate forecasting is key for all business planning. When estimated sales are too high, brick-and-mortar retailers may incur higher costs due to unsold inventories, higher labor and storage space costs, etc. On the other hand, when forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market. Accurate forecasting presents a competitive advantage for companies. It facilitates the achievement of revenue and profit goals and execution of pricing strategy and tactics. In this study, we provide an exhaustive assessment of the forecasting models applied to a high-resolution brick-and-mortar retail dataset. Our forecasting framework addresses the problems found in retail environments, including intermittent demand, missing values, and frequent product turnover. We compare tree-based ensembles (such as XGBoost and LightGBM) and state-of-the-art neural network architectures (including N-BEATS, NHITS, and the Temporal Fusion Transformer) across various experimental settings. Our results show that localized modeling strategies especially those using tree-based models on individual groups with non-imputed data, consistently deliver superior forecasting accuracy and computational efficiency. In contrast, neural models benefit from advanced imputation methods, yet still fall short in handling the irregularities typical of physical retail data. These results further practical understanding for model selection in retail environment and highlight the significance of data preprocessing to improve forecast performance. △ Less

Submitted 6 June, 2025; originally announced June 2025.

Comments: 20 total pages, 10 pages article, 10 pages appendix, 3 figures, 24 tables

arXiv:2305.02260 [pdf, other]

Standardized Benchmark Dataset for Localized Exposure to a Realistic Source at 10$-$90 GHz

Authors: Ante Kapetanovic, Dragan Poljak, Kun Li

Abstract: The lack of freely available standardized datasets represents an aggravating factor during the development and testing the performance of novel computational techniques in exposure assessment and dosimetry research. This hinders progress as researchers are required to generate numerical data (field, power and temperature distribution) anew using simulation software for each exposure scenario. Othe… ▽ More The lack of freely available standardized datasets represents an aggravating factor during the development and testing the performance of novel computational techniques in exposure assessment and dosimetry research. This hinders progress as researchers are required to generate numerical data (field, power and temperature distribution) anew using simulation software for each exposure scenario. Other than being time consuming, this approach is highly susceptible to errors that occur during the configuration of the electromagnetic model. To address this issue, in this paper, the limited available data on the incident power density and resultant maximum temperature rise on the skin surface considering various steady-state exposure scenarios at 10$-$90 GHz have been statistically modeled. The synthetic data have been sampled from the fitted statistical multivariate distribution with respect to predetermined dosimetric constraints. We thus present a comprehensive and open-source dataset compiled of the high-fidelity numerical data considering various exposures to a realistic source. Furthermore, different surrogate models for predicting maximum temperature rise on the skin surface were fitted based on the synthetic dataset. All surrogate models were tested on the originally available data where satisfactory predictive performance has been demonstrated. A simple technique of combining quadratic polynomial and tensor-product spline surrogates, each operating on its own cluster of data, has achieved the lowest mean absolute error of 0.058 °C. Therefore, overall experimental results indicate the validity of the proposed synthetic dataset. △ Less

Submitted 3 May, 2023; originally announced May 2023.

Comments: 6 pages, 3 figures, in proceedings of BioEM2023

arXiv:2011.06861 [pdf, other]

IoT Wallet: Machine Learning-based Sensor Portfolio Application

Authors: Petar Šolić, Ante Lojić Kapetanović, Tomislav Županović, Ivo Kovačević, Toni Perković, Petar Popovski

Abstract: In this paper an application for building sensor wallet is presented. Currently, given system collects sensor data from The Things Network (TTN) cloud system, stores the data into the Influx database and presents the processed data to the user dashboard. Based on the type of the user, data can be viewed-only, controlled or the top user can register the sensor to the system. Moreover, the system ca… ▽ More In this paper an application for building sensor wallet is presented. Currently, given system collects sensor data from The Things Network (TTN) cloud system, stores the data into the Influx database and presents the processed data to the user dashboard. Based on the type of the user, data can be viewed-only, controlled or the top user can register the sensor to the system. Moreover, the system can notify users based on the rules that can be adjusted through the user interface. The special feature of the system is the machine learning service that can be used in various scenarios and is presented throughout the case study that gives a novel approach to estimate soil moisture from the signal strength of a given underground LoRa beacon node. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: 5 pages, 6 figures, in proceedings of the 5th International Conference on Smart and Sustainable Technologies 2020, SpliTech2020

Showing 1–3 of 3 results for author: Kapetanovic, A