-
Comparative Analysis of Modern Machine Learning Models for Retail Sales Forecasting
Authors:
Luka Hobor,
Mario Brcic,
Lidija Polutnik,
Ante Kapetanovic
Abstract:
Accurate forecasting is key for all business planning. When estimated sales are too high, brick-and-mortar retailers may incur higher costs due to unsold inventories, higher labor and storage space costs, etc. On the other hand, when forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market. Accurate fo…
▽ More
Accurate forecasting is key for all business planning. When estimated sales are too high, brick-and-mortar retailers may incur higher costs due to unsold inventories, higher labor and storage space costs, etc. On the other hand, when forecasts underestimate the level of sales, firms experience lost sales, shortages, and impact on the reputation of the retailer in their relevant market. Accurate forecasting presents a competitive advantage for companies. It facilitates the achievement of revenue and profit goals and execution of pricing strategy and tactics. In this study, we provide an exhaustive assessment of the forecasting models applied to a high-resolution brick-and-mortar retail dataset. Our forecasting framework addresses the problems found in retail environments, including intermittent demand, missing values, and frequent product turnover. We compare tree-based ensembles (such as XGBoost and LightGBM) and state-of-the-art neural network architectures (including N-BEATS, NHITS, and the Temporal Fusion Transformer) across various experimental settings. Our results show that localized modeling strategies especially those using tree-based models on individual groups with non-imputed data, consistently deliver superior forecasting accuracy and computational efficiency. In contrast, neural models benefit from advanced imputation methods, yet still fall short in handling the irregularities typical of physical retail data. These results further practical understanding for model selection in retail environment and highlight the significance of data preprocessing to improve forecast performance.
△ Less
Submitted 6 June, 2025;
originally announced June 2025.
-
Standardized Benchmark Dataset for Localized Exposure to a Realistic Source at 10$-$90 GHz
Authors:
Ante Kapetanovic,
Dragan Poljak,
Kun Li
Abstract:
The lack of freely available standardized datasets represents an aggravating factor during the development and testing the performance of novel computational techniques in exposure assessment and dosimetry research. This hinders progress as researchers are required to generate numerical data (field, power and temperature distribution) anew using simulation software for each exposure scenario. Othe…
▽ More
The lack of freely available standardized datasets represents an aggravating factor during the development and testing the performance of novel computational techniques in exposure assessment and dosimetry research. This hinders progress as researchers are required to generate numerical data (field, power and temperature distribution) anew using simulation software for each exposure scenario. Other than being time consuming, this approach is highly susceptible to errors that occur during the configuration of the electromagnetic model. To address this issue, in this paper, the limited available data on the incident power density and resultant maximum temperature rise on the skin surface considering various steady-state exposure scenarios at 10$-$90 GHz have been statistically modeled. The synthetic data have been sampled from the fitted statistical multivariate distribution with respect to predetermined dosimetric constraints. We thus present a comprehensive and open-source dataset compiled of the high-fidelity numerical data considering various exposures to a realistic source. Furthermore, different surrogate models for predicting maximum temperature rise on the skin surface were fitted based on the synthetic dataset. All surrogate models were tested on the originally available data where satisfactory predictive performance has been demonstrated. A simple technique of combining quadratic polynomial and tensor-product spline surrogates, each operating on its own cluster of data, has achieved the lowest mean absolute error of 0.058 °C. Therefore, overall experimental results indicate the validity of the proposed synthetic dataset.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
IoT Wallet: Machine Learning-based Sensor Portfolio Application
Authors:
Petar Šolić,
Ante Lojić Kapetanović,
Tomislav Županović,
Ivo Kovačević,
Toni Perković,
Petar Popovski
Abstract:
In this paper an application for building sensor wallet is presented. Currently, given system collects sensor data from The Things Network (TTN) cloud system, stores the data into the Influx database and presents the processed data to the user dashboard. Based on the type of the user, data can be viewed-only, controlled or the top user can register the sensor to the system. Moreover, the system ca…
▽ More
In this paper an application for building sensor wallet is presented. Currently, given system collects sensor data from The Things Network (TTN) cloud system, stores the data into the Influx database and presents the processed data to the user dashboard. Based on the type of the user, data can be viewed-only, controlled or the top user can register the sensor to the system. Moreover, the system can notify users based on the rules that can be adjusted through the user interface. The special feature of the system is the machine learning service that can be used in various scenarios and is presented throughout the case study that gives a novel approach to estimate soil moisture from the signal strength of a given underground LoRa beacon node.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.