Statistics > Methodology
[Submitted on 21 Aug 2025]
Title:Clustering-based aggregate value regression
View PDF HTML (experimental)Abstract:In various practical situations, forecasting of aggregate values rather than individual ones is often our main focus. For instance, electricity companies are interested in forecasting the total electricity demand in a specific region to ensure reliable grid operation and resource allocation. However, to our knowledge, statistical learning specifically for forecasting aggregate values has not yet been well-established. In particular, the relationship between forecast error and the number of clusters has not been well studied, as clustering is usually treated as unsupervised learning. This study introduces a novel forecasting method specifically focused on the aggregate values in the linear regression model. We call it the Aggregate Value Regression (AVR), and it is constructed by combining all regression models into a single model. With the AVR, we must estimate a huge number of parameters when the number of regression models to be combined is large, resulting in overparameterization. To address the overparameterization issue, we introduce a hierarchical clustering technique, referred to as AVR-C (C stands for clustering). In this approach, several clusters of regression models are constructed, and the AVR is performed within each cluster. The AVR-C introduces a novel bias-variance trade-off theory under the assumption of a misspecified model. In this framework, the number of clusters characterizes model complexity. Monte Carlo simulation is conducted to investigate the behavior of training and test errors of our proposed clustering technique. The bias-variance trade-off theory is also demonstrated through the analysis of electricity demand forecasting.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.