How Complex is your classification problem? A survey on measuring classification complexity

Lorena, Ana C.; Garcia, Luís P. F.; Lehmann, Jens; Souto, Marcilio C. P.; Ho, Tin K.

Computer Science > Machine Learning

arXiv:1808.03591 (cs)

[Submitted on 10 Aug 2018 (v1), last revised 30 Dec 2020 (this version, v3)]

Title:How Complex is your classification problem? A survey on measuring classification complexity

Authors:Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, Tin K. Ho

View PDF

Abstract:Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision boundary are among the known measures for this characterization. This information can support the formulation of new data-driven pre-processing and pattern recognition techniques, which can in turn be focused on challenges highlighted by such characteristics of the problems. This paper surveys and analyzes measures which can be extracted from the training datasets in order to characterize the complexity of the respective classification problems. Their use in recent literature is also reviewed and discussed, allowing to prospect opportunities for future work in the area. Finally, descriptions are given on an R package named Extended Complexity Library (ECoL) that implements a set of complexity measures and is made publicly available.

Comments:	Survey paper
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1808.03591 [cs.LG]
	(or arXiv:1808.03591v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1808.03591

Submission history

From: Luís Paulo Faina Garcia [view email]
[v1] Fri, 10 Aug 2018 15:38:14 UTC (482 KB)
[v2] Wed, 24 Jul 2019 19:43:41 UTC (443 KB)
[v3] Wed, 30 Dec 2020 20:33:52 UTC (1,332 KB)

Computer Science > Machine Learning

Title:How Complex is your classification problem? A survey on measuring classification complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:How Complex is your classification problem? A survey on measuring classification complexity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators