Cluster-Based Random Forest Visualization and Interpretation

Sondag, Max; Meinecke, Christofer; Collaris, Dennis; von Landesberger, Tatiana; Elzen, Stef van den

Computer Science > Machine Learning

arXiv:2507.22665 (cs)

[Submitted on 30 Jul 2025]

Title:Cluster-Based Random Forest Visualization and Interpretation

Authors:Max Sondag, Christofer Meinecke, Dennis Collaris, Tatiana von Landesberger, Stef van den Elzen

View PDF HTML (experimental)

Abstract:Random forests are a machine learning method used to automatically classify datasets and consist of a multitude of decision trees. While these random forests often have higher performance and generalize better than a single decision tree, they are also harder to interpret. This paper presents a visualization method and system to increase interpretability of random forests. We cluster similar trees which enables users to interpret how the model performs in general without needing to analyze each individual decision tree in detail, or interpret an oversimplified summary of the full forest. To meaningfully cluster the decision trees, we introduce a new distance metric that takes into account both the decision rules as well as the predictions of a pair of decision trees. We also propose two new visualization methods that visualize both clustered and individual decision trees: (1) The Feature Plot, which visualizes the topological position of features in the decision trees, and (2) the Rule Plot, which visualizes the decision rules of the decision trees. We demonstrate the efficacy of our approach through a case study on the "Glass" dataset, which is a relatively complex standard machine learning dataset, as well as a small user study.

Subjects:	Machine Learning (cs.LG); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2507.22665 [cs.LG]
	(or arXiv:2507.22665v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2507.22665

Submission history

From: Max Sondag [view email]
[v1] Wed, 30 Jul 2025 13:22:28 UTC (22,991 KB)

Computer Science > Machine Learning

Title:Cluster-Based Random Forest Visualization and Interpretation

Submission history

Access Paper:

Ancillary files (details):

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Cluster-Based Random Forest Visualization and Interpretation

Submission history

Access Paper:

Ancillary files (details):

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators