Skip to main content

Showing 1–1 of 1 results for author: Banerji, C R S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2504.14094  [pdf, other

    cs.LG cs.AI stat.ML

    Leakage and Interpretability in Concept-Based Models

    Authors: Enrico Parisini, Tapabrata Chakraborti, Chris Harbron, Ben D. MacArthur, Christopher R. S. Banerji

    Abstract: Concept Bottleneck Models aim to improve interpretability by predicting high-level intermediate concepts, representing a promising approach for deployment in high-risk scenarios. However, they are known to suffer from information leakage, whereby models exploit unintended information encoded within the learned concepts. We introduce an information-theoretic framework to rigorously characterise and… ▽ More

    Submitted 19 May, 2025; v1 submitted 18 April, 2025; originally announced April 2025.

    Comments: 35 pages, 24 figures