-
On the Cryptomorphism between Davis' Subset Lattices, Atomic Lattices, and Closure Systems under T1 Separation Axiom
Authors:
Dmitry I. Ignatov
Abstract:
In this paper we count set closure systems (also known as Moore families) for the case when all single element sets are closed. In particular, we give the numbers of such strict (empty set included) and non-strict families for the base set of size $n=6$. We also provide the number of such inequivalent Moore families with respect to all permutations of the base set up to $n=6$. The search in OEIS a…
▽ More
In this paper we count set closure systems (also known as Moore families) for the case when all single element sets are closed. In particular, we give the numbers of such strict (empty set included) and non-strict families for the base set of size $n=6$. We also provide the number of such inequivalent Moore families with respect to all permutations of the base set up to $n=6$. The search in OEIS and existing literature revealed the coincidence of the found numbers with the entry for D.\ M.~Davis' set union lattice (\seqnum{A235604}, up to $n=5$) and $|\mathcal L_n|$, the number of atomic lattices on $n$ atoms, obtained by S.\ Mapes (up to $n=6$), respectively. Thus we study all those cases, establish one-to-one correspondences between them via Galois adjunctions and Formal Concept Analysis, and provide the reader with two of our enumerative algorithms as well as with the results of these algorithms used for additional tests. Other results include the largest size of intersection free families for $n=6$ plus our conjecture for $n=7$, an upper bound for the number of atomic lattices $\mathcal L_n$, and some structural properties of $\mathcal L_n$ based on the theory of extremal lattices.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
On Interpretability and Similarity in Concept-Based Machine Learning
Authors:
Léonard Kwuida,
Dmitry I. Ignatov
Abstract:
Machine Learning (ML) provides important techniques for classification and predictions. Most of these are black-box models for users and do not provide decision-makers with an explanation. For the sake of transparency or more validity of decisions, the need to develop explainable/interpretable ML-methods is gaining more and more importance. Certain questions need to be addressed:
How does an ML…
▽ More
Machine Learning (ML) provides important techniques for classification and predictions. Most of these are black-box models for users and do not provide decision-makers with an explanation. For the sake of transparency or more validity of decisions, the need to develop explainable/interpretable ML-methods is gaining more and more importance. Certain questions need to be addressed:
How does an ML procedure derive the class for a particular entity? Why does a particular clustering emerge from a particular unsupervised ML procedure? What can we do if the number of attributes is very large? What are the possible reasons for the mistakes for concrete cases and models?
For binary attributes, Formal Concept Analysis (FCA) offers techniques in terms of intents of formal concepts, and thus provides plausible reasons for model prediction. However, from the interpretable machine learning viewpoint, we still need to provide decision-makers with the importance of individual attributes to the classification of a particular object, which may facilitate explanations by experts in various domains with high-cost errors like medicine or finance.
We discuss how notions from cooperative game theory can be used to assess the contribution of individual attributes in classification and clustering processes in concept-based machine learning. To address the 3rd question, we present some ideas on how to reduce the number of attributes using similarities in large contexts.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Mixed Integer Programming for Searching Maximum Quasi-Bicliques
Authors:
Dmitry I. Ignatov,
Polina Ivanova,
Albina Zamaletdinova
Abstract:
This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $γ$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $γ\in (0,1]$. Fo…
▽ More
This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a $γ$-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least $γ\in (0,1]$. For a bigraph and fixed $γ$, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering \textsc{TriBox}.
△ Less
Submitted 23 February, 2020;
originally announced February 2020.