-
Formal Models of Active Learning from Contrastive Examples
Authors:
Farnam Mansouri,
Hans U. Simon,
Adish Singla,
Yuxin Chen,
Sandra Zilles
Abstract:
Machine learning can greatly benefit from providing learning algorithms with pairs of contrastive training examples -- typically pairs of instances that differ only slightly, yet have different class labels. Intuitively, the difference in the instances helps explain the difference in the class labels. This paper proposes a theoretical framework in which the effect of various types of contrastive e…
▽ More
Machine learning can greatly benefit from providing learning algorithms with pairs of contrastive training examples -- typically pairs of instances that differ only slightly, yet have different class labels. Intuitively, the difference in the instances helps explain the difference in the class labels. This paper proposes a theoretical framework in which the effect of various types of contrastive examples on active learners is studied formally. The focus is on the sample complexity of learning concept classes and how it is influenced by the choice of contrastive examples. We illustrate our results with geometric concept classes and classes of Boolean functions. Interestingly, we reveal a connection between learning from contrastive examples and the classical model of self-directed learning.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
The Word Problem for Products of Symmetric Groups
Authors:
Hans U. Simon
Abstract:
The word problem for products of symmetric groups (WPPSG) is a well-known NP-complete problem. An input instance of this problem consists of ``specification sets'' $X_1,\ldots,X_m \seq \{1,\ldots,n\}$ and a permutation $τ$ on $\{1,\ldots,n\}$. The sets $X_1,\ldots,X_m$ specify a subset of the symmetric group $\cS_n$ and the question is whether the given permutation $τ$ is a member of this subset.…
▽ More
The word problem for products of symmetric groups (WPPSG) is a well-known NP-complete problem. An input instance of this problem consists of ``specification sets'' $X_1,\ldots,X_m \seq \{1,\ldots,n\}$ and a permutation $τ$ on $\{1,\ldots,n\}$. The sets $X_1,\ldots,X_m$ specify a subset of the symmetric group $\cS_n$ and the question is whether the given permutation $τ$ is a member of this subset. We discuss three subproblems of WPPSG and show that they can be solved efficiently. The subproblem WPPSG$_0$ is the restriction of WPPSG to specification sets all of which are sets of consecutive integers. The subproblem WPPSG$_1$ is the restriction of WPPSG to specification sets which have the Consecutive Ones Property. The subproblem WPPSG$_2$ is the restriction of WPPSG to specification sets which have what we call the Weak Consecutive Ones Property. WPPSG$_1$ is more general than WPPSG$_0$ and WPPSG$_2$ is more general than WPPSG$_1$. But the efficient algorithms that we use for solving WPPSG$_1$ and WPPSG$_2$ have, as a sub-routine, the efficient algorithm for solving WPPSG$_0$.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
The Hierarchy of Saturating Matching Numbers
Authors:
Hans U. Simon,
Jan Arne Telle
Abstract:
In this paper, we study three matching problems all of which came up quite recently in the field of machine teaching. The cost of a matching is defined in such a way that, for some formal model of teaching, it equals (or bounds) the number of labeled examples needed to solve a given teaching task. We show how the cost parameters associated with these problems depend on each other and how they are…
▽ More
In this paper, we study three matching problems all of which came up quite recently in the field of machine teaching. The cost of a matching is defined in such a way that, for some formal model of teaching, it equals (or bounds) the number of labeled examples needed to solve a given teaching task. We show how the cost parameters associated with these problems depend on each other and how they are related to other well known combinatorial parameters (like, for instance, the VC-dimension).
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
RTD-Conjecture and Concept Classes Induced by Graphs
Authors:
Hans U. Simon
Abstract:
It is conjectured that the recursive teaching dimension of any finite concept class is upper-bounded by the VC-dimension of this class times a universal constant. In this paper, we confirm this conjecture for two rich families of concept classes where each class is induced by some graph $G$. For each $G$, we consider the class whose concepts represent stars in $G$ as well as the class whose concep…
▽ More
It is conjectured that the recursive teaching dimension of any finite concept class is upper-bounded by the VC-dimension of this class times a universal constant. In this paper, we confirm this conjecture for two rich families of concept classes where each class is induced by some graph $G$. For each $G$, we consider the class whose concepts represent stars in $G$ as well as the class whose concepts represent connected sets in $G$. We show that, for concept classes of this kind, the recursive teaching dimension either equals the VC-dimension or is less by $1$.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
A Note on the Subcubes of the $n$-Cube
Authors:
Hans Ulrich Simon
Abstract:
In the year 1990, Béla Bollobás, Imre Leader and Andrew Radcliffe considered the following combinatorial problem: given three parameters k, n and q, find a set of k vertices in the binary n-cube which contains a maximal number of q-dimensional subcubes. It was shown that an optimal solution is given by the k vertices which coincide with the binary representations of the number 0 , 1 , ... , k-1. T…
▽ More
In the year 1990, Béla Bollobás, Imre Leader and Andrew Radcliffe considered the following combinatorial problem: given three parameters k, n and q, find a set of k vertices in the binary n-cube which contains a maximal number of q-dimensional subcubes. It was shown that an optimal solution is given by the k vertices which coincide with the binary representations of the number 0 , 1 , ... , k-1. Two proofs were presented. The proof given by Bollobas and Leader is particularly elegant and short. Here we show that also the other proof, the one given by Bollobas and Radcliffe, becomes quite simple and short when it is combined with a lemma from Graham whose publication dates back to 1970. As a second application of Graham's lemma, we solve a recursive equation (related to the optimization problem that we discussed before) that might be considered interesting in its own right.
△ Less
Submitted 4 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Greedy Matchings in Bipartite Graphs with Ordered Vertex Sets
Authors:
Hans U. Simon
Abstract:
We define and study greedy matchings in vertex-ordered bipartite graphs. It is shown that each vertex-ordered bipartite graph has a unique greedy matching. The proof uses (a weak form of) Newman's lemma. The vertex ordering is called a preference relation. Given a vertex-ordered bipartite graph, the goal is to match every vertex of one vertex class but to leave unmatched as many as possible vertic…
▽ More
We define and study greedy matchings in vertex-ordered bipartite graphs. It is shown that each vertex-ordered bipartite graph has a unique greedy matching. The proof uses (a weak form of) Newman's lemma. The vertex ordering is called a preference relation. Given a vertex-ordered bipartite graph, the goal is to match every vertex of one vertex class but to leave unmatched as many as possible vertices of low preference in the other concept class. We investigate how well greedy algorithms perform in this setting. It is shown that they have optimal performance provided that the vertex-ordering is cleverly chosen. The study of greedy matchings is motivated by problems in learning theory like illustrating or teaching concepts by means of labeled examples.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
MAP- and MLE-Based Teaching
Authors:
Hans Ulrich Simon,
Jan Arne Telle
Abstract:
Imagine a learner L who tries to infer a hidden concept from a collection of observations. Building on the work [4] of Ferri et al., we assume the learner to be parameterized by priors P(c) and by c-conditional likelihoods P(z|c) where c ranges over all concepts in a given class C and z ranges over all observations in an observation set Z. L is called a MAP-learner (resp. an MLE-learner) if it thi…
▽ More
Imagine a learner L who tries to infer a hidden concept from a collection of observations. Building on the work [4] of Ferri et al., we assume the learner to be parameterized by priors P(c) and by c-conditional likelihoods P(z|c) where c ranges over all concepts in a given class C and z ranges over all observations in an observation set Z. L is called a MAP-learner (resp. an MLE-learner) if it thinks of a collection S of observations as a random sample and returns the concept with the maximum a-posteriori probability (resp. the concept which maximizes the c-conditional likelihood of S). Depending on whether L assumes that S is obtained from ordered or unordered sampling resp. from sampling with or without replacement, we can distinguish four different sampling modes. Given a target concept c in C, a teacher for a MAP-learner L aims at finding a smallest collection of observations that causes L to return c. This approach leads in a natural manner to various notions of a MAP- or MLE-teaching dimension of a concept class C. Our main results are: We show that this teaching model has some desirable monotonicity properties. We clarify how the four sampling modes are related to each other. As for the (important!) special case, where concepts are subsets of a domain and observations are 0,1-labeled examples, we obtain some additional results. First of all, we characterize the MAP- and MLE-teaching dimension associated with an optimally parameterized MAP-learner graph-theoretically. From this central result, some other ones are easy to derive. It is shown, for instance, that the MLE-teaching dimension is either equal to the MAP-teaching dimension or exceeds the latter by 1. It is shown furthermore that these dimensions can be bounded from above by the so-called antichain number, the VC-dimension and related combinatorial parameters. Moreover they can be computed in polynomial time.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Minimum Tournaments with the Strong $S_k$-Property and Implications for Teaching
Authors:
Hans Ulrich Simon
Abstract:
A tournament is said to have the $S_k$-property if, for any set of $k$ players, there is another player who beats them all. Minimum tournaments having this property have been explored very well in the 1960's and the early 1970's. In this paper, we define a strengthening of the $S_k$-property that we name "strong $S_k$-property". We show, first, that several basic results on the weaker notion remai…
▽ More
A tournament is said to have the $S_k$-property if, for any set of $k$ players, there is another player who beats them all. Minimum tournaments having this property have been explored very well in the 1960's and the early 1970's. In this paper, we define a strengthening of the $S_k$-property that we name "strong $S_k$-property". We show, first, that several basic results on the weaker notion remain valid for the stronger notion (and the corresponding modification of the proofs requires only little extra-effort). Second, it is demonstrated that the stronger notion has applications in the area of Teaching. Specifically, we present an infinite family of concept classes all of which can be taught with a single example in the No-Clash model of teaching while, in order to teach a class $\cC$ of this family in the recursive model of teaching, order of $\log|\cC|$ many examples are required. This is the first paper that presents a concrete and easily constructible family of concept classes which separates the No-Clash from the recursive model of teaching by more than a constant factor. The separation by a logarithmic factor is remarkable because the recursive teaching dimension is known to be bounded by $\log |\cC|$ for any concept class $\cC$.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Tournaments, Johnson Graphs, and NC-Teaching
Authors:
Hans U. Simon
Abstract:
Quite recently a teaching model, called "No-Clash Teaching" or simply "NC-Teaching", had been suggested that is provably optimal in the following strong sense. First, it satisfies Goldman and Matthias' collusion-freeness condition. Second, the NC-teaching dimension (= NCTD) is smaller than or equal to the teaching dimension with respect to any other collusion-free teaching model. It has also been…
▽ More
Quite recently a teaching model, called "No-Clash Teaching" or simply "NC-Teaching", had been suggested that is provably optimal in the following strong sense. First, it satisfies Goldman and Matthias' collusion-freeness condition. Second, the NC-teaching dimension (= NCTD) is smaller than or equal to the teaching dimension with respect to any other collusion-free teaching model. It has also been shown that any concept class which has NC-teaching dimension $d$ and is defined over a domain of size $n$ can have at most $2^d \binom{n}{d}$ concepts. The main results in this paper are as follows. First, we characterize the maximum concept classes of NC-teaching dimension $1$ as classes which are induced by tournaments (= complete oriented graphs) in a very natural way. Second, we show that there exists a family $(\cC_n)_{n\ge1}$ of concept classes such that the well known recursive teaching dimension (= RTD) of $\cC_n$ grows logarithmically in $n = |\cC_n|$ while, for every $n\ge1$, the NC-teaching dimension of $\cC_n$ equals $1$. Since the recursive teaching dimension of a finite concept class $\cC$ is generally bounded $\log|\cC|$, the family $(\cC_n)_{n\ge1}$ separates RTD from NCTD in the most striking way. The proof of existence of the family $(\cC_n)_{n\ge1}$ makes use of the probabilistic method and random tournaments. Third, we improve the afore-mentioned upper bound $2^d\binom{n}{d}$ by a factor of order $\sqrt{d}$. The verification of the superior bound makes use of Johnson graphs and maximum subgraphs not containing large narrow cliques.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Optimal Collusion-Free Teaching
Authors:
David Kirkpatrick,
Hans U. Simon,
Sandra Zilles
Abstract:
Formal models of learning from teachers need to respect certain criteria to avoid collusion. The most commonly accepted notion of collusion-freeness was proposed by Goldman and Mathias (1996), and various teaching models obeying their criterion have been studied. For each model $M$ and each concept class $\mathcal{C}$, a parameter $M$-$\mathrm{TD}(\mathcal{C})$ refers to the teaching dimension of…
▽ More
Formal models of learning from teachers need to respect certain criteria to avoid collusion. The most commonly accepted notion of collusion-freeness was proposed by Goldman and Mathias (1996), and various teaching models obeying their criterion have been studied. For each model $M$ and each concept class $\mathcal{C}$, a parameter $M$-$\mathrm{TD}(\mathcal{C})$ refers to the teaching dimension of concept class $\mathcal{C}$ in model $M$---defined to be the number of examples required for teaching a concept, in the worst case over all concepts in $\mathcal{C}$.
This paper introduces a new model of teaching, called no-clash teaching, together with the corresponding parameter $\mathrm{NCTD}(\mathcal{C})$. No-clash teaching is provably optimal in the strong sense that, given any concept class $\mathcal{C}$ and any model $M$ obeying Goldman and Mathias's collusion-freeness criterion, one obtains $\mathrm{NCTD}(\mathcal{C})\le M$-$\mathrm{TD}(\mathcal{C})$. We also study a corresponding notion $\mathrm{NCTD}^+$ for the case of learning from positive data only, establish useful bounds on $\mathrm{NCTD}$ and $\mathrm{NCTD}^+$, and discuss relations of these parameters to the VC-dimension and to sample compression.
In addition to formulating an optimal model of collusion-free teaching, our main results are on the computational complexity of deciding whether $\mathrm{NCTD}^+(\mathcal{C})=k$ (or $\mathrm{NCTD}(\mathcal{C})=k$) for given $\mathcal{C}$ and $k$. We show some such decision problems to be equivalent to the existence question for certain constrained matchings in bipartite graphs. Our NP-hardness results for the latter are of independent interest in the study of constrained graph matchings.
△ Less
Submitted 10 March, 2019;
originally announced March 2019.
-
On the Containment Problem for Linear Sets
Authors:
Hans U. Simon
Abstract:
It is well known that the containment problem (as well as the equivalence problem) for semilinear sets is $\log$-complete in $Π_2^p$. It had been shown quite recently that already the containment problem for multi-dimensional linear sets is $\log$-complete in $Π_2^p$ (where hardness even holds for a unary encoding of the numerical input parameters). In this paper, we show that already the containm…
▽ More
It is well known that the containment problem (as well as the equivalence problem) for semilinear sets is $\log$-complete in $Π_2^p$. It had been shown quite recently that already the containment problem for multi-dimensional linear sets is $\log$-complete in $Π_2^p$ (where hardness even holds for a unary encoding of the numerical input parameters). In this paper, we show that already the containment problem for $1$-dimensional linear sets (with binary encoding of the numerical input parameters) is $\log$-hard (and therefore also $\log$-complete) in $Π_2^p$. However, combining both restrictions (dimension $1$ and unary encoding), the problem becomes solvable in polynomial time.
△ Less
Submitted 20 February, 2018; v1 submitted 12 October, 2017;
originally announced October 2017.
-
Preference-based Teaching
Authors:
Ziyuan Gao,
Christoph Ries,
Hans Ulrich Simon,
Sandra Zilles
Abstract:
We introduce a new model of teaching named "preference-based teaching" and a corresponding complexity parameter---the preference-based teaching dimension (PBTD)---representing the worst-case number of examples needed to teach any concept in a given concept class. Although the PBTD coincides with the well-known recursive teaching dimension (RTD) on finite classes, it is radically different on infin…
▽ More
We introduce a new model of teaching named "preference-based teaching" and a corresponding complexity parameter---the preference-based teaching dimension (PBTD)---representing the worst-case number of examples needed to teach any concept in a given concept class. Although the PBTD coincides with the well-known recursive teaching dimension (RTD) on finite classes, it is radically different on infinite ones: the RTD becomes infinite already for trivial infinite classes (such as half-intervals) whereas the PBTD evaluates to reasonably small values for a wide collection of infinite classes including classes consisting of so-called closed sets w.r.t. a given closure operator, including various classes related to linear sets over $\mathbb{N}_0$ (whose RTD had been studied quite recently) and including the class of Euclidean half-spaces. On top of presenting these concrete results, we provide the reader with a theoretical framework (of a combinatorial flavor) which helps to derive bounds on the PBTD.
△ Less
Submitted 8 February, 2017; v1 submitted 6 February, 2017;
originally announced February 2017.