-
Attribute Dependencies for Data with Grades
Authors:
Radim Belohlavek,
Vilem Vychodil
Abstract:
This paper examines attribute dependencies in data that involve grades, such as a grade to which an object is red or a grade to which two objects are similar. We thus extend the classical agenda by allowing graded, or fuzzy, attributes instead of Boolean attributes in case of attribute implications, and allowing approximate match based on degrees of similarity instead of exact match in case of fun…
▽ More
This paper examines attribute dependencies in data that involve grades, such as a grade to which an object is red or a grade to which two objects are similar. We thus extend the classical agenda by allowing graded, or fuzzy, attributes instead of Boolean attributes in case of attribute implications, and allowing approximate match based on degrees of similarity instead of exact match in case of functional dependencies. In a sense, we move from bivalence, inherently present in the now-available theories of dependencies, to a more flexible setting that involves grades. Such a shift has far-reaching consequences. We argue that a reasonable theory of dependencies may be developed by making use of mathematical fuzzy logic. Namely, the theory of dependencies is then based on a solid logic calculus the same way the classical dependencies are based on classical logic. For instance, rather than handling degrees of similarity in an ad hoc manner, we consistently treat them as truth values, the same way as true (match) and false (mismatch) are treated in classical theories. In addition, several notions intuitively embraced in the presence of grades, such as a degree of validity of a particular dependence or a degree of entailment, naturally emerge and receive a conceptually clean treatment in the presented approach. In the paper, we discuss motivations, provide basic notions of syntax and semantics, and develop basic results which include entailment of dependencies, associated closure structures, a logic of dependencies with two versions of completeness theorem, results and algorithms regarding complete non-redundant sets of dependencies, relationship to and a possible reductionist interface to classical dependencies, and relationship to functional dependencies over domains with similarity.
△ Less
Submitted 10 February, 2014;
originally announced February 2014.
-
From-Below Approximations in Boolean Matrix Factorization: Geometry and New Algorithm
Authors:
Radim Belohlavek,
Martin Trnecka
Abstract:
We present new results on Boolean matrix factorization and a new algorithm based on these results. The results emphasize the significance of factorizations that provide from-below approximations of the input matrix. While the previously proposed algorithms do not consider the possibly different significance of different matrix entries, our results help measure such significance and suggest where t…
▽ More
We present new results on Boolean matrix factorization and a new algorithm based on these results. The results emphasize the significance of factorizations that provide from-below approximations of the input matrix. While the previously proposed algorithms do not consider the possibly different significance of different matrix entries, our results help measure such significance and suggest where to focus when computing factors. An experimental evaluation of the new algorithm on both synthetic and real data demonstrates its good performance in terms of good coverage by the first k factors as well as a small number of factors needed for exact decomposition and indicates that the algorithm outperforms the available ones in these terms. We also propose future research topics.
△ Less
Submitted 20 June, 2013;
originally announced June 2013.
-
Discovery of factors in matrices with grades
Authors:
Radim Belohlavek,
Vilem Vychodil
Abstract:
We present an approach to decomposition and factor analysis of matrices with ordinal data. The matrix entries are grades to which objects represented by rows satisfy attributes represented by columns, e.g. grades to which an image is red, a product has a given feature, or a person performs well in a test. We assume that the grades form a bounded scale equipped with certain aggregation operators an…
▽ More
We present an approach to decomposition and factor analysis of matrices with ordinal data. The matrix entries are grades to which objects represented by rows satisfy attributes represented by columns, e.g. grades to which an image is red, a product has a given feature, or a person performs well in a test. We assume that the grades form a bounded scale equipped with certain aggregation operators and conforms to the structure of a complete residuated lattice. We present a greedy approximation algorithm for the problem of decomposition of such matrix in a product of two matrices with grades under the restriction that the number of factors be small. Our algorithm is based on a geometric insight provided by a theorem identifying particular rectangular-shaped submatrices as optimal factors for the decompositions. These factors correspond to formal concepts of the input data and allow an easy interpretation of the decomposition. We present illustrative examples and experimental evaluation.
△ Less
Submitted 6 March, 2013;
originally announced March 2013.
-
Sensitivity Analysis for Declarative Relational Query Languages with Ordinal Ranks
Authors:
Radim Belohlavek,
Lucie Urbanova,
Vilem Vychodil
Abstract:
We present sensitivity analysis for results of query executions in a relational model of data extended by ordinal ranks. The underlying model of data results from the ordinary Codd's model of data in which we consider ordinal ranks of tuples in data tables expressing degrees to which tuples match queries. In this setting, we show that ranks assigned to tuples are insensitive to small changes, i.e.…
▽ More
We present sensitivity analysis for results of query executions in a relational model of data extended by ordinal ranks. The underlying model of data results from the ordinary Codd's model of data in which we consider ordinal ranks of tuples in data tables expressing degrees to which tuples match queries. In this setting, we show that ranks assigned to tuples are insensitive to small changes, i.e., small changes in the input data do not yield large changes in the results of queries.
△ Less
Submitted 28 September, 2011;
originally announced September 2011.