-
A Chase-based Approach to Consistent Answers of Analytic Queries in Star Schemas
Authors:
Dominique Laurent,
Nicolas Spyratos
Abstract:
We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exac…
▽ More
We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exact consistent answer to a standard, non analytic query or bounds of the exact answer, depending on whether the query involves a selection condition or not.
We extend this approach to computing exact consistent answers of analytic queries over star schemas, provided that the selection condition in the query involves no keys and satisfies the property of independency (i.e., the condition can be expressed as a conjunction of conditions each involving a single attribute). The main contributions of this paper are: (a) a polynomial algorithm for computing the exact consistent answer to a usual projection-selection-join query over a star schema under the above restrictions on the selection condition, and (b) showing that, under the same restrictions the exact consistent answer to an analytic query over a star schema can be computed in time polynomial in the size of the data warehouse.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
The Context Model: A Graph Database Model
Authors:
Nicolas Spyratos
Abstract:
We propose a novel database model whose basic structure is a labeled, directed, acyclic graph with a single root, in which the nodes represent the data sets of an application and the edges represent functional relationships among the data sets. We call such a graph an application context or simply context. The query language of a context consists of two types of queries, traversal queries and anal…
▽ More
We propose a novel database model whose basic structure is a labeled, directed, acyclic graph with a single root, in which the nodes represent the data sets of an application and the edges represent functional relationships among the data sets. We call such a graph an application context or simply context. The query language of a context consists of two types of queries, traversal queries and analytic queries. Both types of queries are defined using a simple functional algebra whose operations are functional restriction, composition of functions, pairing of functions and Cartesian product of sets. Roughly speaking, traversal queries parallel relational algebra queries, whereas analytic queries parallel SQL Group-by queries. In other words, in our model, traversal queries and analytic queries, are both defined within the same formal framework - in contrast to the relational model, where analytic queries are defined outside the relational algebra. Therefore a distinctive feature of our model is that it supports data management and data analytics within the same formal framework.
We demonstrate the expressive power of our model by showing: (a) how a relational database can be defined as a view over a context, with the context playing the role of an underlying semantic layer; (b) how an analytic query over a context can be rewritten at two orthogonal levels: at the level of the traversal queries that do the grouping and measuring, and at the level of the analytic query itself; and (c) how a context can be used as a user-friendly interface for querying relations and analysing relational data.
△ Less
Submitted 28 June, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Consistent Query Answering without Repairs in Tables with Nulls and Functional Dependencies
Authors:
Dominique Laurent,
Nicolas Spyratos
Abstract:
In this paper, we study consistent query answering in tables with nulls and functional dependencies. Given such a table T, we consider the set Tuples of all tuples that can be built up from constants appearing in T, and we use set theoretic semantics for tuples and functional dependencies to characterize the tuples of Tuples in two orthogonal ways: first as true or false tuples, and then as consis…
▽ More
In this paper, we study consistent query answering in tables with nulls and functional dependencies. Given such a table T, we consider the set Tuples of all tuples that can be built up from constants appearing in T, and we use set theoretic semantics for tuples and functional dependencies to characterize the tuples of Tuples in two orthogonal ways: first as true or false tuples, and then as consistent or inconsistent tuples. Queries are issued against T and evaluated in Tuples. In this setting, we consider a query Q: select X from T where Condition over T and define its consistent answer to be the set of tuples x in Tuples such that: x is a true and consistent tuple with schema X and there exists a true super-tuple t of x in Tuples satisfying the condition. We show that, depending on the status that the super-tuple t has in Tuples, there are different types of consistent answer to Q. The main contributions of the paper are: (a) a novel approach to consistent query answering not using table repairs; (b) polynomial algorithms for computing the sets of true-false tuples and the sets of consistent-inconsistent tuples of Tuples; (c) polynomial algorithms in the size of T for computing different types of consistent answer for both conjunctive and disjunctive queries; and (d) a detailed discussion of the differences between our approach and the approaches using table repairs.
△ Less
Submitted 15 February, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
System Network Analytics: Evolution and Stable Rules of a State Series
Authors:
Animesh Chaturvedi,
Aruna Tiwari,
Nicolas Spyratos
Abstract:
System Evolution Analytics on a system that evolves is a challenge because it makes a State Series SS = {S1, S2... SN} (i.e., a set of states ordered by time) with several inter-connected entities changing over time. We present stability characteristics of interesting evolution rules occurring in multiple states. We defined an evolution rule with its stability as the fraction of states in which th…
▽ More
System Evolution Analytics on a system that evolves is a challenge because it makes a State Series SS = {S1, S2... SN} (i.e., a set of states ordered by time) with several inter-connected entities changing over time. We present stability characteristics of interesting evolution rules occurring in multiple states. We defined an evolution rule with its stability as the fraction of states in which the rule is interesting. Extensively, we defined stable rule as the evolution rule having stability that exceeds a given threshold minimum stability (minStab). We also defined persistence metric, a quantitative measure of persistent entity-connections. We explain this with an approach and algorithm for System Network Analytics (SysNet-Analytics), which uses minStab to retrieve Network Evolution Rules (NERs) and Stable NERs (SNERs). The retrieved information is used to calculate a proposed System Network Persistence (SNP) metric. This work is automated as a SysNet-Analytics Tool to demonstrate application on real world systems including: software system, natural-language system, retail market system, and IMDb system. We quantified stability and persistence of entity-connections in a system state series. This results in evolution information, which helps in system evolution analytics based on knowledge discovery and data mining.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Four-Valued Semantics for Deductive Databases
Authors:
Dominique Laurent,
Nicolas Spyratos
Abstract:
In this paper, we introduce a novel approach to deductive databases meant to take into account the needs of current applications in the area of data integration. To this end, we extend the formalism of standard deductive databases to the context of Four-valued logic so as to account for unknown, inconsistent, true or false information under the open world assumption. In our approach, a database is…
▽ More
In this paper, we introduce a novel approach to deductive databases meant to take into account the needs of current applications in the area of data integration. To this end, we extend the formalism of standard deductive databases to the context of Four-valued logic so as to account for unknown, inconsistent, true or false information under the open world assumption. In our approach, a database is a pair (E,R) where E is the extension and R the set of rules. The extension is a set of pairs of the form (f, v) where f is a fact and v is a value that can be true, inconsistent or false - but not unknown (that is, unknown facts are not stored in the database). The rules follow the form of standard Datalog{neg} rules but, contrary to standard rules, their head may be a negative atom. Our main contributions are as follows: (i) we give an expression of first-degree entailment in terms of other connectors and exhibit a functionally complete set of basic connectors not involving first-degree entailment, (ii) we define a new operator for handling our new type of rules and show that this operator is monotonic and continuous, thus providing an effective way for defining and computing database semantics, and (iii) we argue that our framework allows for the definition of a new type of updates that can be used in most standard data integration applications.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Handling Inconsistencies in Tables with Nulls and Functional Dependencies
Authors:
Dominique Laurent,
Nicolas Spyratos
Abstract:
In this paper we address the problem of handling inconsistencies in tables with missing values (also called nulls) and functional dependencies. Although the traditional view is that table instances must respect all functional dependencies imposed on them, it is nevertheless relevant to develop theories about how to handle instances that violate some dependencies. Regarding missing values, we make…
▽ More
In this paper we address the problem of handling inconsistencies in tables with missing values (also called nulls) and functional dependencies. Although the traditional view is that table instances must respect all functional dependencies imposed on them, it is nevertheless relevant to develop theories about how to handle instances that violate some dependencies. Regarding missing values, we make no assumptions on their existence: a missing value exists only if it is inferred from the functional dependencies of the table.
We propose a formal framework in which each tuple of a table is associated with a truth value among the following: true, false, inconsistent or unknown; and we show that our framework can be used to study important problems such as consistent query answering, table merging, and data quality measures - to mention just a few. In this paper, however, we focus mainly on consistent query answering, a problem that has received considerable attention during the last decades.
The main contributions of the paper are the following: (a) we introduce a new approach to handle inconsistencies in a table with nulls and functional dependencies, (b) we give algorithms for computing all true, inconsistent and false tuples, (c) we investigate the relationship between our approach and Four-valued logic in the context of data merging, and (d) we give a novel solution to the consistent query answering problem and compare our solution to that of table repairs.
△ Less
Submitted 27 November, 2021; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Hypotheses Founded Semantics of Logic Programs for Information Integration in Multi-Valued Logics
Authors:
Yann Loyer,
Nicolas Spyratos,
Daniel Stamate
Abstract:
We address the problem of integrating information coming from different sources. The information consists of facts that a central server collects and tries to combine using (a) a set of logical rules, i.e. a logic program, and (b) a hypothesis representing the server's own estimates. In such a setting incomplete information from a source or contradictory information from different sources necess…
▽ More
We address the problem of integrating information coming from different sources. The information consists of facts that a central server collects and tries to combine using (a) a set of logical rules, i.e. a logic program, and (b) a hypothesis representing the server's own estimates. In such a setting incomplete information from a source or contradictory information from different sources necessitate the use of many-valued logics in which programs can be evaluated and hypotheses can be tested. To carry out such activities we propose a formal framework based on bilattices such as Belnap's four-valued logics. In this framework we work with the class of programs defined by Fitting and we develop a theory for information integration.
We also establish an intuitively appealing connection between our hypothesis testing mechanism on the one hand, and the well-founded semantics and Kripke-Kleene semantics of Datalog programs with negation, on the other hand.
△ Less
Submitted 27 November, 2001;
originally announced November 2001.
-
Computing and Comparing Semantics of Programs in Multi-valued Logics
Authors:
Y. Loyer,
N. Spyratos,
D. Stamate
Abstract:
The different semantics that can be assigned to a logic program correspond to different assumptions made concerning the atoms whose logical values cannot be inferred from the rules. Thus, the well founded semantics corresponds to the assumption that every such atom is false, while the Kripke-Kleene semantics corresponds to the assumption that every such atom is unknown. In this paper, we propose…
▽ More
The different semantics that can be assigned to a logic program correspond to different assumptions made concerning the atoms whose logical values cannot be inferred from the rules. Thus, the well founded semantics corresponds to the assumption that every such atom is false, while the Kripke-Kleene semantics corresponds to the assumption that every such atom is unknown. In this paper, we propose to unify and extend this assumption-based approach by introducing parameterized semantics for logic programs. The parameter holds the value that one assumes for all atoms whose logical values cannot be inferred from the rules. We work within multi-valued logic with bilattice structure, and we consider the class of logic programs defined by Fitting.
Following Fitting's approach, we define a simple operator that allows us to compute the parameterized semantics, and to compare and combine semantics obtained for different values of the parameter. The semantics proposed by Fitting corresponds to the value false. We also show that our approach captures and extends the usual semantics of conventional logic programs thereby unifying their computation.
△ Less
Submitted 18 February, 2000;
originally announced February 2000.