Search | arXiv e-print repository

A Chase-based Approach to Consistent Answers of Analytic Queries in Star Schemas

Authors: Dominique Laurent, Nicolas Spyratos

Abstract: We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exac… ▽ More We present an approach to computing consistent answers to analytic queries in data warehouses operating under a star schema and possibly containing missing values and inconsistent data. Our approach is based on earlier work concerning consistent query answering for standard, non-analytic queries in multi-table databases. In that work we presented polynomial algorithms for computing either the exact consistent answer to a standard, non analytic query or bounds of the exact answer, depending on whether the query involves a selection condition or not. We extend this approach to computing exact consistent answers of analytic queries over star schemas, provided that the selection condition in the query involves no keys and satisfies the property of independency (i.e., the condition can be expressed as a conjunction of conditions each involving a single attribute). The main contributions of this paper are: (a) a polynomial algorithm for computing the exact consistent answer to a usual projection-selection-join query over a star schema under the above restrictions on the selection condition, and (b) showing that, under the same restrictions the exact consistent answer to an analytic query over a star schema can be computed in time polynomial in the size of the data warehouse. △ Less

Submitted 22 May, 2025; originally announced May 2025.

Comments: Technical report, 34 pages

arXiv:2305.13895 [pdf, other]

The Context Model: A Graph Database Model

Authors: Nicolas Spyratos

Abstract: We propose a novel database model whose basic structure is a labeled, directed, acyclic graph with a single root, in which the nodes represent the data sets of an application and the edges represent functional relationships among the data sets. We call such a graph an application context or simply context. The query language of a context consists of two types of queries, traversal queries and anal… ▽ More We propose a novel database model whose basic structure is a labeled, directed, acyclic graph with a single root, in which the nodes represent the data sets of an application and the edges represent functional relationships among the data sets. We call such a graph an application context or simply context. The query language of a context consists of two types of queries, traversal queries and analytic queries. Both types of queries are defined using a simple functional algebra whose operations are functional restriction, composition of functions, pairing of functions and Cartesian product of sets. Roughly speaking, traversal queries parallel relational algebra queries, whereas analytic queries parallel SQL Group-by queries. In other words, in our model, traversal queries and analytic queries, are both defined within the same formal framework - in contrast to the relational model, where analytic queries are defined outside the relational algebra. Therefore a distinctive feature of our model is that it supports data management and data analytics within the same formal framework. We demonstrate the expressive power of our model by showing: (a) how a relational database can be defined as a view over a context, with the context playing the role of an underlying semantic layer; (b) how an analytic query over a context can be rewritten at two orthogonal levels: at the level of the traversal queries that do the grouping and measuring, and at the level of the analytic query itself; and (c) how a context can be used as a user-friendly interface for querying relations and analysing relational data. △ Less

Submitted 28 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

arXiv:2301.03668 [pdf, other]

Consistent Query Answering without Repairs in Tables with Nulls and Functional Dependencies

Authors: Dominique Laurent, Nicolas Spyratos

Abstract: In this paper, we study consistent query answering in tables with nulls and functional dependencies. Given such a table T, we consider the set Tuples of all tuples that can be built up from constants appearing in T, and we use set theoretic semantics for tuples and functional dependencies to characterize the tuples of Tuples in two orthogonal ways: first as true or false tuples, and then as consis… ▽ More In this paper, we study consistent query answering in tables with nulls and functional dependencies. Given such a table T, we consider the set Tuples of all tuples that can be built up from constants appearing in T, and we use set theoretic semantics for tuples and functional dependencies to characterize the tuples of Tuples in two orthogonal ways: first as true or false tuples, and then as consistent or inconsistent tuples. Queries are issued against T and evaluated in Tuples. In this setting, we consider a query Q: select X from T where Condition over T and define its consistent answer to be the set of tuples x in Tuples such that: x is a true and consistent tuple with schema X and there exists a true super-tuple t of x in Tuples satisfying the condition. We show that, depending on the status that the super-tuple t has in Tuples, there are different types of consistent answer to Q. The main contributions of the paper are: (a) a novel approach to consistent query answering not using table repairs; (b) polynomial algorithms for computing the sets of true-false tuples and the sets of consistent-inconsistent tuples of Tuples; (c) polynomial algorithms in the size of T for computing different types of consistent answer for both conjunctive and disjunctive queries; and (d) a detailed discussion of the differences between our approach and the approaches using table repairs. △ Less

Submitted 15 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: 42 pages

arXiv:2210.15965 [pdf]

System Network Analytics: Evolution and Stable Rules of a State Series

Authors: Animesh Chaturvedi, Aruna Tiwari, Nicolas Spyratos

Abstract: System Evolution Analytics on a system that evolves is a challenge because it makes a State Series SS = {S1, S2... SN} (i.e., a set of states ordered by time) with several inter-connected entities changing over time. We present stability characteristics of interesting evolution rules occurring in multiple states. We defined an evolution rule with its stability as the fraction of states in which th… ▽ More System Evolution Analytics on a system that evolves is a challenge because it makes a State Series SS = {S1, S2... SN} (i.e., a set of states ordered by time) with several inter-connected entities changing over time. We present stability characteristics of interesting evolution rules occurring in multiple states. We defined an evolution rule with its stability as the fraction of states in which the rule is interesting. Extensively, we defined stable rule as the evolution rule having stability that exceeds a given threshold minimum stability (minStab). We also defined persistence metric, a quantitative measure of persistent entity-connections. We explain this with an approach and algorithm for System Network Analytics (SysNet-Analytics), which uses minStab to retrieve Network Evolution Rules (NERs) and Stable NERs (SNERs). The retrieved information is used to calculate a proposed System Network Persistence (SNP) metric. This work is automated as a SysNet-Analytics Tool to demonstrate application on real world systems including: software system, natural-language system, retail market system, and IMDb system. We quantified stability and persistence of entity-connections in a system state series. This results in evolution information, which helps in system evolution analytics based on knowledge discovery and data mining. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: Accepted on IEEE DSAA and Video Presentation https://www.youtube.com/watch?v=ohOeTXoI-IY&list=PLtvWi5o3JBnF3yxcjGdT4KCDLxRBIpsyR

Journal ref: IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA), October 13-16, 2022, Shenzhen, China. IEEE, 2022. (Core A)

arXiv:2108.02587 [pdf, ps, other]

Four-Valued Semantics for Deductive Databases

Authors: Dominique Laurent, Nicolas Spyratos

Abstract: In this paper, we introduce a novel approach to deductive databases meant to take into account the needs of current applications in the area of data integration. To this end, we extend the formalism of standard deductive databases to the context of Four-valued logic so as to account for unknown, inconsistent, true or false information under the open world assumption. In our approach, a database is… ▽ More In this paper, we introduce a novel approach to deductive databases meant to take into account the needs of current applications in the area of data integration. To this end, we extend the formalism of standard deductive databases to the context of Four-valued logic so as to account for unknown, inconsistent, true or false information under the open world assumption. In our approach, a database is a pair (E,R) where E is the extension and R the set of rules. The extension is a set of pairs of the form (f, v) where f is a fact and v is a value that can be true, inconsistent or false - but not unknown (that is, unknown facts are not stored in the database). The rules follow the form of standard Datalog{neg} rules but, contrary to standard rules, their head may be a negative atom. Our main contributions are as follows: (i) we give an expression of first-degree entailment in terms of other connectors and exhibit a functionally complete set of basic connectors not involving first-degree entailment, (ii) we define a new operator for handling our new type of rules and show that this operator is monotonic and continuous, thus providing an effective way for defining and computing database semantics, and (iii) we argue that our framework allows for the definition of a new type of updates that can be used in most standard data integration applications. △ Less

Submitted 5 August, 2021; originally announced August 2021.

Comments: Unpublished research report

ACM Class: H.0

arXiv:2108.02581 [pdf, ps, other]

Handling Inconsistencies in Tables with Nulls and Functional Dependencies

Authors: Dominique Laurent, Nicolas Spyratos

Abstract: In this paper we address the problem of handling inconsistencies in tables with missing values (also called nulls) and functional dependencies. Although the traditional view is that table instances must respect all functional dependencies imposed on them, it is nevertheless relevant to develop theories about how to handle instances that violate some dependencies. Regarding missing values, we make… ▽ More In this paper we address the problem of handling inconsistencies in tables with missing values (also called nulls) and functional dependencies. Although the traditional view is that table instances must respect all functional dependencies imposed on them, it is nevertheless relevant to develop theories about how to handle instances that violate some dependencies. Regarding missing values, we make no assumptions on their existence: a missing value exists only if it is inferred from the functional dependencies of the table. We propose a formal framework in which each tuple of a table is associated with a truth value among the following: true, false, inconsistent or unknown; and we show that our framework can be used to study important problems such as consistent query answering, table merging, and data quality measures - to mention just a few. In this paper, however, we focus mainly on consistent query answering, a problem that has received considerable attention during the last decades. The main contributions of the paper are the following: (a) we introduce a new approach to handle inconsistencies in a table with nulls and functional dependencies, (b) we give algorithms for computing all true, inconsistent and false tuples, (c) we investigate the relationship between our approach and Four-valued logic in the context of data merging, and (d) we give a novel solution to the consistent query answering problem and compare our solution to that of table repairs. △ Less

Submitted 27 November, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

Comments: In the present version a few changes have been made with respect to the previous version: 1/ The following proofs of lemmas 1, 2, 3 and of Proposition 2 have been rewritten. 2/ A new definition of consistent answer is given and compared with existing approaches based on repairs

ACM Class: H.2.1

arXiv:cs/0111059 [pdf, ps, other]

Hypotheses Founded Semantics of Logic Programs for Information Integration in Multi-Valued Logics

Authors: Yann Loyer, Nicolas Spyratos, Daniel Stamate

Abstract: We address the problem of integrating information coming from different sources. The information consists of facts that a central server collects and tries to combine using (a) a set of logical rules, i.e. a logic program, and (b) a hypothesis representing the server's own estimates. In such a setting incomplete information from a source or contradictory information from different sources necess… ▽ More We address the problem of integrating information coming from different sources. The information consists of facts that a central server collects and tries to combine using (a) a set of logical rules, i.e. a logic program, and (b) a hypothesis representing the server's own estimates. In such a setting incomplete information from a source or contradictory information from different sources necessitate the use of many-valued logics in which programs can be evaluated and hypotheses can be tested. To carry out such activities we propose a formal framework based on bilattices such as Belnap's four-valued logics. In this framework we work with the class of programs defined by Fitting and we develop a theory for information integration. We also establish an intuitively appealing connection between our hypothesis testing mechanism on the one hand, and the well-founded semantics and Kripke-Kleene semantics of Datalog programs with negation, on the other hand. △ Less

Submitted 27 November, 2001; originally announced November 2001.

Comments: 27pages, 1 figure

ACM Class: I.2.3; I.2.11; H.1.1

arXiv:cs/0002013 [pdf, ps, other]

Computing and Comparing Semantics of Programs in Multi-valued Logics

Authors: Y. Loyer, N. Spyratos, D. Stamate

Abstract: The different semantics that can be assigned to a logic program correspond to different assumptions made concerning the atoms whose logical values cannot be inferred from the rules. Thus, the well founded semantics corresponds to the assumption that every such atom is false, while the Kripke-Kleene semantics corresponds to the assumption that every such atom is unknown. In this paper, we propose… ▽ More The different semantics that can be assigned to a logic program correspond to different assumptions made concerning the atoms whose logical values cannot be inferred from the rules. Thus, the well founded semantics corresponds to the assumption that every such atom is false, while the Kripke-Kleene semantics corresponds to the assumption that every such atom is unknown. In this paper, we propose to unify and extend this assumption-based approach by introducing parameterized semantics for logic programs. The parameter holds the value that one assumes for all atoms whose logical values cannot be inferred from the rules. We work within multi-valued logic with bilattice structure, and we consider the class of logic programs defined by Fitting. Following Fitting's approach, we define a simple operator that allows us to compute the parameterized semantics, and to compare and combine semantics obtained for different values of the parameter. The semantics proposed by Fitting corresponds to the value false. We also show that our approach captures and extends the usual semantics of conventional logic programs thereby unifying their computation. △ Less

Submitted 18 February, 2000; originally announced February 2000.

Comments: 10 pages, 1 figure, A preliminary version of this paper appeared in the form of an extended abstract in the conference Mathematical Foundations of Computer Science (MFCS'99)

ACM Class: H.0; I.2.3

Showing 1–8 of 8 results for author: Spyratos, N