-
Repairing Databases over Metric Spaces with Coincidence Constraints
Authors:
Youri Kaminsky,
Benny Kimelfeld,
Ester Livshits,
Felix Naumann,
David Wajc
Abstract:
Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a Euclidean space, and so on. We study the computational complexity of repairing inconsistent databases that violate integrity constraints, where the database values belong to an underlying metric space. The goal is to update the database values to retain c…
▽ More
Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a Euclidean space, and so on. We study the computational complexity of repairing inconsistent databases that violate integrity constraints, where the database values belong to an underlying metric space. The goal is to update the database values to retain consistency while minimizing the total distance between the original values and the repaired ones. We consider what we refer to as \emph{coincidence constraints}, which include key constraints, inclusion, foreign keys, and generally any restriction on the relationship between the numbers of cells of different labels (attributes) coinciding in a single value, for a fixed attribute set.
We begin by showing that the problem is APX-hard for general metric spaces. We then present an algorithm solving the problem optimally for tree metrics, which generalize both the line metric (i.e., where repaired values are numbers) and the discrete metric (i.e., where we simply count the number of changed values). Combining our algorithm for tree metrics and a classic result on probabilistic tree embeddings, we design a (high probability) logarithmic-ratio approximation for general metrics. We also study the variant of the problem where each individual value's allowed change is limited. In this variant, it is already NP-complete to decide the existence of any legal repair for a general metric, and we present a polynomial-time repairing algorithm for the case of a line metric.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
The Shapley Value in Database Management
Authors:
Leopoldo Bertossi,
Benny Kimelfeld,
Ester Livshits,
Mikaël Monet
Abstract:
Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Management, some of the common scores are deployments of the Shapley value, a formula for profit sharing in cooperative game theory. Since its invention…
▽ More
Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Management, some of the common scores are deployments of the Shapley value, a formula for profit sharing in cooperative game theory. Since its invention in the 1950s, the Shapley value has been used for contribution measurement in many fields, from economics to law, with its latest researched applications in modern machine learning. Recent studies investigated the application of the Shapley value to database management. This article gives an overview of recent results on the computational complexity of the Shapley value for measuring the contribution of tuples to query answers and to the extent of inconsistency with respect to integrity constraints. More specifically, the article highlights lower and upper bounds on the complexity of calculating the Shapley value, either exactly or approximately, as well as solutions for realizing the calculation in practice.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Combined Approximations for Uniform Operational Consistent Query Answering
Authors:
Marco Calautti,
Ester Livshits,
Andreas Pieris,
Markus Schneider
Abstract:
Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we reach a database that is consistent w.r.t. the given set of constraints. It has been recently shown that there are efficient approximations for computing the pe…
▽ More
Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we reach a database that is consistent w.r.t. the given set of constraints. It has been recently shown that there are efficient approximations for computing the percentage of repairs, as well as of sequences of operations leading to repairs, that entail a given query when we focus on primary keys, conjunctive queries, and assuming the query is fixed (i.e., in data complexity). However, it has been left open whether such approximations exist when the query is part of the input (i.e., in combined complexity). We show that this is the case when we focus on self-join-free conjunctive queries of bounded generelized hypertreewidth. We also show that it is unlikely that efficient approximation schemes exist once we give up one of the adopted syntactic restrictions, i.e., self-join-freeness or bounding the generelized hypertreewidth. Towards the desired approximation schemes, we introduce a novel counting complexity class, called SpanTL, show that each problem in SpanTL admits an efficient approximation scheme by using a recent approximability result in the context of tree automata, and then place the problems of interest in SpanTL.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
The Complexity of Why-Provenance for Datalog Queries
Authors:
Marco Calautti,
Ester Livshits,
Andreas Pieris,
Markus Schneider
Abstract:
Explaining why a database query result is obtained is an essential task towards the goal of Explainable AI, especially nowadays where expressive database query languages such as Datalog play a critical role in the development of ontology-based applications. A standard way of explaining a query result is the so-called why-provenance, which essentially provides information about the witnesses to a q…
▽ More
Explaining why a database query result is obtained is an essential task towards the goal of Explainable AI, especially nowadays where expressive database query languages such as Datalog play a critical role in the development of ontology-based applications. A standard way of explaining a query result is the so-called why-provenance, which essentially provides information about the witnesses to a query result in the form of subsets of the input database that are sufficient to derive that result. To our surprise, despite the fact that the notion of why-provenance for Datalog queries has been around for decades and intensively studied, its computational complexity remains unexplored. The goal of this work is to fill this apparent gap in the why-provenance literature. Towards this end, we pinpoint the data complexity of why-provenance for Datalog queries and key subclasses thereof. The takeaway of our work is that why-provenance for recursive queries, even if the recursion is limited to be linear, is an intractable problem, whereas for non-recursive queries is highly tractable. Having said that, we experimentally confirm, by exploiting SAT solvers, that making why-provenance for (recursive) Datalog queries work in practice is not an unrealistic goal.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Uniform Operational Consistent Query Answering
Authors:
Marco Calautti,
Ester Livshits,
Andreas Pieris,
Markus Schneider
Abstract:
Operational consistent query answering (CQA) is a recent framework for CQA, based on revised definitions of repairs and consistent answers, which opens up the possibility of efficient approximations with explicit error guarantees. The main idea is to iteratively apply operations (e.g., fact deletions), starting from an inconsistent database, until we reach a database that is consistent w.r.t. the…
▽ More
Operational consistent query answering (CQA) is a recent framework for CQA, based on revised definitions of repairs and consistent answers, which opens up the possibility of efficient approximations with explicit error guarantees. The main idea is to iteratively apply operations (e.g., fact deletions), starting from an inconsistent database, until we reach a database that is consistent w.r.t. the given set of constraints. This gives us the flexibility of choosing the probability with which we apply an operation, which in turn allows us to calculate the probability of an operational repair, and thus, the probability with which a consistent answer is entailed. A natural way of assigning probabilities to operations is by targeting the uniform probability distribution over a reasonable space such as the set of operational repairs, the set of sequences of operations that lead to an operational repair, and the set of available operations at a certain step of the repairing process. This leads to what we generally call uniform operational CQA. The goal of this work is to perform a data complexity analysis of both exact and approximate uniform operational CQA, focusing on functional dependencies (and subclasses thereof), and conjunctive queries. The main outcome of our analysis (among other positive and negative results), is that uniform operational CQA pushes the efficiency boundaries further by ensuring the existence of efficient approximation schemes in scenarios that go beyond the simple case of primary keys, which seems to be the limit of the classical approach to CQA.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
Exact and Approximate Counting of Database Repairs
Authors:
Marco Calautti,
Ester Livshits,
Andreas Pieris,
Markus Schneider
Abstract:
A key task in the context of consistent query answering is to count the number of repairs that entail the query, with the ultimate goal being a precise data complexity classification. This has been achieved in the case of primary keys and self-join-free conjunctive queries (CQs) via an FP/#P-complete dichotomy. We lift this result to the more general case of functional dependencies (FDs). Another…
▽ More
A key task in the context of consistent query answering is to count the number of repairs that entail the query, with the ultimate goal being a precise data complexity classification. This has been achieved in the case of primary keys and self-join-free conjunctive queries (CQs) via an FP/#P-complete dichotomy. We lift this result to the more general case of functional dependencies (FDs). Another important task in this context is whenever the counting problem in question is intractable, to classify it as approximable, i.e., the target value can be efficiently approximated with error guarantees via a fully polynomial-time randomized approximation scheme (FPRAS), or as inapproximable. Although for primary keys and CQs (even with self-joins) the problem is always approximable, we prove that this is not the case for FDs. We show, however, that the class of FDs with a left-hand side chain forms an island of approximability. We see these results, apart from being interesting in their own right, as crucial steps towards a complete classification of approximate counting of repairs in the case of FDs and self-join-free CQs.
△ Less
Submitted 9 April, 2024; v1 submitted 17 December, 2021;
originally announced December 2021.
-
Database Repairing with Soft Functional Dependencies
Authors:
Nofar Carmeli,
Martin Grohe,
Benny Kimelfeld,
Ester Livshits,
Muhammad Tibi
Abstract:
A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cos…
▽ More
A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cost), this subset is a "cardinality repair" of an inconsistent database; in soft interpretations, this subset corresponds to a "most probable world" of a probabilistic database, a "most likely intention" of a probabilistic unclean database, and so on. Within the class of functional dependencies, the complexity of finding a cardinality repair is thoroughly understood. Yet, very little is known about the complexity of this problem in the more general soft semantics. This paper makes a significant progress in this direction. In addition to general insights about the hardness and approximability of the problem, we present algorithms for two special cases: a single functional dependency, and a bipartite matching. The latter is the problem of finding an optimal "almost matching" of a bipartite graph where a penalty is paid for every lost edge and every violation of monogamy.
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
The Shapley Value of Inconsistency Measures for Functional Dependencies
Authors:
Ester Livshits,
Benny Kimelfeld
Abstract:
Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution…
▽ More
Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution have been a subject of much research in Knowledge Representation and, more recently, in Databases. As in many other fields, a conventional responsibility sharing mechanism is the Shapley value from cooperative game theory. In this paper, we carry out a systematic investigation of the complexity of the Shapley value in common inconsistency measures for functional-dependency (FD) violations. For several measures we establish a full classification of the FD sets into tractable and intractable classes with respect to Shapley-value computation. We also study the complexity of approximation in intractable cases.
△ Less
Submitted 14 June, 2022; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Approximate Denial Constraints
Authors:
Ester Livshits,
Alireza Heidari,
Ihab F. Ilyas,
Benny Kimelfeld
Abstract:
The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining approximate DCs (i.e., DCs that are "almost" satisfied) from data. Considering approximate constraint…
▽ More
The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining approximate DCs (i.e., DCs that are "almost" satisfied) from data. Considering approximate constraints allows us to discover more accurate constraints in inconsistent databases, detect rules that are generally correct but may have a few exceptions, as well as avoid overfitting and obtain more general and less contrived constraints. We introduce the algorithm ADCMiner for mining approximate DCs. An important feature of this algorithm is that it does not assume any specific definition of an approximate DC, but takes the semantics as input. Since there is more than one way to define an approximate DC and different definitions may produce very different results, we do not focus on one definition, but rather on a general family of approximation functions that satisfies some natural axioms defined in this paper and captures commonly used definitions of approximate constraints. We also show how our algorithm can be combined with sampling to return results with high accuracy while significantly reducing the running time.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
The Impact of Negation on the Complexity of the Shapley Value in Conjunctive Queries
Authors:
Alon Reshef,
Benny Kimelfeld,
Ester Livshits
Abstract:
The Shapley value is a conventional and well-studied function for determining the contribution of a player to the coalition in a cooperative game. Among its applications in a plethora of domains, it has recently been proposed to use the Shapley value for quantifying the contribution of a tuple to the result of a database query. In particular, we have a thorough understanding of the tractability fr…
▽ More
The Shapley value is a conventional and well-studied function for determining the contribution of a player to the coalition in a cooperative game. Among its applications in a plethora of domains, it has recently been proposed to use the Shapley value for quantifying the contribution of a tuple to the result of a database query. In particular, we have a thorough understanding of the tractability frontier for the class of Conjunctive Queries (CQs) and aggregate functions over CQs. It has also been established that a tractable (randomized) multiplicative approximation exists for every union of CQs. Nevertheless, all of these results are based on the monotonicity of CQs. In this work, we investigate the implication of negation on the complexity of Shapley computation, in both the exact and approximate senses. We generalize a known dichotomy to account for negated atoms. We also show that negation fundamentally changes the complexity of approximation. We do so by drawing a connection to the problem of deciding whether a tuple is "relevant" to a query, and by analyzing its complexity.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
The Shapley Value of Tuples in Query Answering
Authors:
Ester Livshits,
Leopoldo Bertossi,
Benny Kimelfeld,
Moshe Sebag
Abstract:
We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single…
▽ More
We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single wealth-distribution measure that satisfies some natural axioms. While this value has been investigated in several areas, it received little attention in data management. We study this measure in the context of conjunctive and aggregate queries by defining corresponding coalition games. We provide algorithmic and complexity-theoretic results on the computation of Shapley-based contributions to query answers; and for the hard cases we present approximation algorithms.
△ Less
Submitted 1 September, 2021; v1 submitted 18 April, 2019;
originally announced April 2019.
-
Properties of Inconsistency Measures for Databases
Authors:
Ester Livshits,
Rina Kochirgan,
Segev Tsur,
Ihab F. Ilyas,
Benny Kimelfeld,
Sudeepa Roy
Abstract:
How should we quantify the inconsistency of a database that violates integrity constraints? Proper measures are important for various tasks, such as progress indication and action prioritization in cleaning systems, and reliability estimation for new datasets. To choose an appropriate inconsistency measure, it is important to identify the desired properties in the application and understand which…
▽ More
How should we quantify the inconsistency of a database that violates integrity constraints? Proper measures are important for various tasks, such as progress indication and action prioritization in cleaning systems, and reliability estimation for new datasets. To choose an appropriate inconsistency measure, it is important to identify the desired properties in the application and understand which of these is guaranteed or at least expected in practice. For example, in some use cases the inconsistency should reduce if constraints are eliminated; in others it should be stable and avoid jitters and jumps in reaction to small changes in the database. We embark on a systematic investigation of properties for database inconsistency measures. We investigate a collection of basic measures that have been proposed in the past in both the Knowledge Representation and Database communities, analyze their theoretical properties, and empirically observe their behaviour in an experimental study. We also demonstrate how the framework can lead to new inconsistency measures by introducing a new measure that, in contrast to the rest, satisfies all of the properties we consider and can be computed in polynomial time.
△ Less
Submitted 1 April, 2021; v1 submitted 13 April, 2019;
originally announced April 2019.
-
Computing Optimal Repairs for Functional Dependencies
Authors:
Ester Livshits,
Benny Kimelfeld,
Sudeepa Roy
Abstract:
We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell)…
▽ More
We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) updates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal S-repair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a "most probable database" that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem.
△ Less
Submitted 20 December, 2017;
originally announced December 2017.
-
The Complexity of Computing a Cardinality Repair for Functional Dependencies
Authors:
Ester Livshits,
Benny Kimelfeld
Abstract:
For a relation that violates a set of functional dependencies, we consider the task of finding a maximum number of pairwise-consistent tuples, or what is known as a "cardinality repair." We present a polynomial-time algorithm that, for certain fixed relation schemas (with functional dependencies), computes a cardinality repair. Moreover, we prove that on any of the schemas not covered by the algor…
▽ More
For a relation that violates a set of functional dependencies, we consider the task of finding a maximum number of pairwise-consistent tuples, or what is known as a "cardinality repair." We present a polynomial-time algorithm that, for certain fixed relation schemas (with functional dependencies), computes a cardinality repair. Moreover, we prove that on any of the schemas not covered by the algorithm, finding a cardinality repair is, in fact, an NP-hard problem. In particular, we establish a dichotomy in the complexity of computing a cardinality repair, and we present an efficient algorithm to determine whether a given schema belongs to the positive side or the negative side of the dichotomy.
△ Less
Submitted 30 August, 2017;
originally announced August 2017.
-
Unambiguous Prioritized Repairing of Databases
Authors:
Benny Kimelfeld,
Ester Livshits,
Liat Peterfreund
Abstract:
In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way". Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, re…
▽ More
In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way". Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion.
In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish $Π^{p}_{2}$-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate.
△ Less
Submitted 6 March, 2016;
originally announced March 2016.
-
Flow with Nonlinear Potential in General Networks -- Simulation, Optimization, Control, Risk and Stability Analysis
Authors:
Emmanuel M. Livshits,
Leonid A. Ostromuhov
Abstract:
The aim of this paper is a short survey of models and methods that developed by the authors. These models and methods are used to optimize general networks with nonlinear non-convex restrictions and objectives possessing mixed continuous-discrete optimization variables. There are discussed the problem formulations and solution methods for simulation, optimization, sensitivity and stability analysi…
▽ More
The aim of this paper is a short survey of models and methods that developed by the authors. These models and methods are used to optimize general networks with nonlinear non-convex restrictions and objectives possessing mixed continuous-discrete optimization variables. There are discussed the problem formulations and solution methods for simulation, optimization, sensitivity and stability analysis for flow with nonlinear potential in general networks. These problems and the developed methods and programs have industrial application e.g. by gas networks.
△ Less
Submitted 18 March, 2011;
originally announced March 2011.
-
A density functional theory for symmetric radical cations from bonding to dissociation
Authors:
Ester Livshits,
Roi Baer
Abstract:
It is known for quite some time that approximate density functional (ADF) theories fail disastrously when describing the dis-sociative symmetric radical cations R2+. Considering this dissociation limit, previous work has shown that Hartree-Fock (HF) theory favors the R+1--R0 charge distribution while DF approximations favor the R+0.5 -- R+0.5. Yet, general quantum mechanical principles indicate…
▽ More
It is known for quite some time that approximate density functional (ADF) theories fail disastrously when describing the dis-sociative symmetric radical cations R2+. Considering this dissociation limit, previous work has shown that Hartree-Fock (HF) theory favors the R+1--R0 charge distribution while DF approximations favor the R+0.5 -- R+0.5. Yet, general quantum mechanical principles indicate that both these (as well as all intermediate) average charge distributions are asymptotically energy degenerate. Thus HF and ADF theories mistakenly break the symmetry but in a contradicting way. In this letter we show how to construct system-dependent long-range corrected (LC) density functionals that can successfully treat this class of molecules, avoiding the spurious symmetry breaking. Examples and comparisons to experimental data is given for R=H, He and Ne and it is shown that the new LC theory improves considerably the theoretical description of the R2+ bond properties, the long range form of the asymptotic potential curve as well as the atomic polarizability. The broader impact of this finding is discussed as well and it is argued that the widespread semi-empirical approach which advocates treating the LC parameter as a system-independent parameter is in fact inappropriate under general circumstances.
△ Less
Submitted 27 April, 2008; v1 submitted 19 April, 2008;
originally announced April 2008.
-
A well-tempered density functional theory of electrons in molecules
Authors:
Ester Livshits,
Roi Baer
Abstract:
Reporting extensions of a recently developed approach to density functional theory with correct long-range be-havior (Phys. Rev. Lett. 94, 043002 (2005)). The central quantities are a splitting functional gamma[n] and a complementary exchange-correlation functional. We give a practical method for determining the value of γin molecules, assuming an approximation for XC energy is given. The result…
▽ More
Reporting extensions of a recently developed approach to density functional theory with correct long-range be-havior (Phys. Rev. Lett. 94, 043002 (2005)). The central quantities are a splitting functional gamma[n] and a complementary exchange-correlation functional. We give a practical method for determining the value of γin molecules, assuming an approximation for XC energy is given. The resulting theory shows good ability to reproduce the ionization potentials for various molecules. However it is not of sufficient accuracy for forming a satisfactory framework for studying molecular properties. A somewhat different approach is then adopted, which depends on a density-independent γand an additional parameter w eliminating part of the local exchange functional. The values of these two parameters are obtained by best-fitting to experimental atomization energies and bond-lengths of the molecules in the G2(1) database. The optimized values are gamma=0.5 a_0^{-1} and w=0.1 . We then examine the performance of this slightly semi-empirical functional for a variety of molecular properties, comparing to related works and to experiment. We show that this approach can be used for describing in a satisfactory manner a broad range of molecular properties, be they static or dynamic. Most satisfactory is the ability to describe valence, Rydberg and inter-molecular charge-transfer excitations.
△ Less
Submitted 20 January, 2007;
originally announced January 2007.