Search | arXiv e-print repository

Repairing Databases over Metric Spaces with Coincidence Constraints

Authors: Youri Kaminsky, Benny Kimelfeld, Ester Livshits, Felix Naumann, David Wajc

Abstract: Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a Euclidean space, and so on. We study the computational complexity of repairing inconsistent databases that violate integrity constraints, where the database values belong to an underlying metric space. The goal is to update the database values to retain c… ▽ More Datasets often contain values that naturally reside in a metric space: numbers, strings, geographical locations, machine-learned embeddings in a Euclidean space, and so on. We study the computational complexity of repairing inconsistent databases that violate integrity constraints, where the database values belong to an underlying metric space. The goal is to update the database values to retain consistency while minimizing the total distance between the original values and the repaired ones. We consider what we refer to as \emph{coincidence constraints}, which include key constraints, inclusion, foreign keys, and generally any restriction on the relationship between the numbers of cells of different labels (attributes) coinciding in a single value, for a fixed attribute set. We begin by showing that the problem is APX-hard for general metric spaces. We then present an algorithm solving the problem optimally for tree metrics, which generalize both the line metric (i.e., where repaired values are numbers) and the discrete metric (i.e., where we simply count the number of changed values). Combining our algorithm for tree metrics and a classic result on probabilistic tree embeddings, we design a (high probability) logarithmic-ratio approximation for general metrics. We also study the variant of the problem where each individual value's allowed change is limited. In this variant, it is already NP-complete to decide the existence of any legal repair for a general metric, and we present a polynomial-time repairing algorithm for the case of a line metric. △ Less

Submitted 25 September, 2024; originally announced September 2024.

arXiv:2401.06234 [pdf, other]

doi 10.1145/3615952.3615954

The Shapley Value in Database Management

Authors: Leopoldo Bertossi, Benny Kimelfeld, Ester Livshits, Mikaël Monet

Abstract: Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Management, some of the common scores are deployments of the Shapley value, a formula for profit sharing in cooperative game theory. Since its invention… ▽ More Attribution scores can be applied in data management to quantify the contribution of individual items to conclusions from the data, as part of the explanation of what led to these conclusions. In Artificial Intelligence, Machine Learning, and Data Management, some of the common scores are deployments of the Shapley value, a formula for profit sharing in cooperative game theory. Since its invention in the 1950s, the Shapley value has been used for contribution measurement in many fields, from economics to law, with its latest researched applications in modern machine learning. Recent studies investigated the application of the Shapley value to database management. This article gives an overview of recent results on the computational complexity of the Shapley value for measuring the contribution of tuples to query answers and to the extent of inconsistency with respect to integrity constraints. More specifically, the article highlights lower and upper bounds on the complexity of calculating the Shapley value, either exactly or approximately, as well as solutions for realizing the calculation in practice. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 12 pages, including references. This is the authors version of the corresponding SIGMOD Record article

Journal ref: SIGMOD Rec. 52(2): 6-17 (2023)

arXiv:2312.08038 [pdf, other]

Combined Approximations for Uniform Operational Consistent Query Answering

Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider

Abstract: Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we reach a database that is consistent w.r.t. the given set of constraints. It has been recently shown that there are efficient approximations for computing the pe… ▽ More Operational consistent query answering (CQA) is a recent framework for CQA based on revised definitions of repairs, which are built by applying a sequence of operations (e.g., fact deletions) starting from an inconsistent database until we reach a database that is consistent w.r.t. the given set of constraints. It has been recently shown that there are efficient approximations for computing the percentage of repairs, as well as of sequences of operations leading to repairs, that entail a given query when we focus on primary keys, conjunctive queries, and assuming the query is fixed (i.e., in data complexity). However, it has been left open whether such approximations exist when the query is part of the input (i.e., in combined complexity). We show that this is the case when we focus on self-join-free conjunctive queries of bounded generelized hypertreewidth. We also show that it is unlikely that efficient approximation schemes exist once we give up one of the adopted syntactic restrictions, i.e., self-join-freeness or bounding the generelized hypertreewidth. Towards the desired approximation schemes, we introduce a novel counting complexity class, called SpanTL, show that each problem in SpanTL admits an efficient approximation scheme by using a recent approximability result in the context of tree automata, and then place the problems of interest in SpanTL. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2303.12773 [pdf, other]

The Complexity of Why-Provenance for Datalog Queries

Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider

Abstract: Explaining why a database query result is obtained is an essential task towards the goal of Explainable AI, especially nowadays where expressive database query languages such as Datalog play a critical role in the development of ontology-based applications. A standard way of explaining a query result is the so-called why-provenance, which essentially provides information about the witnesses to a q… ▽ More Explaining why a database query result is obtained is an essential task towards the goal of Explainable AI, especially nowadays where expressive database query languages such as Datalog play a critical role in the development of ontology-based applications. A standard way of explaining a query result is the so-called why-provenance, which essentially provides information about the witnesses to a query result in the form of subsets of the input database that are sufficient to derive that result. To our surprise, despite the fact that the notion of why-provenance for Datalog queries has been around for decades and intensively studied, its computational complexity remains unexplored. The goal of this work is to fill this apparent gap in the why-provenance literature. Towards this end, we pinpoint the data complexity of why-provenance for Datalog queries and key subclasses thereof. The takeaway of our work is that why-provenance for recursive queries, even if the recursion is limited to be linear, is an intractable problem, whereas for non-recursive queries is highly tractable. Having said that, we experimentally confirm, by exploiting SAT solvers, that making why-provenance for (recursive) Datalog queries work in practice is not an unrealistic goal. △ Less

Submitted 22 March, 2023; originally announced March 2023.

arXiv:2204.10592 [pdf, other]

Uniform Operational Consistent Query Answering

Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider

Abstract: Operational consistent query answering (CQA) is a recent framework for CQA, based on revised definitions of repairs and consistent answers, which opens up the possibility of efficient approximations with explicit error guarantees. The main idea is to iteratively apply operations (e.g., fact deletions), starting from an inconsistent database, until we reach a database that is consistent w.r.t. the… ▽ More Operational consistent query answering (CQA) is a recent framework for CQA, based on revised definitions of repairs and consistent answers, which opens up the possibility of efficient approximations with explicit error guarantees. The main idea is to iteratively apply operations (e.g., fact deletions), starting from an inconsistent database, until we reach a database that is consistent w.r.t. the given set of constraints. This gives us the flexibility of choosing the probability with which we apply an operation, which in turn allows us to calculate the probability of an operational repair, and thus, the probability with which a consistent answer is entailed. A natural way of assigning probabilities to operations is by targeting the uniform probability distribution over a reasonable space such as the set of operational repairs, the set of sequences of operations that lead to an operational repair, and the set of available operations at a certain step of the repairing process. This leads to what we generally call uniform operational CQA. The goal of this work is to perform a data complexity analysis of both exact and approximate uniform operational CQA, focusing on functional dependencies (and subclasses thereof), and conjunctive queries. The main outcome of our analysis (among other positive and negative results), is that uniform operational CQA pushes the efficiency boundaries further by ensuring the existence of efficient approximation schemes in scenarios that go beyond the simple case of primary keys, which seems to be the limit of the classical approach to CQA. △ Less

Submitted 22 April, 2022; originally announced April 2022.

arXiv:2112.09617 [pdf, ps, other]

Exact and Approximate Counting of Database Repairs

Authors: Marco Calautti, Ester Livshits, Andreas Pieris, Markus Schneider

Abstract: A key task in the context of consistent query answering is to count the number of repairs that entail the query, with the ultimate goal being a precise data complexity classification. This has been achieved in the case of primary keys and self-join-free conjunctive queries (CQs) via an FP/#P-complete dichotomy. We lift this result to the more general case of functional dependencies (FDs). Another… ▽ More A key task in the context of consistent query answering is to count the number of repairs that entail the query, with the ultimate goal being a precise data complexity classification. This has been achieved in the case of primary keys and self-join-free conjunctive queries (CQs) via an FP/#P-complete dichotomy. We lift this result to the more general case of functional dependencies (FDs). Another important task in this context is whenever the counting problem in question is intractable, to classify it as approximable, i.e., the target value can be efficiently approximated with error guarantees via a fully polynomial-time randomized approximation scheme (FPRAS), or as inapproximable. Although for primary keys and CQs (even with self-joins) the problem is always approximable, we prove that this is not the case for FDs. We show, however, that the class of FDs with a left-hand side chain forms an island of approximability. We see these results, apart from being interesting in their own right, as crucial steps towards a complete classification of approximate counting of repairs in the case of FDs and self-join-free CQs. △ Less

Submitted 9 April, 2024; v1 submitted 17 December, 2021; originally announced December 2021.

ACM Class: H.2

arXiv:2009.13821 [pdf, other]

Database Repairing with Soft Functional Dependencies

Authors: Nofar Carmeli, Martin Grohe, Benny Kimelfeld, Ester Livshits, Muhammad Tibi

Abstract: A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cos… ▽ More A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cost), this subset is a "cardinality repair" of an inconsistent database; in soft interpretations, this subset corresponds to a "most probable world" of a probabilistic database, a "most likely intention" of a probabilistic unclean database, and so on. Within the class of functional dependencies, the complexity of finding a cardinality repair is thoroughly understood. Yet, very little is known about the complexity of this problem in the more general soft semantics. This paper makes a significant progress in this direction. In addition to general insights about the hardness and approximability of the problem, we present algorithms for two special cases: a single functional dependency, and a bipartite matching. The latter is the problem of finding an optimal "almost matching" of a bipartite graph where a penalty is paid for every lost edge and every violation of monogamy. △ Less

Submitted 29 September, 2020; originally announced September 2020.

arXiv:2009.13819 [pdf, other]

doi 10.46298/lmcs-18(2:20)2022

The Shapley Value of Inconsistency Measures for Functional Dependencies

Authors: Ester Livshits, Benny Kimelfeld

Abstract: Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution… ▽ More Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution have been a subject of much research in Knowledge Representation and, more recently, in Databases. As in many other fields, a conventional responsibility sharing mechanism is the Shapley value from cooperative game theory. In this paper, we carry out a systematic investigation of the complexity of the Shapley value in common inconsistency measures for functional-dependency (FD) violations. For several measures we establish a full classification of the FD sets into tractable and intractable classes with respect to Shapley-value computation. We also study the complexity of approximation in intractable cases. △ Less

Submitted 14 June, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

Journal ref: Logical Methods in Computer Science, Volume 18, Issue 2 (June 15, 2022) lmcs:8618

arXiv:2005.08540 [pdf, ps, other]

Approximate Denial Constraints

Authors: Ester Livshits, Alireza Heidari, Ihab F. Ilyas, Benny Kimelfeld

Abstract: The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining approximate DCs (i.e., DCs that are "almost" satisfied) from data. Considering approximate constraint… ▽ More The problem of mining integrity constraints from data has been extensively studied over the past two decades for commonly used types of constraints including the classic Functional Dependencies (FDs) and the more general Denial Constraints (DCs). In this paper, we investigate the problem of mining approximate DCs (i.e., DCs that are "almost" satisfied) from data. Considering approximate constraints allows us to discover more accurate constraints in inconsistent databases, detect rules that are generally correct but may have a few exceptions, as well as avoid overfitting and obtain more general and less contrived constraints. We introduce the algorithm ADCMiner for mining approximate DCs. An important feature of this algorithm is that it does not assume any specific definition of an approximate DC, but takes the semantics as input. Since there is more than one way to define an approximate DC and different definitions may produce very different results, we do not focus on one definition, but rather on a general family of approximation functions that satisfies some natural axioms defined in this paper and captures commonly used definitions of approximate constraints. We also show how our algorithm can be combined with sampling to return results with high accuracy while significantly reducing the running time. △ Less

Submitted 18 May, 2020; originally announced May 2020.

arXiv:1912.12610 [pdf, ps, other]

The Impact of Negation on the Complexity of the Shapley Value in Conjunctive Queries

Authors: Alon Reshef, Benny Kimelfeld, Ester Livshits

Abstract: The Shapley value is a conventional and well-studied function for determining the contribution of a player to the coalition in a cooperative game. Among its applications in a plethora of domains, it has recently been proposed to use the Shapley value for quantifying the contribution of a tuple to the result of a database query. In particular, we have a thorough understanding of the tractability fr… ▽ More The Shapley value is a conventional and well-studied function for determining the contribution of a player to the coalition in a cooperative game. Among its applications in a plethora of domains, it has recently been proposed to use the Shapley value for quantifying the contribution of a tuple to the result of a database query. In particular, we have a thorough understanding of the tractability frontier for the class of Conjunctive Queries (CQs) and aggregate functions over CQs. It has also been established that a tractable (randomized) multiplicative approximation exists for every union of CQs. Nevertheless, all of these results are based on the monotonicity of CQs. In this work, we investigate the implication of negation on the complexity of Shapley computation, in both the exact and approximate senses. We generalize a known dichotomy to account for negated atoms. We also show that negation fundamentally changes the complexity of approximation. We do so by drawing a connection to the problem of deciding whether a tuple is "relevant" to a query, and by analyzing its complexity. △ Less

Submitted 29 December, 2019; originally announced December 2019.

arXiv:1904.08679 [pdf, other]

doi 10.46298/lmcs-17(3:22)2021

The Shapley Value of Tuples in Query Answering

Authors: Ester Livshits, Leopoldo Bertossi, Benny Kimelfeld, Moshe Sebag

Abstract: We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single… ▽ More We investigate the application of the Shapley value to quantifying the contribution of a tuple to a query answer. The Shapley value is a widely known numerical measure in cooperative game theory and in many applications of game theory for assessing the contribution of a player to a coalition game. It has been established already in the 1950s, and is theoretically justified by being the very single wealth-distribution measure that satisfies some natural axioms. While this value has been investigated in several areas, it received little attention in data management. We study this measure in the context of conjunctive and aggregate queries by defining corresponding coalition games. We provide algorithmic and complexity-theoretic results on the computation of Shapley-based contributions to query answers; and for the hard cases we present approximation algorithms. △ Less

Submitted 1 September, 2021; v1 submitted 18 April, 2019; originally announced April 2019.

Journal ref: Logical Methods in Computer Science, Volume 17, Issue 3 (September 2, 2021) lmcs:6942

arXiv:1904.06492 [pdf, other]

Properties of Inconsistency Measures for Databases

Authors: Ester Livshits, Rina Kochirgan, Segev Tsur, Ihab F. Ilyas, Benny Kimelfeld, Sudeepa Roy

Abstract: How should we quantify the inconsistency of a database that violates integrity constraints? Proper measures are important for various tasks, such as progress indication and action prioritization in cleaning systems, and reliability estimation for new datasets. To choose an appropriate inconsistency measure, it is important to identify the desired properties in the application and understand which… ▽ More How should we quantify the inconsistency of a database that violates integrity constraints? Proper measures are important for various tasks, such as progress indication and action prioritization in cleaning systems, and reliability estimation for new datasets. To choose an appropriate inconsistency measure, it is important to identify the desired properties in the application and understand which of these is guaranteed or at least expected in practice. For example, in some use cases the inconsistency should reduce if constraints are eliminated; in others it should be stable and avoid jitters and jumps in reaction to small changes in the database. We embark on a systematic investigation of properties for database inconsistency measures. We investigate a collection of basic measures that have been proposed in the past in both the Knowledge Representation and Database communities, analyze their theoretical properties, and empirically observe their behaviour in an experimental study. We also demonstrate how the framework can lead to new inconsistency measures by introducing a new measure that, in contrast to the rest, satisfies all of the properties we consider and can be computed in polynomial time. △ Less

Submitted 1 April, 2021; v1 submitted 13 April, 2019; originally announced April 2019.

arXiv:1712.07705 [pdf, other]

Computing Optimal Repairs for Functional Dependencies

Authors: Ester Livshits, Benny Kimelfeld, Sudeepa Roy

Abstract: We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell)… ▽ More We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) updates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal S-repair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a "most probable database" that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem. △ Less

Submitted 20 December, 2017; originally announced December 2017.

arXiv:1708.09140 [pdf, other]

The Complexity of Computing a Cardinality Repair for Functional Dependencies

Authors: Ester Livshits, Benny Kimelfeld

Abstract: For a relation that violates a set of functional dependencies, we consider the task of finding a maximum number of pairwise-consistent tuples, or what is known as a "cardinality repair." We present a polynomial-time algorithm that, for certain fixed relation schemas (with functional dependencies), computes a cardinality repair. Moreover, we prove that on any of the schemas not covered by the algor… ▽ More For a relation that violates a set of functional dependencies, we consider the task of finding a maximum number of pairwise-consistent tuples, or what is known as a "cardinality repair." We present a polynomial-time algorithm that, for certain fixed relation schemas (with functional dependencies), computes a cardinality repair. Moreover, we prove that on any of the schemas not covered by the algorithm, finding a cardinality repair is, in fact, an NP-hard problem. In particular, we establish a dichotomy in the complexity of computing a cardinality repair, and we present an efficient algorithm to determine whether a given schema belongs to the positive side or the negative side of the dichotomy. △ Less

Submitted 30 August, 2017; originally announced August 2017.

arXiv:1603.01820 [pdf, other]

Unambiguous Prioritized Repairing of Databases

Authors: Benny Kimelfeld, Ester Livshits, Liat Peterfreund

Abstract: In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way". Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, re… ▽ More In its traditional definition, a repair of an inconsistent database is a consistent database that differs from the inconsistent one in a "minimal way". Often, repairs are not equally legitimate, as it is desired to prefer one over another; for example, one fact is regarded more reliable than another, or a more recent fact should be preferred to an earlier one. Motivated by these considerations, researchers have introduced and investigated the framework of preferred repairs, in the context of denial constraints and subset repairs. There, a priority relation between facts is lifted towards a priority relation between consistent databases, and repairs are restricted to the ones that are optimal in the lifted sense. Three notions of lifting (and optimal repairs) have been proposed: Pareto, global, and completion. In this paper we investigate the complexity of deciding whether the priority relation suffices to clean the database unambiguously, or in other words, whether there is exactly one optimal repair. We show that the different lifting semantics entail highly different complexities. Under Pareto optimality, the problem is coNP-complete, in data complexity, for every set of functional dependencies (FDs), except for the tractable case of (equivalence to) one FD per relation. Under global optimality, one FD per relation is still tractable, but we establish $Π^{p}_{2}$-completeness for a relation with two FDs. In contrast, under completion optimality the problem is solvable in polynomial time for every set of FDs. In fact, we present a polynomial-time algorithm for arbitrary conflict hypergraphs. We further show that under a general assumption of transitivity, this algorithm solves the problem even for global optimality. The algorithm is extremely simple, but its proof of correctness is quite intricate. △ Less

Submitted 6 March, 2016; originally announced March 2016.

arXiv:1103.3726 [pdf, ps, other]

Flow with Nonlinear Potential in General Networks -- Simulation, Optimization, Control, Risk and Stability Analysis

Authors: Emmanuel M. Livshits, Leonid A. Ostromuhov

Abstract: The aim of this paper is a short survey of models and methods that developed by the authors. These models and methods are used to optimize general networks with nonlinear non-convex restrictions and objectives possessing mixed continuous-discrete optimization variables. There are discussed the problem formulations and solution methods for simulation, optimization, sensitivity and stability analysi… ▽ More The aim of this paper is a short survey of models and methods that developed by the authors. These models and methods are used to optimize general networks with nonlinear non-convex restrictions and objectives possessing mixed continuous-discrete optimization variables. There are discussed the problem formulations and solution methods for simulation, optimization, sensitivity and stability analysis for flow with nonlinear potential in general networks. These problems and the developed methods and programs have industrial application e.g. by gas networks. △ Less

Submitted 18 March, 2011; originally announced March 2011.

Comments: 10 pages. The paper has been presented on 16th IMACS World Congress 2000 on Scientific Computation, Applied Mathematics and Simulation in Lausanne, Switzerland

ACM Class: J.2; G.1; G.2; I.6.8; J.2

arXiv:0804.3145 [pdf]

doi 10.1021/jp803606n

A density functional theory for symmetric radical cations from bonding to dissociation

Authors: Ester Livshits, Roi Baer

Abstract: It is known for quite some time that approximate density functional (ADF) theories fail disastrously when describing the dis-sociative symmetric radical cations R2+. Considering this dissociation limit, previous work has shown that Hartree-Fock (HF) theory favors the R+1--R0 charge distribution while DF approximations favor the R+0.5 -- R+0.5. Yet, general quantum mechanical principles indicate… ▽ More It is known for quite some time that approximate density functional (ADF) theories fail disastrously when describing the dis-sociative symmetric radical cations R2+. Considering this dissociation limit, previous work has shown that Hartree-Fock (HF) theory favors the R+1--R0 charge distribution while DF approximations favor the R+0.5 -- R+0.5. Yet, general quantum mechanical principles indicate that both these (as well as all intermediate) average charge distributions are asymptotically energy degenerate. Thus HF and ADF theories mistakenly break the symmetry but in a contradicting way. In this letter we show how to construct system-dependent long-range corrected (LC) density functionals that can successfully treat this class of molecules, avoiding the spurious symmetry breaking. Examples and comparisons to experimental data is given for R=H, He and Ne and it is shown that the new LC theory improves considerably the theoretical description of the R2+ bond properties, the long range form of the asymptotic potential curve as well as the atomic polarizability. The broader impact of this finding is discussed as well and it is argued that the widespread semi-empirical approach which advocates treating the LC parameter as a system-independent parameter is in fact inappropriate under general circumstances. △ Less

Submitted 27 April, 2008; v1 submitted 19 April, 2008; originally announced April 2008.

arXiv:cond-mat/0701493 [pdf]

doi 10.1039/B617919C

A well-tempered density functional theory of electrons in molecules

Authors: Ester Livshits, Roi Baer

Abstract: Reporting extensions of a recently developed approach to density functional theory with correct long-range be-havior (Phys. Rev. Lett. 94, 043002 (2005)). The central quantities are a splitting functional gamma[n] and a complementary exchange-correlation functional. We give a practical method for determining the value of γin molecules, assuming an approximation for XC energy is given. The result… ▽ More Reporting extensions of a recently developed approach to density functional theory with correct long-range be-havior (Phys. Rev. Lett. 94, 043002 (2005)). The central quantities are a splitting functional gamma[n] and a complementary exchange-correlation functional. We give a practical method for determining the value of γin molecules, assuming an approximation for XC energy is given. The resulting theory shows good ability to reproduce the ionization potentials for various molecules. However it is not of sufficient accuracy for forming a satisfactory framework for studying molecular properties. A somewhat different approach is then adopted, which depends on a density-independent γand an additional parameter w eliminating part of the local exchange functional. The values of these two parameters are obtained by best-fitting to experimental atomization energies and bond-lengths of the molecules in the G2(1) database. The optimized values are gamma=0.5 a_0^{-1} and w=0.1 . We then examine the performance of this slightly semi-empirical functional for a variety of molecular properties, comparing to related works and to experiment. We show that this approach can be used for describing in a satisfactory manner a broad range of molecular properties, be they static or dynamic. Most satisfactory is the ability to describe valence, Rydberg and inter-molecular charge-transfer excitations. △ Less

Submitted 20 January, 2007; originally announced January 2007.

Showing 1–18 of 18 results for author: Livshits, E