Skip to main content

Showing 1–9 of 9 results for author: Afrati, F N

.
  1. arXiv:2208.09671  [pdf, other

    cs.DB

    Safe Subjoins in Acyclic Joins

    Authors: Foto N. Afrati

    Abstract: It is expensive to compute joins, often due to large intermediate relations. For acyclic joins, monotone join expressions are guaranteed to produce intermediate relations not larger than the size of the output of the join when it is computed on a fully reduced database. Any subexpression of an acyclic join does not offer this guarantee, as it is easy to prove. In this paper, we consider joins with… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

  2. arXiv:2102.06563  [pdf, other

    cs.DB

    Querying collections of tree-structured records in the presence of within-record referential constraints

    Authors: Foto N. Afrati, Matthew Damigos

    Abstract: In this paper, we consider a tree-structured data model used in many commercial databases like Dremel, F1, JSON stores. We define identity and referential constraints within each tree-structured record. The query language is a variant of SQL and flattening is used as an evaluation mechanism. We investigate querying in the presence of these constraints, and point out the challenges that arise from… ▽ More

    Submitted 30 August, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

  3. arXiv:2008.10986  [pdf, other

    cs.DB

    On the complexity of query containment and computing certain answers in the presence of ACs

    Authors: Foto N. Afrati, Matthew Damigos

    Abstract: We often add arithmetic to extend the expressiveness of query languages and study the complexity of problems such as testing query containment and finding certain answers in the framework of answering queries using views. When adding arithmetic comparisons, the complexity of such problems is higher than the complexity of their counterparts without them. It has been observed that we can achieve low… ▽ More

    Submitted 18 November, 2020; v1 submitted 25 August, 2020; originally announced August 2020.

  4. arXiv:1504.03247  [pdf, other

    cs.DB

    Handling Skew in Multiway Joins in Parallel Processing

    Authors: Foto N. Afrati, Jeffrey D. Ullman, Angelos Vasilakopoulos

    Abstract: Handling skew is one of the major challenges in query processing. In distributed computational environments such as MapReduce, uneven distribution of the data to the servers is not desired. One of the dominant measures that we want to optimize in distributed environments is communication cost. In a MapReduce job this is the amount of data that is transferred from the mappers to the reducers. In th… ▽ More

    Submitted 13 April, 2015; originally announced April 2015.

    Comments: 4 pages

  5. arXiv:1503.00650  [pdf, other

    cs.DB

    Consistent Answers of Conjunctive Queries on Graphs

    Authors: Foto N. Afrati, Phokion G. Kolaitis, Angelos Vasilakopoulos

    Abstract: During the past decade, there has been an extensive investigation of the computational complexity of the consistent answers of Boolean conjunctive queries under primary key constraints. Much of this investigation has focused on self-join-free Boolean conjunctive queries. In this paper, we study the consistent answers of Boolean conjunctive queries involving a single binary relation, i.e., we consi… ▽ More

    Submitted 2 March, 2015; originally announced March 2015.

  6. arXiv:1312.2990  [pdf, ps, other

    cs.DB

    Efficient Lineage for SUM Aggregate Queries

    Authors: Foto N. Afrati, Dimitris Fotakis, Angelos Vasilakopoulos

    Abstract: AI systems typically make decisions and find patterns in data based on the computation of aggregate and specifically sum functions, expressed as queries, on data's attributes. This computation can become costly or even inefficient when these queries concern the whole or big parts of the data and especially when we are dealing with big data. New types of intelligent analytics require also the expla… ▽ More

    Submitted 9 June, 2014; v1 submitted 10 December, 2013; originally announced December 2013.

  7. arXiv:1208.0615  [pdf, ps, other

    cs.DC

    Enumerating Subgraph Instances Using Map-Reduce

    Authors: Foto N. Afrati, Dimitris Fotakis, Jeffrey D. Ullman

    Abstract: The theme of this paper is how to find all instances of a given "sample" graph in a larger "data graph," using a single round of map-reduce. For the simplest sample graph, the triangle, we improve upon the best known such algorithm. We then examine the general case, considering both the communication cost between mappers and reducers and the total computation cost at the reducers. To minimize comm… ▽ More

    Submitted 21 November, 2012; v1 submitted 2 August, 2012; originally announced August 2012.

    Comments: 37 pages

  8. arXiv:1206.4377  [pdf, other

    cs.DC cs.DS

    Upper and Lower Bounds on the Cost of a Map-Reduce Computation

    Authors: Foto N. Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey D. Ullman

    Abstract: In this paper we study the tradeoff between parallelism and communication cost in a map-reduce computation. For any problem that is not "embarrassingly parallel," the finer we partition the work of the reducers so that more parallelism can be extracted, the greater will be the total communication between mappers and reducers. We introduce a model of problems that can be solved in a single round of… ▽ More

    Submitted 19 June, 2012; originally announced June 2012.

    Comments: 14 pages

  9. arXiv:1204.1754  [pdf, other

    cs.DB cs.DC

    Vision Paper: Towards an Understanding of the Limits of Map-Reduce Computation

    Authors: Foto N. Afrati, Anish Das Sarma, Semih Salihoglu, Jeffrey D. Ullman

    Abstract: A significant amount of recent research work has addressed the problem of solving various data management problems in the cloud. The major algorithmic challenges in map-reduce computations involve balancing a multitude of factors such as the number of machines available for mappers/reducers, their memory requirements, and communication cost (total amount of data sent from mappers to reducers). Mos… ▽ More

    Submitted 8 April, 2012; originally announced April 2012.

    Comments: 5 pages