Skip to main content

Showing 1–10 of 10 results for author: Tsotras, V J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.14519  [pdf, other

    cs.DB

    Optimizing Big Active Data Management Systems

    Authors: Shahrzad Haji Amin Shirazi, Xikui Wang, Michael J. Carey, Vassilis J. Tsotras

    Abstract: Within the dynamic world of Big Data, traditional systems typically operate in a passive mode, processing and responding to user queries by returning the requested data. However, this methodology falls short of meeting the evolving demands of users who not only wish to analyze data but also to receive proactive updates on topics of interest. To bridge this gap, Big Active Data (BAD) frameworks hav… ▽ More

    Submitted 20 December, 2024; v1 submitted 18 December, 2024; originally announced December 2024.

  2. EBV: Electronic Bee-Veterinarian for Principled Mining and Forecasting of Honeybee Time Series

    Authors: Mst. Shamima Hossain, Christos Faloutsos, Boris Baer, Hyoseung Kim, Vassilis J. Tsotras

    Abstract: Honeybees are vital for pollination and food production. Among many factors, extreme temperature (e.g., due to climate change) is particularly dangerous for bee health. Anticipating such extremities would allow beekeepers to take early preventive action. Thus, given sensor (temperature) time series data from beehives, how can we find patterns and do forecasting? Forecasting is crucial as it helps… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 9 pages, 7 figure, Accepted at 2024 SIAM International Conference on Data Mining (SDM'24)

  3. arXiv:2105.08312  [pdf, ps, other

    cs.DB

    Reachability and Top-k Reachability Queries with Transfer Decay

    Authors: Elena V. Strzheletska, Vassilis J. Tsotras

    Abstract: The prevalence of location tracking systems has resulted in large volumes of spatiotemporal data generated every day. Addressing reachability queries on such datasets is important for a wide range of applications (surveillance, public health, social networks, etc.) A spatiotemporal reachability query identifies whether a physical item (or information etc.) could have been transferred from the sour… ▽ More

    Submitted 18 May, 2021; originally announced May 2021.

    Comments: 10 pages

  4. arXiv:2101.01852  [pdf, other

    cs.DB

    Bridging BAD Islands: Declarative Data Sharing at Scale

    Authors: Xikui Wang, Michael J. Carey, Vassilis J. Tsotras

    Abstract: In many Big Data applications today, information needs to be actively shared between systems managed by different organizations. To enable sharing Big Data at scale, developers would have to create dedicated server programs and glue together multiple Big Data systems for scalability. Developing and managing such glued data sharing services requires a significant amount of work from developers. In… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    Comments: 10 pages, 34 figures, to appear on IEEE Big Data - Workshop on Scalable Cloud Data Management

  5. arXiv:2010.00728  [pdf, other

    cs.DB

    Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems

    Authors: Christina Pavlopoulou, Michael J. Carey, Vassilis J. Tsotras

    Abstract: Query Optimization remains an open problem for Big Data Management Systems. Traditional optimizers are cost-based and use statistical estimates of intermediate result cardinalities to assign costs and pick the best plan. However, such estimates tend to become less accurate because of filtering conditions caused either from undetected correlations between multiple predicates local to a single datas… ▽ More

    Submitted 5 October, 2020; v1 submitted 1 October, 2020; originally announced October 2020.

  6. arXiv:2009.04611  [pdf, other

    cs.DB cs.DC

    Subscribing to Big Data at Scale

    Authors: Xikui Wang, Michael J. Carey, Vassilis J. Tsotras

    Abstract: Today, data is being actively generated by a variety of devices, services, and applications. Such data is important not only for the information that it contains, but also for its relationships to other data and to interested users. Most existing Big Data systems focus on passively answering queries from users, rather than actively collecting data, processing it, and serving it to users. To satisf… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: 36 pages, 47 figures, submitted to TOCS

  7. BAD to the Bone: Big Active Data at its Core

    Authors: Steven Jacobs, Xikui Wang, Michael J. Carey, Vassilis J. Tsotras, Md Yusuf Sarwar Uddin

    Abstract: Virtually all of today's Big Data systems are passive in nature, responding to queries posted by their users. Instead, we are working to shift Big Data platforms from passive to active. In our view, a Big Active Data (BAD) system should continuously and reliably capture Big Data while enabling timely and automatic delivery of relevant information to a large pool of interested users, as well as sup… ▽ More

    Submitted 23 May, 2020; v1 submitted 22 February, 2020; originally announced February 2020.

    Comments: 30 pages. Accepted by VLDBJ

  8. arXiv:1504.00331  [pdf, other

    cs.DB

    Apache VXQuery: A Scalable XQuery Implementation

    Authors: E. Preston Carman Jr., Till Westmann, Vinayak R. Borkar, Michael J. Carey, Vassilis J. Tsotras

    Abstract: The wide use of XML for document management and data exchange has created the need to query large repositories of XML data. To efficiently query such large data collections and take advantage of parallelism, we have implemented Apache VXQuery, an open-source scalable XQuery processor. The system builds upon two other open-source frameworks -- Hyracks, a parallel execution engine, and Algebricks, a… ▽ More

    Submitted 1 April, 2015; originally announced April 2015.

  9. arXiv:1311.0059  [pdf, other

    cs.DB

    Revisiting Aggregation for Data Intensive Applications: A Performance Study

    Authors: Jian Wen, Vinayak R. Borkar, Michael J. Carey, Vassilis J. Tsotras

    Abstract: Aggregation has been an important operation since the early days of relational databases. Today's Big Data applications bring further challenges when processing aggregation queries, demanding adaptive aggregation algorithms that can process large volumes of data relative to a potentially limited memory budget (especially in multiuser settings). Despite its importance, the design and evaluation of… ▽ More

    Submitted 31 October, 2013; originally announced November 2013.

    Comments: 25 Pages

  10. arXiv:1205.6695  [pdf, other

    cs.DB

    On The Spatiotemporal Burstiness of Terms

    Authors: Theodoros Lappas, Marcos R. Vieira, Dimitrios Gunopulos, Vassilis J. Tsotras

    Abstract: Thousands of documents are made available to the users via the web on a daily basis. One of the most extensively studied problems in the context of such document streams is burst identification. Given a term t, a burst is generally exhibited when an unusually high frequency is observed for t. While spatial and temporal burstiness have been studied individually in the past, our work is the first to… ▽ More

    Submitted 30 May, 2012; originally announced May 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 9, pp. 836-847 (2012)