Skip to main content

Showing 1–20 of 20 results for author: Ferhatosmanoğlu, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.01528  [pdf, other

    cs.DC cs.DB

    SQUASH: Serverless and Distributed Quantization-based Attributed Vector Similarity Search

    Authors: Joe Oakley, Hakan Ferhatosmanoglu

    Abstract: Vector similarity search presents significant challenges in terms of scalability for large and high-dimensional datasets, as well as in providing native support for hybrid queries. Serverless computing and cloud functions offer attractive benefits such as elasticity and cost-effectiveness, but are difficult to apply to data-intensive workloads. Jointly addressing these two main challenges, we pres… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  2. arXiv:2409.09079  [pdf, other

    cs.DC cs.AI cs.LG

    D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural Networks

    Authors: Rustam Guliyev, Aparajita Haldar, Hakan Ferhatosmanoglu

    Abstract: Graph Neural Network (GNN) models on streaming graphs entail algorithmic challenges to continuously capture its dynamic state, as well as systems challenges to optimize latency, memory, and throughput during both inference and training. We present D3-GNN, the first distributed, hybrid-parallel, streaming GNN system designed to handle real-time graph updates under online query setting. Our system a… ▽ More

    Submitted 10 September, 2024; originally announced September 2024.

    Comments: 14 pages, 7 figures, published at VLDB'24

    Journal ref: Proc. VLDB Endow. 17, 11 (2024), 2764-2777

  3. arXiv:2403.15195  [pdf, other

    cs.DC cs.AI cs.LG

    FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication

    Authors: Joe Oakley, Hakan Ferhatosmanoglu

    Abstract: Serverless computing offers attractive scalability, elasticity and cost-effectiveness. However, constraints on memory, CPU and function runtime have hindered its adoption for data-intensive applications and machine learning (ML) workloads. Traditional 'server-ful' platforms enable distributed computation via fast networks and well-established inter-process communication (IPC) mechanisms such as MP… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: In Proceedings of 2024 IEEE 40th International Conference on Data Engineering (ICDE) (to appear)

  4. Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

    Authors: Shuang Wang, Bahaeddin Eravci, Rustam Guliyev, Hakan Ferhatosmanoglu

    Abstract: Graph Neural Network (GNN) training and inference involve significant challenges of scalability with respect to both model sizes and number of layers, resulting in degradation of efficiency and accuracy for large and deep GNNs. We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments while avoiding the oversmoothing problem in… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: To appear in CIKM2023

    MSC Class: 68T07 ACM Class: I.m

    Journal ref: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23), October 21--25, 2023, Birmingham, United Kingdom

  5. arXiv:2212.05009  [pdf, other

    cs.LG cs.AI cs.DC

    Scalable Graph Convolutional Network Training on Distributed-Memory Systems

    Authors: Gunduz Vehbi Demirci, Aparajita Haldar, Hakan Ferhatosmanoglu

    Abstract: Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses uni… ▽ More

    Submitted 13 December, 2022; v1 submitted 9 December, 2022; originally announced December 2022.

    Comments: To appear in PVLDB'22

  6. RAGUEL: Recourse-Aware Group Unfairness Elimination

    Authors: Aparajita Haldar, Teddy Cunningham, Hakan Ferhatosmanoglu

    Abstract: While machine learning and ranking-based systems are in widespread use for sensitive decision-making processes (e.g., determining job candidates, assigning credit scores), they are rife with concerns over unintended biases in their outcomes, which makes algorithmic fairness (e.g., demographic parity, equal opportunity) an objective of interest. 'Algorithmic recourse' offers feasible recovery actio… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: to be published in CIKM'22

  7. arXiv:2205.08886  [pdf, other

    cs.LG cs.AI cs.CR cs.DB

    GeoPointGAN: Synthetic Spatial Data with Local Label Differential Privacy

    Authors: Teddy Cunningham, Konstantin Klemmer, Hongkai Wen, Hakan Ferhatosmanoglu

    Abstract: Synthetic data generation is a fundamental task for many data management and data science applications. Spatial data is of particular interest, and its sensitive nature often leads to privacy concerns. We introduce GeoPointGAN, a novel GAN-based solution for generating synthetic spatial point datasets with high utility and strong individual level privacy guarantees. GeoPointGAN's architecture incl… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

  8. arXiv:2203.14925  [pdf, other

    cs.SI cs.CE

    Temporal Cascade Model for Analyzing Spread in Evolving Networks with Disease Monitoring Applications

    Authors: Aparajita Haldar, Shuang Wang, Gunduz Demirci, Joe Oakley, Hakan Ferhatosmanoglu

    Abstract: Current approaches for modeling propagation in networks (e.g., spread of disease) are unable to adequately capture temporal properties of the data such as order and duration of evolving connections or dynamic likelihoods of propagation along these connections. Temporal models in evolving networks are crucial in many applications that need to analyze dynamic spread. For example, a disease-spreading… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to ACM Transactions on Spatial Algorithms and Systems (TSAS) Journal, September 2022. For code and data, see https://github.com/publiccoderepo/T-IC-model

  9. Collective Shortest Paths for Minimizing Congestion on Temporal Load-Aware Road Networks

    Authors: Chris Conlan, Teddy Cunningham, Gunduz Vehbi Demirci, Hakan Ferhatosmanoglu

    Abstract: Shortest path queries over graphs are usually considered as isolated tasks, where the goal is to return the shortest path for each individual query. In practice, however, such queries are typically part of a system (e.g., a road network) and their execution dynamically affects other queries and network parameters, such as the loads on edges, which in turn affects the shortest paths. We study the p… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: 10 pages, to appear at the IWCTS Workshop at SIGSPATIAL 2021

  10. Privacy-Preserving Synthetic Location Data in the Real World

    Authors: Teddy Cunningham, Graham Cormode, Hakan Ferhatosmanoglu

    Abstract: Sharing sensitive data is vital in enabling many modern data analysis and machine learning tasks. However, current methods for data release are insufficiently accurate or granular to provide meaningful utility, and they carry a high risk of deanonymization or membership inference attacks. In this paper, we propose a differentially private synthetic data generation solution with a focus on the comp… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Journal ref: 17th International Symposium on Spatial and Temporal Databases (SSTD '21), 2021

  11. Real-World Trajectory Sharing with Local Differential Privacy

    Authors: Teddy Cunningham, Graham Cormode, Hakan Ferhatosmanoglu, Divesh Srivastava

    Abstract: Sharing trajectories is beneficial for many real-world applications, such as managing disease spread through contact tracing and tailoring public services to a population's travel patterns. However, public concern over privacy and data protection has limited the extent to which this data is shared. Local differential privacy enables data sharing in which users share a perturbed version of their da… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Journal ref: PVLDB, 14(11): 2283 - 2295, 2021

  12. arXiv:2104.11805  [pdf, other

    cs.LG cs.AI cs.DC

    Partitioning sparse deep neural networks for scalable training and inference

    Authors: Gunduz Vehbi Demirci, Hakan Ferhatosmanoglu

    Abstract: The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of t… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: Gunduz Vehbi Demirci and Hakan Ferhatosmanoglu. 2021. Partitioning Sparse Deep Neural Networks for Scalable Training and Inference. In 2021 International Conference on Supercomputing (ICS '21), June 14-17, 2021, Virtual Event, USA. ACM, New York, NY, USA, 12 pages

  13. PPQ-Trajectory: Spatio-temporal Quantization for Querying in Large Trajectory Repositories

    Authors: Shuang Wang, Hakan Ferhatosmanoglu

    Abstract: We present PPQ-trajectory, a spatio-temporal quantization based solution for querying large dynamic trajectory data. PPQ-trajectory includes a partition-wise predictive quantizer (PPQ) that generates an error-bounded codebook with autocorrelation and spatial proximity-based partitions. The codebook is indexed to run approximate and exact spatio-temporal queries over compressed trajectories. PPQ-tr… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: To appear at VLDB 2021

  14. arXiv:2004.05951  [pdf, other

    cs.DB

    SLIM: Scalable Linkage of Mobility Data

    Authors: Fuat Basık, Hakan Ferhatosmanoğlu, Buğra Gedik

    Abstract: We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for servic… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

    Comments: To Appear in Sigmod 2020

  15. arXiv:1909.00741  [pdf, other

    cs.MM cs.CV cs.IR

    VISIR: Visual and Semantic Image Label Refinement

    Authors: Sreyasi Nag Chowdhury, Niket Tandon, Hakan Ferhatosmanoglu, Gerhard Weikum

    Abstract: The social media explosion has populated the Internet with a wealth of images. There are two existing paradigms for image retrieval: 1) content-based image retrieval (CBIR), which has traditionally used visual features for similarity search (e.g., SIFT features), and 2) tag-based image retrieval (TBIR), which has relied on user tagging (e.g., Flickr tags). CBIR now gains semantic expressiveness by… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: Published in WSDM 2018

    Journal ref: ACM ISBN 978-1-4503-5581-0/18/02 2018

  16. arXiv:1904.04866  [pdf, other

    cs.CL cs.LG

    Characterizing the impact of geometric properties of word embeddings on task performance

    Authors: Brendan Whitaker, Denis Newman-Griffis, Aparajita Haldar, Hakan Ferhatosmanoglu, Eric Fosler-Lussier

    Abstract: Analysis of word embedding properties to inform their use in downstream NLP tasks has largely been studied by assessing nearest neighbors. However, geometric properties of the continuous feature space contribute directly to the use of embedding features in downstream models, and are largely unexplored. We consider four properties of word embedding geometry, namely: position relative to the origin,… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

    Comments: Appearing in the Third Workshop on Evaluating Vector Space Representations for NLP (RepEval 2019). 7 pages + references

  17. Fair Task Allocation in Crowdsourced Delivery

    Authors: Fuat Basik, Bugra Gedik, Hakan Ferhatosmanoglu, Kun-Lung Wu

    Abstract: Faster and more cost-efficient, crowdsourced delivery is needed to meet the growing customer demands of many industries, including online shopping, on-demand local delivery, and on-demand transportation. The power of crowdsourced delivery stems from the large number of workers potentially available to provide services and reduce costs. It has been shown in social psychology literature that fairnes… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

    Comments: To Appear in IEEE Transactions on Services Computing

  18. Spatio-Temporal Linkage over Location Enhanced Services

    Authors: Fuat Basık, Buğra Gedik, Çağrı Etemoğlu, Hakan Ferhatosmanoğlu

    Abstract: We are witnessing an enormous growth in the volume of data generated by various online services. An important portion of this data contains geographic references, since many of these services are \emph{location-enhanced} and thus produce spatio-temporal records of their usage. We postulate that the spatio-temporal usage records belonging to the same real-world entity can be matched across records… ▽ More

    Submitted 12 January, 2018; originally announced January 2018.

    Comments: IEEE Transactions on Mobile Computing ( Volume: 17, Issue: 2, Feb. 1 2018 ) http://ieeexplore.ieee.org/document/7937913/

    Journal ref: F. Basık, B. Gedik, Ç. Etemoğlu and H. Ferhatosmanoğlu, "Spatio-Temporal Linkage over Location-Enhanced Services," in IEEE Transactions on Mobile Computing, vol. 17, no. 2, pp. 447-460, Feb. 1 2018

  19. arXiv:1801.02198  [pdf, other

    cs.SI physics.soc-ph

    Topic-Based Influence Computation in Social Networks under Resource Constraints

    Authors: Kaan Bingöl, Bahaeddin Eravcı, Çağrı Özgenç Etemoğlu, Hakan Ferhatosmanoğlu, Buğra Gedik

    Abstract: As social networks are constantly changing and evolving, methods to analyze dynamic social networks are becoming more important in understanding social trends. However, due to the restrictions imposed by the social network service providers, the resources available to fetch the entire contents of a social network are typically very limited. As a result, analysis of dynamic social network data requ… ▽ More

    Submitted 7 January, 2018; originally announced January 2018.

  20. Privacy-Preserving Aggregate Queries for Optimal Location Selection

    Authors: Emre Yilmaz, Hakan Ferhatosmanoglu, Erman Ayday, Remzi Can Aksoy

    Abstract: Today, vast amounts of location data are collected by various service providers. These location data owners have a good idea of where their users are most of the time. Other businesses also want to use this information for location analytics, such as finding the optimal location for a new branch. However, location data owners cannot share their data with other businesses, mainly due to privacy and… ▽ More

    Submitted 6 January, 2018; originally announced January 2018.

    Comments: IEEE Transactions on Dependable and Secure Computing, 2017

    Journal ref: IEEE Transactions on Dependable and Secure Computing, 16(2), 329-343, 2019