-
Memento Filter: A Fast, Dynamic, and Robust Range Filter
Authors:
Navid Eslami,
Niv Dayan
Abstract:
Range filters are probabilistic data structures that answer approximate range emptiness queries. They aid in avoiding processing empty range queries and have use cases in many application domains such as key-value stores and social web analytics. However, current range filter designs do not support dynamically changing and growing datasets. Moreover, several of these designs also exhibit impractic…
▽ More
Range filters are probabilistic data structures that answer approximate range emptiness queries. They aid in avoiding processing empty range queries and have use cases in many application domains such as key-value stores and social web analytics. However, current range filter designs do not support dynamically changing and growing datasets. Moreover, several of these designs also exhibit impractically high false positive rates under correlated workloads, which are common in practice. These impediments restrict the applicability of range filters across a wide range of use cases. We introduce Memento filter, the first range filter to offer dynamicity, fast operations, and a robust false positive rate guarantee for any workload. Memento filter partitions the key universe and clusters its keys according to this partitioning. For each cluster, it stores a fingerprint and a list of key suffixes contiguously. The encoding of these lists makes them amenable to existing dynamic filter structures. Due to the well-defined one-to-one mapping from keys to suffixes, Memento filter supports inserts and deletes and can even expand to accommodate a growing dataset. We implement Memento filter on top of a Rank-and-Select Quotient filter and InfiniFilter and demonstrate that it achieves competitive false positive rates and performance with the state-of-the-art while also providing dynamicity. Due to its dynamicity, Memento filter is the first range filter applicable to B-Trees. We showcase this by integrating Memento filter into WiredTiger, a B-Tree-based key-value store. Memento filter doubles WiredTiger's range query throughput when 50% of the queries are empty while keeping all other cost metrics unharmed.
△ Less
Submitted 27 October, 2024; v1 submitted 10 August, 2024;
originally announced August 2024.
-
Rethinking RAFT for Efficient Optical Flow
Authors:
Navid Eslami,
Farnoosh Arefi,
Amir M. Mansourian,
Shohreh Kasaei
Abstract:
Despite significant progress in deep learning-based optical flow methods, accurately estimating large displacements and repetitive patterns remains a challenge. The limitations of local features and similarity search patterns used in these algorithms contribute to this issue. Additionally, some existing methods suffer from slow runtime and excessive graphic memory consumption. To address these pro…
▽ More
Despite significant progress in deep learning-based optical flow methods, accurately estimating large displacements and repetitive patterns remains a challenge. The limitations of local features and similarity search patterns used in these algorithms contribute to this issue. Additionally, some existing methods suffer from slow runtime and excessive graphic memory consumption. To address these problems, this paper proposes a novel approach based on the RAFT framework. The proposed Attention-based Feature Localization (AFL) approach incorporates the attention mechanism to handle global feature extraction and address repetitive patterns. It introduces an operator for matching pixels with corresponding counterparts in the second frame and assigning accurate flow values. Furthermore, an Amorphous Lookup Operator (ALO) is proposed to enhance convergence speed and improve RAFTs ability to handle large displacements by reducing data redundancy in its search operator and expanding the search space for similarity extraction. The proposed method, Efficient RAFT (Ef-RAFT),achieves significant improvements of 10% on the Sintel dataset and 5% on the KITTI dataset over RAFT. Remarkably, these enhancements are attained with a modest 33% reduction in speed and a mere 13% increase in memory usage. The code is available at: https://github.com/n3slami/Ef-RAFT
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Blacksmith: Fast Adversarial Training of Vision Transformers via a Mixture of Single-step and Multi-step Methods
Authors:
Mahdi Salmani,
Alireza Dehghanpour Farashah,
Mohammad Azizmalayeri,
Mahdi Amiri,
Navid Eslami,
Mohammad Taghi Manzuri,
Mohammad Hossein Rohban
Abstract:
Despite the remarkable success achieved by deep learning algorithms in various domains, such as computer vision, they remain vulnerable to adversarial perturbations. Adversarial Training (AT) stands out as one of the most effective solutions to address this issue; however, single-step AT can lead to Catastrophic Overfitting (CO). This scenario occurs when the adversarially trained network suddenly…
▽ More
Despite the remarkable success achieved by deep learning algorithms in various domains, such as computer vision, they remain vulnerable to adversarial perturbations. Adversarial Training (AT) stands out as one of the most effective solutions to address this issue; however, single-step AT can lead to Catastrophic Overfitting (CO). This scenario occurs when the adversarially trained network suddenly loses robustness against multi-step attacks like Projected Gradient Descent (PGD). Although several approaches have been proposed to address this problem in Convolutional Neural Networks (CNNs), we found out that they do not perform well when applied to Vision Transformers (ViTs). In this paper, we propose Blacksmith, a novel training strategy to overcome the CO problem, specifically in ViTs. Our approach utilizes either of PGD-2 or Fast Gradient Sign Method (FGSM) randomly in a mini-batch during the adversarial training of the neural network. This will increase the diversity of our training attacks, which could potentially mitigate the CO issue. To manage the increased training time resulting from this combination, we craft the PGD-2 attack based on only the first half of the layers, while FGSM is applied end-to-end. Through our experiments, we demonstrate that our novel method effectively prevents CO, achieves PGD-2 level performance, and outperforms other existing techniques including N-FGSM, which is the state-of-the-art method in fast training for CNNs.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
Locality in online, dynamic, sequential, and distributed graph algorithms
Authors:
Amirreza Akbari,
Navid Eslami,
Henrik Lievonen,
Darya Melnyk,
Joona Särkijärvi,
Jukka Suomela
Abstract:
In this work, we give a unifying view of locality in four settings: distributed algorithms, sequential greedy algorithms, dynamic algorithms, and online algorithms. We introduce a new model of computing, called the online-LOCAL model: the adversary reveals the nodes of the input graph one by one, in the same way as in classical online algorithms, but for each new node we get to see its radius-T ne…
▽ More
In this work, we give a unifying view of locality in four settings: distributed algorithms, sequential greedy algorithms, dynamic algorithms, and online algorithms. We introduce a new model of computing, called the online-LOCAL model: the adversary reveals the nodes of the input graph one by one, in the same way as in classical online algorithms, but for each new node we get to see its radius-T neighborhood before choosing the output. We compare the online-LOCAL model with three other models: the LOCAL model of distributed computing, where each node produces its output based on its radius-T neighborhood, its sequential counterpart SLOCAL, and the dynamic-LOCAL model, where changes in the dynamic input graph only influence the radius-T neighborhood of the point of change. The SLOCAL and dynamic-LOCAL models are sandwiched between the LOCAL and online-LOCAL models, with LOCAL being the weakest and online-LOCAL the strongest model. In general, all models are distinct, but we study in particular locally checkable labeling problems (LCLs), which is a family of graph problems studied in the context of distributed graph algorithms. We prove that for LCL problems in paths, cycles, and rooted trees, all models are roughly equivalent: the locality of any LCL problem falls in the same broad class - $O(\log^* n)$, $Θ(\log n)$, or $n^{Θ(1)}$ - in all four models. In particular, this result enables one to generalize prior lower-bound results from the LOCAL model to all four models, and it also allows one to simulate e.g. dynamic-LOCAL algorithms efficiently in the LOCAL model. We also show that this equivalence does not hold in general bipartite graphs. We provide an online-LOCAL algorithm with locality $O(\log n)$ for the $3$-coloring problem in bipartite graphs - this is a problem with locality $Ω(n^{1/2})$ in the LOCAL model and $Ω(n^{1/10})$ in the SLOCAL model.
△ Less
Submitted 12 November, 2022; v1 submitted 14 September, 2021;
originally announced September 2021.