-
Membrane: Accelerating Database Analytics with Bank-Level DRAM-PIM Filtering
Authors:
Akhil Shekar,
Kevin Gaffney,
Martin Prammer,
Khyati Kiyawat,
Lingxi Wu,
Helena Caminal,
Zhenxing Fan,
Yimin Gao,
Ashish Venkat,
José F. Martínez,
Jignesh Patel,
Kevin Skadron
Abstract:
In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to alleviate this bottleneck. In our study, we employ a commonly used software approach that streamlines JOIN operations into simpler selection or filtering tasks usin…
▽ More
In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to alleviate this bottleneck. In our study, we employ a commonly used software approach that streamlines JOIN operations into simpler selection or filtering tasks using pre-join denormalization which makes query processing workload more amenable to PIM acceleration. This research explores DRAM design landscape to evaluate how effectively these filtering tasks can be efficiently executed across DRAM hierarchy and their effect on overall application speedup. We also find that operations such as aggregates are more suitably executed on the CPU rather than PIM. Thus, we propose a cooperative query processing framework that capitalizes on both CPU and PIM strengths, where (i) the DRAM-based PIM block, with its massive parallelism, supports scan operations while (ii) CPU, with its flexible architecture, supports the rest of query execution. This allows us to utilize both PIM and CPU where appropriate and prevent dramatic changes to the overall system architecture.
With these minimal modifications, our methodology enables us to faithfully perform end-to-end performance evaluations using established analytics benchmarks such as TPCH and star-schema benchmark (SSB). Our findings show that this novel mapping approach improves performance, delivering a 5.92x/6.5x speedup compared to a traditional schema and 3.03-4.05x speedup compared to a denormalized schema with 9-17% memory overhead, depending on the degree of partial denormalization. Further, we provide insights into query selectivity, memory overheads, and software optimizations in the context of PIM-based filtering, which better explain the behavior and performance of these systems across the benchmarks.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Jina CLIP: Your CLIP Model Is Also Your Text Retriever
Authors:
Andreas Koukounas,
Georgios Mastrapas,
Michael Günther,
Bo Wang,
Scott Martens,
Isabelle Mohr,
Saba Sturua,
Mohammad Kalim Akram,
Joan Fontanals Martínez,
Saahil Ognawala,
Susana Guzman,
Maximilian Werk,
Nan Wang,
Han Xiao
Abstract:
Contrastive Language-Image Pretraining (CLIP) is widely used to train models to align images and texts in a common embedding space by mapping them to fixed-sized vectors. These models are key to multimodal information retrieval and related tasks. However, CLIP models generally underperform in text-only tasks compared to specialized text models. This creates inefficiencies for information retrieval…
▽ More
Contrastive Language-Image Pretraining (CLIP) is widely used to train models to align images and texts in a common embedding space by mapping them to fixed-sized vectors. These models are key to multimodal information retrieval and related tasks. However, CLIP models generally underperform in text-only tasks compared to specialized text models. This creates inefficiencies for information retrieval systems that keep separate embeddings and models for text-only and multimodal tasks. We propose a novel, multi-task contrastive training method to address this issue, which we use to train the jina-clip-v1 model to achieve the state-of-the-art performance on both text-image and text-text retrieval tasks.
△ Less
Submitted 26 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings
Authors:
Isabelle Mohr,
Markus Krimmel,
Saba Sturua,
Mohammad Kalim Akram,
Andreas Koukounas,
Michael Günther,
Georgios Mastrapas,
Vinit Ravishankar,
Joan Fontanals Martínez,
Feng Wang,
Qi Liu,
Ziniu Yu,
Jie Fu,
Saahil Ognawala,
Susana Guzman,
Bo Wang,
Maximilian Werk,
Nan Wang,
Han Xiao
Abstract:
We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations.
By f…
▽ More
We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations.
By focusing on bilingual models and introducing a unique multi-task learning objective, we have significantly improved the model performance on STS tasks, which outperforms the capabilities of existing multilingual models in both target language understanding and cross-lingual evaluation tasks. Moreover, our bilingual models are more efficient, requiring fewer parameters and less memory due to their smaller vocabulary needs. Furthermore, we have expanded the Massive Text Embedding Benchmark (MTEB) to include benchmarks for German and Spanish embedding models. This integration aims to stimulate further research and advancement in text embedding technologies for these languages.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
FODT: Fast, Online, Distributed and Temporary Failure Recovery Approach for MEC
Authors:
Xin Yuan,
Ning Li,
Jose Fernan Martinez
Abstract:
Mobile edge computing (MEC) can reduce the latency of cloud computing successfully. However, the edge server may fail due to the hardware of software issues. When the edge server failure happens, the users who offload tasks to this server will be affected. How to recover the services for these affected users quickly and effectively is challenging. Moreover, considering that the server failure is c…
▽ More
Mobile edge computing (MEC) can reduce the latency of cloud computing successfully. However, the edge server may fail due to the hardware of software issues. When the edge server failure happens, the users who offload tasks to this server will be affected. How to recover the services for these affected users quickly and effectively is challenging. Moreover, considering that the server failure is continuous and temporary, and the failed server can be repaired, the previous works cannot handle this problem effectively. Therefore, in this paper, we propose the fast, online, distributed, and temporary failure recovery algorithm (FODT) for MEC. In FODT, when edge sever failure happens, only the affected APs recalculate their user-server allocation strategies and the other APs do not change their strategies. For the affected access points (Aps), the strategies before server failure are reused to reduce complexity and latency. When the failed server is repaired, the influenced APs reuse the strategies before server failure to offload task to this server. Based on this approach, the FODT can achieve better performance than previous works. To the best of knowledge, the FODT is the first failure recovery algorithm, and when compared with previous research, it has higher failure recovery efficiency and lower complexity with acceptable approximate ratio.
△ Less
Submitted 16 April, 2025; v1 submitted 25 December, 2023;
originally announced December 2023.
-
CoNet: Borderless and decentralized server cooperation in edge computing
Authors:
Ning Li,
Xin Yuan,
Zhaoxin Zhang,
Jose Fernan Martinez
Abstract:
In edge computing (EC), by offloading tasks to edge server or remote cloud, the system performance can be improved greatly. However, since the traffic distribution in EC is heterogeneous and dynamic, it is difficult for an individual edge server to provide satisfactory computation service anytime and anywhere. This issue motivated the researchers to study the cooperation between edge servers. The…
▽ More
In edge computing (EC), by offloading tasks to edge server or remote cloud, the system performance can be improved greatly. However, since the traffic distribution in EC is heterogeneous and dynamic, it is difficult for an individual edge server to provide satisfactory computation service anytime and anywhere. This issue motivated the researchers to study the cooperation between edge servers. The previous server cooperation algorithms have disadvantages since the cooperated region is limited within one-hop. However, the performance of EC can be improved further by releasing the restriction of cooperation region. Even some works have extended the cooperated region to multi-hops, they fail to support the task offloading which is one of the core issues of edge computing. Therefore, we propose a new decentralized and borderless server cooperation algorithm for edge computing which takes task offloading strategy into account, named CoNet. In CoNet, the cooperation region is not limited. Each server forms its own basic cooperation unit (BCU) and calculates its announced capability based on BCU. The server's capability, the processing delay, the task and calculation result forwarding delay are considered during the calculation. The task division strategy bases on the real capability of host-server and the announced capability of cooperation-servers. This cooperation process is recursive and will be terminated once the terminal condition is satisfied. The simulation results demonstrate the advantages of CoNet over previous works.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Learning language variations in news corpora through differential embeddings
Authors:
Carlos Selmo,
Julian F. Martinez,
Mariano G. Beiró,
J. Ignacio Alvarez-Hamelin
Abstract:
There is an increasing interest in the NLP community in capturing variations in the usage of language, either through time (i.e., semantic drift), across regions (as dialects or variants) or in different social contexts (i.e., professional or media technolects). Several successful dynamical embeddings have been proposed that can track semantic change through time. Here we show that a model with a…
▽ More
There is an increasing interest in the NLP community in capturing variations in the usage of language, either through time (i.e., semantic drift), across regions (as dialects or variants) or in different social contexts (i.e., professional or media technolects). Several successful dynamical embeddings have been proposed that can track semantic change through time. Here we show that a model with a central word representation and a slice-dependent contribution can learn word embeddings from different corpora simultaneously. This model is based on a star-like representation of the slices. We apply it to The New York Times and The Guardian newspapers, and we show that it can capture both temporal dynamics in the yearly slices of each corpus, and language variations between US and UK English in a curated multi-source corpus. We provide an extensive evaluation of this methodology.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
Game Theory based Joint Task Offloading and Resources Allocation Algorithm for Mobile Edge Computing
Authors:
Jianen Yan,
Ning Li,
Zhaoxin Zhang,
Alex X. Liu,
Jose Fernan Martinez,
Xin Yuan
Abstract:
Mobile edge computing (MEC) has emerged for reducing energy consumption and latency by allowing mobile users to offload computationally intensive tasks to the MEC server. Due to the spectrum reuse in small cell network, the inter-cell interference has a great effect on MEC performances. In this paper, for reducing the energy consumption and latency of MEC, we propose a game theory based approach t…
▽ More
Mobile edge computing (MEC) has emerged for reducing energy consumption and latency by allowing mobile users to offload computationally intensive tasks to the MEC server. Due to the spectrum reuse in small cell network, the inter-cell interference has a great effect on MEC performances. In this paper, for reducing the energy consumption and latency of MEC, we propose a game theory based approach to join task offloading decision and resources allocation together in the MEC system. In this algorithm, the offloading decision, the CPU capacity adjustment, the transmission power control, and the network interference management of mobile users are regarded as a game. In this game, based on the best response strategy, each mobile user makes their own utility maximum rather than the utility of the whole system. We prove that this game is an exact potential game and the Nash equilibrium (NE) of this game exists. For reaching the NE, the best response approach is applied. We calculate the best response of these three variables. Moreover, we investigate the properties of this algorithm, including the convergence, the computational complexity, and the Price of anarchy (PoA). The theoretical analysis shows that the inter-cell interference affects on the performances of MEC greatly. The NE of this game is Pareto efficiency. Finally, we evaluate the performances of this algorithm by simulation. The simulation results illustrate that this algorithm is effective in improving the performances of the multi-user MEC system.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.