Skip to main content

Showing 1–50 of 68 results for author: Jacobsen, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01902  [pdf, ps, other

    quant-ph cs.DC physics.chem-ph

    Analyzing Common Electronic Structure Theory Algorithms for Distributed Quantum Computing

    Authors: Grier M. Jones, Hans-Arno Jacobsen

    Abstract: To move towards the utility era of quantum computing, many corporations have posed distributed quantum computing (DQC) as a framework for scaling the current generation of devices for practical applications. One of these applications is quantum chemistry, also known as electronic structure theory, which has been poised as a "killer application" of quantum computing, To this end, we analyze five el… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2505.19947  [pdf, ps, other

    cs.LG cs.AI eess.SY

    Dynamically Learned Test-Time Model Routing in Language Model Zoos with Service Level Guarantees

    Authors: Herbert Woisetschläger, Ryan Zhang, Shiqiang Wang, Hans-Arno Jacobsen

    Abstract: Open-weight LLM zoos provide access to numerous high-quality models, but selecting the appropriate model for specific tasks remains challenging and requires technical expertise. Most users simply want factually correct, safe, and satisfying responses without concerning themselves with model technicalities, while inference service providers prioritize minimizing operating costs. These competing int… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: Preprint. Under review

    ACM Class: I.2; I.2.7; I.2.8

  3. arXiv:2504.18980  [pdf, other

    cs.DB

    Beyond Performance: Measuring the Environmental Impact of Analytical Databases

    Authors: Michail Bachras, Hans-Arno Jacobsen

    Abstract: The exponential growth of data is making query processing increasingly critical for modern computing infrastructure, yet the environmental impact of database operations remains poorly understood and largely overlooked. This paper presents ATLAS, a comprehensive methodology for measuring and quantifying the environmental footprint of analytical database systems, considering both operational impacts… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: 14 pages, 15 figures

  4. Decentralization in PoS Blockchain Consensus: Quantification and Advancement

    Authors: Shashank Motepalli, Hans-Arno Jacobsen

    Abstract: Decentralization is a foundational principle of permissionless blockchains, with consensus mechanisms serving a critical role in its realization. This study quantifies the decentralization of consensus mechanisms in proof-of-stake (PoS) blockchains using a comprehensive set of metrics, including Nakamoto coefficients, Gini, Herfindahl Hirschman Index (HHI), Shapley values, and Zipfs coefficient. O… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

    Journal ref: IEEE Transactions on Network and Service Management (2025)

  5. arXiv:2504.02191  [pdf, other

    cs.CE cs.LG

    A User-Tunable Machine Learning Framework for Step-Wise Synthesis Planning

    Authors: Shivesh Prakash, Hans-Arno Jacobsen, Viki Kumar Prasad

    Abstract: We introduce MHNpath, a machine learning-driven retrosynthetic tool designed for computer-aided synthesis planning. Leveraging modern Hopfield networks and novel comparative metrics, MHNpath efficiently prioritizes reaction templates, improving the scalability and accuracy of retrosynthetic predictions. The tool incorporates a tunable scoring system that allows users to prioritize pathways based o… ▽ More

    Submitted 3 April, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

  6. arXiv:2503.08914  [pdf, other

    cs.DC

    Cabinet: Dynamically Weighted Consensus Made Fast

    Authors: Gengrui Zhang, Shiquan Zhang, Michail Bachras, Yuqiu Zhang, Hans-Arno Jacobsen

    Abstract: Conventional consensus algorithms, such as Paxos and Raft, encounter inefficiencies when applied to large-scale distributed systems due to the requirement of waiting for replies from a majority of nodes. To address these challenges, we propose Cabinet, a novel consensus algorithm that introduces dynamically weighted consensus, allocating distinct weights to nodes based on any given failure thresho… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  7. arXiv:2503.01854  [pdf, ps, other

    cs.CL cs.AI

    A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

    Authors: Jiahui Geng, Qing Li, Herbert Woisetschlaeger, Zongxiong Chen, Fengyu Cai, Yuxia Wang, Preslav Nakov, Hans-Arno Jacobsen, Fakhri Karray

    Abstract: This study investigates the machine unlearning techniques within the context of large language models (LLMs), referred to as \textit{LLM unlearning}. LLM unlearning offers a principled approach to removing the influence of undesirable data (e.g., sensitive or illegal information) from LLMs, while preserving their overall utility without requiring full retraining. Despite growing research interest,… ▽ More

    Submitted 31 May, 2025; v1 submitted 22 February, 2025; originally announced March 2025.

  8. arXiv:2502.20403  [pdf, other

    cs.ET cs.AI cs.CR cs.LG quant-ph

    Adversarial Robustness of Partitioned Quantum Classifiers

    Authors: Pouya Kananian, Hans-Arno Jacobsen

    Abstract: Adversarial robustness in quantum classifiers is a critical area of study, providing insights into their performance compared to classical models and uncovering potential advantages inherent to quantum machine learning. In the NISQ era of quantum computing, circuit cutting is a notable technique for simulating circuits that exceed the qubit limitations of current devices, enabling the distribution… ▽ More

    Submitted 28 January, 2025; originally announced February 2025.

  9. arXiv:2502.19986  [pdf, other

    cs.LG

    WaveGAS: Waveform Relaxation for Scaling Graph Neural Networks

    Authors: Jana Vatter, Mykhaylo Zayats, Marcos Martínez Galindo, Vanessa López, Ruben Mayer, Hans-Arno Jacobsen, Hoang Thanh Lam

    Abstract: With the ever-growing size of real-world graphs, numerous techniques to overcome resource limitations when training Graph Neural Networks (GNNs) have been developed. One such approach, GNNAutoScale (GAS), uses graph partitioning to enable training under constrained GPU memory. GAS also stores historical embedding vectors, which are retrieved from one-hop neighbors in other partitions, ensuring cri… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  10. arXiv:2502.06733  [pdf, other

    cs.LG cs.AI

    Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

    Authors: Daouda Sow, Herbert Woisetschläger, Saikiran Bulusu, Shiqiang Wang, Hans-Arno Jacobsen, Yingbin Liang

    Abstract: Pretraining large language models (LLMs) on vast and heterogeneous datasets is crucial for achieving state-of-the-art performance across diverse downstream tasks. However, current training paradigms treat all samples equally, overlooking the importance or relevance of individual samples throughout the training process. Existing reweighting strategies, which primarily focus on group-level data impo… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: Accepted for publication at ICLR 2025. Code base available: https://github.com/sowmaster/Sample-Level-Loss-Reweighting-ICLR-2025

  11. arXiv:2411.00889  [pdf, other

    cs.LG eess.SY

    MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level Guarantees

    Authors: Ryan Zhang, Herbert Woisetschläger, Shiqiang Wang, Hans Arno Jacobsen

    Abstract: Open-weight large language model (LLM) zoos allow users to quickly integrate state-of-the-art models into systems. Despite increasing availability, selecting the most appropriate model for a given task still largely relies on public benchmark leaderboards and educated guesses. This can be unsatisfactory for both inference service providers and end users, where the providers usually prioritize cost… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

    Comments: Accepted at the 2024 Workshop on Adaptive Foundation Models in conjunction with NeurIPS 2024

  12. arXiv:2409.11129  [pdf, other

    cs.LG cs.DB cs.PF

    Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study

    Authors: Nikolai Merkel, Pierre Toussing, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Graph neural networks (GNNs) are a type of neural network capable of learning on graph-structured data. However, training GNNs on large-scale graphs is challenging due to iterative aggregations of high-dimensional features from neighboring vertices within sparse graph structures combined with neural network operations. The sparsity of graphs frequently results in suboptimal memory access patterns… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: To be published in proceedings of the 51st International Conference on Very Large Data Bases (VLDB), September 1-5, 2025

  13. arXiv:2407.08105  [pdf, ps, other

    cs.AI

    Federated Learning and AI Regulation in the European Union: Who is Responsible? -- An Interdisciplinary Analysis

    Authors: Herbert Woisetschläger, Simon Mertel, Christoph Krönke, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: The European Union Artificial Intelligence Act mandates clear stakeholder responsibilities in developing and deploying machine learning applications to avoid substantial fines, prioritizing private and secure data processing with data remaining at its origin. Federated Learning (FL) enables the training of generative AI Models across data siloes, sharing only model parameters while improving data… ▽ More

    Submitted 12 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted at the GenLaw'24 workshop in conjunction with ICML'24

    ACM Class: K.5; I.2.11; C.2.4; D.2.1

  14. arXiv:2406.16968  [pdf, other

    cs.LG cs.AI

    Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition

    Authors: Kai Shao, Rui Wang, Yixue Hao, Long Hu, Min Chen, Hans Arno Jacobsen

    Abstract: Depression recognition based on physiological signals such as functional near-infrared spectroscopy (fNIRS) and electroencephalogram (EEG) has made considerable progress. However, most existing studies ignore the complementarity and semantic consistency of multimodal physiological signals under the same stimulation task in complex spatio-temporal patterns. In this paper, we introduce a multimodal… ▽ More

    Submitted 25 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

  15. arXiv:2406.06318  [pdf

    cs.DC cs.CR

    Should my Blockchain Learn to Drive? A Study of Hyperledger Fabric

    Authors: Jeeta Ann Chacko, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Similar to other transaction processing frameworks, blockchain systems need to be dynamically reconfigured to adapt to varying workloads and changes in network conditions. However, achieving optimal reconfiguration is particularly challenging due to the complexity of the blockchain stack, which has diverse configurable parameters. This paper explores the concept of self-driving blockchains, which… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  16. arXiv:2404.02779  [pdf, other

    cs.LG

    Federated Computing -- Survey on Building Blocks, Extensions and Systems

    Authors: René Schwermer, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: In response to the increasing volume and sensitivity of data, traditional centralized computing models face challenges, such as data security breaches and regulatory hurdles. Federated Computing (FC) addresses these concerns by enabling collaborative processing without compromising individual data privacy. This is achieved through a decentralized network of devices, each retaining control over its… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  17. arXiv:2402.05968  [pdf, other

    cs.LG cs.AI cs.CY cs.DC

    Federated Learning Priorities Under the European Union Artificial Intelligence Act

    Authors: Herbert Woisetschläger, Alexander Erben, Bill Marino, Shiqiang Wang, Nicholas D. Lane, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: The age of AI regulation is upon us, with the European Union Artificial Intelligence Act (AI Act) leading the way. Our key inquiry is how this will affect Federated Learning (FL), whose starting point of prioritizing data privacy while performing ML fundamentally differs from that of centralized learning. We believe the AI Act and future regulations could be the missing catalyst that pushes FL tow… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    ACM Class: I.2; I.2.11; K.5

  18. arXiv:2402.04874  [pdf, other

    cs.AI cs.LG

    Choosing a Classical Planner with Graph Neural Networks

    Authors: Jana Vatter, Ruben Mayer, Hans-Arno Jacobsen, Horst Samulowitz, Michael Katz

    Abstract: Online planner selection is the task of choosing a solver out of a predefined set for a given planning problem. As planning is computationally hard, the performance of solvers varies greatly on planning problems. Thus, the ability to predict their performance on a given problem is of great importance. While a variety of learning methods have been employed, for classical cost-optimal planning the p… ▽ More

    Submitted 25 January, 2024; originally announced February 2024.

  19. arXiv:2401.04472  [pdf, other

    cs.LG cs.AI cs.DC

    A Survey on Efficient Federated Learning Methods for Foundation Model Training

    Authors: Herbert Woisetschläger, Alexander Isenko, Shiqiang Wang, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients. However, new approaches to FL often discuss their contributions involving small deep-learning models only and focus on training full models on clients. In the wake of Foundation Models (FM), the reality is different for many deep learning applications.… ▽ More

    Submitted 5 September, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted for publication at IJCAI 2024. Please cite the published paper via https://doi.org/10.24963/ijcai.2024/919

    ACM Class: I.2.11; C.2

  20. arXiv:2312.13938  [pdf, other

    cs.DC cs.CY

    How Does Stake Distribution Influence Consensus? Analyzing Blockchain Decentralization

    Authors: Shashank Motepalli, Hans-Arno Jacobsen

    Abstract: In the PoS blockchain landscape, the challenge of achieving full decentralization is often hindered by a disproportionate concentration of staked tokens among a few validators. This study analyses this challenge by first formalizing decentralization metrics for weighted consensus mechanisms. An empirical analysis across ten permissionless blockchains uncovers significant weight concentration among… ▽ More

    Submitted 20 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: To appear in ICBC 2024

  21. An End-to-End Performance Comparison of Seven Permissioned Blockchain Systems

    Authors: Frank Christian Geyer, Hans-Arno Jacobsen, Ruben Mayer, Peter Mandl

    Abstract: The emergence of more and more blockchain solutions with innovative approaches to optimising performance, scalability, privacy and governance complicates performance analysis. Reasons for the difficulty of benchmarking blockchains include, for example, the high number of system parameters to configure and the effort to deploy a blockchain network. In addition, performance data, which mostly comes… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 14 pages, 5 figures, 20 tables, Middleware Conference

  22. FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains

    Authors: Pezhman Nasirifard, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: With the increased adaption of blockchain technologies, permissioned blockchains such as Hyperledger Fabric provide a robust ecosystem for developing production-grade decentralized applications. However, the additional latency between executing and committing transactions, due to Fabric's three-phase transaction lifecycle of Execute-Order-Validate (EOV), is a potential scalability bottleneck. The… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: In Proceedings of the 20th International Middleware Conference (Middleware '19). ACM 2019

  23. arXiv:2310.03150  [pdf, other

    cs.LG cs.DC cs.PF

    Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly

    Authors: Herbert Woisetschläger, Alexander Isenko, Shiqiang Wang, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Large Language Models (LLM) and foundation models are popular as they offer new opportunities for individuals and businesses to improve natural language processing, interact with data, and retrieve information faster. However, training or fine-tuning LLMs requires a vast amount of data, which can be challenging to access due to legal or technical restrictions and may require private computing reso… ▽ More

    Submitted 2 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: Camera-ready version for DEEM'24. Please cite the official ACM paper via https://doi.org/10.1145/3650203.3663331

    ACM Class: I.2.11; C.2.4; C.4; D.2.8

  24. arXiv:2309.16962  [pdf, other

    cs.DC

    Lifting the Fog of Uncertainties: Dynamic Resource Orchestration for the Containerized Cloud

    Authors: Yuqiu Zhang, Tongkun Zhang, Gengrui Zhang, Hans-Arno Jacobsen

    Abstract: The advances in virtualization technologies have sparked a growing transition from virtual machine (VM)-based to container-based infrastructure for cloud computing. From the resource orchestration perspective, containers' lightweight and highly configurable nature not only enables opportunities for more optimized strategies, but also poses greater challenges due to additional uncertainties and a l… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: To appear at ACM SoCC '23

  25. An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training

    Authors: Nikolai Merkel, Daniel Stoll, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Recently, graph neural networks (GNNs) have gained much attention as a growing area of deep learning capable of learning on graph-structured data. However, the computational and memory requirements for training GNNs on large-scale graphs make it necessary to distribute the training. A prerequisite for distributed GNN training is to partition the input graph into smaller parts that are distributed… ▽ More

    Submitted 12 August, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: To be published in Proceedings of the 28th International Conference on Extending Database Technology (EDBT), 25th, March-28th March, 2025

  26. arXiv:2307.08154  [pdf, other

    cs.DC

    PrestigeBFT: Revolutionizing View Changes in BFT Consensus Algorithms with Reputation Mechanisms

    Authors: Gengrui Zhang, Fei Pan, Sofia Tijanic, Hans-Arno Jacobsen

    Abstract: This paper proposes PrestigeBFT, a novel leader-based BFT consensus algorithm that addresses the weaknesses of passive view-change protocols. Passive protocols blindly rotate leadership among servers on a predefined schedule, potentially selecting unavailable or slow servers as leaders. PrestigeBFT proposes an active view-change protocol using reputation mechanisms that calculate a server's potent… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  27. FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems

    Authors: Herbert Woisetschläger, Alexander Erben, Ruben Mayer, Shiqiang Wang, Hans-Arno Jacobsen

    Abstract: Federated Learning (FL) has become a viable technique for realizing privacy-enhancing distributed deep learning on the network edge. Heterogeneous hardware, unreliable client devices, and energy constraints often characterize edge computing systems. In this paper, we propose FLEdge, which complements existing FL benchmarks by enabling a systematic evaluation of client capabilities. We focus on com… ▽ More

    Submitted 4 November, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Paper accepted for publication at the ACM/IFIP Middleware Conference 2024. Please cite the published version via https://doi.org/10.1145/3652892.3700751 (will be available after the conference in December 2024)

    ACM Class: I.2.11; C.2.4; C.4; D.2.8

  28. arXiv:2306.03163  [pdf, other

    cs.LG cs.DC cs.NI cs.PF

    How Can We Train Deep Learning Models Across Clouds and Continents? An Experimental Study

    Authors: Alexander Erben, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: This paper aims to answer the question: Can deep learning models be cost-efficiently trained on a global market of spot VMs spanning different data centers and cloud providers? To provide guidance, we extensively evaluate the cost and throughput implications of training in different zones, continents, and clouds for representative CV, NLP, and ASR models. To expand the current training options fur… ▽ More

    Submitted 2 June, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Published at VLDB 2024. Artifacts and Code: https://github.com/cirquit/hivemind-multi-cloud

    ACM Class: I.2.11; C.2.4; C.4; D.2.8

  29. arXiv:2305.17771  [pdf, other

    cs.DC cs.CY

    Analyzing Geospatial Distribution in Blockchains

    Authors: Shashank Motepalli, Hans-Arno Jacobsen

    Abstract: Blockchains are decentralized; are they genuinely? We analyze blockchain decentralization's often-overlooked but quantifiable dimension: geospatial distribution of transaction processing. Blockchains bring with them the potential for geospatially distributed transaction processing. They enable validators from geospatially distant locations to partake in consensus protocols; we refer to them as min… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: To appear in IEEE DAPPS 2023

  30. arXiv:2305.13854  [pdf, other

    cs.DC cs.LG

    The Evolution of Distributed Systems for Graph Neural Networks and their Origin in Graph Processing and Deep Learning: A Survey

    Authors: Jana Vatter, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Graph Neural Networks (GNNs) are an emerging research field. This specialized Deep Neural Network (DNN) architecture is capable of processing graph structured data and bridges the gap between graph processing and Deep Learning (DL). As graphs are everywhere, GNNs can be applied to various domains including recommendation systems, computer vision, natural language processing, biology and chemistry.… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at ACM Computing Surveys

  31. arXiv:2304.04976  [pdf, other

    cs.DC

    Partitioner Selection with EASE to Optimize Distributed Graph Processing

    Authors: Nikolai Merkel, Ruben Mayer, Tawkir Ahmed Fakir, Hans-Arno Jacobsen

    Abstract: For distributed graph processing on massive graphs, a graph is partitioned into multiple equally-sized parts which are distributed among machines in a compute cluster. In the last decade, many partitioning algorithms have been developed which differ from each other with respect to the partitioning quality, the run-time of the partitioning and the type of graph for which they work best. The plethor… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: To appear at IEEE International Conference on Data Engineering (ICDE 2023)

  32. Diba: A Re-configurable Stream Processor

    Authors: Mohammadreza Najafi, Thamir M. Qadah, Mohammad Sadoghi, Hans-Arno Jacobsen

    Abstract: Stream processing acceleration is driven by the continuously increasing volume and velocity of data generated on the Web and the limitations of storage, computation, and power consumption. Hardware solutions provide better performance and power consumption, but they are hindered by the high research and development costs and the long time to market. In this work, we propose our re-configurable str… ▽ More

    Submitted 27 August, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Journal ref: in IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 9, pp. 4550-4566, Sept. 2024

  33. arXiv:2301.06210  [pdf, other

    cs.DC

    V-Guard: An Efficient Permissioned Blockchain for Achieving Consensus under Dynamic Memberships in V2X

    Authors: Gengrui Zhang, Yunhao Mao, Shiquan Zhang, Shashank Motepalli, Fei Pan, Hans-Arno Jacobsen

    Abstract: This paper presents V-Guard, a new permissioned blockchain that achieves consensus for vehicular data under changing memberships, targeting the problem in V2X networks where vehicles are often intermittently connected on the roads. To achieve this goal, V-Guard integrates membership management into the consensus process for agreeing on data entries. It binds a data entry with a membership configur… ▽ More

    Submitted 3 April, 2023; v1 submitted 15 January, 2023; originally announced January 2023.

  34. arXiv:2301.04719  [pdf, other

    cs.DC

    How To Optimize My Blockchain? A Multi-Level Recommendation Approach

    Authors: Jeeta Ann Chacko, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Aside from the conception of new blockchain architectures, existing blockchain optimizations in the literature primarily focus on system or data-oriented optimizations within prevailing blockchains. However, since blockchains handle multiple aspects ranging from organizational governance to smart contract design, a holistic approach that encompasses all the different layers of a given blockchain s… ▽ More

    Submitted 11 January, 2023; originally announced January 2023.

    Comments: This is a preprint of an upcoming publication at ACM SIGMOD 2023. Please cite the original SIGMOD version

  35. arXiv:2210.07897  [pdf, other

    cs.DC

    A Serverless Publish/Subscribe System

    Authors: Pezhman Nasirifard, Hans-Arno Jacobsen

    Abstract: Operating a scalable and reliable server application, such as publish/subscribe (pub/sub) systems, requires tremendous development efforts and resources. The emerging serverless paradigm simplifies the development and deployment of highly available applications by delegating most operational concerns to the cloud providers. The serverless paradigm describes a programming model where the developers… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  36. arXiv:2210.07789  [pdf, other

    cs.DC eess.SP eess.SY

    i13DR: A Real-Time Demand Response Infrastructure for Integrating Renewable Energy Resources

    Authors: Pezhman Nasirifard, Hans-Arno Jacobsen

    Abstract: With the ongoing integration of Renewable Energy Sources (RES), the complexity of power grids is increasing. Due to the fluctuating nature of RES, ensuring the reliability of power grids can be challenging. One possible approach for addressing these challenges is Demand Response (DR) which is described as matching the demand for electrical energy according to the changes and the availability of su… ▽ More

    Submitted 31 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

  37. OrderlessChain: Do Permissioned Blockchains Need Total Global Order of Transactions?

    Authors: Pezhman Nasirifard, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Existing permissioned blockchains often rely on coordination-based consensus protocols to ensure the safe execution of applications in a Byzantine environment. Furthermore, these protocols serialize the transactions by ordering them into a total global order. The serializability preserves the correctness of the application's state stored on the blockchain. However, using coordination-based protoco… ▽ More

    Submitted 24 October, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

  38. arXiv:2209.03759  [pdf, other

    eess.SP cs.AI cs.LG

    Representation Learning for Appliance Recognition: A Comparison to Classical Machine Learning

    Authors: Matthias Kahl, Daniel Jorde, Hans-Arno Jacobsen

    Abstract: Non-intrusive load monitoring (NILM) aims at energy consumption and appliance state information retrieval from aggregated consumption measurements, with the help of signal processing and machine learning algorithms. Representation learning with deep neural networks is successfully applied to several related disciplines. The main advantage of representation learning lies in replacing an expert-driv… ▽ More

    Submitted 26 August, 2022; originally announced September 2022.

  39. arXiv:2206.13237  [pdf, other

    q-fin.ST cs.DC cs.LG

    The DEBS 2022 Grand Challenge: Detecting Trading Trends in Financial Tick Data

    Authors: Sebastian Frischbier, Jawad Tahir, Christoph Doblander, Arne Hormann, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: The DEBS Grand Challenge (GC) is an annual programming competition open to practitioners from both academia and industry. The GC 2022 edition focuses on real-time complex event processing of high-volume tick data provided by Infront Financial Technology GmbH. The goal of the challenge is to efficiently compute specific trend indicators and detect patterns in these indicators like those used by rea… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

    Comments: Author's version of the work, definitive Version of Record published in the proceedings of The 16th ACM International Conference on Distributed and Event-based Systems (DEBS '22); 7 pages, 7 figures

  40. arXiv:2204.05582  [pdf

    cs.NI

    Towards Data-Driven Precision Agriculture using Open Data and Open Source Software

    Authors: Jacob Høxbroe Jeppesen, Rune Hylsberg Jacobsen, Rasmus Nyholm Jørgensen, Thomas Skjødeberg Toftegaard

    Abstract: Information and communications technology (ICT) within the agricultural sector is characterized by a widespread use of proprietary data formats, a strong lack of interoperability standards, and a tight connection to specific hardware implementations resulting from vendor lock-in. This partly explains why ICT has not yet had its full impact within the domain. By utilizing the vast amount of publicl… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: 6 pages, 6 figures

  41. arXiv:2204.03181  [pdf, other

    cs.DC

    Reaching Consensus in the Byzantine Empire: A Comprehensive Review of BFT Consensus Algorithms

    Authors: Gengrui Zhang, Fei Pan, Yunhao Mao, Sofia Tijanic, Michael Dang'ana, Shashank Motepalli, Shiquan Zhang, Hans-Arno Jacobsen

    Abstract: Byzantine fault-tolerant (BFT) consensus algorithms are at the core of providing safety and liveness guarantees for distributed systems that must operate in the presence of arbitrary failures. Recently, numerous new BFT algorithms have been proposed, not least due to the traction blockchain technologies have garnered in the search for consensus solutions that offer high throughput, low latency, an… ▽ More

    Submitted 5 December, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

  42. arXiv:2203.12721  [pdf, other

    cs.DC

    Out-of-Core Edge Partitioning at Linear Run-Time

    Authors: Ruben Mayer, Kamil Orujzade, Hans-Arno Jacobsen

    Abstract: Graph edge partitioning is an important preprocessing step to optimize distributed computing jobs on graph-structured data. The edge set of a given graph is split into $k$ equally-sized partitions, such that the replication of vertices across partitions is minimized. Out-of-core edge partitioning algorithms are able to tackle the problem with low memory overhead. Exsisting out-of-core algorithms m… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: To appear at IEEE International Conference on Data Engineering (ICDE 2022)

  43. arXiv:2203.09714  [pdf, other

    cs.MA cs.DC

    Decentralizing Permissioned Blockchain with Delay Towers

    Authors: Shashank Motepalli, Hans-Arno Jacobsen

    Abstract: Growing excitement around permissionless blockchains is uncovering its latent scalability concerns. Permissioned blockchains offer high transactional throughput and low latencies while compromising decentralization. In the quest for a decentralized, scalable blockchain fabric, i.e., to offer the scalability of permissioned blockchain in a permissionless setting, we present L4L to encourage decentr… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

  44. arXiv:2202.09434  [pdf, other

    cs.DC

    ESCAPE to Precaution against Leader Failures

    Authors: Gengrui Zhang, Hans-Arno Jacobsen

    Abstract: Leader-based consensus protocols must undergo a view-change phase to elect a new leader when the current leader fails. The new leader is often decided upon a candidate server that collects votes from a quorum of servers. However, voting-based election mechanisms intrinsically cause competition in leadership candidacy when each candidate collects only partial votes. This split-vote scenario can res… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  45. arXiv:2202.08679  [pdf, other

    cs.LG cs.DC cs.PF

    Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines

    Authors: Alexander Isenko, Ruben Mayer, Jeffrey Jedele, Hans-Arno Jacobsen

    Abstract: Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy. Maximizing resource utilization is becoming more challenging as the throughput of training processes increases with hardware innovations (e.g., faster GPUs, TPUs, and inter-connects) and advanced parallelization techniques that yield better scalability. At the same time, the amou… ▽ More

    Submitted 25 March, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: To be published in SIGMOD, June 12-17, 2022, Philadelphia, PA, USA. Repository: https://github.com/cirquit/presto

    ACM Class: I.4.0; I.4.2; I.2.0; B.4.4; C.4; D.2.8

  46. A Review on Communication Protocols for Autonomous Unmanned Aerial Vehicles for Inspection Application

    Authors: Liping Shi, Néstor J. Hernández Marcano, Rune Hylsberg Jacobsen

    Abstract: The communication system is a critical part of the system design for the autonomous UAV. It has to address different considerations, including efficiency, reliability and mobility of the UAV. In addition, a multi-UAV system requires a communication system to assist information sharing, task allocation and collaboration in a team of UAVs. In this paper, we review communication solutions for support… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: 28 pages

  47. arXiv:2104.05849  [pdf, other

    cs.GT cs.CE cs.MA

    Reward Mechanism for Blockchains Using Evolutionary Game Theory

    Authors: Shashank Motepalli, Hans-Arno Jacobsen

    Abstract: Blockchains have witnessed widespread adoption in the past decade in various fields. The growing demand makes their scalability and sustainability challenges more evident than ever. As a result, more and more blockchains have begun to adopt proof-of-stake (PoS) consensus protocols to address those challenges. One of the fundamental characteristics of any blockchain technology is its crypto-economi… ▽ More

    Submitted 8 July, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Cite: @inproceedings{motepalli2021reward, title={Reward Mechanism for Blockchains Using Evolutionary Game Theory}, author={Motepalli, Shashank and Jacobsen, Hans-Arno}, booktitle={2021 3rd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS)}, year={2021} }

  48. Hybrid Edge Partitioner: Partitioning Large Power-Law Graphs under Memory Constraints

    Authors: Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Distributed systems that manage and process graph-structured data internally solve a graph partitioning problem to minimize their communication overhead and query run-time. Besides computational complexity -- optimal graph partitioning is NP-hard -- another important consideration is the memory overhead. Real-world graphs often have an immense size, such that loading the complete graph into memory… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: SIGMOD 2021, 14 pages

  49. arXiv:2103.04681  [pdf, other

    cs.DC cs.DB

    Why Do My Blockchain Transactions Fail? A Study of Hyperledger Fabric (Extended version)*

    Authors: Jeeta Ann Chacko, Ruben Mayer, Hans-Arno Jacobsen

    Abstract: Permissioned blockchain systems promise to provide both decentralized trust and privacy. Hyperledger Fabric is currently one of the most wide-spread permissioned blockchain systems and is heavily promoted both in industry and academia. Due to its optimistic concurrency model, the transaction failure rates in Fabric can become a bottleneck. While there is active research to reduce failures, there i… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: This is an extended version of an upcoming publication at ACM SIGMOD 2021. Please cite the original SIGMOD version

  50. arXiv:2001.07086  [pdf, other

    cs.DC

    2PS: High-Quality Edge Partitioning with Two-Phase Streaming

    Authors: Ruben Mayer, Kamil Orujzade, Hans-Arno Jacobsen

    Abstract: Graph partitioning is an important preprocessing step to distributed graph processing. In edge partitioning, the edge set of a given graph is split into $k$ equally-sized partitions, such that the replication of vertices across partitions is minimized. Streaming is a viable approach to partition graphs that exceed the memory capacities of a single server. The graph is ingested as a stream of edges… ▽ More

    Submitted 20 January, 2020; originally announced January 2020.

    Comments: in submission