-
PDSP-Bench: A Benchmarking System for Parallel and Distributed Stream Processing
Authors:
Pratyush Agnihotri,
Boris Koldehofe,
Roman Heinrich,
Carsten Binnig,
Manisha Luthra
Abstract:
The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream Processing Systems (SPS) use operator parallelism and the available resources to process massive workloads of modern applications. Existing benchmarking systems f…
▽ More
The paper introduces PDSP-Bench, a novel benchmarking system designed for a systematic understanding of performance of parallel stream processing in a distributed environment. Such an understanding is essential for determining how Stream Processing Systems (SPS) use operator parallelism and the available resources to process massive workloads of modern applications. Existing benchmarking systems focus on analyzing SPS using queries with sequential operator pipelines within a homogeneous centralized environment. Quite differently, PDSP-Bench emphasizes the aspects of parallel stream processing in a distributed heterogeneous environment and simultaneously allows the integration of machine learning models for SPS workloads. In our results, we benchmark a well-known SPS, Apache Flink, using parallel query structures derived from real-world applications and synthetic queries to show the capabilities of PDSP-Bench towards parallel stream processing. Moreover, we compare different learned cost models using generated SPS workloads on PDSP-Bench by showcasing their evaluations on model and training efficiency. We present key observations from our experiments using PDSP-Bench that highlight interesting trends given different query workloads, such as non-linearity and paradoxical effects of parallelism on the performance.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
A Tale of Two Scales: Reconciling Horizontal and Vertical Scaling for Inference Serving Systems
Authors:
Kamran Razavi,
Mehran Salmani,
Max Mühlhäuser,
Boris Koldehofe,
Lin Wang
Abstract:
Inference serving is of great importance in deploying machine learning models in real-world applications, ensuring efficient processing and quick responses to inference requests. However, managing resources in these systems poses significant challenges, particularly in maintaining performance under varying and unpredictable workloads. Two primary scaling strategies, horizontal and vertical scaling…
▽ More
Inference serving is of great importance in deploying machine learning models in real-world applications, ensuring efficient processing and quick responses to inference requests. However, managing resources in these systems poses significant challenges, particularly in maintaining performance under varying and unpredictable workloads. Two primary scaling strategies, horizontal and vertical scaling, offer different advantages and limitations. Horizontal scaling adds more instances to handle increased loads but can suffer from cold start issues and increased management complexity. Vertical scaling boosts the capacity of existing instances, allowing for quicker responses but is limited by hardware and model parallelization capabilities.
This paper introduces Themis, a system designed to leverage the benefits of both horizontal and vertical scaling in inference serving systems. Themis employs a two-stage autoscaling strategy: initially using in-place vertical scaling to handle workload surges and then switching to horizontal scaling to optimize resource efficiency once the workload stabilizes. The system profiles the processing latency of deep learning models, calculates queuing delays, and employs different dynamic programming algorithms to solve the joint horizontal and vertical scaling problem optimally based on the workload situation. Extensive evaluations with real-world workload traces demonstrate over $10\times$ SLO violation reduction compared to the state-of-the-art horizontal or vertical autoscaling approaches while maintaining resource efficiency when the workload is stable.
△ Less
Submitted 20 July, 2024;
originally announced July 2024.
-
Differential Privacy for Protecting Private Patterns in Data Streams
Authors:
He Gu,
Thomas Plagemann,
Maik Benndorf,
Vera Goebel,
Boris Koldehofe
Abstract:
Complex event processing (CEP) is a powerful and increasingly more important tool to analyse data streams for Internet of Things (IoT) applications. These data streams often contain private information that requires proper protection. However, privacy protection in CEP systems is still in its infancy, and most existing privacy-preserving mechanisms (PPMs) are adopted from those designed for data s…
▽ More
Complex event processing (CEP) is a powerful and increasingly more important tool to analyse data streams for Internet of Things (IoT) applications. These data streams often contain private information that requires proper protection. However, privacy protection in CEP systems is still in its infancy, and most existing privacy-preserving mechanisms (PPMs) are adopted from those designed for data streams. Such approaches undermine the quality of the entire data stream and limit the performance of IoT applications. In this paper, we attempt to break the limitation and establish a new foundation for PPMs of CEP by proposing a novel pattern-level differential privacy (DP) guarantee. We introduce two PPMs that guarantee pattern-level DP. They operate only on data that correlate with private patterns rather than on the entire data stream, leading to higher data quality. One of the PPMs provides adaptive privacy protection and brings more granularity and generalization. We evaluate the performance of the proposed PPMs with two experiments on a real-world dataset and on a synthetic dataset. The results of the experiments indicate that our proposed privacy guarantee and its PPMs can deliver better data quality under equally strong privacy guarantees, compared to multiple well-known PPMs designed for data streams.
△ Less
Submitted 11 May, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Towards Privacy Engineering for Real-Time Analytics in the Human-Centered Internet of Things
Authors:
Thomas Plagemann,
Vera Goebel,
Matthias Hollick,
Boris Koldehofe
Abstract:
Big data applications offer smart solutions to many urgent societal challenges, such as health care, traffic coordination, energy management, etc. The basic premise for these applications is "the more data the better". The focus often lies on sensing infrastructures in the public realm that produce an ever-increasing amount of data. Yet, any smartphone and smartwatch owner could be a continuous so…
▽ More
Big data applications offer smart solutions to many urgent societal challenges, such as health care, traffic coordination, energy management, etc. The basic premise for these applications is "the more data the better". The focus often lies on sensing infrastructures in the public realm that produce an ever-increasing amount of data. Yet, any smartphone and smartwatch owner could be a continuous source of valuable data and contribute to many useful big data applications. However, such data can reveal a lot of sensitive information, like the current location or the heart rate of the owner of such devices. Protection of personal data is important in our society and for example manifested in the EU General Data Protection Regulation (GDPR). However, privacy protection and useful big data applications are hard to bring together, particularly in the human-centered IoT. Implementing proper privacy protection requires skills that are typically not in the focus of data analysts and big data developers. Thus, many individuals tend to share none of their data if in doubt whether it will be properly protected. There exist excellent privacy solutions between the "all or nothing" approach. For example, instead of continuously publishing the current location of individuals one might aggregate this data and only publish information of how many individuals are in a certain area of the city. Thus, personal data is not revealed, while useful information for certain applications like traffic coordination is retained. The goal of the Parrot project is to provide tools for real-time data analysis applications that leverage this "middle ground". Data analysts should only be required to specify their data needs, and end-users can select the privacy requirements for their data as well as the applications and end-users they want to share their data with.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
TCEP: Transitions in Operator Placement to Adapt to Dynamic Network Environments
Authors:
Manisha Luthra,
Boris Koldehofe,
Niels Danger,
Pascal Weisenburger,
Guido Salvaneschi,
Ioannis Stavrakakis
Abstract:
Distributed Complex Event Processing (DCEP) is a commonly used paradigm to detect and act on situational changes of many applications, including the Internet of Things (IoT). DCEP achieves this using a simple specification of analytical tasks on data streams called operators and their distributed execution on a set of infrastructure. The adaptivity of DCEP to the dynamics of IoT applications is es…
▽ More
Distributed Complex Event Processing (DCEP) is a commonly used paradigm to detect and act on situational changes of many applications, including the Internet of Things (IoT). DCEP achieves this using a simple specification of analytical tasks on data streams called operators and their distributed execution on a set of infrastructure. The adaptivity of DCEP to the dynamics of IoT applications is essential and very challenging in the face of changing demands concerning Quality of Service. In our previous work, we addressed this issue by enabling transitions, which allow for the adaptive use of multiple operator placement mechanisms. In this article, we extend the transition methodology by optimizing the costs of transition and analyzing the behaviour using multiple operator placement mechanisms. Furthermore, we provide an extensive evaluation on the costs of transition imposed by operator migrations and learning, as it can inflict overhead on the performance if operated uncoordinatedly.
△ Less
Submitted 23 June, 2021;
originally announced June 2021.
-
INetCEP: In-Network Complex Event Processing for Information-Centric Networking
Authors:
Manisha Luthra,
Boris Koldehofe,
Jonas Höchst,
Patrick Lampe,
Ali Haider Rizvi,
Ralf Kundel,
Bernd Freisleben
Abstract:
Emerging network architectures like Information-centric Networking (ICN) offer simplicity in the data plane by addressing named data. Such flexibility opens up the possibility to move data processing inside network elements for high-performance computation, known as in-network processing. However, existing ICN architectures are limited in terms of data plane programmability due to the lack of (i)…
▽ More
Emerging network architectures like Information-centric Networking (ICN) offer simplicity in the data plane by addressing named data. Such flexibility opens up the possibility to move data processing inside network elements for high-performance computation, known as in-network processing. However, existing ICN architectures are limited in terms of data plane programmability due to the lack of (i) in-network processing and (ii) data plane programming abstractions. Such architectures can benefit from Complex Event Processing (CEP), an in-network processing paradigm to efficiently process data inside the data plane. Yet, it is extremely challenging to integrate CEP because the current communication model of ICN is limited to consumer-initiated interaction that comes with significant overhead in a number of requests to process continuous data streams. In contrast, a change to producer-initiated interaction, as favored by CEP, imposes severe limitations for request-reply interactions. In this paper, we propose an in-network CEP architecture, INetCEP that supports unified interaction patterns (consumer- and producer-initiated). In addition, we provide a CEP query language and facilitate CEP operations while increasing the range of applications that can be supported by ICN. We provide an open-source implementation and evaluation of INetCEP over an ICN architecture, Named Function Networking, and two applications: energy forecasting in smart homes and a disaster scenario.
△ Less
Submitted 14 December, 2020; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Efficient Complex Event Processing in Information-centric Networking at the Edge
Authors:
Manisha Luthra,
Johannes Pfannmüller,
Boris Koldehofe,
Jonas Höchst,
Artur Sterz,
Rhaban Hark,
Bernd Freisleben
Abstract:
Information-centric Networking (ICN) is an emerging Internet architecture that offers promising features, such as in-network caching and named data addressing, to support the edge computing paradigm, in particular Internet-of-Things (IoT) applications. ICN can benefit from Complex Event Processing (CEP), which is an in-network processing paradigm to specify and perform efficient query operations o…
▽ More
Information-centric Networking (ICN) is an emerging Internet architecture that offers promising features, such as in-network caching and named data addressing, to support the edge computing paradigm, in particular Internet-of-Things (IoT) applications. ICN can benefit from Complex Event Processing (CEP), which is an in-network processing paradigm to specify and perform efficient query operations on data streams. However, integrating CEP into ICN is a challenging task due to the following reasons: (1) typical ICN architectures do not provide support for forwarding and processing continuous data streams; (2) IoT applications often need short response times and require robust event detection, which both are hard to accomplish using existing CEP systems.
In this article, we present a novel network architecture, called INetCEP, for efficient CEP-based in-network processing as part of ICN. INetCEP enables efficient data processing in ICN by means of (1) a unified communication model that supports continuous data streams, (2) a meta query language for CEP to specify data processing operations in the data plane, and (3) query processing algorithms to resolve the specified operations. Our experimental results for two IoT use cases and datasets show that INetCEP offers very short response times of up to 73 μs under high workload and is more than 15X faster in terms of forwarding events than the state-of-the-art CEP system Flink. Furthermore, the delivery and processing of complex queries is around 32X faster than Flink and more than 100X faster than a naive pull-based reference approach, while maintaining 100% accuracy.
△ Less
Submitted 13 December, 2020; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Operator as a Service: Stateful Serverless Complex Event Processing
Authors:
Manisha Luthra,
Sebastian Hennig,
Kamran Razavi,
Lin Wang,
Boris Koldehofe
Abstract:
Complex Event Processing (CEP) is a powerful paradigm for scalable data management that is employed in many real-world scenarios such as detecting credit card fraud in banks. The so-called complex events are expressed using a specification language that is typically implemented and executed on a specific runtime system. While the tight coupling of these two components has been regarded as the key…
▽ More
Complex Event Processing (CEP) is a powerful paradigm for scalable data management that is employed in many real-world scenarios such as detecting credit card fraud in banks. The so-called complex events are expressed using a specification language that is typically implemented and executed on a specific runtime system. While the tight coupling of these two components has been regarded as the key for supporting CEP at high performance, such dependencies pose several inherent challenges as follows. (1) Application development atop a CEP system requires extensive knowledge of how the runtime system operates, which is typically highly complex in nature. (2) The specification language dependence requires the need of domain experts and further restricts and steepens the learning curve for application developers. In this paper, we propose CEPLESS, a scalable data management system that decouples the specification from the runtime system by building on the principles of serverless computing. CEPLESS provides operator as a service and offers flexibility by enabling the development of CEP application in any specification language while abstracting away the complexity of the CEP runtime system. As part of CEPLESS, we designed and evaluated novel mechanisms for in-memory processing and batching that enables the stateful processing of CEP operators even under high rates of ingested events. Our evaluation demonstrates that CEPLESS can be easily integrated into existing CEP systems like Apache Flink while attaining similar throughput under a high scale of events (up to 100K events per second) and dynamic operator update in up to 238 ms.
△ Less
Submitted 28 June, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
P4-CoDel: Experiences on Programmable Data Plane Hardware
Authors:
Ralf Kundel,
Amr Rizk,
Jeremias Blendin,
Boris Koldehofe,
Rhaban Hark,
Ralf Steinmetz
Abstract:
Fixed buffer sizing in computer networks, especially the Internet, is a compromise between latency and bandwidth. A decision in favor of high bandwidth, implying larger buffers, subordinates the latency as a consequence of constantly filled buffers. This phenomenon is called Bufferbloat. Active Queue Management (AQM) algorithms such as CoDel or PIE, designed for the use on software based hosts, of…
▽ More
Fixed buffer sizing in computer networks, especially the Internet, is a compromise between latency and bandwidth. A decision in favor of high bandwidth, implying larger buffers, subordinates the latency as a consequence of constantly filled buffers. This phenomenon is called Bufferbloat. Active Queue Management (AQM) algorithms such as CoDel or PIE, designed for the use on software based hosts, offer a flow agnostic remedy to Bufferbloat by controlling the queue filling and hence the latency through subtle packet drops. In previous work, we have shown that the data plane programming language P4 is powerful enough to implement the CoDel algorithm. While legacy software algorithms can be easily compiled onto almost any processing architecture, this is not generally true for AQM on programmable data plane hardware, i.e., programmable packet processors. In this work, we highlight corresponding challenges, demonstrate how to tackle them, and provide techniques enabling the implementation of such AQM algorithms on different high speed P4-programmable data plane hardware targets. In addition, we provide measurement results created on different P4-programmable data plane targets. The resulting latency measurements reveal the feasibility and the constraints to be considered to perform Active Queue Management within these devices. Finally, we release the source code and instructions to reproduce the results in this paper as open source to the research community.
△ Less
Submitted 7 July, 2021; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Dissecting Apple's Meta-CDN during an iOS Update
Authors:
Jeremias Blendin,
Fabrice Bendfeldt,
Ingmar Poese,
Boris Koldehofe,
Oliver Hohlfeld
Abstract:
Content delivery networks (CDN) contribute more than 50% of today's Internet traffic. Meta-CDNs, an evolution of centrally controlled CDNs, promise increased flexibility by multihoming content. So far, efforts to understand the characteristics of Meta-CDNs focus mainly on third-party Meta-CDN services. A common, but unexplored, use case for Meta-CDNs is to use the CDNs mapping infrastructure to fo…
▽ More
Content delivery networks (CDN) contribute more than 50% of today's Internet traffic. Meta-CDNs, an evolution of centrally controlled CDNs, promise increased flexibility by multihoming content. So far, efforts to understand the characteristics of Meta-CDNs focus mainly on third-party Meta-CDN services. A common, but unexplored, use case for Meta-CDNs is to use the CDNs mapping infrastructure to form self-operated Meta-CDNs integrating third-party CDNs. These CDNs assist in the build-up phase of a CDN's infrastructure or mitigate capacity shortages by offloading traffic. This paper investigates the Apple CDN as a prominent example of self-operated Meta-CDNs. We describe the involved CDNs, the request-mapping mechanism, and show the cache locations of the Apple CDN using measurements of more than 800 RIPE Atlas probes worldwide. We further measure its load-sharing behavior by observing a major iOS update in Sep. 2017, a significant event potentially reaching up to an estimated 1 billion iOS devices. Furthermore, by analyzing data from a European Eyeball ISP, we quantify third-party traffic offloading effects and find third-party CDNs increase their traffic by 438% while saturating seemingly unrelated links.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Don't Repeat Yourself: Seamless Execution and Analysis of Extensive Network Experiments
Authors:
Alexander Frömmgen,
Denny Stohr,
Boris Koldehofe,
Amr Rizk
Abstract:
This paper presents MACI, the first bespoke framework for the management, the scalable execution, and the interactive analysis of a large number of network experiments. Driven by the desire to avoid repetitive implementation of just a few scripts for the execution and analysis of experiments, MACI emerged as a generic framework for network experiments that significantly increases efficiency and en…
▽ More
This paper presents MACI, the first bespoke framework for the management, the scalable execution, and the interactive analysis of a large number of network experiments. Driven by the desire to avoid repetitive implementation of just a few scripts for the execution and analysis of experiments, MACI emerged as a generic framework for network experiments that significantly increases efficiency and ensures reproducibility. To this end, MACI incorporates and integrates established simulators and analysis tools to foster rapid but systematic network experiments.
We found MACI indispensable in all phases of the research and development process of various communication systems, such as i) an extensive DASH video streaming study, ii) the systematic development and improvement of Multipath TCP schedulers, and iii) research on a distributed topology graph pattern matching algorithm. With this work, we make MACI publicly available to the research community to advance efficient and reproducible network experiments.
△ Less
Submitted 9 February, 2018;
originally announced February 2018.