Search | arXiv e-print repository

dreaMLearning: Data Compression Assisted Machine Learning

Authors: Xiaobo Zhao, Aaron Hurst, Panagiotis Karras, Daniel E. Lucani

Abstract: Despite rapid advancements, machine learning, particularly deep learning, is hindered by the need for large amounts of labeled data to learn meaningful patterns without overfitting and immense demands for computation and storage, which motivate research into architectures that can achieve good performance with fewer resources. This paper introduces dreaMLearning, a novel framework that enables lea… ▽ More Despite rapid advancements, machine learning, particularly deep learning, is hindered by the need for large amounts of labeled data to learn meaningful patterns without overfitting and immense demands for computation and storage, which motivate research into architectures that can achieve good performance with fewer resources. This paper introduces dreaMLearning, a novel framework that enables learning from compressed data without decompression, built upon Entropy-based Generalized Deduplication (EntroGeDe), an entropy-driven lossless compression method that consolidates information into a compact set of representative samples. DreaMLearning accommodates a wide range of data types, tasks, and model architectures. Extensive experiments on regression and classification tasks with tabular and image data demonstrate that dreaMLearning accelerates training by up to 8.8x, reduces memory usage by 10x, and cuts storage by 42%, with a minimal impact on model performance. These advancements enhance diverse ML applications, including distributed and federated learning, and tinyML on resource-constrained edge devices, unlocking new possibilities for efficient and scalable learning. △ Less

Submitted 27 June, 2025; originally announced June 2025.

Comments: 18 pages, 11 figures

arXiv:2506.09186 [pdf, ps, other]

Not all those who drift are lost: Drift correction and calibration scheduling for the IoT

Authors: Aaron Hurst, Andrey V. Kalinichev, Klaus Koren, Daniel E. Lucani

Abstract: Sensors provide a vital source of data that link digital systems with the physical world. However, as sensors age, the relationship between what they measure and what they output changes. This is known as sensor drift and poses a significant challenge that, combined with limited opportunity for re-calibration, can severely limit data quality over time. Previous approaches to drift correction typic… ▽ More Sensors provide a vital source of data that link digital systems with the physical world. However, as sensors age, the relationship between what they measure and what they output changes. This is known as sensor drift and poses a significant challenge that, combined with limited opportunity for re-calibration, can severely limit data quality over time. Previous approaches to drift correction typically require large volumes of ground truth data and do not consider measurement or prediction uncertainty. In this paper, we propose a probabilistic sensor drift correction method that takes a fundamental approach to modelling the sensor response using Gaussian Process Regression. Tested using dissolved oxygen sensors, our method delivers mean squared error (MSE) reductions of up to 90% and more than 20% on average. We also propose a novel uncertainty-driven calibration schedule optimisation approach that builds on top of drift correction and further reduces MSE by up to 15.7%. △ Less

Submitted 10 June, 2025; originally announced June 2025.

arXiv:2408.14473 [pdf, other]

doi 10.1109/JIOT.2024.3476922

Precision on Demand: Propositional Logic for Event-Trigger Threshold Regulation

Authors: Valdemar Tang, Claudio Gomes, Daniel Lucani

Abstract: We introduce a novel event-trigger threshold (ETT) regulation mechanism based on the quantitative semantics of propositional logic (PL). We exploit the expressiveness of the PL vocabulary to deliver a precise and flexible specification of ETT regulation based on system requirements and properties. Additionally, we present a modified ETT regulation mechanism that provides formal guarantees for sati… ▽ More We introduce a novel event-trigger threshold (ETT) regulation mechanism based on the quantitative semantics of propositional logic (PL). We exploit the expressiveness of the PL vocabulary to deliver a precise and flexible specification of ETT regulation based on system requirements and properties. Additionally, we present a modified ETT regulation mechanism that provides formal guarantees for satisfaction/violation detection of arbitrary PL properties. To validate our proposed method, we consider a convoy of vehicles in an adaptive cruise control scenario. In this scenario, the PL operators are used to encode safety properties and the ETTs are regulated accordingly, e.g., if our safety metric is high there can be a higher ETT threshold, while a smaller threshold is used when the system is approaching unsafe conditions. Under ideal ETT regulation conditions in this safety scenario, we show that reductions between 41.8 - 96.3% in the number of triggered events is possible compared to using a constant ETT while maintaining similar safety conditions. △ Less

Submitted 11 February, 2025; v1 submitted 26 August, 2024; originally announced August 2024.

Comments: 17 pages, 7 figures

Journal ref: IEEE Internet of Things Journal, vol. 12, no. 3, pp. 2674-2689, 1 Feb.1, 2025

arXiv:2402.05974 [pdf, other]

RAGE for the Machine: Image Compression with Low-Cost Random Access for Embedded Applications

Authors: Christian D. Rask, Daniel E. Lucani

Abstract: We introduce RAGE, an image compression framework that achieves four generally conflicting objectives: 1) good compression for a wide variety of color images, 2) computationally efficient, fast decompression, 3) fast random access of images with pixel-level granularity without the need to decompress the entire image, 4) support for both lossless and lossy compression. To achieve these, we rely on… ▽ More We introduce RAGE, an image compression framework that achieves four generally conflicting objectives: 1) good compression for a wide variety of color images, 2) computationally efficient, fast decompression, 3) fast random access of images with pixel-level granularity without the need to decompress the entire image, 4) support for both lossless and lossy compression. To achieve these, we rely on the recent concept of generalized deduplication (GD), which is known to provide efficient lossless (de)compression and fast random access in time-series data, and deliver key expansions suitable for image compression, both lossless and lossy. Using nine different datasets, incl. graphics, logos, natural images, we show that RAGE has similar or better compression ratios to state-of-the-art lossless image compressors, while delivering pixel-level random access capabilities. Tests in an ARM Cortex-M33 platform show seek times between 9.9 and 40.6~ns and average decoding time per pixel between 274 and 1226~ns. Our measurements also show that RAGE's lossy variant, RAGE-Q, outperforms JPEG by several fold in terms of distortion in embedded graphics and has reasonable compression and distortion for natural images. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 7 pages, submitted, 10 figures, submitted to IEEE International Conference on Image Processing (ICIP)

arXiv:2401.12018 [pdf, other]

doi 10.14778/3648160.3648181

PairwiseHist: Fast, Accurate and Space-Efficient Approximate Query Processing with Data Compression

Authors: Aaron Hurst, Daniel E. Lucani, Qi Zhang

Abstract: Exponential growth in data collection is creating significant challenges for data storage and analytics latency.Approximate Query Processing (AQP) has long been touted as a solution for accelerating analytics on large datasets, however, there is still room for improvement across all key performance criteria. In this paper, we propose a novel histogram-based data synopsis called PairwiseHist that u… ▽ More Exponential growth in data collection is creating significant challenges for data storage and analytics latency.Approximate Query Processing (AQP) has long been touted as a solution for accelerating analytics on large datasets, however, there is still room for improvement across all key performance criteria. In this paper, we propose a novel histogram-based data synopsis called PairwiseHist that uses recursive hypothesis testing to ensure accurate histograms and can be built on top of data compressed using Generalized Deduplication (GD). We thus show that GD data compression can contribute to AQP. Compared to state-of-the-art AQP approaches, PairwiseHist achieves better performance across all key metrics, including 2.6$ \times $ higher accuracy, 3.5$ \times $ lower latency, 24$ \times $ smaller synopses and 1.5--4$ \times $ faster construction time. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2308.14627 [pdf, other]

Zip to Zip-it: Compression to Achieve Local Differential Privacy

Authors: Francesco Taurone, Daniel Lucani, Qi Zhang

Abstract: Local differential privacy techniques for numerical data typically transform a dataset to ensure a bound on the likelihood that, given a query, a malicious user could infer information on the original samples. Queries are often solely based on users and their requirements, limiting the design of the perturbation to processes that, while privatizing the results, do not jeopardize their usefulness.… ▽ More Local differential privacy techniques for numerical data typically transform a dataset to ensure a bound on the likelihood that, given a query, a malicious user could infer information on the original samples. Queries are often solely based on users and their requirements, limiting the design of the perturbation to processes that, while privatizing the results, do not jeopardize their usefulness. In this paper, we propose a privatization technique called Zeal, where perturbator and aggregator are designed as a unit, resulting in a locally differentially private mechanism that, by-design, improves the compressibility of the perturbed dataset compared to the original, saves on transmitted bits for data collection and protects against a privacy vulnerabilities due to floating point arithmetic that affect other state-of-the-art schemes. We prove that the utility error on querying the average is invariant to the bias introduced by Zeal in a wide range of conditions, and that under the same circumstances, Zeal also guarantee protection against the aforementioned vulnerability. Our numerical results show up to 94% improvements in compression and up to 95% more efficient data transmissions, while keeping utility errors within 2%. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Journal ref: 2023 IEEE Global Communications Conference: Selected Areas in Communications: Cloud/edge Computing, Networking, and Data Storage (Globecom2023 SAC CLOUD)

arXiv:2308.03623 [pdf, other]

doi 10.1007/978-3-031-38318-2

Lossless preprocessing of floating point data to enhance compression

Authors: Francesco Taurone, Daniel E. Lucani, Marcell Fehér, Qi Zhang

Abstract: Data compression algorithms typically rely on identifying repeated sequences of symbols from the original data to provide a compact representation of the same information, while maintaining the ability to recover the original data from the compressed sequence. Using data transformations prior to the compression process has the potential to enhance the compression capabilities, being lossless as lo… ▽ More Data compression algorithms typically rely on identifying repeated sequences of symbols from the original data to provide a compact representation of the same information, while maintaining the ability to recover the original data from the compressed sequence. Using data transformations prior to the compression process has the potential to enhance the compression capabilities, being lossless as long as the transformation is invertible. Floating point data presents unique challenges to generate invertible transformations with high compression potential. This paper identifies key conditions for basic operations of floating point data that guarantee lossless transformations. Then, we show four methods that make use of these observations to deliver lossless compression of real datasets, where we improve compression rates up to 40 %. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2304.07240 [pdf, other]

GreedyGD: Enhanced Generalized Deduplication for Direct Analytics in IoT

Authors: Aaron Hurst, Daniel E. Lucani, Qi Zhang

Abstract: Exponential growth in the amount of data generated by the Internet of Things currently pose significant challenges for data communication, storage and analytics and leads to high costs for organisations hoping to leverage their data. Novel techniques are therefore needed to holistically improve the efficiency of data storage and analytics in IoT systems. The emerging compression technique Generali… ▽ More Exponential growth in the amount of data generated by the Internet of Things currently pose significant challenges for data communication, storage and analytics and leads to high costs for organisations hoping to leverage their data. Novel techniques are therefore needed to holistically improve the efficiency of data storage and analytics in IoT systems. The emerging compression technique Generalized Deduplication (GD) has been shown to deliver high compression and enable direct compressed data analytics with low storage and memory requirements. In this paper, we propose a new GD-based data compression algorithm called GreedyGD that is designed for analytics. Compared to existing versions of GD, GreedyGD enables more reliable analytics with less data, while running 11.2x faster and delivering even better compression. △ Less

Submitted 14 April, 2023; originally announced April 2023.

ACM Class: E.2

arXiv:2303.04478 [pdf, other]

Change a Bit to save Bytes: Compression for Floating Point Time-Series Data

Authors: Francesco Taurone, Daniel E. Lucani, Marcell Fehér, Qi Zhang

Abstract: The number of IoT devices is expected to continue its dramatic growth in the coming years and, with it, a growth in the amount of data to be transmitted, processed and stored. Compression techniques that support analytics directly on the compressed data could pave the way for systems to scale efficiently to these growing demands. This paper proposes two novel methods for preprocessing a stream of… ▽ More The number of IoT devices is expected to continue its dramatic growth in the coming years and, with it, a growth in the amount of data to be transmitted, processed and stored. Compression techniques that support analytics directly on the compressed data could pave the way for systems to scale efficiently to these growing demands. This paper proposes two novel methods for preprocessing a stream of floating point data to improve the compression capabilities of various IoT data compressors. In particular, these techniques are shown to be helpful with recent compressors that allow for random access and analytics while maintaining good compression. Our techniques improve compression with reductions up to 80% when allowing for at most 1% of recovery error. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Journal ref: 2023 IEEE International Conference on Communications (ICC): SAC Cloud Computing, Networking and Storage Track (IEEE ICC'23 - SAC-02 CCNS Track)

arXiv:2209.02334 [pdf, other]

An Adaptive Column Compression Family for Self-Driving Databases

Authors: Marcell Fehér, Daniel E. Lucani, Ioannis Chatzigeorgiou

Abstract: Modern in-memory databases are typically used for high-performance workloads, therefore they have to be optimized for small memory footprint and high query speed at the same time. Data compression has the potential to reduce memory requirements but often reduces query speed too. In this paper we propose a novel, adaptive compressor that offers a new trade-off point of these dimensions, achieving b… ▽ More Modern in-memory databases are typically used for high-performance workloads, therefore they have to be optimized for small memory footprint and high query speed at the same time. Data compression has the potential to reduce memory requirements but often reduces query speed too. In this paper we propose a novel, adaptive compressor that offers a new trade-off point of these dimensions, achieving better compression than LZ4 while reaching query speeds close to the fastest existing segment encoders. We evaluate our compressor both with synthetic data in isolation and on the TPC-H and Join Order Benchmarks, integrated into a modern relational column store, Hyrise. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: Appeared in the Thirteenth International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures (ADMS'22), a workshop of VLDB22

arXiv:2202.13925 [pdf, ps, other]

Bonsai: A Generalized Look at Dual Deduplication

Authors: Hadi Sehat, Anders Lindskov Kloborg, Christian Mørup, Elena Pagnin, Daniel E. Lucani

Abstract: Cloud Service Providers (CSPs) offer a vast amount of storage space at competitive prices to cope with the growing demand for digital data storage. Dual deduplication is a recent framework designed to improve data compression on the CSP while keeping clients' data private from the CSP. To achieve this, clients perform lightweight information-theoretic transformations to their data prior to upload.… ▽ More Cloud Service Providers (CSPs) offer a vast amount of storage space at competitive prices to cope with the growing demand for digital data storage. Dual deduplication is a recent framework designed to improve data compression on the CSP while keeping clients' data private from the CSP. To achieve this, clients perform lightweight information-theoretic transformations to their data prior to upload. We investigate the effectiveness of dual deduplication, and propose an improvement for the existing state-of-the-art method. We name our proposal Bonsai as it aims at reducing storage fingerprint and improving scalability. In detail, Bonsai achieves (1) significant reduction in client storage, (2) reduction in total required storage (client + CSP), and (3) reducing the deduplication time on the CSP. Our experiments show that Bonsai achieves compression rates of 68\% on the cloud and 5\% on the client, while allowing the cloud to identify deduplications in a time-efficient manner. We also show that combining our method with universal compressors in the cloud, e.g., Brotli, can yield better overall compression on the data compared to only applying the universal compressor or plain Bonsai. Finally, we show that Bonsai and its variants provide sufficient privacy against an honest-but-curious CPS that knows the distribution of the Clients' original data. △ Less

Submitted 29 March, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

arXiv:2201.10839 [pdf, other]

Bifrost: Secure, Scalable and Efficient File Sharing System Using Dual Deduplication

Authors: Hadi Sehat, Elena Pagnin, Daniel E. Lucani

Abstract: We consider the problem of sharing sensitive or valuable files across users while partially relying on a common, untrusted third-party, e.g., a Cloud Storage Provider (CSP). Although users can rely on a secure peer-to-peer (P2P) channel for file sharing, this introduces potential delay on the data transfer and requires the sender to remain active and connected while the transfer process occurs. In… ▽ More We consider the problem of sharing sensitive or valuable files across users while partially relying on a common, untrusted third-party, e.g., a Cloud Storage Provider (CSP). Although users can rely on a secure peer-to-peer (P2P) channel for file sharing, this introduces potential delay on the data transfer and requires the sender to remain active and connected while the transfer process occurs. Instead of using the P2P channel for the entire file, users can upload information about the file on a common CSP and share only the essential information that enables the receiver to download and recover the original file. This paper introduces Bifrost, an innovative file sharing system inspired by recent results on dual deduplication. Bifrost achieves the desired functionality and simultaneously guarantees that (1) the CSP can efficiently compress outsourced data; (2) the secure P2P channel is used only to transmit short, but crucial information; (3) users can check for data integrity, i.e., detect if the CSP alters the outsourced data; and (4) only the sender (data owner) and the intended receiver can access the file after sharing, i.e., the cloud or no malicious adversary can infer useful information about the shared file. We analyze compression and bandwidth performance using a proof-of-concept implementation. Our experiments show that secure file sharing can be achieved by sending only 650 bits on the P2P channel, irrespective of file size, while the CSP that aids the sharing can enjoy a compression rate of 86.9 %. △ Less

Submitted 26 January, 2022; originally announced January 2022.

arXiv:2107.08868 [pdf, other]

Energy Efficient Data Recovery from Corrupted LoRa Frames

Authors: Niloofar Yazdani, Nikolaos Kouvelas, R Venkatesha Prasad, Daniel E. Lucani

Abstract: High frame-corruption is widely observed in Long Range Wide Area Networks (LoRaWAN) due to the coexistence with other networks in ISM bands and an Aloha-like MAC layer. LoRa's Forward Error Correction (FEC) mechanism is often insufficient to retrieve corrupted data. In fact, real-life measurements show that at least one-fourth of received transmissions are corrupted. When more frames are dropped,… ▽ More High frame-corruption is widely observed in Long Range Wide Area Networks (LoRaWAN) due to the coexistence with other networks in ISM bands and an Aloha-like MAC layer. LoRa's Forward Error Correction (FEC) mechanism is often insufficient to retrieve corrupted data. In fact, real-life measurements show that at least one-fourth of received transmissions are corrupted. When more frames are dropped, LoRa nodes usually switch over to higher spreading factors (SF), thus increasing transmission times and increasing the required energy. This paper introduces ReDCoS, a novel coding technique at the application layer that improves recovery of corrupted LoRa frames, thus reducing the overall transmission time and energy invested by LoRa nodes by several-fold. ReDCoS utilizes lightweight coding techniques to pre-encode the transmitted data. Therefore, the inbuilt Cyclic Redundancy Check (CRC) that follows is computed based on an already encoded data. At the receiver, we use both the CRC and the coded data to recover data from a corrupted frame beyond the built-in Error Correcting Code (ECC). We compare the performance of ReDCoS to (I) the standard FEC of vanilla-LoRaWAN, and to (ii) RS coding applied as ECC to the data of LoRaWAN. The results indicated a 54x and 13.5x improvement of decoding ratio, respectively, when 20 data symbols were sent. Furthermore, we evaluated ReDCoS on-field using LoRa SX1261 transceivers showing that it outperformed RS-coding by factor of at least 2x (and up to 6x) in terms of the decoding ratio while consuming 38.5% less energy per correctly received transmission. △ Less

Submitted 19 July, 2021; originally announced July 2021.

Comments: 6 pages

arXiv:2105.00721 [pdf, other]

Stream Compression of DLMS Smart Meter Readings

Authors: Marcell Fehér, Daniel E. Lucani, Morten Tranberg Hansen, Flemming Enevold Vester

Abstract: Smart electricity meters typically upload readings a few times a day. Utility providers aim to increase the upload frequency in order to access consumption information in near real time, but the legacy compressors fail to provide sufficient savings on the low-bandwidth, high-cost data connection. We propose a new compression method and data format for DLMS smart meter readings, which is significan… ▽ More Smart electricity meters typically upload readings a few times a day. Utility providers aim to increase the upload frequency in order to access consumption information in near real time, but the legacy compressors fail to provide sufficient savings on the low-bandwidth, high-cost data connection. We propose a new compression method and data format for DLMS smart meter readings, which is significantly better with frequent uploads and enable reporting every reading in near real time with the same or lower data sizes than the currently available compressors in the DLMS protocol. △ Less

Submitted 3 May, 2021; originally announced May 2021.

Comments: 6 pages, 7 figures, IEEE conference format, submitted to Globecom'21

arXiv:2104.15094 [pdf, other]

QoS-Aware Placement of Deep Learning Services on the Edge with Multiple Service Implementations

Authors: Nathaniel Hudson, Hana Khamfroush, Daniel E. Lucani

Abstract: Mobile edge computing pushes computationally-intensive services closer to the user to provide reduced delay due to physical proximity. This has led many to consider deploying deep learning models on the edge -- commonly known as edge intelligence (EI). EI services can have many model implementations that provide different QoS. For instance, one model can perform inference faster than another (thus… ▽ More Mobile edge computing pushes computationally-intensive services closer to the user to provide reduced delay due to physical proximity. This has led many to consider deploying deep learning models on the edge -- commonly known as edge intelligence (EI). EI services can have many model implementations that provide different QoS. For instance, one model can perform inference faster than another (thus reducing latency) while achieving less accuracy when evaluated. In this paper, we study joint service placement and model scheduling of EI services with the goal to maximize Quality-of-Servcice (QoS) for end users where EI services have multiple implementations to serve user requests, each with varying costs and QoS benefits. We cast the problem as an integer linear program and prove that it is NP-hard. We then prove the objective is equivalent to maximizing a monotone increasing, submodular set function and thus can be solved greedily while maintaining a (1-1/e)-approximation guarantee. We then propose two greedy algorithms: one that theoretically guarantees this approximation and another that empirically matches its performance with greater efficiency. Finally, we thoroughly evaluate the proposed algorithm for making placement and scheduling decisions in both synthetic and real-world scenarios against the optimal solution and some baselines. In the real-world case, we consider real machine learning models using the ImageNet 2012 data-set for requests. Our numerical experiments empirically show that our more efficient greedy algorithm is able to approximate the optimal solution with a 0.904 approximation on average, while the next closest baseline achieves a 0.607 approximation on average. △ Less

Submitted 30 April, 2021; originally announced April 2021.

Comments: Accepted for publication through the 30th International Conference on Computer Communications and Networks (ICCCN 2021). This manuscript contains a complete proof of a theorem referenced in the ICCCN manuscript

arXiv:2101.05323 [pdf, other]

doi 10.1145/3386367.3431302

ZipLine: In-Network Compression at Line Speed

Authors: Sébastien Vaucher, Niloofar Yazdani, Pascal Felber, Daniel E. Lucani, Valerio Schiavoni

Abstract: Network appliances continue to offer novel opportunities to offload processing from computing nodes directly into the data plane. One popular concern of network operators and their customers is to move data increasingly faster. A common technique to increase data throughput is to compress it before its transmission. However, this requires compression of the data -- a time and energy demanding pre-… ▽ More Network appliances continue to offer novel opportunities to offload processing from computing nodes directly into the data plane. One popular concern of network operators and their customers is to move data increasingly faster. A common technique to increase data throughput is to compress it before its transmission. However, this requires compression of the data -- a time and energy demanding pre-processing phase -- and decompression upon reception -- a similarly resource consuming operation. Moreover, if multiple nodes transfer similar data chunks across the network hop (e.g., a given pair of switches), each node effectively wastes resources by executing similar steps. This paper proposes ZipLine, an approach to design and implement (de)compression at line speed leveraging the Tofino hardware platform which is programmable using the P4_16 language. We report on lessons learned while building the system and show throughput, latency and compression measurements on synthetic and real-world traces, showcasing the benefits and trade-offs of our design. △ Less

Submitted 13 January, 2021; originally announced January 2021.

Journal ref: 2020. Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies. Association for Computing Machinery, New York, NY, USA, 399-405

arXiv:2011.08381 [pdf, other]

Optimal Accuracy-Time Trade-off for Deep Learning Services in Edge Computing Systems

Authors: Minoo Hosseinzadeh, Andrew Wachal, Hana Khamfroush, Daniel E. Lucani

Abstract: With the increasing demand for computationally intensive services like deep learning tasks, emerging distributed computing platforms such as edge computing (EC) systems are becoming more popular. Edge computing systems have shown promising results in terms of latency reduction compared to the traditional cloud systems. However, their limited processing capacity imposes a trade-off between the pote… ▽ More With the increasing demand for computationally intensive services like deep learning tasks, emerging distributed computing platforms such as edge computing (EC) systems are becoming more popular. Edge computing systems have shown promising results in terms of latency reduction compared to the traditional cloud systems. However, their limited processing capacity imposes a trade-off between the potential latency reduction and the achieved accuracy in computationally-intensive services such as deep learning-based services. In this paper, we focus on finding the optimal accuracy-time trade-off for running deep learning services in a three-tier EC platform where several deep learning models with different accuracy levels are available. Specifically, we cast the problem as an Integer Linear Program, where optimal task scheduling decisions are made to maximize overall user satisfaction in terms of accuracy-time trade-off. We prove that our problem is NP-hard and then provide a polynomial constant-time greedy algorithm, called GUS, that is shown to attain near-optimal results. Finally, upon vetting our algorithmic solution through numerical experiments and comparison with a set of heuristics, we deploy it on a test-bed implemented to measure for real-world results. The results of both numerical analysis and real-world implementation show that GUS can outperform the baseline heuristics in terms of the average percentage of satisfied users by a factor of at least 50%. △ Less

Submitted 16 November, 2020; originally announced November 2020.

arXiv:2007.11403 [pdf, ps, other]

Yggdrasil: Privacy-aware Dual Deduplication in Multi Client Settings

Authors: Hadi Sehat, Elena Pagnin, Daniel E. Lucani

Abstract: This paper proposes Yggdrasil, a protocol for privacy-aware dual data deduplication in multi client settings. Yggdrasil is designed to reduce the cloud storage space while safeguarding the privacy of the client's outsourced data. Yggdrasil combines three innovative tools to achieve this goal. First, generalized deduplication, an emerging technique to reduce data footprint. Second, non-deterministi… ▽ More This paper proposes Yggdrasil, a protocol for privacy-aware dual data deduplication in multi client settings. Yggdrasil is designed to reduce the cloud storage space while safeguarding the privacy of the client's outsourced data. Yggdrasil combines three innovative tools to achieve this goal. First, generalized deduplication, an emerging technique to reduce data footprint. Second, non-deterministic transformations that are described compactly and improve the degree of data compression in the Cloud (across users). Third, data preprocessing in the clients in the form of lightweight, privacy-driven transformations prior to upload. This guarantees that an honest-but-curious Cloud service trying to retrieve the client's actual data will face a high degree of uncertainty as to what the original data is. We provide a mathematical analysis of the measure of uncertainty as well as the compression potential of our protocol. Our experiments with a HDFS log data set shows that 49% overall compression can be achieved, with clients storing only 12% for privacy and the Cloud storing the rest. This is achieved while ensuring that each fragment uploaded to the Cloud would have 10^296 possible original strings from the client. Higher uncertainty is possible, with some reduction of compression potential. △ Less

Submitted 22 July, 2020; originally announced July 2020.

arXiv:2007.10064 [pdf, other]

Memory-aware Online Compression of CAN Bus Data for Future Vehicular Systems

Authors: Niloofar Yazdani, Lars Nielsen, Daniel E. Lucani

Abstract: Vehicles generate a large amount of data from their internal sensors. This data is not only useful for a vehicle's proper operation, but it provides car manufacturers with the ability to optimize performance of individual vehicles and companies with fleets of vehicles (e.g., trucks, taxis, tractors) to optimize their operations to reduce fuel costs and plan repairs. This paper proposes algorithms… ▽ More Vehicles generate a large amount of data from their internal sensors. This data is not only useful for a vehicle's proper operation, but it provides car manufacturers with the ability to optimize performance of individual vehicles and companies with fleets of vehicles (e.g., trucks, taxis, tractors) to optimize their operations to reduce fuel costs and plan repairs. This paper proposes algorithms to compress CAN bus data, specifically, packaged as MDF4 files. In particular, we propose lightweight, online and configurable compression algorithms that allow limited devices to choose the amount of RAM and Flash allocated to them. We show that our proposals can outperform LZW for the same RAM footprint, and can even deliver comparable or better performance to DEFLATE under the same RAM limitations. △ Less

Submitted 20 July, 2020; originally announced July 2020.

Comments: 6 pages, 7 figures

arXiv:2005.11158 [pdf, other]

doi 10.1145/3401025.3404098

Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication

Authors: Christian Göttel, Lars Nielsen, Niloofar Yazdani, Pascal Felber, Daniel E. Lucani, Valerio Schiavoni

Abstract: With the advent of the Internet of Things (IoT), the ever growing number of connected devices observed in recent years and foreseen for the next decade suggests that more and more data will have to be transmitted over a network, before being processed and stored in data centers. Generalized deduplication (GD) is a novel technique to effectively reduce the data storage cost by identifying similar d… ▽ More With the advent of the Internet of Things (IoT), the ever growing number of connected devices observed in recent years and foreseen for the next decade suggests that more and more data will have to be transmitted over a network, before being processed and stored in data centers. Generalized deduplication (GD) is a novel technique to effectively reduce the data storage cost by identifying similar data chunks, and able to gradually reduce the pressure from the network infrastructure by limiting the data that needs to be transmitted. This paper presents Hermes, an application-level protocol for the data-plane that can operate over generalized deduplication, as well as over classic deduplication. Hermes significantly reduces the data transmission traffic while effectively decreasing the energy footprint, a relevant matter to consider in the context of IoT deployments. We fully implemented Hermes and evaluated its performance using consumer-grade IoT devices (e.g., Raspberry Pi 4B models). Our results highlight several trade-offs that must be taken into account when considering real-world workloads. △ Less

Submitted 20 July, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

Comments: This work was partially financed by the SCALE-IoT Project (Grant No. 7026-00042B) granted by the Independent Research Fund Denmark, by the Aarhus Universitets Forskningsfond (AUFF) Starting Grant Project AUFF- 2017-FLS-7-1, and Aarhus University's DIGIT Centre. European Commission Project: LEGaTO - Low Energy Toolset for Heterogeneous Computing (EC-H2020-780681)

Journal ref: DEBS'20: Proceedings of the 14th ACM International Conference on Distributed and Event-Based Systems (2020) 133-136

arXiv:1901.02720 [pdf, other]

doi 10.1109/GLOBECOM38437.2019.9014012

Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties

Authors: Rasmus Vestergaard, Qi Zhang, Daniel E. Lucani

Abstract: We study a generalization of deduplication, which enables lossless deduplication of highly similar data and show that standard deduplication with fixed chunk length is a special case. We provide bounds on the expected length of coded sequences for generalized deduplication and show that the coding has asymptotic near-entropy cost under the proposed source model. More importantly, we show that gene… ▽ More We study a generalization of deduplication, which enables lossless deduplication of highly similar data and show that standard deduplication with fixed chunk length is a special case. We provide bounds on the expected length of coded sequences for generalized deduplication and show that the coding has asymptotic near-entropy cost under the proposed source model. More importantly, we show that generalized deduplication allows for multiple orders of magnitude faster convergence than standard deduplication. This means that generalized deduplication can provide compression benefits much earlier than standard deduplication, which is key in practical systems. Numerical examples demonstrate our results, showing that our lower bounds are achievable, and illustrating the potential gain of using the generalization over standard deduplication. In fact, we show that even for a simple case of generalized deduplication, the gain in convergence speed is linear with the size of the data chunks. △ Less

Submitted 7 August, 2019; v1 submitted 9 January, 2019; originally announced January 2019.

Comments: 15 pages, 4 figures. This is the full version of a paper accepted for GLOBECOM 2019

arXiv:1804.07190 [pdf, other]

Reliable IoT Storage: Minimizing Bandwidth Use in Storage Without Newcomer Nodes

Authors: Xiaobo Zhao, Daniel E. Lucani, Xiaohong Shen, Haiyan Wang

Abstract: This letter characterizes the optimal policies for bandwidth use and storage for the problem of distributed storage in Internet of Things (IoT) scenarios, where lost nodes cannot be replaced by new nodes as is typically assumed in Data Center and Cloud scenarios. We develop an information flow model that captures the overall process of data transmission between IoT devices, from the initial prepar… ▽ More This letter characterizes the optimal policies for bandwidth use and storage for the problem of distributed storage in Internet of Things (IoT) scenarios, where lost nodes cannot be replaced by new nodes as is typically assumed in Data Center and Cloud scenarios. We develop an information flow model that captures the overall process of data transmission between IoT devices, from the initial preparation stage (generating redundancy from the original data) to the different repair stages with fewer and fewer devices. Our numerical results show that in a system with 10 nodes, the proposed optimal scheme can save as much as 10.3% of bandwidth use, and as much as 44% storage use with respect to the closest suboptimal approach. △ Less

Submitted 19 April, 2018; originally announced April 2018.

Comments: 4 pages, 3 figures, accepted for publication in IEEE Communications Letters

arXiv:1511.05892 [pdf, ps, other]

doi 10.1109/TCOMM.2015.2503398

Analysis and Optimization of Sparse Random Linear Network Coding for Reliable Multicast Services

Authors: Andrea Tassi, Ioannis Chatzigeorgiou, Daniel E. Lucani

Abstract: Point-to-multipoint communications are expected to play a pivotal role in next-generation networks. This paper refers to a cellular system transmitting layered multicast services to a multicast group of users. Reliability of communications is ensured via different Random Linear Network Coding (RLNC) techniques. We deal with a fundamental problem: the computational complexity of the RLNC decoder. T… ▽ More Point-to-multipoint communications are expected to play a pivotal role in next-generation networks. This paper refers to a cellular system transmitting layered multicast services to a multicast group of users. Reliability of communications is ensured via different Random Linear Network Coding (RLNC) techniques. We deal with a fundamental problem: the computational complexity of the RLNC decoder. The higher the number of decoding operations is, the more the user's computational overhead grows and, consequently, the faster the battery of mobile devices drains. By referring to several sparse RLNC techniques, and without any assumption on the implementation of the RLNC decoder in use, we provide an efficient way to characterize the performance of users targeted by ultra-reliable layered multicast services. The proposed modeling allows to efficiently derive the average number of coded packet transmissions needed to recover one or more service layers. We design a convex resource allocation framework that allows to minimize the complexity of the RLNC decoder by jointly optimizing the transmission parameters and the sparsity of the code. The designed optimization framework also ensures service guarantees to predetermined fractions of users. The performance of the proposed optimization framework is then investigated in a LTE-A eMBMS network multicasting H.264/SVC video services. △ Less

Submitted 19 November, 2015; v1 submitted 18 November, 2015; originally announced November 2015.

Comments: To appear on IEEE Transactions on Communications

arXiv:1404.6620 [pdf, other]

Fulcrum Network Codes: A Code for Fluid Allocation of Complexity

Authors: Daniel E. Lucani, Morten V. Pedersen, Diego Ruano, Chres W. Sørensen, Frank H. P. Fitzek, Janus Heide, Olav Geil

Abstract: This paper proposes Fulcrum network codes, a network coding framework that achieves three seemingly conflicting objectives: (i) to reduce the coding coefficient overhead to almost n bits per packet in a generation of n packets; (ii) to operate the network using only GF(2) operations at intermediate nodes if necessary, dramatically reducing complexity in the network; (iii) to deliver an end-to-end… ▽ More This paper proposes Fulcrum network codes, a network coding framework that achieves three seemingly conflicting objectives: (i) to reduce the coding coefficient overhead to almost n bits per packet in a generation of n packets; (ii) to operate the network using only GF(2) operations at intermediate nodes if necessary, dramatically reducing complexity in the network; (iii) to deliver an end-to-end performance that is close to that of a high-field network coding system for high-end receivers while simultaneously catering to low-end receivers that decode in GF(2). As a consequence of (ii) and (iii), Fulcrum codes have a unique trait missing so far in the network coding literature: they provide the network with the flexibility to spread computational complexity over different devices depending on their current load, network conditions, or even energy targets in a decentralized way. At the core of our framework lies the idea of precoding at the sources using an expansion field GF(2h) to increase the number of dimensions seen by the network using a linear mapping. Fulcrum codes can use any high-field linear code for precoding, e.g., Reed-Solomon, with the structure of the precode determining some of the key features of the resulting code. For example, a systematic structure provides the ability to manage heterogeneous receivers while using the same data stream. Our analysis shows that the number of additional dimensions created during precoding controls the trade-off between delay, overhead, and complexity. Our implementation and measurements show that Fulcrum achieves similar decoding probability as high field Random Linear Network Coding (RLNC) approaches but with encoders/decoders that are an order of magnitude faster. △ Less

Submitted 18 November, 2015; v1 submitted 26 April, 2014; originally announced April 2014.

Comments: 30 pages, 12 figures, Submitted to the IEEE Transactions on Communications

arXiv:1204.0034 [pdf, ps, other]

doi 10.1109/ICC.2013.6655057

Systematic Network Coding with the Aid of a Full-Duplex Relay

Authors: Giuliano Giacaglia, Xiaomeng Shi, MinJi Kim, Daniel E. Lucani, Muriel Medard

Abstract: A characterization of systematic network coding over multi-hop wireless networks is key towards understanding the trade-off between complexity and delay performance of networks that preserve the systematic structure. This paper studies the case of a relay channel, where the source's objective is to deliver a given number of data packets to a receiver with the aid of a relay. The source broadcasts… ▽ More A characterization of systematic network coding over multi-hop wireless networks is key towards understanding the trade-off between complexity and delay performance of networks that preserve the systematic structure. This paper studies the case of a relay channel, where the source's objective is to deliver a given number of data packets to a receiver with the aid of a relay. The source broadcasts to both the receiver and the relay using one frequency, while the relay uses another frequency for transmissions to the receiver, allowing for a full-duplex operation of the relay. We analyze the decoding complexity and delay performance of two types of relays: one that preserves the systematic structure of the code from the source; another that does not. A systematic relay forwards uncoded packets upon reception, but transmits coded packets to the receiver after receiving the first coded packet from the source. On the other hand, a non-systematic relay always transmits linear combinations of previously received packets. We compare the performance of these two alternatives by analytically characterizing the expected transmission completion time as well as the number of uncoded packets forwarded by the relay. Our numerical results show that, for a poor channel between the source and the receiver, preserving the systematic structure at the relay (i) allows a significant increase in the number of uncoded packets received by the receiver, thus reducing the decoding complexity, and (ii) preserves close to optimal delay performance. △ Less

Submitted 30 March, 2012; originally announced April 2012.

Comments: 6 pages, 5 figures, submitted to IEEE Globecom

arXiv:1109.2613 [pdf, ps, other]

Whether and Where to Code in the Wireless Relay Channel

Authors: Xiaomeng Shi, Muriel Medard, Daniel E. Lucani

Abstract: The throughput benefits of random linear network codes have been studied extensively for wirelined and wireless erasure networks. It is often assumed that all nodes within a network perform coding operations. In energy-constrained systems, however, coding subgraphs should be chosen to control the number of coding nodes while maintaining throughput. In this paper, we explore the strategic use of ne… ▽ More The throughput benefits of random linear network codes have been studied extensively for wirelined and wireless erasure networks. It is often assumed that all nodes within a network perform coding operations. In energy-constrained systems, however, coding subgraphs should be chosen to control the number of coding nodes while maintaining throughput. In this paper, we explore the strategic use of network coding in the wireless packet erasure relay channel according to both throughput and energy metrics. In the relay channel, a single source communicates to a single sink through the aid of a half-duplex relay. The fluid flow model is used to describe the case where both the source and the relay are coding, and Markov chain models are proposed to describe packet evolution if only the source or only the relay is coding. In addition to transmission energy, we take into account coding and reception energies. We show that coding at the relay alone while operating in a rateless fashion is neither throughput nor energy efficient. Given a set of system parameters, our analysis determines the optimal amount of time the relay should participate in the transmission, and where coding should be performed. △ Less

Submitted 1 June, 2012; v1 submitted 12 September, 2011; originally announced September 2011.

Comments: 11 pages, 12 figures, to be published in the IEEE JSAC Special Issue on Theories and Methods for Advanced Wireless Relays

arXiv:1105.6176 [pdf, other]

doi 10.1109/ISIT.2011.6034244

Energy-Delay Considerations in Coded Packet Flows

Authors: Daniel E. Lucani, Joerg Kliewer

Abstract: We consider a line of terminals which is connected by packet erasure channels and where random linear network coding is carried out at each node prior to transmission. In particular, we address an online approach in which each terminal has local information to be conveyed to the base station at the end of the line and provide a queueing theoretic analysis of this scenario. First, a genie-aided sce… ▽ More We consider a line of terminals which is connected by packet erasure channels and where random linear network coding is carried out at each node prior to transmission. In particular, we address an online approach in which each terminal has local information to be conveyed to the base station at the end of the line and provide a queueing theoretic analysis of this scenario. First, a genie-aided scenario is considered and the average delay and average transmission energy depending on the link erasure probabilities and the Poisson arrival rates at each node are analyzed. We then assume that all nodes cannot send and receive at the same time. The transmitting nodes in the network send coded data packets before stopping to wait for the receiving nodes to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the information. We analyze this problem for an infinite queue size at the terminals and show that there is an optimal number of coded data packets at each node, in terms of average completion time or transmission energy, to be sent before stopping to listen. △ Less

Submitted 31 May, 2011; originally announced May 2011.

Comments: 5 pages, 3 figures. Accepted, IEEE ISIT 2011, Saint Petersburg, Russia

arXiv:1103.0266 [pdf, ps, other]

On the Order Optimality of Large-scale Underwater Networks

Authors: Won-Yong Shin, Daniel E. Lucani, Muriel Medard, Milica Stojanovic, Vahid Tarokh

Abstract: Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square, in which both bandwidth and received signal power can be limited significantly. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well a… ▽ More Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square, in which both bandwidth and received signal power can be limited significantly. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. Cut-set upper bounds on the throughput scaling are then derived in both extended and dense networks having unit node density and unit area, respectively. It is first analyzed that under extended networks, the upper bound is inversely proportional to the attenuation parameter, thus resulting in a highly power-limited network. Interestingly, it is seen that the upper bound for extended networks is intrinsically related to the attenuation parameter but not the spreading factor. On the other hand, in dense networks, we show that there exists either a bandwidth or power limitation, or both, according to the path-loss attenuation regimes, thus yielding the upper bound that has three fundamentally different operating regimes. Furthermore, we describe an achievable scheme based on the simple nearest-neighbor multi-hop (MH) transmission. We show that under extended networks, the MH scheme is order-optimal for all the operating regimes. An achievability result is also presented in dense networks, where the operating regimes that guarantee the order optimality are identified. It thus turns out that frequency scaling is instrumental towards achieving the order optimality in the regimes. Finally, these scaling results are extended to a random network realization. As a result, vital information for fundamental limits of a variety of underwater network scenarios is provided by showing capacity scaling laws. △ Less

Submitted 28 March, 2011; v1 submitted 1 March, 2011; originally announced March 2011.

Comments: 25 pages, 6 figures, Submitted to IEEE Transactions on Information Theory (part of this work was submitted to the 2011 IEEE International Symposium on Information Theory 2011)

arXiv:1008.0143 [pdf, ps, other]

When Both Transmitting and Receiving Energies Matter: An Application of Network Coding in Wireless Body Area Networks

Authors: Xiaomeng Shi, Muriel Medard, Daniel Lucani

Abstract: A network coding scheme for practical implementations of wireless body area networks is presented, with the objective of providing reliability under low-energy constraints. We propose a simple network layer protocol for star networks, adapting redundancy based on both transmission and reception energies for data and control packets, as well as channel conditions. Our numerical results show that ev… ▽ More A network coding scheme for practical implementations of wireless body area networks is presented, with the objective of providing reliability under low-energy constraints. We propose a simple network layer protocol for star networks, adapting redundancy based on both transmission and reception energies for data and control packets, as well as channel conditions. Our numerical results show that even for small networks, the amount of energy reduction achievable can range from 29% to 87%, as the receiving energy per control packet increases from equal to much larger than the transmitting energy per data packet. The achievable gains increase as a) more nodes are added to the network, and/or b) the channels seen by different sensor nodes become more asymmetric. △ Less

Submitted 25 February, 2011; v1 submitted 31 July, 2010; originally announced August 2010.

Comments: 10 pages, 7 figures, submitted to the NC-Pro Workshop at IFIP Networking Conference 2011, and to appear in the conference proceedings, published by Springer-Verlag, in the Lecture Notes in Computer Science (LNCS) series

arXiv:1005.0855 [pdf, ps, other]

On Capacity Scaling of Underwater Networks: An Information-Theoretic Perspective

Authors: Won-Yong Shin, Daniel E. Lucani, Muriel Medard, Milica Stojanovic, Vahid Tarokh

Abstract: Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. A cut-set upper bound on the throughput scaling is… ▽ More Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. A cut-set upper bound on the throughput scaling is then derived in extended networks. Our result indicates that the upper bound is inversely proportional to the attenuation parameter, thus resulting in a highly power-limited network. Interestingly, it is seen that unlike the case of wireless radio networks, our upper bound is intrinsically related to the attenuation parameter but not the spreading factor. Furthermore, we describe an achievable scheme based on the simple nearest neighbor multi-hop (MH) transmission. It is shown under extended networks that the MH scheme is order-optimal as the attenuation parameter scales exponentially with $\sqrt{n}$ (or faster). Finally, these scaling results are extended to a random network realization. △ Less

Submitted 7 May, 2010; v1 submitted 5 May, 2010; originally announced May 2010.

Comments: 16 pages, 4 figures, Submitted to IEEE Transactions on Information Theory

arXiv:0908.0497 [pdf, ps, other]

doi 10.1109/INFCOM.2010.5462001

Network Coding for Multi-Resolution Multicast

Authors: MinJi Kim, Daniel Lucani, Xiaomeng Shi, Fang Zhao, Muriel Medard

Abstract: Multi-resolution codes enable multicast at different rates to different receivers, a setup that is often desirable for graphics or video streaming. We propose a simple, distributed, two-stage message passing algorithm to generate network codes for single-source multicast of multi-resolution codes. The goal of this "pushback algorithm" is to maximize the total rate achieved by all receivers, whil… ▽ More Multi-resolution codes enable multicast at different rates to different receivers, a setup that is often desirable for graphics or video streaming. We propose a simple, distributed, two-stage message passing algorithm to generate network codes for single-source multicast of multi-resolution codes. The goal of this "pushback algorithm" is to maximize the total rate achieved by all receivers, while guaranteeing decodability of the base layer at each receiver. By conducting pushback and code generation stages, this algorithm takes advantage of inter-layer as well as intra-layer coding. Numerical simulations show that in terms of total rate achieved, the pushback algorithm outperforms routing and intra-layer coding schemes, even with codeword sizes as small as 10 bits. In addition, the performance gap widens as the number of receivers and the number of nodes in the network increases. We also observe that naiive inter-layer coding schemes may perform worse than intra-layer schemes under certain network conditions. △ Less

Submitted 4 August, 2009; originally announced August 2009.

Comments: 9 pages, 16 figures, submitted to IEEE INFOCOM 2010

arXiv:0903.4443 [pdf, other]

Broadcasting in Time-Division Duplexing: A Random Linear Network Coding Approach

Authors: Daniel E. Lucani, Muriel Médard, Milica Stojanovic

Abstract: We study random linear network coding for broadcasting in time division duplexing channels. We assume a packet erasure channel with nodes that cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receivers to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the infor… ▽ More We study random linear network coding for broadcasting in time division duplexing channels. We assume a packet erasure channel with nodes that cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receivers to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the information. We study the mean time to complete the transmission of a block of packets to all receivers. We also present a bound on the number of stops to wait for acknowledgement in order to complete transmission with probability at least $1-ε$, for any $ε>0$. We present analysis and numerical results showing that our scheme outperforms optimal scheduling policies for broadcast, in terms of the mean completion time. We provide a simple heuristic to compute the number of coded packets to be sent before stopping that achieves close to optimal performance with the advantage of a considerable reduction in the search time. △ Less

Submitted 25 March, 2009; originally announced March 2009.

Comments: 6 pages, 5 figures, Submitted to Workshop on Network Coding, Theory, and Applications (NetCod 2009)

arXiv:0903.4434 [pdf, other]

Random Linear Network Coding for Time-Division Duplexing: Queueing Analysis

Authors: Daniel E. Lucani, Muriel Médard, Milica Stojanovic

Abstract: We study the performance of random linear network coding for time division duplexing channels with Poisson arrivals. We model the system as a bulk-service queue with variable bulk size. A full characterization for random linear network coding is provided for time division duplexing channels [1] by means of the moment generating function. We present numerical results for the mean number of packet… ▽ More We study the performance of random linear network coding for time division duplexing channels with Poisson arrivals. We model the system as a bulk-service queue with variable bulk size. A full characterization for random linear network coding is provided for time division duplexing channels [1] by means of the moment generating function. We present numerical results for the mean number of packets in the queue and consider the effect of the range of allowable bulk sizes. We show that there exists an optimal choice of this range that minimizes the mean number of data packets in the queue. △ Less

Submitted 25 March, 2009; originally announced March 2009.

Comments: 5 pages, 5 figures, 2 tables, Submitted to ISIT 2009

arXiv:0903.4426 [pdf, other]

Capacity Scaling Laws for Underwater Networks

Authors: Daniel E. Lucani, Muriel Médard, Milica Stojanovic

Abstract: The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. Signals transmitted from one user to another over a distance $l$ are subject to a power loss of $l^{-α}{a(f)}^{-l}$. Although a terrestrial radio channel can be modeled similarly, the underwater acoustic channel has different characteristics. The s… ▽ More The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. Signals transmitted from one user to another over a distance $l$ are subject to a power loss of $l^{-α}{a(f)}^{-l}$. Although a terrestrial radio channel can be modeled similarly, the underwater acoustic channel has different characteristics. The spreading factor $α$, related to the geometry of propagation, has values in the range $1 \leq α\leq 2$. The absorption coefficient $a(f)$ is a rapidly increasing function of frequency: it is three orders of magnitude greater at 100 kHz than at a few Hz. Existing results for capacity of wireless networks correspond to scenarios for which $a(f) = 1$, or a constant greater than one, and $α\geq 2$. These results cannot be applied to underwater acoustic networks in which the attenuation varies over the system bandwidth. We use a water-filling argument to assess the minimum transmission power and optimum transmission band as functions of the link distance and desired data rate, and study the capacity scaling laws under this model. △ Less

Submitted 25 March, 2009; originally announced March 2009.

Comments: 5 pages, 2 figures, to Appear in Proceedings of Asilomar Conference on Signals, Systems, and Computers, 2008

arXiv:0901.0269 [pdf, other]

Random Linear Network Coding For Time Division Duplexing: Energy Analysis

Authors: Daniel E. Lucani, Milica Stojanovic, Muriel Médard

Abstract: We study the energy performance of random linear network coding for time division duplexing channels. We assume a packet erasure channel with nodes that cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receiver to acknowledge the number of degrees of freedom, if any, that are required to decode correctly… ▽ More We study the energy performance of random linear network coding for time division duplexing channels. We assume a packet erasure channel with nodes that cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receiver to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the information. Our analysis shows that, in terms of mean energy consumed, there is an optimal number of coded data packets to send before stopping to listen. This number depends on the energy needed to transmit each coded packet and the acknowledgment (ACK), probabilities of packet and ACK erasure, and the number of degrees of freedom that the receiver requires to decode the data. We show that its energy performance is superior to that of a full-duplex system. We also study the performance of our scheme when the number of coded packets is chosen to minimize the mean time to complete transmission as in [1]. Energy performance under this optimization criterion is found to be close to optimal, thus providing a good trade-off between energy and time required to complete transmissions. △ Less

Submitted 2 January, 2009; originally announced January 2009.

Comments: 5 pages, 6 figures, Accepted to ICC 2009

arXiv:0809.2350 [pdf, other]

Random Linear Network Coding For Time Division Duplexing: When To Stop Talking And Start Listening

Authors: Daniel E. Lucani, Milica Stojanovic, Muriel Médard

Abstract: A new random linear network coding scheme for reliable communications for time division duplexing channels is proposed. The setup assumes a packet erasure channel and that nodes cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receiver to acknowledge (ACK) the number of degrees of freedom, if any, that ar… ▽ More A new random linear network coding scheme for reliable communications for time division duplexing channels is proposed. The setup assumes a packet erasure channel and that nodes cannot transmit and receive information simultaneously. The sender transmits coded data packets back-to-back before stopping to wait for the receiver to acknowledge (ACK) the number of degrees of freedom, if any, that are required to decode correctly the information. We provide an analysis of this problem to show that there is an optimal number of coded data packets, in terms of mean completion time, to be sent before stopping to listen. This number depends on the latency, probabilities of packet erasure and ACK erasure, and the number of degrees of freedom that the receiver requires to decode the data. This scheme is optimal in terms of the mean time to complete the transmission of a fixed number of data packets. We show that its performance is very close to that of a full duplex system, while transmitting a different number of coded packets can cause large degradation in performance, especially if latency is high. Also, we study the throughput performance of our scheme and compare it to existing half-duplex Go-back-N and Selective Repeat ARQ schemes. Numerical results, obtained for different latencies, show that our scheme has similar performance to the Selective Repeat in most cases and considerable performance gain when latency and packet error probability is high. △ Less

Submitted 13 September, 2008; originally announced September 2008.

Comments: 9 pages, 9 figures, Submitted to INFOCOM'09

arXiv:0809.0070 [pdf, ps, other]

Underwater Acoustic Networks: Channel Models and Network Coding based Lower Bound to Transmission Power for Multicast

Authors: Daniel E. Lucani, Muriel Médard, Milica Stojanovic

Abstract: The goal of this paper is two-fold. First, to establish a tractable model for the underwater acoustic channel useful for network optimization in terms of convexity. Second, to propose a network coding based lower bound for transmission power in underwater acoustic networks, and compare this bound to the performance of several network layer schemes. The underwater acoustic channel is characterize… ▽ More The goal of this paper is two-fold. First, to establish a tractable model for the underwater acoustic channel useful for network optimization in terms of convexity. Second, to propose a network coding based lower bound for transmission power in underwater acoustic networks, and compare this bound to the performance of several network layer schemes. The underwater acoustic channel is characterized by a path loss that depends strongly on transmission distance and signal frequency. The exact relationship among power, transmission band, distance and capacity for the Gaussian noise scenario is a complicated one. We provide a closed-form approximate model for 1) transmission power and 2) optimal frequency band to use, as functions of distance and capacity. The model is obtained through numerical evaluation of analytical results that take into account physical models of acoustic propagation loss and ambient noise. Network coding is applied to determine a lower bound to transmission power for a multicast scenario, for a variety of multicast data rates and transmission distances of interest for practical systems, exploiting physical properties of the underwater acoustic channel. The results quantify the performance gap in transmission power between a variety of routing and network coding schemes and the network coding based lower bound. We illustrate results numerically for different network scenarios. △ Less

Submitted 30 August, 2008; originally announced September 2008.

Comments: 12 pages, 10 figures, 2 Tables, Accepted to Journal on Selected Areas in Communications (Underwater Communications and Wireless Networks)

arXiv:0801.0426 [pdf, other]

doi 10.1109/OCEANSKOBE.2008.4531073

On the Relationship between Transmission Power and Capacity of an Underwater Acoustic Communication Channel

Authors: Daniel E. Lucani, Milica Stojanovic, Muriel Médard

Abstract: The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. As a consequence, transmission bandwidth depends on the transmission distance, a feature that distinguishes an underwater acoustic system from a terrestrial radio system. The exact relationship between power, transmission band, distance and capacit… ▽ More The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. As a consequence, transmission bandwidth depends on the transmission distance, a feature that distinguishes an underwater acoustic system from a terrestrial radio system. The exact relationship between power, transmission band, distance and capacity for the Gaussian noise scenario is a complicated one. This work provides a closed-form approximate model for 1) power consumption, 2) band-edge frequency and 3) bandwidth as functions of distance and capacity required for a data link. This approximate model is obtained by numerical evaluation of analytical results which takes into account physical models of acoustic propagation loss and ambient noise. The closed-form approximations may become useful tools in the design and analysis of underwater acoustic networks. △ Less

Submitted 2 January, 2008; originally announced January 2008.

Comments: 6 pages, 9 Figures, Awaiting acceptance to IEEE Oceans 08 (Conference), Kobe, Japan

Showing 1–38 of 38 results for author: Lucani, D