Search | arXiv e-print repository

Utilizing Transaction Prioritization to Enhance Confirmation Speed in the IOTA Network

Authors: Seyyed Ali Aghamiri, Reza Sharifnia, Ahmad Khonsari

Abstract: With the rapid advancement of blockchain technology, a significant trend is the adoption of Directed Acyclic Graphs (DAGs) as an alternative to traditional chain-based architectures for organizing ledger records. Systems like IOTA, which are specially designed for the Internet of Things (IoT), leverage DAG-based architectures to achieve greater scalability by enabling multiple attachment points in… ▽ More With the rapid advancement of blockchain technology, a significant trend is the adoption of Directed Acyclic Graphs (DAGs) as an alternative to traditional chain-based architectures for organizing ledger records. Systems like IOTA, which are specially designed for the Internet of Things (IoT), leverage DAG-based architectures to achieve greater scalability by enabling multiple attachment points in the ledger for new transactions while allowing these transactions to be added to the network without incurring any fees. To determine these attachment points, many tip selection algorithms commonly employ specific strategies on the DAG ledger. Transaction prioritization is not considered in the IOTA network, which becomes especially important when network bandwidth is limited. In this paper, we propose an optimization framework designed to integrate a priority level for critical or high-priority IoT transactions within the IOTA network. We evaluate our system using fully based on the official IOTA GitHub repository, which employs the currently operational IOTA node software (Hornet version), as part of the Chrysalis update (1.5). The experimental results show that higher-priority transactions in the proposed algorithm reach final confirmation in less time compared to the original IOTA system. △ Less

Submitted 14 February, 2025; v1 submitted 28 January, 2025; originally announced January 2025.

Comments: 5 pages

arXiv:2403.00725 [pdf, ps, other]

Cost-Effective Activity Control of Asymptomatic Carriers in Layered Temporal Social Networks

Authors: Masoumeh Moradian, Aresh Dadlani, Rasul Kairgeldin, Ahmad Khonsari

Abstract: The robustness of human social networks against epidemic propagation relies on the propensity for physical contact adaptation. During the early phase of infection, asymptomatic carriers exhibit the same activity level as susceptible individuals, which presents challenges for incorporating control measures in epidemic projection models. This paper focuses on modeling and cost-efficient activity con… ▽ More The robustness of human social networks against epidemic propagation relies on the propensity for physical contact adaptation. During the early phase of infection, asymptomatic carriers exhibit the same activity level as susceptible individuals, which presents challenges for incorporating control measures in epidemic projection models. This paper focuses on modeling and cost-efficient activity control of susceptible and carrier individuals in the context of the susceptible-carrier-infected-removed (SCIR) epidemic model over a two-layer contact network. In this model, individuals switch from a static contact layer to create new links in a temporal layer based on state-dependent activation rates. We derive conditions for the infection to die out or persist in a homogeneous network. Considering the significant costs associated with reducing the activity of susceptible and carrier individuals, we formulate an optimization problem to minimize the disease decay rate while constrained by a limited budget. We propose the use of successive geometric programming (SGP) approximation for this optimization task. Through simulation experiments on Poisson random graphs, we assess the impact of different parameters on disease prevalence. The results demonstrate that our SGP framework achieves a cost reduction of nearly 33% compared to conventional methods based on degree and closeness centrality. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2401.07288 [pdf, other]

Hybrid Coded-Uncoded Caching in Multi-Access Networks with Non-uniform Demands

Authors: Abdollah Ghaffari Sheshjavani, Ahmad Khonsari, Masoumeh Moradian, Seyed Pooya Shariatpanahi, Seyedeh Bahereh Hassanpour

Abstract: To address the massive growth of data traffic over cellular networks, increasing spatial reuse of the frequency spectrum by the deployment of small base stations (SBSs) has been considered. For rapid deployment of SBSs in the networks, caching popular content along with new coded caching schemes are proposed. To maximize the cellular network's capacity, densifying it with small base stations is in… ▽ More To address the massive growth of data traffic over cellular networks, increasing spatial reuse of the frequency spectrum by the deployment of small base stations (SBSs) has been considered. For rapid deployment of SBSs in the networks, caching popular content along with new coded caching schemes are proposed. To maximize the cellular network's capacity, densifying it with small base stations is inevitable. In ultra-dense cellular networks, coverage of SBSs may overlap. To this aim, the multi-access caching system, where users potentially can access multiple cache nodes simultaneously, has attracted more attention in recent years. Most previous works on multi-access coded caching, only consider specific conditions such as cyclic wrap-around network topologies. In this paper, we investigate caching in ultra-dense cellular networks, where different users can access different numbers of caches under non-uniform content popularity distribution, and propose Multi-Access Hybrid coded-uncoded Caching (MAHC). We formulate the optimization problem of the proposed scheme for general network topologies and evaluate it for 2-SBS network scenarios. The numerical and simulation results show that the proposed MAHC scheme outperforms optimal conventional uncoded and previous multi-access coded caching (MACC) schemes. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: 10 pages

arXiv:2401.01424 [pdf, ps, other]

Age-Aware Dynamic Frame Slotted ALOHA for Machine-Type Communications

Authors: Masoumeh Moradian, Aresh Dadlani, Ahmad Khonsari, Hina Tabassum

Abstract: Information aging has gained prominence in characterizing communication protocols for timely remote estimation and control applications. This work proposes an Age of Information (AoI)-aware threshold-based dynamic frame slotted ALOHA (T-DFSA) for contention resolution in random access machine-type communication networks. Unlike conventional DFSA that maximizes the throughput in each frame, the fra… ▽ More Information aging has gained prominence in characterizing communication protocols for timely remote estimation and control applications. This work proposes an Age of Information (AoI)-aware threshold-based dynamic frame slotted ALOHA (T-DFSA) for contention resolution in random access machine-type communication networks. Unlike conventional DFSA that maximizes the throughput in each frame, the frame length and age-gain threshold in T-DFSA are determined to minimize the normalized average AoI reduction of the network in each frame. At the start of each frame in the proposed protocol, the common Access Point (AP) stores an estimate of the age-gain distribution of a typical node. Depending on the observed status of the slots, age-gains of successful nodes, and maximum available AoI, the AP adjusts its estimation in each frame. The maximum available AoI is exploited to derive the maximum possible age-gain at each frame and thus, to avoid overestimating the age-gain threshold, which may render T-DFSA unstable. Numerical results validate our theoretical analysis and demonstrate the effectiveness of the proposed T-DFSA compared to the existing optimal frame slotted ALOHA, threshold-ALOHA, and age-based thinning protocols in a considerable range of update generation rates. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2208.01619 [pdf, ps, other]

On the Age of Status Updates in Unreliable Multi-Source M/G/1 Queueing Systems

Authors: Muthukrishnan Senthil Kumar, Aresh Dadlani, Masoumeh Moradian, Ahmad Khonsari, Theodoros A. Tsiftsis

Abstract: The timeliness of status message delivery in communications networks is subjective to time-varying wireless channel transmissions. In this paper, we investigate the age of information (AoI) of each source in a multi-source M/G/1 queueing update system with active server failures. In particular, we adopt the method of supplementary variables to derive a closed-form expression for the average AoI in… ▽ More The timeliness of status message delivery in communications networks is subjective to time-varying wireless channel transmissions. In this paper, we investigate the age of information (AoI) of each source in a multi-source M/G/1 queueing update system with active server failures. In particular, we adopt the method of supplementary variables to derive a closed-form expression for the average AoI in terms of system parameters, where the server repair time follows a general distribution and the service time of packets generated by independent sources is a general random variable. Numerical results are provided to validate the effectiveness of the proposed packet serving policy under different parametric settings. △ Less

Submitted 31 October, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

Comments: 5 pages, 6 figures, journal

arXiv:2208.00297 [pdf, other]

doi 10.1016/j.comnet.2023.109654

Privacy-Preserving Edge Caching: A Probabilistic Approach

Authors: Seyedeh Bahereh Hassanpour, Ahmad Khonsari, Masoumeh Moradian, Seyed Pooya Shariatpanahi

Abstract: Edge caching (EC) decreases the average access delay of the end-users through caching popular content at the edge network, however, it increases the leakage probability of valuable information such as users preferences. Most of the existing privacy-preserving approaches focus on adding layers of encryption, which confronts the network with more challenges such as energy and computation limitations… ▽ More Edge caching (EC) decreases the average access delay of the end-users through caching popular content at the edge network, however, it increases the leakage probability of valuable information such as users preferences. Most of the existing privacy-preserving approaches focus on adding layers of encryption, which confronts the network with more challenges such as energy and computation limitations. We employ a chunk-based joint probabilistic caching (JPC) approach to mislead an adversary eavesdropping on the communication inside an EC and maximizing the adversary's error in estimating the requested file and requesting cache. In JPC, we optimize the probability of each cache placement to minimize the communication cost while guaranteeing the desired privacy and then, formulate the optimization problem as a linear programming (LP) problem. Since JPC inherits the curse of dimensionality, we also propose scalable JPC (SPC), which reduces the number of feasible cache placements by dividing files into non-overlapping subsets. We also compare the JPC and SPC approaches against an existing probabilistic method, referred to as disjoint probabilistic caching (DPC) and random dummy-based approach (RDA). Results obtained through extensive numerical evaluations confirm the validity of the analytical approach, the superiority of JPC and SPC over DPC and RDA. △ Less

Submitted 30 July, 2022; originally announced August 2022.

arXiv:2108.01292 [pdf, ps, other]

Scaling Power Management in Cloud Data Centers: A Multi-Level Continuous-Time MDP Approach

Authors: Behzad Chitsaz, Ahmad Khonsari, Masoumeh Moradian, Aresh Dadlani, Mohammad Sadegh Talebi

Abstract: Power management in multi-server data centers~especially at scale is a vital issue of increasing importance in cloud computing paradigm. Existing studies mostly consider thresholds on the number of idle servers to switch the servers on or off and suffer from scalability issues. As a natural approach in view~of~the Markovian assumption, we present a multi-level continuous-time Markov decision proce… ▽ More Power management in multi-server data centers~especially at scale is a vital issue of increasing importance in cloud computing paradigm. Existing studies mostly consider thresholds on the number of idle servers to switch the servers on or off and suffer from scalability issues. As a natural approach in view~of~the Markovian assumption, we present a multi-level continuous-time Markov decision process (CTMDP) model based on state aggregation of multi-server data centers with setup times that interestingly overcomes the inherent intractability of traditional MDP approaches due to their colossal state-action space. The beauty of the presented model is that, while it keeps loyalty to the Markovian behavior, it approximates the calculation of the transition probabilities in a way that keeps the accuracy of the results at a desirable level. Moreover, near-optimal performance is attained at the expense of the increased state-space dimensionality by tuning the number of levels in the multi-level approach. The simulation results were promising and confirm that in many scenarios of interest, the proposed approach attains noticeable improvements, namely a near 50% reduction in the size of CTMDP while yielding better rewards as compared to existing fixed threshold-based policies and aggregation methods. △ Less

Submitted 19 July, 2023; v1 submitted 3 August, 2021; originally announced August 2021.

Comments: 12 pages, 7 figures

arXiv:2107.13437 [pdf, ps, other]

Co-evolution of Viral Processes and Structural Stability in Signed Social Networks

Authors: Temirlan Kalimzhanov, Amir Haji Ali Khamseh'i, Aresh Dadlani, Muthukrishnan Senthil Kumar, Ahmad Khonsari

Abstract: Prediction and control of spreading processes in social networks (SNs) are closely tied to the underlying connectivity patterns. Contrary to most existing efforts that exclusively focus on positive social user interactions, the impact of contagion processes on the temporal evolution of signed SNs (SSNs) with distinctive friendly (positive) and hostile (negative) relationships yet, remains largely… ▽ More Prediction and control of spreading processes in social networks (SNs) are closely tied to the underlying connectivity patterns. Contrary to most existing efforts that exclusively focus on positive social user interactions, the impact of contagion processes on the temporal evolution of signed SNs (SSNs) with distinctive friendly (positive) and hostile (negative) relationships yet, remains largely unexplored. In this paper, we study the interplay between social link polarity and propagation of viral phenomena coupled with user alertness. In particular, we propose a novel energy model built on Heider's balance theory that relates the stochastic susceptible-alert-infected-susceptible epidemic dynamical model with the structural balance of SSNs to substantiate the trade-off between social tension and epidemic spread. Moreover, the role of hostile social links in the formation of disjoint friendly clusters of alerted and infected users is analyzed. Using three real-world SSN datasets, we further present a time-efficient algorithm to expedite the energy computation in our Monte-Carlo simulation method and show compelling insights on the effectiveness and rationality of user awareness and initial network settings in reaching structurally balanced local and global network energy states. △ Less

Submitted 26 July, 2022; v1 submitted 28 July, 2021; originally announced July 2021.

Comments: 6 pages, 9 figures, under review

arXiv:2105.03220 [pdf, other]

doi 10.1016/j.comnet.2021.108454

Content Caching for Shared Medium Networks Under Heterogeneous Users' Behaviours

Authors: Abdollah Ghaffari Sheshjavani, Ahmad Khonsari, Seyed Pooya Shariatpanahi, Masoumeh Moradian

Abstract: Content caching is a widely studied technique aimed to reduce the network load imposed by data transmission during peak time while ensuring users' quality of experience. It has been shown that when there is a common link between caches and the server, delivering contents via the coded caching scheme can significantly improve performance over conventional caching. However, finding the optimal conte… ▽ More Content caching is a widely studied technique aimed to reduce the network load imposed by data transmission during peak time while ensuring users' quality of experience. It has been shown that when there is a common link between caches and the server, delivering contents via the coded caching scheme can significantly improve performance over conventional caching. However, finding the optimal content placement is a challenge in the case of heterogeneous users' behaviours. In this paper we consider heterogeneous number of demands and non-uniform content popularity distribution in the case of homogeneous and heterogeneous user preferences. We propose a hybrid coded-uncoded caching scheme to trade-off between popularity and diversity. We derive explicit closed-form expressions of the server load for the proposed hybrid scheme and formulate the corresponding optimization problem. Results show that the proposed hybrid caching scheme can reduce the server load significantly and outperforms the baseline pure coded and pure uncoded and previous works in the literature for both homogeneous and heterogeneous user preferences. △ Less

Submitted 7 May, 2021; originally announced May 2021.

Comments: 12 pages

arXiv:2010.05197 [pdf, other]

doi 10.1109/ISCAS45731.2020.9181001

TaxoNN: A Light-Weight Accelerator for Deep Neural Network Training

Authors: Reza Hojabr, Kamyar Givaki, Kossar Pourahmadi, Parsa Nooralinejad, Ahmad Khonsari, Dara Rahmati, M. Hassan Najafi

Abstract: Emerging intelligent embedded devices rely on Deep Neural Networks (DNNs) to be able to interact with the real-world environment. This interaction comes with the ability to retrain DNNs, since environmental conditions change continuously in time. Stochastic Gradient Descent (SGD) is a widely used algorithm to train DNNs by optimizing the parameters over the training data iteratively. In this work,… ▽ More Emerging intelligent embedded devices rely on Deep Neural Networks (DNNs) to be able to interact with the real-world environment. This interaction comes with the ability to retrain DNNs, since environmental conditions change continuously in time. Stochastic Gradient Descent (SGD) is a widely used algorithm to train DNNs by optimizing the parameters over the training data iteratively. In this work, first we present a novel approach to add the training ability to a baseline DNN accelerator (inference only) by splitting the SGD algorithm into simple computational elements. Then, based on this heuristic approach we propose TaxoNN, a light-weight accelerator for DNN training. TaxoNN can easily tune the DNN weights by reusing the hardware resources used in the inference process using a time-multiplexing approach and low-bitwidth units. Our experimental results show that TaxoNN delivers, on average, 0.97% higher misclassification rate compared to a full-precision implementation. Moreover, TaxoNN provides 2.1$\times$ power saving and 1.65$\times$ area reduction over the state-of-the-art DNN training accelerator. △ Less

Submitted 11 October, 2020; originally announced October 2020.

Comments: Accepted to ISCAS 2020. 5 pages, 5 figures

Journal ref: 2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020, pp. 1-5

arXiv:2003.14348 [pdf, ps, other]

UniformAugment: A Search-free Probabilistic Data Augmentation Approach

Authors: Tom Ching LingChen, Ava Khonsari, Amirreza Lashkari, Mina Rafi Nazari, Jaspreet Singh Sambee, Mario A. Nascimento

Abstract: Augmenting training datasets has been shown to improve the learning effectiveness for several computer vision tasks. A good augmentation produces an augmented dataset that adds variability while retaining the statistical properties of the original dataset. Some techniques, such as AutoAugment and Fast AutoAugment, have introduced a search phase to find a set of suitable augmentation policies for a… ▽ More Augmenting training datasets has been shown to improve the learning effectiveness for several computer vision tasks. A good augmentation produces an augmented dataset that adds variability while retaining the statistical properties of the original dataset. Some techniques, such as AutoAugment and Fast AutoAugment, have introduced a search phase to find a set of suitable augmentation policies for a given model and dataset. This comes at the cost of great computational overhead, adding up to several thousand GPU hours. More recently RandAugment was proposed to substantially speedup the search phase by approximating the search space by a couple of hyperparameters, but still incurring non-negligible cost for tuning those. In this paper we show that, under the assumption that the augmentation space is approximately distribution invariant, a uniform sampling over the continuous space of augmentation transformations is sufficient to train highly effective models. Based on that result we propose UniformAugment, an automated data augmentation approach that completely avoids a search phase. In addition to discussing the theoretical underpinning supporting our approach, we also use the standard datasets, as well as established models for image classification, to show that UniformAugment's effectiveness is comparable to the aforementioned methods, while still being highly efficient by virtue of not requiring any search. △ Less

Submitted 31 March, 2020; originally announced March 2020.

arXiv:2001.00053 [pdf, other]

On the Resilience of Deep Learning for Reduced-voltage FPGAs

Authors: Kamyar Givaki, Behzad Salami, Reza Hojabr, S. M. Reza Tayaranian, Ahmad Khonsari, Dara Rahmati, Saeid Gorgin, Adrian Cristal, Osman S. Unsal

Abstract: Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for p… ▽ More Deep Neural Networks (DNNs) are inherently computation-intensive and also power-hungry. Hardware accelerators such as Field Programmable Gate Arrays (FPGAs) are a promising solution that can satisfy these requirements for both embedded and High-Performance Computing (HPC) systems. In FPGAs, as well as CPUs and GPUs, aggressive voltage scaling below the nominal level is an effective technique for power dissipation minimization. Unfortunately, bit-flip faults start to appear as the voltage is scaled down closer to the transistor threshold due to timing issues, thus creating a resilience issue. This paper experimentally evaluates the resilience of the training phase of DNNs in the presence of voltage underscaling related faults of FPGAs, especially in on-chip memories. Toward this goal, we have experimentally evaluated the resilience of LeNet-5 and also a specially designed network for CIFAR-10 dataset with different activation functions of Rectified Linear Unit (Relu) and Hyperbolic Tangent (Tanh). We have found that modern FPGAs are robust enough in extremely low-voltage levels and that low-voltage related faults can be automatically masked within the training iterations, so there is no need for costly software- or hardware-oriented fault mitigation techniques like ECC. Approximately 10% more training iterations are needed to fill the gap in the accuracy. This observation is the result of the relatively low rate of undervolting faults, i.e., <0.1\%, measured on real FPGA fabrics. We have also increased the fault rate significantly for the LeNet-5 network by randomly generated fault injection campaigns and observed that the training accuracy starts to degrade. When the fault rate increases, the network with Tanh activation function outperforms the one with Relu in terms of accuracy, e.g., when the fault rate is 30% the accuracy difference is 4.92%. △ Less

Submitted 26 December, 2019; originally announced January 2020.

arXiv:1911.01296 [pdf, other]

Serverless Computing: A Survey of Opportunities, Challenges and Applications

Authors: Hossein Shafiei, Ahmad Khonsari, Payam Mousavi

Abstract: The topic of serverless computing has proved to be a controversial subject both within academic and industrial communities. Many have praised the approach to be a platform for a new era of computing and some have argued that it is in fact a step backward. Though, both sides agree that there exist challenges that must be addressed in order to better utilize its potentials. This paper surveys existi… ▽ More The topic of serverless computing has proved to be a controversial subject both within academic and industrial communities. Many have praised the approach to be a platform for a new era of computing and some have argued that it is in fact a step backward. Though, both sides agree that there exist challenges that must be addressed in order to better utilize its potentials. This paper surveys existing challenges toward vast adoption of serverless services and also explores some of the challenges that have not been thoroughly discussed in the previous studies. Each challenge is discussed thoroughly and a number of possible directions for future studies is proposed. Moreover, the paper reviews some of the unique opportunities and potentials that the serverless computing presents. △ Less

Submitted 4 June, 2021; v1 submitted 4 November, 2019; originally announced November 2019.

Comments: 27 pages, 3 figures

arXiv:1801.00840 [pdf]

High Performance Architecture for Flow-Table Lookup in SDN on FPGA

Authors: Rashid Hatamia, Hossein Bahramgiria, Ahmad Khonsari

Abstract: We propose Range-based Ternary Search Tree (RTST), a tree-based approach for flow-table lookup in SDN network. RTST builds upon flow-tables in SDN switches to provide a fast lookup among flows. We present a parallel multi-pipeline architecture for implementing RTST that benefits from high throughput and low latency. The proposed RTST and architecture achieve a memory efficiency of 1 byte of memory… ▽ More We propose Range-based Ternary Search Tree (RTST), a tree-based approach for flow-table lookup in SDN network. RTST builds upon flow-tables in SDN switches to provide a fast lookup among flows. We present a parallel multi-pipeline architecture for implementing RTST that benefits from high throughput and low latency. The proposed RTST and architecture achieve a memory efficiency of 1 byte of memory for each byte of flow. We also present a set of techniques to support dynamic updates. Experimental results show that RTST can be used to improve the performance of flow-lookup. It achieves a throughput of 670 Million Packets Per Second (MPPS), for a 1 K 15-tuple flow-table, on a state-of-the-art FPGA. △ Less

Submitted 2 January, 2018; originally announced January 2018.

arXiv:1606.00637 [pdf, other]

Maximum-Quality Tree Construction for Deadline-Constrained Aggregation in WSNs

Authors: Bahram Alinia, Mohammad H. Hajiesmaili, Ahmad Khonsari, Noel Crespi

Abstract: In deadline-constrained wireless sensor networks (WSNs), quality of aggregation (QoA) is determined by the number of participating nodes in the data aggregation process. The previous studies have attempted to propose optimal scheduling algorithms to obtain the maximum QoA assuming a fixed underlying aggregation tree. However, there exists no prior work to address the issue of constructing optimal… ▽ More In deadline-constrained wireless sensor networks (WSNs), quality of aggregation (QoA) is determined by the number of participating nodes in the data aggregation process. The previous studies have attempted to propose optimal scheduling algorithms to obtain the maximum QoA assuming a fixed underlying aggregation tree. However, there exists no prior work to address the issue of constructing optimal aggregation tree in deadline-constraints WSNs. The structure of underlying aggregation tree is important since our analysis demonstrates that the ratio between the maximum achievable QoAs of different trees could be as large as O(2^D), where D is the deadline. This paper casts a combinatorial optimization problem to address optimal tree construction for deadline-constrained data aggregation in WSNs. While the problem is proved to be NP-hard, we employ the recently proposed Markov approximation framework and devise two distributed algorithms with different computation overheads to find close-to-optimal solutions with bounded approximation gap. To further improve the convergence of the proposed Markov-based algorithms, we devise another initial tree construction algorithm with low computational complexity. Our extensive experiments for a set randomly-generated scenarios demonstrate that the proposed algorithms outperforms the existing alternative methods by obtaining better quality of aggregations. △ Less

Submitted 2 June, 2016; originally announced June 2016.

Comments: 31 pages. arXiv admin note: substantial text overlap with arXiv:1405.0597

arXiv:1512.04106 [pdf, other]

A Fair Admission Control Mechanism for Efficient Utilization of Resources in On-chip Nanophotonic Crossbars

Authors: Seyed Hessam Mirsadeghi, Ahmad Khonsari, Mohammad Sadegh Talebi, Behnam Khodabandeloo

Abstract: Advances in CMOS-compatible photonic elements have made it plausible to exploit nanophotonic communications to overcome the limitations of traditional NoCs. Amongst various proposed nanophotonic architectures, optical crossbars have been shown to provide high performance in terms of bandwidth and latency. In general, optical crossbars provide a vast volume of network resources that are shared amon… ▽ More Advances in CMOS-compatible photonic elements have made it plausible to exploit nanophotonic communications to overcome the limitations of traditional NoCs. Amongst various proposed nanophotonic architectures, optical crossbars have been shown to provide high performance in terms of bandwidth and latency. In general, optical crossbars provide a vast volume of network resources that are shared among all the cores within the chip. In this paper, we present a fair and efficient admission control mechanism for shared wavelengths and buffer space in optical crossbars. We model buffer management and wavelength assignment as a utility-based convex optimization problem, whose solution determines the admission control policy. Thanks to efficient convex optimization techniques, we obtain the globally optimal solution of the admission control optimization problem by using simple and yet efficient iterative algorithms. We cast our solution procedure as an iterative algorithm to be implemented a central admission controller. Our experimental results corroborate the gain that can be obtained by using such an admission controller to manage the shared resources of the system. Furthermore, they confirm that the proposed admission control algorithm works well for various traffic patterns and parameters, and evinces a tractable scalability with increase in the number of cores of the crossbar. △ Less

Submitted 22 September, 2016; v1 submitted 13 December, 2015; originally announced December 2015.

Comments: submitted

arXiv:1509.03374 [pdf, other]

Utility-Optimal Dynamic Rate Allocation under Average End-to-End Delay Requirements

Authors: Mohammad H. Hajiesmaili, Mohammad Sadegh Talebi, Ahmad Khonsari

Abstract: QoS-aware networking applications such as real-time streaming and video surveillance systems require nearly fixed average end-to-end delay over long periods to communicate efficiently, although may tolerate some delay variations in short periods. This variability exhibits complex dynamics that makes rate control of such applications a formidable task. This paper addresses rate allocation for heter… ▽ More QoS-aware networking applications such as real-time streaming and video surveillance systems require nearly fixed average end-to-end delay over long periods to communicate efficiently, although may tolerate some delay variations in short periods. This variability exhibits complex dynamics that makes rate control of such applications a formidable task. This paper addresses rate allocation for heterogeneous QoS-aware applications that preserves the long-term end-to-end delay constraint while, similar to Dynamic Network Utility Maximization (DNUM), strives to achieve the maximum network utility aggregated over a fixed time interval. Since capturing temporal dynamics in QoS requirements of sources is allowed in our system model, we incorporate a novel time-coupling constraint in which delay-sensitivity of sources is considered such that a certain end-to-end average delay for each source over a pre-specified time interval is satisfied. We propose DA-DNUM algorithm, as a dual-based solution, which allocates source rates for the next time interval in a distributed fashion, given the knowledge of network parameters in advance. Through numerical experiments, we show that DA-DNUM gains higher average link utilization and a wider range of feasible scenarios in comparison with the best, to our knowledge, rate control schemes that may guarantee such constraints on delay. △ Less

Submitted 30 October, 2015; v1 submitted 10 September, 2015; originally announced September 2015.

arXiv:1506.03551 [pdf, other]

On the Feasibility of Wireless Interconnects for High-throughput Data Centers

Authors: Ahmad Khonsari, Seyed Pooya Shariatpanahi, Abolfazl Diyanat, Hossein Shafiei

Abstract: Data Centers (DCs) are required to be scalable to large data sets so as to accommodate ever increasing demands of resource-limited embedded and mobile devices. Thanks to the availability of recent high data rate millimeter-wave frequency spectrum such as 60GHz and due to the favorable attributes of this technology, wireless DC (WDC) exhibits the potentials of being a promising solution especially… ▽ More Data Centers (DCs) are required to be scalable to large data sets so as to accommodate ever increasing demands of resource-limited embedded and mobile devices. Thanks to the availability of recent high data rate millimeter-wave frequency spectrum such as 60GHz and due to the favorable attributes of this technology, wireless DC (WDC) exhibits the potentials of being a promising solution especially for small to medium scale DCs. This paper investigates the problem of throughput scalability of WDCs using the established theory of the asymptotic throughput of wireless multi-hop networks that are primarily proposed for homogeneous traffic conditions. The rate-heterogeneous traffic distribution of a data center however, requires the asymptotic heterogeneous throughput knowledge of a wireless network in order to study the performance and feasibility of WDCs for practical purposes. To answer these questions this paper presents a lower bound for the throughput scalability of a multi-hop rate-heterogeneous network when traffic generation rates of all nodes are similar, except one node. We demonstrate that the throughput scalability of conventional multi-hopping and the spatial reuse of the above bi-rate network is inefficient and henceforth develop a speculative 2-partitioning scheme that improves the network throughput scaling potentials. A better lower bound of the throughput is then obtained. Finally, we obtain the throughput scaling of an i.i.d. rate-heterogeneous network and obtain its lower bound. Again we propose a speculative 2-partitioning scheme to achieve a network with higher throughput in terms of improved lower bound. All of the obtained results have been verified using simulation experiments. △ Less

Submitted 11 June, 2015; originally announced June 2015.

arXiv:1405.0597

On the Construction of Maximum-Quality Aggregation Trees in Deadline-Constrained WSNs

Authors: Bahram Alinia, Mohammad H. Hajiesmaili, Ahmad Khonsari

Abstract: In deadline-constrained data aggregation in wireless sensor networks (WSNs), the imposed sink deadline along with the interference constraint hinders participation of all sensor nodes in data aggregation. Thus, exploiting the wisdom of the crowd paradigm, the total number of participant nodes in data aggregation determines the quality of aggregation ($QoA$). Although the previous studies have prop… ▽ More In deadline-constrained data aggregation in wireless sensor networks (WSNs), the imposed sink deadline along with the interference constraint hinders participation of all sensor nodes in data aggregation. Thus, exploiting the wisdom of the crowd paradigm, the total number of participant nodes in data aggregation determines the quality of aggregation ($QoA$). Although the previous studies have proposed optimal algorithms to maximize $QoA$ under an imposed deadline and a given aggregation tree, there is no work on constructing optimal tree in this context. In this paper, we cast an optimization problem to address optimal tree construction for deadline-constrained data aggregation in WSNs. We demonstrate that the ratio between the maximum achievable $QoA$s of the optimal and the worst aggregation trees is as large as $O(2^D)$, where $D$ is the sink deadline and thus makes devising efficient solution of the problem an issue of paramount value. However, the problem is challenging to solve since we prove that it is NP-hard. We apply the recently-proposed Markov approximation framework to devise two distributed algorithms with different computation overheads that converge to a bounded neighborhood of the optimal solution. Extensive simulations in a set of representative randomly-generated scenarios show that the proposed algorithms significantly improve $QoA$ by %101 and %93 in average compared to the best, to our knowledge, existing alternative methods. △ Less

Submitted 31 July, 2014; v1 submitted 3 May, 2014; originally announced May 2014.

Comments: This paper has been withdrawn by the author due to a crucial sign error in equation 1

arXiv:1302.1506 [pdf, other]

Rate-Privacy in Wireless Sensor Networks

Authors: H. Shafiei, A. Khonsari, H. Derakhshi, P. Mousavi

Abstract: This paper introduces the concept of rate privacy in the context of wireless sensor networks. Our discussion reveals that the concept indeed is of a great importance for the privacy preservation of such networks. As a result, we propose a buffering scheme to protect the rate from adversaries. Simulation results verify the applicability of our approach. This paper introduces the concept of rate privacy in the context of wireless sensor networks. Our discussion reveals that the concept indeed is of a great importance for the privacy preservation of such networks. As a result, we propose a buffering scheme to protect the rate from adversaries. Simulation results verify the applicability of our approach. △ Less

Submitted 6 February, 2013; originally announced February 2013.

arXiv:1208.2374 [pdf]

Dynamic Warp Resizing in High-Performance SIMT

Authors: Ahmad Lashgar, Amirali Baniasadi, Ahmad Khonsari

Abstract: Modern GPUs synchronize threads grouped in a warp at every instruction. These results in improving SIMD efficiency and makes sharing fetch and decode resources possible. The number of threads included in each warp (or warp size) affects divergence, synchronization overhead and the efficiency of memory access coalescing. Small warps reduce the performance penalty associated with branch and memory d… ▽ More Modern GPUs synchronize threads grouped in a warp at every instruction. These results in improving SIMD efficiency and makes sharing fetch and decode resources possible. The number of threads included in each warp (or warp size) affects divergence, synchronization overhead and the efficiency of memory access coalescing. Small warps reduce the performance penalty associated with branch and memory divergence at the expense of a reduction in memory coalescing. Large warps enhance memory coalescing significantly but also increase branch and memory divergence. Dynamic workload behavior, including branch/memory divergence and coalescing, is an important factor in determining the warp size returning best performance. Optimal warp size can vary from one workload to another or from one program phase to the next. Based on this observation, we propose Dynamic Warp Resizing (DWR). DWR takes innovative microarchitectural steps to adjust warp size during runtime and according to program characteristics. DWR outperforms static warp size decisions, up to 1.7X to 2.28X, while imposing less than 1% area overhead. We investigate various alternative configurations and show that DWR performs better for narrower SIMD and larger caches. △ Less

Submitted 3 November, 2012; v1 submitted 11 August, 2012; originally announced August 2012.

Comments: 9 pages, 5 Figures, 3 Lists, 1 Table, The extended version of ICCD 2012 poster paper

arXiv:1205.4967 [pdf]

Investigating Warp Size Impact in GPUs

Authors: Ahmad Lashgar, Amirali Baniasadi, Ahmad Khonsari

Abstract: There are a number of design decisions that impact a GPU's performance. Among such decisions deciding the right warp size can deeply influence the rest of the design. Small warps reduce the performance penalty associated with branch divergence at the expense of a reduction in memory coalescing. Large warps enhance memory coalescing significantly but also increase branch divergence. This leaves des… ▽ More There are a number of design decisions that impact a GPU's performance. Among such decisions deciding the right warp size can deeply influence the rest of the design. Small warps reduce the performance penalty associated with branch divergence at the expense of a reduction in memory coalescing. Large warps enhance memory coalescing significantly but also increase branch divergence. This leaves designers with two choices: use a small warps and invest in finding new solutions to enhance coalescing or use large warps and address branch divergence employing effective control-flow solutions. In this work our goal is to investigate the answer to this question. We analyze warp size impact on memory coalescing and branch divergence. We use our findings to study two machines: a GPU using small warps but equipped with excellent memory coalescing (SW+) and a GPU using large warps but employing an MIMD engine immune from control-flow costs (LW+). Our evaluations show that building coalescing-enhanced small warp GPUs is a better approach compared to pursuing a control-flow enhanced large warp GPU. △ Less

Submitted 22 May, 2012; originally announced May 2012.

Comments: 7 pages, 7 figures, 2 tables, Technical Report

arXiv:1109.6851

Content-Aware Rate Control for Video Transmission with Buffer Constraints in Multipath Networks

Authors: Mohammad Hassan Hajiesmaili, Ali Sehati, Ahmad Khonsari, Mohammad Sadegh Talebi

Abstract: Being an integral part of the network traffic, nowadays it's vital to design robust mechanisms to provide QoS for multimedia applications. The main goal of this paper is to provide an efficient solution to support content-aware video transmission mechanism with buffer underflow avoidance at the receiver in multipath networks. Towards this, we introduce a content-aware time-varying utility function… ▽ More Being an integral part of the network traffic, nowadays it's vital to design robust mechanisms to provide QoS for multimedia applications. The main goal of this paper is to provide an efficient solution to support content-aware video transmission mechanism with buffer underflow avoidance at the receiver in multipath networks. Towards this, we introduce a content-aware time-varying utility function, where the quality impacts of video content is incorporated into its definition. Using the proposed utility function, we formulate a multipath Dynamic Network Utility Maximization (DNUM) problem for the rate allocation of video streams, where it takes into account QoS demand of video streams in terms of buffer underflow avoidance. Finally, using primal-dual method, we propose a distributed solution that optimally allocates the shared bandwidth to video streams. The numerical examples demonstrate the efficacy of the proposed content-aware rate allocation algorithm for video sources in both single and multiple path network models. △ Less

Submitted 17 August, 2012; v1 submitted 30 September, 2011; originally announced September 2011.

Comments: This paper has been withdrawn by the last author since there is a minor change in the list of authors. In the new version, the last author is not included in the paper any longer

arXiv:1109.6809 [pdf, ps, other]

NUM-Based Rate Allocation for Streaming Traffic via Sequential Convex Programming

Authors: Ali Sehati, Mohammad Sadegh Talebi, Ahmad Khonsari

Abstract: In recent years, there has been an increasing demand for ubiquitous streaming like applications in data networks. In this paper, we concentrate on NUM-based rate allocation for streaming applications with the so-called S-curve utility functions. Due to non-concavity of such utility functions, the underlying NUM problem would be non-convex for which dual methods might become quite useless. To tackl… ▽ More In recent years, there has been an increasing demand for ubiquitous streaming like applications in data networks. In this paper, we concentrate on NUM-based rate allocation for streaming applications with the so-called S-curve utility functions. Due to non-concavity of such utility functions, the underlying NUM problem would be non-convex for which dual methods might become quite useless. To tackle the non-convex problem, using elementary techniques we make the utility of the network concave, however this results in reverse-convex constraints which make the problem non-convex. To deal with such a transformed NUM, we leverage Sequential Convex Programming (SCP) approach to approximate the non-convex problem by a series of convex ones. Based on this approach, we propose a distributed rate allocation algorithm and demonstrate that under mild conditions, it converges to a locally optimal solution of the original NUM. Numerical results validate the effectiveness, in terms of tractable convergence of the proposed rate allocation algorithm. △ Less

Submitted 30 September, 2011; originally announced September 2011.

Comments: 6 pages, conference submission

arXiv:1102.2604 [pdf, other]

Quasi-Optimal Network Utility Maximization for Scalable Video Streaming

Authors: Mohammad Sadegh Talebi, Ahmad Khonsari, Mohammad Hassan Hajiesmaili, Sina Jafarpour

Abstract: This paper addresses rate control for transmission of scalable video streams via Network Utility Maximization (NUM) formulation. Due to stringent QoS requirements of video streams and specific characterization of utility experienced by end-users, one has to solve nonconvex and even nonsmooth NUM formulation for such streams, where dual methods often prove incompetent. Convexification plays an impo… ▽ More This paper addresses rate control for transmission of scalable video streams via Network Utility Maximization (NUM) formulation. Due to stringent QoS requirements of video streams and specific characterization of utility experienced by end-users, one has to solve nonconvex and even nonsmooth NUM formulation for such streams, where dual methods often prove incompetent. Convexification plays an important role in this work as it permits the use of existing dual methods to solve an approximate to the NUM problem iteratively and distributively. Hence, to tackle the nonsmoothness and nonconvexity, we aim at reformulating the NUM problem through approximation and transformation of the ideal discretely adaptive utility function for scalable video streams. The reformulated problem is shown to be a D.C. (Difference of Convex) problem. We leveraged Sequential Convex Programming (SCP) approach to replace the nonconvex D.C. problem by a sequence of convex problems that aim to approximate the original D.C. problem. We then solve each convex problem produced by SCP approach using existing dual methods. This procedure is the essence of two distributed iterative rate control algorithms proposed in this paper, for which one can show the convergence to a locally optimal point of the nonconvex D.C. problem and equivalently to a locally optimal point of an approximate to the original nonconvex problem. Our experimental results show that the proposed rate control algorithms converge with tractable convergence behavior. △ Less

Submitted 17 August, 2012; v1 submitted 13 February, 2011; originally announced February 2011.

Comments: This work has been submitted to the IEEE for possible publication

Showing 1–25 of 25 results for author: Khonsari, A