Skip to main content

Showing 1–33 of 33 results for author: Nasir, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.03558  [pdf, ps, other

    cs.CV cs.AI

    An Advanced Deep Learning Framework for Ischemic and Hemorrhagic Brain Stroke Diagnosis Using Computed Tomography (CT) Images

    Authors: Md. Sabbir Hossen, Eshat Ahmed Shuvo, Shibbir Ahmed Arif, Pabon Shaha, Md. Saiduzzaman, Mostofa Kamal Nasir

    Abstract: Brain stroke is one of the leading causes of mortality and long-term disability worldwide, highlighting the need for precise and fast prediction techniques. Computed Tomography (CT) scan is considered one of the most effective methods for diagnosing brain strokes. The majority of stroke classification techniques rely on a single slice-level prediction mechanism, allowing the radiologist to manuall… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

    Comments: Preprint version. Submitted for peer review

  2. arXiv:2506.06524  [pdf, ps, other

    cs.AI cs.HC

    ScriptDoctor: Automatic Generation of PuzzleScript Games via Large Language Models and Tree Search

    Authors: Sam Earle, Ahmed Khalifa, Muhammad Umair Nasir, Zehua Jiang, Graham Todd, Andrzej Banburski-Fahey, Julian Togelius

    Abstract: There is much interest in using large pre-trained models in Automatic Game Design (AGD), whether via the generation of code, assets, or more abstract conceptualization of design ideas. But so far this interest largely stems from the ad hoc use of such generative models under persistent human supervision. Much work remains to show how these tools can be integrated into longer-time-horizon AGD pipel… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: 5 pages, 3 figures, 3 tables, submitted to IEEE Conference on Games as a Short Paper

  3. arXiv:2505.18645  [pdf

    cs.AI

    Riverine Flood Prediction and Early Warning in Mountainous Regions using Artificial Intelligence

    Authors: Haleema Bibi, Sadia Saleem, Zakia Jalil, Muhammad Nasir, Tahani Alsubait

    Abstract: Flooding is the most devastating phenomenon occurring globally, particularly in mountainous regions, risk dramatically increases due to complex terrains and extreme climate changes. These situations are damaging livelihoods, agriculture, infrastructure, and human lives. This study uses the Kabul River between Pakistan and Afghanistan as a case study to reflect the complications of flood forecastin… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

    Comments: 26 pages, 6 figure

  4. arXiv:2503.16536  [pdf, other

    cs.CL

    Word2Minecraft: Generating 3D Game Levels through Large Language Models

    Authors: Shuo Huang, Muhammad Umair Nasir, Steven James, Julian Togelius

    Abstract: We present Word2Minecraft, a system that leverages large language models to generate playable game levels in Minecraft based on structured stories. The system transforms narrative elements-such as protagonist goals, antagonist challenges, and environmental settings-into game levels with both spatial and gameplay constraints. We introduce a flexible framework that allows for the customization of st… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  5. arXiv:2410.07765  [pdf, other

    cs.CL cs.AI

    GameTraversalBenchmark: Evaluating Planning Abilities Of Large Language Models Through Traversing 2D Game Maps

    Authors: Muhammad Umair Nasir, Steven James, Julian Togelius

    Abstract: Large language models (LLMs) have recently demonstrated great success in generating and understanding natural language. While they have also shown potential beyond the domain of natural language, it remains an open question as to what extent and in which way these LLMs can plan. We investigate their planning capabilities by proposing GameTraversalBenchmark (GTB), a benchmark consisting of diverse… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Accepted at 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks

  6. arXiv:2408.06955  [pdf

    cs.SE cs.HC

    Crowdsourcing: A Framework for Usability Evaluation

    Authors: Muhammad Nasir

    Abstract: Objective: This research explores using crowdsourcing for software usability evaluation. Background: Usability studies are essential for designing user-friendly software, but traditional methods are often costly and time-consuming. Crowdsourcing offers a quicker, cost-effective alternative for remote usability evaluation, though ensuring quality feedback remains a challenge. Method: A systemat… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: This thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computing at Riphah International University, Islamabad, Pakistan, 28th June 2022

  7. arXiv:2405.06686  [pdf, other

    cs.CL cs.AI

    Word2World: Generating Stories and Worlds through Large Language Models

    Authors: Muhammad U. Nasir, Steven James, Julian Togelius

    Abstract: Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  8. arXiv:2306.01102  [pdf, other

    cs.NE cs.AI cs.CL

    LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization

    Authors: Muhammad U. Nasir, Sam Earle, Christopher Cleghorn, Steven James, Julian Togelius

    Abstract: Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. Here, we propose using the coding abilities of LLMs to introduce meaningful variations to code defining neural networks. Meanwhile, Quality-Diversity (QD) algo… ▽ More

    Submitted 12 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to The Genetic and Evolutionary Computation Conference 2024

  9. arXiv:2305.18243  [pdf, other

    cs.CL cs.AI

    Practical PCG Through Large Language Models

    Authors: Muhammad U Nasir, Julian Togelius

    Abstract: Large Language Models (LLMs) have proven to be useful tools in various domains outside of the field of their inception, which was natural language processing. In this study, we provide practical directions on how to use LLMs to generate 2D-game rooms for an under-development game, named Metavoidal. Our technique can harness the power of GPT-3 by Human-in-the-loop fine-tuning which allows our metho… ▽ More

    Submitted 2 July, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Published at 2023 IEEE Conference on Games

  10. arXiv:2304.04839  [pdf

    cs.LG cs.CY

    MHfit: Mobile Health Data for Predicting Athletics Fitness Using Machine Learning

    Authors: Jonayet Miah, Muntasir Mamun, Md Minhazur Rahman, Md Ishtyaq Mahmud, Sabbir Ahmed, Md Hasan Bin Nasir

    Abstract: Mobile phones and other electronic gadgets or devices have aided in collecting data without the need for data entry. This paper will specifically focus on Mobile health data. Mobile health data use mobile devices to gather clinical health data and track patient vitals in real-time. Our study is aimed to give decisions for small or big sports teams on whether one athlete good fit or not for a parti… ▽ More

    Submitted 26 April, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: 6, Accepted by 2nd International Seminar on Machine Learning, Optimization, and Data Science (ISMODE)

  11. arXiv:2302.05817  [pdf, other

    cs.AI cs.CL cs.NE

    Level Generation Through Large Language Models

    Authors: Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius

    Abstract: Large Language Models (LLMs) are powerful tools, capable of leveraging their training on natural language to write stories, generate code, and answer questions. But can they generate functional video game levels? Game levels, with their complex functional constraints and spatial relationships in more than one dimension, are very different from the kinds of data an LLM typically sees during trainin… ▽ More

    Submitted 1 June, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Journal ref: FDG 2023: Proceedings of the 18th International Conference on the Foundations of Digital Games

  12. arXiv:2302.01561  [pdf, other

    cs.AI

    Hierarchically Composing Level Generators for the Creation of Complex Structures

    Authors: Michael Beukman, Manuel Fokam, Marcel Kruger, Guy Axelrod, Muhammad Nasir, Branden Ingram, Benjamin Rosman, Steven James

    Abstract: Procedural content generation (PCG) is a growing field, with numerous applications in the video game industry and great potential to help create better games at a fraction of the cost of manual creation. However, much of the work in PCG is focused on generating relatively straightforward levels in simple games, as it is challenging to design an optimisable objective function for complex settings.… ▽ More

    Submitted 19 July, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Code is available at https://github.com/Michael-Beukman/MCHAMR. This work has been accepted to IEEE Transactions on Games, with copyright transferred to the IEEE

  13. arXiv:2211.11636  [pdf, other

    cs.LG cs.AI cs.CV

    Dwelling Type Classification for Disaster Risk Assessment Using Satellite Imagery

    Authors: Md Nasir, Tina Sederholm, Anshu Sharma, Sundeep Reddy Mallu, Sumedh Ranjan Ghatage, Rahul Dodhia, Juan Lavista Ferres

    Abstract: Vulnerability and risk assessment of neighborhoods is essential for effective disaster preparedness. Existing traditional systems, due to dependency on time-consuming and cost-intensive field surveying, do not provide a scalable way to decipher warnings and assess the precise extent of the risk at a hyper-local level. In this work, machine learning was used to automate the process of identifying d… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted for presentation in AI+HADR workshop, Neurips 2022

  14. arXiv:2211.03279  [pdf, other

    eess.AS cs.SD

    A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations

    Authors: Rimita Lahiri, Md Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics. We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. We use conformers(a combination of convolutional network and transformer) for capturing both short-term and long-term conversational… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  15. arXiv:2210.11442  [pdf, other

    cs.AI cs.NE

    Augmentative Topology Agents For Open-Ended Learning

    Authors: Muhammad Umair Nasir, Michael Beukman, Steven James, Christopher Wesley Cleghorn

    Abstract: In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult en… ▽ More

    Submitted 11 October, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted to The Proceedings of Genetic and Evolutionary Computation Conference (GECCO) 2023

  16. arXiv:2205.08621  [pdf, other

    cs.CL cs.AI

    Geographical Distance Is The New Hyperparameter: A Case Study Of Finding The Optimal Pre-trained Language For English-isiZulu Machine Translation

    Authors: Muhammad Umair Nasir, Innocent Amos Mchechesi

    Abstract: Stemming from the limited availability of datasets and textual resources for low-resource languages such as isiZulu, there is a significant need to be able to harness knowledge from pre-trained models to improve low resource machine translation. Moreover, a lack of techniques to handle the complexities of morphologically rich languages has compounded the unequal development of translation models,… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022 Workshop MIA

  17. arXiv:2205.02022  [pdf, other

    cs.CL

    A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation

    Authors: David Ifeoluwa Adelani, Jesujoba Oluwadara Alabi, Angela Fan, Julia Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P. Dossou, Chris Chinenye Emezue, Colin Leong, Michael Beukman, Shamsuddeen Hassan Muhammad, Guyo Dub Jarso, Oreen Yousuf, Andre Niyongabo Rubungo, Gilles Hacheme, Eric Peter Wairagala, Muhammad Umair Nasir, Benjamin Ayoade Ajibade, Tunde Oluwaseyi Ajayi , et al. (20 additional authors not shown)

    Abstract: Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly left out in these datasets. This is primarily because many widely spoken languages are not well represented on the web and therefore excluded from the large-scale crawls used to create datasets. Furthermore, downstream users of these models… ▽ More

    Submitted 22 August, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted to NAACL 2022 (added evaluation data for amh, kin, nya, sna, xho)

  18. Usability Inspection: Novice Crowd Inspectors versus Expert

    Authors: Muhammad Nasir, Naveed Ikram, Zakia Jalil

    Abstract: Objective: This research study aims to investigate the use of novice crowd inspectors for usability inspection with respect to time spent and the cost incurred. This study compares the results of the novice crowd usability inspection guided by a single expert's heuristic usability inspection (novice crowd usability inspection henceforth) with the expert heuristic usability inspection. Background:… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  19. arXiv:2106.06720  [pdf, other

    cs.IR cs.CY

    BIOPAK Flasher: Epidemic disease monitoring and detection in Pakistan using text mining

    Authors: Muhammad Nasir, Maheen Bakhtyar, Junaid Baber, Sadia Lakho, Bilal Ahmed, Waheed Noor

    Abstract: Infectious disease outbreak has a significant impact on morbidity, mortality and can cause economic instability of many countries. As global trade is growing, goods and individuals are expected to travel across the border, an infected epidemic area carrier can pose a great danger to his hostile. If a disease outbreak is recognized promptly, then commercial products and travelers (traders/visitors)… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

    Comments: Paper is accepted in SOFTA 2020

  20. arXiv:2104.11757  [pdf, ps, other

    cs.CY

    Becoming Good at AI for Good

    Authors: Meghana Kshirsagar, Caleb Robinson, Siyu Yang, Shahrzad Gholami, Ivan Klyuzhin, Sumit Mukherjee, Md Nasir, Anthony Ortiz, Felipe Oviedo, Darren Tanner, Anusua Trivedi, Yixi Xu, Ming Zhong, Bistra Dilkina, Rahul Dodhia, Juan M. Lavista Ferres

    Abstract: AI for good (AI4G) projects involve developing and applying artificial intelligence (AI) based solutions to further goals in areas such as sustainability, health, humanitarian aid, and social justice. Developing and deploying such solutions must be done in collaboration with partners who are experts in the domain in question and who already have experience in making progress towards such goals. Ba… ▽ More

    Submitted 3 May, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: Accepted to AIES-2021

  21. DistB-SDoIndustry: Enhancing Security in Industry 4.0 Services based on Distributed Blockchain through Software Defined Networking-IoT Enabled Architecture

    Authors: Anichur Rahman, Umme Sara, Dipanjali Kundu, Saiful Islam, Md. Jahidul Islam, Mahedi Hasan, Ziaur Rahman, Mostofa Kamal Nasir

    Abstract: The concept of Industry 4.0 is a newly emerging focus of research throughout the world. However, it has lots of challenges to control data, and it can be addressed with various technologies like Internet of Things (IoT), Big Data, Artificial Intelligence (AI), Software Defined Networking (SDN), and Blockchain (BC) for managing data securely. Further, the complexity of sensors, appliances, sensor n… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 8 Pages, 6 Figures

    ACM Class: J.7

    Journal ref: IJACSA, 11(9), 2020

  22. DistB-Condo: Distributed Blockchain-based IoT-SDN Model for Smart Condominium

    Authors: Anichur Rahman, Md. Jahidul Islam, Ziaur Rahman, Md. Mahfuz Reza, Adnan Anwar, M. A. Parvez Mahmud, Mostofa Kamal Nasir, Rafidah Md Noor

    Abstract: Condominium network refers to intra-organization networks, where smart buildings or apartments are connected and share resources over the network. Secured communication platform or channel has been highlighted as a key requirement for a reliable condominium which can be ensured by the utilization of the advanced techniques and platforms like Software-Defined Network (SDN), Network Function Virtual… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 17 Pages, 12 Tables, 17 Figures

    ACM Class: H.1.1

    Journal ref: EEE Access, vol. 8, pp. 209594-209609, 2020

  23. arXiv:1904.06002  [pdf, other

    cs.CL

    Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance

    Authors: Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan

    Abstract: Linguistic coordination is a well-established phenomenon in spoken conversations and often associated with positive social behaviors and outcomes. While there have been many attempts to measure lexical coordination or entrainment in literature, only a few have explored coordination in syntactic or semantic space. In this work, we attempt to combine these different aspects of coordination into a si… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

  24. arXiv:1809.00394  [pdf, other

    cs.DS

    Mining Frequent Patterns in Evolving Graphs

    Authors: Cigdem Aslay, Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Aristides Gionis

    Abstract: Given a labeled graph, the frequent-subgraph mining (FSM) problem asks to find all the $k$-vertex subgraphs that appear with frequency greater than a given threshold. FSM has numerous applications ranging from biology to network science, as it provides a compact summary of the characteristics of the graph. However, the task is challenging, even more so for evolving graphs due to the streaming natu… ▽ More

    Submitted 10 September, 2018; v1 submitted 2 September, 2018; originally announced September 2018.

    Comments: 10 pages, accepted at CIKM 2018

  25. Towards an Unsupervised Entrainment Distance in Conversational Speech using Deep Neural Networks

    Authors: Md Nasir, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

    Abstract: Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics. Understanding how interlocutors tend to adapt to each other's speaking style through entrainment involves measuring a range of acoustic features and comparing those via multiple signal comparison methods. In this work, we present a turn-level distance measure obt… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: submitted to Interspeech 2018

  26. arXiv:1707.05272  [pdf, other

    cs.SE

    Learn More, Pay Less! Lessons Learned from Applying the Wizard-of-Oz Technique for Exploring Mobile App Requirements

    Authors: Zahra Shakeri Hossein Abad, Shane D. V. Sims, Abdullah Cheema, Montasir B. Nasir, Payal Harisinghani

    Abstract: Mobile apps have exploded in popularity, encouraging developers to provide content to the massive user base of the main app stores. Although there exist automated techniques that can classify user comments into various topics with high levels of precision, recent studies have shown that the top apps in the app stores do not have customer ratings that directly correlate with the app's success. This… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

    Comments: 8 pages, IEEE International Requirements Engineering Conference Workshops (REW'17)

  27. arXiv:1705.09073  [pdf, other

    cs.DC

    Load Balancing for Skewed Streams on Heterogeneous Cluster

    Authors: Muhammad Anis Uddin Nasir, Hiroshi Horii, Marco Serafini, Nicolas Kourtellis, Rudy Raymond, Sarunas Girdzijauskas, Takayuki Osogami

    Abstract: Streaming applications frequently encounter skewed workloads and execute on heterogeneous clusters. Optimal resource utilization in such adverse conditions becomes a challenge, as it requires inferring the resource capacities and input distribution at run time. In this paper, we tackle the aforementioned challenges by modeling them as a load balancing problem. We propose a novel partitioning strat… ▽ More

    Submitted 1 October, 2017; v1 submitted 25 May, 2017; originally announced May 2017.

    Comments: 12 pages, under submission

  28. Fully Dynamic Algorithm for Top-$k$ Densest Subgraphs

    Authors: Muhammad Anis Uddin Nasir, Aristides Gionis, Gianmarco De Francisci Morales, Sarunas Girdzijauskas

    Abstract: Given a large graph, the densest-subgraph problem asks to find a subgraph with maximum average degree. When considering the top-$k$ version of this problem, a naïve solution is to iteratively find the densest subgraph and remove it in each iteration. However, such a solution is impractical due to high processing cost. The problem is further complicated when dealing with dynamic graphs, since addin… ▽ More

    Submitted 29 August, 2017; v1 submitted 19 October, 2016; originally announced October 2016.

    Comments: 10 pages, 8 figures, accepted at CIKM 2017

  29. arXiv:1605.00928   

    cs.DC

    Fault Tolerance for Stream Processing Engines

    Authors: Muhammad Anis Uddin Nasir

    Abstract: Distributed Stream Processing Engines (DSPEs) target applications related to continuous computation, online machine learning and real-time query processing. DSPEs operate on high volume of data by applying lightweight operations on real-time and continuous streams. Such systems require clusters of hundreds of machine for their deployment. Streaming applications come with various requirements, i.e.… ▽ More

    Submitted 5 May, 2020; v1 submitted 3 May, 2016; originally announced May 2016.

    Comments: The survey is not complete and require major updates

  30. arXiv:1510.07623  [pdf, other

    cs.DC

    Partial Key Grouping: Load-Balanced Partitioning of Distributed Streams

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David Garcia-Soriano, Nicolas Kourtellis, Marco Serafini

    Abstract: We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical "power of two choices" to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load b… ▽ More

    Submitted 26 October, 2015; originally announced October 2015.

    Comments: 14 pages. arXiv admin note: substantial text overlap with arXiv:1504.00788

  31. arXiv:1510.05714  [pdf, other

    cs.DC

    When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Processing

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, Nicolas Kourtellis, Marco Serafini

    Abstract: Carefully balancing load in distributed stream processing systems has a fundamental impact on execution latency and throughput. Load balancing is challenging because real-world workloads are skewed: some tuples in the stream are associated to keys which are significantly more frequent than others. Skew is remarkably more problematic in large deployments: more workers implies fewer keys per worker,… ▽ More

    Submitted 27 January, 2016; v1 submitted 19 October, 2015; originally announced October 2015.

    Comments: 12 pages, 14 Figures, this paper is accepted and will be published at ICDE 2016

  32. arXiv:1508.05591  [pdf, other

    cs.DC cs.SI

    Socially-Aware Distributed Hash Tables for Decentralized Online Social Networks

    Authors: Muhammad Anis Uddin Nasir, Sarunas Girdzijauskas, Nicolas Kourtellis

    Abstract: Many decentralized online social networks (DOSNs) have been proposed due to an increase in awareness related to privacy and scalability issues in centralized social networks. Such decentralized networks transfer processing and storage functionalities from the service providers towards the end users. DOSNs require individualistic implementation for services, (i.e., search, information dissemination… ▽ More

    Submitted 23 September, 2015; v1 submitted 23 August, 2015; originally announced August 2015.

    Comments: 10 pages, p2p 2015 conference

  33. The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines

    Authors: Muhammad Anis Uddin Nasir, Gianmarco De Francisci Morales, David García-Soriano, Nicolas Kourtellis, Marco Serafini

    Abstract: We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical "power of two choices" to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load b… ▽ More

    Submitted 3 April, 2015; originally announced April 2015.

    Comments: 31st IEEE International Conference on Data Engineering (ICDE), 2015