-
PlaceFM: A Training-free Geospatial Foundation Model of Places
Authors:
Mohammad Hashemi,
Hossein Amiri,
Andreas Zufle
Abstract:
Spatial structure is central to effective geospatial intelligence systems. While foundation models have shown promise, they often lack the flexibility to reason about places, which are context-rich regions spanning different spatial granularities. We propose PlaceFM, a spatial foundation model that captures place representations using a training-free graph condensation method. PlaceFM condenses a…
▽ More
Spatial structure is central to effective geospatial intelligence systems. While foundation models have shown promise, they often lack the flexibility to reason about places, which are context-rich regions spanning different spatial granularities. We propose PlaceFM, a spatial foundation model that captures place representations using a training-free graph condensation method. PlaceFM condenses a nationwide POI graph built from integrated Foursquare and OpenStreetMap data in the U.S., generating general-purpose embeddings of places. These embeddings can be seamlessly integrated into geolocation data pipelines to support a wide range of downstream tasks. Without requiring pretraining, PlaceFM offers a scalable and adaptable solution for multi-scale geospatial analysis.
△ Less
Submitted 25 June, 2025;
originally announced July 2025.
-
Privacy-Preserving Quantized Federated Learning with Diverse Precision
Authors:
Dang Qua Nguyen,
Morteza Hashemi,
Erik Perrins,
Sergiy A. Vorobyov,
David J. Love,
Taejoon Kim
Abstract:
Federated learning (FL) has emerged as a promising paradigm for distributed machine learning, enabling collaborative training of a global model across multiple local devices without requiring them to share raw data. Despite its advancements, FL is limited by factors such as: (i) privacy risks arising from the unprotected transmission of local model updates to the fusion center (FC) and (ii) decrea…
▽ More
Federated learning (FL) has emerged as a promising paradigm for distributed machine learning, enabling collaborative training of a global model across multiple local devices without requiring them to share raw data. Despite its advancements, FL is limited by factors such as: (i) privacy risks arising from the unprotected transmission of local model updates to the fusion center (FC) and (ii) decreased learning utility caused by heterogeneity in model quantization resolution across participating devices. Prior work typically addresses only one of these challenges because maintaining learning utility under both privacy risks and quantization heterogeneity is a non-trivial task. In this paper, our aim is therefore to improve the learning utility of a privacy-preserving FL that allows clusters of devices with different quantization resolutions to participate in each FL round. Specifically, we introduce a novel stochastic quantizer (SQ) that is designed to simultaneously achieve differential privacy (DP) and minimum quantization error. Notably, the proposed SQ guarantees bounded distortion, unlike other DP approaches. To address quantization heterogeneity, we introduce a cluster size optimization technique combined with a linear fusion approach to enhance model aggregation accuracy. Numerical simulations validate the benefits of our approach in terms of privacy protection and learning utility compared to the conventional LaplaceSQ-FL algorithm.
△ Less
Submitted 2 July, 2025; v1 submitted 1 July, 2025;
originally announced July 2025.
-
From Points to Places: Towards Human Mobility-Driven Spatiotemporal Foundation Models via Understanding Places
Authors:
Mohammad Hashemi,
Andreas Zufle
Abstract:
Capturing human mobility is essential for modeling how people interact with and move through physical spaces, reflecting social behavior, access to resources, and dynamic spatial patterns. To support scalable and transferable analysis across diverse geographies and contexts, there is a need for a generalizable foundation model for spatiotemporal data. While foundation models have transformed langu…
▽ More
Capturing human mobility is essential for modeling how people interact with and move through physical spaces, reflecting social behavior, access to resources, and dynamic spatial patterns. To support scalable and transferable analysis across diverse geographies and contexts, there is a need for a generalizable foundation model for spatiotemporal data. While foundation models have transformed language and vision, they remain limited in handling the unique challenges posed by the spatial, temporal, and semantic complexity of mobility data. This vision paper advocates for a new class of spatial foundation models that integrate geolocation semantics with human mobility across multiple scales. Central to our vision is a shift from modeling discrete points of interest to understanding places: dynamic, context-rich regions shaped by human behavior and mobility that may comprise many places of interest. We identify key gaps in adaptability, scalability, and multi-granular reasoning, and propose research directions focused on modeling places and enabling efficient learning. Our goal is to guide the development of scalable, context-aware models for next-generation geospatial intelligence. These models unlock powerful applications ranging from personalized place discovery and logistics optimization to urban planning, ultimately enabling smarter and more responsive spatial decision-making.
△ Less
Submitted 17 June, 2025;
originally announced June 2025.
-
Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA
Authors:
Rishabh Maheshwary,
Masoud Hashemi,
Khyati Mahajan,
Shiva Krishna Reddy Malay,
Sai Rajeswar,
Sathwik Tejaswi Madhusudhan,
Spandana Gella,
Vikas Yadav
Abstract:
Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG.…
▽ More
Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG. To address these challenges, we propose Notes Writing, a method that generates concise and relevant notes from retrieved documents at each step, thereby reducing noise and retaining only essential information. This indirectly increases the effective context length of Large Language Models (LLMs), enabling them to reason and plan more effectively while processing larger volumes of input text. Notes Writing is framework agnostic and can be integrated with different iterative RAG methods. We demonstrate its effectiveness with three iterative RAG methods, across two models and four evaluation datasets. Notes writing yields an average improvement of 15.6 percentage points overall, with minimal increase in output tokens.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
Neural-Enhanced Rate Adaptation and Computation Distribution for Emerging mmWave Multi-User 3D Video Streaming Systems
Authors:
Babak Badnava,
Jacob Chakareski,
Morteza Hashemi
Abstract:
We investigate multitask edge-user communication-computation resource allocation for $360^\circ$ video streaming in an edge-computing enabled millimeter wave (mmWave) multi-user virtual reality system. To balance the communication-computation trade-offs that arise herein, we formulate a video quality maximization problem that integrates interdependent multitask/multi-user action spaces and rebuffe…
▽ More
We investigate multitask edge-user communication-computation resource allocation for $360^\circ$ video streaming in an edge-computing enabled millimeter wave (mmWave) multi-user virtual reality system. To balance the communication-computation trade-offs that arise herein, we formulate a video quality maximization problem that integrates interdependent multitask/multi-user action spaces and rebuffering time/quality variation constraints. We formulate a deep reinforcement learning framework for \underline{m}ulti-\underline{t}ask \underline{r}ate adaptation and \underline{c}omputation distribution (MTRC) to solve the problem of interest. Our solution does not rely on a priori knowledge about the environment and uses only prior video streaming statistics (e.g., throughput, decoding time, and transmission delay), and content information, to adjust the assigned video bitrates and computation distribution, as it observes the induced streaming performance online. Moreover, to capture the task interdependence in the environment, we leverage neural network cascades to extend our MTRC method to two novel variants denoted as R1C2 and C1R2. We train all three methods with real-world mmWave network traces and $360^\circ$ video datasets to evaluate their performance in terms of expected quality of experience (QoE), viewport peak signal-to-noise ratio (PSNR), rebuffering time, and quality variation. We outperform state-of-the-art rate adaptation algorithms, with C1R2 showing best results and achieving $5.21-6.06$ dB PSNR gains, $2.18-2.70$x rebuffering time reduction, and $4.14-4.50$ dB quality variation reduction.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Learning Driven Elastic Task Multi-Connectivity Immersive Computing Systems
Authors:
Babak Badnava,
Jacob Chakareski,
Morteza Hashemi
Abstract:
In virtual reality (VR) environments, computational tasks exhibit an elastic nature, meaning they can dynamically adjust based on various user and system constraints. This elasticity is essential for maintaining immersive experiences; however, it also introduces challenges for communication and computing in VR systems. In this paper, we investigate elastic task offloading for multi-user edge-compu…
▽ More
In virtual reality (VR) environments, computational tasks exhibit an elastic nature, meaning they can dynamically adjust based on various user and system constraints. This elasticity is essential for maintaining immersive experiences; however, it also introduces challenges for communication and computing in VR systems. In this paper, we investigate elastic task offloading for multi-user edge-computing-enabled VR systems with multi-connectivity, aiming to maximize the computational energy-efficiency (computational throughput per unit of energy consumed). To balance the induced communication, computation, energy consumption, and quality of experience trade-offs due to the elasticity of VR tasks, we formulate a constrained stochastic computational energy-efficiency optimization problem that integrates the multi-connectivity/multi-user action space and the elastic nature of VR computational tasks. We formulate a centralized phasic policy gradient (CPPG) framework to solve the problem of interest online, using only prior elastic task offloading statistics (energy consumption, response time, and transmission time), and task information (i.e., task size and computational intensity), while observing the induced system performance (energy consumption and latency). We further extend our approach to decentralized learning by formulating an independent phasic policy gradient (IPPG) method and a decentralized shared multi-armed bandit (DSMAB) method. We train our methods with real-world 4G, 5G, and WiGig network traces and 360 video datasets to evaluate their performance in terms of response time, energy efficiency, scalability, and delivered quality of experience. We also provide a comprehensive analysis of task size and its effect on offloading policy and system performance. In particular, we show that CPPG reduces latency by 28% and energy consumption by 78% compared to IPPG.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models
Authors:
Mehrnoush Shamsfard,
Zahra Saaberi,
Mostafa Karimi manesh,
Seyed Mohammad Hossein Hashemi,
Zahra Vatankhah,
Motahareh Ramezani,
Niki Pourazin,
Tara Zare,
Maryam Azimi,
Sarina Chitsaz,
Sama Khoraminejad,
Morteza Mahdavi Mortazavi,
Mohammad Mahdi Chizari,
Sahar Maleki,
Seyed Soroush Majd,
Mostafa Masumi,
Sayed Ali Musavi Khoeini,
Amir Mohseni,
Sogol Alipour
Abstract:
Research on evaluating and analyzing large language models (LLMs) has been extensive for resource-rich languages such as English, yet their performance in languages such as Persian has received considerably less attention. This paper introduces FarsEval-PKBETS benchmark, a subset of FarsEval project for evaluating large language models in Persian. This benchmark consists of 4000 questions and answ…
▽ More
Research on evaluating and analyzing large language models (LLMs) has been extensive for resource-rich languages such as English, yet their performance in languages such as Persian has received considerably less attention. This paper introduces FarsEval-PKBETS benchmark, a subset of FarsEval project for evaluating large language models in Persian. This benchmark consists of 4000 questions and answers in various formats, including multiple choice, short answer and descriptive responses. It covers a wide range of domains and tasks,including medicine, law, religion, Persian language, encyclopedic knowledge, human preferences, social knowledge, ethics and bias, text generation, and respecting others' rights. This bechmark incorporates linguistics, cultural, and local considerations relevant to the Persian language and Iran. To ensure the questions are challenging for current LLMs, three models -- Llama3-70B, PersianMind, and Dorna -- were evaluated using this benchmark. Their average accuracy was below 50%, meaning they provided fully correct answers to fewer than half of the questions. These results indicate that current language models are still far from being able to solve this benchmark
△ Less
Submitted 20 April, 2025;
originally announced April 2025.
-
DNR Bench: Benchmarking Over-Reasoning in Reasoning LLMs
Authors:
Masoud Hashemi,
Oluwanifemi Bamgbose,
Sathwik Tejaswi Madhusudhan,
Jishnu Sethumadhavan Nair,
Aman Tiwari,
Vikas Yadav
Abstract:
Test-time scaling has significantly improved large language model performance, enabling deeper reasoning to solve complex problems. However, this increased reasoning capability also leads to excessive token generation and unnecessary problem-solving attempts. We introduce Dont Reason Bench (DNR Bench), a new benchmark designed to evaluate LLMs ability to robustly understand the tricky reasoning tr…
▽ More
Test-time scaling has significantly improved large language model performance, enabling deeper reasoning to solve complex problems. However, this increased reasoning capability also leads to excessive token generation and unnecessary problem-solving attempts. We introduce Dont Reason Bench (DNR Bench), a new benchmark designed to evaluate LLMs ability to robustly understand the tricky reasoning triggers and avoiding unnecessary generation. DNR Bench consists of 150 adversarially designed prompts that are easy for humans to understand and respond to, but surprisingly not for many of the recent prominent LLMs. DNR Bench tests models abilities across different capabilities, such as instruction adherence, hallucination avoidance, redundancy filtering, and unanswerable question recognition. We evaluate reasoning LLMs (RLMs), including DeepSeek-R1, OpenAI O3-mini, Claude-3.7-sonnet and compare them against a powerful non-reasoning model, e.g., GPT-4o. Our experiments reveal that RLMs generate up to 70x more tokens than necessary, often failing at tasks that simpler non-reasoning models handle efficiently with higher accuracy. Our findings underscore the need for more effective training and inference strategies in RLMs.
△ Less
Submitted 17 April, 2025; v1 submitted 19 March, 2025;
originally announced March 2025.
-
ECO: An LLM-Driven Efficient Code Optimizer for Warehouse Scale Computers
Authors:
Hannah Lin,
Martin Maas,
Maximilian Roquemore,
Arman Hasanzadeh,
Fred Lewis,
Yusuf Simonson,
Tzu-Wei Yang,
Amir Yazdanbakhsh,
Deniz Altinbüken,
Florin Papa,
Maggie Nolan Edmonds,
Aditya Patil,
Don Schwarz,
Satish Chandra,
Chris Kennelly,
Milad Hashemi,
Parthasarathy Ranganathan
Abstract:
With the end of Moore's Law, optimizing code for performance has become paramount for meeting ever-increasing compute demands, particularly in hyperscale data centers where even small efficiency gains translate to significant resource and energy savings. Traditionally, this process requires significant programmer effort to identify optimization opportunities, modify the code to implement the optim…
▽ More
With the end of Moore's Law, optimizing code for performance has become paramount for meeting ever-increasing compute demands, particularly in hyperscale data centers where even small efficiency gains translate to significant resource and energy savings. Traditionally, this process requires significant programmer effort to identify optimization opportunities, modify the code to implement the optimization, and carefully deploy and measure the optimization's impact. Despite a significant amount of work on automating program edits and promising results in small-scale settings, such performance optimizations have remained elusive in large real-world production environments, due to the scale, high degree of complexity, and reliability required.
This paper introduces ECO (Efficient Code Optimizer), a system that automatically refactors source code to improve performance at scale. To achieve these performance gains, ECO searches through historical commits at scale to create a dictionary of performance anti-patterns that these commits addressed. These anti-patterns are used to search for similar patterns in a code base of billions of lines of code, pinpointing other code segments with similar potential optimization opportunities. Using a fine-tuned LLM, ECO then automatically refactors the code to generate and apply similar edits. Next, ECO verifies the transformed code, submits it for code review, and measures the impact of the optimization in production.
Currently deployed on Google's hyperscale production fleet, this system has driven >25k changed lines of production code, across over 6.4k submitted commits, with a >99.5% production success rate. Over the past year, ECO has consistently resulted in significant performance savings every quarter. On average, the savings produced per quarter are equivalent to over 500k normalized CPU cores.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance
Authors:
Bryan Etzine,
Masoud Hashemi,
Nishanth Madhusudhan,
Sagar Davasam,
Roshnee Sharma,
Sathwik Tejaswi Madhusudhan,
Vikas Yadav
Abstract:
Existing benchmarks are becoming saturated and struggle to separate model performances due to factors like data contamination and advancing LLM capabilities. This paper introduces EMDM (Enhanced Model Differentiation Metric), a novel weighted metric that revitalizes benchmarks by enhancing model separation. EMDM integrates final answer and Chain-of-Thought (CoT) reasoning correctness, assigning we…
▽ More
Existing benchmarks are becoming saturated and struggle to separate model performances due to factors like data contamination and advancing LLM capabilities. This paper introduces EMDM (Enhanced Model Differentiation Metric), a novel weighted metric that revitalizes benchmarks by enhancing model separation. EMDM integrates final answer and Chain-of-Thought (CoT) reasoning correctness, assigning weights based on the complexity and reasoning depth required to solve a given sample in the evaluation data. Using a baseline LLM in two setups-Unguided, where the model has no prior exposure to test samples, and Guided, where the model has prior knowledge of the desired answer-EMDM distinguishes instances of varying difficulty. The CoT and answer correctness from these setups inform an optimization objective for weight assignment, resulting in a more nuanced evaluation of model performance. Compared to the exact match (EM) metric, which achieves 17% separation on ARC-Challenge, EMDM achieves 46%, demonstrating its effectiveness in differentiating models based on reasoning and knowledge requirements.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
Revolutionizing Traffic Management with AI-Powered Machine Vision: A Step Toward Smart Cities
Authors:
Seyed Hossein Hosseini DolatAbadi,
Sayyed Mohammad Hossein Hashemi,
Mohammad Hosseini,
Moein-Aldin AliHosseini
Abstract:
The rapid urbanization of cities and increasing vehicular congestion have posed significant challenges to traffic management and safety. This study explores the transformative potential of artificial intelligence (AI) and machine vision technologies in revolutionizing traffic systems. By leveraging advanced surveillance cameras and deep learning algorithms, this research proposes a system for real…
▽ More
The rapid urbanization of cities and increasing vehicular congestion have posed significant challenges to traffic management and safety. This study explores the transformative potential of artificial intelligence (AI) and machine vision technologies in revolutionizing traffic systems. By leveraging advanced surveillance cameras and deep learning algorithms, this research proposes a system for real-time detection of vehicles, traffic anomalies, and driver behaviors. The system integrates geospatial and weather data to adapt dynamically to environmental conditions, ensuring robust performance in diverse scenarios. Using YOLOv8 and YOLOv11 models, the study achieves high accuracy in vehicle detection and anomaly recognition, optimizing traffic flow and enhancing road safety. These findings contribute to the development of intelligent traffic management solutions and align with the vision of creating smart cities with sustainable and efficient urban infrastructure.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Scalable Graph Condensation with Evolving Capabilities
Authors:
Shengbo Gong,
Mohammad Hashemi,
Juntong Ni,
Carl Yang,
Wei Jin
Abstract:
Graph data has become a pivotal modality due to its unique ability to model relational datasets. However, real-world graph data continues to grow exponentially, resulting in a quadratic increase in the complexity of most graph algorithms as graph sizes expand. Although graph condensation (GC) methods have been proposed to address these scalability issues, existing approaches often treat the traini…
▽ More
Graph data has become a pivotal modality due to its unique ability to model relational datasets. However, real-world graph data continues to grow exponentially, resulting in a quadratic increase in the complexity of most graph algorithms as graph sizes expand. Although graph condensation (GC) methods have been proposed to address these scalability issues, existing approaches often treat the training set as static, overlooking the evolving nature of real-world graph data. This limitation leads to inefficiencies when condensing growing training sets. In this paper, we introduce GECC (Graph Evolving Clustering Condensation), a scalable graph condensation method designed to handle large-scale and evolving graph data. GECC employs a traceable and efficient approach by performing class-wise clustering on aggregated features. Furthermore, it can inherits previous condensation results as clustering centroids when the condensed graph expands, thereby attaining an evolving capability. This methodology is supported by robust theoretical foundations and demonstrates superior empirical performance. Comprehensive experiments show that GECC achieves better performance than most state-of-the-art graph condensation methods while delivering an around 1,000x speedup on large datasets.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
pFedWN: A Personalized Federated Learning Framework for D2D Wireless Networks with Heterogeneous Data
Authors:
Zhou Ni,
Masoud Ghazikor,
Morteza Hashemi
Abstract:
Traditional Federated Learning (FL) approaches often struggle with data heterogeneity across clients, leading to suboptimal model performance for individual clients. To address this issue, Personalized Federated Learning (PFL) emerges as a solution to the challenges posed by non-independent and identically distributed (non-IID) and unbalanced data across clients. Furthermore, in most existing dece…
▽ More
Traditional Federated Learning (FL) approaches often struggle with data heterogeneity across clients, leading to suboptimal model performance for individual clients. To address this issue, Personalized Federated Learning (PFL) emerges as a solution to the challenges posed by non-independent and identically distributed (non-IID) and unbalanced data across clients. Furthermore, in most existing decentralized machine learning works, a perfect communication channel is considered for model parameter transmission between clients and servers. However, decentralized PFL over wireless links introduces new challenges, such as resource allocation and interference management. To overcome these challenges, we formulate a joint optimization problem that incorporates the underlying device-to-device (D2D) wireless channel conditions into a server-free PFL approach. The proposed method, dubbed pFedWN, optimizes the learning performance for each client while accounting for the variability in D2D wireless channels. To tackle the formulated problem, we divide it into two sub-problems: PFL neighbor selection and PFL weight assignment. The PFL neighbor selection is addressed through channel-aware neighbor selection within unlicensed spectrum bands such as ISM bands. Next, to assign PFL weights, we utilize the Expectation-Maximization (EM) method to evaluate the similarity between clients' data and obtain optimal weight distribution among the chosen PFL neighbors. Empirical results show that pFedWN provides efficient and personalized learning performance with non-IID and unbalanced datasets. Furthermore, it outperforms the existing FL and PFL methods in terms of learning efficacy and robustness, particularly under dynamic and unpredictable wireless channel conditions.
△ Less
Submitted 16 January, 2025;
originally announced January 2025.
-
FarExStance: Explainable Stance Detection for Farsi
Authors:
Majid Zarharan,
Maryam Hashemi,
Malika Behroozrazegh,
Sauleh Eetemadi,
Mohammad Taher Pilehvar,
Jennifer Foster
Abstract:
We introduce FarExStance, a new dataset for explainable stance detection in Farsi. Each instance in this dataset contains a claim, the stance of an article or social media post towards that claim, and an extractive explanation which provides evidence for the stance label. We compare the performance of a fine-tuned multilingual RoBERTa model to several large language models in zero-shot, few-shot,…
▽ More
We introduce FarExStance, a new dataset for explainable stance detection in Farsi. Each instance in this dataset contains a claim, the stance of an article or social media post towards that claim, and an extractive explanation which provides evidence for the stance label. We compare the performance of a fine-tuned multilingual RoBERTa model to several large language models in zero-shot, few-shot, and parameter-efficient fine-tuned settings on our new dataset. On stance detection, the most accurate models are the fine-tuned RoBERTa model, the LLM Aya-23-8B which has been fine-tuned using parameter-efficient fine-tuning, and few-shot Claude-3.5-Sonnet. Regarding the quality of the explanations, our automatic evaluation metrics indicate that few-shot GPT-4o generates the most coherent explanations, while our human evaluation reveals that the best Overall Explanation Score (OES) belongs to few-shot Claude-3.5-Sonnet. The fine-tuned Aya-32-8B model produced explanations most closely aligned with the reference explanations.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
Auto-Cypher: Improving LLMs on Cypher generation via LLM-supervised generation-verification framework
Authors:
Aman Tiwari,
Shiva Krishna Reddy Malay,
Vikas Yadav,
Masoud Hashemi,
Sathwik Tejaswi Madhusudhan
Abstract:
Graph databases like Neo4j are gaining popularity for handling complex, interconnected data, over traditional relational databases in modeling and querying relationships. While translating natural language into SQL queries is well-researched, generating Cypher queries for Neo4j remains relatively underexplored. In this work, we present an automated, LLM-Supervised, pipeline to generate high-qualit…
▽ More
Graph databases like Neo4j are gaining popularity for handling complex, interconnected data, over traditional relational databases in modeling and querying relationships. While translating natural language into SQL queries is well-researched, generating Cypher queries for Neo4j remains relatively underexplored. In this work, we present an automated, LLM-Supervised, pipeline to generate high-quality synthetic data for Text2Cypher. Our Cypher data generation pipeline introduces LLM-As-Database-Filler, a novel strategy for ensuring Cypher query correctness, thus resulting in high quality generations. Using our pipeline, we generate high quality Text2Cypher data - SynthCypher containing 29.8k instances across various domains and queries with varying complexities. Training open-source LLMs like LLaMa-3.1-8B, Mistral-7B, and QWEN-7B on SynthCypher results in performance gains of up to 40% on the Text2Cypher test split and 30% on the SPIDER benchmark, adapted for graph databases.
△ Less
Submitted 24 January, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
EaCO: Resource Sharing Dynamics and Its Impact on Energy Efficiency for DNN Training
Authors:
Kawsar Haghshenas,
Mona Hashemi
Abstract:
Deep Learning Training (DLT) is a growing workload in shared GPU/CPU clusters due to its high computational cost and increasing number of jobs. This contributes to significant energy consumption in GPU clusters, further exacerbated by GPU under-utilization, as shown in production cluster logs. Addressing this challenge requires workload scheduling and resource allocation policies for efficient GPU…
▽ More
Deep Learning Training (DLT) is a growing workload in shared GPU/CPU clusters due to its high computational cost and increasing number of jobs. This contributes to significant energy consumption in GPU clusters, further exacerbated by GPU under-utilization, as shown in production cluster logs. Addressing this challenge requires workload scheduling and resource allocation policies for efficient GPU sharing to improve resource and energy efficiency while maintaining performance. However, previous works primarily optimize for performance, often overlooking or even sacrificing energy efficiency.
In this paper, we present EaCO, the first energy-aware scheduling algorithm designed specifically for DLT workloads in GPU clusters. EaCO leverages hardware-supported context switching to enable GPU sharing across multiple DLT jobs, improving resource and energy utilization. GPU sharing can increase Job Completion Time (JCT) and may lead to contention if not employed carefully. To address this, EaCO integrates experiment and historical-based predictions as well as early-stage observations, ensuring performance expectations are met while optimizing energy efficiency.
We begin by experimentally exploring the dynamics of co-locating DLTs, investigating its impact on energy and resource utilization. Our results show that co-location improves energy efficiency by up to 44% for individual jobs, and increases average GPU utilization to as high as 97%. Additionally, evaluations on large-scale clusters using production traces demonstrate that EaCO reduces total energy by up to 39% compared to existing algorithms, which comes with a minimal increase in job runtime-less than 3.2% in our simulations.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Optimizing Location Allocation in Urban Management: A Brief Review
Authors:
Aref Ayati,
Mohammad Mahdi Hashemi,
Mohsen Saffar,
Hamid Reza Naji
Abstract:
Regarding the concepts of urban management, digital transformation, and smart cities, various issues are presented. Currently, we like to attend to location allocation problems that can be a new part of digital transformation in urban management (such as locating and placing facilities, locating and arranging centers such as aid and rescue centers, or even postal hubs, telecommunications, electron…
▽ More
Regarding the concepts of urban management, digital transformation, and smart cities, various issues are presented. Currently, we like to attend to location allocation problems that can be a new part of digital transformation in urban management (such as locating and placing facilities, locating and arranging centers such as aid and rescue centers, or even postal hubs, telecommunications, electronic equipment, and data centers, and routing in transportation optimization). These issues, which are seemingly simple but in practice complex, are important in urban environments, and the issue of accurate location allocation based on existing criteria directly impacts cost management, profit, efficiency, and citizen satisfaction. In recent years, researchers have used or presented various models and methods for location allocation problems, some of which will be mentioned in this article. Given the nature of these problems, which are optimization problems, this article will also examine existing research from an optimization perspective in summary. Finally, a brief conclusion will be made of the existing methods and their weaknesses, and suggestions will be made for continuing the path and improving scientific and practical research in this field.
△ Less
Submitted 8 December, 2024;
originally announced December 2024.
-
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
Authors:
Hoang H Nguyen,
Khyati Mahajan,
Vikas Yadav,
Julian Salazar,
Philip S. Yu,
Masoud Hashemi,
Rishabh Maheshwary
Abstract:
Although multilingual LLMs have achieved remarkable performance across benchmarks, we find they continue to underperform on non-Latin script languages across contemporary LLM families. This discrepancy arises from the fact that LLMs are pretrained with orthographic scripts, which are dominated by Latin characters that obscure their shared phonology with non-Latin scripts. We propose leveraging pho…
▽ More
Although multilingual LLMs have achieved remarkable performance across benchmarks, we find they continue to underperform on non-Latin script languages across contemporary LLM families. This discrepancy arises from the fact that LLMs are pretrained with orthographic scripts, which are dominated by Latin characters that obscure their shared phonology with non-Latin scripts. We propose leveraging phonemic transcriptions as complementary signals to induce script-invariant representations. Our study demonstrates that integrating phonemic signals improves performance across both non-Latin and Latin script languages, with a particularly significant impact on closing the performance gap between the two. Through detailed experiments, we show that phonemic and orthographic scripts retrieve distinct examples for in-context learning (ICL). This motivates our proposed Mixed-ICL retrieval strategy, where further aggregation from both leads to our significant performance improvements for both Latin script languages (up to 12.6%) and non-Latin script languages (up to 15.1%) compared to randomized ICL retrieval.
△ Less
Submitted 26 June, 2025; v1 submitted 4 November, 2024;
originally announced November 2024.
-
Optimizing NOMA Transmissions to Advance Federated Learning in Vehicular Networks
Authors:
Ziru Chen,
Zhou Ni,
Peiyuan Guan,
Lu Wang,
Lin X. Cai,
Morteza Hashemi,
Zongzhi Li
Abstract:
Diverse critical data, such as location information and driving patterns, can be collected by IoT devices in vehicular networks to improve driving experiences and road safety. However, drivers are often reluctant to share their data due to privacy concerns. The Federated Vehicular Network (FVN) is a promising technology that tackles these concerns by transmitting model parameters instead of raw da…
▽ More
Diverse critical data, such as location information and driving patterns, can be collected by IoT devices in vehicular networks to improve driving experiences and road safety. However, drivers are often reluctant to share their data due to privacy concerns. The Federated Vehicular Network (FVN) is a promising technology that tackles these concerns by transmitting model parameters instead of raw data, thereby protecting the privacy of drivers. Nevertheless, the performance of Federated Learning (FL) in a vehicular network depends on the joining ratio, which is restricted by the limited available wireless resources. To address these challenges, this paper proposes to apply Non-Orthogonal Multiple Access (NOMA) to improve the joining ratio in a FVN. Specifically, a vehicle selection and transmission power control algorithm is developed to exploit the power domain differences in the received signal to ensure the maximum number of vehicles capable of joining the FVN. Our simulation results demonstrate that the proposed NOMA-based strategy increases the joining ratio and significantly enhances the performance of the FVN.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
Channel-Aware Distributed Transmission Control and Video Streaming in UAV Networks
Authors:
Masoud Ghazikor,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we study the problem of distributed transmission control and video streaming optimization for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous cross-layer analysis framework that jointly considers three inter-dependent factors: (i) in-band interference introduced by ground/aerial nodes at the physical (PHY) layer, (ii) limited-size queues…
▽ More
In this paper, we study the problem of distributed transmission control and video streaming optimization for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous cross-layer analysis framework that jointly considers three inter-dependent factors: (i) in-band interference introduced by ground/aerial nodes at the physical (PHY) layer, (ii) limited-size queues with delay-constrained packet arrival at the medium access control (MAC) layer, and (iii) video encoding rate at the application layer. First, we formulate an optimization problem to maximize the throughput by optimizing the fading threshold. To this end, we jointly analyze the queue-related packet loss probabilities (i.e., buffer overflow and time threshold event) as well as the outage probability due to the low signal-to-interference-plus-noise ratio (SINR). We introduce the Distributed Transmission Control (DTC) algorithm that maximizes the throughput by adjusting transmission policies to balance the trade-offs between packet drop from queues vs. transmission errors due to low SINRs. Second, we incorporate the video distortion model to develop distributed peak signal-to-noise ratio (PSNR) optimization for video streaming. The formulated optimization incorporates two cross-layer parameters, specifically the fading threshold and video encoding rate. To tackle this problem, we develop the Joint Distributed Video Transmission and Encoder Control (JDVT-EC) algorithm that enhances the PSNR for all nodes by fine-tuning transmission policies and video encoding rates to balance the trade-offs between packet loss and lossy video compression distortions. Through extensive numerical analysis, we thoroughly examine the proposed algorithms and demonstrate that they are able to find the optimal transmission policies and video encoding rates under various scenarios.
△ Less
Submitted 3 August, 2024;
originally announced August 2024.
-
Vision-Language Model Based Handwriting Verification
Authors:
Mihir Chauhan,
Abhishek Satbhai,
Mohammad Abuzar Hashemi,
Mir Basheer Ali,
Bina Ramamurthy,
Mingchen Gao,
Siwei Lyu,
Sargur Srihari
Abstract:
Handwriting Verification is a critical in document forensics. Deep learning based approaches often face skepticism from forensic document examiners due to their lack of explainability and reliance on extensive training data and handcrafted features. This paper explores using Vision Language Models (VLMs), such as OpenAI's GPT-4o and Google's PaliGemma, to address these challenges. By leveraging th…
▽ More
Handwriting Verification is a critical in document forensics. Deep learning based approaches often face skepticism from forensic document examiners due to their lack of explainability and reliance on extensive training data and handcrafted features. This paper explores using Vision Language Models (VLMs), such as OpenAI's GPT-4o and Google's PaliGemma, to address these challenges. By leveraging their Visual Question Answering capabilities and 0-shot Chain-of-Thought (CoT) reasoning, our goal is to provide clear, human-understandable explanations for model decisions. Our experiments on the CEDAR handwriting dataset demonstrate that VLMs offer enhanced interpretability, reduce the need for large training datasets, and adapt better to diverse handwriting styles. However, results show that the CNN-based ResNet-18 architecture outperforms the 0-shot CoT prompt engineering approach with GPT-4o (Accuracy: 70%) and supervised fine-tuned PaliGemma (Accuracy: 71%), achieving an accuracy of 84% on the CEDAR AND dataset. These findings highlight the potential of VLMs in generating human-interpretable decisions while underscoring the need for further advancements to match the performance of specialized deep learning models.
△ Less
Submitted 31 July, 2024;
originally announced July 2024.
-
Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models
Authors:
Nishanth Madhusudhan,
Sathwik Tejaswi Madhusudhan,
Vikas Yadav,
Masoud Hashemi
Abstract:
Abstention Ability (AA) is a critical aspect of Large Language Model (LLM) reliability, referring to an LLM's capability to withhold responses when uncertain or lacking a definitive answer, without compromising performance. Although previous studies have attempted to improve AA, they lack a standardised evaluation method and remain unsuitable for black-box models where token prediction probabiliti…
▽ More
Abstention Ability (AA) is a critical aspect of Large Language Model (LLM) reliability, referring to an LLM's capability to withhold responses when uncertain or lacking a definitive answer, without compromising performance. Although previous studies have attempted to improve AA, they lack a standardised evaluation method and remain unsuitable for black-box models where token prediction probabilities are inaccessible. This makes comparative analysis challenging, especially for state-of-the-art closed-source commercial LLMs. This paper bridges this gap by introducing a black-box evaluation approach and a new dataset, Abstain-QA, crafted to rigorously assess AA across varied question types (answerable and unanswerable), domains (well-represented and under-represented), and task types (fact centric and reasoning). We also propose a new confusion matrix, the ''Answerable-Unanswerable Confusion Matrix'' (AUCM) which serves as the basis for evaluating AA, by offering a structured and precise approach for assessment. Finally, we explore the impact of three prompting strategies-Strict Prompting, Verbal Confidence Thresholding, and Chain-of-Thought (CoT)-on improving AA. Our results indicate that even powerful models like GPT-4, Mixtral 8x22b encounter difficulties with abstention; however, strategic approaches such as Strict prompting and CoT can enhance this capability.
△ Less
Submitted 24 September, 2024; v1 submitted 23 July, 2024;
originally announced July 2024.
-
Multi-Task Decision-Making for Multi-User 360 Video Processing over Wireless Networks
Authors:
Babak Badnava,
Jacob Chakareski,
Morteza Hashemi
Abstract:
We study a multi-task decision-making problem for 360 video processing in a wireless multi-user virtual reality (VR) system that includes an edge computing unit (ECU) to deliver 360 videos to VR users and offer computing assistance for decoding/rendering of video frames. However, this comes at the expense of increased data volume and required bandwidth. To balance this trade-off, we formulate a co…
▽ More
We study a multi-task decision-making problem for 360 video processing in a wireless multi-user virtual reality (VR) system that includes an edge computing unit (ECU) to deliver 360 videos to VR users and offer computing assistance for decoding/rendering of video frames. However, this comes at the expense of increased data volume and required bandwidth. To balance this trade-off, we formulate a constrained quality of experience (QoE) maximization problem in which the rebuffering time and quality variation between video frames are bounded by user and video requirements. To solve the formulated multi-user QoE maximization, we leverage deep reinforcement learning (DRL) for multi-task rate adaptation and computation distribution (MTRC). The proposed MTRC approach does not rely on any predefined assumption about the environment and relies on video playback statistics (i.e., past throughput, decoding time, transmission time, etc.), video information, and the resulting performance to adjust the video bitrate and computation distribution. We train MTRC with real-world wireless network traces and 360 video datasets to obtain evaluation results in terms of the average QoE, peak signal-to-noise ratio (PSNR), rebuffering time, and quality variation. Our results indicate that the MTRC improves the users' QoE compared to state-of-the-art rate adaptation algorithm. Specifically, we show a 5.97 dB to 6.44 dB improvement in PSNR, a 1.66X to 4.23X improvement in rebuffering time, and a 4.21 dB to 4.35 dB improvement in quality variation.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Federated Learning-based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems
Authors:
Sravan Reddy Chintareddy,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users (SUs) to opportunistically utilize detected "spectrum holes". Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment…
▽ More
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users (SUs) to opportunistically utilize detected "spectrum holes". Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment and training a machine learning (ML) model using the federated learning (FL) architecture. Unlike the existing studies on FL for wireless that presume datasets are readily available for training, we propose a novel architecture that directly integrates wireless dataset generation, which involves capturing I/Q samples from over-the-air signals in a multi-cell environment, into the FL training process. Secondly, in the collaborative spectrum inference stage, we propose a collaborative spectrum fusion strategy that is compatible with the unmanned aircraft system traffic management (UTM) ecosystem. Finally, in the spectrum scheduling stage, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users. To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base-station~(BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary users channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Self-Supervised Learning Based Handwriting Verification
Authors:
Mihir Chauhan,
Mohammad Abuzar Hashemi,
Abhishek Satbhai,
Mir Basheer Ali,
Bina Ramamurthy,
Mingchen Gao,
Siwei Lyu,
Sargur Srihari
Abstract:
We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND data…
▽ More
We present SSL-HV: Self-Supervised Learning approaches applied to the task of Handwriting Verification. This task involves determining whether a given pair of handwritten images originate from the same or different writer distribution. We have compared the performance of multiple generative, contrastive SSL approaches against handcrafted feature extractors and supervised learning on CEDAR AND dataset. We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy, while ResNet-18 fine-tuned using Variance-Invariance-Covariance Regularization (VICReg) outperforms other contrastive approaches achieving 78% accuracy. Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels.
△ Less
Submitted 1 August, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Centralized vs. Decentralized Multi-Agent Reinforcement Learning for Enhanced Control of Electric Vehicle Charging Networks
Authors:
Amin Shojaeighadikolaei,
Zsolt Talata,
Morteza Hashemi
Abstract:
The widespread adoption of electric vehicles (EVs) poses several challenges to power distribution networks and smart grid infrastructure due to the possibility of significantly increasing electricity demands, especially during peak hours. Furthermore, when EVs participate in demand-side management programs, charging expenses can be reduced by using optimal charging control policies that fully util…
▽ More
The widespread adoption of electric vehicles (EVs) poses several challenges to power distribution networks and smart grid infrastructure due to the possibility of significantly increasing electricity demands, especially during peak hours. Furthermore, when EVs participate in demand-side management programs, charging expenses can be reduced by using optimal charging control policies that fully utilize real-time pricing schemes. However, devising optimal charging methods and control strategies for EVs is challenging due to various stochastic and uncertain environmental factors. Currently, most EV charging controllers operate based on a centralized model. In this paper, we introduce a novel approach for distributed and cooperative charging strategy using a Multi-Agent Reinforcement Learning (MARL) framework. Our method is built upon the Deep Deterministic Policy Gradient (DDPG) algorithm for a group of EVs in a residential community, where all EVs are connected to a shared transformer. This method, referred to as CTDE-DDPG, adopts a Centralized Training Decentralized Execution (CTDE) approach to establish cooperation between agents during the training phase, while ensuring a distributed and privacy-preserving operation during execution. We theoretically examine the performance of centralized and decentralized critics for the DDPG-based MARL implementation and demonstrate their trade-offs. Furthermore, we numerically explore the efficiency, scalability, and performance of centralized and decentralized critics. Our theoretical and numerical results indicate that, despite higher policy gradient variances and training complexity, the CTDE-DDPG framework significantly improves charging efficiency by reducing total variation by approximately %36 and charging cost by around %9.1 on average...
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Model-free Resilient Controller Design based on Incentive Feedback Stackelberg Game and Q-learning
Authors:
Jiajun Shen,
Fengjun Li,
Morteza Hashemi,
Huazhen Fang
Abstract:
In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly c…
▽ More
In the swift evolution of Cyber-Physical Systems (CPSs) within intelligent environments, especially in the industrial domain shaped by Industry 4.0, the surge in development brings forth unprecedented security challenges. This paper explores the intricate security issues of Industrial CPSs (ICPSs), with a specific focus on the unique threats presented by intelligent attackers capable of directly compromising the controller, thereby posing a direct risk to physical security. Within the framework of hierarchical control and incentive feedback Stackelberg game, we design a resilient leading controller (leader) that is adaptive to a compromised following controller (follower) such that the compromised follower acts cooperatively with the leader, aligning its strategies with the leader's objective to achieve a team-optimal solution. First, we provide sufficient conditions for the existence of an incentive Stackelberg solution when system dynamics are known. Then, we propose a Q-learning-based Approximate Dynamic Programming (ADP) approach, and corresponding algorithms for the online resolution of the incentive Stackelberg solution without requiring prior knowledge of system dynamics. Last but not least, we prove the convergence of our approach to the optimum.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation
Authors:
Mohammad Hashemi,
Shengbo Gong,
Juntong Ni,
Wenqi Fan,
B. Aditya Prakash,
Wei Jin
Abstract:
Many real-world datasets can be naturally represented as graphs, spanning a wide range of domains. However, the increasing complexity and size of graph datasets present significant challenges for analysis and computation. In response, graph reduction, or graph summarization, has gained prominence for simplifying large graphs while preserving essential properties. In this survey, we aim to provide…
▽ More
Many real-world datasets can be naturally represented as graphs, spanning a wide range of domains. However, the increasing complexity and size of graph datasets present significant challenges for analysis and computation. In response, graph reduction, or graph summarization, has gained prominence for simplifying large graphs while preserving essential properties. In this survey, we aim to provide a comprehensive understanding of graph reduction methods, including graph sparsification, graph coarsening, and graph condensation. Specifically, we establish a unified definition for these methods and introduce a hierarchical taxonomy to categorize the challenges they address. Our survey then systematically reviews the technical details of these methods and emphasizes their practical applications across diverse scenarios. Furthermore, we outline critical research directions to ensure the continued effectiveness of graph reduction techniques, as well as provide a comprehensive paper list at \url{https://github.com/Emory-Melody/awesome-graph-reduction}. We hope this survey will bridge literature gaps and propel the advancement of this promising field.
△ Less
Submitted 29 June, 2024; v1 submitted 28 January, 2024;
originally announced February 2024.
-
Semantic-Aware and Goal-Oriented Communications for Object Detection in Wireless End-to-End Image Transmission
Authors:
Fatemeh Zahra Safaeipour,
Morteza Hashemi
Abstract:
Semantic communication is focused on optimizing the exchange of information by transmitting only the most relevant data required to convey the intended message to the receiver and achieve the desired communication goal. For example, if we consider images as the information and the goal of the communication is object detection at the receiver side, the semantic of information would be the objects i…
▽ More
Semantic communication is focused on optimizing the exchange of information by transmitting only the most relevant data required to convey the intended message to the receiver and achieve the desired communication goal. For example, if we consider images as the information and the goal of the communication is object detection at the receiver side, the semantic of information would be the objects in each image. Therefore, by only transferring the semantics of images we can achieve the communication goal. In this paper, we propose a design framework for implementing semantic-aware and goal-oriented communication of images. To achieve this, we first define the baseline problem as a set of mathematical problems that can be optimized to improve the efficiency and effectiveness of the communication system. We consider two scenarios in which either the data rate or the error at the receiver is the limiting constraint. Our proposed system model and solution is inspired by the concept of auto-encoders, where the encoder and the decoder are respectively implemented at the transmitter and receiver to extract semantic information for specific object detection goals. Our numerical results validate the proposed design framework to achieve low error or near-optimal in a goal-oriented communication system while reducing the amount of data transfers.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Interference-Aware Queuing Analysis for Distributed Transmission Control in UAV Networks
Authors:
Masoud Ghazikor,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we investigate the problem of distributed transmission control for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous interference-aware queuing analysis framework that jointly considers two inter-dependent factors: (i) limited-size queues with delay-constrained packet arrival, and (ii) in-band interference introduced by other ground/aerial…
▽ More
In this paper, we investigate the problem of distributed transmission control for unmanned aerial vehicles (UAVs) operating in unlicensed spectrum bands. We develop a rigorous interference-aware queuing analysis framework that jointly considers two inter-dependent factors: (i) limited-size queues with delay-constrained packet arrival, and (ii) in-band interference introduced by other ground/aerial users. We aim to optimize the expected throughput by jointly analyzing these factors. In the queuing analysis, we explore two packet loss probabilities including, buffer overflow model and time threshold model. For interference analysis, we investigate the outage probability and packet losses due to low signal-to-interference-plus-noise ratio (SINR). We introduce two algorithms namely, Interference-Aware Transmission Control (IA-TC), and Interference-Aware Distributed Transmission Control (IA-DTC). These algorithms maximize the expected throughput by adjusting transmission policies to balance the trade-offs between packet drop from queues vs. transmission errors due to low SINRs. We implement the proposed algorithms and demonstrate that the optimal transmission policy under various scenarios is found.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Realism in Action: Anomaly-Aware Diagnosis of Brain Tumors from Medical Images Using YOLOv8 and DeiT
Authors:
Seyed Mohammad Hossein Hashemi,
Leila Safari,
Mohsen Hooshmand,
Amirhossein Dadashzadeh Taromi
Abstract:
Reliable diagnosis of brain tumors remains challenging due to low clinical incidence rates of such cases. However, this low rate is neglected in most of proposed methods. We propose a clinically inspired framework for anomaly-resilient tumor detection and classification. Detection leverages YOLOv8n fine-tuned on a realistically imbalanced dataset (1:9 tumor-to-normal ratio; 30,000 MRI slices from…
▽ More
Reliable diagnosis of brain tumors remains challenging due to low clinical incidence rates of such cases. However, this low rate is neglected in most of proposed methods. We propose a clinically inspired framework for anomaly-resilient tumor detection and classification. Detection leverages YOLOv8n fine-tuned on a realistically imbalanced dataset (1:9 tumor-to-normal ratio; 30,000 MRI slices from 81 patients). In addition, we propose a novel Patient-to-Patient (PTP) metric that evaluates diagnostic reliability at the patient level. Classification employs knowledge distillation: a Data Efficient Image Transformer (DeiT) student model is distilled from a ResNet152 teacher. The distilled ViT achieves an F1-score of 0.92 within 20 epochs, matching near teacher performance (F1=0.97) with significantly reduced computational resources. This end-to-end framework demonstrates high robustness in clinically representative anomaly-distributed data, offering a viable tool that adheres to realistic situations in clinics.
△ Less
Submitted 1 July, 2025; v1 submitted 6 January, 2024;
originally announced January 2024.
-
Predicting Bone Degradation Using Vision Transformer and Synthetic Cellular Microstructures Dataset
Authors:
Mohammad Saber Hashemi,
Azadeh Sheidaei
Abstract:
Bone degradation, especially for astronauts in microgravity conditions, is crucial for space exploration missions since the lower applied external forces accelerate the diminution in bone stiffness and strength substantially. Although existing computational models help us understand this phenomenon and possibly restrict its effect in the future, they are time-consuming to simulate the changes in t…
▽ More
Bone degradation, especially for astronauts in microgravity conditions, is crucial for space exploration missions since the lower applied external forces accelerate the diminution in bone stiffness and strength substantially. Although existing computational models help us understand this phenomenon and possibly restrict its effect in the future, they are time-consuming to simulate the changes in the bones, not just the bone microstructures, of each individual in detail. In this study, a robust yet fast computational method to predict and visualize bone degradation has been developed. Our deep-learning method, TransVNet, can take in different 3D voxelized images and predict their evolution throughout months utilizing a hybrid 3D-CNN-VisionTransformer autoencoder architecture. Because of limited available experimental data and challenges of obtaining new samples, a digital twin dataset of diverse and initial bone-like microstructures was generated to train our TransVNet on the evolution of the 3D images through a previously developed degradation model for microgravity.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Efficient Cluster Selection for Personalized Federated Learning: A Multi-Armed Bandit Approach
Authors:
Zhou Ni,
Morteza Hashemi
Abstract:
Federated learning (FL) offers a decentralized training approach for machine learning models, prioritizing data privacy. However, the inherent heterogeneity in FL networks, arising from variations in data distribution, size, and device capabilities, poses challenges in user federation. Recognizing this, Personalized Federated Learning (PFL) emphasizes tailoring learning processes to individual dat…
▽ More
Federated learning (FL) offers a decentralized training approach for machine learning models, prioritizing data privacy. However, the inherent heterogeneity in FL networks, arising from variations in data distribution, size, and device capabilities, poses challenges in user federation. Recognizing this, Personalized Federated Learning (PFL) emphasizes tailoring learning processes to individual data profiles. In this paper, we address the complexity of clustering users in PFL, especially in dynamic networks, by introducing a dynamic Upper Confidence Bound (dUCB) algorithm inspired by the multi-armed bandit (MAB) approach. The dUCB algorithm ensures that new users can effectively find the best cluster for their data distribution by balancing exploration and exploitation. The performance of our algorithm is evaluated in various cases, showing its effectiveness in handling dynamic federated learning scenarios.
△ Less
Submitted 29 October, 2023;
originally announced October 2023.
-
LXMERT Model Compression for Visual Question Answering
Authors:
Maryam Hashemi,
Ghazaleh Mahmoudi,
Sara Kodeiri,
Hadi Sheikhi,
Sauleh Eetemadi
Abstract:
Large-scale pretrained models such as LXMERT are becoming popular for learning cross-modal representations on text-image pairs for vision-language tasks. According to the lottery ticket hypothesis, NLP and computer vision models contain smaller subnetworks capable of being trained in isolation to full performance. In this paper, we combine these observations to evaluate whether such trainable subn…
▽ More
Large-scale pretrained models such as LXMERT are becoming popular for learning cross-modal representations on text-image pairs for vision-language tasks. According to the lottery ticket hypothesis, NLP and computer vision models contain smaller subnetworks capable of being trained in isolation to full performance. In this paper, we combine these observations to evaluate whether such trainable subnetworks exist in LXMERT when fine-tuned on the VQA task. In addition, we perform a model size cost-benefit analysis by investigating how much pruning can be done without significant loss in accuracy. Our experiment results demonstrate that LXMERT can be effectively pruned by 40%-60% in size with 3% loss in accuracy.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
PCGPT: Procedural Content Generation via Transformers
Authors:
Sajad Mohaghegh,
Mohammad Amin Ramezan Dehnavi,
Golnoosh Abdollahinejad,
Matin Hashemi
Abstract:
The paper presents the PCGPT framework, an innovative approach to procedural content generation (PCG) using offline reinforcement learning and transformer networks. PCGPT utilizes an autoregressive model based on transformers to generate game levels iteratively, addressing the challenges of traditional PCG methods such as repetitive, predictable, or inconsistent content. The framework models traje…
▽ More
The paper presents the PCGPT framework, an innovative approach to procedural content generation (PCG) using offline reinforcement learning and transformer networks. PCGPT utilizes an autoregressive model based on transformers to generate game levels iteratively, addressing the challenges of traditional PCG methods such as repetitive, predictable, or inconsistent content. The framework models trajectories of actions, states, and rewards, leveraging the transformer's self-attention mechanism to capture temporal dependencies and causal relationships. The approach is evaluated in the Sokoban puzzle game, where the model predicts items that are needed with their corresponding locations. Experimental results on the game Sokoban demonstrate that PCGPT generates more complex and diverse game content. Interestingly, it achieves these results in significantly fewer steps compared to existing methods, showcasing its potential for enhancing game design and online content generation. Our model represents a new PCG paradigm which outperforms previous methods.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
An Efficient Distributed Multi-Agent Reinforcement Learning for EV Charging Network Control
Authors:
Amin Shojaeighadikolaei,
Morteza Hashemi
Abstract:
The increasing trend in adopting electric vehicles (EVs) will significantly impact the residential electricity demand, which results in an increased risk of transformer overload in the distribution grid. To mitigate such risks, there are urgent needs to develop effective EV charging controllers. Currently, the majority of the EV charge controllers are based on a centralized approach for managing i…
▽ More
The increasing trend in adopting electric vehicles (EVs) will significantly impact the residential electricity demand, which results in an increased risk of transformer overload in the distribution grid. To mitigate such risks, there are urgent needs to develop effective EV charging controllers. Currently, the majority of the EV charge controllers are based on a centralized approach for managing individual EVs or a group of EVs. In this paper, we introduce a decentralized Multi-agent Reinforcement Learning (MARL) charging framework that prioritizes the preservation of privacy for EV owners. We employ the Centralized Training Decentralized Execution-Deep Deterministic Policy Gradient (CTDE-DDPG) scheme, which provides valuable information to users during training while maintaining privacy during execution. Our results demonstrate that the CTDE framework improves the performance of the charging network by reducing the network costs. Moreover, we show that the Peak-to-Average Ratio (PAR) of the total demand is reduced, which, in turn, reduces the risk of transformer overload during the peak hours.
△ Less
Submitted 24 August, 2023;
originally announced August 2023.
-
Energy-Efficient Deadline-Aware Edge Computing: Bandit Learning with Partial Observations in Multi-Channel Systems
Authors:
Babak Badnava,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi,
Ness B Shroff
Abstract:
In this paper, we consider a task offloading problem in a multi-access edge computing (MEC) network, in which edge users can either use their local processing unit to compute their tasks or offload their tasks to a nearby edge server through multiple communication channels each with different characteristics. The main objective is to maximize the energy efficiency of the edge users while meeting c…
▽ More
In this paper, we consider a task offloading problem in a multi-access edge computing (MEC) network, in which edge users can either use their local processing unit to compute their tasks or offload their tasks to a nearby edge server through multiple communication channels each with different characteristics. The main objective is to maximize the energy efficiency of the edge users while meeting computing tasks deadlines. In the multi-user multi-channel offloading scenario, users are distributed with partial observations of the system states. We formulate this problem as a stochastic optimization problem and leverage \emph{contextual neural multi-armed bandit} models to develop an energy-efficient deadline-aware solution, dubbed E2DA. The proposed E2DA framework only relies on partial state information (i.e., computation task features) to make offloading decisions. Through extensive numerical analysis, we demonstrate that the E2DA algorithm can efficiently learn an offloading policy and achieve close-to-optimal performance in comparison with several baseline policies that optimize energy consumption and/or response time. Furthermore, we provide a comprehensive set of results on the MEC system performance for various applications such as augmented reality (AR) and virtual reality (VR).
△ Less
Submitted 12 August, 2023;
originally announced August 2023.
-
Exploring the Interplay of Interference and Queues in Unlicensed Spectrum Bands for UAV Networks
Authors:
Masoud Ghazikor,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we present an analytical framework to explore the interplay of signal interference and transmission queue management, and their impacts on the performance of unmanned aerial vehicles (UAVs) when operating in the unlicensed spectrum bands. In particular, we develop a comprehensive framework to investigate the impact of other interference links on the UAV as it communicates with the g…
▽ More
In this paper, we present an analytical framework to explore the interplay of signal interference and transmission queue management, and their impacts on the performance of unmanned aerial vehicles (UAVs) when operating in the unlicensed spectrum bands. In particular, we develop a comprehensive framework to investigate the impact of other interference links on the UAV as it communicates with the ground users. To this end, we provide closed-form expressions for packet drop probabilities in the queue due to buffer overflow or large queuing delay, which are expressed in terms of a transmission policy as a function of the channel fading threshold $β$. The overall packet loss caused either by interference signals or queuing packet drop is obtained, which, in turn, yields in obtaining the expected throughput performance. Through extensive numerical results, we investigate the impact of the channel fading threshold $β$, which plays an important role in balancing the trade-offs between packet loss due to queue drop or transmission error due to large interference levels.
△ Less
Submitted 24 November, 2023; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems
Authors:
Sravan Reddy Chintareddy,
Keenan Roach,
Kenny Cheung,
Morteza Hashemi
Abstract:
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To…
▽ More
In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as the secondary users to opportunistically utilize detected spectrum holes. To this end, we propose a multi-class classification problem for wideband spectrum sensing to detect vacant spectrum spots based on collected I/Q samples. To enhance the accuracy of the spectrum sensing module, the outputs from the multi-class classification by each individual UAV are fused at a server in the unmanned aircraft system traffic management (UTM) ecosystem. In the spectrum scheduling phase, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users (i.e., UAVs). To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base-station~(BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary users channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Detecting individual-level infections using sparse group-testing through graph-coupled hidden Markov models
Authors:
Zahra Gholamalian,
Zeinab Maleki,
MasoudReza Hashemi,
Pouria Ramazi
Abstract:
Identifying the infection status of each individual during infectious diseases informs public health management. However, performing frequent individual-level tests may not be feasible. Instead, sparse and sometimes group-level tests are performed. Determining the infection status of individuals using sparse group-level tests remains an open problem. We have tackled this problem by extending graph…
▽ More
Identifying the infection status of each individual during infectious diseases informs public health management. However, performing frequent individual-level tests may not be feasible. Instead, sparse and sometimes group-level tests are performed. Determining the infection status of individuals using sparse group-level tests remains an open problem. We have tackled this problem by extending graph-coupled hidden Markov models with individuals infection statuses as the hidden states and the group test results as the observations. We fitted the model to simulation datasets using the Gibbs sampling method. The model performed about 0.55 AUC for low testing frequencies and increased to 0.80 AUC in the case where the groups were tested every day. The model was separately tested on a daily basis case to predict the statuses over time and after 15 days of the beginning of the spread, which resulted in 0.98 AUC at day 16 and remained above 0.80 AUC until day 128. Therefore, although dealing with sparse tests remains unsolved, the results open the possibility of using initial group screenings during pandemics to accurately estimate individuals infection statuses.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
Combating Uncertainties in Wind and Distributed PV Energy Sources Using Integrated Reinforcement Learning and Time-Series Forecasting
Authors:
Arman Ghasemi,
Amin Shojaeighadikolaei,
Morteza Hashemi
Abstract:
Renewable energy sources, such as wind and solar power, are increasingly being integrated into smart grid systems. However, when compared to traditional energy resources, the unpredictability of renewable energy generation poses significant challenges for both electricity providers and utility companies. Furthermore, the large-scale integration of distributed energy resources (such as PV systems)…
▽ More
Renewable energy sources, such as wind and solar power, are increasingly being integrated into smart grid systems. However, when compared to traditional energy resources, the unpredictability of renewable energy generation poses significant challenges for both electricity providers and utility companies. Furthermore, the large-scale integration of distributed energy resources (such as PV systems) creates new challenges for energy management in microgrids. To tackle these issues, we propose a novel framework with two objectives: (i) combating uncertainty of renewable energy in smart grid by leveraging time-series forecasting with Long-Short Term Memory (LSTM) solutions, and (ii) establishing distributed and dynamic decision-making framework with multi-agent reinforcement learning using Deep Deterministic Policy Gradient (DDPG) algorithm. The proposed framework considers both objectives concurrently to fully integrate them, while considering both wholesale and retail markets, thereby enabling efficient energy management in the presence of uncertain and distributed renewable energy sources. Through extensive numerical simulations, we demonstrate that the proposed solution significantly improves the profit of load serving entities (LSE) by providing a more accurate wind generation forecast. Furthermore, our results demonstrate that households with PV and battery installations can increase their profits by using intelligent battery charge/discharge actions determined by the DDPG agents.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Learning Performance-Improving Code Edits
Authors:
Alexander Shypula,
Aman Madaan,
Yimeng Zeng,
Uri Alon,
Jacob Gardner,
Milad Hashemi,
Graham Neubig,
Parthasarathy Ranganathan,
Osbert Bastani,
Amir Yazdanbakhsh
Abstract:
With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the semantics of code. Simultaneously, pretrained large language models (LLMs) have demonstrated strong capabilities at solving a wide range of programming tasks. To t…
▽ More
With the decline of Moore's law, optimizing program performance has become a major focus of software research. However, high-level optimizations such as API and algorithm changes remain elusive due to the difficulty of understanding the semantics of code. Simultaneously, pretrained large language models (LLMs) have demonstrated strong capabilities at solving a wide range of programming tasks. To that end, we introduce a framework for adapting LLMs to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs, accompanied by extensive unit tests. A major challenge is the significant variability of measuring performance on commodity hardware, which can lead to spurious "improvements." To isolate and reliably evaluate the impact of program optimizations, we design an environment based on the gem5 full system simulator, the de facto simulator used in academia and industry. Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play. A combination of these techniques achieves a mean speedup of 6.86 with eight generations, higher than average optimizations from individual programmers (3.66). Using our model's fastest generations, we set a new upper limit on the fastest speedup possible for our dataset at 9.64 compared to using the fastest human submissions available (9.56).
△ Less
Submitted 26 April, 2024; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Understanding User Preferences in Explainable Artificial Intelligence: A Survey and a Mapping Function Proposal
Authors:
Maryam Hashemi,
Ali Darejeh,
Francisco Cruz
Abstract:
The increasing complexity of AI systems has led to the growth of the field of Explainable Artificial Intelligence (XAI), which aims to provide explanations and justifications for the outputs of AI algorithms. While there is considerable demand for XAI, there remains a scarcity of studies aimed at comprehensively understanding the practical distinctions among different methods and effectively align…
▽ More
The increasing complexity of AI systems has led to the growth of the field of Explainable Artificial Intelligence (XAI), which aims to provide explanations and justifications for the outputs of AI algorithms. While there is considerable demand for XAI, there remains a scarcity of studies aimed at comprehensively understanding the practical distinctions among different methods and effectively aligning each method with users individual needs, and ideally, offer a mapping function which can map each user with its specific needs to a method of explainability. This study endeavors to bridge this gap by conducting a thorough review of extant research in XAI, with a specific focus on Explainable Machine Learning (XML), and a keen eye on user needs. Our main objective is to offer a classification of XAI methods within the realm of XML, categorizing current works into three distinct domains: philosophy, theory, and practice, and providing a critical review for each category. Moreover, our study seeks to facilitate the connection between XAI users and the most suitable methods for them and tailor explanations to meet their specific needs by proposing a mapping function that take to account users and their desired properties and suggest an XAI method to them. This entails an examination of prevalent XAI approaches and an evaluation of their properties. The primary outcome of this study is the formulation of a clear and concise strategy for selecting the optimal XAI method to achieve a given goal, all while delivering personalized explanations tailored to individual users.
△ Less
Submitted 19 June, 2024; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Distributed Energy Management and Demand Response in Smart Grids: A Multi-Agent Deep Reinforcement Learning Framework
Authors:
Amin Shojaeighadikolaei,
Arman Ghasemi,
Kailani Jones,
Yousif Dafalla,
Alexandru G. Bardas,
Reza Ahmadi,
Morteza Haashemi
Abstract:
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems. In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users. DR has a widely recognized potential for improving power grid stability and re…
▽ More
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems. In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users. DR has a widely recognized potential for improving power grid stability and reliability, while at the same time reducing end-users energy bills. However, the conventional DR techniques come with several shortcomings, such as the inability to handle operational uncertainties while incurring end-user disutility, which prevents widespread adoption in real-world applications. The proposed framework addresses these shortcomings by implementing DR and DEM based on real-time pricing strategy that is achieved using deep reinforcement learning. Furthermore, this framework enables the power grid service provider to leverage distributed energy resources (i.e., PV rooftop panels and battery storage) as dispatchable assets to support the smart grid during peak hours, thus achieving management of distributed energy resources. Simulation results based on the Deep Q-Network (DQN) demonstrate significant improvements of the 24-hour accumulative profit for both prosumers and the power grid service provider, as well as major reductions in the utilization of the power grid reserve generators.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks
Authors:
Sadegh Mahdavi,
Kevin Swersky,
Thomas Kipf,
Milad Hashemi,
Christos Thrampoulidis,
Renjie Liao
Abstract:
In this paper, we study the OOD generalization of neural algorithmic reasoning tasks, where the goal is to learn an algorithm (e.g., sorting, breadth-first search, and depth-first search) from input-output pairs using deep neural networks. First, we argue that OOD generalization in this setting is significantly different than common OOD settings. For example, some phenomena in OOD generalization o…
▽ More
In this paper, we study the OOD generalization of neural algorithmic reasoning tasks, where the goal is to learn an algorithm (e.g., sorting, breadth-first search, and depth-first search) from input-output pairs using deep neural networks. First, we argue that OOD generalization in this setting is significantly different than common OOD settings. For example, some phenomena in OOD generalization of image classifications such as \emph{accuracy on the line} are not observed here, and techniques such as data augmentation methods do not help as assumptions underlying many augmentation techniques are often violated. Second, we analyze the main challenges (e.g., input distribution shift, non-representative data generation, and uninformative validation metrics) of the current leading benchmark, i.e., CLRS \citep{deepmind2021clrs}, which contains 30 algorithmic reasoning tasks. We propose several solutions, including a simple-yet-effective fix to the input distribution shift and improved data generation. Finally, we propose an attention-based 2WL-graph neural network (GNN) processor which complements message-passing GNNs so their combination outperforms the state-of-the-art model by a 3% margin averaged over all algorithms. Our code is available at: \url{https://github.com/smahdavi4/clrs}.
△ Less
Submitted 18 March, 2023; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Connective Reconstruction-based Novelty Detection
Authors:
Seyyed Morteza Hashemi,
Parvaneh Aliniya,
Parvin Razzaghi
Abstract:
Detection of out-of-distribution samples is one of the critical tasks for real-world applications of computer vision. The advancement of deep learning has enabled us to analyze real-world data which contain unexplained samples, accentuating the need to detect out-of-distribution instances more than before. GAN-based approaches have been widely used to address this problem due to their ability to p…
▽ More
Detection of out-of-distribution samples is one of the critical tasks for real-world applications of computer vision. The advancement of deep learning has enabled us to analyze real-world data which contain unexplained samples, accentuating the need to detect out-of-distribution instances more than before. GAN-based approaches have been widely used to address this problem due to their ability to perform distribution fitting; however, they are accompanied by training instability and mode collapse. We propose a simple yet efficient reconstruction-based method that avoids adding complexities to compensate for the limitations of GAN models while outperforming them. Unlike previous reconstruction-based works that only utilize reconstruction error or generated samples, our proposed method simultaneously incorporates both of them in the detection task. Our model, which we call "Connective Novelty Detection" has two subnetworks, an autoencoder, and a binary classifier. The autoencoder learns the representation of the positive class by reconstructing them. Then, the model creates negative and connected positive examples using real and generated samples. Negative instances are generated via manipulating the real data, so their distribution is close to the positive class to achieve a more accurate boundary for the classifier. To boost the robustness of the detection to reconstruction error, connected positive samples are created by combining the real and generated samples. Finally, the binary classifier is trained using connected positive and negative examples. We demonstrate a considerable improvement in novelty detection over state-of-the-art methods on MNIST and Caltech-256 datasets.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
CUF: Continuous Upsampling Filters
Authors:
Cristina Vasconcelos,
Cengiz Oztireli,
Mark Matthews,
Milad Hashemi,
Kevin Swersky,
Andrea Tagliasacchi
Abstract:
Neural fields have rapidly been adopted for representing 3D signals, but their application to more classical 2D image-processing has been relatively limited. In this paper, we consider one of the most important operations in image processing: upsampling. In deep learning, learnable upsampling layers have extensively been used for single image super-resolution. We propose to parameterize upsampling…
▽ More
Neural fields have rapidly been adopted for representing 3D signals, but their application to more classical 2D image-processing has been relatively limited. In this paper, we consider one of the most important operations in image processing: upsampling. In deep learning, learnable upsampling layers have extensively been used for single image super-resolution. We propose to parameterize upsampling kernels as neural fields. This parameterization leads to a compact architecture that obtains a 40-fold reduction in the number of parameters when compared with competing arbitrary-scale super-resolution architectures. When upsampling images of size 256x256 we show that our architecture is 2x-10x more efficient than competing arbitrary-scale super-resolution architectures, and more efficient than sub-pixel convolutions when instantiated to a single-scale model. In the general setting, these gains grow polynomially with the square of the target scale. We validate our method on standard benchmarks showing such efficiency gains can be achieved without sacrifices in super-resolution performance.
△ Less
Submitted 20 October, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Learning to Improve Code Efficiency
Authors:
Binghong Chen,
Daniel Tarlow,
Kevin Swersky,
Martin Maas,
Pablo Heiber,
Ashish Naik,
Milad Hashemi,
Parthasarathy Ranganathan
Abstract:
Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared t…
▽ More
Improvements in the performance of computing systems, driven by Moore's Law, have transformed society. As such hardware-driven gains slow down, it becomes even more important for software developers to focus on performance and efficiency during development. While several studies have demonstrated the potential from such improved code efficiency (e.g., 2x better generational improvements compared to hardware), unlocking these gains in practice has been challenging. Reasoning about algorithmic complexity and the interaction of coding patterns on hardware can be challenging for the average programmer, especially when combined with pragmatic constraints around development velocity and multi-person development.
This paper seeks to address this problem. We analyze a large competitive programming dataset from the Google Code Jam competition and find that efficient code is indeed rare, with a 2x runtime difference between the median and the 90th percentile of solutions. We propose using machine learning to automatically provide prescriptive feedback in the form of hints, to guide programmers towards writing high-performance code. To automatically learn these hints from the dataset, we propose a novel discrete variational auto-encoder, where each discrete latent variable represents a different learned category of code-edit that increases performance. We show that this method represents the multi-modal space of code efficiency edits better than a sequence-to-sequence baseline and generates a distribution of more efficient solutions.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Linking Properties to Microstructure in Liquid Metal Embedded Elastomers via Machine Learning
Authors:
Abhijith Thoopul Anantharanga,
Mohammad Saber Hashemi,
Azadeh Sheidaei
Abstract:
Liquid metals (LM) are embedded in an elastomer matrix to obtain soft composites with unique thermal, dielectric, and mechanical properties. They have applications in soft robotics, biomedical engineering, and wearable electronics. By linking the structure to the properties of these materials, it is possible to perform material design rationally. Liquid-metal embedded elastomers (LMEEs) have been…
▽ More
Liquid metals (LM) are embedded in an elastomer matrix to obtain soft composites with unique thermal, dielectric, and mechanical properties. They have applications in soft robotics, biomedical engineering, and wearable electronics. By linking the structure to the properties of these materials, it is possible to perform material design rationally. Liquid-metal embedded elastomers (LMEEs) have been designed for targeted electro-thermo-mechanical properties by semi-supervised learning of structure-property (SP) links in a variational autoencoder network (VAE). The design parameters are the microstructural descriptors that are physically meaningful and have affine relationships with the synthetization of the studied particulate composite. The machine learning (ML) model is trained on a generated dataset of microstructural descriptors with their multifunctional property quantities as their labels. Sobol sequence is used for in-silico Design of Experiment (DoE) by sampling the design space to generate a comprehensive dataset of 3D microstructure realizations via a packing algorithm. The mechanical responses of the generated microstructures are simulated using a previously developed Finite Element (FE) model, considering the surface tension induced by LM inclusions, while the linear thermal and dielectric constants are homogenized with the help of our in-house Fast Fourier Transform (FFT) package. Following the training by minimization of an appropriate loss function, the VAE encoder acts as the surrogate of numerical solvers of the multifunctional homogenizations, and its decoder is used for the material design. Our results indicate the satisfactory performance of the surrogate model and the inverse calculator with respect to high-fidelity numerical simulations validated with LMEE experimental results.
△ Less
Submitted 24 July, 2022;
originally announced August 2022.
-
Garbled EDA: Privacy Preserving Electronic Design Automation
Authors:
Mohammad Hashemi,
Steffi Roy,
Fatemeh Ganji,
Domenic Forte
Abstract:
The complexity of modern integrated circuits (ICs) necessitates collaboration between multiple distrusting parties, including thirdparty intellectual property (3PIP) vendors, design houses, CAD/EDA tool vendors, and foundries, which jeopardizes confidentiality and integrity of each party's IP. IP protection standards and the existing techniques proposed by researchers are ad hoc and vulnerable to…
▽ More
The complexity of modern integrated circuits (ICs) necessitates collaboration between multiple distrusting parties, including thirdparty intellectual property (3PIP) vendors, design houses, CAD/EDA tool vendors, and foundries, which jeopardizes confidentiality and integrity of each party's IP. IP protection standards and the existing techniques proposed by researchers are ad hoc and vulnerable to numerous structural, functional, and/or side-channel attacks. Our framework, Garbled EDA, proposes an alternative direction through formulating the problem in a secure multi-party computation setting, where the privacy of IPs, CAD tools, and process design kits (PDKs) is maintained. As a proof-of-concept, Garbled EDA is evaluated in the context of simulation, where multiple IP description formats (Verilog, C, S) are supported. Our results demonstrate a reasonable logical-resource cost and negligible memory overhead. To further reduce the overhead, we present another efficient implementation methodology, feasible when the resource utilization is a bottleneck, but the communication between two parties is not restricted. Interestingly, this implementation is private and secure even in the presence of malicious adversaries attempting to, e.g., gain access to PDKs or in-house IPs of the CAD tool providers.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.