-
Forecasting Empty Container availability for Vehicle Booking System Application
Authors:
Arthur Cartel Foahom Gouabou,
Mohammed Al-Kharaz,
Faouzi Hakimi,
Tarek Khaled,
Kenza Amzil
Abstract:
Container terminals, pivotal nodes in the network of empty container movement, hold significant potential for enhancing operational efficiency within terminal depots through effective collaboration between transporters and terminal operators. This collaboration is crucial for achieving optimization, leading to streamlined operations and reduced congestion, thereby benefiting both parties. Conseque…
▽ More
Container terminals, pivotal nodes in the network of empty container movement, hold significant potential for enhancing operational efficiency within terminal depots through effective collaboration between transporters and terminal operators. This collaboration is crucial for achieving optimization, leading to streamlined operations and reduced congestion, thereby benefiting both parties. Consequently, there is a pressing need to develop the most suitable forecasting approaches to address this challenge. This study focuses on developing and evaluating a data-driven approach for forecasting empty container availability at container terminal depots within a Vehicle Booking System (VBS) framework. It addresses the gap in research concerning optimizing empty container dwell time and aims to enhance operational efficiencies in container terminal operations. Four forecasting models-Naive, ARIMA, Prophet, and LSTM-are comprehensively analyzed for their predictive capabilities, with LSTM emerging as the top performer due to its ability to capture complex time series patterns. The research underscores the significance of selecting appropriate forecasting techniques tailored to the specific requirements of container terminal operations, contributing to improved operational planning and management in maritime logistics.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Comparative Study of the Median Based Unit Rayleigh and its Generalized Form the Generalized Odd Median Based Unit Rayleigh
Authors:
Iman Mohammed Attia
Abstract:
In the present paper, the author discusses the Generalized Odd Median Base Unit Rayleigh (GOMBUR) in relation to the Median Based Unit Rayleigh (MBUR) to evaluate the additive value of the new shape parameter on the estimation process as regards validity indices, goodness of fit statistics, estimated variances of the estimated parameters and their standard errors. This evaluation is conducted on r…
▽ More
In the present paper, the author discusses the Generalized Odd Median Base Unit Rayleigh (GOMBUR) in relation to the Median Based Unit Rayleigh (MBUR) to evaluate the additive value of the new shape parameter on the estimation process as regards validity indices, goodness of fit statistics, estimated variances of the estimated parameters and their standard errors. This evaluation is conducted on real datasets. Each dataset is analyzed by fitting different competitor distributions in addition to MBUR and GOMBUR distributions. The parameter estimation is achieved by applying Maximum likelihood estimator (MLE) using Nelder Mead optimizer.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
The New Generalized Odd Median Based Unit Rayleigh with a New Shape Oscillating Hazard Rate Function
Authors:
Iman Attia
Abstract:
In this paper, the author presents the generalized form of the Median-Based Unit Rayleigh (MBUR) distribution, a novel statistical distribution that is specifically defined within the interval (0, 1) expressing oscillating hazard rate function. This generalization adds a new parameter to the MBUR distribution that significantly addresses the unique characteristics of data represented as ratios and…
▽ More
In this paper, the author presents the generalized form of the Median-Based Unit Rayleigh (MBUR) distribution, a novel statistical distribution that is specifically defined within the interval (0, 1) expressing oscillating hazard rate function. This generalization adds a new parameter to the MBUR distribution that significantly addresses the unique characteristics of data represented as ratios and proportions, which are commonly encountered in various fields of research. The establishment of this generalization aims to deepen our understanding of these phenomena by providing a robust framework for analysis. The paper offers a thorough and meticulous derivation of the probability density function (PDF) for the MBUR distribution, illuminating each phase of the process with clarity and precision.
△ Less
Submitted 24 February, 2025;
originally announced March 2025.
-
mobilityDCAT-AP: a Metadata Specification for Enhanced Cross-border Mobility Data Sharing
Authors:
Mario Scrocca,
Lina Molinas Comet,
Benjamin Witsch,
Daham Mohammed Mustafa,
Christoph Lange,
Marco Comerio,
Peter Lubrich
Abstract:
Integrated and efficient mobility requires data sharing among the involved stakeholders. In this direction, regulators and transport authorities have been defining policies to foster the digitalisation and online publication of mobility data. However, the creation of several heterogeneous data portals for mobility data resulted in a fragmented ecosystem that challenges data accessibility. In this…
▽ More
Integrated and efficient mobility requires data sharing among the involved stakeholders. In this direction, regulators and transport authorities have been defining policies to foster the digitalisation and online publication of mobility data. However, the creation of several heterogeneous data portals for mobility data resulted in a fragmented ecosystem that challenges data accessibility. In this context, metadata is a key enabler to foster the findability and reusability of relevant datasets, but their interoperability across different data portals should be ensured. Moreover, each domain presents specificities on the relevant information that should be encoded through metadata. To solve these issues within the mobility domain, we present mobilityDCAT-AP, a reference metadata specification for mobility data portals specified by putting together domain experts and the Semantic Web community. We report on the work done to develop the metadata model behind mobilityDCAT-AP and the best practices followed in its implementation and publication. Finally, we describe the available educational resources and the activities performed to ensure broader adoption of mobilityDCAT-AP across mobility data portals. We present success stories from early adopters and discuss the challenges they encountered in implementing a metadata specification based on Semantic Web technologies.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Step-by-Step Data Cleaning Recommendations to Improve ML Prediction Accuracy
Authors:
Sedir Mohammed,
Felix Naumann,
Hazar Harmouch
Abstract:
Data quality is crucial in machine learning (ML) applications, as errors in the data can significantly impact the prediction accuracy of the underlying ML model. Therefore, data cleaning is an integral component of any ML pipeline. However, in practical scenarios, data cleaning incurs significant costs, as it often involves domain experts for configuring and executing the cleaning process. Thus, e…
▽ More
Data quality is crucial in machine learning (ML) applications, as errors in the data can significantly impact the prediction accuracy of the underlying ML model. Therefore, data cleaning is an integral component of any ML pipeline. However, in practical scenarios, data cleaning incurs significant costs, as it often involves domain experts for configuring and executing the cleaning process. Thus, efficient resource allocation during data cleaning can enhance ML prediction accuracy while controlling expenses.
This paper presents COMET, a system designed to optimize data cleaning efforts for ML tasks. COMET gives step-by-step recommendations on which feature to clean next, maximizing the efficiency of data cleaning under resource constraints. We evaluated COMET across various datasets, ML algorithms, and data error types, demonstrating its robustness and adaptability. Our results show that COMET consistently outperforms feature importance-based, random, and another well-known cleaning method, achieving up to 52 and on average 5 percentage points higher ML prediction accuracy than the proposed baselines.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Safe Control of Second-Order Systems with Linear Constraints
Authors:
Mohammed Alyaseen,
Nikolay Atanasov,
Jorge Cortes
Abstract:
Control barrier functions (CBFs) offer a powerful tool for enforcing safety specifications in control synthesis. This paper deals with the problem of constructing valid CBFs. Given a second-order system and any desired safety set with linear boundaries in the position space, we construct a provably control-invariant subset of this desired safety set. The constructed subset does not sacrifice any p…
▽ More
Control barrier functions (CBFs) offer a powerful tool for enforcing safety specifications in control synthesis. This paper deals with the problem of constructing valid CBFs. Given a second-order system and any desired safety set with linear boundaries in the position space, we construct a provably control-invariant subset of this desired safety set. The constructed subset does not sacrifice any positions allowed by the desired safety set, which can be nonconvex. We show how our construction can also meet safety specification on the velocity. We then demonstrate that if the system satisfies standard Euler-Lagrange systems properties then our construction can also handle constraints on the allowable control inputs. We finally show the efficacy of the proposed method in a numerical example of keeping a 2D robot arm safe from collision.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
Better Together: Unified Motion Capture and 3D Avatar Reconstruction
Authors:
Arthur Moreau,
Mohammed Brahimi,
Richard Shaw,
Athanasios Papaioannou,
Thomas Tanay,
Zhensong Zhang,
Eduardo Pérez-Pellitero
Abstract:
We present Better Together, a method that simultaneously solves the human pose estimation problem while reconstructing a photorealistic 3D human avatar from multi-view videos. While prior art usually solves these problems separately, we argue that joint optimization of skeletal motion with a 3D renderable body model brings synergistic effects, i.e. yields more precise motion capture and improved v…
▽ More
We present Better Together, a method that simultaneously solves the human pose estimation problem while reconstructing a photorealistic 3D human avatar from multi-view videos. While prior art usually solves these problems separately, we argue that joint optimization of skeletal motion with a 3D renderable body model brings synergistic effects, i.e. yields more precise motion capture and improved visual quality of real-time rendering of avatars. To achieve this, we introduce a novel animatable avatar with 3D Gaussians rigged on a personalized mesh and propose to optimize the motion sequence with time-dependent MLPs that provide accurate and temporally consistent pose estimates. We first evaluate our method on highly challenging yoga poses and demonstrate state-of-the-art accuracy on multi-view human pose estimation, reducing error by 35% on body joints and 45% on hand joints compared to keypoint-based methods. At the same time, our method significantly boosts the visual quality of animatable avatars (+2dB PSNR on novel view synthesis) on diverse challenging subjects.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
DG16M: A Large-Scale Dataset for Dual-Arm Grasping with Force-Optimized Grasps
Authors:
Md Faizal Karim,
Mohammed Saad Hashmi,
Shreya Bollimuntha,
Mahesh Reddy Tapeti,
Gaurav Singh,
Nagamanikandan Govindan,
K Madhava Krishna
Abstract:
Dual-arm robotic grasping is crucial for handling large objects that require stable and coordinated manipulation. While single-arm grasping has been extensively studied, datasets tailored for dual-arm settings remain scarce. We introduce a large-scale dataset of 16 million dual-arm grasps, evaluated under improved force-closure constraints. Additionally, we develop a benchmark dataset containing 3…
▽ More
Dual-arm robotic grasping is crucial for handling large objects that require stable and coordinated manipulation. While single-arm grasping has been extensively studied, datasets tailored for dual-arm settings remain scarce. We introduce a large-scale dataset of 16 million dual-arm grasps, evaluated under improved force-closure constraints. Additionally, we develop a benchmark dataset containing 300 objects with approximately 30,000 grasps, evaluated in a physics simulation environment, providing a better grasp quality assessment for dual-arm grasp synthesis methods. Finally, we demonstrate the effectiveness of our dataset by training a Dual-Arm Grasp Classifier network that outperforms the state-of-the-art methods by 15\%, achieving higher grasp success rates and improved generalization across objects.
△ Less
Submitted 30 June, 2025; v1 submitted 11 March, 2025;
originally announced March 2025.
-
Non-homogeneous problem for the fractional wave equation with irregular coefficients and data
Authors:
Manel Bouguenna,
Mohammed Elamine Sebih
Abstract:
In this paper, we consider the Cauchy problem for a non-homogeneous wave equation generated by the fractional Laplacian and involving different kinds of lower order terms. We allow the equation coefficients and data to be of distributional type or less regular, having in mind the Dirac delta function and its powers, and we prove that the problem is well-posed in the sense of the concept of very we…
▽ More
In this paper, we consider the Cauchy problem for a non-homogeneous wave equation generated by the fractional Laplacian and involving different kinds of lower order terms. We allow the equation coefficients and data to be of distributional type or less regular, having in mind the Dirac delta function and its powers, and we prove that the problem is well-posed in the sense of the concept of very weak solutions. Moreover, we prove the uniqueness in an appropriate sense and the coherence of the very weak solution concept with classical theory.
△ Less
Submitted 12 March, 2025; v1 submitted 10 March, 2025;
originally announced March 2025.
-
On the Importance of Clearsky Model in Short-Term Solar Radiation Forecasting
Authors:
Cyril Voyant,
Milan Despotovic,
Gilles Notton,
Yves-Marie Saint-Drenan,
Mohammed Asloune,
Luis Garcia-Gutierrez
Abstract:
Clearsky models are widely used in solar energy for many applications such as quality control, resource assessment, satellite-base irradiance estimation and forecasting. However, their use in forecasting and nowcasting is associated with a number of challenges. Synchronization errors, reliance on the Clearsky index (ratio of the global horizontal irradiance to its cloud-free counterpart) and high…
▽ More
Clearsky models are widely used in solar energy for many applications such as quality control, resource assessment, satellite-base irradiance estimation and forecasting. However, their use in forecasting and nowcasting is associated with a number of challenges. Synchronization errors, reliance on the Clearsky index (ratio of the global horizontal irradiance to its cloud-free counterpart) and high sensitivity of the clearsky model to errors in aerosol optical depth at low solar elevation limit their added value in real-time applications. This paper explores the feasibility of short-term forecasting without relying on a clearsky model. We propose a Clearsky-Free forecasting approach using Extreme Learning Machine (ELM) models. ELM learns daily periodicity and local variability directly from raw Global Horizontal Irradiance (GHI) data. It eliminates the need for Clearsky normalization, simplifying the forecasting process and improving scalability. Our approach is a non-linear adaptative statistical method that implicitely learns the irradiance in cloud-free conditions removing the need for an clear-sky model and the related operational issues. Deterministic and probabilistic results are compared to traditional benchmarks, including ARMA with McClear-generated Clearsky data and quantile regression for probabilistic forecasts. ELM matches or outperforms these methods, providing accurate predictions and robust uncertainty quantification. This approach offers a simple, efficient solution for real-time solar forecasting. By overcoming the stationarization process limitations based on usual multiplicative scheme Clearsky models, it provides a flexible and reliable framework for modern energy systems.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper
Authors:
Sargam Yadav,
Asifa Mehmood Qureshi,
Abhishek Kaushik,
Shubham Sharma,
Roisin Loughran,
Subramaniam Kazhuparambil,
Andrew Shaw,
Mohammed Sabry,
Niamh St John Lynch,
. Nikhil Singh,
Padraic O'Hara,
Pranay Jaiswal,
Roshan Chandru,
David Lillis
Abstract:
The introduction of transformer architecture was a turning point in Natural Language Processing (NLP). Models based on the transformer architecture such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformer (GPT) have gained widespread popularity in various applications such as software development and education. The availability of Large Language…
▽ More
The introduction of transformer architecture was a turning point in Natural Language Processing (NLP). Models based on the transformer architecture such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformer (GPT) have gained widespread popularity in various applications such as software development and education. The availability of Large Language Models (LLMs) such as ChatGPT and Bard to the general public has showcased the tremendous potential of these models and encouraged their integration into various domains such as software development for tasks such as code generation, debugging, and documentation generation. In this study, opinions from 11 experts regarding their experience with LLMs for software development have been gathered and analysed to draw insights that can guide successful and responsible integration. The overall opinion of the experts is positive, with the experts identifying advantages such as increase in productivity and reduced coding time. Potential concerns and challenges such as risk of over-dependence and ethical considerations have also been highlighted.
△ Less
Submitted 13 June, 2025; v1 submitted 10 March, 2025;
originally announced March 2025.
-
Learning Decision Trees as Amortized Structure Inference
Authors:
Mohammed Mahfoud,
Ghait Boukachab,
Michał Koziarski,
Alex Hernandez-Garcia,
Stefan Bauer,
Yoshua Bengio,
Nikolay Malkin
Abstract:
Building predictive models for tabular data presents fundamental challenges, notably in scaling consistently, i.e., more resources translating to better performance, and generalizing systematically beyond the training data distribution. Designing decision tree models remains especially challenging given the intractably large search space, and most existing methods rely on greedy heuristics, while…
▽ More
Building predictive models for tabular data presents fundamental challenges, notably in scaling consistently, i.e., more resources translating to better performance, and generalizing systematically beyond the training data distribution. Designing decision tree models remains especially challenging given the intractably large search space, and most existing methods rely on greedy heuristics, while deep learning inductive biases expect a temporal or spatial structure not naturally present in tabular data. We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data, formulating decision tree construction as a sequential planning problem. We train a deep reinforcement learning (GFlowNet) policy to solve this problem, yielding a generative model that samples decision trees from the Bayesian posterior. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks derived from real-world data, robustness to distribution shifts, and anomaly detection, all while yielding interpretable models with shorter description lengths. Samples from the trained DT-GFN model can be ensembled to construct a random forest, and we further show that the performance of scales consistently in ensemble size, yielding ensembles of predictors that continue to generalize systematically.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
TPU-Gen: LLM-Driven Custom Tensor Processing Unit Generator
Authors:
Deepak Vungarala,
Mohammed E. Elbtity,
Sumiya Syed,
Sakila Alam,
Kartik Pandit,
Arnob Ghosh,
Ramtin Zand,
Shaahin Angizi
Abstract:
The increasing complexity and scale of Deep Neural Networks (DNNs) necessitate specialized tensor accelerators, such as Tensor Processing Units (TPUs), to meet various computational and energy efficiency requirements. Nevertheless, designing optimal TPU remains challenging due to the high domain expertise level, considerable manual design time, and lack of high-quality, domain-specific datasets. T…
▽ More
The increasing complexity and scale of Deep Neural Networks (DNNs) necessitate specialized tensor accelerators, such as Tensor Processing Units (TPUs), to meet various computational and energy efficiency requirements. Nevertheless, designing optimal TPU remains challenging due to the high domain expertise level, considerable manual design time, and lack of high-quality, domain-specific datasets. This paper introduces TPU-Gen, the first Large Language Model (LLM) based framework designed to automate the exact and approximate TPU generation process, focusing on systolic array architectures. TPU-Gen is supported with a meticulously curated, comprehensive, and open-source dataset that covers a wide range of spatial array designs and approximate multiply-and-accumulate units, enabling design reuse, adaptation, and customization for different DNN workloads. The proposed framework leverages Retrieval-Augmented Generation (RAG) as an effective solution for a data-scare hardware domain in building LLMs, addressing the most intriguing issue, hallucinations. TPU-Gen transforms high-level architectural specifications into optimized low-level implementations through an effective hardware generation pipeline. Our extensive experimental evaluations demonstrate superior performance, power, and area efficiency, with an average reduction in area and power of 92\% and 96\% from the manual optimization reference values. These results set new standards for driving advancements in next-generation design automation tools powered by LLMs.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
The Impact of Building-Induced Visibility Restrictions on Intersection Accidents
Authors:
Hanlin Tian,
Yuxiang Feng,
Wei Zhou,
Anupriya,
Mohammed Quddus,
Yiannis Demiris,
Panagiotis Angeloudis
Abstract:
Traffic accidents, especially at intersections, are a major road safety concern. Previous research has extensively studied intersection-related accidents, but the effect of building-induced visibility restrictions at intersections on accident rates has been under-explored, particularly in urban contexts. Using OpenStreetMap data, the UK's geographic and accident datasets, and the UK Traffic Count…
▽ More
Traffic accidents, especially at intersections, are a major road safety concern. Previous research has extensively studied intersection-related accidents, but the effect of building-induced visibility restrictions at intersections on accident rates has been under-explored, particularly in urban contexts. Using OpenStreetMap data, the UK's geographic and accident datasets, and the UK Traffic Count Dataset, we formulated a novel approach to estimate accident risk at intersections. This method factors in the area visible to drivers, accounting for views blocked by buildings - a distinctive aspect in traffic accident analysis. Our findings reveal a notable correlation between the road visible percentage and accident frequency. In the model, the coefficient for "road visible percentage" is 1.7450, implying a strong positive relationship. Incorporating this visibility factor enhances the model's explanatory power, with increased R-square values and reduced AIC and BIC, indicating a better data fit. This study underscores the essential role of architectural layouts in road safety and suggests that urban planning strategies should consider building-induced visibility restrictions. Such consideration could be an effective approach to mitigate accident rates at intersections. This research opens up new avenues for innovative, data-driven urban planning and traffic management strategies, highlighting the importance of visibility enhancements for safer roads.
△ Less
Submitted 13 February, 2025;
originally announced March 2025.
-
Cognitive Bias Detection Using Advanced Prompt Engineering
Authors:
Frederic Lemieux,
Aisha Behr,
Clara Kellermann-Bryant,
Zaki Mohammed
Abstract:
Cognitive biases, systematic deviations from rationality in judgment, pose significant challenges in generating objective content. This paper introduces a novel approach for real-time cognitive bias detection in user-generated text using large language models (LLMs) and advanced prompt engineering techniques. The proposed system analyzes textual data to identify common cognitive biases such as con…
▽ More
Cognitive biases, systematic deviations from rationality in judgment, pose significant challenges in generating objective content. This paper introduces a novel approach for real-time cognitive bias detection in user-generated text using large language models (LLMs) and advanced prompt engineering techniques. The proposed system analyzes textual data to identify common cognitive biases such as confirmation bias, circular reasoning, and hidden assumption. By designing tailored prompts, the system effectively leverages LLMs' capabilities to both recognize and mitigate these biases, improving the quality of human-generated content (e.g., news, media, reports). Experimental results demonstrate the high accuracy of our approach in identifying cognitive biases, offering a valuable tool for enhancing content objectivity and reducing the risks of biased decision-making.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
Characterizing the positive inertia index of connected signed graphs in terms of girth
Authors:
Suliman Khan,
Sakander Hayat,
Mohammed J. F. Alenazi
Abstract:
Let $G^σ=(G,σ)$ be a connected signed graph and $A(G^σ)$ be its adjacency matrix. The positive inertia index of $G^σ$, denoted by $p^{+}(G^σ)$, is defined as the number of positive eigenvalues of $A(G^σ)$. Assume that $G^σ$ contains at least one cycle, and let $g_{r}$ be its girth. In this paper, we prove $p^{+}(G^σ) \geq \lceil \frac {g_{r}}{2} \rceil-1$ for a signed graph $G^σ$. The extremal sig…
▽ More
Let $G^σ=(G,σ)$ be a connected signed graph and $A(G^σ)$ be its adjacency matrix. The positive inertia index of $G^σ$, denoted by $p^{+}(G^σ)$, is defined as the number of positive eigenvalues of $A(G^σ)$. Assume that $G^σ$ contains at least one cycle, and let $g_{r}$ be its girth. In this paper, we prove $p^{+}(G^σ) \geq \lceil \frac {g_{r}}{2} \rceil-1$ for a signed graph $G^σ$. The extremal signed graphs corresponding to $p^{+}(G^σ) = \lceil \frac {g_{r}}{2} \rceil-1$ and $p^{+}(G^σ) =\lceil \frac {g_{r}}{2} \rceil$ are characterized, respectively. The results presented in this article extend the recent work on ordinary graphs by Duan and Yang (Linear Algebra Appl., 2024) to the context of signed graphs.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Magnetic Phase Transitions and Mixed Spin in Double Perovskite $Sr_{2}FeMoO_{6}$
Authors:
Said Khaireddine,
Redouane Assad,
Mohammed El Falaki,
Rachid Ahl Lamara,
Lalla Btissam Drissi
Abstract:
The magnetic properties of the double perovskite oxide $Sr_{2}$FeMo$O_{6}$ are analyzed using a mixed-spin Ising model with spins $\left( \frac{1}{2},\frac{5}{2}\right) $ in the presence of a random crystal field $Δ$ and exchange interactions $ J $ on a three-dimensional (3D) cubic lattice. The study employs both the Mean-Field Approximation (MFA) based on the Bogoliubov inequality for Gibbs free…
▽ More
The magnetic properties of the double perovskite oxide $Sr_{2}$FeMo$O_{6}$ are analyzed using a mixed-spin Ising model with spins $\left( \frac{1}{2},\frac{5}{2}\right) $ in the presence of a random crystal field $Δ$ and exchange interactions $ J $ on a three-dimensional (3D) cubic lattice. The study employs both the Mean-Field Approximation (MFA) based on the Bogoliubov inequality for Gibbs free energy and Monte Carlo (MC) simulations using the Metropolis algorithm to provide a comprehensive analysis of the system's phase transitions and magnetization behavior, with focusing on the role of Fe and Mo sublattices. We establish the ground-state phase diagram, identifying multiple stable magnetic configurations and first-order transitions at low temperatures. Indicate compensation temperature $T_{comp}$. This work provides deeper insight into the thermodynamic, physics statistic and magnetic properties of $Sr_{2}$FeMo$O_{6}$, with implications for future applications in spintronics and magnetic storage technologies.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Large Language Models in Healthcare
Authors:
Mohammed Al-Garadi,
Tushar Mungle,
Abdulaziz Ahmed,
Abeed Sarker,
Zhuqi Miao,
Michael E. Matheny
Abstract:
Large language models (LLMs) hold promise for transforming healthcare, from streamlining administrative and clinical workflows to enriching patient engagement and advancing clinical decision-making. However, their successful integration requires rigorous development, adaptation, and evaluation strategies tailored to clinical needs. In this Review, we highlight recent advancements, explore emerging…
▽ More
Large language models (LLMs) hold promise for transforming healthcare, from streamlining administrative and clinical workflows to enriching patient engagement and advancing clinical decision-making. However, their successful integration requires rigorous development, adaptation, and evaluation strategies tailored to clinical needs. In this Review, we highlight recent advancements, explore emerging opportunities for LLM-driven innovation, and propose a framework for their responsible implementation in healthcare settings. We examine strategies for adapting LLMs to domain-specific healthcare tasks, such as fine-tuning, prompt engineering, and multimodal integration with electronic health records. We also summarize various evaluation metrics tailored to healthcare, addressing clinical accuracy, fairness, robustness, and patient outcomes. Furthermore, we discuss the challenges associated with deploying LLMs in healthcare--including data privacy, bias mitigation, regulatory compliance, and computational sustainability--and underscore the need for interdisciplinary collaboration. Finally, these challenges present promising future research directions for advancing LLM implementation in clinical settings and healthcare.
△ Less
Submitted 2 April, 2025; v1 submitted 6 February, 2025;
originally announced March 2025.
-
LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
Authors:
Sambal Shikhar,
Mohammed Irfan Kurpath,
Sahal Shaji Mullappilly,
Jean Lahoud,
Fahad Khan,
Rao Muhammad Anwer,
Salman Khan,
Hisham Cholakkal
Abstract:
Recent advancements in speech-to-speech dialogue systems leverage LLMs for multimodal interactions, yet they remain hindered by fine-tuning requirements, high computational overhead, and text-speech misalignment. Existing speech-enabled LLMs often degrade conversational quality by modifying the LLM, thereby compromising its linguistic capabilities. In contrast, we propose LLMVoX, a lightweight 30M…
▽ More
Recent advancements in speech-to-speech dialogue systems leverage LLMs for multimodal interactions, yet they remain hindered by fine-tuning requirements, high computational overhead, and text-speech misalignment. Existing speech-enabled LLMs often degrade conversational quality by modifying the LLM, thereby compromising its linguistic capabilities. In contrast, we propose LLMVoX, a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming TTS system that generates high-quality speech with low latency, while fully preserving the capabilities of the base LLM. Our approach achieves a significantly lower Word Error Rate compared to speech-enabled LLMs, while operating at comparable latency and UTMOS score. By decoupling speech synthesis from LLM processing via a multi-queue token streaming system, LLMVoX supports seamless, infinite-length dialogues. Its plug-and-play design also facilitates extension to various tasks with different backbones. Furthermore, LLMVoX generalizes to new languages with only dataset adaptation, attaining a low Character Error Rate on an Arabic speech task. Additionally, we have integrated LLMVoX with a Vision-Language Model to create an omni-model with speech, text, and vision capabilities, without requiring additional multimodal training. Our code base and project page is available at https://mbzuai-oryx.github.io/LLMVoX .
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Kantorovich duality for optimal transport on completely regular Hausdorff spaces
Authors:
Mohammed Bachir
Abstract:
We introduce a new intermediate optimization problem situated between Kantorovich's primal and dual formulations. This new problem extends Kantorovich's duality to separable Baire measures, which are strictly more general than tight (or Radon) measures in completely regular Hausdorff spaces. In the special case where the measures are Radon, our intermediate problem aligns with the classical Kantor…
▽ More
We introduce a new intermediate optimization problem situated between Kantorovich's primal and dual formulations. This new problem extends Kantorovich's duality to separable Baire measures, which are strictly more general than tight (or Radon) measures in completely regular Hausdorff spaces. In the special case where the measures are Radon, our intermediate problem aligns with the classical Kantorovich's primal problem. Existence of solutions for all three formulations are also provided within this comprehensive framework.
△ Less
Submitted 19 June, 2025; v1 submitted 5 March, 2025;
originally announced March 2025.
-
Multi-Agent DRL for Queue-Aware Task Offloading in Hierarchical MEC-Enabled Air-Ground Networks
Authors:
Muhammet Hevesli,
Abegaz Mohammed Seid,
Aiman Erbad,
Mohamed Abdallah
Abstract:
Mobile edge computing (MEC)-enabled air-ground networks are a key component of 6G, employing aerial base stations (ABSs) such as unmanned aerial vehicles (UAVs) and high-altitude platform stations (HAPS) to provide dynamic services to ground IoT devices (IoTDs). These IoTDs support real-time applications (e.g., multimedia and Metaverse services) that demand high computational resources and strict…
▽ More
Mobile edge computing (MEC)-enabled air-ground networks are a key component of 6G, employing aerial base stations (ABSs) such as unmanned aerial vehicles (UAVs) and high-altitude platform stations (HAPS) to provide dynamic services to ground IoT devices (IoTDs). These IoTDs support real-time applications (e.g., multimedia and Metaverse services) that demand high computational resources and strict quality of service (QoS) guarantees in terms of latency and task queue management. Given their limited energy and processing capabilities, IoTDs rely on UAVs and HAPS to offload tasks for distributed processing, forming a multi-tier MEC system. This paper tackles the overall energy minimization problem in MEC-enabled air-ground integrated networks (MAGIN) by jointly optimizing UAV trajectories, computing resource allocation, and queue-aware task offloading decisions. The optimization is challenging due to the nonconvex, nonlinear nature of this hierarchical system, which renders traditional methods ineffective. We reformulate the problem as a multi-agent Markov decision process (MDP) with continuous action spaces and heterogeneous agents, and propose a novel variant of multi-agent proximal policy optimization with a Beta distribution (MAPPO-BD) to solve it. Extensive simulations show that MAPPO-BD outperforms baseline schemes, achieving superior energy savings and efficient resource management in MAGIN while meeting queue delay and edge computing constraints.
△ Less
Submitted 5 March, 2025;
originally announced March 2025.
-
Dynamic Neural Surfaces for Elastic 4D Shape Representation and Analysis
Authors:
Awais Nizamani,
Hamid Laga,
Guanjin Wang,
Farid Boussaid,
Mohammed Bennamoun,
Anuj Srivastava
Abstract:
We propose a novel framework for the statistical analysis of genus-zero 4D surfaces, i.e., 3D surfaces that deform and evolve over time. This problem is particularly challenging due to the arbitrary parameterizations of these surfaces and their varying deformation speeds, necessitating effective spatiotemporal registration. Traditionally, 4D surfaces are discretized, in space and time, before comp…
▽ More
We propose a novel framework for the statistical analysis of genus-zero 4D surfaces, i.e., 3D surfaces that deform and evolve over time. This problem is particularly challenging due to the arbitrary parameterizations of these surfaces and their varying deformation speeds, necessitating effective spatiotemporal registration. Traditionally, 4D surfaces are discretized, in space and time, before computing their spatiotemporal registrations, geodesics, and statistics. However, this approach may result in suboptimal solutions and, as we demonstrate in this paper, is not necessary. In contrast, we treat 4D surfaces as continuous functions in both space and time. We introduce Dynamic Spherical Neural Surfaces (D-SNS), an efficient smooth and continuous spatiotemporal representation for genus-0 4D surfaces. We then demonstrate how to perform core 4D shape analysis tasks such as spatiotemporal registration, geodesics computation, and mean 4D shape estimation, directly on these continuous representations without upfront discretization and meshing. By integrating neural representations with classical Riemannian geometry and statistical shape analysis techniques, we provide the building blocks for enabling full functional shape analysis. We demonstrate the efficiency of the framework on 4D human and face datasets. The source code and additional results are available at https://4d-dsns.github.io/DSNS/.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Privacy-Preserving Fair Synthetic Tabular Data
Authors:
Fatima J. Sarmin,
Atiquer R. Rahman,
Christopher J. Henry,
Noman Mohammed
Abstract:
Sharing of tabular data containing valuable but private information is limited due to legal and ethical issues. Synthetic data could be an alternative solution to this sharing problem, as it is artificially generated by machine learning algorithms and tries to capture the underlying data distribution. However, machine learning models are not free from memorization and may introduce biases, as they…
▽ More
Sharing of tabular data containing valuable but private information is limited due to legal and ethical issues. Synthetic data could be an alternative solution to this sharing problem, as it is artificially generated by machine learning algorithms and tries to capture the underlying data distribution. However, machine learning models are not free from memorization and may introduce biases, as they rely on training data. Producing synthetic data that preserves privacy and fairness while maintaining utility close to the real data is a challenging task. This research simultaneously addresses both the privacy and fairness aspects of synthetic data, an area not explored by other studies. In this work, we present PF-WGAN, a privacy-preserving, fair synthetic tabular data generator based on the WGAN-GP model. We have modified the original WGAN-GP by adding privacy and fairness constraints forcing it to produce privacy-preserving fair data. This approach will enable the publication of datasets that protect individual's privacy and remain unbiased toward any particular group. We compared the results with three state-of-the-art synthetic data generator models in terms of utility, privacy, and fairness across four different datasets. We found that the proposed model exhibits a more balanced trade-off among utility, privacy, and fairness.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
PileUp Mitigation at the HL-LHC Using Attention for Event-Wide Context
Authors:
Luke Vaughan,
Mohammed Rakib,
Shivang Patel,
Flera Rizatdinova,
Alexander Khanov,
Arunkumar Bagavathi
Abstract:
The Large Hadron Collider, LHC, collides bunches of protons resulting in multiple interactions that occur practically simultaneously. This creates a pileup effect that distorts physics measurements due to the products of pileup collisions. In order to improve the discovery potential of the LHC, it is necessary to mitigate the effect of pileup interactions on the processes of interest. In this pape…
▽ More
The Large Hadron Collider, LHC, collides bunches of protons resulting in multiple interactions that occur practically simultaneously. This creates a pileup effect that distorts physics measurements due to the products of pileup collisions. In order to improve the discovery potential of the LHC, it is necessary to mitigate the effect of pileup interactions on the processes of interest. In this paper, we suggest a novel AI-based method, PUMiNet, to tackle the problem of pileup at the current LHC and future High Luminosity LHC conditions. PUMiNet is an attention-based algorithm that mitigates pileup effects using a regression task on jets in the context of an entire event. At $\left\langle μ\right\rangle=200$, PUMiNet is able to predict the hard scatter energy and mass fractions of jets with $R^2=0.912$ and $R^2=0.720$, respectively. These predictions enable the reconstruction of the Higgs boson mass in the HL-LHC environment.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Branching fraction measurement of the decay $B^+ \to ψ(2S) φ(1020) K^+$
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis,
L. An
, et al. (1128 additional authors not shown)
Abstract:
The branching fraction of the decay $B^+\to ψ(2S)φ(1020)K^+$, relative to the topologically similar decay $B^+\to J/ψφ(1020) K^+$, is measured using proton-proton collision data collected by the LHCb experiment at center-of-mass energies of 7, 8, and 13 TeV, corresponding to an integrated luminosity of $9\,\mathrm{fb}^{-1}$. The ratio is found to be $0.061 \pm 0.004 \pm 0.009$, where the first unc…
▽ More
The branching fraction of the decay $B^+\to ψ(2S)φ(1020)K^+$, relative to the topologically similar decay $B^+\to J/ψφ(1020) K^+$, is measured using proton-proton collision data collected by the LHCb experiment at center-of-mass energies of 7, 8, and 13 TeV, corresponding to an integrated luminosity of $9\,\mathrm{fb}^{-1}$. The ratio is found to be $0.061 \pm 0.004 \pm 0.009$, where the first uncertainty is statistical and the second systematic. Using the world-average branching fraction for $B^+ \to J/ψφ(1020) K^+$, the branching fraction for the decay $B^+\to ψ(2S) φ(1020) K^+$ is found to be $ (3.0 \pm 0.2 \pm 0.5 \pm 0.2) \times 10^{-6}$, where the first uncertainty is statistical, the second systematic, and the third is due to the branching fraction of the normalization channel.
△ Less
Submitted 14 May, 2025; v1 submitted 4 March, 2025;
originally announced March 2025.
-
Remote Sensing Image Classification Using Convolutional Neural Network (CNN) and Transfer Learning Techniques
Authors:
Mustafa Majeed Abd Zaid,
Ahmed Abed Mohammed,
Putra Sumari
Abstract:
This study investigates the classification of aerial images depicting transmission towers, forests, farmland, and mountains. To complete the classification job, features are extracted from input photos using a Convolutional Neural Network (CNN) architecture. Then, the images are classified using Softmax. To test the model, we ran it for ten epochs using a batch size of 90, the Adam optimizer, and…
▽ More
This study investigates the classification of aerial images depicting transmission towers, forests, farmland, and mountains. To complete the classification job, features are extracted from input photos using a Convolutional Neural Network (CNN) architecture. Then, the images are classified using Softmax. To test the model, we ran it for ten epochs using a batch size of 90, the Adam optimizer, and a learning rate of 0.001. Both training and assessment are conducted using a dataset that blends self-collected pictures from Google satellite imagery with the MLRNet dataset. The comprehensive dataset comprises 10,400 images. Our study shows that transfer learning models and MobileNetV2 in particular, work well for landscape categorization. These models are good options for practical use because they strike a good mix between precision and efficiency; our approach achieves results with an overall accuracy of 87% on the built CNN model. Furthermore, we reach even higher accuracies by utilizing the pretrained VGG16 and MobileNetV2 models as a starting point for transfer learning. Specifically, VGG16 achieves an accuracy of 90% and a test loss of 0.298, while MobileNetV2 outperforms both models with an accuracy of 96% and a test loss of 0.119; the results demonstrate the effectiveness of employing transfer learning with MobileNetV2 for classifying transmission towers, forests, farmland, and mountains.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Tabby: Tabular Data Synthesis with Language Models
Authors:
Sonia Cromp,
Satya Sai Srinath Namburi GNVV,
Mohammed Alkhudhayri,
Catherine Cao,
Samuel Guo,
Nicholas Roberts,
Frederic Sala
Abstract:
While advances in large language models (LLMs) have greatly improved the quality of synthetic text data in recent years, synthesizing tabular data has received relatively less attention. We address this disparity with Tabby, a simple but powerful post-training modification to the standard Transformer language model architecture, enabling its use for tabular dataset synthesis. Tabby enables the rep…
▽ More
While advances in large language models (LLMs) have greatly improved the quality of synthetic text data in recent years, synthesizing tabular data has received relatively less attention. We address this disparity with Tabby, a simple but powerful post-training modification to the standard Transformer language model architecture, enabling its use for tabular dataset synthesis. Tabby enables the representation of differences across columns using Gated Mixture-of-Experts, with column-specific sets of parameters. Empirically, Tabby results in data quality near or equal to that of real data. By pairing our novel LLM table training technique, Plain, with Tabby, we observe up to a 44% improvement in quality over previous methods. We also show that Tabby extends beyond tables to more general structured data, reaching parity with real data on a nested JSON dataset as well.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
PhishVQC: Optimizing Phishing URL Detection with Correlation Based Feature Selection and Variational Quantum Classifier
Authors:
Md. Farhan Shahriyar,
Gazi Tanbhir,
Abdullah Md Raihan Chy,
Mohammed Abdul Al Arafat Tanzin,
Md. Jisan Mashrafi
Abstract:
Phishing URL detection is crucial in cybersecurity as malicious websites disguise themselves to steal sensitive infor mation. Traditional machine learning techniques struggle to per form well in complex real-world scenarios due to large datasets and intricate patterns. Motivated by quantum computing, this paper proposes using Variational Quantum Classifiers (VQC) to enhance phishing URL detection.…
▽ More
Phishing URL detection is crucial in cybersecurity as malicious websites disguise themselves to steal sensitive infor mation. Traditional machine learning techniques struggle to per form well in complex real-world scenarios due to large datasets and intricate patterns. Motivated by quantum computing, this paper proposes using Variational Quantum Classifiers (VQC) to enhance phishing URL detection. We present PhishVQC, a quantum model that combines quantum feature maps and vari ational ansatzes such as RealAmplitude and EfficientSU2. The model is evaluated across two experimental setups with varying dataset sizes and feature map repetitions. PhishVQC achieves a maximum macro average F1-score of 0.89, showing a 22% improvement over prior studies. This highlights the potential of quantum machine learning to improve phishing detection accuracy. The study also notes computational challenges, with execution wall times increasing as dataset size grows.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
CAPS: Context-Aware Priority Sampling for Enhanced Imitation Learning in Autonomous Driving
Authors:
Hamidreza Mirkhani,
Behzad Khamidehi,
Ehsan Ahmadi,
Fazel Arasteh,
Mohammed Elmahgiubi,
Weize Zhang,
Umar Rajguru,
Kasra Rezaee
Abstract:
In this paper, we introduce CAPS (Context-Aware Priority Sampling), a novel method designed to enhance data efficiency in learning-based autonomous driving systems. CAPS addresses the challenge of imbalanced training datasets in imitation learning by leveraging Vector Quantized Variational Autoencoders (VQ-VAEs). The use of VQ-VAE provides a structured and interpretable data representation, which…
▽ More
In this paper, we introduce CAPS (Context-Aware Priority Sampling), a novel method designed to enhance data efficiency in learning-based autonomous driving systems. CAPS addresses the challenge of imbalanced training datasets in imitation learning by leveraging Vector Quantized Variational Autoencoders (VQ-VAEs). The use of VQ-VAE provides a structured and interpretable data representation, which helps reveal meaningful patterns in the data. These patterns are used to group the data into clusters, with each sample being assigned a cluster ID. The cluster IDs are then used to re-balance the dataset, ensuring that rare yet valuable samples receive higher priority during training. By ensuring a more diverse and informative training set, CAPS improves the generalization of the trained planner across a wide range of driving scenarios. We evaluate our method through closed-loop simulations in the CARLA environment. The results on Bench2Drive scenarios demonstrate that our framework outperforms state-of-the-art methods, leading to notable improvements in model performance.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Authors:
Fajri Koto,
Rituraj Joshi,
Nurdaulet Mukhituly,
Yuxia Wang,
Zhuohan Xie,
Rahul Pal,
Daniil Orel,
Parvez Mullah,
Diana Turmakhan,
Maiya Goloburda,
Mohammed Kamran,
Samujjwal Ghosh,
Bokang Jia,
Jonibek Mansurov,
Mukhammed Togmanov,
Debopriyo Banerjee,
Nurkhan Laiyk,
Akhmed Sakip,
Xudong Han,
Ekaterina Kochmar,
Alham Fikri Aji,
Aaryamonvikram Singh,
Alok Anil Jadhav,
Satheesh Katipomu,
Samta Kamboj
, et al. (10 additional authors not shown)
Abstract:
Llama-3.1-Sherkala-8B-Chat, or Sherkala-Chat (8B) for short, is a state-of-the-art instruction-tuned open generative large language model (LLM) designed for Kazakh. Sherkala-Chat (8B) aims to enhance the inclusivity of LLM advancements for Kazakh speakers. Adapted from the LLaMA-3.1-8B model, Sherkala-Chat (8B) is trained on 45.3B tokens across Kazakh, English, Russian, and Turkish. With 8 billion…
▽ More
Llama-3.1-Sherkala-8B-Chat, or Sherkala-Chat (8B) for short, is a state-of-the-art instruction-tuned open generative large language model (LLM) designed for Kazakh. Sherkala-Chat (8B) aims to enhance the inclusivity of LLM advancements for Kazakh speakers. Adapted from the LLaMA-3.1-8B model, Sherkala-Chat (8B) is trained on 45.3B tokens across Kazakh, English, Russian, and Turkish. With 8 billion parameters, it demonstrates strong knowledge and reasoning abilities in Kazakh, significantly outperforming existing open Kazakh and multilingual models of similar scale while achieving competitive performance in English. We release Sherkala-Chat (8B) as an open-weight instruction-tuned model and provide a detailed overview of its training, fine-tuning, safety alignment, and evaluation, aiming to advance research and support diverse real-world applications.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs
Authors:
Fakhraddin Alwajih,
Abdellah El Mekki,
Samar Mohamed Magdy,
Abdelrahim A. Elmadany,
Omer Nacar,
El Moatez Billah Nagoudi,
Reem Abdel-Salam,
Hanin Atwany,
Youssef Nafea,
Abdulfattah Mohammed Yahya,
Rahaf Alhamouri,
Hamzah A. Alsayadi,
Hiba Zayed,
Sara Shatnawi,
Serry Sibaee,
Yasir Ech-Chammakhy,
Walid Al-Dhabyani,
Marwa Mohamed Ali,
Imen Jarraya,
Ahmed Oumar El-Shangiti,
Aisha Alraeesi,
Mohammed Anwar Al-Ghrawi,
Abdulrahman S. Al-Batati,
Elgizouli Mohamed,
Noha Taha Elgindi
, et al. (19 additional authors not shown)
Abstract:
As large language models (LLMs) become increasingly integrated into daily life, ensuring their cultural sensitivity and inclusivity is paramount. We introduce our dataset, a year-long community-driven project covering all 22 Arab countries. The dataset includes instructions (input, response pairs) in both Modern Standard Arabic (MSA) and dialectal Arabic (DA), spanning 20 diverse topics. Built by…
▽ More
As large language models (LLMs) become increasingly integrated into daily life, ensuring their cultural sensitivity and inclusivity is paramount. We introduce our dataset, a year-long community-driven project covering all 22 Arab countries. The dataset includes instructions (input, response pairs) in both Modern Standard Arabic (MSA) and dialectal Arabic (DA), spanning 20 diverse topics. Built by a team of 44 researchers across the Arab world, all of whom are authors of this paper, our dataset offers a broad, inclusive perspective. We use our dataset to evaluate the cultural and dialectal capabilities of several frontier LLMs, revealing notable limitations. For instance, while closed-source LLMs generally exhibit strong performance, they are not without flaws, and smaller open-source models face greater challenges. Moreover, certain countries (e.g., Egypt, the UAE) appear better represented than others (e.g., Iraq, Mauritania, Yemen). Our annotation guidelines, code, and data for reproducibility are publicly available.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
Visual Reasoning at Urban Intersections: FineTuning GPT-4o for Traffic Conflict Detection
Authors:
Sari Masri,
Huthaifa I. Ashqar,
Mohammed Elhenawy
Abstract:
Traffic control in unsignalized urban intersections presents significant challenges due to the complexity, frequent conflicts, and blind spots. This study explores the capability of leveraging Multimodal Large Language Models (MLLMs), such as GPT-4o, to provide logical and visual reasoning by directly using birds-eye-view videos of four-legged intersections. In this proposed method, GPT-4o acts as…
▽ More
Traffic control in unsignalized urban intersections presents significant challenges due to the complexity, frequent conflicts, and blind spots. This study explores the capability of leveraging Multimodal Large Language Models (MLLMs), such as GPT-4o, to provide logical and visual reasoning by directly using birds-eye-view videos of four-legged intersections. In this proposed method, GPT-4o acts as intelligent system to detect conflicts and provide explanations and recommendations for the drivers. The fine-tuned model achieved an accuracy of 77.14%, while the manual evaluation of the true predicted values of the fine-tuned GPT-4o showed significant achievements of 89.9% accuracy for model-generated explanations and 92.3% for the recommended next actions. These results highlight the feasibility of using MLLMs for real-time traffic management using videos as inputs, offering scalable and actionable insights into intersections traffic management and operation. Code used in this study is available at https://github.com/sarimasri3/Traffic-Intersection-Conflict-Detection-using-images.git.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
HazardNet: A Small-Scale Vision Language Model for Real-Time Traffic Safety Detection at Edge Devices
Authors:
Mohammad Abu Tami,
Mohammed Elhenawy,
Huthaifa I. Ashqar
Abstract:
Traffic safety remains a vital concern in contemporary urban settings, intensified by the increase of vehicles and the complicated nature of road networks. Traditional safety-critical event detection systems predominantly rely on sensor-based approaches and conventional machine learning algorithms, necessitating extensive data collection and complex training processes to adhere to traffic safety r…
▽ More
Traffic safety remains a vital concern in contemporary urban settings, intensified by the increase of vehicles and the complicated nature of road networks. Traditional safety-critical event detection systems predominantly rely on sensor-based approaches and conventional machine learning algorithms, necessitating extensive data collection and complex training processes to adhere to traffic safety regulations. This paper introduces HazardNet, a small-scale Vision Language Model designed to enhance traffic safety by leveraging the reasoning capabilities of advanced language and vision models. We built HazardNet by fine-tuning the pre-trained Qwen2-VL-2B model, chosen for its superior performance among open-source alternatives and its compact size of two billion parameters. This helps to facilitate deployment on edge devices with efficient inference throughput. In addition, we present HazardQA, a novel Vision Question Answering (VQA) dataset constructed specifically for training HazardNet on real-world scenarios involving safety-critical events. Our experimental results show that the fine-tuned HazardNet outperformed the base model up to an 89% improvement in F1-Score and has comparable results with improvement in some cases reach up to 6% when compared to larger models, such as GPT-4o. These advancements underscore the potential of HazardNet in providing real-time, reliable traffic safety event detection, thereby contributing to reduced accidents and improved traffic management in urban environments. Both HazardNet model and the HazardQA dataset are available at https://huggingface.co/Tami3/HazardNet and https://huggingface.co/datasets/Tami3/HazardQA, respectively.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
From Retrieval to Generation: Comparing Different Approaches
Authors:
Abdelrahman Abdallah,
Jamshid Mozafari,
Bhawna Piryani,
Mohammed Ali,
Adam Jatowt
Abstract:
Knowledge-intensive tasks, particularly open-domain question answering (ODQA), document reranking, and retrieval-augmented language modeling, require a balance between retrieval accuracy and generative flexibility. Traditional retrieval models such as BM25 and Dense Passage Retrieval (DPR), efficiently retrieve from large corpora but often lack semantic depth. Generative models like GPT-4-o provid…
▽ More
Knowledge-intensive tasks, particularly open-domain question answering (ODQA), document reranking, and retrieval-augmented language modeling, require a balance between retrieval accuracy and generative flexibility. Traditional retrieval models such as BM25 and Dense Passage Retrieval (DPR), efficiently retrieve from large corpora but often lack semantic depth. Generative models like GPT-4-o provide richer contextual understanding but face challenges in maintaining factual consistency. In this work, we conduct a systematic evaluation of retrieval-based, generation-based, and hybrid models, with a primary focus on their performance in ODQA and related retrieval-augmented tasks. Our results show that dense retrievers, particularly DPR, achieve strong performance in ODQA with a top-1 accuracy of 50.17\% on NQ, while hybrid models improve nDCG@10 scores on BEIR from 43.42 (BM25) to 52.59, demonstrating their strength in document reranking. Additionally, we analyze language modeling tasks using WikiText-103, showing that retrieval-based approaches like BM25 achieve lower perplexity compared to generative and hybrid methods, highlighting their utility in retrieval-augmented generation. By providing detailed comparisons and practical insights into the conditions where each approach excels, we aim to facilitate future optimizations in retrieval, reranking, and generative models for ODQA and related knowledge-intensive applications.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
Modified FOX Optimizer for Solving optimization problems
Authors:
Dler O. Hasan,
Hardi M. Mohammed,
Zrar Khalid Abdul
Abstract:
The FOX optimizer, inspired by red fox hunting behavior, is a powerful algorithm for solving real-world and engineering problems. However, despite balancing exploration and exploitation, it can prematurely converge to local optima, as agent positions are updated solely based on the current best-known position, causing all agents to converge on one location. This study proposes the modified FOX opt…
▽ More
The FOX optimizer, inspired by red fox hunting behavior, is a powerful algorithm for solving real-world and engineering problems. However, despite balancing exploration and exploitation, it can prematurely converge to local optima, as agent positions are updated solely based on the current best-known position, causing all agents to converge on one location. This study proposes the modified FOX optimizer (mFOX) to enhance exploration and balance exploration and exploitation in three steps. First, the Oppositional-Based Learning (OBL) strategy is used to improve the initial population. Second, control parameters are refined to achieve a better balance between exploration and exploitation. Third, a new update equation is introduced, allowing agents to adjust their positions relative to one another rather than relying solely on the best-known position. This approach improves exploration efficiency without adding complexity. The mFOX algorithm's performance is evaluated against 12 well-known algorithms on 23 classical benchmark functions, 10 CEC2019 functions, and 12 CEC2022 functions. It outperforms competitors in 74% of the classical benchmarks, 60% of the CEC2019 benchmarks, and 58% of the CEC2022 benchmarks. Additionally, mFOX effectively addresses four engineering problems. These results demonstrate mFOX's strong competitiveness in solving complex optimization tasks, including unimodal, constrained, and high-dimensional problems.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
An Improved 3D Skeletons UP-Fall Dataset: Enhancing Data Quality for Efficient Impact Fall Detection
Authors:
Tresor Y. Koffi,
Youssef Mourchid,
Mohammed Hindawi,
Yohan Dupuis
Abstract:
Detecting impact where an individual makes contact with the ground within a fall event is crucial in fall detection systems, particularly for elderly care where prompt intervention can prevent serious injuries. The UP-Fall dataset, a key resource in fall detection research, has proven valuable but suffers from limitations in data accuracy and comprehensiveness. These limitations cause confusion in…
▽ More
Detecting impact where an individual makes contact with the ground within a fall event is crucial in fall detection systems, particularly for elderly care where prompt intervention can prevent serious injuries. The UP-Fall dataset, a key resource in fall detection research, has proven valuable but suffers from limitations in data accuracy and comprehensiveness. These limitations cause confusion in distinguishing between non-impact events, such as sliding, and real falls with impact, where the person actually hits the ground. This confusion compromises the effectiveness of current fall detection systems. This study presents enhancements to the UP-Fall dataset aiming at improving it for impact fall detection by incorporating 3D skeleton data. Our preprocessing techniques ensure high data accuracy and comprehensiveness, enabling a more reliable impact fall detection. Extensive experiments were conducted using various machine learning and deep learning algorithms to benchmark the improved 3D skeletons dataset. The results demonstrate substantial improvements in the performance of fall detection models trained on the enhanced dataset. This contribution aims to enhance the safety and well-being of the elderly population at risk. To support further research and development of building more reliable impact fall detection systems, we have made the improved 3D skeletons UP-Fall dataset publicly available at this link https://zenodo.org/records/12773013.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
A Multi-Agent DRL-Based Framework for Optimal Resource Allocation and Twin Migration in the Multi-Tier Vehicular Metaverse
Authors:
Nahom Abishu Hayla,
A. Mohammed Seid,
Aiman Erbad,
Tilahun M. Getu,
Ala Al-Fuqaha,
Mohsen Guizani
Abstract:
Although multi-tier vehicular Metaverse promises to transform vehicles into essential nodes -- within an interconnected digital ecosystem -- using efficient resource allocation and seamless vehicular twin (VT) migration, this can hardly be achieved by the existing techniques operating in a highly dynamic vehicular environment, since they can hardly balance multi-objective optimization problems suc…
▽ More
Although multi-tier vehicular Metaverse promises to transform vehicles into essential nodes -- within an interconnected digital ecosystem -- using efficient resource allocation and seamless vehicular twin (VT) migration, this can hardly be achieved by the existing techniques operating in a highly dynamic vehicular environment, since they can hardly balance multi-objective optimization problems such as latency reduction, resource utilization, and user experience (UX). To address these challenges, we introduce a novel multi-tier resource allocation and VT migration framework that integrates Graph Convolutional Networks (GCNs), a hierarchical Stackelberg game-based incentive mechanism, and Multi-Agent Deep Reinforcement Learning (MADRL). The GCN-based model captures both spatial and temporal dependencies within the vehicular network; the Stackelberg game-based incentive mechanism fosters cooperation between vehicles and infrastructure; and the MADRL algorithm jointly optimizes resource allocation and VT migration in real time. By modeling this dynamic and multi-tier vehicular Metaverse as a Markov Decision Process (MDP), we develop a MADRL-based algorithm dubbed the Multi-Objective Multi-Agent Deep Deterministic Policy Gradient (MO-MADDPG), which can effectively balances the various conflicting objectives. Extensive simulations validate the effectiveness of this algorithm that is demonstrated to enhance scalability, reliability, and efficiency while considerably improving latency, resource utilization, migration cost, and overall UX by 12.8%, 9.7%, 14.2%, and 16.1%, respectively.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Observation of a new charmed baryon decaying to $Ξ_c^+ π^- π^+$
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1135 additional authors not shown)
Abstract:
The $Ξ_c^+ π^- π^+$ spectrum is investigated using proton-proton collisions at a center-of-mass energy of 13TeV, corresponding to an integrated luminosity of 5.4fb$^{-1}$, collected by the LHCb experiment during 2016--2018. Four states are observed with high significance, and their masses and widths are measured to be \begin{align*}
m[Ξ_c(2815)^{+}] &= 2816.65 \pm 0.03 \pm 0.03 \pm 0.23 ~\text{M…
▽ More
The $Ξ_c^+ π^- π^+$ spectrum is investigated using proton-proton collisions at a center-of-mass energy of 13TeV, corresponding to an integrated luminosity of 5.4fb$^{-1}$, collected by the LHCb experiment during 2016--2018. Four states are observed with high significance, and their masses and widths are measured to be \begin{align*}
m[Ξ_c(2815)^{+}] &= 2816.65 \pm 0.03 \pm 0.03 \pm 0.23 ~\text{MeV},
Γ[Ξ_c(2815)^{+}] &= 2.07 \pm 0.08 \pm 0.12~\text{MeV},\\[5pt]
m[Ξ_c(2923)^{+}] &= 2922.8 \pm 0.3 \pm 0.5 \pm 0.2~\text{MeV},
Γ[Ξ_c(2923)^{+}] &= 5.3 \pm 0.9 \pm 1.4~\text{MeV},\\[5pt]
m[Ξ_c(2970)^{+}] &= 2968.6 \pm 0.5 \pm 0.5 \pm 0.2~\text{MeV},
Γ[Ξ_c(2970)^{+}] &= 31.7 \pm 1.7 \pm 1.9~\text{MeV},\\[5pt]
m[Ξ_c(3080)^{+}] &= 3076.8 \pm 0.7 \pm 1.3 \pm 0.2~\text{MeV},
Γ[Ξ_c(3080)^{+}] &= 6.8 \pm 2.3 \pm 0.9~\text{MeV}, \end{align*} where the uncertainties are statistical, systematic, and due to the limited precision on the $Ξ_c^+$ mass, respectively. The $Ξ_c(2923)^{+}$ baryon is observed for the first time, and is consistent with being the isospin partner of the previously observed $Ξ_c(2923)^{0}$ state. Most of the measured parameters are more precise than existing world averages.
△ Less
Submitted 26 February, 2025;
originally announced February 2025.
-
Learning multi-phase flow and transport in fractured porous media with auto-regressive and recurrent graph neural networks
Authors:
Mohammed Al Kobaisi,
Wenjuan Zhang,
Waleed Diab,
Hadi Hajibeygi
Abstract:
In the past three decades, a wide array of computational methodologies and simulation frameworks has emerged to address the complexities of modeling multi-phase flow and transport processes in fractured porous media. The conformal mesh approaches which explicitly align the computational grid with fracture surfaces are considered by many to be the most accurate. However, such methods require excess…
▽ More
In the past three decades, a wide array of computational methodologies and simulation frameworks has emerged to address the complexities of modeling multi-phase flow and transport processes in fractured porous media. The conformal mesh approaches which explicitly align the computational grid with fracture surfaces are considered by many to be the most accurate. However, such methods require excessive fine-scale meshing, rendering them impractical for large or complex fracture networks. In this work, we propose to learn the complex multi-phase flow and transport dynamics in fractured porous media with graph neural networks (GNN). GNNs are well suited for this task due to the unstructured topology of the computation grid resulting from the Embedded Discrete Fracture Model (EDFM) discretization. We propose two deep learning architectures, a GNN and a recurrent GNN. Both networks follow a two-stage training strategy: an autoregressive one step roll-out, followed by a fine-tuning step where the model is supervised using the whole ground-truth sequence. We demonstrate that the two-stage training approach is effective in mitigating error accumulation during autoregressive model rollouts in the testing phase. Our findings indicate that both GNNs generalize well to unseen fracture realizations, with comparable performance in forecasting saturation sequences, and slightly better performance for the recurrent GNN in predicting pressure sequences. While the second stage of training proved to be beneficial for the GNN model, its impact on the recurrent GNN model was less pronounced. Finally, the performance of both GNNs for temporal extrapolation is tested. The recurrent GNN significantly outperformed the GNN in terms of accuracy, thereby underscoring its superior capability in predicting long sequences.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Forecasting Rare Language Model Behaviors
Authors:
Erik Jones,
Meg Tong,
Jesse Mu,
Mohammed Mahfoud,
Jan Leike,
Roger Grosse,
Jared Kaplan,
William Fithian,
Ethan Perez,
Mrinank Sharma
Abstract:
Standard language model evaluations can fail to capture risks that emerge only at deployment scale. For example, a model may produce safe responses during a small-scale beta test, yet reveal dangerous information when processing billions of requests at deployment. To remedy this, we introduce a method to forecast potential risks across orders of magnitude more queries than we test during evaluatio…
▽ More
Standard language model evaluations can fail to capture risks that emerge only at deployment scale. For example, a model may produce safe responses during a small-scale beta test, yet reveal dangerous information when processing billions of requests at deployment. To remedy this, we introduce a method to forecast potential risks across orders of magnitude more queries than we test during evaluation. We make forecasts by studying each query's elicitation probability -- the probability the query produces a target behavior -- and demonstrate that the largest observed elicitation probabilities predictably scale with the number of queries. We find that our forecasts can predict the emergence of diverse undesirable behaviors -- such as assisting users with dangerous chemical synthesis or taking power-seeking actions -- across up to three orders of magnitude of query volume. Our work enables model developers to proactively anticipate and patch rare failures before they manifest during large-scale deployments.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
Enhancing Collaboration for Software Engineers through Matching
Authors:
Nayaab Azim,
Sadath Ullah Khan Mohammed,
Evan Phaup,
Adeyemi Aina
Abstract:
In recent years, the field of software engineering has experienced a considerable increase in demand for competent experts, resulting in an increased demand for platforms that connect software engineers and facilitate collaboration. In response to this necessity, in this paper we present a project to solve the lack of a proper one-stop connection platform for software engineers and promoting colla…
▽ More
In recent years, the field of software engineering has experienced a considerable increase in demand for competent experts, resulting in an increased demand for platforms that connect software engineers and facilitate collaboration. In response to this necessity, in this paper we present a project to solve the lack of a proper one-stop connection platform for software engineers and promoting collaborative learning and upskilling. The idea of the project is to develop a web-based application (NEXAS) that would facilitate connecting and collaborating between software engineers. The application would perform algorithmic matching to suggest user connections based on their technical profiles and interests. The users can filter profiles, discover open projects, and form collaboration groups. Using this application will enable users to connect with peers having similar interests, thereby creating a community network tailored exclusively for software engineers.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Open-Source Retrieval Augmented Generation Framework for Retrieving Accurate Medication Insights from Formularies for African Healthcare Workers
Authors:
Axum AI,
:,
J. Owoyemi,
S. Abubakar,
A. Owoyemi,
T. O. Togunwa,
F. C. Madubuko,
S. Oyatoye,
Z. Oyetolu,
K. Akyea,
A. O. Mohammed,
A. Adebakin
Abstract:
Accessing accurate medication insights is vital for enhancing patient safety, minimizing errors, and supporting clinical decision-making. However, healthcare professionals in Africa often rely on manual and time-consuming processes to retrieve drug information, exacerbated by limited access to pharmacists due to brain drain and healthcare disparities. This paper presents "Drug Insights," an open-s…
▽ More
Accessing accurate medication insights is vital for enhancing patient safety, minimizing errors, and supporting clinical decision-making. However, healthcare professionals in Africa often rely on manual and time-consuming processes to retrieve drug information, exacerbated by limited access to pharmacists due to brain drain and healthcare disparities. This paper presents "Drug Insights," an open-source Retrieval-Augmented Generation (RAG) chatbot designed to streamline medication lookup for healthcare workers in Africa. By leveraging a corpus of Nigerian pharmaceutical data and advanced AI technologies, including Pinecone databases and GPT models, the system delivers accurate, context-specific responses with minimal hallucination. The chatbot integrates prompt engineering and S-BERT evaluation to optimize retrieval and response generation. Preliminary tests, including pharmacist feedback, affirm the tool's potential to improve drug information access while highlighting areas for enhancement, such as UI/UX refinement and extended corpus integration.
△ Less
Submitted 27 January, 2025;
originally announced February 2025.
-
Developing an Artificial Intelligence Tool for Personalized Breast Cancer Treatment Plans based on the NCCN Guidelines
Authors:
Abdul M. Mohammed,
Iqtidar Mansoor,
Sarah Blythe,
Dennis Trujillo
Abstract:
Cancer treatments require personalized approaches based on a patient's clinical condition, medical history, and evidence-based guidelines. The National Comprehensive Cancer Network (NCCN) provides frequently updated, complex guidelines through visuals like flowcharts and diagrams, which can be time consuming for oncologists to stay current with treatment protocols. This study presents an AI (Artif…
▽ More
Cancer treatments require personalized approaches based on a patient's clinical condition, medical history, and evidence-based guidelines. The National Comprehensive Cancer Network (NCCN) provides frequently updated, complex guidelines through visuals like flowcharts and diagrams, which can be time consuming for oncologists to stay current with treatment protocols. This study presents an AI (Artificial Intelligence)-driven methodology to accurately automate treatment regimens following NCCN guidelines for breast cancer patients.
We proposed two AI-driven methods: Agentic-RAG (Retrieval-Augmented Generation) and Graph-RAG. Agentic-RAG used a three-step Large Language Model (LLM) process to select clinical titles from NCCN guidelines, retrieve matching JSON content, and iteratively refine recommendations based on insufficiency checks. Graph-RAG followed a Microsoft-developed framework with proprietary prompts, where JSON data was converted to text via an LLM, summarized, and mapped into graph structures representing key treatment relationships. Final recommendations were generated by querying relevant graph summaries. Both were evaluated using a set of patient descriptions, each with four associated questions.
As shown in Table 1, Agentic RAG achieved a 100% adherence (24/24) with no hallucinations or incorrect treatments. Graph-RAG had 95.8% adherence (23/24) with one incorrect treatment and no hallucinations. Chat GPT-4 showed 91.6% adherence (22/24) with two wrong treatments and no hallucinations. Both Agentic RAG and Graph-RAG provided detailed treatment recommendations with accurate references to relevant NCCN document page numbers.
△ Less
Submitted 5 January, 2025;
originally announced February 2025.
-
On-demand generation of entangled photons pairs in the telecom O-band from nanowire quantum dots
Authors:
Mohammed K. Alqedra,
Chiao-Tzu Huang,
Edith Yeung,
Wen-Hao Chang,
Sofiane Haffouz,
Philip J. Poole,
Dan Dalacu,
Ali W. Elshaari,
Val Zwiller
Abstract:
On-demand entangled photon pairs at telecom wavelengths are crucial for quantum communication, distributed quantum computing, and quantum-enhanced sensing and metrology. The O-band is particularly advantageous because of its minimal chromatic dispersion and transmission loss in optical fibers, making it well-suited for long-distance quantum networks. Site-controlled nanowire quantum dots have emer…
▽ More
On-demand entangled photon pairs at telecom wavelengths are crucial for quantum communication, distributed quantum computing, and quantum-enhanced sensing and metrology. The O-band is particularly advantageous because of its minimal chromatic dispersion and transmission loss in optical fibers, making it well-suited for long-distance quantum networks. Site-controlled nanowire quantum dots have emerged as a promising platform for the on-demand generation of single and entangled photons, offering high extraction efficiency and the potential for scalable fabrication in large uniform arrays. However, their operation has been largely restricted to the visible and first near-infrared (NIR-I) windows. Here, we demonstrate an on-demand bright source of entangled photon pairs with high fidelity in the telecom O-band based on site-controlled nanowire quantum dots. We measure a fine-structure splitting of 4.6 $μ$eV, verifying the suitability of the quantum dot for generating high-fidelity polarization-entangled photon pairs. Full quantum state tomography of the two-photon state generated by the biexciton\hyph exciton cascade reveals a maximum fidelity of $85.8\% \pm 1.1\%$ to the $Φ^+$ Bell state, and a maximum concurrence of $75.1\% \pm 2.1\%$. We estimate the source efficiency at the first lens to be 12.5$\%$. This bright, scalable, and deterministic source of entangled photons in the telecom range represents a valuable step forward in advancing practical quantum applications at telecom wavelengths.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions
Authors:
Nathanaël Carraz Rakotonirina,
Mohammed Hamdy,
Jon Ander Campos,
Lucas Weber,
Alberto Testoni,
Marzieh Fadaee,
Sandro Pezzelle,
Marco Del Tredici
Abstract:
Large Language Models (LLMs) are increasingly used in working environments for a wide range of tasks, excelling at solving individual problems in isolation. However, are they also able to effectively collaborate over long-term interactions? To investigate this, we introduce MemoryCode, a synthetic multi-session dataset designed to test LLMs' ability to track and execute simple coding instructions…
▽ More
Large Language Models (LLMs) are increasingly used in working environments for a wide range of tasks, excelling at solving individual problems in isolation. However, are they also able to effectively collaborate over long-term interactions? To investigate this, we introduce MemoryCode, a synthetic multi-session dataset designed to test LLMs' ability to track and execute simple coding instructions amid irrelevant information, simulating a realistic setting. While all the models we tested handle isolated instructions well, even the performance of state-of-the-art models like GPT-4o deteriorates when instructions are spread across sessions. Our analysis suggests this is due to their failure to retrieve and integrate information over long instruction chains. Our results highlight a fundamental limitation of current LLMs, restricting their ability to collaborate effectively in long interactions.
△ Less
Submitted 6 June, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
MMTEB: Massive Multilingual Text Embedding Benchmark
Authors:
Kenneth Enevoldsen,
Isaac Chung,
Imene Kerboua,
Márton Kardos,
Ashwin Mathur,
David Stap,
Jay Gala,
Wissam Siblini,
Dominik Krzemiński,
Genta Indra Winata,
Saba Sturua,
Saiteja Utpala,
Mathieu Ciancone,
Marion Schaeffer,
Gabriel Sequeira,
Diganta Misra,
Shreeya Dhakal,
Jonathan Rystrøm,
Roman Solomatin,
Ömer Çağatan,
Akash Kundu,
Martin Bernstorff,
Shitao Xiao,
Akshita Sukhlecha,
Bhavish Pahwa
, et al. (61 additional authors not shown)
Abstract:
Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ langua…
▽ More
Text embeddings are typically evaluated on a limited set of tasks, which are constrained by language, domain, and task diversity. To address these limitations and provide a more comprehensive evaluation, we introduce the Massive Multilingual Text Embedding Benchmark (MMTEB) - a large-scale, community-driven expansion of MTEB, covering over 500 quality-controlled evaluation tasks across 250+ languages. MMTEB includes a diverse set of challenging, novel tasks such as instruction following, long-document retrieval, and code retrieval, representing the largest multilingual collection of evaluation tasks for embedding models to date. Using this collection, we develop several highly multilingual benchmarks, which we use to evaluate a representative set of models. We find that while large language models (LLMs) with billions of parameters can achieve state-of-the-art performance on certain language subsets and task categories, the best-performing publicly available model is multilingual-e5-large-instruct with only 560 million parameters. To facilitate accessibility and reduce computational cost, we introduce a novel downsampling method based on inter-task correlation, ensuring a diverse selection while preserving relative model rankings. Furthermore, we optimize tasks such as retrieval by sampling hard negatives, creating smaller but effective splits. These optimizations allow us to introduce benchmarks that drastically reduce computational demands. For instance, our newly introduced zero-shot English benchmark maintains a ranking order similar to the full-scale version but at a fraction of the computational cost.
△ Less
Submitted 8 June, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
Towards a Robust Quality Assurance Framework for Cloud Computing Environments
Authors:
Mohammed Alharbi,
RJ Qureshi
Abstract:
Trends such as cloud computing raise issues regarding stable and uniform quality assurance and validation of software requirements. Current QA frameworks are poorly defined, often not automated, and lack the flexibility needed for on-demand, cloud based environments. These gaps lead to inconsistencies in service delivery, challenges in scaling organizational capacity, and internal and external ine…
▽ More
Trends such as cloud computing raise issues regarding stable and uniform quality assurance and validation of software requirements. Current QA frameworks are poorly defined, often not automated, and lack the flexibility needed for on-demand, cloud based environments. These gaps lead to inconsistencies in service delivery, challenges in scaling organizational capacity, and internal and external inefficiencies that affect the reliability and effectiveness of cloud services. This paper presents a detailed framework for QA in cloud computing systems and advocates for standardized, automated, and adaptable systems to address these challenges. It aims to establish generic QA policies, incorporate intelligent techniques to enhance extendibility, and create adaptive solutions to manage the inherent attributes of cloud computing environments. The proposed framework is evaluated through survey questionnaires from industry practitioners, and descriptive statistics summarize the results. The study demonstrates the promise, effectiveness, and potential applicability of integrating a single QA framework to enhance the software functionality, dependability, and future adaptability in cloud computing systems
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
HyperGCL: Multi-Modal Graph Contrastive Learning via Learnable Hypergraph Views
Authors:
Khaled Mohammed Saifuddin,
Shihao Ji,
Esra Akbas
Abstract:
Recent advancements in Graph Contrastive Learning (GCL) have demonstrated remarkable effectiveness in improving graph representations. However, relying on predefined augmentations (e.g., node dropping, edge perturbation, attribute masking) may result in the loss of task-relevant information and a lack of adaptability to diverse input data. Furthermore, the selection of negative samples remains rar…
▽ More
Recent advancements in Graph Contrastive Learning (GCL) have demonstrated remarkable effectiveness in improving graph representations. However, relying on predefined augmentations (e.g., node dropping, edge perturbation, attribute masking) may result in the loss of task-relevant information and a lack of adaptability to diverse input data. Furthermore, the selection of negative samples remains rarely explored. In this paper, we introduce HyperGCL, a novel multimodal GCL framework from a hypergraph perspective. HyperGCL constructs three distinct hypergraph views by jointly utilizing the input graph's structure and attributes, enabling a comprehensive integration of multiple modalities in contrastive learning. A learnable adaptive topology augmentation technique enhances these views by preserving important relations and filtering out noise. View-specific encoders capture essential characteristics from each view, while a network-aware contrastive loss leverages the underlying topology to define positive and negative samples effectively. Extensive experiments on benchmark datasets demonstrate that HyperGCL achieves state-of-the-art node classification performance.
△ Less
Submitted 26 February, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
Effect of laser field and magnetic flux on scattering in graphene quantum dots
Authors:
Mohammed El Azar,
Ahmed Bouhlal,
Hocine Bahlouli,
Ahmed Jellal
Abstract:
We show how Dirac electrons interact with a graphene quantum dots (GQDs) when exposed to both a magnetic flux and circularly polarized light. After obtaining the solutions of the energy spectrum, we compute the scattering coefficients. These allow us to show how efficiently the electrons diffuse and how their probability density is distributed in space. Our results show that light polarization is…
▽ More
We show how Dirac electrons interact with a graphene quantum dots (GQDs) when exposed to both a magnetic flux and circularly polarized light. After obtaining the solutions of the energy spectrum, we compute the scattering coefficients. These allow us to show how efficiently the electrons diffuse and how their probability density is distributed in space. Our results show that light polarization is key in controlling electron scattering. It affects electron localization near the GQDs and the strength of the scattering coefficients. We also investigate how light intensity and magnetic flux affect the formation of quasi-bound states. In addition, the electrostatic potential reduces the density of scattering states and fine-tunes the interaction between electrons and the quantum dot. This research improves our understanding of electron behavior in graphene nanostructures and suggests new ways to control electronic states at the quantum level.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
An Empirical Evaluation of Encoder Architectures for Fast Real-Time Long Conversational Understanding
Authors:
Annamalai Senthilnathan,
Kristjan Arumae,
Mohammed Khalilia,
Zhengzheng Xing,
Aaron R. Colak
Abstract:
Analyzing long text data such as customer call transcripts is a cost-intensive and tedious task. Machine learning methods, namely Transformers, are leveraged to model agent-customer interactions. Unfortunately, Transformers adhere to fixed-length architectures and their self-attention mechanism scales quadratically with input length. Such limitations make it challenging to leverage traditional Tra…
▽ More
Analyzing long text data such as customer call transcripts is a cost-intensive and tedious task. Machine learning methods, namely Transformers, are leveraged to model agent-customer interactions. Unfortunately, Transformers adhere to fixed-length architectures and their self-attention mechanism scales quadratically with input length. Such limitations make it challenging to leverage traditional Transformers for long sequence tasks, such as conversational understanding, especially in real-time use cases. In this paper we explore and evaluate recently proposed efficient Transformer variants (e.g. Performer, Reformer) and a CNN-based architecture for real-time and near real-time long conversational understanding tasks. We show that CNN-based models are dynamic, ~2.6x faster to train, ~80% faster inference and ~72% more memory efficient compared to Transformers on average. Additionally, we evaluate the CNN model using the Long Range Arena benchmark to demonstrate competitiveness in general long document analysis.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.