Skip to main content

Showing 1–17 of 17 results for author: Heakl, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14606  [pdf, ps, other

    cs.CL cs.AR cs.LG cs.PL cs.SE

    Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees

    Authors: Ahmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud

    Abstract: The hardware ecosystem is rapidly evolving, with increasing interest in translating low-level programs across different instruction set architectures (ISAs) in a quick, flexible, and correct way to enhance the portability and longevity of existing code. A particularly challenging class of this transpilation problem is translating between complex- (CISC) and reduced- (RISC) hardware architectures,… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Project page: https://ahmedheakl.github.io/Guaranteed-Guess/

  2. arXiv:2506.05336  [pdf, ps, other

    cs.CV

    VideoMolmo: Spatio-Temporal Grounding Meets Pointing

    Authors: Ghazi Shazan Ahmad, Ahmed Heakl, Hanan Gani, Abdelrahman Shaker, Zhiqiang Shen, Ranjay Krishna, Fahad Shahbaz Khan, Salman Khan

    Abstract: Spatio-temporal localization is vital for precise interactions across diverse domains, from biological research to autonomous navigation and interactive interfaces. Current video-based approaches, while proficient in tracking, lack the sophisticated reasoning capabilities of large language models, limiting their contextual understanding and generalization. We introduce VideoMolmo, a large multimod… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: 20 pages, 13 figures

  3. arXiv:2505.21887  [pdf, ps, other

    cs.AI cs.CE cs.LG

    SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

    Authors: Ahmed Heakl, Yahia Salaheldin Shaaban, Martin Takac, Salem Lahlou, Zangir Iklassov

    Abstract: Robust routing under uncertainty is central to real-world logistics, yet most benchmarks assume static, idealized settings. We present SVRPBench, the first open benchmark to capture high-fidelity stochastic dynamics in vehicle routing at urban scale. Spanning more than 500 instances with up to 1000 customers, it simulates realistic delivery conditions: time-dependent congestion, log-normal delays,… ▽ More

    Submitted 29 May, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

    Comments: 18 pages, 14 figures, 11 tables

  4. arXiv:2505.16968  [pdf, ps, other

    cs.AR cs.AI cs.CL cs.LG cs.PL

    CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

    Authors: Ahmed Heakl, Sarim Hashmi, Gustavo Bertolo Stahl, Seung Hun Eddie Han, Salman Khan, Abdulrahman Mahmoud

    Abstract: We introduce CASS, the first large-scale dataset and model suite for cross-architecture GPU code transpilation, targeting both source-level (CUDA <--> HIP) and assembly-level (Nvidia SASS <--> AMD RDNA3) translation. The dataset comprises 70k verified code pairs across host and device, addressing a critical gap in low-level GPU code portability. Leveraging this resource, we train the CASS family o… ▽ More

    Submitted 29 May, 2025; v1 submitted 22 May, 2025; originally announced May 2025.

    Comments: 20 pages, 11 figures, 5 tables

  5. arXiv:2502.14949  [pdf, other

    cs.CV cs.AI cs.CL cs.HC cs.LG

    KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding

    Authors: Ahmed Heakl, Abdullah Sohail, Mukul Ranjan, Rania Hossam, Ghazi Ahmed, Mohamed El-Geish, Omar Maher, Zhiqiang Shen, Fahad Khan, Salman Khan

    Abstract: With the growing adoption of Retrieval-Augmented Generation (RAG) in document processing, robust text recognition has become increasingly critical for knowledge extraction. While OCR (Optical Character Recognition) for English and other languages benefits from large datasets and well-established benchmarks, Arabic OCR faces unique challenges due to its cursive script, right-to-left text flow, and… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: 17 pages, 5 figures, ACL 2025

  6. arXiv:2502.00094  [pdf, other

    cs.CV cs.AI cs.CL cs.HC cs.LG

    AIN: The Arabic INclusive Large Multimodal Model

    Authors: Ahmed Heakl, Sara Ghaboura, Omkar Thawkar, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan

    Abstract: Amid the swift progress of large language models (LLMs) and their evolution into large multimodal models (LMMs), significant strides have been made in high-resource languages such as English and Chinese. While Arabic LLMs have seen notable progress, Arabic LMMs remain largely unexplored, often narrowly focusing on a few specific aspects of the language and visual understanding. To bridge this gap,… ▽ More

    Submitted 4 February, 2025; v1 submitted 31 January, 2025; originally announced February 2025.

    Comments: 20 pages, 16 figures, ACL

  7. arXiv:2501.06186  [pdf, other

    cs.CV

    LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

    Authors: Omkar Thawakar, Dinura Dissanayake, Ketan More, Ritesh Thawkar, Ahmed Heakl, Noor Ahsan, Yuhao Li, Mohammed Zumri, Jean Lahoud, Rao Muhammad Anwer, Hisham Cholakkal, Ivan Laptev, Mubarak Shah, Fahad Shahbaz Khan, Salman Khan

    Abstract: Reasoning is a fundamental capability for solving complex multi-step problems, particularly in visual contexts where sequential step-wise understanding is essential. Existing approaches lack a comprehensive framework for evaluating visual reasoning and do not emphasize step-wise problem-solving. To this end, we propose a comprehensive framework for advancing step-by-step visual reasoning in large… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: 15 pages, 5 Figures

  8. arXiv:2411.16341  [pdf, other

    cs.PL cs.AR

    From CISC to RISC: language-model guided assembly transpilation

    Authors: Ahmed Heakl, Chaimaa Abi, Rania Hossam, Abdulrahman Mahmoud

    Abstract: The transition from x86 to ARM architecture is becoming increasingly common across various domains, primarily driven by ARM's energy efficiency and improved performance across traditional sectors. However, this ISA shift poses significant challenges, mainly due to the extensive legacy ecosystem of x86 software and lack of portability across proprietary ecosystems and software stacks. This paper in… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

  9. arXiv:2410.18976  [pdf, other

    cs.CV cs.AI cs.CL cs.CY cs.LG

    CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

    Authors: Sara Ghaboura, Ahmed Heakl, Omkar Thawakar, Ali Alharthi, Ines Riahi, Abduljalil Saif, Jorma Laaksonen, Fahad S. Khan, Salman Khan, Rao M. Anwer

    Abstract: Recent years have witnessed a significant interest in developing large multimodal models (LMMs) capable of performing various visual reasoning and understanding tasks. This has led to the introduction of multiple LMM benchmarks to evaluate LMMs on different tasks. However, most existing LMM evaluation benchmarks are predominantly English-centric. In this work, we develop a comprehensive LMM evalua… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

    Comments: 10 pages, 5 figures, NAACL

  10. arXiv:2409.08695  [pdf, other

    cs.CV cs.AI cs.LG cs.RO eess.SY

    Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding

    Authors: Rania Hossam, Ahmed Heakl, Walid Gomaa

    Abstract: Traditional fish farming practices often lead to inefficient feeding, resulting in environmental issues and reduced productivity. We developed an innovative system combining computer vision and IoT technologies for precise Tilapia feeding. Our solution uses real-time IoT sensors to monitor water quality parameters and computer vision algorithms to analyze fish size and count, determining optimal f… ▽ More

    Submitted 24 September, 2024; v1 submitted 13 September, 2024; originally announced September 2024.

    Comments: 8 pages, 6 figures, 3 tables, 21th International Conference on Informatics in Control, Automation, and Robotics

  11. arXiv:2406.18125  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models

    Authors: Ahmed Heakl, Youssef Mohamed, Noran Mohamed, Aly Elsharkawy, Ahmed Zaky

    Abstract: The increasing reliance on online recruitment platforms coupled with the adoption of AI technologies has highlighted the critical need for efficient resume classification methods. However, challenges such as small datasets, lack of standardized resume templates, and privacy concerns hinder the accuracy and effectiveness of existing classification models. In this work, we address these challenges b… ▽ More

    Submitted 12 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures, 1 table, 6th International Conference on AI in Computational Linguistics

  12. arXiv:2406.18120  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs

    Authors: Ahmed Heakl, Youssef Zaghloul, Mennatullah Ali, Rania Hossam, Walid Gomaa

    Abstract: Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developin… ▽ More

    Submitted 12 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures, 5 tables, 6th International Conference on AI in Computational Linguistics

  13. arXiv:2406.00447  [pdf, other

    cs.CV cs.AI cs.CY cs.LG cs.RO

    DroneVis: Versatile Computer Vision Library for Drones

    Authors: Ahmed Heakl, Fatma Youssef, Victor Parque, Walid Gomaa

    Abstract: This paper introduces DroneVis, a novel library designed to automate computer vision algorithms on Parrot drones. DroneVis offers a versatile set of features and provides a diverse range of computer vision tasks along with a variety of models to choose from. Implemented in Python, the library adheres to high-quality code standards, facilitating effortless customization and feature expansion accord… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 23 pages, 15 figure, 2 tables

  14. arXiv:2406.00409  [pdf, other

    cs.CV cs.AI cs.LG cs.MM cs.NE

    Arabic Handwritten Text for Person Biometric Identification: A Deep Learning Approach

    Authors: Mazen Balat, Youssef Mohamed, Ahmed Heakl, Ahmed Zaky

    Abstract: This study thoroughly investigates how well deep learning models can recognize Arabic handwritten text for person biometric identification. It compares three advanced architectures -- ResNet50, MobileNetV2, and EfficientNetB7 -- using three widely recognized datasets: AHAWP, Khatt, and LAMIS-MSHD. Results show that EfficientNetB7 outperforms the others, achieving test accuracies of 98.57\%, 99.15\… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 6 pages, 11 figures, 4 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications (IMSA)

  15. arXiv:2406.00135  [pdf, other

    cs.CV cs.AI cs.HC cs.LG cs.MM

    Advancing Ear Biometrics: Enhancing Accuracy and Robustness through Deep Learning

    Authors: Youssef Mohamed, Zeyad Youssef, Ahmed Heakl, Ahmed Zaky

    Abstract: Biometric identification is a reliable method to verify individuals based on their unique physical or behavioral traits, offering a secure alternative to traditional methods like passwords or PINs. This study focuses on ear biometric identification, exploiting its distinctive features for enhanced accuracy, reliability, and usability. While past studies typically investigate face recognition and f… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 6 pages, 8 figures, 3 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications

  16. arXiv:2402.07448  [pdf

    cs.CL cs.AI cs.DB cs.IR cs.LG

    AraSpider: Democratizing Arabic-to-SQL

    Authors: Ahmed Heakl, Youssef Mohamed, Ahmed B. Zaky

    Abstract: This study presents AraSpider, the first Arabic version of the Spider dataset, aimed at improving natural language processing (NLP) in the Arabic-speaking community. Four multilingual translation models were tested for their effectiveness in translating English to Arabic. Additionally, two models were assessed for their ability to generate SQL queries from Arabic text. The results showed that usin… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 11 pages, 4 figures

  17. arXiv:2208.12086  [pdf, other

    cs.SD cs.AI cs.MM eess.AS eess.SP

    A Study on Broadcast Networks for Music Genre Classification

    Authors: Ahmed Heakl, Abdelrahman Abdelgawad, Victor Parque

    Abstract: Due to the increased demand for music streaming/recommender services and the recent developments of music information retrieval frameworks, Music Genre Classification (MGC) has attracted the community's attention. However, convolutional-based approaches are known to lack the ability to efficiently encode and localize temporal features. In this paper, we study the broadcast-based neural networks ai… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: accepted for oral presentation at the World Congress on Computational Intelligence (WCCI 2022) - International Joint Conference on Neural Networks (IJCNN 2022)