Skip to main content

Showing 1–6 of 6 results for author: Al-Sabahi, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.19444  [pdf, other

    cs.SE cs.CL

    Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks

    Authors: Kang Yang, Xinjun Mao, Shangwen Wang, Yanlin Wang, Tanghaoran Zhang, Bo Lin, Yihao Qin, Zhang Zhang, Yao Lu, Kamal Al-Sabahi

    Abstract: Pre-trained code models rely heavily on high-quality pre-training data, particularly human-written reference comments that bridge code and natural language. However, these comments often become outdated as software evolves, degrading model performance. Large language models (LLMs) excel at generating high-quality code comments. We investigate whether replacing human-written comments with LLM-gener… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

    Comments: Awarded the ACM SIGSOFT Distinguished Paper Award in ICPC 2025

  2. Multi-head Sequence Tagging Model for Grammatical Error Correction

    Authors: Kamal Al-Sabahi, Kang Yang, Wangwang Liu, Guanyu Jiang, Xian Li, Ming Yang

    Abstract: To solve the Grammatical Error Correction (GEC) problem , a mapping between a source sequence and a target one is needed, where the two differ only on few spans. For this reason, the attention has been shifted to the non-autoregressive or sequence tagging models. In which, the GEC has been simplified from Seq2Seq to labeling the input tokens with edit commands chosen from a large edit space. Due t… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Journal ref: Engineering Applications of Artificial Intelligence,Volume 133, Part D, July 2024, 108314

  3. arXiv:1809.06662  [pdf

    cs.CL

    Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization

    Authors: Kamal Al-Sabahi, Zhang Zuping, Yang Kang

    Abstract: Sequence generative models with RNN variants, such as LSTM, GRU, show promising performance on abstractive document summarization. However, they still have some issues that limit their performance, especially while deal-ing with long sequences. One of the issues is that, to the best of our knowledge, all current models employ a unidirectional decoder, which reasons only about the past and still li… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

    Comments: Preprint

  4. An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization

    Authors: Kamal Al-Sabahi, Zuping Zhang, Jun Long, Khaled Alwesabi

    Abstract: The fast-growing amount of information on the Internet makes the research in automatic document summarization very urgent. It is an effective solution for information overload. Many approaches have been proposed based on different strategies, such as latent semantic analysis (LSA). However, LSA, when applied to document summarization, has some limitations which diminish its performance. In this wo… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

    Comments: This is a pre-print of an article published in Arabian Journal for Science and Engineering. The final authenticated version is available online at: https://doi.org/10.1007/s13369-018-3286-z

    Journal ref: K. Al-Sabahi, Z. Zhang, J. Long, and K. Alwesabi, "An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization," Arabian Journal for Science and Engineering, journal article May 05 2018

  5. Latent Semantic Analysis Approach for Document Summarization Based on Word Embeddings

    Authors: Kamal Al-Sabahi, Zhang Zuping, Yang Kang

    Abstract: Since the amount of information on the internet is growing rapidly, it is not easy for a user to find relevant information for his/her query. To tackle this issue, much attention has been paid to Automatic Document Summarization. The key point in any successful document summarizer is a good document representation. The traditional approaches based on word overlapping mostly fail to produce that ki… ▽ More

    Submitted 27 October, 2018; v1 submitted 7 July, 2018; originally announced July 2018.

    Comments: 20 pages, One-column, 4 figures

    Journal ref: KSII Transactions on Internet and Information Systems, 2019, Vol. 13, No.1

  6. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)

    Authors: Kamal Al-Sabahi, Zhang Zuping, Mohammed Nadher

    Abstract: The recent advance in neural network architecture and training algorithms have shown the effectiveness of representation learning. The neural network-based models generate better representation than the traditional ones. They have the ability to automatically learn the distributed representation for sentences and documents. To this end, we proposed a novel model that addresses several issues that… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

    Comments: 8 pages, 4 figures, 2 tables, IEEE Access, pp. 1-1, 2018