Skip to main content

Showing 1–7 of 7 results for author: Lin, H Y

.
  1. arXiv:2503.19444  [pdf, other

    cs.SE

    AI Safety in the Eyes of the Downstream Developer: A First Look at Concerns, Practices, and Challenges

    Authors: Haoyu Gao, Mansooreh Zahedi, Wenxin Jiang, Hong Yi Lin, James Davis, Christoph Treude

    Abstract: Pre-trained models (PTMs) have become a cornerstone of AI-based software, allowing for rapid integration and development with minimal training overhead. However, their adoption also introduces unique safety challenges, such as data leakage and biased outputs, that demand rigorous handling by downstream developers. While previous research has proposed taxonomies of AI safety concerns and various mi… ▽ More

    Submitted 25 March, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

  2. arXiv:2503.16167  [pdf, ps, other

    cs.SE cs.CL

    CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models

    Authors: Hong Yi Lin, Chunhua Liu, Haoyu Gao, Patanamon Thongtanunam, Christoph Treude

    Abstract: State-of-the-art large language models (LLMs) have demonstrated impressive code generation capabilities but struggle with real-world software engineering tasks, such as revising source code to address code reviews, hindering their practical use. Code review comments are often implicit, ambiguous, and colloquial, requiring models to grasp both code and human intent. This challenge calls for evaluat… ▽ More

    Submitted 31 May, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

    Comments: The paper is published in Findings of the Association for Computational Linguistics (ACL 2025)

  3. arXiv:2502.03806  [pdf, other

    cs.SE cs.LG

    Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks

    Authors: Kyi Shin Khant, Hong Yi Lin, Patanamon Thongtanunam

    Abstract: Learning-based techniques, especially advanced pre-trained models for code have demonstrated capabilities in code understanding and generation, solving diverse software engineering (SE) tasks. Despite the promising results, current training approaches may not fully optimize model performance, as they typically involve learning from randomly shuffled training data. Recent work shows that Curriculum… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted by the 22nd International Conference on Mining Software Repositories (MSR 25)

  4. arXiv:2502.02757  [pdf, other

    cs.SE

    Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation

    Authors: Chunhua Liu, Hong Yi Lin, Patanamon Thongtanunam

    Abstract: Code review is an important practice in software development, yet it is time-consuming and requires substantial effort. While open-source datasets have been used to train neural models for automating code review tasks, including review comment generation, these datasets contain a significant amount of noisy comments (e.g., vague or non-actionable feedback) that persist despite cleaning methods usi… ▽ More

    Submitted 5 February, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

    Comments: The paper is published at the International Conference on Mining Software Repositories (MSR2025)

  5. arXiv:2409.10959  [pdf, other

    cs.SE cs.LG

    Leveraging Reviewer Experience in Code Review Comment Generation

    Authors: Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Michael W. Godfrey, Chunhua Liu, Wachiraphan Charoenwet

    Abstract: Modern code review is a ubiquitous software quality assurance process aimed at identifying potential issues within newly written code. Despite its effectiveness, the process demands large amounts of effort from the human reviewers involved. To help alleviate this workload, researchers have trained deep learning models to imitate human reviewers in providing natural language code reviews. Formally,… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

  6. Improving Automated Code Reviews: Learning from Experience

    Authors: Hong Yi Lin, Patanamon Thongtanunam, Christoph Treude, Wachiraphan Charoenwet

    Abstract: Modern code review is a critical quality assurance process that is widely adopted in both industry and open source software environments. This process can help newcomers learn from the feedback of experienced reviewers; however, it often brings a large workload and stress to reviewers. To alleviate this burden, the field of automated code reviews aims to automate the process, teaching large langua… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted by the 21st International Conference on Mining Software Repositories (MSR 24)

  7. arXiv:0911.3455  [pdf

    cond-mat.mtrl-sci cond-mat.other

    Oxygen vacancies in N doped TiO2: Experiment and first principle calculations

    Authors: Abdul K. Rumaiz, J. C. Woicik, E. Cockayne, H. Y. Lin, G. Hassnain Jaffari, S. Ismat Shah

    Abstract: We have determined the electronic and atomic structure of N doped TiO2 using a combination of hard x-ray photoelectron spectroscopy (HAXPES) and first- principles density functional theory calculations. Our results reveal that N doping of TiO2 leads to the formation of oxygen vacancies and the combination of both N impurity and oxygen vacancies accounts for the observed visible light catalytic b… ▽ More

    Submitted 3 December, 2009; v1 submitted 17 November, 2009; originally announced November 2009.

    Comments: 13 pages with 5 figures