Skip to main content

Showing 1–18 of 18 results for author: Koh, Y S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.23803  [pdf, ps, other

    cs.CR cs.AI

    MultiPhishGuard: An LLM-based Multi-Agent System for Phishing Email Detection

    Authors: Yinuo Xue, Eric Spero, Yun Sing Koh, Giovanni Russello

    Abstract: Phishing email detection faces critical challenges from evolving adversarial tactics and heterogeneous attack patterns. Traditional detection methods, such as rule-based filters and denylists, often struggle to keep pace with these evolving tactics, leading to false negatives and compromised security. While machine learning approaches have improved detection accuracy, they still face challenges ad… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  2. arXiv:2504.00349  [pdf, other

    cs.LG

    Reducing Smoothness with Expressive Memory Enhanced Hierarchical Graph Neural Networks

    Authors: Thomas Bailie, Yun Sing Koh, S. Karthik Mukkavilli, Varvara Vetrova

    Abstract: Graphical forecasting models learn the structure of time series data via projecting onto a graph, with recent techniques capturing spatial-temporal associations between variables via edge weights. Hierarchical variants offer a distinct advantage by analysing the time series across multiple resolutions, making them particularly effective in tasks like global weather forecasting, where low-resolutio… ▽ More

    Submitted 2 April, 2025; v1 submitted 31 March, 2025; originally announced April 2025.

  3. arXiv:2502.13495  [pdf, other

    physics.ao-ph cs.LG stat.AP

    A Study on Monthly Marine Heatwave Forecasts in New Zealand: An Investigation of Imbalanced Regression Loss Functions with Neural Network Models

    Authors: Ding Ning, Varvara Vetrova, Sébastien Delaux, Rachael Tappenden, Karin R. Bryan, Yun Sing Koh

    Abstract: Marine heatwaves (MHWs) are extreme ocean-temperature events with significant impacts on marine ecosystems and related industries. Accurate forecasts (one to six months ahead) of MHWs would aid in mitigating these impacts. However, forecasting MHWs presents a challenging imbalanced regression task due to the rarity of extreme temperature anomalies in comparison to more frequent moderate conditions… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: The paper contains 32 pages for the main text

  4. arXiv:2502.07432  [pdf, ps, other

    cs.LG

    CapyMOA: Efficient Machine Learning for Data Streams in Python

    Authors: Heitor Murilo Gomes, Anton Lee, Nuwan Gunasekara, Yibin Sun, Guilherme Weigert Cassales, Justin Liu, Marco Heyden, Vitor Cerqueira, Maroua Bahri, Yun Sing Koh, Bernhard Pfahringer, Albert Bifet

    Abstract: CapyMOA is an open-source library designed for efficient machine learning on streaming data. It provides a structured framework for real-time learning and evaluation, featuring a flexible data representation. CapyMOA includes an extensible architecture that allows integration with external frameworks such as MOA and PyTorch, facilitating hybrid learning approaches that combine traditional online a… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  5. arXiv:2502.01679  [pdf, other

    cs.CY cs.CL cs.LG

    LIBRA: Measuring Bias of Large Language Model from a Local Context

    Authors: Bo Pang, Tingrui Qiao, Caroline Walker, Chris Cunningham, Yun Sing Koh

    Abstract: Large Language Models (LLMs) have significantly advanced natural language processing applications, yet their widespread use raises concerns regarding inherent biases that may reduce utility or harm for particular social groups. Despite the advancement in addressing LLM bias, existing research has two major limitations. First, existing LLM bias evaluation focuses on the U.S. cultural context, makin… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: Paper accepted by ECIR 2025

  6. arXiv:2501.13368  [pdf, other

    cs.CV cs.LG

    Meta-Feature Adapter: Integrating Environmental Metadata for Enhanced Animal Re-identification

    Authors: Yuzhuo Li, Di Zhao, Yihao Wu, Yun Sing Koh

    Abstract: Identifying individual animals within large wildlife populations is essential for effective wildlife monitoring and conservation efforts. Recent advancements in computer vision have shown promise in animal re-identification (Animal ReID) by leveraging data from camera traps. However, existing methods rely exclusively on visual data, neglecting environmental metadata that ecologists have identified… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

    Comments: 10 pages, 6 figures

  7. arXiv:2501.05731  [pdf, other

    cs.LG physics.ao-ph stat.AP

    Diving Deep: Forecasting Sea Surface Temperatures and Anomalies

    Authors: Ding Ning, Varvara Vetrova, Karin R. Bryan, Yun Sing Koh, Andreas Voskou, N'Dah Jean Kouagou, Arnab Sharma

    Abstract: This overview paper details the findings from the Diving Deep: Forecasting Sea Surface Temperatures and Anomalies Challenge at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2024. The challenge focused on the data-driven predictability of global sea surface temperatures (SSTs), a key factor in climate forecasting, ecosystem m… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: The paper contains 9 pages for the main text and 10 pages including References. 5 figures. Discovery Track, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2024

  8. arXiv:2412.04475  [pdf, other

    physics.ao-ph cs.LG

    Advancing Marine Heatwave Forecasts: An Integrated Deep Learning Approach

    Authors: Ding Ning, Varvara Vetrova, Yun Sing Koh, Karin R. Bryan

    Abstract: Marine heatwaves (MHWs), an extreme climate phenomenon, pose significant challenges to marine ecosystems and industries, with their frequency and intensity increasing due to climate change. This study introduces an integrated deep learning approach to forecast short-to-long-term MHWs on a global scale. The approach combines graph representation for modeling spatial properties in climate data, imba… ▽ More

    Submitted 19 November, 2024; originally announced December 2024.

    Comments: The paper contains 7 pages for the main text, 9 pages including References, and 17 pages including the Appendix. 3 figures

  9. arXiv:2411.12052  [pdf, other

    cs.LG

    Higher Order Graph Attention Probabilistic Walk Networks

    Authors: Thomas Bailie, Yun Sing Koh, Karthik Mukkavilli

    Abstract: Graphs inherently capture dependencies between nodes or variables through their topological structure, with paths between any two nodes indicating a sequential dependency on the nodes traversed. Message Passing Neural Networks (MPNNs) leverage these latent relationships embedded in graph structures, and have become widely adopted across diverse applications. However, many existing methods predomin… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  10. arXiv:2410.22927  [pdf, other

    cs.CV cs.LG

    An Individual Identity-Driven Framework for Animal Re-Identification

    Authors: Yihao Wu, Di Zhao, Jingfeng Zhang, Yun Sing Koh

    Abstract: Reliable re-identification of individuals within large wildlife populations is crucial for biological studies, ecological research, and wildlife conservation. Classic computer vision techniques offer a promising direction for Animal Re-identification (Animal ReID), but their backbones' close-set nature limits their applicability and generalizability. Despite the demonstrated effectiveness of visio… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Comments: 10 pages

    MSC Class: 68T45

  11. arXiv:2410.15875  [pdf, other

    cs.LG

    Enabling Asymmetric Knowledge Transfer in Multi-Task Learning with Self-Auxiliaries

    Authors: Olivier Graffeuille, Yun Sing Koh, Joerg Wicker, Moritz Lehmann

    Abstract: Knowledge transfer in multi-task learning is typically viewed as a dichotomy; positive transfer, which improves the performance of all tasks, or negative transfer, which hinders the performance of all tasks. In this paper, we investigate the understudied problem of asymmetric task relationships, where knowledge transfer aids the learning of certain tasks while hindering the learning of others. We… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  12. A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams

    Authors: Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet

    Abstract: The distribution of streaming data often changes over time as conditions change, a phenomenon known as concept drift. Only a subset of previous experience, collected in similar conditions, is relevant to learning an accurate classifier for current data. Learning from irrelevant experience describing a different concept can degrade performance. A system learning from streaming data must identify wh… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  13. arXiv:2402.11989  [pdf, other

    cs.LG cs.CR cs.CV

    Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models

    Authors: Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhang

    Abstract: Low-rank adaptation (LoRA) is an efficient strategy for adapting latent diffusion models (LDMs) on a private dataset to generate specific images by minimizing the adaptation loss. However, the LoRA-adapted LDMs are vulnerable to membership inference (MI) attacks that can judge whether a particular data point belongs to the private dataset, thus leading to the privacy leakage. To defend against MI… ▽ More

    Submitted 15 December, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: AAAI 2025 Accept

  14. arXiv:2305.00645  [pdf, ps, other

    cs.CR

    GTree: GPU-Friendly Privacy-preserving Decision Tree Training and Inference

    Authors: Qifan Wang, Shujie Cui, Lei Zhou, Ye Dong, Jianli Bai, Yun Sing Koh, Giovanni Russello

    Abstract: Decision tree (DT) is a widely used machine learning model due to its versatility, speed, and interpretability. However, for privacy-sensitive applications, outsourcing DT training and inference to cloud platforms raise concerns about data privacy. Researchers have developed privacy-preserving approaches for DT training and inference using cryptographic primitives, such as Secure Multi-Party Compu… ▽ More

    Submitted 1 April, 2025; v1 submitted 30 April, 2023; originally announced May 2023.

  15. arXiv:2304.00664  [pdf, other

    cs.HC cs.CR

    What You See is Not What You Get: The Role of Email Presentation in Phishing Susceptibility

    Authors: Sijie Zhuo, Robert Biddle, Lucas Betts, Nalin Asanka Gamagedara Arachchilage, Yun Sing Koh, Danielle Lottridge, Giovanni Russello

    Abstract: Phishing is one of the most prevalent social engineering attacks that targets both organizations and individuals. It is crucial to understand how email presentation impacts users' reactions to phishing attacks. We speculated that the device and email presentation may play a role, and, in particular, that how links are shown might influence susceptibility. Collaborating with the IT Services unit of… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Comments: 12 pages, 3 figures

  16. arXiv:2202.07905  [pdf, other

    cs.CR cs.CY cs.HC

    SoK: Human-Centered Phishing Susceptibility

    Authors: Sijie Zhuo, Robert Biddle, Yun Sing Koh, Danielle Lottridge, Giovanni Russello

    Abstract: Phishing is recognised as a serious threat to organisations and individuals. While there have been significant technical advances in blocking phishing attacks, people remain the last line of defence after phishing emails reach their email client. Most of the existing literature on this subject has focused on the technical aspects related to phishing. However, the factors that cause humans to be su… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: 13 pages of content, 2 figures, 18 pages in total

  17. arXiv:2112.13497  [pdf, other

    cs.SE

    Evaluating Software User Feedback Classifiers on Unseen Apps, Datasets, and Metadata

    Authors: Peter Devine, Yun Sing Koh, Kelly Blincoe

    Abstract: Listening to user's requirements is crucial to building and maintaining high quality software. Online software user feedback has been shown to contain large amounts of information useful to requirements engineering (RE). Previous studies have created machine learning classifiers for parsing this feedback for development insight. While these classifiers report generally good performance when evalua… ▽ More

    Submitted 26 December, 2021; originally announced December 2021.

  18. arXiv:1905.08848  [pdf, other

    cs.LG stat.ML

    Recurring Concept Meta-learning for Evolving Data Streams

    Authors: Robert Anderson, Yun Sing Koh, Gillian Dobbie, Albert Bifet

    Abstract: When concept drift is detected during classification in a data stream, a common remedy is to retrain a framework's classifier. However, this loses useful information if the classifier has learnt the current concept well, and this concept will recur again in the future. Some frameworks retain and reuse classifiers, but it can be time-consuming to select an appropriate classifier to reuse. These fra… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.