Skip to main content

Showing 1–50 of 286 results for author: Sugiyama, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10616  [pdf, ps, other

    cs.LG

    Non-stationary Online Learning for Curved Losses: Improved Dynamic Regret via Mixability

    Authors: Yu-Jie Zhang, Peng Zhao, Masashi Sugiyama

    Abstract: Non-stationary online learning has drawn much attention in recent years. Despite considerable progress, dynamic regret minimization has primarily focused on convex functions, leaving the functions with stronger curvature (e.g., squared or logistic loss) underexplored. In this work, we address this gap by showing that the regret can be substantially improved by leveraging the concept of mixability,… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  2. arXiv:2505.24709  [pdf, ps, other

    cs.LG cs.AI

    On Symmetric Losses for Robust Policy Optimization with Noisy Preferences

    Authors: Soichiro Nishimori, Yu-Jie Zhang, Thanawat Lodkaew, Masashi Sugiyama

    Abstract: Optimizing policies based on human preferences is key to aligning language models with human intent. This work focuses on reward modeling, a core component in reinforcement learning from human feedback (RLHF), and offline preference optimization, such as direct preference optimization. Conventional approaches typically assume accurate annotations. However, real-world preference data often contains… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  3. arXiv:2505.20761  [pdf, ps, other

    cs.LG stat.ML

    Practical estimation of the optimal classification error with soft labels and calibration

    Authors: Ryota Ushio, Takashi Ishida, Masashi Sugiyama

    Abstract: While the performance of machine learning systems has experienced significant improvement in recent years, relatively little attention has been paid to the fundamental question: to what extent can we improve our models? This paper provides a means of answering this question in the setting of binary classification, which is practical and theoretically supported. We extend a previous work that utili… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 36 pages, 24 figures; GitHub: https://github.com/RyotaUshio/bayes-error-estimation

  4. arXiv:2505.13900  [pdf, ps, other

    cs.LG

    New Evidence of the Two-Phase Learning Dynamics of Neural Networks

    Authors: Zhanpeng Zhou, Yongyi Yang, Mahito Sugiyama, Junchi Yan

    Abstract: Understanding how deep neural networks learn remains a fundamental challenge in modern machine learning. A growing body of evidence suggests that training dynamics undergo a distinct phase transition, yet our understanding of this transition is still incomplete. In this paper, we introduce an interval-wise perspective that compares network states across a time window, revealing two new phenomena t… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: This work extends the workshop paper, On the Cone Effect in the Learning Dynamics, accepted by ICLR 2025 Workshop DeLTa

  5. arXiv:2505.09045  [pdf, ps, other

    math.OC cs.CC cs.DC

    The Adaptive Complexity of Finding a Stationary Point

    Authors: Huanjian Zhou, Andi Han, Akiko Takeda, Masashi Sugiyama

    Abstract: In large-scale applications, such as machine learning, it is desirable to design non-convex optimization algorithms with a high degree of parallelization. In this work, we study the adaptive complexity of finding a stationary point, which is the minimal number of sequential rounds required to achieve stationarity given polynomially many queries executed in parallel at each round. For the high-di… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: Accepted to COLT2025

  6. arXiv:2504.21334  [pdf, other

    cs.CV

    Simple Visual Artifact Detection in Sora-Generated Videos

    Authors: Misora Sugiyama, Hirokatsu Kataoka

    Abstract: The December 2024 release of OpenAI's Sora, a powerful video generation model driven by natural language prompts, highlights a growing convergence between large language models (LLMs) and video synthesis. As these multimodal systems evolve into video-enabled LLMs (VidLLMs), capable of interpreting, generating, and interacting with visual content, understanding their limitations and ensuring their… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  7. arXiv:2504.08234  [pdf, other

    cs.SE cs.LG

    Bringing Structure to Naturalness: On the Naturalness of ASTs

    Authors: Profir-Petru Pârţachi, Mahito Sugiyama

    Abstract: Source code comes in different shapes and forms. Previous research has already shown code to be more predictable than natural language as well as highlighted its statistical predictability at the token level: source code can be natural. More recently, the structure of code -- control flow, syntax graphs, abstract syntax trees etc. -- has been successfully used to improve the state-of-the-art on nu… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

  8. arXiv:2503.16316  [pdf, other

    cs.LG

    On the Cone Effect in the Learning Dynamics

    Authors: Zhanpeng Zhou, Yongyi Yang, Jie Ren, Mahito Sugiyama, Junchi Yan

    Abstract: Understanding the learning dynamics of neural networks is a central topic in the deep learning community. In this paper, we take an empirical perspective to study the learning dynamics of neural networks in real-world settings. Specifically, we investigate the evolution process of the empirical Neural Tangent Kernel (eNTK) during training. Our key findings reveal a two-phase learning process: i) i… ▽ More

    Submitted 13 April, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

    Comments: Accepted by ICLR 2025 workshop DeLTa

  9. arXiv:2503.10669  [pdf, other

    cs.CL cs.AI

    UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality

    Authors: Zelei Cheng, Xin-Qiang Cai, Yuting Tang, Pushi Zhang, Boming Yang, Masashi Sugiyama, Xinyu Xing

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone for aligning large language models (LLMs) with human values. However, existing approaches struggle to capture the multi-dimensional, distributional nuances of human preferences. Methods such as RiC that directly inject raw reward values into prompts face significant numerical sensitivity issues--for instance, LLMs may fail… ▽ More

    Submitted 18 May, 2025; v1 submitted 10 March, 2025; originally announced March 2025.

    Comments: Language Modeling, Machine Learning for NLP, Distributional Pareto-Optimal

  10. arXiv:2503.08155  [pdf, other

    cs.LG

    Domain Adaptation and Entanglement: an Optimal Transport Perspective

    Authors: Okan Koç, Alexander Soen, Chao-Kai Chiang, Masashi Sugiyama

    Abstract: Current machine learning systems are brittle in the face of distribution shifts (DS), where the target distribution that the system is tested on differs from the source distribution used to train the system. This problem of robustness to DS has been studied extensively in the field of domain adaptation. For deep neural networks, a popular framework for unsupervised domain adaptation (UDA) is domai… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

    Comments: Accepted for publication in AISTATS'25

  11. arXiv:2503.04151  [pdf, other

    cs.CV cs.AI cs.LG

    Robust Multi-View Learning via Representation Fusion of Sample-Level Attention and Alignment of Simulated Perturbation

    Authors: Jie Xu, Na Zhao, Gang Niu, Masashi Sugiyama, Xiaofeng Zhu

    Abstract: Recently, multi-view learning (MVL) has garnered significant attention due to its ability to fuse discriminative information from multiple views. However, real-world multi-view datasets are often heterogeneous and imperfect, which usually makes MVL methods designed for specific combinations of views lack application potential and limits their effectiveness. To address this issue, we propose a nove… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  12. arXiv:2502.14205  [pdf, other

    cs.LG cs.AI

    Accurate Forgetting for Heterogeneous Federated Continual Learning

    Authors: Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang, Kunda Yan, Bo Han, Gang Niu, Lei Fang, Changshui Zhang, Masashi Sugiyama

    Abstract: Recent years have witnessed a burgeoning interest in federated learning (FL). However, the contexts in which clients engage in sequential learning remain under-explored. Bridging FL and continual learning (CL) gives rise to a challenging practical problem: federated continual learning (FCL). Existing research in FCL primarily focuses on mitigating the catastrophic forgetting issue of continual lea… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

    Comments: published in ICLR 2024

  13. arXiv:2502.10184  [pdf, other

    cs.LG

    Realistic Evaluation of Deep Partial-Label Learning Algorithms

    Authors: Wei Wang, Dong-Dong Wu, Jindong Wang, Gang Niu, Min-Ling Zhang, Masashi Sugiyama

    Abstract: Partial-label learning (PLL) is a weakly supervised learning problem in which each example is associated with multiple candidate labels and only one is the true label. In recent years, many deep PLL algorithms have been developed to improve model performance. However, we find that some early developed algorithms are often underestimated and can outperform many later algorithms with complicated des… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

    Comments: ICLR 2025 Spotlight

  14. arXiv:2502.05206  [pdf, ps, other

    cs.CR cs.AI cs.CL cs.CV

    Safety at Scale: A Comprehensive Survey of Large Model Safety

    Authors: Xingjun Ma, Yifeng Gao, Yixu Wang, Ruofan Wang, Xin Wang, Ye Sun, Yifan Ding, Hengyuan Xu, Yunhao Chen, Yunhan Zhao, Hanxun Huang, Yige Li, Jiaming Zhang, Xiang Zheng, Yang Bai, Zuxuan Wu, Xipeng Qiu, Jingfeng Zhang, Yiming Li, Xudong Han, Haonan Li, Jun Sun, Cong Wang, Jindong Gu, Baoyuan Wu , et al. (22 additional authors not shown)

    Abstract: The rapid advancement of large models, driven by their exceptional abilities in learning and generalization through large-scale pre-training, has reshaped the landscape of Artificial Intelligence (AI). These models are now foundational to a wide range of applications, including conversational AI, recommendation systems, autonomous driving, content generation, medical diagnostics, and scientific di… ▽ More

    Submitted 2 June, 2025; v1 submitted 2 February, 2025; originally announced February 2025.

    Comments: 47 pages, 3 figures, 11 tables; GitHub: https://github.com/xingjunm/Awesome-Large-Model-Safety

  15. arXiv:2502.01170  [pdf, other

    cs.LG

    Label Distribution Learning with Biased Annotations by Learning Multi-Label Representation

    Authors: Zhiqiang Kou, Si Qin, Hailin Wang, Mingkun Xie, Shuo Chen, Yuheng Jia, Tongliang Liu, Masashi Sugiyama, Xin Geng

    Abstract: Multi-label learning (MLL) has gained attention for its ability to represent real-world data. Label Distribution Learning (LDL), an extension of MLL to learning from label distributions, faces challenges in collecting accurate label distributions. To address the issue of biased annotations, based on the low-rank assumption, existing works recover true distributions from biased observations by expl… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  16. arXiv:2502.00473  [pdf, other

    cs.LG cs.CV

    Weak-to-Strong Diffusion with Reflection

    Authors: Lichen Bai, Masashi Sugiyama, Zeke Xie

    Abstract: The goal of diffusion generative models is to align the learned distribution with the real data distribution through gradient score matching. However, inherent limitations in training data quality, modeling strategies, and architectural design lead to inevitable gap between generated outputs and real data. To reduce this gap, we propose Weak-to-Strong Diffusion (W2SD), a novel framework that utili… ▽ More

    Submitted 24 April, 2025; v1 submitted 1 February, 2025; originally announced February 2025.

    Comments: 23 pages, 23 figures, 15 tables

  17. arXiv:2412.21205  [pdf, other

    cs.CV cs.AI cs.LG

    Action-Agnostic Point-Level Supervision for Temporal Action Detection

    Authors: Shuhei M. Yoshida, Takashi Shibata, Makoto Terao, Takayuki Okatani, Masashi Sugiyama

    Abstract: We propose action-agnostic point-level (AAPL) supervision for temporal action detection to achieve accurate action instance detection with a lightly annotated dataset. In the proposed scheme, a small portion of video frames is sampled in an unsupervised manner and presented to human annotators, who then label the frames with action categories. Unlike point-level supervision, which requires annotat… ▽ More

    Submitted 30 December, 2024; originally announced December 2024.

    Comments: AAAI-25. Technical appendices included. 15 pages, 3 figures, 11 tables

  18. arXiv:2412.07435  [pdf, other

    cs.DS cs.DC cs.LG math.NA

    Parallel simulation for sampling under isoperimetry and score-based diffusion models

    Authors: Huanjian Zhou, Masashi Sugiyama

    Abstract: In recent years, there has been a surge of interest in proving discretization bounds for sampling under isoperimetry and for diffusion models. As data size grows, reducing the iteration cost becomes an important goal. Inspired by the great success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Rigorous theor… ▽ More

    Submitted 12 December, 2024; v1 submitted 10 December, 2024; originally announced December 2024.

  19. arXiv:2410.20176  [pdf, other

    cs.LG

    Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning

    Authors: Yuting Tang, Xin-Qiang Cai, Jing-Cheng Pang, Qiyu Wu, Yao-Xiang Ding, Masashi Sugiyama

    Abstract: Reinforcement Learning (RL) empowers agents to acquire various skills by learning from reward signals. Unfortunately, designing high-quality instance-level rewards often demands significant effort. An emerging alternative, RL with delayed reward, focuses on learning from rewards presented periodically, which can be obtained from human evaluators assessing the agent's performance over sequences of… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

  20. arXiv:2410.12457  [pdf, other

    cs.LG cs.AI

    Sharpness-Aware Black-Box Optimization

    Authors: Feiyang Ye, Yueming Lyu, Xuehao Wang, Masashi Sugiyama, Yu Zhang, Ivor Tsang

    Abstract: Black-box optimization algorithms have been widely used in various machine learning problems, including reinforcement learning and prompt fine-tuning. However, directly optimizing the training loss value, as commonly done in existing black-box optimization methods, could lead to suboptimal model quality and generalization performance. To address those problems in black-box optimization, we propose… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 27 pages, 5 figures

  21. arXiv:2410.11964  [pdf, other

    cs.LG stat.ML

    A Complete Decomposition of KL Error using Refined Information and Mode Interaction Selection

    Authors: James Enouen, Mahito Sugiyama

    Abstract: The log-linear model has received a significant amount of theoretical attention in previous decades and remains the fundamental tool used for learning probability distributions over discrete variables. Despite its large popularity in statistical mechanics and high-dimensional statistics, the vast majority of such energy-based modeling approaches only focus on the two-variable relationships, such a… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  22. arXiv:2410.03124  [pdf, other

    cs.CL cs.LG

    In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement

    Authors: Zhen-Yu Zhang, Jiandong Zhang, Huaxiu Yao, Gang Niu, Masashi Sugiyama

    Abstract: Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality. Most existing methods rely on human supervision or parameter retraining, both of which are costly in terms of data collection and computational resources. To handle these challenges, a direct solution is to generate ``high-confidence'' data from… ▽ More

    Submitted 26 May, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

  23. arXiv:2410.00718  [pdf, other

    cs.LG

    Pseudo-Non-Linear Data Augmentation via Energy Minimization

    Authors: Pingbang Hu, Mahito Sugiyama

    Abstract: We propose a novel and interpretable data augmentation method based on energy-based modeling and principles from information geometry. Unlike black-box generative models, which rely on deep neural networks, our approach replaces these non-interpretable transformations with explicit, theoretically grounded ones, ensuring interpretability and strong guarantees such as energy minimization. Central to… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  24. arXiv:2409.16718  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO

    Vision-Language Model Fine-Tuning via Simple Parameter-Efficient Modification

    Authors: Ming Li, Jike Zhong, Chenxin Li, Liuzhuozheng Li, Nie Lin, Masashi Sugiyama

    Abstract: Recent advances in fine-tuning Vision-Language Models (VLMs) have witnessed the success of prompt tuning and adapter tuning, while the classic model fine-tuning on inherent parameters seems to be overlooked. It is believed that fine-tuning the parameters of VLMs with few-shot samples corrupts the pre-trained knowledge since fine-tuning the CLIP model even degrades performance. In this paper, we re… ▽ More

    Submitted 19 November, 2024; v1 submitted 25 September, 2024; originally announced September 2024.

    Comments: EMNLP 2024 Main Conference

  25. arXiv:2408.13045  [pdf, other

    cs.DS

    The adaptive complexity of parallelized log-concave sampling

    Authors: Huanjian Zhou, Baoxiang Wang, Masashi Sugiyama

    Abstract: In large-data applications, such as the inference process of diffusion models, it is desirable to design sampling algorithms with a high degree of parallelization. In this work, we study the adaptive complexity of sampling, which is the minimum number of sequential rounds required to achieve sampling given polynomially many queries executed in parallel at each round. For unconstrained sampling, we… ▽ More

    Submitted 19 May, 2025; v1 submitted 23 August, 2024; originally announced August 2024.

  26. arXiv:2407.18624  [pdf, other

    cs.LG

    Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning

    Authors: Jia-Hao Xiao, Ming-Kun Xie, Heng-Bo Fan, Gang Niu, Masashi Sugiyama, Sheng-Jun Huang

    Abstract: Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. To solve this problem, the mainstream method developed an effective t… ▽ More

    Submitted 26 December, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

    Comments: Published in ECCV 2024

  27. arXiv:2406.09179  [pdf, other

    cs.LG

    Towards Effective Evaluations and Comparisons for LLM Unlearning Methods

    Authors: Qizhou Wang, Bo Han, Puning Yang, Jianing Zhu, Tongliang Liu, Masashi Sugiyama

    Abstract: The imperative to eliminate undesirable data memorization underscores the significance of machine unlearning for large language models (LLMs). Recent research has introduced a series of promising unlearning methods, notably boosting the practical significance of the field. Nevertheless, adopting a proper evaluation framework to reflect the true unlearning efficacy is also essential yet has not rec… ▽ More

    Submitted 24 February, 2025; v1 submitted 13 June, 2024; originally announced June 2024.

  28. arXiv:2406.08288  [pdf, other

    cs.LG

    Decoupling the Class Label and the Target Concept in Machine Unlearning

    Authors: Jianing Zhu, Bo Han, Jiangchao Yao, Jianliang Xu, Gang Niu, Masashi Sugiyama

    Abstract: Machine unlearning as an emerging research topic for data regulations, aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class, through gradient ascent on the forgetting data or fine-tuning with the remaining data. However, while these metho… ▽ More

    Submitted 16 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  29. arXiv:2405.20494  [pdf, other

    cs.CV cs.AI cs.LG

    Slight Corruption in Pre-training Data Makes Better Diffusion Models

    Authors: Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj

    Abstract: Diffusion models (DMs) have shown remarkable capabilities in generating realistic high-quality images, audios, and videos. They benefit significantly from extensive pre-training on large-scale datasets, including web-crawled data with paired data and conditions, such as image-text and image-class pairs. Despite rigorous filtering, these pre-training datasets often inevitably contain corrupted pair… ▽ More

    Submitted 30 October, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024 Spotlight

  30. arXiv:2405.18890  [pdf, other

    cs.LG cs.DC

    Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization

    Authors: Ziqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya Zhang, Masashi Sugiyama, Yanfeng Wang

    Abstract: In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima, degenerating the performance of the resulted global model. Prevalent federated approaches incorporate sharpness-aware minimization (SAM) into local training to mitigate this problem. However, the local loss landscapes may not accurately reflect the flatness of… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  31. arXiv:2405.16168  [pdf, other

    cs.LG stat.ML

    Multi-Player Approaches for Dueling Bandits

    Authors: Or Raveh, Junya Honda, Masashi Sugiyama

    Abstract: Various approaches have emerged for multi-armed bandits in distributed systems. The multiplayer dueling bandit problem, common in scenarios with only preference-based information like human feedback, introduces challenges related to controlling collaborative exploration of non-informative arm pairs, but has received little attention. To fill this gap, we demonstrate that the direct use of a Follow… ▽ More

    Submitted 23 April, 2025; v1 submitted 25 May, 2024; originally announced May 2024.

  32. arXiv:2405.14596  [pdf, other

    cs.LG

    Linear Mode Connectivity in Differentiable Tree Ensembles

    Authors: Ryuichi Kanoh, Mahito Sugiyama

    Abstract: Linear Mode Connectivity (LMC) refers to the phenomenon that performance remains consistent for linearly interpolated models in the parameter space. For independently optimized model pairs from different random initializations, achieving LMC is considered crucial for understanding the stable success of the non-convex optimization in modern machine learning models and for facilitating practical par… ▽ More

    Submitted 14 February, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to ICLR 2025

  33. arXiv:2405.14114  [pdf, other

    cs.LG cs.AI

    Offline Reinforcement Learning from Datasets with Structured Non-Stationarity

    Authors: Johannes Ackermann, Takayuki Osa, Masashi Sugiyama

    Abstract: Current Reinforcement Learning (RL) is often limited by the large amount of data needed to learn a successful policy. Offline RL aims to solve this issue by using transitions collected by a different behavior policy. We address a novel Offline RL problem setting in which, while collecting the dataset, the transition and reward functions gradually change between episodes but stay constant within ea… ▽ More

    Submitted 27 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted for Reinforcement Learning Conference (RLC) 2024

  34. arXiv:2405.09892  [pdf, other

    cs.LG cs.DC

    Balancing Similarity and Complementarity for Federated Learning

    Authors: Kunda Yan, Sen Cui, Abudukelimu Wuerkaixi, Jingfeng Zhang, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang

    Abstract: In mobile and IoT systems, Federated Learning (FL) is increasingly important for effectively using data while maintaining user privacy. One key challenge in FL is managing statistical heterogeneity, such as non-i.i.d. data, arising from numerous clients and diverse data sources. This requires strategic cooperation, often with clients having similar characteristics. However, we are interested in a… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  35. arXiv:2404.07465  [pdf, other

    cs.LG

    Offline Reinforcement Learning with Domain-Unlabeled Data

    Authors: Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, Masashi Sugiyama

    Abstract: Offline reinforcement learning (RL) is vital in areas where active data collection is expensive or infeasible, such as robotics or healthcare. In the real world, offline datasets often involve multiple domains that share the same state and action spaces but have distinct dynamics, and only a small fraction of samples are clearly labeled as belonging to the target domain we are interested in. For e… ▽ More

    Submitted 28 February, 2025; v1 submitted 11 April, 2024; originally announced April 2024.

  36. arXiv:2404.06287  [pdf, other

    cs.CV cs.LG

    Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training

    Authors: Ming-Kun Xie, Jia-Hao Xiao, Pei Peng, Gang Niu, Masashi Sugiyama, Sheng-Jun Huang

    Abstract: The key to multi-label image classification (MLC) is to improve model performance by leveraging label correlations. Unfortunately, it has been shown that overemphasizing co-occurrence relationships can cause the overfitting issue of the model, ultimately leading to performance degradation. In this paper, we provide a causal inference framework to show that the correlative features caused by the ta… ▽ More

    Submitted 12 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  37. arXiv:2403.10855  [pdf, other

    cs.LG cs.RO

    Reinforcement Learning with Options and State Representation

    Authors: Ayoub Ghriss, Masashi Sugiyama, Alessandro Lazaric

    Abstract: The current thesis aims to explore the reinforcement learning field and build on existing methods to produce improved ones to tackle the problem of learning in high-dimensional and complex environments. It addresses such goals by decomposing learning tasks in a hierarchical fashion known as Hierarchical Reinforcement Learning. We start in the first chapter by getting familiar with the Markov Dec… ▽ More

    Submitted 25 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

    Comments: Master Thesis 2018, MVA ENS Paris-Saclay, Tokyo RIKEN AIP

  38. arXiv:2403.06869  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Impact of Noisy Supervision in Foundation Model Learning

    Authors: Hao Chen, Zihan Wang, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj, Jindong Wang

    Abstract: Foundation models are usually pre-trained on large-scale datasets and then adapted to downstream tasks through tuning. However, the large-scale pre-training datasets, often inaccessible or too expensive to handle, can contain label noise that may adversely affect the generalization of the model and pose unexpected risks. This paper stands out as the first work to comprehensively understand and ana… ▽ More

    Submitted 4 May, 2025; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: 18 pages, 10 figures, 6 tables, preprint. arXiv admin note: substantial text overlap with arXiv:2309.17002

  39. arXiv:2402.19287  [pdf, other

    cs.LG

    StiefelGen: A Simple, Model Agnostic Approach for Time Series Data Augmentation over Riemannian Manifolds

    Authors: Prasad Cheema, Mahito Sugiyama

    Abstract: Data augmentation is an area of research which has seen active development in many machine learning fields, such as in image-based learning models, reinforcement learning for self driving vehicles, and general noise injection for point cloud data. However, convincing methods for general time series data augmentation still leaves much to be desired, especially since the methods developed for these… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 61 pages, 41 figures

  40. arXiv:2402.18805  [pdf, other

    cs.SI stat.ML

    VEC-SBM: Optimal Community Detection with Vectorial Edges Covariates

    Authors: Guillaume Braun, Masashi Sugiyama

    Abstract: Social networks are often associated with rich side information, such as texts and images. While numerous methods have been developed to identify communities from pairwise interactions, they usually ignore such side information. In this work, we study an extension of the Stochastic Block Model (SBM), a widely used statistical framework for community detection, that integrates vectorial edges covar… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  41. arXiv:2402.06918  [pdf, other

    cs.LG cs.AI cs.CL

    Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought

    Authors: Zhen-Yu Zhang, Siwei Han, Huaxiu Yao, Gang Niu, Masashi Sugiyama

    Abstract: To improve the ability of the large language model (LLMs) to tackle complex reasoning problems, chain-of-thoughts (CoT) methods were proposed to guide LLMs to reason step-by-step, enabling problem solving from simple to complex. State-of-the-art methods for generating such a chain involve interactive collaboration, where the learner generates candidate intermediate thoughts, evaluated by the LLM,… ▽ More

    Submitted 26 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  42. arXiv:2402.03771  [pdf, other

    cs.LG

    Reinforcement Learning from Bagged Reward

    Authors: Yuting Tang, Xin-Qiang Cai, Yao-Xiang Ding, Qiyu Wu, Guoqing Liu, Masashi Sugiyama

    Abstract: In Reinforcement Learning (RL), it is commonly assumed that an immediate reward signal is generated for each action taken by the agent, helping the agent maximize cumulative rewards to obtain the optimal policy. However, in many real-world scenarios, designing immediate reward signals is difficult; instead, agents receive a single reward that is contingent upon a partial sequence or a complete tra… ▽ More

    Submitted 26 October, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  43. arXiv:2402.01922  [pdf, other

    cs.LG cs.AI

    A General Framework for Learning from Weak Supervision

    Authors: Hao Chen, Jindong Wang, Lei Feng, Xiang Li, Yidong Wang, Xing Xie, Masashi Sugiyama, Rita Singh, Bhiksha Raj

    Abstract: Weakly supervised learning generally faces challenges in applicability to various scenarios with diverse weak supervision and in scalability due to the complexity of existing algorithms, thereby hindering the practical deployment. This paper introduces a general framework for learning from weak supervision (GLWS) with a novel algorithm. Central to GLWS is an Expectation-Maximization (EM) formulati… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 24 pages, 20 tables, 9 figures

  44. arXiv:2401.06826  [pdf, other

    cs.LG cs.AI cs.CV

    Direct Distillation between Different Domains

    Authors: Jialiang Tang, Shuo Chen, Gang Niu, Hongyuan Zhu, Joey Tianyi Zhou, Chen Gong, Masashi Sugiyama

    Abstract: Knowledge Distillation (KD) aims to learn a compact student network using knowledge from a large pre-trained teacher network, where both networks are trained on data from the same distribution. However, in practical applications, the student network may be required to perform in a new scenario (i.e., the target domain), which usually exhibits significant differences from the known scenario of the… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  45. arXiv:2311.15502  [pdf, other

    cs.LG

    Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical

    Authors: Wei Wang, Takashi Ishida, Yu-Jie Zhang, Gang Niu, Masashi Sugiyama

    Abstract: Complementary-label learning is a weakly supervised learning problem in which each training example is associated with one or multiple complementary labels indicating the classes to which it does not belong. Existing consistent approaches have relied on the uniform distribution assumption to model the generation of complementary labels, or on an ordinary-label training set to estimate the transiti… ▽ More

    Submitted 11 October, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: ICML 2024

  46. arXiv:2310.15681  [pdf, other

    cs.LG

    Fixed-Budget Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit

    Authors: Shintaro Nakamura, Masashi Sugiyama

    Abstract: We study the real-valued combinatorial pure exploration of the multi-armed bandit in the fixed-budget setting. We first introduce the Combinatorial Successive Asign (CSA) algorithm, which is the first algorithm that can identify the best action even when the size of the action class is exponentially large with respect to the number of arms. We show that the upper bound of the probability of error… ▽ More

    Submitted 15 November, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

  47. arXiv:2310.13923  [pdf, other

    cs.LG

    Diversified Outlier Exposure for Out-of-Distribution Detection via Informative Extrapolation

    Authors: Jianing Zhu, Geng Yu, Jiangchao Yao, Tongliang Liu, Gang Niu, Masashi Sugiyama, Bo Han

    Abstract: Out-of-distribution (OOD) detection is important for deploying reliable machine learning models on real-world applications. Recent advances in outlier exposure have shown promising results on OOD detection via fine-tuning model with informatively sampled auxiliary outliers. However, previous methods assume that the collected outliers can be sufficiently large and representative to cover the bounda… ▽ More

    Submitted 26 October, 2023; v1 submitted 21 October, 2023; originally announced October 2023.

    Comments: accepted by NeurIPS 2023

  48. arXiv:2310.07351  [pdf, other

    cs.LG

    Atom-Motif Contrastive Transformer for Molecular Property Prediction

    Authors: Wentao Yu, Shuo Chen, Chen Gong, Gang Niu, Masashi Sugiyama

    Abstract: Recently, Graph Transformer (GT) models have been widely used in the task of Molecular Property Prediction (MPP) due to their high reliability in characterizing the latent relationship among graph nodes (i.e., the atoms in a molecule). However, most existing GT-based methods usually explore the basic interactions between pairwise atoms, and thus they fail to consider the important interactions amo… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: submit to AAAI-24

  49. arXiv:2310.05632  [pdf, other

    cs.LG

    Binary Classification with Confidence Difference

    Authors: Wei Wang, Lei Feng, Yuchen Jiang, Gang Niu, Min-Ling Zhang, Masashi Sugiyama

    Abstract: Recently, learning with soft labels has been shown to achieve better performance than learning with hard labels in terms of model generalization, calibration, and robustness. However, collecting pointwise labeling confidence for all training examples can be challenging and time-consuming in real-world scenarios. This paper delves into a novel weakly supervised binary classification problem called… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  50. arXiv:2310.00539  [pdf, other

    stat.ML cs.LG

    Thompson Exploration with Best Challenger Rule in Best Arm Identification

    Authors: Jongyeong Lee, Junya Honda, Masashi Sugiyama

    Abstract: This paper studies the fixed-confidence best arm identification (BAI) problem in the bandit framework in the canonical single-parameter exponential models. For this problem, many policies have been proposed, but most of them require solving an optimization problem at every round and/or are forced to explore an arm at least a certain number of times except those restricted to the Gaussian model. To… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: TBA ACML2023, 49pages