Skip to main content

Showing 1–1 of 1 results for author: Guanglu, W

.
  1. arXiv:2506.07165  [pdf, ps, other

    cs.LG cs.AI

    AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models

    Authors: Qi Liu, Jingqing Ruan, Hao Li, Haodong Zhao, Desheng Wang, Jiansong Chen, Wan Guanglu, Xunliang Cai, Zhi Zheng, Tong Xu

    Abstract: Existing multi-objective preference alignment methods for large language models (LLMs) face limitations: (1) the inability to effectively balance various preference dimensions, and (2) reliance on auxiliary reward/reference models introduces computational complexity. To address these challenges, we propose Adaptive Multi-objective Preference Optimization (AMoPO), a novel framework that achieves dy… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

    Comments: Accepted by ACL 2025