Skip to main content

Showing 1–50 of 125 results for author: Cui, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.05110  [pdf, ps, other

    cs.AI

    Rule Learning for Knowledge Graph Reasoning under Agnostic Distribution Shift

    Authors: Shixuan Liu, Yue He, Yunfei Wang, Hao Zou, Haoxiang Cheng, Wenjing Yang, Peng Cui, Zhong Liu

    Abstract: Knowledge graph (KG) reasoning remains a critical research area focused on inferring missing knowledge by analyzing relationships among observed facts. Despite its success, a key limitation of existing KG reasoning methods is their dependence on the I.I.D assumption. This assumption can easily be violated due to unknown sample selection bias during training or agnostic distribution shifts during t… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. Data Heterogeneity Modeling for Trustworthy Machine Learning

    Authors: Jiashuo Liu, Peng Cui

    Abstract: Data heterogeneity plays a pivotal role in determining the performance of machine learning (ML) systems. Traditional algorithms, which are typically designed to optimize average performance, often overlook the intrinsic diversity within datasets. This oversight can lead to a myriad of issues, including unreliable decision-making, inadequate generalization across different domains, unfair outcomes,… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Survey paper for tutorial "Data Heterogeneity Modeling for Trustworthy Machine Learning" in KDD'25

  3. arXiv:2504.10158  [pdf, other

    cs.CV cs.AI

    COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts

    Authors: Jiansheng Li, Xingxuan Zhang, Hao Zou, Yige Guo, Renzhe Xu, Yilong Liu, Chuzhao Zhu, Yue He, Peng Cui

    Abstract: Current object detectors often suffer significant perfor-mance degradation in real-world applications when encountering distributional shifts. Consequently, the out-of-distribution (OOD) generalization capability of object detectors has garnered increasing attention from researchers. Despite this growing interest, there remains a lack of a large-scale, comprehensive dataset and evaluation benchmar… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  4. arXiv:2503.15579  [pdf, other

    cs.LG

    Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study

    Authors: Xingxuan Zhang, Haoran Wang, Jiansheng Li, Yuan Xue, Shikai Guan, Renzhe Xu, Hao Zou, Han Yu, Peng Cui

    Abstract: Large language models (LLMs) like GPT-4 and LLaMA-3 utilize the powerful in-context learning (ICL) capability of Transformer architecture to learn on the fly from limited examples. While ICL underpins many LLM applications, its full potential remains hindered by a limited understanding of its generalization boundaries and vulnerabilities. We present a systematic investigation of transformers' gene… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 32 pages

  5. arXiv:2503.02887  [pdf, other

    cs.SI

    Dynamics and Inequalities in Digital Social Networks: A Computational and Sociological Review

    Authors: Pengjia Cui

    Abstract: Digital networks have profoundly transformed the ways in which individuals interact, exchange information, and establish connections, leading to the emergence of phenomena such as virality, misinformation cascades, and online polarization. This review conducts a thorough examination of the micro-macro linkages within digital social networks, analyzing how individual actions like liking, sharing, a… ▽ More

    Submitted 12 February, 2025; originally announced March 2025.

  6. arXiv:2502.07544  [pdf, other

    cs.CL

    Grammar Control in Dialogue Response Generation for Language Learning Chatbots

    Authors: Dominik Glandorf, Peng Cui, Detmar Meurers, Mrinmaya Sachan

    Abstract: Chatbots based on large language models offer cheap conversation practice opportunities for language learners. However, they are hard to control for linguistic forms that correspond to learners' current needs, such as grammar. We control grammar in chatbot conversation practice by grounding a dialogue response generation model in a pedagogical repository of grammar skills. We also explore how this… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

    Comments: Accepted to NAACL 2025

  7. arXiv:2502.07414  [pdf, other

    cs.LG

    Sample Weight Averaging for Stable Prediction

    Authors: Han Yu, Yue He, Renzhe Xu, Dongbai Li, Jiayin Zhang, Wenchao Zou, Peng Cui

    Abstract: The challenge of Out-of-Distribution (OOD) generalization poses a foundational concern for the application of machine learning algorithms to risk-sensitive areas. Inspired by traditional importance weighting and propensity weighting methods, prior approaches employ an independence-based sample reweighting procedure. They aim at decorrelating covariates to counteract the bias introduced by spurious… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  8. arXiv:2502.07412  [pdf, other

    cs.SI

    Mapping the Intellectual Structure of Social Network Research: A Comparative Bibliometric Analysis

    Authors: Pengjia Cui, Yawen Dong

    Abstract: Network science is an interdisciplinary field that transcends traditional academic boundaries, offering profound insights into complex systems across disciplines. This study conducts a bibliometric analysis of three leading journals, Social Networks, Network Science, and the Journal of Complex Networks, each representing a distinct yet interconnected perspective within the field. Social Networks f… ▽ More

    Submitted 11 February, 2025; originally announced February 2025.

  9. arXiv:2502.06990  [pdf, other

    cs.CL

    Investigating the Zone of Proximal Development of Language Models for In-Context Learning

    Authors: Peng Cui, Mrinmaya Sachan

    Abstract: In this paper, we introduce a learning analytics framework to analyze the in-context learning (ICL) behavior of large language models (LLMs) through the lens of the Zone of Proximal Development (ZPD), an established theory in educational psychology. ZPD delineates the space between what a learner is capable of doing unsupported and what the learner cannot do even with support. We adapt this concep… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: NAACL 2025 findings

  10. arXiv:2501.19032  [pdf, other

    cs.LG

    Error Slice Discovery via Manifold Compactness

    Authors: Han Yu, Jiashuo Liu, Hao Zou, Renzhe Xu, Yue He, Xingxuan Zhang, Peng Cui

    Abstract: Despite the great performance of deep learning models in many areas, they still make mistakes and underperform on certain subsets of data, i.e. error slices. Given a trained model, it is important to identify its semantically coherent error slices that are easy to interpret, which is referred to as the error slice discovery problem. However, there is no proper metric of slice coherence without rel… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  11. arXiv:2501.18251  [pdf, ps, other

    cs.CL

    How to Select Datapoints for Efficient Human Evaluation of NLG Models?

    Authors: Vilém Zouhar, Peng Cui, Mrinmaya Sachan

    Abstract: Human evaluation is the gold standard for evaluating text generation models. However, it is expensive. In order to fit budgetary constraints, a random subset of the test data is often chosen in practice for human evaluation. However, randomly selected data may not accurately represent test performance, making this approach economically inefficient for model comparison. Thus, in this work, we devel… ▽ More

    Submitted 31 May, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

  12. arXiv:2411.17767  [pdf, other

    cs.CV cs.LG

    Exploring Aleatoric Uncertainty in Object Detection via Vision Foundation Models

    Authors: Peng Cui, Guande He, Dan Zhang, Zhijie Deng, Yinpeng Dong, Jun Zhu

    Abstract: Datasets collected from the open world unavoidably suffer from various forms of randomness or noiseness, leading to the ubiquity of aleatoric (data) uncertainty. Quantifying such uncertainty is particularly pivotal for object detection, where images contain multi-scale objects with occlusion, obscureness, and even noisy annotations, in contrast to images with centric and similar-scale objects in c… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

  13. arXiv:2411.16095  [pdf, other

    cs.LG

    LDACP: Long-Delayed Ad Conversions Prediction Model for Bidding Strategy

    Authors: Peng Cui, Yiming Yang, Fusheng Jin, Siyuan Tang, Yunli Wang, Fukang Yang, Yalong Jia, Qingpeng Cai, Fei Pan, Changcheng Li, Peng Jiang

    Abstract: In online advertising, once an ad campaign is deployed, the automated bidding system dynamically adjusts the bidding strategy to optimize Cost Per Action (CPA) based on the number of ad conversions. For ads with a long conversion delay, relying solely on the real-time tracked conversion number as a signal for bidding strategy can significantly overestimate the current CPA, leading to conservative… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

    Comments: 10 pages, 8 figures, 6 tables

  14. arXiv:2407.14309  [pdf, other

    cs.CL cs.AI

    How to Engage Your Readers? Generating Guiding Questions to Promote Active Reading

    Authors: Peng Cui, Vilém Zouhar, Xiaoyu Zhang, Mrinmaya Sachan

    Abstract: Using questions in written text is an effective strategy to enhance readability. However, what makes an active reading question good, what the linguistic role of these questions is, and what is their impact on human reading remains understudied. We introduce GuidingQ, a dataset of 10K in-text questions from textbooks and scientific articles. By analyzing the dataset, we present a comprehensive und… ▽ More

    Submitted 28 July, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: ACL 2024

  15. arXiv:2406.01066  [pdf, other

    cs.LG

    Topology-Aware Dynamic Reweighting for Distribution Shifts on Graph

    Authors: Weihuang Zheng, Jiashuo Liu, Jiaxing Li, Jiayun Wu, Peng Cui, Youyong Kong

    Abstract: Graph Neural Networks (GNNs) are widely used for node classification tasks but often fail to generalize when training and test nodes come from different distributions, limiting their practicality. To overcome this, recent approaches adopt invariant learning techniques from the out-of-distribution (OOD) generalization field, which seek to establish stable prediction methods across environments. How… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2406.00661  [pdf, other

    cs.LG cs.AI

    Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

    Authors: Jiayun Wu, Jiashuo Liu, Peng Cui, Zhiwei Steven Wu

    Abstract: We establish a new model-agnostic optimization framework for out-of-distribution generalization via multicalibration, a criterion that ensures a predictor is calibrated across a family of overlapping groups. Multicalibration is shown to be associated with robustness of statistical inference under covariate shift. We further establish a link between multicalibration and robustness for prediction ta… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  17. arXiv:2405.19656  [pdf, other

    cs.AI

    Accurate and Reliable Predictions with Mutual-Transport Ensemble

    Authors: Han Liu, Peng Cui, Bingning Wang, Jun Zhu, Xiaolin Hu

    Abstract: Deep Neural Networks (DNNs) have achieved remarkable success in a variety of tasks, especially when it comes to prediction accuracy. However, in complex real-world scenarios, particularly in safety-critical applications, high accuracy alone is not enough. Reliable uncertainty estimates are crucial. Modern DNNs, often trained with cross-entropy loss, tend to be overconfident, especially with ambigu… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  18. arXiv:2405.03198  [pdf, other

    stat.ML cs.LG math.OC

    Stability Evaluation via Distributional Perturbation Analysis

    Authors: Jose Blanchet, Peng Cui, Jiajin Li, Jiashuo Liu

    Abstract: The performance of learning models often deteriorates when deployed in out-of-sample environments. To ensure reliable deployment, we propose a stability evaluation criterion based on distributional perturbations. Conceptually, our stability evaluation criterion is defined as the minimal perturbation required on our observed dataset to induce a prescribed deterioration in risk evaluation. In this p… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  19. arXiv:2404.19596  [pdf, other

    cs.IR cs.LG

    Debiased Collaborative Filtering with Kernel-Based Causal Balancing

    Authors: Haoxuan Li, Chunyuan Zheng, Yanghao Xiao, Peng Wu, Zhi Geng, Xu Chen, Peng Cui

    Abstract: Debiased collaborative filtering aims to learn an unbiased prediction model by removing different biases in observational datasets. To solve this problem, one of the simple and effective methods is based on the propensity score, which adjusts the observational sample distribution to the target one by reweighting observed instances. Ideally, propensity scores should be learned with causal balancing… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: ICLR 24 Spotlight

  20. arXiv:2403.15524  [pdf, other

    cs.GT cs.LG

    PPA-Game: Characterizing and Learning Competitive Dynamics Among Online Content Creators

    Authors: Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

    Abstract: We introduce the Proportional Payoff Allocation Game (PPA-Game) to model how agents, akin to content creators on platforms like YouTube and TikTok, compete for divisible resources and consumers' attention. Payoffs are allocated to agents based on heterogeneous weights, reflecting the diversity in content quality among creators. Our analysis reveals that although a pure Nash equilibrium (PNE) is no… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  21. arXiv:2403.01874  [pdf, other

    cs.LG

    A Survey on Evaluation of Out-of-Distribution Generalization

    Authors: Han Yu, Jiashuo Liu, Xingxuan Zhang, Jiayun Wu, Peng Cui

    Abstract: Machine learning models, while progressively advanced, rely heavily on the IID assumption, which is often unfulfilled in practice due to inevitable distribution shifts. This renders them susceptible and untrustworthy for deployment in risk-sensitive applications. Such a significant problem has consequently spawned various branches of works dedicated to developing algorithms capable of Out-of-Distr… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  22. arXiv:2402.18294  [pdf, other

    cs.RO

    Whole-body Humanoid Robot Locomotion with Human Reference

    Authors: Qiang Zhang, Peter Cui, David Yan, Jingkai Sun, Yiqun Duan, Gang Han, Wen Zhao, Weining Zhang, Yijie Guo, Arthur Zhang, Renjing Xu

    Abstract: Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterati… ▽ More

    Submitted 26 August, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 7pages, 7 figures

  23. arXiv:2402.06599  [pdf, other

    cs.CV cs.AI

    On the Out-Of-Distribution Generalization of Multimodal Large Language Models

    Authors: Xingxuan Zhang, Jiansheng Li, Wenjing Chu, Junjia Hai, Renzhe Xu, Yuqing Yang, Shikai Guan, Jiazheng Xu, Peng Cui

    Abstract: We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs) via comprehensive evaluation under out-of-distribution scenarios and domain-specific tasks. We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery. Empirical results indicate that MLLMs struggle w… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  24. arXiv:2312.16815  [pdf, other

    physics.soc-ph cs.AI nlin.AO

    Emergence and Causality in Complex Systems: A Survey on Causal Emergence and Related Quantitative Studies

    Authors: Bing Yuan, Zhang Jiang, Aobo Lyu, Jiayun Wu, Zhipeng Wang, Mingzhe Yang, Kaiwei Liu, Muyun Mou, Peng Cui

    Abstract: Emergence and causality are two fundamental concepts for understanding complex systems. They are interconnected. On one hand, emergence refers to the phenomenon where macroscopic properties cannot be solely attributed to the cause of individual properties. On the other hand, causality can exhibit emergence, meaning that new causal laws may arise as we increase the level of abstraction. Causal emer… ▽ More

    Submitted 25 February, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 57 pages, 17 figures, 1 table

    MSC Class: 68P30 ACM Class: K.3.2

  25. arXiv:2312.01294  [pdf, other

    cs.LG stat.ML

    Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation for Time Series

    Authors: Ying Liu, Peng Cui, Wenbo Hu, Richang Hong

    Abstract: Real-world time series data frequently have significant amounts of missing values, posing challenges for advanced analysis. A common approach to address this issue is imputation, where the primary challenge lies in determining the appropriate values to fill in. While previous deep learning methods have proven effective for time series imputation, they often produce overconfident imputations, which… ▽ More

    Submitted 23 September, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: fix typo

  26. arXiv:2311.05054  [pdf, other

    cs.LG cs.AI

    Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

    Authors: Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui

    Abstract: Machine learning algorithms minimizing average risk are susceptible to distributional shifts. Distributionally Robust Optimization (DRO) addresses this issue by optimizing the worst-case risk within an uncertainty set. However, DRO suffers from over-pessimism, leading to low-confidence predictions, poor parameter estimations as well as poor generalization. In this work, we conduct a theoretical an… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Short version appears at 37th Conference on Neural Information Processing Systems (NeurIPS 2023), Workshop on Distribution Shifts (DistShift)

  27. arXiv:2310.11732  [pdf, other

    cs.LG cs.CL

    Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting

    Authors: Guande He, Peng Cui, Jianfei Chen, Wenbo Hu, Jun Zhu

    Abstract: Despite the significant progress made in practical applications of aligned language models (LMs), they tend to be overconfident in output answers compared to the corresponding pre-trained LMs. In this work, we systematically evaluate the impact of the alignment process on logit-based uncertainty calibration of LMs under the multiple-choice setting. We first conduct a thoughtful empirical study on… ▽ More

    Submitted 19 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  28. arXiv:2309.07870  [pdf, other

    cs.CL

    Agents: An Open-source Framework for Autonomous Language Agents

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Long Li, Jialong Wu, Tiannan Wang, Shi Qiu, Jintian Zhang, Jing Chen, Ruipu Wu, Shuai Wang, Shiding Zhu, Jiyu Chen, Wentao Zhang, Xiangru Tang, Ningyu Zhang, Huajun Chen, Peng Cui, Mrinmaya Sachan

    Abstract: Recent advances on large language models (LLMs) enable researchers and developers to build autonomous language agents that can automatically solve various tasks and interact with environments, humans, and other agents using natural language interfaces. We consider language agents as a promising direction towards artificial general intelligence and release Agents, an open-source library with the go… ▽ More

    Submitted 11 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Code available at https://github.com/aiwaves-cn/agents

  29. arXiv:2308.15364  [pdf, other

    cs.LG stat.ML

    Heterogeneous Multi-Task Gaussian Cox Processes

    Authors: Feng Zhou, Quyu Kong, Zhijie Deng, Fengxiang He, Peng Cui, Jun Zhu

    Abstract: This paper presents a novel extension of multi-task Gaussian Cox processes for modeling multiple heterogeneous correlated tasks jointly, e.g., classification and regression, via multi-output Gaussian processes (MOGP). A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  30. arXiv:2308.10544  [pdf, other

    cs.LG

    Towards Accelerated Model Training via Bayesian Data Selection

    Authors: Zhijie Deng, Peng Cui, Jun Zhu

    Abstract: Mislabeled, duplicated, or biased data in real-world scenarios can lead to prolonged training and even hinder model convergence. Traditional solutions prioritizing easy or hard samples lack the flexibility to handle such a variety simultaneously. Recent work has proposed a more reasonable data selection principle by examining the data's impact on the model's generalization loss. However, its pract… ▽ More

    Submitted 7 November, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: NeurIPS 2023

  31. arXiv:2307.05284  [pdf, other

    cs.LG cs.AI

    Rethinking Distribution Shifts: Empirical Analysis and Inductive Modeling for Tabular Data

    Authors: Jiashuo Liu, Tianyu Wang, Peng Cui, Hongseok Namkoong

    Abstract: Different distribution shifts require different interventions, and algorithms must be grounded in the specific shifts they address. However, methodological development for robust algorithms typically relies on structural assumptions that lack empirical validation. Advocating for an empirically grounded data-driven approach to research, we build an empirical testbed comprising natural shifts across… ▽ More

    Submitted 13 November, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: Conference version appeared in NeurIPS 2023, previously titled "On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets"

  32. Inductive Meta-path Learning for Schema-complex Heterogeneous Information Networks

    Authors: Shixuan Liu, Changjun Fan, Kewei Cheng, Yunfei Wang, Peng Cui, Yizhou Sun, Zhong Liu

    Abstract: Heterogeneous Information Networks (HINs) are information networks with multiple types of nodes and edges. The concept of meta-path, i.e., a sequence of entity types and relation types connecting two entities, is proposed to provide the meta-level explainable semantics for various HIN tasks. Traditionally, meta-paths are primarily used for schema-simple HINs, e.g., bibliographic networks with only… ▽ More

    Submitted 3 December, 2024; v1 submitted 8 July, 2023; originally announced July 2023.

  33. arXiv:2306.02457  [pdf, other

    cs.CL cs.AI

    Adaptive and Personalized Exercise Generation for Online Language Learning

    Authors: Peng Cui, Mrinmaya Sachan

    Abstract: Adaptive learning aims to provide customized educational activities (e.g., exercises) to address individual learning needs. However, manual construction and delivery of such activities is a laborious process. Thus, in this paper, we study a novel task of adaptive and personalized exercise generation for online language learning. To this end, we combine a knowledge tracing model that estimates each… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: To appear at ACL 2023

  34. arXiv:2305.19158  [pdf, other

    cs.LG cs.CY cs.GT cs.MA

    Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

    Authors: Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

    Abstract: Competitions for shareable and limited resources have long been studied with strategic agents. In reality, agents often have to learn and maximize the rewards of the resources at the same time. To design an individualized competing policy, we model the competition between agents in a novel multi-player multi-armed bandit (MPMAB) setting where players are selfish and aim to maximize their own rewar… ▽ More

    Submitted 4 August, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  35. arXiv:2305.15644  [pdf, other

    cs.LG cs.AI cs.CV

    Meta Adaptive Task Sampling for Few-Domain Generalization

    Authors: Zheyan Shen, Han Yu, Peng Cui, Jiashuo Liu, Xingxuan Zhang, Linjun Zhou, Furui Liu

    Abstract: To ensure the out-of-distribution (OOD) generalization performance, traditional domain generalization (DG) methods resort to training on data from multiple sources with different underlying distributions. And the success of those DG methods largely depends on the fact that there are diverse training distributions. However, it usually needs great efforts to obtain enough heterogeneous data due to t… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  36. arXiv:2305.15431  [pdf, other

    cs.IR cs.LG

    Exploring and Exploiting Data Heterogeneity in Recommendation

    Authors: Zimu Wang, Jiashuo Liu, Hao Zou, Xingxuan Zhang, Yue He, Dongxu Liang, Peng Cui

    Abstract: Massive amounts of data are the foundation of data-driven recommendation models. As an inherent nature of big data, data heterogeneity widely exists in real-world recommendation systems. It reflects the differences in the properties among sub-populations. Ignoring the heterogeneity in recommendation data could limit the performance of recommendation models, hurt the sub-populational robustness, an… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: 14 pages, 14 figures

  37. arXiv:2305.15253  [pdf, other

    cs.LG cs.AI cs.CV

    Rethinking the Evaluation Protocol of Domain Generalization

    Authors: Han Yu, Xingxuan Zhang, Renzhe Xu, Jiashuo Liu, Yue He, Peng Cui

    Abstract: Domain generalization aims to solve the challenge of Out-of-Distribution (OOD) generalization by leveraging common knowledge learned from multiple training domains to generalize to unseen test domains. To accurately evaluate the OOD generalization ability, it is required that test data information is unavailable. However, the current domain generalization protocol may still have potential test dat… ▽ More

    Submitted 23 March, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

  38. arXiv:2305.13304  [pdf, other

    cs.CL cs.LG

    RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

    Authors: Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan

    Abstract: The fixed-size context of Transformer makes GPT models incapable of generating arbitrarily long text. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs. RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses natural language to simulate the Long Short-Term Memory mechanism in an LSTM. At each timestep, RecurrentGPT g… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Under review

  39. arXiv:2304.10127  [pdf, other

    cs.LG

    Learning Sample Difficulty from Pre-trained Models for Reliable Prediction

    Authors: Peng Cui, Dan Zhang, Zhijie Deng, Yinpeng Dong, Jun Zhu

    Abstract: Large-scale pre-trained models have achieved remarkable success in many applications, but how to leverage them to improve the prediction reliability of downstream models is undesirably under-explored. Moreover, modern neural networks have been found to be poorly calibrated and make overconfident predictions regardless of inherent sample difficulty and data uncertainty. To address this issue, we pr… ▽ More

    Submitted 30 October, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023

  40. arXiv:2304.00305  [pdf, other

    cs.LG cs.AI cs.IT

    Predictive Heterogeneity: Measures and Applications

    Authors: Jiashuo Liu, Jiayun Wu, Bo Li, Peng Cui

    Abstract: As an intrinsic and fundamental property of big data, data heterogeneity exists in a variety of real-world applications, such as precision medicine, autonomous driving, financial applications, etc. For machine learning algorithms, the ignorance of data heterogeneity will greatly hurt the generalization performance and the algorithmic fairness, since the prediction mechanisms among different sub-po… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

    Comments: 35 pages. Short version accepted at ICLR 2023

  41. arXiv:2303.03108  [pdf, other

    cs.LG cs.CV

    Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization

    Authors: Xingxuan Zhang, Renzhe Xu, Han Yu, Hao Zou, Peng Cui

    Abstract: Recently, flat minima are proven to be effective for improving generalization and sharpness-aware minimization (SAM) achieves state-of-the-art performance. Yet the current definition of flatness discussed in SAM and its follow-ups are limited to the zeroth-order flatness (i.e., the worst-case loss within a perturbation radius). We show that the zeroth-order flatness can be insufficient to discrimi… ▽ More

    Submitted 4 July, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: CVPR2023 highlight paper

  42. arXiv:2302.05098  [pdf, other

    cs.CV cs.LG

    Confidence-based Reliable Learning under Dual Noises

    Authors: Peng Cui, Yang Yue, Zhijie Deng, Jun Zhu

    Abstract: Deep neural networks (DNNs) have achieved remarkable success in a variety of computer vision tasks, where massive labeled images are routinely required for model optimization. Yet, the data collected from the open world are unavoidably polluted by noise, which may significantly undermine the efficacy of the learned models. Various attempts have been made to reliably train DNNs under data noise, bu… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2022

  43. arXiv:2301.09819  [pdf, other

    cs.LG

    Model Agnostic Sample Reweighting for Out-of-Distribution Learning

    Authors: Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang

    Abstract: Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, \textbf{M}… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

  44. arXiv:2212.00992  [pdf, other

    cs.LG stat.ML

    Stable Learning via Sparse Variable Independence

    Authors: Han Yu, Peng Cui, Yue He, Zheyan Shen, Yong Lin, Renzhe Xu, Xingxuan Zhang

    Abstract: The problem of covariate-shift generalization has attracted intensive research attention. Previous stable learning algorithms employ sample reweighting schemes to decorrelate the covariates when there is no explicit domain information about training data. However, with finite samples, it is difficult to achieve the desirable weights that ensure perfect independence to get rid of the unstable varia… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  45. arXiv:2211.01664  [pdf, other

    cs.CV

    PointSee: Image Enhances Point Cloud

    Authors: Lipeng Gu, Xuefeng Yan, Peng Cui, Lina Gong, Haoran Xie, Fu Lee Wang, Jin Qin, Mingqiang Wei

    Abstract: There is a trend to fuse multi-modal information for 3D object detection (3OD). However, the challenging problems of low lightweightness, poor flexibility of plug-and-play, and inaccurate alignment of features are still not well-solved, when designing multi-modal fusion newtorks. We propose PointSee, a lightweight, flexible and effective multi-modal fusion solution to facilitate various 3OD networ… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  46. arXiv:2210.16057  [pdf, other

    cs.CV

    Semi-UFormer: Semi-supervised Uncertainty-aware Transformer for Image Dehazing

    Authors: Ming Tong, Yongzhen Wang, Peng Cui, Xuefeng Yan, Mingqiang Wei

    Abstract: Image dehazing is fundamental yet not well-solved in computer vision. Most cutting-edge models are trained in synthetic data, leading to the poor performance on real-world hazy scenarios. Besides, they commonly give deterministic dehazed images while neglecting to mine their uncertainty. To bridge the domain gap and enhance the dehazing performance, we propose a novel semi-supervised uncertainty-a… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  47. arXiv:2210.12637  [pdf, other

    cs.LG

    Neural Eigenfunctions Are Structured Representation Learners

    Authors: Zhijie Deng, Jiaxin Shi, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu

    Abstract: This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap. Unlike prior spectral methods such as Laplacian Eigenmap that operate in a nonparametric manner, Neural Eigenmap leverages NeuralEF to parametrically model eigenfunctions using a neural network. We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applyin… ▽ More

    Submitted 8 December, 2023; v1 submitted 23 October, 2022; originally announced October 2022.

  48. arXiv:2210.08268  [pdf, other

    cs.LG

    Product Ranking for Revenue Maximization with Multiple Purchases

    Authors: Renzhe Xu, Xingxuan Zhang, Bo Li, Yafeng Zhang, Xiaolong Chen, Peng Cui

    Abstract: Product ranking is the core problem for revenue-maximizing online retailers. To design proper product ranking algorithms, various consumer choice models are proposed to characterize the consumers' behaviors when they are provided with a list of products. However, existing works assume that each consumer purchases at most one product or will keep viewing the product list after purchasing a product,… ▽ More

    Submitted 2 January, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  49. arXiv:2206.02990  [pdf, other

    cs.LG

    Enhancing Distributional Stability among Sub-populations

    Authors: Jiashuo Liu, Jiayun Wu, Jie Peng, Xiaoyu Wu, Yang Zheng, Bo Li, Peng Cui

    Abstract: Enhancing the stability of machine learning algorithms under distributional shifts is at the heart of the Out-of-Distribution (OOD) Generalization problem. Derived from causal learning, recent works of invariant learning pursue strict invariance with multiple training environments. Although intuitively reasonable, strong assumptions on the availability and quality of environments are made to learn… ▽ More

    Submitted 13 February, 2024; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted at International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  50. arXiv:2205.10014  [pdf, other

    cs.LG cs.AI

    A Survey of Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection

    Authors: Bingzhe Wu, Jintang Li, Junchi Yu, Yatao Bian, Hengtong Zhang, CHaochao Chen, Chengbin Hou, Guoji Fu, Liang Chen, Tingyang Xu, Yu Rong, Xiaolin Zheng, Junzhou Huang, Ran He, Baoyuan Wu, GUangyu Sun, Peng Cui, Zibin Zheng, Zhe Liu, Peilin Zhao

    Abstract: Deep graph learning has achieved remarkable progresses in both business and scientific areas ranging from finance and e-commerce, to drug and advanced material discovery. Despite these progresses, how to ensure various deep graph learning algorithms behave in a socially responsible manner and meet regulatory compliance requirements becomes an emerging problem, especially in risk-sensitive domains.… ▽ More

    Submitted 23 May, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Preprint; Work in progress. arXiv admin note: substantial text overlap with arXiv:2202.07114