Skip to main content

Showing 1–50 of 66 results for author: Lyu, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.06319  [pdf, other

    cs.LG cs.AI

    Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching

    Authors: Yanhao Dong, Yubo Miao, Weinan Li, Xiao Zheng, Chao Wang, Feng Lyu

    Abstract: Large Language Models (LLMs) exhibit pronounced memory-bound characteristics during inference due to High Bandwidth Memory (HBM) bandwidth constraints. In this paper, we propose an L2 Cache-oriented asynchronous KV Cache prefetching method to break through the memory bandwidth bottleneck in LLM inference through computation-load overlap. By strategically scheduling idle memory bandwidth during act… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: 8 pages, 5 figures

  2. arXiv:2503.24235  [pdf, other

    cs.CL cs.AI

    A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?

    Authors: Qiyuan Zhang, Fuyuan Lyu, Zexu Sun, Lei Wang, Weixu Zhang, Wenyue Hua, Haolun Wu, Zhihan Guo, Yufei Wang, Niklas Muennighoff, Irwin King, Xue Liu, Chen Ma

    Abstract: As enthusiasm for scaling computation (data and parameters) in the pretraining era gradually diminished, test-time scaling (TTS), also referred to as ``test-time computing'' has emerged as a prominent research focus. Recent studies demonstrate that TTS can further elicit the problem-solving capabilities of large language models (LLMs), enabling significant breakthroughs not only in specialized rea… ▽ More

    Submitted 4 May, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

    Comments: v3: Expand Agentic and SFT Chapters. Build Website for better visualization

  3. arXiv:2503.22136  [pdf, other

    cs.CV

    Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

    Authors: Hongmei Yin, Tingliang Feng, Fan Lyu, Fanhua Shang, Hongying Liu, Wei Feng, Liang Wan

    Abstract: In this work, we focus on continual semantic segmentation (CSS), where segmentation networks are required to continuously learn new classes without erasing knowledge of previously learned ones. Although storing images of old classes and directly incorporating them into the training of new models has proven effective in mitigating catastrophic forgetting in classification tasks, this strategy prese… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  4. arXiv:2503.10699  [pdf, other

    cs.CV cs.AI cs.LG

    Test-Time Discovery via Hashing Memory

    Authors: Fan Lyu, Tianle Liu, Zhang Zhang, Fuyuan Hu, Liang Wang

    Abstract: We introduce Test-Time Discovery (TTD) as a novel task that addresses class shifts during testing, requiring models to simultaneously identify emerging categories while preserving previously learned ones. A key challenge in TTD is distinguishing newly discovered classes from those already identified. To address this, we propose a training-free, hash-based memory mechanism that enhances class disco… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  5. arXiv:2503.03165  [pdf, other

    cs.CE cs.IR cs.LG

    A Predict-Then-Optimize Customer Allocation Framework for Online Fund Recommendation

    Authors: Xing Tang, Yunpeng Weng, Fuyuan Lyu, Dugang Liu, Xiuqiang He

    Abstract: With the rapid growth of online investment platforms, funds can be distributed to individual customers online. The central issue is to match funds with potential customers under constraints. Most mainstream platforms adopt the recommendation formulation to tackle the problem. However, the traditional recommendation regime has its inherent drawbacks when applying the fund-matching problem with mult… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: Accepted by DASFAA 2025

  6. arXiv:2503.02112  [pdf, other

    cs.LG astro-ph.IM

    Building Machine Learning Challenges for Anomaly Detection in Science

    Authors: Elizabeth G. Campolongo, Yuan-Tang Chou, Ekaterina Govorkova, Wahid Bhimji, Wei-Lun Chao, Chris Harris, Shih-Chieh Hsu, Hilmar Lapp, Mark S. Neubauer, Josephine Namayanja, Aneesh Subramanian, Philip Harris, Advaith Anand, David E. Carlyn, Subhankar Ghosh, Christopher Lawrence, Eric Moreno, Ryan Raikman, Jiaman Wu, Ziheng Zhang, Bayu Adhi, Mohammad Ahmadi Gharehtoragh, Saúl Alonso Monsalve, Marta Babicz, Furqan Baig , et al. (125 additional authors not shown)

    Abstract: Scientific discoveries are often made by finding a pattern or object that was not predicted by the known rules of science. Oftentimes, these anomalous events or objects that do not conform to the norms are an indication that the rules of science governing the data are incomplete, and something new needs to be present to explain these unexpected outliers. The challenge of finding anomalies can be c… ▽ More

    Submitted 29 March, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: 17 pages 6 figures to be submitted to Nature Communications

  7. arXiv:2502.12501  [pdf, other

    cs.CL

    Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge

    Authors: Qiyuan Zhang, Yufei Wang, Yuxin Jiang, Liangyou Li, Chuhan Wu, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma

    Abstract: LLM-as-a-Judge, which generates chain-of-thought (CoT) judgments, has become a widely adopted auto-evaluation method. However, its reliability is compromised by the CoT reasoning's inability to capture comprehensive and deeper details, often leading to incomplete outcomes. Existing methods mainly rely on majority voting or criteria expansion, which is insufficient to address the limitation in CoT.… ▽ More

    Submitted 7 April, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

  8. arXiv:2502.02998  [pdf, other

    cs.LG

    Conformal Uncertainty Indicator for Continual Test-Time Adaptation

    Authors: Fan Lyu, Hanyu Zhao, Ziqi Shi, Ye Liu, Fuyuan Hu, Zhang Zhang, Liang Wang

    Abstract: Continual Test-Time Adaptation (CTTA) aims to adapt models to sequentially changing domains during testing, relying on pseudo-labels for self-adaptation. However, incorrect pseudo-labels can accumulate, leading to performance degradation. To address this, we propose a Conformal Uncertainty Indicator (CUI) for CTTA, leveraging Conformal Prediction (CP) to generate prediction sets that include the t… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  9. arXiv:2412.12654  [pdf, other

    cs.CV

    CALA: A Class-Aware Logit Adapter for Few-Shot Class-Incremental Learning

    Authors: Chengyan Liu, Linglan Zhao, Fan Lyu, Kaile Du, Fuyuan Hu, Tao Zhou

    Abstract: Few-Shot Class-Incremental Learning (FSCIL) defines a practical but challenging task where models are required to continuously learn novel concepts with only a few training samples. Due to data scarcity, existing FSCIL methods resort to training a backbone with abundant base data and then keeping it frozen afterward. However, the above operation often causes the backbone to overfit to base classes… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: 10 pages

  10. arXiv:2411.15731  [pdf, other

    cs.IR cs.AI

    Fusion Matters: Learning Fusion in Deep Click-through Rate Prediction Models

    Authors: Kexin Zhang, Fuyuan Lyu, Xing Tang, Dugang Liu, Chen Ma, Kaize Ding, Xiuqiang He, Xue Liu

    Abstract: The evolution of previous Click-Through Rate (CTR) models has mainly been driven by proposing complex components, whether shallow or deep, that are adept at modeling feature interactions. However, there has been less focus on improving fusion design. Instead, two naive solutions, stacked and parallel fusion, are commonly used. Both solutions rely on pre-determined fusion connections and fixed fusi… ▽ More

    Submitted 24 November, 2024; originally announced November 2024.

    Comments: Accepted by WSDM 2025

  11. Comprehending Knowledge Graphs with Large Language Models for Recommender Systems

    Authors: Ziqiang Cui, Yunpeng Weng, Xing Tang, Fuyuan Lyu, Dugang Liu, Xiuqiang He, Chen Ma

    Abstract: In recent years, the introduction of knowledge graphs (KGs) has significantly advanced recommender systems by facilitating the discovery of potential associations between items. However, existing methods still face several limitations. First, most KGs suffer from missing facts or limited scopes. Second, existing methods convert textual information in KGs into IDs, resulting in the loss of natural… ▽ More

    Submitted 17 April, 2025; v1 submitted 16 October, 2024; originally announced October 2024.

    Comments: Accepted as a full paper by SIGIR'25

  12. arXiv:2410.05193  [pdf, other

    cs.CL

    RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

    Authors: Qiyuan Zhang, Yufei Wang, Tiezheng YU, Yuxin Jiang, Chuhan Wu, Liangyou Li, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma

    Abstract: With significant efforts in recent studies, LLM-as-a-Judge has become a cost-effective alternative to human evaluation for assessing text generation quality in a wide range of tasks. However, there still remains a reliability gap between LLM-as-a-Judge and human evaluation. One important reason is the lack of guided oracles in the evaluation process. Motivated by the role of reference pervasively… ▽ More

    Submitted 7 April, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

  13. arXiv:2409.14874  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Towards Ground-truth-free Evaluation of Any Segmentation in Medical Images

    Authors: Ahjol Senbi, Tianyu Huang, Fei Lyu, Qing Li, Yuhui Tao, Wei Shao, Qiang Chen, Chengyan Wang, Shuo Wang, Tao Zhou, Yizhe Zhang

    Abstract: We explore the feasibility and potential of building a ground-truth-free evaluation model to assess the quality of segmentations generated by the Segment Anything Model (SAM) and its variants in medical imaging. This evaluation model estimates segmentation quality scores by analyzing the coherence and consistency between the input images and their corresponding segmentation predictions. Based on p… ▽ More

    Submitted 24 September, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: 17 pages, 15 figures

  14. arXiv:2409.09072  [pdf, other

    cs.DC cs.AI cs.LG

    Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services

    Authors: Shuangwei Gao, Peng Yang, Yuxin Kong, Feng Lyu, Ning Zhang

    Abstract: Artificial Intelligence Generated Content (AIGC) services can efficiently satisfy user-specified content creation demands, but the high computational requirements pose various challenges to supporting mobile users at scale. In this paper, we present our design of an edge-enabled AIGC service provisioning system to properly assign computing tasks of generative models to edge servers, thereby improv… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

  15. arXiv:2409.05303  [pdf, other

    cs.LG cs.AI

    Resource-Efficient Generative AI Model Deployment in Mobile Edge Networks

    Authors: Yuxin Liang, Peng Yang, Yuanyuan He, Feng Lyu

    Abstract: The surging development of Artificial Intelligence-Generated Content (AIGC) marks a transformative era of the content creation and production. Edge servers promise attractive benefits, e.g., reduced service delay and backhaul traffic load, for hosting AIGC services compared to cloud-based solutions. However, the scarcity of available resources on the edge pose significant challenges in deploying g… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

  16. arXiv:2408.12161  [pdf, other

    cs.CV

    Rebalancing Multi-Label Class-Incremental Learning

    Authors: Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Junzhou Xie, Yixi Shen, Fuyuan Hu, Guangcan Liu

    Abstract: Multi-label class-incremental learning (MLCIL) is essential for real-world multi-label applications, allowing models to learn new labels while retaining previously learned knowledge continuously. However, recent MLCIL approaches can only achieve suboptimal performance due to the oversight of the positive-negative imbalance problem, which manifests at both the label and loss levels because of the t… ▽ More

    Submitted 22 August, 2024; originally announced August 2024.

  17. arXiv:2408.08585  [pdf, other

    cs.IR cs.LG

    OptDist: Learning Optimal Distribution for Customer Lifetime Value Prediction

    Authors: Yunpeng Weng, Xing Tang, Zhenhao Xu, Fuyuan Lyu, Dugang Liu, Zexu Sun, Xiuqiang He

    Abstract: Customer Lifetime Value (CLTV) prediction is a critical task in business applications. Accurately predicting CLTV is challenging in real-world business scenarios, as the distribution of CLTV is complex and mutable. Firstly, there is a large number of users without any consumption consisting of a long-tailed part that is too complex to fit. Secondly, the small set of high-value users spent orders o… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: CIKM 2024

  18. arXiv:2407.18526  [pdf, other

    cs.LG

    Constructing Enhanced Mutual Information for Online Class-Incremental Learning

    Authors: Huan Zhang, Fan Lyu, Shenghua Fan, Yujin Zheng, Dingwen Wang

    Abstract: Online Class-Incremental continual Learning (OCIL) addresses the challenge of continuously learning from a single-channel data stream, adapting to new tasks while mitigating catastrophic forgetting. Recently, Mutual Information (MI)-based methods have shown promising performance in OCIL. However, existing MI-based methods treat various knowledge components in isolation, ignoring the knowledge conf… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  19. arXiv:2407.08214  [pdf, other

    cs.LG cs.AI

    Towards stable training of parallel continual learning

    Authors: Li Yuepan, Fan Lyu, Yuyang Li, Wei Feng, Guangcan Liu, Fanhua Shang

    Abstract: Parallel Continual Learning (PCL) tasks investigate the training methods for continual learning with multi-source input, where data from different tasks are learned as they arrive. PCL offers high training efficiency and is well-suited for complex multi-source data systems, such as autonomous vehicles equipped with multiple sensors. However, at any time, multiple tasks need to be trained simultane… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  20. arXiv:2407.02253  [pdf, other

    cs.LG cs.CV

    Parameter-Selective Continual Test-Time Adaptation

    Authors: Jiaxu Tian, Fan Lyu

    Abstract: Continual Test-Time Adaptation (CTTA) aims to adapt a pretrained model to ever-changing environments during the test time under continuous domain shifts. Most existing CTTA approaches are based on the Mean Teacher (MT) structure, which contains a student and a teacher model, where the student is updated using the pseudo-labels from the teacher model, and the teacher is then updated by exponential… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 17pages, 4 figures

  21. arXiv:2407.01300  [pdf, other

    cs.CL cs.AI cs.LG

    Collaborative Performance Prediction for Large Language Models

    Authors: Qiyuan Zhang, Fuyuan Lyu, Xue Liu, Chen Ma

    Abstract: Comprehensively understanding and accurately predicting the performance of large language models across diverse downstream tasks has emerged as a pivotal challenge in NLP research. The pioneering scaling law on downstream works demonstrated intrinsic similarities within model families and utilized such similarities for performance prediction. However, they tend to overlook the similarities between… ▽ More

    Submitted 2 October, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: In Proceedings of EMNLP 2024 Main Track

  22. arXiv:2406.02609  [pdf, other

    cs.LG cs.AI

    Less is More: Pseudo-Label Filtering for Continual Test-Time Adaptation

    Authors: Jiayao Tan, Fan Lyu, Chenggong Ni, Tingliang Feng, Fuyuan Hu, Zhang Zhang, Shaochuang Zhao, Liang Wang

    Abstract: Continual Test-Time Adaptation (CTTA) aims to adapt a pre-trained model to a sequence of target domains during the test phase without accessing the source data. To adapt to unlabeled data from unknown domains, existing methods rely on constructing pseudo-labels for all samples and updating the model through self-training. However, these pseudo-labels often involve noise, leading to insufficient ad… ▽ More

    Submitted 12 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.03335 by other authors

  23. arXiv:2405.17054  [pdf, other

    cs.LG

    Improving Data-aware and Parameter-aware Robustness for Continual Learning

    Authors: Hanxi Xiao, Fan Lyu

    Abstract: The goal of Continual Learning (CL) task is to continuously learn multiple new tasks sequentially while achieving a balance between the plasticity and stability of new and old knowledge. This paper analyzes that this insufficiency arises from the ineffective handling of outliers, leading to abnormal gradients and unexpected model updates. To address this issue, we enhance the data-aware and parame… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  24. arXiv:2405.14602  [pdf, other

    cs.LG

    Controllable Continual Test-Time Adaptation

    Authors: Ziqi Shi, Fan Lyu, Ye Liu, Fanhua Shang, Fuyuan Hu, Wei Feng, Zhang Zhang, Liang Wang

    Abstract: Continual Test-Time Adaptation (CTTA) is an emerging and challenging task where a model trained in a source domain must adapt to continuously changing conditions during testing, without access to the original source data. CTTA is prone to error accumulation due to uncontrollable domain shifts, leading to blurred decision boundaries between categories. Existing CTTA methods primarily focus on suppr… ▽ More

    Submitted 28 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  25. arXiv:2405.09133  [pdf, other

    cs.LG

    Overcoming Domain Drift in Online Continual Learning

    Authors: Fan Lyu, Daofeng Liu, Linglan Zhao, Zhang Zhang, Fanhua Shang, Fuyuan Hu, Wei Feng, Liang Wang

    Abstract: Online Continual Learning (OCL) empowers machine learning models to acquire new knowledge online across a sequence of tasks. However, OCL faces a significant challenge: catastrophic forgetting, wherein the model learned in previous tasks is substantially overwritten upon encountering new tasks, leading to a biased forgetting of prior knowledge. Moreover, the continual doman drift in sequential lea… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  26. arXiv:2404.07200  [pdf, other

    cs.LG

    Toward a Better Understanding of Fourier Neural Operators from a Spectral Perspective

    Authors: Shaoxiang Qin, Fuyuan Lyu, Wenhui Peng, Dingyang Geng, Ju Wang, Xing Tang, Sylvie Leroyer, Naiping Gao, Xue Liu, Liangzhu Leon Wang

    Abstract: In solving partial differential equations (PDEs), Fourier Neural Operators (FNOs) have exhibited notable effectiveness. However, FNO is observed to be ineffective with large Fourier kernels that parameterize more frequencies. Current solutions rely on setting small kernels, restricting FNO's ability to capture complex PDE data in real-world applications. This paper offers empirical insights into F… ▽ More

    Submitted 9 October, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  27. arXiv:2403.17442  [pdf, other

    cs.IR

    Touch the Core: Exploring Task Dependence Among Hybrid Targets for Recommendation

    Authors: Xing Tang, Yang Qiao, Fuyuan Lyu, Dugang Liu, Xiuqiang He

    Abstract: As user behaviors become complicated on business platforms, online recommendations focus more on how to touch the core conversions, which are highly related to the interests of platforms. These core conversions are usually continuous targets, such as \textit{watch time}, \textit{revenue}, and so on, whose predictions can be enhanced by previous discrete conversion actions. Therefore, multi-task le… ▽ More

    Submitted 20 August, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by RecSys 2024

  28. arXiv:2403.12559  [pdf, other

    cs.CV cs.LG

    Confidence Self-Calibration for Multi-Label Class-Incremental Learning

    Authors: Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Chen Lu, Guangcan Liu

    Abstract: The partial label challenge in Multi-Label Class-Incremental Learning (MLCIL) arises when only the new classes are labeled during training, while past and future labels remain unavailable. This issue leads to a proliferation of false-positive errors due to erroneously high confidence multi-label predictions, exacerbating catastrophic forgetting within the disjoint label space. In this paper, we ai… ▽ More

    Submitted 12 August, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted at the European Conference on Computer Vision (ECCV) 2024

  29. arXiv:2402.18609  [pdf, other

    cs.LG cs.AI

    ICE-SEARCH: A Language Model-Driven Feature Selection Approach

    Authors: Tianze Yang, Tianyi Yang, Fuyuan Lyu, Shaoshan Liu, Xue, Liu

    Abstract: This study unveils the In-Context Evolutionary Search (ICE-SEARCH) method, which is among the first works that melds large language models (LLMs) with evolutionary algorithms for feature selection (FS) tasks and demonstrates its effectiveness in Medical Predictive Analytics (MPA) applications. ICE-SEARCH harnesses the crossover and mutation capabilities inherent in LLMs within an evolutionary fram… ▽ More

    Submitted 8 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  30. arXiv:2402.08182  [pdf, other

    cs.LG stat.ML

    Variational Continual Test-Time Adaptation

    Authors: Fan Lyu, Kaile Du, Yuyang Li, Hanyu Zhao, Zhang Zhang, Guangcan Liu, Liang Wang

    Abstract: The prior drift is crucial in Continual Test-Time Adaptation (CTTA) methods that only use unlabeled test data, as it can cause significant error propagation. In this paper, we introduce VCoTTA, a variational Bayesian approach to measure uncertainties in CTTA. At the source stage, we transform a pre-trained deterministic model into a Bayesian Neural Network (BNN) via a variational warm-up strategy,… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  31. arXiv:2401.01054  [pdf, other

    cs.LG cs.AI

    Elastic Multi-Gradient Descent for Parallel Continual Learning

    Authors: Fan Lyu, Wei Feng, Yuepan Li, Qing Sun, Fanhua Shang, Liang Wan, Liang Wang

    Abstract: The goal of Continual Learning (CL) is to continuously learn from new data streams and accomplish the corresponding tasks. Previously studied CL assumes that data are given in sequence nose-to-tail for different tasks, thus indeed belonging to Serial Continual Learning (SCL). This paper studies the novel paradigm of Parallel Continual Learning (PCL) in dynamic multi-task scenarios, where a diverse… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Submited to IEEE TPAMI

  32. arXiv:2311.03526  [pdf, other

    cs.IR

    Towards Automated Negative Sampling in Implicit Recommendation

    Authors: Fuyuan Lyu, Yaochen Hu, Xing Tang, Yingxue Zhang, Ruiming Tang, Xue Liu

    Abstract: Negative sampling methods are vital in implicit recommendation models as they allow us to obtain negative instances from massive unlabeled data. Most existing approaches focus on sampling hard negative samples in various ways. These studies are orthogonal to the recommendation model and implicit datasets. However, such an idea contradicts the common belief in AutoML that the model and dataset shou… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  33. arXiv:2310.20490  [pdf, other

    cs.CV cs.LG

    Long-Tailed Learning as Multi-Objective Optimization

    Authors: Weiqi Li, Fan Lyu, Fanhua Shang, Liang Wan, Wei Feng

    Abstract: Real-world data is extremely imbalanced and presents a long-tailed distribution, resulting in models that are biased towards classes with sufficient samples and perform poorly on rare classes. Recent methods propose to rebalance classes but they undertake the seesaw dilemma (what is increasing performance on tail classes may decrease that of head classes, and vice versa). In this paper, we argue t… ▽ More

    Submitted 1 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: In submission

  34. arXiv:2310.20268  [pdf, other

    cs.CV cs.AI

    Constructing Sample-to-Class Graph for Few-Shot Class-Incremental Learning

    Authors: Fuyuan Hu, Jian Zhang, Fan Lyu, Linyan Li, Fenglei Xu

    Abstract: Few-shot class-incremental learning (FSCIL) aims to build machine learning model that can continually learn new concepts from a few data samples, without forgetting knowledge of old classes. The challenges of FSCIL lies in the limited data of new classes, which not only lead to significant overfitting issues but also exacerbates the notorious catastrophic forgetting problems. As proved in early… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  35. arXiv:2310.19113  [pdf, other

    cs.CV cs.AI eess.SP

    Dynamic V2X Autonomous Perception from Road-to-Vehicle Vision

    Authors: Jiayao Tan, Fan Lyu, Linyan Li, Fuyuan Hu, Tingliang Feng, Fenglei Xu, Rui Yao

    Abstract: Vehicle-to-everything (V2X) perception is an innovative technology that enhances vehicle perception accuracy, thereby elevating the security and reliability of autonomous systems. However, existing V2X perception methods focus on static scenes from mainly vehicle-based vision, which is constrained by sensor capabilities and communication loads. To adapt V2X perception models to dynamic scenes, we… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  36. arXiv:2310.15342  [pdf, other

    cs.LG cs.IR

    Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network

    Authors: Fuyuan Lyu, Xing Tang, Dugang Liu, Chen Ma, Weihong Luo, Liang Chen, Xiuqiang He, Xue Liu

    Abstract: Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, with which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-gra… ▽ More

    Submitted 30 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 poster

  37. arXiv:2306.13382  [pdf, other

    cs.IR

    OptMSM: Optimizing Multi-Scenario Modeling for Click-Through Rate Prediction

    Authors: Xing Tang, Yang Qiao, Yuwen Fu, Fuyuan Lyu, Dugang Liu, Xiuqiang He

    Abstract: A large-scale industrial recommendation platform typically consists of multiple associated scenarios, requiring a unified click-through rate (CTR) prediction model to serve them simultaneously. Existing approaches for multi-scenario CTR prediction generally consist of two main modules: i) a scenario-aware learning module that learns a set of multi-functional representations with scenario-shared an… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted by ECML-PKDD 2023 Applied Data Science Track

  38. arXiv:2306.00315  [pdf, other

    cs.LG cs.IR

    Explicit Feature Interaction-aware Uplift Network for Online Marketing

    Authors: Dugang Liu, Xing Tang, Han Gao, Fuyuan Lyu, Xiuqiang He

    Abstract: As a key component in online marketing, uplift modeling aims to accurately capture the degree to which different treatments motivate different users, such as coupons or discounts, also known as the estimation of individual treatment effect (ITE). In an actual business scenario, the options for treatment may be numerous and complex, and there may be correlations between different treatments. In add… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted by SIGKDD 2023 Applied Data Science Track

  39. arXiv:2303.13862  [pdf, other

    cs.CV

    Two-level Graph Network for Few-Shot Class-Incremental Learning

    Authors: Hao Chen, Linyan Li, Fan Lyu, Fuyuan Hu, Zhenping Xia, Fenglei Xu

    Abstract: Few-shot class-incremental learning (FSCIL) aims to design machine learning algorithms that can continually learn new concepts from a few data points, without forgetting knowledge of old classes. The difficulty lies in that limited data from new classes not only lead to significant overfitting issues but also exacerbates the notorious catastrophic forgetting problems. However, existing FSCIL metho… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: text overlap with arXiv:2203.06953 by other authors

  40. arXiv:2303.02954  [pdf, other

    cs.LG cs.CV

    Centroid Distance Distillation for Effective Rehearsal in Continual Learning

    Authors: Daofeng Liu, Fan Lyu, Linyan Li, Zhenping Xia, Fuyuan Hu

    Abstract: Rehearsal, retraining on a stored small data subset of old tasks, has been proven effective in solving catastrophic forgetting in continual learning. However, due to the sampled data may have a large bias towards the original dataset, retraining them is susceptible to driving continual domain drift of old tasks in feature space, resulting in forgetting. In this paper, we focus on tackling the cont… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

  41. arXiv:2302.02241  [pdf, other

    cs.IR

    Feature Representation Learning for Click-through Rate Prediction: A Review and New Perspectives

    Authors: Fuyuan Lyu, Xing Tang, Dugang Liu, Haolun Wu, Chen Ma, Xiuqiang He, Xue Liu

    Abstract: Representation learning has been a critical topic in machine learning. In Click-through Rate Prediction, most features are represented as embedding vectors and learned simultaneously with other parameters in the model. With the development of CTR models, feature representation learning has become a trending topic and has been extensively studied by both industrial and academic researchers in recen… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

    Comments: Submitted to IJCAI 2023 Survey Track

  42. arXiv:2301.10909  [pdf, other

    cs.IR

    Optimizing Feature Set for Click-Through Rate Prediction

    Authors: Fuyuan Lyu, Xing Tang, Dugang Liu, Liang Chen, Xiuqiang He, Xue Liu

    Abstract: Click-through prediction (CTR) models transform features into latent vectors and enumerate possible feature interactions to improve performance based on the input feature set. Therefore, when selecting an optimal feature set, we should consider the influence of both feature and its interaction. However, most previous works focus on either feature field selection or only select feature interaction… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted by WWW 2023 Research Tracks

  43. arXiv:2212.14464  [pdf, other

    cs.IR

    Result Diversification in Search and Recommendation: A Survey

    Authors: Haolun Wu, Yansen Zhang, Chen Ma, Fuyuan Lyu, Bowei He, Bhaskar Mitra, Xue Liu

    Abstract: Diversifying return results is an important research topic in retrieval systems in order to satisfy both the various interests of customers and the equal market exposure of providers. There has been growing attention on diversity-aware research during recent years, accompanied by a proliferation of literature on methods to promote diversity in search and recommendation. However, diversity-aware st… ▽ More

    Submitted 18 February, 2024; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 20 pages

  44. arXiv:2211.14763  [pdf, other

    cs.CV cs.AI

    Multi-Label Continual Learning using Augmented Graph Convolutional Network

    Authors: Kaile Du, Fan Lyu, Linyan Li, Fuyuan Hu, Wei Feng, Fenglei Xu, Xuefeng Xi, Hanjing Cheng

    Abstract: Multi-Label Continual Learning (MLCL) builds a class-incremental framework in a sequential multi-label image recognition data stream. The critical challenges of MLCL are the construction of label relationships on past-missing and future-missing partial labels of training data and the catastrophic forgetting on old classes, resulting in poor generalization. To solve the problems, the study proposes… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

  45. arXiv:2210.10581  [pdf, other

    cs.CL cs.CR

    CEntRE: A paragraph-level Chinese dataset for Relation Extraction among Enterprises

    Authors: Peipei Liu, Hong Li, Zhiyu Wang, Yimo Ren, Jie Liu, Fei Lyu, Hongsong Zhu, Limin Sun

    Abstract: Enterprise relation extraction aims to detect pairs of enterprise entities and identify the business relations between them from unstructured or semi-structured text data, and it is crucial for several real-world applications such as risk analysis, rating research and supply chain security. However, previous work mainly focuses on getting attribute information about enterprises like personnel and… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

  46. arXiv:2209.12241  [pdf, other

    cs.LG

    Exploring Example Influence in Continual Learning

    Authors: Qing Sun, Fan Lyu, Fanhua Shang, Wei Feng, Liang Wan

    Abstract: Continual Learning (CL) sequentially learns new tasks like human beings, with the goal to achieve better Stability (S, remembering past tasks) and Plasticity (P, adapting to new tasks). Due to the fact that past training data is not available, it is valuable to explore the influence difference on S and P among training examples, which may improve the learning pattern towards better SP. Inspired by… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted at NeurIPS 2022

  47. OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction

    Authors: Fuyuan Lyu, Xing Tang, Hong Zhu, Huifeng Guo, Yingxue Zhang, Ruiming Tang, Xue Liu

    Abstract: Learning embedding table plays a fundamental role in Click-through rate(CTR) prediction from the view of the model performance and memory usage. The embedding table is a two-dimensional tensor, with its axes indicating the number of feature values and the embedding dimension, respectively. To learn an efficient and effective embedding table, recent works either assign various embedding dimensions… ▽ More

    Submitted 6 September, 2022; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: Accepted by CIKM 2022 Research Track

  48. arXiv:2207.07840  [pdf, other

    cs.LG cs.AI

    Class-Incremental Lifelong Learning in Multi-Label Classification

    Authors: Kaile Du, Linyan Li, Fan Lyu, Fuyuan Hu, Zhenping Xia, Fenglei Xu

    Abstract: Existing class-incremental lifelong learning studies only the data is with single-label, which limits its adaptation to multi-label data. This paper studies Lifelong Multi-Label (LML) classification, which builds an online class-incremental classifier in a sequential multi-label classification data stream. Training on the data with Partial Labels in LML classification may result in more serious Ca… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2203.05534

  49. arXiv:2203.10480  [pdf, other

    cs.LG

    Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey

    Authors: Yuecai Zhu, Fuyuan Lyu, Chengming Hu, Xi Chen, Xue Liu

    Abstract: In recent years, the prevalent online services generate a sheer volume of user activity data. Service providers collect these data in order to perform client behavior analysis, and offer better and more customized services. Majority of these data can be modeled and stored as graph, such as the social graph in Facebook, user-video interaction graph in Youtube. These graphs need to evolve over time… ▽ More

    Submitted 27 March, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: Optimize title for better visibility

  50. arXiv:2203.05534  [pdf, other

    cs.CV cs.AI

    AGCN: Augmented Graph Convolutional Network for Lifelong Multi-label Image Recognition

    Authors: Kaile Du, Fan Lyu, Fuyuan Hu, Linyan Li, Wei Feng, Fenglei Xu, Qiming Fu

    Abstract: The Lifelong Multi-Label (LML) image recognition builds an online class-incremental classifier in a sequential multi-label image recognition data stream. The key challenges of LML image recognition are the construction of label relationships on Partial Labels of training data and the Catastrophic Forgetting on old classes, resulting in poor generalization. To solve the problems, the study proposes… ▽ More

    Submitted 10 March, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Accpted in ICME 2022