Skip to main content

Showing 1–50 of 70 results for author: Lai, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.04511  [pdf, ps, other

    cs.CV

    FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection

    Authors: Xinhua Lu, Runhe Lai, Yanqi Wu, Kanghao Chen, Wei-Shi Zheng, Ruixuan Wang

    Abstract: Pre-trained vision-language models (VLMs) have advanced out-of-distribution (OOD) detection recently. However, existing CLIP-based methods often focus on learning OOD-related knowledge to improve OOD detection, showing limited generalization or reliance on external large-scale auxiliary datasets. In this study, instead of delving into the intricate OOD-related knowledge, we propose an innovative C… ▽ More

    Submitted 6 July, 2025; originally announced July 2025.

  2. arXiv:2506.21873  [pdf, ps, other

    cs.CV cs.AI

    Grounding-Aware Token Pruning: Recovering from Drastic Performance Drops in Visual Grounding Caused by Pruning

    Authors: Tzu-Chun Chien, Chieh-Kai Lin, Shiang-Feng Tsai, Ruei-Chi Lai, Hung-Jen Chen, Min Sun

    Abstract: Recent Multimodal Large Language Models (MLLMs) have demonstrated strong performance in visual grounding, establishing themselves as a general interface for various vision-language applications. This progress has driven the development of token pruning methods to mitigate the high computational costs associated with processing numerous visual tokens. However, we observe that pruning significantly… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  3. arXiv:2506.10959  [pdf, ps, other

    cs.LG cs.AI math.ST

    Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods

    Authors: Zhaiming Shen, Alexander Hsu, Rongjie Lai, Wenjing Liao

    Abstract: While in-context learning (ICL) has achieved remarkable success in natural language and vision domains, its theoretical understanding--particularly in the context of structured geometric data--remains unexplored. In this work, we initiate a theoretical study of ICL for regression of Hölder functions on manifolds. By establishing a novel connection between the attention mechanism and classical kern… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  4. arXiv:2505.03205  [pdf, ps, other

    cs.LG math.NA math.ST

    Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights

    Authors: Zhaiming Shen, Alex Havrilla, Rongjie Lai, Alexander Cloninger, Wenjing Liao

    Abstract: Transformers serve as the foundational architecture for large language and video generation models, such as GPT, BERT, SORA and their successors. Empirical studies have demonstrated that real-world data and learning tasks exhibit low-dimensional structures, along with some noise or measurement error. The performance of transformers tends to depend on the intrinsic dimension of the data/tasks, thou… ▽ More

    Submitted 13 June, 2025; v1 submitted 6 May, 2025; originally announced May 2025.

  5. arXiv:2502.20175  [pdf, ps, other

    cs.AI cs.CL

    An Extensive Evaluation of PDDL Capabilities in off-the-shelf LLMs

    Authors: Kaustubh Vyas, Damien Graux, Sébastien Montella, Pavlos Vougiouklis, Ruofei Lai, Keshuang Li, Yang Ren, Jeff Z. Pan

    Abstract: In recent advancements, large language models (LLMs) have exhibited proficiency in code generation and chain-of-thought reasoning, laying the groundwork for tackling automatic formal planning tasks. This study evaluates the potential of LLMs to understand and generate Planning Domain Definition Language (PDDL), an essential representation in artificial intelligence planning. We conduct an extensiv… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: Under review

  6. arXiv:2501.01005  [pdf, other

    cs.DC cs.AI cs.LG

    FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

    Authors: Zihao Ye, Lequn Chen, Ruihang Lai, Wuwei Lin, Yineng Zhang, Stephanie Wang, Tianqi Chen, Baris Kasikci, Vinod Grover, Arvind Krishnamurthy, Luis Ceze

    Abstract: Transformers, driven by attention mechanisms, form the foundation of large language models (LLMs). As these models scale up, efficient GPU attention kernels become essential for high-throughput and low-latency inference. Diverse LLM applications demand flexible and high-performance attention solutions. We present FlashInfer: a customizable and efficient attention engine for LLM serving. FlashInfer… ▽ More

    Submitted 21 April, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

    Comments: Accepted by MLSys 2025, code available at http://github.com/flashinfer-ai/flashinfer

  7. arXiv:2412.18431  [pdf, ps, other

    cs.CL cs.AI cs.IR

    GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

    Authors: Zhili Shen, Chenxin Diao, Pavlos Vougiouklis, Pascual Merita, Shriram Piramanayagam, Enting Chen, Damien Graux, Andre Melo, Ruofei Lai, Zeren Jiang, Zhongyang Li, YE QI, Yang Ren, Dandan Tu, Jeff Z. Pan

    Abstract: Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce GeAR, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an a… ▽ More

    Submitted 22 June, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

    Comments: ACL 2025 Findings

  8. arXiv:2412.15803  [pdf, other

    cs.LG cs.AI

    WebLLM: A High-Performance In-Browser LLM Inference Engine

    Authors: Charlie F. Ruan, Yucheng Qin, Xun Zhou, Ruihang Lai, Hongyi Jin, Yixin Dong, Bohan Hou, Meng-Shiun Yu, Yiyan Zhai, Sudeep Agarwal, Hangrui Cao, Siyuan Feng, Tianqi Chen

    Abstract: Advancements in large language models (LLMs) have unlocked remarkable capabilities. While deploying these models typically requires server-grade GPUs and cloud-based inference, the recent emergence of smaller open-source models and increasingly powerful consumer devices have made on-device deployment practical. The web browser as a platform for on-device deployment is universally accessible, provi… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

  9. arXiv:2412.12839  [pdf, other

    cs.AI

    From An LLM Swarm To A PDDL-Empowered HIVE: Planning Self-Executed Instructions In A Multi-Modal Jungle

    Authors: Kaustubh Vyas, Damien Graux, Yijun Yang, Sébastien Montella, Chenxin Diao, Wendi Zhou, Pavlos Vougiouklis, Ruofei Lai, Yang Ren, Keshuang Li, Jeff Z. Pan

    Abstract: In response to the call for agent-based solutions that leverage the ever-increasing capabilities of the deep models' ecosystem, we introduce Hive -- a comprehensive solution for selecting appropriate models and subsequently planning a set of atomic actions to satisfy the end-users' instructions. Hive operates over sets of models and, upon receiving natural language instructions (i.e. user queries)… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: Under review

  10. arXiv:2412.12488  [pdf, other

    cs.DC

    A System for Microserving of LLMs

    Authors: Hongyi Jin, Ruihang Lai, Charlie F. Ruan, Yingcheng Wang, Todd C. Mowry, Xupeng Miao, Zhihao Jia, Tianqi Chen

    Abstract: The recent advances in LLMs bring a strong demand for efficient system support to improve overall serving efficiency. As LLM inference scales towards multiple GPUs and even multiple compute nodes, various coordination patterns, such as prefill-decode disaggregation and context migration, arise in serving systems. Most inference services today expose a coarse-grained request-level API with a pre-co… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  11. arXiv:2411.15100  [pdf, other

    cs.CL cs.AI cs.PL

    XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models

    Authors: Yixin Dong, Charlie F. Ruan, Yaxing Cai, Ruihang Lai, Ziyi Xu, Yilong Zhao, Tianqi Chen

    Abstract: The applications of LLM Agents are becoming increasingly complex and diverse, leading to a high demand for structured outputs that can be parsed into code, structured function calls, and embodied agent commands. These developments bring significant demands for structured generation in LLM inference. Context-free grammar is a flexible approach to enable structured generation via constrained decodin… ▽ More

    Submitted 12 May, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

    Comments: MLSys '25

  12. arXiv:2410.03386  [pdf, ps, other

    cs.CY

    Chronic Disease Diagnoses Using Behavioral Data

    Authors: Di Wang, Yidan Hu, Eng Sing Lee, Hui Hwang Teong, Ray Tian Rui Lai, Wai Han Hoi, Chunyan Miao

    Abstract: Early detection of chronic diseases is beneficial to healthcare by providing a golden opportunity for timely interventions. Although numerous prior studies have successfully used machine learning (ML) models for disease diagnoses, they highly rely on medical data, which are scarce for most patients in the early stage of the chronic diseases. In this paper, we aim to diagnose hyperglycemia (diabete… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  13. arXiv:2408.14728  [pdf, other

    cs.LG cs.AI cs.CR

    TART: Boosting Clean Accuracy Through Tangent Direction Guided Adversarial Training

    Authors: Bongsoo Yi, Rongjie Lai, Yao Li

    Abstract: Adversarial training has been shown to be successful in enhancing the robustness of deep neural networks against adversarial attacks. However, this robustness is accompanied by a significant decline in accuracy on clean data. In this paper, we propose a novel method, called Tangent Direction Guided Adversarial Training (TART), that leverages the tangent space of the data manifold to ameliorate the… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  14. arXiv:2408.11049  [pdf, other

    cs.CL

    MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

    Authors: Ranajoy Sadhukhan, Jian Chen, Zhuoming Chen, Vashisth Tiwari, Ruihang Lai, Jinyuan Shi, Ian En-Hsu Yen, Avner May, Tianqi Chen, Beidi Chen

    Abstract: Large Language Models (LLMs) have become more prevalent in long-context applications such as interactive chatbots, document analysis, and agent workflows, but it is challenging to serve long-context requests with low latency and high throughput. Speculative decoding (SD) is a widely used technique to reduce latency losslessly, but the conventional wisdom suggests that its efficacy is limited to sm… ▽ More

    Submitted 1 April, 2025; v1 submitted 20 August, 2024; originally announced August 2024.

  15. arXiv:2406.07432  [pdf, other

    cs.IR

    Matryoshka Representation Learning for Recommendation

    Authors: Riwei Lai, Li Chen, Weixin Chen, Rui Chen

    Abstract: Representation learning is essential for deep-neural-network-based recommender systems to capture user preferences and item features within fixed-dimensional user and item vectors. Unlike existing representation learning methods that either treat each user preference and item feature uniformly or categorize them into discrete clusters, we argue that in the real world, user preferences and item fea… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  16. arXiv:2404.09151  [pdf, other

    cs.SE cs.LG

    Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging

    Authors: Siyuan Feng, Jiawei Liu, Ruihang Lai, Charlie F. Ruan, Yong Yu, Lingming Zhang, Tianqi Chen

    Abstract: While existing machine learning (ML) frameworks focus on established platforms, like running CUDA on server-grade GPUs, there have been growing demands to enable emerging AI applications in a broader set of scenarios, such as running Large Language Models (LLMs) within browsers and mobile phones. However, deploying emerging models on new platforms (such as Metal and WebGPU) presents significant so… ▽ More

    Submitted 3 April, 2025; v1 submitted 14 April, 2024; originally announced April 2024.

  17. arXiv:2403.08010  [pdf, other

    cs.CL

    Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM

    Authors: Jingcong Liang, Rong Ye, Meng Han, Ruofei Lai, Xinyu Zhang, Xuanjing Huang, Zhongyu Wei

    Abstract: How can we construct an automated debate judge to evaluate an extensive, vibrant, multi-turn debate? This task is challenging, as judging a debate involves grappling with lengthy texts, intricate argument relationships, and multi-dimensional assessments. At the same time, current research mainly focuses on short dialogues, rarely touching upon the evaluation of an entire debate. In this paper, by… ▽ More

    Submitted 19 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  18. On Defeating Graph Analysis of Anonymous Transactions

    Authors: Christoph Egger, Russell W. F. Lai, Viktoria Ronge, Ivy K. Y. Woo, Hoover H. F. Yin

    Abstract: In a ring-signature-based anonymous cryptocurrency, signers of a transaction are hidden among a set of potential signers, called a ring, whose size is much smaller than the number of all users. The ring-membership relations specified by the sets of transactions thus induce bipartite transaction graphs, whose distribution is in turn induced by the ring sampler underlying the cryptocurrency. Since… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Journal ref: Proceedings on Privacy Enhancing Technologies (PoPETs), Vol. 2022, Issue 3, Pages 538-557

  19. arXiv:2401.17878  [pdf, other

    cs.IR

    A Survey on Data-Centric Recommender Systems

    Authors: Riwei Lai, Rui Chen, Chi Zhang

    Abstract: Recommender systems (RSs) have become an essential tool for mitigating information overload in a range of real-world applications. Recent trends in RSs have revealed a major paradigm shift, moving the spotlight from model-centric innovations to data-centric efforts (e.g., improving data quality and quantity). This evolution has given rise to the concept of data-centric recommender systems (Data-Ce… ▽ More

    Submitted 28 May, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures

  20. arXiv:2401.15482  [pdf, other

    cs.LG cs.GT math.OC

    Unsupervised Solution Operator Learning for Mean-Field Games via Sampling-Invariant Parametrizations

    Authors: Han Huang, Rongjie Lai

    Abstract: Recent advances in deep learning has witnessed many innovative frameworks that solve high dimensional mean-field games (MFG) accurately and efficiently. These methods, however, are restricted to solving single-instance MFG and demands extensive computational time per instance, limiting practicality. To overcome this, we develop a novel framework to learn the MFG solution operator. Our model takes… ▽ More

    Submitted 23 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  21. arXiv:2401.15045  [pdf

    cs.NE

    Emulating Complex Synapses Using Interlinked Proton Conductors

    Authors: Lifu Zhang, Ji-An Li, Yang Hu, Jie Jiang, Rongjie Lai, Marcus K. Benna, Jian Shi

    Abstract: In terms of energy efficiency and computational speed, neuromorphic electronics based on non-volatile memory devices is expected to be one of most promising hardware candidates for future artificial intelligence (AI). However, catastrophic forgetting, networks rapidly overwriting previously learned weights when learning new tasks, remains as a pivotal hurdle in either digital or analog AI chips fo… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 6 figures

  22. arXiv:2401.10490  [pdf, other

    cs.LG

    Generalization Error Guaranteed Auto-Encoder-Based Nonlinear Model Reduction for Operator Learning

    Authors: Hao Liu, Biraj Dahal, Rongjie Lai, Wenjing Liao

    Abstract: Many physical processes in science and engineering are naturally represented by operators between infinite-dimensional function spaces. The problem of operator learning, in this context, seeks to extract these physical processes from empirical data, which is challenging due to the infinite or high dimensionality of data. An integral component in addressing this challenge is model reduction, which… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  23. arXiv:2401.05191  [pdf, other

    cs.IR

    Adaptive Hardness Negative Sampling for Collaborative Filtering

    Authors: Riwei Lai, Rui Chen, Qilong Han, Chi Zhang, Li Chen

    Abstract: Negative sampling is essential for implicit collaborative filtering to provide proper negative training signals so as to achieve desirable performance. We experimentally unveil a common limitation of all existing negative sampling methods that they can only select negative samples of a fixed hardness level, leading to the false positive problem (FPP) and false negative problem (FNP). We then propo… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024

  24. arXiv:2312.13608  [pdf, other

    cs.CL

    Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation

    Authors: Jiayu Lin, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei

    Abstract: Counter-argument generation -- a captivating area in computational linguistics -- seeks to craft statements that offer opposing views. While most research has ventured into paragraph-level generation, sentence-level counter-argument generation beckons with its unique constraints and brevity-focused challenges. Furthermore, the diverse nature of counter-arguments poses challenges for evaluating mod… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: EMNLP2023, main conference

  25. arXiv:2312.00874  [pdf, other

    cs.CL

    Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining

    Authors: Jingcong Liang, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei

    Abstract: The knowledge graph is a structure to store and represent knowledge, and recent studies have discussed its capability to assist language models for various applications. Some variations of knowledge graphs aim to record arguments and their relations for computational argumentation tasks. However, many must simplify semantic types to fit specific schemas, thus losing flexibility and expression abil… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: to be published in EMNLP 2023

  26. arXiv:2311.18397  [pdf, other

    cs.CL

    IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions

    Authors: Zhebin Zhang, Xinyu Zhang, Yuanhang Ren, Saijiang Shi, Meng Han, Yongkang Wu, Ruofei Lai, Zhao Cao

    Abstract: Retrieval-Augmented Generation (RAG), by incorporating external knowledge with parametric memory of language models, has become the state-of-the-art architecture for open-domain QA tasks. However, common knowledge bases are inherently constrained by limited coverage and noisy information, making retrieval-based approaches inadequate to answer implicit reasoning questions. In this paper, we propose… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  27. arXiv:2311.02103  [pdf, other

    cs.LG cs.AI cs.PL

    Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

    Authors: Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen

    Abstract: Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven the demand for their universal deployment across a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces a cros… ▽ More

    Submitted 6 February, 2025; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: To appear at ASPLOS 2025 (16 pages, 20 figures)

  28. arXiv:2308.15711  [pdf, other

    cs.CL cs.AI

    Optimizing Factual Accuracy in Text Generation through Dynamic Knowledge Selection

    Authors: Hongjin Qian, Zhicheng Dou, Jiejun Tan, Haonan Chen, Haoqi Gu, Ruofei Lai, Xinyu Zhang, Zhao Cao, Ji-Rong Wen

    Abstract: Language models (LMs) have revolutionized the way we interact with information, but they often generate nonfactual text, raising concerns about their reliability. Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e.g., entity mismatch) of irrelevant references. Besides,as the length of the output text grows,… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: 15 pages

  29. Augmented Negative Sampling for Collaborative Filtering

    Authors: Yuhan Zhao, Rui Chen, Riwei Lai, Qilong Han, Hongtao Song, Li Chen

    Abstract: Negative sampling is essential for implicit-feedback-based collaborative filtering, which is used to constitute negative signals from massive unlabeled data to guide supervised learning. The state-of-the-art idea is to utilize hard negative samples that carry more useful information to form a better decision boundary. To balance efficiency and effectiveness, the vast majority of existing methods f… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 11 pages, 16 figures,

    MSC Class: 68T07 ACM Class: H.3.3

  30. arXiv:2304.04358  [pdf, other

    cs.CL cs.AI

    WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

    Authors: Hongjing Qian, Yutao Zhu, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, Ji-Rong Wen

    Abstract: In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web. In this task, called WebBrain, the ultimate goal is to generate a fluent, informative, and factually-correct short article (e.g., a Wikipedia article) for a factual query unseen in Wikipedia. To enable experiments on WebBrain, we construct a large-… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

    Comments: Codes in https://github.com/qhjqhj00/WebBrain

  31. arXiv:2303.09863  [pdf, other

    stat.ML cs.LG

    Deep Nonparametric Estimation of Intrinsic Data Structures by Chart Autoencoders: Generalization Error and Robustness

    Authors: Hao Liu, Alex Havrilla, Rongjie Lai, Wenjing Liao

    Abstract: Autoencoders have demonstrated remarkable success in learning low-dimensional latent features of high-dimensional data across various applications. Assuming that data are sampled near a low-dimensional manifold, we employ chart autoencoders, which encode data into low-dimensional latent features on a collection of charts, preserving the topology and geometry of the data manifold. Our paper establi… ▽ More

    Submitted 25 October, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

  32. arXiv:2212.12842  [pdf, other

    cs.DC

    More is Different: Prototyping and Analyzing a New Form of Edge Server with Massive Mobile SoCs

    Authors: Li Zhang, Zhe Fu, Boqing Shi, Xiang Li, Rujin Lai, Chenyang Yang, Ao Zhou, Xiao Ma, Shangguang Wang, Mengwei Xu

    Abstract: Huge energy consumption poses a significant challenge for edge clouds. In response to this, we introduce a new type of edge server, namely SoC Cluster, that orchestrates multiple low-power mobile system-on-chips (SoCs) through an on-chip network. For the first time, we have developed a concrete SoC Cluster consisting of 60 Qualcomm Snapdragon 865 SoCs housed in a 2U rack, which has been successful… ▽ More

    Submitted 16 July, 2024; v1 submitted 24 December, 2022; originally announced December 2022.

    Comments: Accepted at USENIX ATC 2024

  33. arXiv:2209.06583  [pdf, other

    cs.IR cs.AI cs.CL

    Pre-training for Information Retrieval: Are Hyperlinks Fully Explored?

    Authors: Jiawen Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Zikai Guo, Zhaoye Fei, Ruofei Lai, Yongkang Wu, Zhao Cao, Zhicheng Dou

    Abstract: Recent years have witnessed great progress on applying pre-trained language models, e.g., BERT, to information retrieval (IR) tasks. Hyperlinks, which are commonly used in Web pages, have been leveraged for designing pre-training objectives. For example, anchor texts of the hyperlinks have been used for simulating queries, thus constructing tremendous query-document pairs for pre-training. However… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: work in progress

  34. arXiv:2208.10570  [pdf, other

    cs.LG math.NA

    Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders

    Authors: Stefan C. Schonsheck, Scott Mahan, Timo Klock, Alexander Cloninger, Rongjie Lai

    Abstract: Autoencoding is a popular method in representation learning. Conventional autoencoders employ symmetric encoding-decoding procedures and a simple Euclidean latent space to detect hidden low-dimensional structures in an unsupervised way. Some modern approaches to novel data generation such as generative adversarial networks askew this symmetry, but still employ a pair of massive networks--one to ge… ▽ More

    Submitted 4 October, 2024; v1 submitted 22 August, 2022; originally announced August 2022.

  35. arXiv:2208.09129  [pdf, other

    cs.CL

    Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding

    Authors: Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu

    Abstract: Generalized text representations are the foundation of many natural language understanding tasks. To fully utilize the different corpus, it is inevitable that models need to understand the relevance among them. However, many methods ignore the relevance and adopt a single-channel model (a coarse paradigm) directly for all tasks, which lacks enough rationality and interpretation. In addition, some… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Accpeted by COLING 2022

  36. arXiv:2207.04663  [pdf, other

    eess.IV cs.AI cs.LG

    An Ultra-low Power TinyML System for Real-time Visual Processing at Edge

    Authors: Kunran Xu, Huawei Zhang, Yishi Li, Yuhao Zhang, Rui Lai, Yi Liu

    Abstract: Tiny machine learning (TinyML), executing AI workloads on resource and power strictly restricted systems, is an important and challenging topic. This brief firstly presents an extremely tiny backbone to construct high efficiency CNN models for various visual tasks. Then, a specially designed neural co-processor (NCP) is interconnected with MCU to build an ultra-low power TinyML system, which store… ▽ More

    Submitted 1 June, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 5 pages, 5 figures

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, 2023

  37. arXiv:2207.04606  [pdf, other

    cs.LG cs.AI cs.PL

    SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

    Authors: Zihao Ye, Ruihang Lai, Junru Shao, Tianqi Chen, Luis Ceze

    Abstract: Sparse tensors are rapidly becoming critical components of modern deep learning workloads. However, developing high-performance sparse operators can be difficult and tedious, and existing vendor libraries cannot satisfy the escalating demands from new operators. Sparse tensor compilers simplify the development of operators, but efficient sparse compilation for deep learning remains challenging bec… ▽ More

    Submitted 21 February, 2023; v1 submitted 10 July, 2022; originally announced July 2022.

    Comments: To appear at ASPLOS 2023 (19 pages, 23 figures), source code available at https://github.com/uwsampl/sparsetir, artifact available at https://github.com/uwsampl/sparsetir-artifact

  38. arXiv:2207.04296  [pdf, other

    cs.LG cs.AI cs.PL

    TensorIR: An Abstraction for Automatic Tensorized Program Optimization

    Authors: Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen

    Abstract: Deploying deep learning models on various devices has become an important topic. The wave of hardware specialization brings a diverse set of acceleration primitives for multi-dimensional tensor computations. These new acceleration primitives, along with the emerging machine learning models, bring tremendous engineering challenges. In this paper, we present TensorIR, a compiler abstraction for opti… ▽ More

    Submitted 27 October, 2022; v1 submitted 9 July, 2022; originally announced July 2022.

    Comments: Accepted to ASPLOS 2023

  39. arXiv:2206.14990  [pdf, other

    math.OC cs.LG cs.MA

    Bridging Mean-Field Games and Normalizing Flows with Trajectory Regularization

    Authors: Han Huang, Jiajia Yu, Jie Chen, Rongjie Lai

    Abstract: Mean-field games (MFGs) are a modeling framework for systems with a large number of interacting agents. They have applications in economics, finance, and game theory. Normalizing flows (NFs) are a family of deep generative models that compute data likelihoods by using an invertible mapping, which is typically parameterized by using neural networks. They are useful for density modeling and data gen… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: 36 pages, 22 figures, 9 tables

    MSC Class: 49M41; 49M25

  40. arXiv:2205.13603  [pdf, other

    cs.LG

    Tensor Program Optimization with Probabilistic Programs

    Authors: Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen

    Abstract: Automatic optimization for tensor programs becomes increasingly important as we deploy deep learning in various environments, and efficient optimization relies on a rich search space and effective search. Most existing efforts adopt a search space which lacks the ability to efficiently enable domain experts to grow the search space. This paper introduces MetaSchedule, a domain-specific probabilist… ▽ More

    Submitted 9 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted to NeurIPS 2022

  41. arXiv:2205.10423  [pdf, other

    cs.LG q-bio.BM

    Learning Geometrically Disentangled Representations of Protein Folding Simulations

    Authors: N. Joseph Tatro, Payel Das, Pin-Yu Chen, Vijil Chenthamarakshan, Rongjie Lai

    Abstract: Massive molecular simulations of drug-target proteins have been used as a tool to understand disease mechanism and develop therapeutics. This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein, e.g. SARS-CoV-2 Spike protein, obtained from computationally expensive molecular simulations. Model tasks involve characterizing the distinct structural f… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: 13 pages, appeared at SimDL ICLR Workshop 2021

    MSC Class: 68T07

  42. arXiv:2205.05122  [pdf, ps, other

    cs.IT

    Multichannel Optimal Tree-Decodable Codes are Not Always Optimal Prefix Codes

    Authors: Hoover H. F. Yin, Harry W. H. Wong, Mehrdad Tahernia, Russell W. F. Lai

    Abstract: The theory of multichannel prefix codes aims to generalize the classical theory of prefix codes. Although single- and two-channel prefix codes always have decoding trees, the same cannot be said when there are more than two channels. One question is of theoretical interest: Do there exist optimal tree-decodable codes that are not optimal prefix codes? Existing literature, which focused on generali… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Full version of the conference version in ISIT'22

  43. arXiv:2201.05279  [pdf, other

    cs.LG cs.CV

    Manifoldron: Direct Space Partition via Manifold Discovery

    Authors: Dayang Wang, Feng-Lei Fan, Bo-Jian Hou, Hao Zhang, Zhen Jia, Boce Zhou, Rongjie Lai, Hengyong Yu, Fei Wang

    Abstract: A neural network with the widely-used ReLU activation has been shown to partition the sample space into many convex polytopes for prediction. However, the parameterized way a neural network and other machine learning models use to partition the space has imperfections, \textit{e}.\textit{g}., the compromised interpretability for complex models, the inflexibility in decision boundary construction d… ▽ More

    Submitted 2 May, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

  44. arXiv:2112.07513  [pdf, other

    cs.CV cs.AI cs.MM

    CORE-Text: Improving Scene Text Detection with Contrastive Relational Reasoning

    Authors: Jingyang Lin, Yingwei Pan, Rongfeng Lai, Xuehang Yang, Hongyang Chao, Ting Yao

    Abstract: Localizing text instances in natural scenes is regarded as a fundamental challenge in computer vision. Nevertheless, owing to the extremely varied aspect ratios and scales of text instances in real scenes, most conventional text detectors suffer from the sub-text problem that only localizes the fragments of text instance (i.e., sub-texts). In this work, we quantitatively analyze the sub-text probl… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: ICME 2021 (Oral); Code is publicly available at: https://github.com/jylins/CORE-Text

  45. arXiv:2110.07431  [pdf, other

    cs.CL

    Towards More Effective and Economic Sparsely-Activated Model

    Authors: Hao Jiang, Ke Zhan, Jianwei Qu, Yongkang Wu, Zhaoye Fei, Xinyu Zhang, Lei Chen, Zhicheng Dou, Xipeng Qiu, Zikai Guo, Ruofei Lai, Jiawen Wu, Enrui Hu, Yinxia Zhang, Yantao Jia, Fan Yu, Zhao Cao

    Abstract: The sparsely-activated models have achieved great success in natural language processing through large-scale parameters and relatively low computational cost, and gradually become a feasible technique for training and implementing extremely large models. Due to the limit of communication cost, activating multiple experts is hardly affordable during training and inference. Therefore, previous work… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

  46. arXiv:2110.06081  [pdf, other

    cs.LG cs.CV cs.NE

    On Expressivity and Trainability of Quadratic Networks

    Authors: Feng-Lei Fan, Mengzhou Li, Fei Wang, Rongjie Lai, Ge Wang

    Abstract: Inspired by the diversity of biological neurons, quadratic artificial neurons can play an important role in deep learning models. The type of quadratic neurons of our interest replaces the inner-product operation in the conventional neuron with a quadratic function. Despite promising results so far achieved by networks of quadratic neurons, there are important issues not well addressed. Theoretica… ▽ More

    Submitted 8 September, 2023; v1 submitted 12 October, 2021; originally announced October 2021.

  47. arXiv:2109.06436  [pdf, other

    cs.IR cs.CL

    YES SIR!Optimizing Semantic Space of Negatives with Self-Involvement Ranker

    Authors: Ruizhi Pu, Xinyu Zhang, Ruofei Lai, Zikai Guo, Yinxia Zhang, Hao Jiang, Yongkang Wu, Yantao Jia, Zhicheng Dou, Zhao Cao

    Abstract: Pre-trained model such as BERT has been proved to be an effective tool for dealing with Information Retrieval (IR) problems. Due to its inspiring performance, it has been widely used to tackle with real-world IR problems such as document ranking. Recently, researchers have found that selecting "hard" rather than "random" negative samples would be beneficial for fine-tuning pre-trained models on ra… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

  48. arXiv:2105.03606  [pdf, ps, other

    cs.IT

    On Multi-Channel Huffman Codes for Asymmetric-Alphabet Channels

    Authors: Hoover H. F. Yin, Xishi Wang, Ka Hei Ng, Russell W. F. Lai, Lucien K. L. Ng, Jack P. K. Ma

    Abstract: Zero-error single-channel source coding has been studied extensively over the past decades. Its natural multi-channel generalization is however not well investigated. While the special case with multiple symmetric-alphabet channels was studied a decade ago, codes in such setting have no advantage over single-channel codes in data compression, making them worthless in most applications. With essent… ▽ More

    Submitted 8 May, 2021; originally announced May 2021.

    Comments: full version of the ISIT 2021 paper

  49. arXiv:2009.02439  [pdf, other

    cs.LG math.OC stat.ML

    Optimizing Mode Connectivity via Neuron Alignment

    Authors: N. Joseph Tatro, Pin-Yu Chen, Payel Das, Igor Melnyk, Prasanna Sattigeri, Rongjie Lai

    Abstract: The loss landscapes of deep neural networks are not well understood due to their high nonconvexity. Empirically, the local minima of these loss functions can be connected by a learned curve in model space, along which the loss remains nearly constant; a feature known as mode connectivity. Yet, current curve finding algorithms do not consider the influence of symmetry in the loss surface created by… ▽ More

    Submitted 2 November, 2020; v1 submitted 4 September, 2020; originally announced September 2020.

    Comments: Accepted to NeurIPS 2020, 24 pages, 9 figures, code available at https://github.com/IBM/NeuronAlignment

    Journal ref: Advances in Neural Information Processing Systems, Volume 33, 2020

  50. arXiv:2007.13049  [pdf, other

    cs.CV

    A Dual Iterative Refinement Method for Non-rigid Shape Matching

    Authors: Rui Xiang, Rongjie Lai, Hongkai Zhao

    Abstract: In this work, a simple and efficient dual iterative refinement (DIR) method is proposed for dense correspondence between two nearly isometric shapes. The key idea is to use dual information, such as spatial and spectral, or local and global features, in a complementary and effective way, and extract more accurate information from current iteration to use for the next iteration. In each DIR iterati… ▽ More

    Submitted 19 November, 2020; v1 submitted 25 July, 2020; originally announced July 2020.

    Comments: 9 pages, 9 figures and 1 table