Skip to main content

Showing 1–50 of 91 results for author: Bai, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.24123  [pdf, ps, other

    cs.CV

    Calligrapher: Freestyle Text Image Customization

    Authors: Yue Ma, Qingyan Bai, Hao Ouyang, Ka Leong Cheng, Qiuyu Wang, Hongyu Liu, Zichen Liu, Haofan Wang, Jingye Chen, Yujun Shen, Qifeng Chen

    Abstract: We introduce Calligrapher, a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Addressing the challenges of precise style control and data dependency in typographic customization, our framework incorporates three key technical contributions. First, we develop a self-distillation mechani… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: Project page: https://calligrapher2025.github.io/Calligrapher Code: https://github.com/Calligrapher2025/Calligrapher

  2. arXiv:2506.09513  [pdf, ps, other

    cs.CL cs.AI cs.MA

    ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

    Authors: Yu Sun, Xingyu Qian, Weiwen Xu, Hao Zhang, Chenghao Xiao, Long Li, Yu Rong, Wenbing Huang, Qifeng Bai, Tingyang Xu

    Abstract: Though reasoning-based large language models (LLMs) have excelled in mathematics and programming, their capabilities in knowledge-intensive medical question answering remain underexplored. To address this, we introduce ReasonMed, the largest medical reasoning dataset, comprising 370k high-quality examples distilled from 1.7 million initial reasoning paths generated by various LLMs. ReasonMed is co… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 24 pages, 6 figures, 7 tables

  3. arXiv:2505.15138  [pdf, ps, other

    cs.LG cs.AI

    Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

    Authors: Yang Xu, Swetha Ganesh, Washim Uddin Mondal, Qinbo Bai, Vaneet Aggarwal

    Abstract: This paper investigates infinite-horizon average reward Constrained Markov Decision Processes (CMDPs) with general parametrization. We propose a Primal-Dual Natural Actor-Critic algorithm that adeptly manages constraints while ensuring a high convergence rate. In particular, our algorithm achieves global convergence and constraint violation rates of $\tilde{\mathcal{O}}(1/\sqrt{T})$ over a horizon… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  4. arXiv:2503.06518  [pdf, other

    cs.LG cs.AI

    Towards Superior Quantization Accuracy: A Layer-sensitive Approach

    Authors: Feng Zhang, Yanbin Liu, Weihua Li, Jie Lv, Xiaodan Wang, Quan Bai

    Abstract: Large Vision and Language Models have exhibited remarkable human-like intelligence in tasks such as natural language comprehension, problem-solving, logical reasoning, and knowledge retrieval. However, training and serving these models require substantial computational resources, posing a significant barrier to their widespread application and further research. To mitigate this challenge, various… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  5. arXiv:2503.00413  [pdf, other

    cs.CV cs.LG

    CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

    Authors: Tianyu Huai, Jie Zhou, Xingjiao Wu, Qin Chen, Qingchun Bai, Ze Zhou, Liang He

    Abstract: Multimodal large language models (MLLMs) have garnered widespread attention from researchers due to their remarkable understanding and generation capabilities in visual language tasks (e.g., visual question answering). However, the rapid pace of knowledge updates in the real world makes offline training of MLLMs costly, and when faced with non-stationary data streams, MLLMs suffer from catastrophi… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Comments: 10 pages,4 figures,accepted by CVPR2025

  6. arXiv:2502.02945  [pdf, other

    cs.CL cs.AI

    LLM-KT: Aligning Large Language Models with Knowledge Tracing using a Plug-and-Play Instruction

    Authors: Ziwei Wang, Jie Zhou, Qin Chen, Min Zhang, Bo Jiang, Aimin Zhou, Qinchun Bai, Liang He

    Abstract: The knowledge tracing (KT) problem is an extremely important topic in personalized education, which aims to predict whether students can correctly answer the next question based on their past question-answer records. Prior work on this task mainly focused on learning the sequence of behaviors based on the IDs or textual information. However, these studies usually fail to capture students' sufficie… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  7. arXiv:2501.13394  [pdf, ps, other

    cs.LG cs.AI

    Concurrent Learning with Aggregated States via Randomized Least Squares Value Iteration

    Authors: Yan Chen, Qinxun Bai, Yiteng Zhang, Shi Dong, Maria Dimakopoulou, Qi Sun, Zhengyuan Zhou

    Abstract: Designing learning agents that explore efficiently in a complex environment has been widely recognized as a fundamental challenge in reinforcement learning. While a number of works have demonstrated the effectiveness of techniques based on randomized value functions on a single agent, it remains unclear, from a theoretical point of view, whether injecting randomization can help a society of agents… ▽ More

    Submitted 15 June, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

  8. arXiv:2501.09905  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    SLIM: Sim-to-Real Legged Instructive Manipulation via Long-Horizon Visuomotor Learning

    Authors: Haichao Zhang, Haonan Yu, Le Zhao, Andrew Choi, Qinxun Bai, Break Yang, Wei Xu

    Abstract: We present a low-cost legged mobile manipulation system that solves long-horizon real-world tasks, trained by reinforcement learning purely in simulation. This system is made possible by 1) a hierarchical design of a high-level policy for visual-mobile manipulation following task instructions, and a low-level quadruped locomotion policy, 2) a teacher and student training pipeline for the high leve… ▽ More

    Submitted 29 January, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

  9. arXiv:2412.21079  [pdf, other

    cs.CV

    Edicho: Consistent Image Editing in the Wild

    Authors: Qingyan Bai, Hao Ouyang, Yinghao Xu, Qiuyu Wang, Ceyuan Yang, Ka Leong Cheng, Yujun Shen, Qifeng Chen

    Abstract: As a verified need, consistent editing across in-the-wild images remains a technical challenge arising from various unmanageable factors, like object poses, lighting conditions, and photography environments. Edicho steps in with a training-free solution based on diffusion models, featuring a fundamental design principle of using explicit image correspondence to direct editing. Specifically, the ke… ▽ More

    Submitted 14 January, 2025; v1 submitted 30 December, 2024; originally announced December 2024.

    Comments: Project page: https://ant-research.github.io/edicho/

  10. arXiv:2411.08307  [pdf, other

    cs.AI cs.MM cs.SD eess.AS

    PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation

    Authors: Yungang Yi, Weihua Li, Matthew Kuo, Quan Bai

    Abstract: AI-based music generation has progressed significantly in recent years. However, creating symbolic music that is both long-structured and expressive remains a considerable challenge. In this paper, we propose PerceiverS (Segmentation and Scale), a novel architecture designed to address this issue by leveraging both Effective Segmentation and Multi-Scale attention mechanisms. Our approach enhances… ▽ More

    Submitted 4 December, 2024; v1 submitted 12 November, 2024; originally announced November 2024.

  11. arXiv:2411.00259  [pdf, other

    cs.LG

    Enhancing Diversity in Bayesian Deep Learning via Hyperspherical Energy Minimization of CKA

    Authors: David Smerkous, Qinxun Bai, Fuxin Li

    Abstract: Particle-based Bayesian deep learning often requires a similarity metric to compare two networks. However, naive similarity metrics lack permutation invariance and are inappropriate for comparing networks. Centered Kernel Alignment (CKA) on feature kernels has been proposed to compare deep networks but has not been used as an optimization objective in Bayesian deep learning. In this paper, we expl… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

    Comments: NeurIPS 2024

  12. arXiv:2410.08345  [pdf, other

    cs.AI

    Large Legislative Models: Towards Efficient AI Policymaking in Economic Simulations

    Authors: Henry Gasztowtt, Benjamin Smith, Vincent Zhu, Qinxun Bai, Edwin Zhang

    Abstract: The improvement of economic policymaking presents an opportunity for broad societal benefit, a notion that has inspired research towards AI-driven policymaking tools. AI policymaking holds the potential to surpass human performance through the ability to process data quickly at scale. However, existing RL-based methods exhibit sample inefficiency, and are further limited by an inability to flexibl… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  13. arXiv:2408.11408  [pdf, other

    cs.CV

    Latent Feature and Attention Dual Erasure Attack against Multi-View Diffusion Models for 3D Assets Protection

    Authors: Jingwei Sun, Xuchong Zhang, Changfeng Sun, Qicheng Bai, Hongbin Sun

    Abstract: Multi-View Diffusion Models (MVDMs) enable remarkable improvements in the field of 3D geometric reconstruction, but the issue regarding intellectual property has received increasing attention due to unauthorized imitation. Recently, some works have utilized adversarial attacks to protect copyright. However, all these works focus on single-image generation tasks which only need to consider the inne… ▽ More

    Submitted 7 April, 2025; v1 submitted 21 August, 2024; originally announced August 2024.

    Comments: This paper has been accepted by ICME 2025

  14. arXiv:2407.15415  [pdf, other

    cs.CL

    LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

    Authors: Xi Chen, Songyang Zhang, Qibing Bai, Kai Chen, Satoshi Nakamura

    Abstract: We introduces LLaST, a framework for building high-performance Large Language model based Speech-to-text Translation systems. We address the limitations of end-to-end speech translation(E2E ST) models by exploring model architecture design and optimization techniques tailored for LLMs. Our approach includes LLM-based speech translation architecture design, ASR-augmented training, multilingual data… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  15. arXiv:2407.15233  [pdf, other

    cs.CV

    LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer

    Authors: Yu Li, Yifan Chen, Gongye Liu, Fei Yin, Qingyan Bai, Jie Wu, Hongfa Wang, Ruihang Chu, Yujiu Yang

    Abstract: Layout generation is a foundation task of graphic design, which requires the integration of visual aesthetics and harmonious expression of content delivery. However, existing methods still face challenges in generating precise and visually appealing layouts, including blocking, overlapping, small-sized, or spatial misalignment. We found that these methods overlook the crucial balance between learn… ▽ More

    Submitted 22 November, 2024; v1 submitted 21 July, 2024; originally announced July 2024.

  16. Imbalanced Graph-Level Anomaly Detection via Counterfactual Augmentation and Feature Learning

    Authors: Zitong Wang, Xuexiong Luo, Enfeng Song, Qiuqing Bai, Fu Lin

    Abstract: Graph-level anomaly detection (GLAD) has already gained significant importance and has become a popular field of study, attracting considerable attention across numerous downstream works. The core focus of this domain is to capture and highlight the anomalous information within given graph datasets. In most existing studies, anomalies are often the instances of few. The stark imbalance misleads cu… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: 12 pages, 4 figures, SSDBM2024

  17. arXiv:2406.11481  [pdf, other

    cs.LG cs.AI

    Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms

    Authors: Vaneet Aggarwal, Washim Uddin Mondal, Qinbo Bai

    Abstract: Reinforcement Learning (RL) serves as a versatile framework for sequential decision-making, finding applications across diverse domains such as robotics, autonomous driving, recommendation systems, supply chain optimization, biology, mechanics, and finance. The primary objective in these applications is to maximize the average reward. Real-world scenarios often necessitate adherence to specific co… ▽ More

    Submitted 17 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.02042

    Journal ref: Foundations and Trends in Optimization: Vol. 6: No. 4, pp 193-298, 2024

  18. arXiv:2406.10367  [pdf, other

    cs.LG

    Disentangled Hyperbolic Representation Learning for Heterogeneous Graphs

    Authors: Qijie Bai, Changli Nie, Haiwei Zhang, Zhicheng Dou, Xiaojie Yuan

    Abstract: Heterogeneous graphs have attracted a lot of research interests recently due to the success for representing complex real-world systems. However, existing methods have two pain points in embedding them into low-dimensional spaces: the mixing of structural and semantic information, and the distributional mismatch between data and embedding spaces. These two challenges require representation methods… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  19. arXiv:2406.05551  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Autoregressive Diffusion Transformer for Text-to-Speech Synthesis

    Authors: Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li

    Abstract: Audio language models have recently emerged as a promising approach for various audio generation tasks, relying on audio tokenizers to encode waveforms into sequences of discrete symbols. Audio tokenization often poses a necessary compromise between code bitrate and reconstruction accuracy. When dealing with low-bitrate audio codes, language models are constrained to process only a subset of the i… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  20. arXiv:2406.04679  [pdf, other

    eess.IV cs.CV

    XctDiff: Reconstruction of CT Images with Consistent Anatomical Structures from a Single Radiographic Projection Image

    Authors: Qingze Bai, Tiange Liu, Zhi Liu, Yubing Tong, Drew Torigian, Jayaram Udupa

    Abstract: In this paper, we present XctDiff, an algorithm framework for reconstructing CT from a single radiograph, which decomposes the reconstruction process into two easily controllable tasks: feature extraction and CT reconstruction. Specifically, we first design a progressive feature extraction strategy that is able to extract robust 3D priors from radiographs. Then, we use the extracted prior informat… ▽ More

    Submitted 13 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  21. arXiv:2404.11869  [pdf, other

    cs.LG cs.SI

    An Efficient Loop and Clique Coarsening Algorithm for Graph Classification

    Authors: Xiaorui Qi, Qijie Bai, Yanlong Wen, Haiwei Zhang, Xiaojie Yuan

    Abstract: Graph Transformers (GTs) have made remarkable achievements in graph-level tasks. However, most existing works regard graph structures as a form of guidance or bias for enhancing node representations, which focuses on node-central perspectives and lacks explicit representations of edges and structures. One natural question arises as to whether we can leverage a hypernode to represent some structure… ▽ More

    Submitted 9 December, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  22. arXiv:2404.04906  [pdf, other

    cs.HC cs.IR

    Balancing Information Perception with Yin-Yang: Agent-Based Information Neutrality Model for Recommendation Systems

    Authors: Mengyan Wang, Yuxuan Hu, Shiqing Wu, Weihua Li, Quan Bai, Verica Rupar

    Abstract: While preference-based recommendation algorithms effectively enhance user engagement by recommending personalized content, they often result in the creation of ``filter bubbles''. These bubbles restrict the range of information users interact with, inadvertently reinforcing their existing viewpoints. Previous research has focused on modifying these underlying algorithms to tackle this issue. Yet,… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  23. arXiv:2402.15525  [pdf, other

    cs.CL cs.CY

    Detecting misinformation through Framing Theory: the Frame Element-based Model

    Authors: Guan Wang, Rebecca Frederick, Jinglong Duan, William Wong, Verica Rupar, Weihua Li, Quan Bai

    Abstract: In this paper, we delve into the rapidly evolving challenge of misinformation detection, with a specific focus on the nuanced manipulation of narrative frames - an under-explored area within the AI community. The potential for Generative AI models to generate misleading narratives underscores the urgency of this problem. Drawing from communication and framing theories, we posit that the presentati… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures, 7 tables

  24. arXiv:2402.15289  [pdf, other

    cs.CL cs.LG

    Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models

    Authors: Shunyu Liu, Jie Zhou, Qunxi Zhu, Qin Chen, Qingchun Bai, Jun Xiao, Liang He

    Abstract: Aspect-Based Sentiment Analysis (ABSA) stands as a crucial task in predicting the sentiment polarity associated with identified aspects within text. However, a notable challenge in ABSA lies in precisely determining the aspects' boundaries (start and end indices), especially for long ones, due to users' colloquial expressions. We propose DiffusionABSA, a novel diffusion model tailored for ABSA, wh… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to LREC-COLING 2024, submission version

  25. arXiv:2402.14000  [pdf, other

    cs.CV

    Real-time 3D-aware Portrait Editing from a Single Image

    Authors: Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen

    Abstract: This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner. To this end, a lightweight module is distilled from a 3D portrait generator and a text-to-image model, which provide prior knowledge of face geometry and superior editing capability, respectively. Such a design brings two comp… ▽ More

    Submitted 18 July, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ECCV 2024 camera-ready version. Project page: https://github.com/EzioBy/3dpe

  26. arXiv:2402.02042  [pdf, ps, other

    cs.LG cs.AI

    Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: This paper explores the realm of infinite horizon average reward Constrained Markov Decision Processes (CMDPs). To the best of our knowledge, this work is the first to delve into the regret and constraint violation analysis of average reward CMDPs with a general policy parametrization. To address this challenge, we propose a primal dual-based policy gradient algorithm that adeptly manages the cons… ▽ More

    Submitted 30 October, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Journal ref: NeurIPS 2024

  27. arXiv:2312.02877  [pdf, other

    cs.CV

    DIPR: Efficient Point Cloud Registration via Dynamic Iteration

    Authors: Yang Ai, Qiang Bai, Jindong Li, Xi Yang

    Abstract: Point cloud registration (PCR) is an essential task in 3D vision. Existing methods achieve increasingly higher accuracy. However, a large proportion of non-overlapping points in point cloud registration consume a lot of computational resources while negatively affecting registration accuracy. To overcome this challenge, we introduce a novel Efficient Point Cloud Registration via Dynamic Iteration… ▽ More

    Submitted 24 August, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  28. arXiv:2310.04342  [pdf, other

    cs.DB cs.NI

    Minerva: Decentralized Collaborative Query Processing over InterPlanetary File System

    Authors: Zhiyi Yao, Bowen Ding, Qianlan Bai, Yuedong Xu

    Abstract: Data silos create barriers in accessing and utilizing data dispersed over networks. Directly sharing data easily suffers from the long downloading time, the single point failure and the untraceable data usage. In this paper, we present Minerva, a peer-to-peer cross-cluster data query system based on InterPlanetary File System (IPFS). Minerva makes use of the distributed Hash table (DHT) lookup to… ▽ More

    Submitted 8 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

  29. arXiv:2309.11730  [pdf, other

    eess.AS cs.SD

    Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

    Authors: Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li

    Abstract: Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets. To boost the system performance, researchers leverage large pretrained models such as WavLM to transfer learned high-level features to the downstream speaker recognition task. However, this approach introduces extra parameters as the pretrained model remains in the inference s… ▽ More

    Submitted 26 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: submitted to ICASSP 2024

  30. arXiv:2309.01922  [pdf, ps, other

    cs.LG cs.AI

    Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes

    Authors: Qinbo Bai, Washim Uddin Mondal, Vaneet Aggarwal

    Abstract: In this paper, we consider an infinite horizon average reward Markov Decision Process (MDP). Distinguishing itself from existing works within this context, our approach harnesses the power of the general policy gradient-based algorithm, liberating it from the constraints of assuming a linear MDP structure. We propose a policy gradient-based algorithm and show its global convergence property. We th… ▽ More

    Submitted 2 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Journal ref: AAAI 2024

  31. arXiv:2308.07926  [pdf, other

    cs.CV

    CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

    Authors: Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen

    Abstract: We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i.e., rendered from the canonical content field) to each individual frame along the time axis. Given a target video, these two fi… ▽ More

    Submitted 12 December, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code: https://github.com/qiuyu96/CoDeF

  32. arXiv:2307.02797  [pdf, other

    cs.IR cs.AI

    BHEISR: Nudging from Bias to Balance -- Promoting Belief Harmony by Eliminating Ideological Segregation in Knowledge-based Recommendations

    Authors: Mengyan Wang, Yuxuan Hu, Zihan Yuan, Chenting Jiang, Weihua Li, Shiqing Wu, Quan Bai

    Abstract: In the realm of personalized recommendation systems, the increasing concern is the amplification of belief imbalance and user biases, a phenomenon primarily attributed to the filter bubble. Addressing this critical issue, we introduce an innovative intermediate agency (BHEISR) between users and existing recommendation systems to attenuate the negative repercussions of the filter bubble effect in e… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 26 pages

    MSC Class: 68T07 ACM Class: I.2.6; I.2.7

  33. arXiv:2306.05537  [pdf, other

    cs.CL

    AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization

    Authors: Guan Wang, Weihua Li, Edmund M-K. Lai, Quan Bai

    Abstract: The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant in… ▽ More

    Submitted 25 May, 2023; originally announced June 2023.

    Comments: 21 pages, 4 figures, 7 tables

  34. arXiv:2305.08272  [pdf, other

    cs.DB

    QueryBooster: Improving SQL Performance Using Middleware Services for Human-Centered Query Rewriting

    Authors: Qiushi Bai, Sadeem Alsudais, Chen Li

    Abstract: SQL query performance is critical in database applications, and query rewriting is a technique that transforms an original query into an equivalent query with a better performance. In a wide range of database-supported systems, there is a unique problem where both the application and database layer are black boxes, and the developers need to use their knowledge about the data and domain to rewrite… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

  35. HGWaveNet: A Hyperbolic Graph Neural Network for Temporal Link Prediction

    Authors: Qijie Bai, Changli Nie, Haiwei Zhang, Dongming Zhao, Xiaojie Yuan

    Abstract: Temporal link prediction, aiming to predict future edges between paired nodes in a dynamic graph, is of vital importance in diverse applications. However, existing methods are mainly built upon uniform Euclidean space, which has been found to be conflict with the power-law distributions of real-world graphs and unable to represent the hierarchical connections between nodes effectively. With respec… ▽ More

    Submitted 3 May, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: Accepted by Web Conference (WWW) 2023

    Journal ref: WWW '23: Proceedings of the ACM Web Conference 2023 (523-532)

  36. $\text{H}^2\text{TNE}$: Temporal Heterogeneous Information Network Embedding in Hyperbolic Spaces

    Authors: Qijie Bai, Jiawen Guo, Haiwei Zhang, Changli Nie, Lin Zhang, Xiaojie Yuan

    Abstract: Temporal heterogeneous information network (temporal HIN) embedding, aiming to represent various types of nodes of different timestamps into low dimensional spaces while preserving structural and semantic information, is of vital importance in diverse real-life tasks. Researchers have made great efforts on temporal HIN embedding in Euclidean spaces and got some considerable achievements. However,… ▽ More

    Submitted 14 June, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

    Journal ref: The Semantic Web-ISWC 2022: 21st International Semantic Web Conference, Virtual Event, October 23-27, 2022, Proceedings (pp. 179-195)

  37. arXiv:2304.01999  [pdf, other

    cs.CV

    Revisiting the Evaluation of Image Synthesis with GANs

    Authors: Mengping Yang, Ceyuan Yang, Yichi Zhang, Qingyan Bai, Yujun Shen, Bo Dai

    Abstract: A good metric, which promises a reliable comparison between solutions, is essential for any well-defined task. Unlike most vision tasks that have per-sample ground-truth, image synthesis tasks target generating unseen data and hence are usually evaluated through a distributional distance between one set of real samples and another set of generated samples. This study presents an empirical investig… ▽ More

    Submitted 23 October, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 datasets and benchmarks track

  38. arXiv:2303.00815  [pdf, other

    cs.CL cs.AI

    Soft Prompt Guided Joint Learning for Cross-Domain Sentiment Analysis

    Authors: Jingli Shi, Weihua Li, Quan Bai, Yi Yang, Jianhua Jiang

    Abstract: Aspect term extraction is a fundamental task in fine-grained sentiment analysis, which aims at detecting customer's opinion targets from reviews on product or service. The traditional supervised models can achieve promising results with annotated datasets, however, the performance dramatically decreases when they are applied to the task of cross-domain aspect term extraction. Existing cross-domain… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 22 pages

  39. arXiv:2302.08505  [pdf, other

    cs.CV cs.AI

    Rapid-Motion-Track: Markerless Tracking of Fast Human Motion with Deeper Learning

    Authors: Renjie Li, Chun Yu Lao, Rebecca St. George, Katherine Lawler, Saurabh Garg, Son N. Tran, Quan Bai, Jane Alty

    Abstract: Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are us… ▽ More

    Submitted 18 January, 2023; originally announced February 2023.

  40. arXiv:2302.01443  [pdf, other

    cs.AI

    DOR: A Novel Dual-Observation-Based Approach for News Recommendation Systems

    Authors: Mengyan Wang, Weihua Li, Jingli Shi, Shiqing Wu, Quan Bai

    Abstract: Online social media platforms offer access to a vast amount of information, but sifting through the abundance of news can be overwhelming and tiring for readers. personalised recommendation algorithms can help users find information that interests them. However, most existing models rely solely on observations of user behaviour, such as viewing history, ignoring the connections between the news an… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    MSC Class: 68T07

  41. arXiv:2212.03752  [pdf, other

    cs.CV eess.IV

    GLeaD: Improving GANs with A Generator-Leading Task

    Authors: Qingyan Bai, Ceyuan Yang, Yinghao Xu, Xihui Liu, Yujiu Yang, Yujun Shen

    Abstract: Generative adversarial network (GAN) is formulated as a two-player game between a generator (G) and a discriminator (D), where D is asked to differentiate whether an image comes from real data or is produced by G. Under such a formulation, D plays as the rule maker and hence tends to dominate the competition. Towards a fairer game in GANs, we propose a new paradigm for adversarial training, which… ▽ More

    Submitted 6 June, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: CVPR2023. Project page: https://ezioby.github.io/glead/ Code: https://github.com/EzioBy/glead/

  42. arXiv:2212.00007  [pdf, other

    cs.HC cs.AI cs.LG

    A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing

    Authors: Yi Yang, Zhong-Qiu Zhao, Quan Bai, Qing Liu, Weihua Li

    Abstract: Due to the noises in crowdsourced labels, label aggregation (LA) has emerged as a standard procedure to post-process crowdsourced labels. LA methods estimate true labels from crowdsourced labels by modeling worker qualities. Most existing LA methods are iterative in nature. They need to traverse all the crowdsourced labels multiple times in order to jointly and iteratively update true labels and w… ▽ More

    Submitted 19 November, 2022; originally announced December 2022.

  43. arXiv:2211.15956  [pdf, other

    cs.LG cs.AI

    Offline Reinforcement Learning with Closed-Form Policy Improvement Operators

    Authors: Jiachen Li, Edwin Zhang, Ming Yin, Qinxun Bai, Yu-Xiang Wang, William Yang Wang

    Abstract: Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while constrained by the behavior policy to avoid a significant distributional shift. In this paper, we propose our closed-form policy improvement operators. We make a n… ▽ More

    Submitted 22 July, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted at ICML 2023

  44. arXiv:2211.00185  [pdf, other

    cs.LG cs.AI cs.CV

    Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

    Authors: Wenli Yang, Guan Huang, Renjie Li, Jiahao Yu, Yanyu Chen, Quan Bai, Beyong Kang

    Abstract: Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains, but lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications. There have been many works on input interpretability focusing on analyzing the input-output relations, but the internal logic of models has… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  45. arXiv:2210.09549  [pdf, other

    cs.CV cs.LG

    Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation

    Authors: Ruijun Li, Weihua Li, Yi Yang, Hanyu Wei, Jianhua Jiang, Quan Bai

    Abstract: Recently, diffusion models have been proven to perform remarkably well in text-to-image synthesis tasks in a number of studies, immediately presenting new study opportunities for image generation. Google's Imagen follows this research trend and outperforms DALLE2 as the best model for text-to-image generation. However, Imagen merely uses a T5 language model for text processing, which cannot ensure… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    MSC Class: 94A08 ACM Class: I.4.0

  46. arXiv:2208.02189  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis

    Authors: Qibing Bai, Tom Ko, Yu Zhang

    Abstract: In human speech, the attitude of a speaker cannot be fully expressed only by the textual content. It has to come along with the intonation. Declarative questions are commonly used in daily Cantonese conversations, and they are usually uttered with rising intonation. Vanilla neural text-to-speech (TTS) systems are not capable of synthesizing rising intonation for these sentences due to the loss of… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: Accepted by INTERSPEECH 2022

  47. arXiv:2207.02376  [pdf, other

    cs.CV cs.AI

    A Comprehensive Review on Deep Supervision: Theories and Applications

    Authors: Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai

    Abstract: Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishi… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

  48. arXiv:2206.05850  [pdf, other

    cs.LG cs.AI eess.SY

    Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm

    Authors: Qinbo Bai, Amrit Singh Bedi, Vaneet Aggarwal

    Abstract: We consider the problem of constrained Markov decision process (CMDP) in continuous state-actions spaces where the goal is to maximize the expected cumulative reward subject to some constraints. We propose a novel Conservative Natural Policy Gradient Primal-Dual Algorithm (C-NPG-PD) to achieve zero constraint violation while achieving state of the art convergence results for the objective value fu… ▽ More

    Submitted 16 May, 2024; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: The latest version fixed the error in the proof of Lemma 4 in AAAI2023

  49. arXiv:2205.15514  [pdf, other

    cs.CL

    A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis

    Authors: Qi Zhang, Jie Zhou, Qin Chen, Qingchun Bai, Jun Xiao, Liang He

    Abstract: Structured sentiment analysis, which aims to extract the complex semantic structures such as holders, expressions, targets, and polarities, has obtained widespread attention from both industry and academia. Unfortunately, the existing structured sentiment analysis datasets refer to a few languages and are relatively small, limiting neural network models' performance. In this paper, we focus on the… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  50. Enhancing Event-Level Sentiment Analysis with Structured Arguments

    Authors: Qi Zhang, Jie Zhou, Qin Chen, Qinchun Bai, Liang He

    Abstract: Previous studies about event-level sentiment analysis (SA) usually model the event as a topic, a category or target terms, while the structured arguments (e.g., subject, object, time and location) that have potential effects on the sentiment are not well studied. In this paper, we redefine the task as structured event-level SA and propose an End-to-End Event-level Sentiment Analysis (… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.