Skip to main content

Showing 1–50 of 70 results for author: Ruan, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.04952  [pdf, ps, other

    cs.CL cs.SE

    ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation

    Authors: Chenchen Zhang, Yuhang Li, Can Xu, Jiaheng Liu, Ao Liu, Shihui Hu, Dengpeng Wu, Guanhua Huang, Kejiao Li, Qi Yi, Ruibin Xiong, Haotian Zhu, Yuanxing Zhang, Yuhao Jiang, Yue Zhang, Zenan Xu, Bohui Zhai, Guoxiang He, Hebin Li, Jie Zhao, Le Zhang, Lingyun Tan, Pengyu Guo, Xianshu Pang, Yang Ruan , et al. (7 additional authors not shown)

    Abstract: The generative capabilities of Large Language Models (LLMs) are rapidly expanding from static code to dynamic, interactive visual artifacts. This progress is bottlenecked by a critical evaluation gap: established benchmarks focus on algorithmic correctness and are blind to the visual fidelity and interactive integrity that define modern user experiences. To bridge this gap, we introduce ArtifactsB… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2506.23577  [pdf, ps, other

    cs.CV

    StackCLIP: Clustering-Driven Stacked Prompt in Zero-Shot Industrial Anomaly Detection

    Authors: Yanning Hou, Yanran Ruan, Junfa Li, Shanshan Wang, Jianfeng Qiu, Ke Xu

    Abstract: Enhancing the alignment between text and image features in the CLIP model is a critical challenge in zero-shot industrial anomaly detection tasks. Recent studies predominantly utilize specific category prompts during pretraining, which can cause overfitting to the training categories and limit model generalization. To address this, we propose a method that transforms category names through multica… ▽ More

    Submitted 5 July, 2025; v1 submitted 30 June, 2025; originally announced June 2025.

  3. arXiv:2506.20599  [pdf, ps, other

    cs.CV

    SFNet: Fusion of Spatial and Frequency-Domain Features for Remote Sensing Image Forgery Detection

    Authors: Ji Qi, Xinchang Zhang, Dingqi Ye, Yongjia Ruan, Xin Guo, Shaowen Wang, Haifeng Li

    Abstract: The rapid advancement of generative artificial intelligence is producing fake remote sensing imagery (RSI) that is increasingly difficult to detect, potentially leading to erroneous intelligence, fake news, and even conspiracy theories. Existing forgery detection methods typically rely on single visual features to capture predefined artifacts, such as spatial-domain cues to detect forged objects l… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  4. arXiv:2505.12325  [pdf, other

    cs.LG

    Neural Graduated Assignment for Maximum Common Edge Subgraphs

    Authors: Chaolong Ying, Yingqi Ruan, Xuemin Chen, Yaomin Wang, Tianshu Yu

    Abstract: The Maximum Common Edge Subgraph (MCES) problem is a crucial challenge with significant implications in domains such as biology and chemistry. Traditional approaches, which include transformations into max-clique and search-based algorithms, suffer from scalability issues when dealing with larger instances. This paper introduces ``Neural Graduated Assignment'' (NGA), a simple, scalable, unsupervis… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  5. arXiv:2505.08120  [pdf, other

    cs.CL cs.LG

    Putting It All into Context: Simplifying Agents with LCLMs

    Authors: Mingjian Jiang, Yangjun Ruan, Luis Lastras, Pavan Kapanipathi, Tatsunori Hashimoto

    Abstract: Recent advances in language model (LM) agents have demonstrated significant potential for automating complex real-world tasks. To make progress on these difficult tasks, LM agent architectures have become increasingly complex, often incorporating multi-step retrieval tools, multiple agents, and scaffolding adapted to the underlying LM. In this work, we investigate whether all of this complexity is… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  6. arXiv:2503.18866  [pdf, other

    cs.LG cs.AI cs.CL

    Reasoning to Learn from Latent Thoughts

    Authors: Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori Hashimoto

    Abstract: Compute scaling for language model (LM) pretraining has outpaced the growth of human-written texts, leading to concerns that data will become the bottleneck to LM scaling. To continue scaling pretraining in this data-constrained regime, we propose that explicitly modeling and inferring the latent thoughts that underlie the text generation process can significantly improve pretraining data efficien… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

  7. arXiv:2503.05505  [pdf, other

    cs.CL

    Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework

    Authors: Yusong Ke, Hongru Lin, Yuting Ruan, Junya Tang, Li Li

    Abstract: Large language models (LLMs) are increasingly adopted in medical question-answering (QA) scenarios. However, LLMs can generate hallucinations and nonfactual information, undermining their trustworthiness in high-stakes medical tasks. Conformal Prediction (CP) provides a statistically rigorous framework for marginal (average) coverage guarantees but has limited exploration in medical QA. This paper… ▽ More

    Submitted 8 May, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

    Comments: Published by Mathematics

  8. arXiv:2502.10510  [pdf, other

    cs.LG stat.ML

    MixMin: Finding Data Mixtures via Convex Minimization

    Authors: Anvith Thudi, Evianne Rovers, Yangjun Ruan, Tristan Thrush, Chris J. Maddison

    Abstract: Modern machine learning pipelines are increasingly combining and mixing data from diverse and disparate sources, e.g., pre-training large language models. Yet, finding the optimal data mixture is a challenging and open problem. We formalize this data mixing problem as a bi-level objective: the best mixture is the one that would lead to the best model for a downstream objective. Unfortunately, this… ▽ More

    Submitted 22 May, 2025; v1 submitted 14 February, 2025; originally announced February 2025.

    Comments: Proceedings of the 42nd International Conference on Machine Learning

  9. arXiv:2501.15098  [pdf, other

    cs.LG cs.AI

    CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

    Authors: Zihang Li, Yangdong Ruan, Wenjun Liu, Zhengyang Wang, Tong Yang

    Abstract: Although retrieval-augmented generation(RAG) significantly improves generation quality by retrieving external knowledge bases and integrating generated content, it faces computational efficiency bottlenecks, particularly in knowledge retrieval tasks involving hierarchical structures for Tree-RAG. This paper proposes a Tree-RAG acceleration method based on the improved Cuckoo Filter, which optimize… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

  10. arXiv:2501.04389  [pdf, other

    cs.IT

    Towards accurate and reliable ICU outcome prediction: a multimodal learning framework based on belief function theory using structured EHRs and free-text notes

    Authors: Yucheng Ruan, Daniel J. Tan, See Kiong Ng, Ling Huang, Mengling Feng

    Abstract: Accurate Intensive Care Unit (ICU) outcome prediction is critical for improving patient treatment quality and ICU resource allocation. Existing research mainly focuses on structured data, e.g. demographics and vital signs, and lacks effective frameworks to integrate clinical notes from heterogeneous electronic health records (EHRs). This study aims to explore a multimodal framework based on belief… ▽ More

    Submitted 25 February, 2025; v1 submitted 8 January, 2025; originally announced January 2025.

  11. arXiv:2411.15240  [pdf

    cs.LG cs.AI cs.HC q-bio.QM

    Foundation Models for Wearable Movement Data in Mental Health Research

    Authors: Franklin Y. Ruan, Aiwei Zhang, Jenny Y. Oh, SouYoung Jin, Nicholas C. Jacobson

    Abstract: Pretrained foundation models and transformer architectures have driven the success of large language models (LLMs) and other modern AI breakthroughs. However, similar advancements in health data modeling remain limited due to the need for innovative adaptations. Wearable movement data offers a valuable avenue for exploration, as it's a core feature in nearly all commercial smartwatches, well estab… ▽ More

    Submitted 28 June, 2025; v1 submitted 21 November, 2024; originally announced November 2024.

  12. arXiv:2411.09852  [pdf, ps, other

    cs.IR cs.AI cs.LG

    InterFormer: Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction

    Authors: Zhichen Zeng, Xiaolong Liu, Mengyue Hang, Xiaoyi Liu, Qinghai Zhou, Chaofei Yang, Yiqun Liu, Yichen Ruan, Laming Chen, Yuxin Chen, Yujia Hao, Jiaqi Xu, Jade Nie, Xi Liu, Buyun Zhang, Wei Wen, Siyang Yuan, Hang Yin, Xin Zhang, Kai Wang, Wen-Yen Chen, Yiping Han, Huayu Li, Chunzhi Yang, Bo Long , et al. (3 additional authors not shown)

    Abstract: Click-through rate (CTR) prediction, which predicts the probability of a user clicking an ad, is a fundamental task in recommender systems. The emergence of heterogeneous information, such as user profile and behavior sequences, depicts user interests from different aspects. A mutually beneficial integration of heterogeneous information is the cornerstone towards the success of CTR prediction. How… ▽ More

    Submitted 25 June, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: 11 pages, 6 figures

  13. arXiv:2410.20783  [pdf, other

    cs.CL cs.AI cs.LG

    Graph-based Uncertainty Metrics for Long-form Language Model Outputs

    Authors: Mingjian Jiang, Yangjun Ruan, Prasanna Sattigeri, Salim Roukos, Tatsunori Hashimoto

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly improved text generation capabilities, but these systems are still known to hallucinate, and granular uncertainty estimation for long-form LLM generations remains challenging. In this work, we propose Graph Uncertainty -- which represents the relationship between LLM generations and claims within them as a bipartite graph and e… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

    Comments: Accepted as a Spotlight paper at NeurIPS 2024

  14. arXiv:2409.01957  [pdf, ps, other

    cs.IT eess.SP

    Power Control and Random Serving Mode Allocation for CJT-NCJT Hybrid Mode Enabled Cell-Free Massive MIMO With Limited Fronthauls

    Authors: Hangyu Zhang, Rui Zhang, Yongzhao Li, Yuhan Ruan, Tao Li, Dong Yang

    Abstract: With a great potential of improving the service fairness and quality for user equipments (UEs), cell-free massive multiple-input multiple-output (mMIMO) has been regarded as an emerging candidate for 6G network architectures. Under ideal assumptions, the coherent joint transmission (CJT) serving mode has been considered as an optimal option for cell-free mMIMO systems, since it can achieve coheren… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: 6 pages, 2 figures, accepted by GLOBECOM 2024

  15. arXiv:2409.00991  [pdf, other

    cs.CV cs.AI

    3D Priors-Guided Diffusion for Blind Face Restoration

    Authors: Xiaobin Lu, Xiaobin Hu, Jun Luo, Ben Zhu, Yaping Ruan, Wenqi Ren

    Abstract: Blind face restoration endeavors to restore a clear face image from a degraded counterpart. Recent approaches employing Generative Adversarial Networks (GANs) as priors have demonstrated remarkable success in this field. However, these methods encounter challenges in achieving a balance between realism and fidelity, particularly in complex degradation scenarios. To inherit the exceptional realism… ▽ More

    Submitted 12 September, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: This paper was accepted by ACM MM 2024, and the project page is accessible at: https://github.com/838143396/3Diffusion

  16. arXiv:2408.10548  [pdf, other

    cs.CL

    Language Modeling on Tabular Data: A Survey of Foundations, Techniques and Evolution

    Authors: Yucheng Ruan, Xiang Lan, Jingying Ma, Yizhi Dong, Kai He, Mengling Feng

    Abstract: Tabular data, a prevalent data type across various domains, presents unique challenges due to its heterogeneous nature and complex structural relationships. Achieving high predictive performance and robustness in tabular data analysis holds significant promise for numerous applications. Influenced by recent advancements in natural language processing, particularly transformer architectures, new me… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  17. arXiv:2406.13281  [pdf, other

    cs.CV

    ECAFormer: Low-light Image Enhancement using Cross Attention

    Authors: Yudi Ruan, Hao Ma, Weikai Li, Xiao Wang

    Abstract: Low-light image enhancement (LLIE) is critical in computer vision. Existing LLIE methods often fail to discover the underlying relationships between different sub-components, causing the loss of complementary information between multiple modules and network layers, ultimately resulting in the loss of image details. To beat this shortage, we design a hierarchical mutual Enhancement via a Cross Atte… ▽ More

    Submitted 22 December, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  18. arXiv:2406.13161  [pdf, other

    cs.AI cs.CL cs.LG cs.PL

    APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts

    Authors: Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan, Gennady Pekhimenko, Chris J. Maddison, Xujie Si

    Abstract: Large Language Models (LLMs) have become increasingly capable of handling diverse tasks with the aid of well-crafted prompts and integration of external tools, but as task complexity rises, the workflow involving LLMs can be complicated and thus challenging to implement and maintain. To address this challenge, we propose APPL, A Prompt Programming Language that acts as a bridge between computer pr… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  19. arXiv:2406.02240  [pdf, other

    cs.NI

    Quantum Computing in Wireless Communications and Networking: A Tutorial-cum-Survey

    Authors: Wei Zhao, Tangjie Weng, Yue Ruan, Zhi Liu, Xuangou Wu, Xiao Zheng, Nei Kato

    Abstract: Owing to its outstanding parallel computing capabilities, quantum computing (QC) has been a subject of continuous attention. With the gradual maturation of QC platforms, it has increasingly played a significant role in various fields such as transportation, pharmaceuticals, and industrial manufacturing,achieving unprecedented milestones. In modern society, wireless communication stands as an indis… ▽ More

    Submitted 19 November, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  20. arXiv:2405.10938  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Observational Scaling Laws and the Predictability of Language Model Performance

    Authors: Yangjun Ruan, Chris J. Maddison, Tatsunori Hashimoto

    Abstract: Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of training models across many different scales has limited their use. We propose an alternative, observational approach that bypasses model training and instead builds scaling laws from ~100 publically… ▽ More

    Submitted 1 October, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted at NeurIPS 2024 as a spotlight

  21. arXiv:2404.13841  [pdf, other

    cs.LG cs.AI

    Fair Concurrent Training of Multiple Models in Federated Learning

    Authors: Marie Siew, Haoran Zhang, Jong-Ik Park, Yuezhou Liu, Yichen Ruan, Lili Su, Stratis Ioannidis, Edmund Yeh, Carlee Joe-Wong

    Abstract: Federated learning (FL) enables collaborative learning across multiple clients. In most FL work, all clients train a single learning task. However, the recent proliferation of FL applications may increasingly require multiple FL tasks to be trained simultaneously, sharing clients' computing and communication resources, which we call Multiple-Model Federated Learning (MMFL). Current MMFL algorithms… ▽ More

    Submitted 19 May, 2025; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted at IEEE Transactions on Networking (TON) 2025

  22. arXiv:2403.05268  [pdf, ps, other

    cs.CL cs.LG

    Deep Prompt Multi-task Network for Abuse Language Detection

    Authors: Jian Zhu, Yuping Ruan, Jingfei Chang, Wenhui Sun, Hui Wan, Jian Long, Cheng Luo

    Abstract: The detection of abusive language remains a long-standing challenge with the extensive use of social networks. The detection task of abusive language suffers from limited accuracy. We argue that the existing detection methods utilize the fine-tuning technique of the pre-trained language models (PLMs) to handle downstream tasks. Hence, these methods fail to stimulate the general knowledge of the PL… ▽ More

    Submitted 24 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by the International Conference on Pattern Recognition (ICPR) 2024

  23. arXiv:2403.04246  [pdf, other

    stat.ML cs.AI cs.LG

    Efficient CNN-LSTM based Parameter Estimation of Levy Driven Stochastic Differential Equations

    Authors: Shuaiyu Li, Yang Ruan, Changzhou Long, Yuzhong Cheng

    Abstract: This study addresses the challenges in parameter estimation of stochastic differential equations driven by non-Gaussian noises, which are critical in understanding dynamic phenomena such as price fluctuations and the spread of infectious diseases. Previous research highlighted the potential of LSTM networks in estimating parameters of alpha stable Levy driven SDEs but faced limitations including h… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 2023 International Conference on Machine Learning and Applications (ICMLA)

  24. arXiv:2312.10386  [pdf, other

    cs.LG

    RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates

    Authors: Jun Sun, Xinxin Zhang, Shoukang Han, Yu-ping Ruan, Taihao Li

    Abstract: Multimodal learning is susceptible to modality missing, which poses a major obstacle for its practical applications and, thus, invigorates increasing research interest. In this paper, we investigate two challenging problems: 1) when modality missing exists in the training data, how to exploit the incomplete samples while guaranteeing that they are properly supervised? 2) when the missing rates of… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  25. arXiv:2312.07048  [pdf, other

    cs.CV

    Edge Wasserstein Distance Loss for Oriented Object Detection

    Authors: Yuke Zhu, Yumeng Ruan, Zihua Xiong, Sheng Guo

    Abstract: Regression loss design is an essential topic for oriented object detection. Due to the periodicity of the angle and the ambiguity of width and height definition, traditional L1-distance loss and its variants have been suffered from the metric discontinuity and the square-like problem. As a solution, the distribution based methods show significant advantages by representing oriented boxes as distri… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  26. arXiv:2311.18358  [pdf, other

    cs.CV

    TIDE: Test Time Few Shot Object Detection

    Authors: Weikai Li, Hongfeng Wei, Yanlai Wu, Jie Yang, Yudi Ruan, Yuan Li, Ying Tang

    Abstract: Few-shot object detection (FSOD) aims to extract semantic knowledge from limited object instances of novel categories within a target domain. Recent advances in FSOD focus on fine-tuning the base model based on a few objects via meta-learning or data augmentation. Despite their success, the majority of them are grounded with parametric readjustment to generalize on novel objects, which face consid… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  27. arXiv:2310.13901  [pdf, other

    cs.LG eess.SY

    Towards Hyperparameter-Agnostic DNN Training via Dynamical System Insights

    Authors: Carmel Fiscko, Aayushya Agarwal, Yihan Ruan, Soummya Kar, Larry Pileggi, Bruno Sinopoli

    Abstract: We present a stochastic first-order optimization method specialized for deep neural networks (DNNs), ECCO-DNN. This method models the optimization variable trajectory as a dynamical system and develops a discretization algorithm that adaptively selects step sizes based on the trajectory's shape. This provides two key insights: designing the dynamical system for fast continuous-time convergence and… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 25 pages, 11 figures

  28. On Synthetic Data for Back Translation

    Authors: Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu

    Abstract: Back translation (BT) is one of the most significant technologies in NMT research fields. Existing attempts on BT share a common characteristic: they employ either beam search or random sampling to generate synthetic data with a backward model but seldom work studies the role of synthetic data in the performance of BT. This motivates us to ask a fundamental question: {\em what kind of synthetic da… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Journal ref: In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 419--430, Seattle, United States. Association for Computational Linguistics

  29. arXiv:2310.05694  [pdf

    cs.CL

    A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics

    Authors: Kai He, Rui Mao, Qika Lin, Yucheng Ruan, Xiang Lan, Mengling Feng, Erik Cambria

    Abstract: The utilization of large language models (LLMs) in the Healthcare domain has generated both excitement and concern due to their ability to effectively respond to freetext queries with certain professional knowledge. This survey outlines the capabilities of the currently developed LLMs for Healthcare and explicates their development process, with the aim of providing an overview of the development… ▽ More

    Submitted 26 January, 2025; v1 submitted 9 October, 2023; originally announced October 2023.

  30. arXiv:2309.15817  [pdf, other

    cs.AI cs.CL cs.LG

    Identifying the Risks of LM Agents with an LM-Emulated Sandbox

    Authors: Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, Tatsunori Hashimoto

    Abstract: Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Identifying these risks is labor-intensive, necessitating implementing the tools, setting up the environment for each test scenario manually, and finding risky cas… ▽ More

    Submitted 17 May, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

  31. arXiv:2308.09346  [pdf, other

    cs.CV

    Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching

    Authors: Jiazheng Xing, Mengmeng Wang, Yudi Ruan, Bofan Chen, Yaowei Guo, Boyu Mu, Guang Dai, Jingdong Wang, Yong Liu

    Abstract: Class prototype construction and matching are core aspects of few-shot action recognition. Previous methods mainly focus on designing spatiotemporal relation modeling modules or complex temporal alignment algorithms. Despite the promising results, they ignored the value of class prototype construction and matching, leading to unsatisfactory performance in recognizing similar categories in every ta… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  32. arXiv:2305.11304  [pdf, other

    cs.LG

    pTSE: A Multi-model Ensemble Method for Probabilistic Time Series Forecasting

    Authors: Yunyi Zhou, Zhixuan Chu, Yijia Ruan, Ge Jin, Yuchen Huang, Sheng Li

    Abstract: Various probabilistic time series forecasting models have sprung up and shown remarkably good performance. However, the choice of model highly relies on the characteristics of the input time series and the fixed distribution that the model is based on. Due to the fact that the probability distributions cannot be averaged over different models straightforwardly, the current time series model ensemb… ▽ More

    Submitted 30 August, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: The 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)

  33. PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba

    Authors: Jianying Wang, Tongliang Li, Haoze Song, Xinjun Yang, Wenchao Zhou, Feifei Li, Baoyue Yan, Qianqian Wu, Yukun Liang, Chengjun Ying, Yujie Wang, Baokai Chen, Chang Cai, Yubin Ruan, Xiaoyi Weng, Shibin Chen, Liang Yin, Chengzhong Yang, Xin Cai, Hongyan Xing, Nanlong Yu, Xiaofei Chen, Dapeng Huang, Jianling Sun

    Abstract: Cloud-native databases have become the de-facto choice for mission-critical applications on the cloud due to the need for high availability, resource elasticity, and cost efficiency. Meanwhile, driven by the increasing connectivity between data generation and analysis, users prefer a single database to efficiently process both OLTP and OLAP workloads, which enhances data freshness and reduces the… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 14 pages, 16 figures, to be published in ACM SIGMOD 2023

  34. arXiv:2305.05835  [pdf, other

    eess.IV cs.CV cs.LG

    Reference-based OCT Angiogram Super-resolution with Learnable Texture Generation

    Authors: Yuyan Ruan, Dawei Yang, Ziqi Tang, An Ran Ran, Carol Y. Cheung, Hao Chen

    Abstract: Optical coherence tomography angiography (OCTA) is a new imaging modality to visualize retinal microvasculature and has been readily adopted in clinics. High-resolution OCT angiograms are important to qualitatively and quantitatively identify potential biomarkers for different retinal diseases accurately. However, one significant problem of OCTA is the inevitable decrease in resolution when increa… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 12 pages, 11 figures

    MSC Class: 68T07 ACM Class: I.2; I.4

  35. arXiv:2303.17408  [pdf, other

    cs.CL

    P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data

    Authors: Yucheng Ruan, Xiang Lan, Daniel J. Tan, Hairil Rizal Abdullah, Mengling Feng

    Abstract: Medical tabular data, abundant in Electronic Health Records (EHRs), is a valuable resource for diverse medical tasks such as risk prediction. While deep learning approaches, particularly transformer-based models, have shown remarkable performance in tabular data prediction, there are still problems remaining for existing work to be effectively adapted into medical domain, such as ignoring unstruct… ▽ More

    Submitted 10 April, 2025; v1 submitted 30 March, 2023; originally announced March 2023.

  36. arXiv:2302.03916  [pdf, other

    cs.LG

    QS-ADN: Quasi-Supervised Artifact Disentanglement Network for Low-Dose CT Image Denoising by Local Similarity Among Unpaired Data

    Authors: Yuhui Ruan, Qiao Yuan, Chuang Niu, Chen Li, Yudong Yao, Ge Wang, Yueyang Teng

    Abstract: Deep learning has been successfully applied to low-dose CT (LDCT) image denoising for reducing potential radiation risk. However, the widely reported supervised LDCT denoising networks require a training set of paired images, which is expensive to obtain and cannot be perfectly simulated. Unsupervised learning utilizes unpaired data and is highly desirable for LDCT denoising. As an example, an art… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  37. arXiv:2211.09981  [pdf, other

    cs.LG cs.AI stat.ML

    Weighted Ensemble Self-Supervised Learning

    Authors: Yangjun Ruan, Saurabh Singh, Warren Morningstar, Alexander A. Alemi, Sergey Ioffe, Ian Fischer, Joshua V. Dillon

    Abstract: Ensembling has proven to be a powerful technique for boosting model performance, uncertainty estimation, and robustness in supervised learning. Advances in self-supervised learning (SSL) enable leveraging large unlabeled corpora for state-of-the-art few-shot and supervised learning performance. In this paper, we explore how ensemble methods can improve recent SSL techniques by developing a framewo… ▽ More

    Submitted 9 April, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted by ICLR 2023

  38. arXiv:2211.08682  [pdf, other

    cs.CL

    Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

    Authors: Wang Qi, Yu-Ping Ruan, Yuan Zuo, Taihao Li

    Abstract: Conventional fine-tuning encounters increasing difficulties given the size of current Pre-trained Language Models, which makes parameter-efficient tuning become the focal point of frontier research. Previous methods in this field add tunable adapters into MHA or/and FFN of Transformer blocks to enable PLMs achieve transferability. However, as an important part of Transformer architecture, the powe… ▽ More

    Submitted 9 December, 2022; v1 submitted 16 November, 2022; originally announced November 2022.

  39. arXiv:2210.06361  [pdf, other

    cs.CV

    MFFN: Multi-view Feature Fusion Network for Camouflaged Object Detection

    Authors: Dehua Zheng, Xiaochen Zheng, Laurence T. Yang, Yuan Gao, Chenlu Zhu, Yiheng Ruan

    Abstract: Recent research about camouflaged object detection (COD) aims to segment highly concealed objects hidden in complex surroundings. The tiny, fuzzy camouflaged objects result in visually indistinguishable properties. However, current single-view COD detectors are sensitive to background distractors. Therefore, blurred boundaries and variable shapes of the camouflaged objects are challenging to be fu… ▽ More

    Submitted 19 October, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

  40. arXiv:2206.08497  [pdf, other

    cs.GR cs.CV

    Unsupervised Kinematic Motion Detection for Part-segmented 3D Shape Collections

    Authors: Xianghao Xu, Yifan Ruan, Srinath Sridhar, Daniel Ritchie

    Abstract: 3D models of manufactured objects are important for populating virtual worlds and for synthetic data generation for vision and robotics. To be most useful, such objects should be articulated: their parts should move when interacted with. While articulated object datasets exist, creating them is labor-intensive. Learning-based prediction of part motions can help, but all existing methods require an… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: SIGGRAPH 2022

  41. arXiv:2204.07763  [pdf, other

    cs.SD cs.LG eess.AS

    UFRC: A Unified Framework for Reliable COVID-19 Detection on Crowdsourced Cough Audio

    Authors: Jiangeng Chang, Yucheng Ruan, Cui Shaoze, John Soong Tshon Yit, Mengling Feng

    Abstract: We suggested a unified system with core components of data augmentation, ImageNet-pretrained ResNet-50, cost-sensitive loss, deep ensemble learning, and uncertainty estimation to quickly and consistently detect COVID-19 using acoustic evidence. To increase the model's capacity to identify a minority class, data augmentation and cost-sensitive loss are incorporated (infected samples). In the COVID-… ▽ More

    Submitted 30 June, 2022; v1 submitted 16 April, 2022; originally announced April 2022.

  42. arXiv:2202.08396  [pdf, other

    cs.LG cs.AI cs.LO

    Augment with Care: Contrastive Learning for Combinatorial Problems

    Authors: Haonan Duan, Pashootan Vaezipoor, Max B. Paulus, Yangjun Ruan, Chris J. Maddison

    Abstract: Supervised learning can improve the design of state-of-the-art solvers for combinatorial problems, but labelling large numbers of combinatorial instances is often impractical due to exponential worst-case complexity. Inspired by the recent success of contrastive pre-training for images, we conduct a scientific study of the effect of augmentation design on contrastive pre-training for the Boolean s… ▽ More

    Submitted 20 June, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  43. arXiv:2201.07366  [pdf, other

    cs.CV

    TriCoLo: Trimodal Contrastive Loss for Text to Shape Retrieval

    Authors: Yue Ruan, Han-Hung Lee, Yiming Zhang, Ke Zhang, Angel X. Chang

    Abstract: Text-to-shape retrieval is an increasingly relevant problem with the growth of 3D shape data. Recent work on contrastive losses for learning joint embeddings over multimodal data has been successful at tasks such as retrieval and classification. Thus far, work on joint representation learning for 3D shapes and text has focused on improving embeddings through modeling of complex attention between r… ▽ More

    Submitted 27 December, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted by WACV 2024

  44. arXiv:2201.00057  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Optimal Representations for Covariate Shift

    Authors: Yangjun Ruan, Yann Dubois, Chris J. Maddison

    Abstract: Machine learning systems often experience a distribution shift between training and testing. In this paper, we introduce a simple variational objective whose optima are exactly the set of all representations on which risk minimizers are guaranteed to be robust to any distribution shift that preserves the Bayes predictor, e.g., covariate shifts. Our objective has two components. First, a representa… ▽ More

    Submitted 14 March, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

    Comments: Accepted at ICLR 2022

  45. arXiv:2112.06053  [pdf, other

    cs.LG

    FedSoft: Soft Clustered Federated Learning with Proximal Local Updating

    Authors: Yichen Ruan, Carlee Joe-Wong

    Abstract: Traditionally, clustered federated learning groups clients with the same data distribution into a cluster, so that every client is uniquely associated with one data distribution and helps train a model for this distribution. We relax this hard association assumption to soft clustered federated learning, which allows every local dataset to follow a mixture of multiple source distributions. We propo… ▽ More

    Submitted 22 March, 2022; v1 submitted 11 December, 2021; originally announced December 2021.

  46. arXiv:2105.14879  [pdf, other

    cs.CL

    SemEval-2021 Task 4: Reading Comprehension of Abstract Meaning

    Authors: Boyuan Zheng, Xiaoyu Yang, Yu-Ping Ruan, Zhenhua Ling, Quan Liu, Si Wei, Xiaodan Zhu

    Abstract: This paper introduces the SemEval-2021 shared task 4: Reading Comprehension of Abstract Meaning (ReCAM). This shared task is designed to help evaluate the ability of machines in representing and understanding abstract concepts. Given a passage and the corresponding question, a participating system is expected to choose the correct answer from five candidates of abstract concepts in a cloze-style m… ▽ More

    Submitted 1 June, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

  47. Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation

    Authors: Yu-Ping Ruan, Zhen-Hua Ling

    Abstract: This paper presents an emotion-regularized conditional variational autoencoder (Emo-CVAE) model for generating emotional conversation responses. In conventional CVAE-based emotional response generation, emotion labels are simply used as additional conditions in prior, posterior and decoder networks. Considering that emotion styles are naturally entangled with semantic contents in the language spac… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

    Comments: Accepted by IEEE Transactions on Affective Computing

  48. arXiv:2102.11086  [pdf, other

    cs.LG cs.AI cs.IT stat.CO

    Improving Lossless Compression Rates via Monte Carlo Bits-Back Coding

    Authors: Yangjun Ruan, Karen Ullrich, Daniel Severo, James Townsend, Ashish Khisti, Arnaud Doucet, Alireza Makhzani, Chris J. Maddison

    Abstract: Latent variable models have been successfully applied in lossless compression with the bits-back coding algorithm. However, bits-back suffers from an increase in the bitrate equal to the KL divergence between the approximate posterior and the true posterior. In this paper, we show how to remove this gap asymptotically by deriving bits-back coding algorithms from tighter variational bounds. The key… ▽ More

    Submitted 14 June, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

  49. arXiv:2007.01980  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

    Authors: Yufei Ruan, Jiaqi Yang, Yuan Zhou

    Abstract: Motivated by practical needs such as large-scale learning, we study the impact of adaptivity constraints to linear contextual bandits, a central problem in online active learning. We consider two popular limited adaptivity models in literature: batch learning and rare policy switches. We show that, when the context vectors are adversarially chosen in $d$-dimensional linear contextual bandits, the… ▽ More

    Submitted 23 April, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

  50. arXiv:2006.06954  [pdf, other

    cs.LG stat.ML

    Towards Flexible Device Participation in Federated Learning

    Authors: Yichen Ruan, Xiaoxi Zhang, Shu-Che Liang, Carlee Joe-Wong

    Abstract: Traditional federated learning algorithms impose strict requirements on the participation rates of devices, which limit the potential reach of federated learning. This paper extends the current learning paradigm to include devices that may become inactive, compute incomplete updates, and depart or arrive in the middle of training. We derive analytical results to illustrate how allowing more flexib… ▽ More

    Submitted 25 February, 2021; v1 submitted 12 June, 2020; originally announced June 2020.