Skip to main content

Showing 1–32 of 32 results for author: Ruan, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.19015  [pdf, other

    cs.CV cs.MM

    Can Multimodal Large Language Models Understand Spatial Relations?

    Authors: Jingping Liu, Ziyan Liu, Zhedong Cen, Yan Zhou, Yinan Zou, Weiyan Zhang, Haiyun Jiang, Tong Ruan

    Abstract: Spatial relation reasoning is a crucial task for multimodal large language models (MLLMs) to understand the objective world. However, current benchmarks have issues like relying on bounding boxes, ignoring perspective substitutions, or allowing questions to be answered using only the model's prior knowledge without image understanding. To address these issues, we introduce SpatialMQA, a human-anno… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: 13 pages, 19 figures

  2. arXiv:2502.13677  [pdf, other

    cs.RO

    A Framework for Semantics-based Situational Awareness during Mobile Robot Deployments

    Authors: Tianshu Ruan, Aniketh Ramesh, Hao Wang, Alix Johnstone-Morfoisse, Gokcenur Altindal, Paul Norman, Grigoris Nikolaou, Rustam Stolkin, Manolis Chiou

    Abstract: Deployment of robots into hazardous environments typically involves a ``Human-Robot Teaming'' (HRT) paradigm, in which a human supervisor interacts with a remotely operating robot inside the hazardous zone. Situational Awareness (SA) is vital for enabling HRT, to support navigation, planning, and decision-making. This paper explores issues of higher-level ``semantic'' information and understanding… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  3. arXiv:2502.11703  [pdf, other

    cs.CL

    CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation

    Authors: Guangya Yu, Yanhao Li, Zongying Jiang, Yuxiong Jin, Li Dai, Yupian Lin, Ruihui Hou, Weiyan Zhang, Yongqi Fan, Qi Ye, Jingping Liu, Tong Ruan

    Abstract: Medical quality control indicators are essential to assess the qualifications of healthcare institutions for medical services. With the impressive performance of large language models (LLMs) like GPT-4 in the medical field, leveraging these technologies for the Medical Quality Control Indicator Calculation (MQCIC) presents a promising approach. In this work, (1) we introduce a real-world task MQCI… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 16 pages

  4. arXiv:2412.19420  [pdf

    cs.DB cs.DS

    A Matrix Logic Approach to Efficient Frequent Itemset Discovery in Large Data Sets

    Authors: Xuan Li, Tingyi Ruan, Yankaiqi Li, Quanchao Lu, Xiaoxuan Sun

    Abstract: This paper proposes a frequent itemset mining algorithm based on the Boolean matrix method, aiming to solve the storage and computational bottlenecks of traditional frequent pattern mining algorithms in high-dimensional and large-scale transaction databases. By representing the itemsets in the transaction database as Boolean matrices, the algorithm uses Boolean logic operations such as AND and OR… ▽ More

    Submitted 26 December, 2024; originally announced December 2024.

  5. arXiv:2412.18288  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Towards understanding how attention mechanism works in deep learning

    Authors: Tianyu Ruan, Shihua Zhang

    Abstract: Attention mechanism has been extensively integrated within mainstream neural network architectures, such as Transformers and graph attention networks. Yet, its underlying working principles remain somewhat elusive. What is its essence? Are there any connections between it and traditional machine learning algorithms? In this study, we inspect the process of computing similarity using classic metric… ▽ More

    Submitted 24 December, 2024; originally announced December 2024.

    Comments: 38 pages, 6 figures

    MSC Class: 68T07 ACM Class: I.2.6; I.5.1

  6. arXiv:2411.12161  [pdf

    cs.DC

    Adaptive Cache Management for Complex Storage Systems Using CNN-LSTM-Based Spatiotemporal Prediction

    Authors: Xiaoye Wang, Xuan Li, Linji Wang, Tingyi Ruan, Pochun Li

    Abstract: This paper proposes an intelligent cache management strategy based on CNN-LSTM to improve the performance and cache hit rate of storage systems. Through comparative experiments with traditional algorithms (such as LRU and LFU) and other deep learning models (such as RNN, GRU-RNN and LSTM), the results show that the CNN-LSTM model has significant advantages in cache demand prediction. The MSE and M… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  7. arXiv:2408.10039  [pdf, other

    cs.AI

    MSDiagnosis: A Benchmark for Evaluating Large Language Models in Multi-Step Clinical Diagnosis

    Authors: Ruihui Hou, Shencheng Chen, Yongqi Fan, Guangya Yu, Lifeng Zhu, Jing Sun, Jingping Liu, Tong Ruan

    Abstract: Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a Ch… ▽ More

    Submitted 16 December, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  8. arXiv:2407.10990  [pdf

    cs.CL cs.AI

    MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

    Authors: Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang

    Abstract: Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese med… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

    Comments: 25 pages.4 figures

  9. arXiv:2406.15019  [pdf, other

    cs.CL

    MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens

    Authors: Yongqi Fan, Hongli Sun, Kui Xue, Xiaofan Zhang, Shaoting Zhang, Tong Ruan

    Abstract: Numerous advanced Large Language Models (LLMs) now support context lengths up to 128K, and some extend to 200K. Some benchmarks in the generic domain have also followed up on evaluating long-context capabilities. In the medical domain, tasks are distinctive due to the unique contexts and need for domain expertise, necessitating further evaluation. However, despite the frequent presence of long tex… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  10. arXiv:2405.10630  [pdf, other

    cs.CL cs.AI

    Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges

    Authors: Xiaoming Shi, Zeming Liu, Li Du, Yuxuan Wang, Hongru Wang, Yuhang Guo, Tong Ruan, Jie Xu, Shaoting Zhang

    Abstract: This paper surveys and organizes research works on medical dialog systems, which is an important yet challenging task. Although these systems have been surveyed in the medical community from an application perspective, a systematic review from a rigorous technical perspective has to date remained noticeably absent. As a result, an overview of the categories, methods, and evaluation of medical dial… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  11. arXiv:2404.17897  [pdf, other

    cs.CL

    Tool Calling: Enhancing Medication Consultation via Retrieval-Augmented Large Language Models

    Authors: Zhongzhen Huang, Kui Xue, Yongqi Fan, Linjie Mu, Ruoyu Liu, Tong Ruan, Shaoting Zhang, Xiaofan Zhang

    Abstract: Large-scale language models (LLMs) have achieved remarkable success across various language tasks but suffer from hallucinations and temporal misalignment. To mitigate these shortcomings, Retrieval-augmented generation (RAG) has been utilized to provide external knowledge to facilitate the answer generation. However, applying such models to the medical domain faces several challenges due to the la… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  12. Learning Invariant Inter-pixel Correlations for Superpixel Generation

    Authors: Sen Xu, Shikui Wei, Tao Ruan, Lixin Liao

    Abstract: Deep superpixel algorithms have made remarkable strides by substituting hand-crafted features with learnable ones. Nevertheless, we observe that existing deep superpixel methods, serving as mid-level representation operations, remain sensitive to the statistical properties (e.g., color distribution, high-level semantics) embedded within the training dataset. Consequently, learnable features exhibi… ▽ More

    Submitted 9 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI24

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6351-6359 (2024)

  13. arXiv:2402.18028  [pdf, other

    cs.CV

    OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine

    Authors: Xiaosong Wang, Xiaofan Zhang, Guotai Wang, Junjun He, Zhongyu Li, Wentao Zhu, Yi Guo, Qi Dou, Xiaoxiao Li, Dequan Wang, Liang Hong, Qicheng Lao, Tong Ruan, Yukun Zhou, Yixue Li, Jie Zhao, Kang Li, Xin Sun, Lifeng Zhu, Shaoting Zhang

    Abstract: The emerging trend of advancing generalist artificial intelligence, such as GPTv4 and Gemini, has reshaped the landscape of research (academia and industry) in machine learning and many other research areas. However, domain-specific applications of such foundation models (e.g., in medicine) remain untouched or often at their very early stages. It will require an individual set of transfer learning… ▽ More

    Submitted 3 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Technical Report. Visit https://github.com/openmedlab for more details

  14. arXiv:2312.02441  [pdf, other

    cs.CL

    MedDM:LLM-executable clinical guidance tree for clinical decision-making

    Authors: Binbin Li, Tianxin Meng, Xiaoming Shi, Jie Zhai, Tong Ruan

    Abstract: It is becoming increasingly emphasis on the importance of LLM participating in clinical diagnosis decision-making. However, the low specialization refers to that current medical LLMs can not provide specific medical advice, which are more like a medical Q\&A. And there is no suitable clinical guidance tree data set that can be used directly with LLM. To address this issue, we first propose LLM-exe… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  15. arXiv:2308.07635  [pdf, other

    cs.CL cs.AI

    LLM-Mini-CEX: Automatic Evaluation of Large Language Model for Diagnostic Conversation

    Authors: Xiaoming Shi, Jie Xu, Jinru Ding, Jiali Pang, Sichen Liu, Shuqing Luo, Xingwei Peng, Lu Lu, Haihong Yang, Mingtao Hu, Tong Ruan, Shaoting Zhang

    Abstract: There is an increasing interest in developing LLMs for medical diagnosis to improve diagnosis efficiency. Despite their alluring technological potential, there is no unified and comprehensive evaluation criterion, leading to the inability to evaluate the quality and potential risks of medical LLMs, further hindering the application of LLMs in medical treatment scenarios. Besides, current evaluatio… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  16. arXiv:2211.14095  [pdf, other

    cs.RO cs.AI cs.HC cs.MA

    A Hierarchical Variable Autonomy Mixed-Initiative Framework for Human-Robot Teaming in Mobile Robotics

    Authors: Dimitris Panagopoulos, Giannis Petousakis, Aniketh Ramesh, Tianshu Ruan, Grigoris Nikolaou, Rustam Stolkin, Manolis Chiou

    Abstract: This paper presents a Mixed-Initiative (MI) framework for addressing the problem of control authority transfer between a remote human operator and an AI agent when cooperatively controlling a mobile robot. Our Hierarchical Expert-guided Mixed-Initiative Control Switcher (HierEMICS) leverages information on the human operator's state and intent. The control switching policies are based on a critica… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 6 pages, 4 figures, ICHMS 2022, First two Authors contributed equally

  17. AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model

    Authors: Yongyu Yan, Kui Xue, Xiaoming Shi, Qi Ye, Jingping Liu, Tong Ruan

    Abstract: Continual pretraining is a popular way of building a domain-specific pretrained language model from a general-domain language model. In spite of its high efficiency, continual pretraining suffers from catastrophic forgetting, which may harm the model's performance in downstream tasks. To alleviate the issue, in this paper, we propose a continual pretraining method for the BERT-based model, named A… ▽ More

    Submitted 19 October, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

  18. A Taxonomy of Semantic Information in Robot-Assisted Disaster Response

    Authors: Tianshu Ruan, Hao Wang, Rustam Stolkin, Manolis Chiou

    Abstract: This paper proposes a taxonomy of semantic information in robot-assisted disaster response. Robots are increasingly being used in hazardous environment industries and emergency response teams to perform various tasks. Operational decision-making in such applications requires a complex semantic understanding of environments that are remote from the human operator. Low-level sensory data from the ro… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

  19. arXiv:2206.15312  [pdf, other

    cs.CL cs.LG

    FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer

    Authors: Jingping Liu, Yuqiu Song, Kui Xue, Hongli Sun, Chao Wang, Lihan Chen, Haiyun Jiang, Jiaqing Liang, Tong Ruan

    Abstract: Prompt tuning is an emerging way of adapting pre-trained language models to downstream tasks. However, the existing studies are mainly to add prompts to the input sequence. This way would not work as expected due to the intermediate multi-head self-attention and feed-forward network computation, making model optimization not very smooth. Hence, we propose a novel tuning way called layer tuning, ai… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  20. arXiv:2203.09946  [pdf, other

    cs.CL

    Prompt-based Generative Approach towards Multi-Hierarchical Medical Dialogue State Tracking

    Authors: Jun Liu, Tong Ruan, Haofen Wang, Huanhuan Zhang

    Abstract: The medical dialogue system is a promising application that can provide great convenience for patients. The dialogue state tracking (DST) module in the medical dialogue system which interprets utterances into the machine-readable structure for downstream tasks is particularly challenging. Firstly, the states need to be able to represent compound entities such as symptoms with their body part or di… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: 7 pages, 1 figure

  21. arXiv:2106.00501  [pdf, other

    cs.AI eess.SP

    A Unified Cognitive Learning Framework for Adapting to Dynamic Environment and Tasks

    Authors: Qihui Wu, Tianchen Ruan, Fuhui Zhou, Yang Huang, Fan Xu, Shijin Zhao, Ya Liu, Xuyang Huang

    Abstract: Many machine learning frameworks have been proposed and used in wireless communications for realizing diverse goals. However, their incapability of adapting to the dynamic wireless environment and tasks and of self-learning limit their extensive applications and achievable performance. Inspired by the great flexibility and adaptation of primate behaviors due to the brain cognitive mechanism, a uni… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: This paper has been submitted to IEEE Wireless Communications Magazine(minor revision)

  22. arXiv:2101.09101  [pdf, other

    cs.CL cs.LG

    A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization

    Authors: Ming Liang, Kui Xue, Tong Ruan

    Abstract: Medical terminology normalization aims to map the clinical mention to terminologies come from a knowledge base, which plays an important role in analyzing Electronic Health Record(EHR) and many downstream tasks. In this paper, we focus on Chinese procedure terminology normalization. The expression of terminologies are various and one medical mention may be linked to multiple terminologies. Previou… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  23. arXiv:2010.11724  [pdf, other

    cs.CV

    LID 2020: The Learning from Imperfect Data Challenge Results

    Authors: Yunchao Wei, Shuai Zheng, Ming-Ming Cheng, Hang Zhao, Liwei Wang, Errui Ding, Yi Yang, Antonio Torralba, Ting Liu, Guolei Sun, Wenguan Wang, Luc Van Gool, Wonho Bae, Junhyug Noh, Jinhwan Seo, Gunhee Kim, Hao Zhao, Ming Lu, Anbang Yao, Yiwen Guo, Yurong Chen, Li Zhang, Chuangchuang Tan, Tao Ruan, Guanghua Gu , et al. (10 additional authors not shown)

    Abstract: Learning from imperfect data becomes an issue in many industrial applications after the research community has made profound progress in supervised learning from perfectly annotated datasets. The purpose of the Learning from Imperfect Data (LID) workshop is to inspire and facilitate the research in developing novel approaches that would harness the imperfect data and improve the data-efficiency du… ▽ More

    Submitted 17 October, 2020; originally announced October 2020.

    Comments: Summary of the 2nd Learning from Imperfect Data Workshop in conjunction with CVPR 2020

  24. VehicleNet: Learning Robust Visual Representation for Vehicle Re-identification

    Authors: Zhedong Zheng, Tao Ruan, Yunchao Wei, Yi Yang, Tao Mei

    Abstract: One fundamental challenge of vehicle re-identification (re-id) is to learn robust and discriminative visual representation, given the significant intra-class vehicle variations across different camera views. As the existing vehicle datasets are limited in terms of training images and viewpoints, we propose to build a unique large-scale vehicle dataset (called VehicleNet) by harnessing four public… ▽ More

    Submitted 29 April, 2022; v1 submitted 14 April, 2020; originally announced April 2020.

  25. arXiv:1908.07721  [pdf, other

    cs.CL

    Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text

    Authors: Kui Xue, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Huanhuan Zhang, Ping He

    Abstract: Entity and relation extraction is the necessary step in structuring medical text. However, the feature extraction ability of the bidirectional long short term memory network in the existing model does not achieve the best effect. At the same time, the language model has achieved excellent results in more and more natural language processing tasks. In this paper, we present a focused attention mode… ▽ More

    Submitted 22 October, 2019; v1 submitted 21 August, 2019; originally announced August 2019.

    Comments: 8 pages, 2 figures, submitted to BIBM 2019, accepted as a regular paper

  26. arXiv:1908.06606  [pdf, other

    cs.CL

    Question Answering based Clinical Text Structuring Using Pre-trained Language Model

    Authors: Jiahui Qiu, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Jinlin Liu, Jing Sun

    Abstract: Clinical text structuring is a critical and fundamental task for clinical research. Traditional methods such as taskspecific end-to-end models and pipeline models usually suffer from the lack of dataset and error propagation. In this paper, we present a question answering based clinical text structuring (QA-CTS) task to unify different specific tasks and make dataset shareable. A novel model that… ▽ More

    Submitted 22 October, 2019; v1 submitted 19 August, 2019; originally announced August 2019.

  27. arXiv:1812.09905  [pdf, other

    cs.CY cs.AI cs.DB

    PatientEG Dataset: Bringing Event Graph Model with Temporal Relations to Electronic Medical Records

    Authors: Xuli Liu, Jihao Jin, Qi Wang, Tong Ruan, Yangming Zhou, Daqi Gao, Yichao Yin

    Abstract: Medical activities, such as diagnoses, medicine treatments, and laboratory tests, as well as temporal relations between these activities are the basic concepts in clinical research. However, existing relational data model on electronic medical records (EMRs) lacks explicit and accurate semantic definitions of these concepts. It leads to the inconvenience of query construction and the inefficiency… ▽ More

    Submitted 24 December, 2018; originally announced December 2018.

  28. arXiv:1809.05996  [pdf, other

    cs.CV

    Devil in the Details: Towards Accurate Single and Multiple Human Parsing

    Authors: Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao Zhao, Thomas Huang

    Abstract: Human parsing has received considerable interest due to its wide application potentials. Nevertheless, it is still unclear how to develop an accurate human parsing system in an efficient and elegant way. In this paper, we identify several useful properties, including feature resolution, global context information and edge details, and perform rigorous analyses to reveal how to leverage them to ben… ▽ More

    Submitted 29 November, 2018; v1 submitted 16 September, 2018; originally announced September 2018.

  29. arXiv:1808.08669  [pdf, other

    cs.CL

    Fast and Accurate Recognition of Chinese Clinical Named Entities with Residual Dilated Convolutions

    Authors: Jiahui Qiu, Qi Wang, Yangming Zhou, Tong Ruan, Ju Gao

    Abstract: Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translation research. In recent years, deep learning methods have achieved significant success in CNER tasks. However, these methods depend greatly on Recurrent Neur… ▽ More

    Submitted 27 November, 2018; v1 submitted 26 August, 2018; originally announced August 2018.

    Comments: 8 pages, 3 figures. Accepted as regular paper by 2018 IEEE International Conference on Bioinformatics and Biomedicine. arXiv admin note: text overlap with arXiv:1804.05017

  30. arXiv:1807.06718  [pdf, other

    cs.CL

    Automatic Severity Classification of Coronary Artery Disease via Recurrent Capsule Network

    Authors: Qi Wang, Jiahui Qiu, Yangming Zhou, Tong Ruan, Daqi Gao, Ju Gao

    Abstract: Coronary artery disease (CAD) is one of the leading causes of cardiovascular disease deaths. CAD condition progresses rapidly, if not diagnosed and treated at an early stage may eventually lead to an irreversible state of the heart muscle death. Invasive coronary arteriography is the gold standard technique for CAD diagnosis. Coronary arteriography texts describe which part has stenosis and how mu… ▽ More

    Submitted 27 November, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: 8 pages, 5 figures

  31. arXiv:1805.04827  [pdf, other

    cs.CL

    An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities

    Authors: Qi Wang, Chenming Xu, Yangming Zhou, Tong Ruan, Daqi Gao, Ping He

    Abstract: Named entities are usually composable and extensible. Typical examples are names of symptoms and diseases in medical areas. To distinguish these entities from general entities, we name them \textit{compound entities}. In this paper, we present an attention-based Bi-GRU-CapsNet model to detect hypernymy relationship between compound entities. Our model consists of several important components. To a… ▽ More

    Submitted 27 November, 2018; v1 submitted 13 May, 2018; originally announced May 2018.

    Comments: 5 pages, 3 figures. Accepted as short paper by 2018 International Conference on Bioinformatics and Biomedicine

  32. arXiv:1804.05017  [pdf, other

    cs.CL

    Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition

    Authors: Qi Wang, Yuhang Xia, Yangming Zhou, Tong Ruan, Daqi Gao, Ping He

    Abstract: Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity recognition and many other Natural Language Processin… ▽ More

    Submitted 13 April, 2018; originally announced April 2018.

    Comments: 21 pages, 6 figures