Search | arXiv e-print repository

Synergy-Informed Design of Platform Trials for Combination Therapies

Authors: Nan Miles Xi, Man Mandy Jin, Lin Wang, Xin Huang

Abstract: Combination drug therapies hold significant promise for enhancing treatment efficacy, particularly in fields such as oncology, immunotherapy, and infectious diseases. However, designing clinical trials for these regimens poses unique statistical challenges due to multiple hypothesis testing, shared control groups, and overlapping treatment components that induce complex correlation structures. In… ▽ More Combination drug therapies hold significant promise for enhancing treatment efficacy, particularly in fields such as oncology, immunotherapy, and infectious diseases. However, designing clinical trials for these regimens poses unique statistical challenges due to multiple hypothesis testing, shared control groups, and overlapping treatment components that induce complex correlation structures. In this paper, we develop a novel statistical framework tailored for early-phase translational combination therapy trials, with a focus on platform trial designs. Our methodology introduces a generalized Dunnett's procedure that controls false positive rates by accounting for the correlations between treatment arms. Additionally, we propose strategies for power analysis and sample size optimization that leverage preclinical data to estimate effect sizes, synergy parameters, and inter-arm correlations. Simulation studies demonstrate that our approach not only controls various false positive metrics under diverse trial scenarios but also informs optimal allocation ratios to maximize power. A real-data application further illustrates the integration of translational preclinical insights into the clinical trial design process. An open-source R package is provided to support the application of our methods in practice. Overall, our framework offers statistically rigorous guidance for the design of early-phase combination therapy trials, aiming to enhance the efficiency of the bench-to-bedside transition. △ Less

Submitted 3 June, 2025; originally announced June 2025.

arXiv:2504.19580 [pdf, other]

ARTEMIS: Autoregressive End-to-End Trajectory Planning with Mixture of Experts for Autonomous Driving

Authors: Renju Feng, Ning Xi, Duanfeng Chu, Rukang Wang, Zejian Deng, Anzheng Wang, Liping Lu, Jinxiang Wang, Yanjun Huang

Abstract: This paper presents ARTEMIS, an end-to-end autonomous driving framework that combines autoregressive trajectory planning with Mixture-of-Experts (MoE). Traditional modular methods suffer from error propagation, while existing end-to-end models typically employ static one-shot inference paradigms that inadequately capture the dynamic changes of the environment. ARTEMIS takes a different method by g… ▽ More This paper presents ARTEMIS, an end-to-end autonomous driving framework that combines autoregressive trajectory planning with Mixture-of-Experts (MoE). Traditional modular methods suffer from error propagation, while existing end-to-end models typically employ static one-shot inference paradigms that inadequately capture the dynamic changes of the environment. ARTEMIS takes a different method by generating trajectory waypoints sequentially, preserves critical temporal dependencies while dynamically routing scene-specific queries to specialized expert networks. It effectively relieves trajectory quality degradation issues encountered when guidance information is ambiguous, and overcomes the inherent representational limitations of singular network architectures when processing diverse driving scenarios. Additionally, we use a lightweight batch reallocation strategy that significantly improves the training speed of the Mixture-of-Experts model. Through experiments on the NAVSIM dataset, ARTEMIS exhibits superior competitive performance, achieving 87.0 PDMS and 83.1 EPDMS with ResNet-34 backbone, demonstrates state-of-the-art performance on multiple metrics. △ Less

Submitted 4 May, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

arXiv:2504.11298 [pdf, other]

doi 10.1038/s41586-023-06885-w

Giant Magnetocaloric Effect in Spin Supersolid Candidate Na$_2$BaCo(PO$_4$)$_2$

Authors: Junsen Xiang, Chuandi Zhang, Yuan Gao, Wolfang Schmidt, Karin Schmalzl, Chin-Wei Wang, Bo Li, Ning Xi, Xin-Yang Liu, Hai Jin, Gang Li, Jun Shen, Ziyu Chen, Yang Qi, Yuan Wan, Wentao Jin, Wei Li, Peijie Sun, Gang Su

Abstract: Supersolid, an exotic quantum state of matter that consists of particles forming an incompressible solid structure while simultaneously showing superfluidity of zero viscosity [1], is one of the long-standing pursuits in fundamental research [2, 3]. Although the initial report of $^4$He supersolid turned out to be an artifact [4], this intriguing quantum matter has inspired enthusiastic investigat… ▽ More Supersolid, an exotic quantum state of matter that consists of particles forming an incompressible solid structure while simultaneously showing superfluidity of zero viscosity [1], is one of the long-standing pursuits in fundamental research [2, 3]. Although the initial report of $^4$He supersolid turned out to be an artifact [4], this intriguing quantum matter has inspired enthusiastic investigations into ultracold quantum gases [5-8]. Nevertheless, the realization of supersolidity in condensed matter remains elusive. Here we find evidence for a quantum magnetic analogue of supersolid -- the spin supersolid -- in the recently synthesized triangular-lattice antiferromagnet Na$_2$BaCo(PO$_4$)$_2$ [9]. Notably, a giant magnetocaloric effect related to the spin supersolidity is observed in the demagnetization cooling process, manifesting itself as two prominent valley-like regimes, with the lowest temperature attaining below 100 mK. Not only is there an experimentally determined series of critical fields but the demagnetization cooling profile also shows excellent agreement with the theoretical simulations with an easy-axis Heisenberg model. Neutron diffractions also successfully locate the proposed spin supersolid phases by revealing the coexistence of three-sublattice spin solid order and interlayer incommensurability indicative of the spin superfluidity. Thus, our results indicate a strong entropic effect of the spin supersolid phase in a frustrated quantum magnet and open up a viable and promising avenue for applications in sub-Kelvin refrigeration, especially in the context of persistent concerns about helium shortages [10, 11]. △ Less

Submitted 15 April, 2025; originally announced April 2025.

Comments: 19 pages, 13 figures

Journal ref: Nature volume 625, pages 270-275 (2024)

arXiv:2503.10412 [pdf, other]

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Authors: Luyuan Xie, Tianyu Luan, Wenyuan Cai, Guochen Yan, Zhaoyu Chen, Nan Xi, Yuejian Fang, Qingni Shen, Zhonghai Wu, Junsong Yuan

Abstract: Federated learning has wide applications in the medical field. It enables knowledge sharing among different healthcare institutes while protecting patients' privacy. However, existing federated learning systems are typically centralized, requiring clients to upload client-specific knowledge to a central server for aggregation. This centralized approach would integrate the knowledge from each clien… ▽ More Federated learning has wide applications in the medical field. It enables knowledge sharing among different healthcare institutes while protecting patients' privacy. However, existing federated learning systems are typically centralized, requiring clients to upload client-specific knowledge to a central server for aggregation. This centralized approach would integrate the knowledge from each client into a centralized server, and the knowledge would be already undermined during the centralized integration before it reaches back to each client. Besides, the centralized approach also creates a dependency on the central server, which may affect training stability if the server malfunctions or connections are unstable. To address these issues, we propose a decentralized federated learning framework named dFLMoE. In our framework, clients directly exchange lightweight head models with each other. After exchanging, each client treats both local and received head models as individual experts, and utilizes a client-specific Mixture of Experts (MoE) approach to make collective decisions. This design not only reduces the knowledge damage with client-specific aggregations but also removes the dependency on the central server to enhance the robustness of the framework. We validate our framework on multiple medical tasks, demonstrating that our method evidently outperforms state-of-the-art approaches under both model homogeneity and heterogeneity settings. △ Less

Submitted 19 May, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

Comments: Accapted by CVPR 2025

Journal ref: Accapted by CVPR 2025

arXiv:2503.04785 [pdf, other]

Mapping Trustworthiness in Large Language Models: A Bibliometric Analysis Bridging Theory to Practice

Authors: José Siqueira de Cerqueira, Kai-Kristian Kemell, Rebekah Rousi, Nannan Xi, Juho Hamari, Pekka Abrahamsson

Abstract: The rapid proliferation of Large Language Models (LLMs) has raised significant trustworthiness and ethical concerns. Despite the widespread adoption of LLMs across domains, there is still no clear consensus on how to define and operationalise trustworthiness. This study aims to bridge the gap between theoretical discussion and practical implementation by analysing research trends, definitions of t… ▽ More The rapid proliferation of Large Language Models (LLMs) has raised significant trustworthiness and ethical concerns. Despite the widespread adoption of LLMs across domains, there is still no clear consensus on how to define and operationalise trustworthiness. This study aims to bridge the gap between theoretical discussion and practical implementation by analysing research trends, definitions of trustworthiness, and practical techniques. We conducted a bibliometric mapping analysis of 2,006 publications from Web of Science (2019-2025) using the Bibliometrix, and manually reviewed 68 papers. We found a shift from traditional AI ethics discussion to LLM trustworthiness frameworks. We identified 18 different definitions of trust/trustworthiness, with transparency, explainability and reliability emerging as the most common dimensions. We identified 20 strategies to enhance LLM trustworthiness, with fine-tuning and retrieval-augmented generation (RAG) being the most prominent. Most of the strategies are developer-driven and applied during the post-training phase. Several authors propose fragmented terminologies rather than unified frameworks, leading to the risks of "ethics washing," where ethical discourse is adopted without a genuine regulatory commitment. Our findings highlight: persistent gaps between theoretical taxonomies and practical implementation, the crucial role of the developer in operationalising trust, and call for standardised frameworks and stronger regulatory measures to enable trustworthy and ethical deployment of LLMs. △ Less

Submitted 4 May, 2025; v1 submitted 27 February, 2025; originally announced March 2025.

arXiv:2503.03152 [pdf, other]

UnPuzzle: A Unified Framework for Pathology Image Analysis

Authors: Dankai Liao, Sicheng Chen, Nuwa Xi, Qiaochu Xue, Jieyu Li, Lingxuan Hou, Zeyu Liu, Chang Han Low, Yufeng Wu, Yiling Liu, Yanqin Jiang, Dandan Li, Shangqing Lyu

Abstract: Pathology image analysis plays a pivotal role in medical diagnosis, with deep learning techniques significantly advancing diagnostic accuracy and research. While numerous studies have been conducted to address specific pathological tasks, the lack of standardization in pre-processing methods and model/database architectures complicates fair comparisons across different approaches. This highlights… ▽ More Pathology image analysis plays a pivotal role in medical diagnosis, with deep learning techniques significantly advancing diagnostic accuracy and research. While numerous studies have been conducted to address specific pathological tasks, the lack of standardization in pre-processing methods and model/database architectures complicates fair comparisons across different approaches. This highlights the need for a unified pipeline and comprehensive benchmarks to enable consistent evaluation and accelerate research progress. In this paper, we present UnPuzzle, a novel and unified framework for pathological AI research that covers a broad range of pathology tasks with benchmark results. From high-level to low-level, upstream to downstream tasks, UnPuzzle offers a modular pipeline that encompasses data pre-processing, model composition,taskconfiguration,andexperimentconduction.Specifically, it facilitates efficient benchmarking for both Whole Slide Images (WSIs) and Region of Interest (ROI) tasks. Moreover, the framework supports variouslearningparadigms,includingself-supervisedlearning,multi-task learning,andmulti-modallearning,enablingcomprehensivedevelopment of pathology AI models. Through extensive benchmarking across multiple datasets, we demonstrate the effectiveness of UnPuzzle in streamlining pathology AI research and promoting reproducibility. We envision UnPuzzle as a cornerstone for future advancements in pathology AI, providing a more accessible, transparent, and standardized approach to model evaluation. The UnPuzzle repository is publicly available at https://github.com/Puzzle-AI/UnPuzzle. △ Less

Submitted 28 March, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

Comments: 11 pages,2 figures

arXiv:2502.13760 [pdf, other]

Muscle Activation Estimation by Optimizing the Musculoskeletal Model for Personalized Strength and Conditioning Training

Authors: Xi Wu, Chenzui Li, Kehan Zou, Ning Xi, Fei Chen

Abstract: Musculoskeletal models are pivotal in the domains of rehabilitation and resistance training to analyze muscle conditions. However, individual variability in musculoskeletal parameters and the immeasurability of some internal biomechanical variables pose significant obstacles to accurate personalized modelling. Furthermore, muscle activation estimation can be challenging due to the inherent redunda… ▽ More Musculoskeletal models are pivotal in the domains of rehabilitation and resistance training to analyze muscle conditions. However, individual variability in musculoskeletal parameters and the immeasurability of some internal biomechanical variables pose significant obstacles to accurate personalized modelling. Furthermore, muscle activation estimation can be challenging due to the inherent redundancy of the musculoskeletal system, where multiple muscles drive a single joint. This study develops a whole-body musculoskeletal model for strength and conditioning training and calibrates relevant muscle parameters with an electromyography-based optimization method. By utilizing the personalized musculoskeletal model, muscle activation can be subsequently estimated to analyze the performance of exercises. Bench press and deadlift are chosen for experimental verification to affirm the efficacy of this approach. △ Less

Submitted 20 February, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

arXiv:2412.16615 [pdf, other]

Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval

Authors: Luo Ji, Feixiang Guo, Teng Chen, Qingqing Gu, Xiaoyu Wang, Ningyuan Xi, Yihong Wang, Peng Yu, Yue Zhao, Hongyang Lei, Zhonglin Jiang, Yong Chen

Abstract: Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be in… ▽ More Despite the recent advancement in Retrieval-Augmented Generation (RAG) systems, most retrieval methodologies are often developed for factual retrieval, which assumes query and positive documents are semantically similar. In this paper, we instead propose and study a more challenging type of retrieval task, called hidden rationale retrieval, in which query and document are not similar but can be inferred by reasoning chains, logic relationships, or empirical experiences. To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice. To further strengthen pioneering LLM-based retrievers, we design a special instruction that transforms the retrieval task into a generative task by prompting LLM to answer a binary-choice question. The model can be fine-tuned with direct preference optimization (DPO). The framework is also optimized for computational efficiency with no performance degradation. We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC), compared with previous retrieval works. Our study suggests the potential to employ LLM as a foundation for a wider scope of retrieval tasks. Our codes, models, and datasets are available on https://github.com/flyfree5/LaHoRe. △ Less

Submitted 9 April, 2025; v1 submitted 21 December, 2024; originally announced December 2024.

Comments: 10 pages, 3 figures, ECIR 2025

arXiv:2412.05342 [pdf, ps, other]

Multi-Party Supervised Fine-tuning of Language Models for Multi-Party Dialogue Generation

Authors: Xiaoyu Wang, Ningyuan Xi, Teng Chen, Qingqing Gu, Yue Zhao, Xiaokai Chen, Zhonglin Jiang, Yong Chen, Luo Ji

Abstract: Large Language Models (LLM) are usually fine-tuned to participate in dyadic or two-party dialogues, which can not adapt well to multi-party dialogues (MPD), which hinders their applications in such scenarios including multi-personal meetings, discussions and daily communication. Previous LLM-based researches mainly focus on the multi-agent framework, while their base LLMs are still pairwisely fine… ▽ More Large Language Models (LLM) are usually fine-tuned to participate in dyadic or two-party dialogues, which can not adapt well to multi-party dialogues (MPD), which hinders their applications in such scenarios including multi-personal meetings, discussions and daily communication. Previous LLM-based researches mainly focus on the multi-agent framework, while their base LLMs are still pairwisely fine-tuned. In this work, we design a multi-party fine-tuning framework (MuPaS) for LLMs on the multi-party dialogue datasets, and prove such a straightforward framework can let the LLM align with the multi-party conversation style efficiently and effectively. We also design two training strategies which can convert MuPaS into the MPD simulator. Substantial experiments show that MuPaS can achieve state-of-the-art multi-party response, higher accuracy of the-next-speaker prediction, higher human and automatic evaluated utterance qualities, and can even generate reasonably with out-of-distribution scene, topic and role descriptions. The MuPaS framework bridges the LLM training with more complicated multi-party applications, such as conversation generation, virtual rehearsal or meta-universe. △ Less

Submitted 11 June, 2025; v1 submitted 6 December, 2024; originally announced December 2024.

Comments: Accepted by IJCNN 2025

arXiv:2411.08881 [pdf, other]

Can We Trust AI Agents? A Case Study of an LLM-Based Multi-Agent System for Ethical AI

Authors: José Antonio Siqueira de Cerqueira, Mamia Agbese, Rebekah Rousi, Nannan Xi, Juho Hamari, Pekka Abrahamsson

Abstract: AI-based systems, including Large Language Models (LLM), impact millions by supporting diverse tasks but face issues like misinformation, bias, and misuse. AI ethics is crucial as new technologies and concerns emerge, but objective, practical guidance remains debated. This study examines the use of LLMs for AI ethics in practice, assessing how LLM trustworthiness-enhancing techniques affect softwa… ▽ More AI-based systems, including Large Language Models (LLM), impact millions by supporting diverse tasks but face issues like misinformation, bias, and misuse. AI ethics is crucial as new technologies and concerns emerge, but objective, practical guidance remains debated. This study examines the use of LLMs for AI ethics in practice, assessing how LLM trustworthiness-enhancing techniques affect software development in this context. Using the Design Science Research (DSR) method, we identify techniques for LLM trustworthiness: multi-agents, distinct roles, structured communication, and multiple rounds of debate. We design a multi-agent prototype LLM-MAS, where agents engage in structured discussions on real-world AI ethics issues from the AI Incident Database. We evaluate the prototype across three case scenarios using thematic analysis, hierarchical clustering, comparative (baseline) studies, and running source code. The system generates approximately 2,000 lines of code per case, compared to only 80 lines in baseline trials. Discussions reveal terms like bias detection, transparency, accountability, user consent, GDPR compliance, fairness evaluation, and EU AI Act compliance, showing this prototype ability to generate extensive source code and documentation addressing often overlooked AI ethics issues. However, practical challenges in source code integration and dependency management may limit its use by practitioners. △ Less

Submitted 16 May, 2025; v1 submitted 25 October, 2024; originally announced November 2024.

ACM Class: I.2.0; K.6.3

arXiv:2410.22236 [pdf, other]

Quantum Supercritical Regime with the Universally Boosted Magnetocaloric Effect

Authors: Enze Lv, Ning Xi, Yuliang Jin, Wei Li

Abstract: Across finite temperatures and fields, a quantum critical point (QCP) can extend to a quantum critical regime (QCR), characterized by prominent quantum fluctuations and universal scalings. The QCR is essential for comprehending many-body systems and correlated quantum materials, attracting intensive research interest over the past decades. In this study, we identify a distinct quantum supercritica… ▽ More Across finite temperatures and fields, a quantum critical point (QCP) can extend to a quantum critical regime (QCR), characterized by prominent quantum fluctuations and universal scalings. The QCR is essential for comprehending many-body systems and correlated quantum materials, attracting intensive research interest over the past decades. In this study, we identify a distinct quantum supercritical regime (QSR) that also originates from the QCP but is driven by a symmetry-breaking field $h$, which couples to the order parameter. The QSR has crossover lines following $T \propto h^{\frac{zν}{β+γ}}$, where $β, γ, z, ν$ are the critical exponents. Amongst other intriguing phenomena in QSR, the magnetocaloric effect (MCE) exhibits a universally diverging magnetic Grüneisen ratio as $Γ_h \equiv \frac{1}{T} (\frac{\partial T}{\partial h})_{S} \propto T^{-\frac{β+γ}{zν}}$. This constitutes a boost in the universal MCE, when compared to that of QCR, as even a small field $h$ can induce a dramatic temperature variation. Experimental realizations involving quantum Ising and Heisenberg magnets are discussed. These systems offer an ideal platform for investigating quantum supercriticality and also hold significant potential as advanced refrigerants. △ Less

Submitted 4 November, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

Comments: 5 pages, 4 figures

arXiv:2409.12059 [pdf, other]

MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Authors: Ningyuan Xi, Xiaoyu Wang, Yetao Wu, Teng Chen, Qingqing Gu, Yue Zhao, Jinxian Qu, Zhonglin Jiang, Yong Chen, Luo Ji

Abstract: Large Language Model can reasonably understand and generate human expressions but may lack of thorough thinking and reasoning mechanisms. Recently there have been several studies which enhance the thinking ability of language models but most of them are not data-driven or training-based. In this paper, we are motivated by the cognitive mechanism in the natural world, and design a novel model archi… ▽ More Large Language Model can reasonably understand and generate human expressions but may lack of thorough thinking and reasoning mechanisms. Recently there have been several studies which enhance the thinking ability of language models but most of them are not data-driven or training-based. In this paper, we are motivated by the cognitive mechanism in the natural world, and design a novel model architecture called TaS which allows it to first consider the thoughts and then express the response based upon the query. We design several pipelines to annotate or generate the thought contents from prompt-response samples, then add language heads in a middle layer which behaves as the thinking layer. We train the language model by the thoughts-augmented data and successfully let the thinking layer automatically generate reasonable thoughts and finally output more reasonable responses. Both qualitative examples and quantitative results validate the effectiveness and performance of TaS. Our code is available at https://anonymous.4open.science/r/TadE. △ Less

Submitted 25 April, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

Comments: 19 pages, 7 figures

arXiv:2409.06624 [pdf, other]

A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio

Authors: Ningyuan Xi, Yetao Wu, Kun Fan, Teng Chen, Qingqing Gu, Peng Yu, Jinxian Qu, Chenxi Liu, Zhonglin Jiang, Yong Chen, Luo Ji

Abstract: Large Language Models (LLM) often needs to be Continual Pre-Trained (CPT) to obtain the unfamiliar language skill or adapt into new domains. The huge training cost of CPT often asks for cautious choice of key hyper-parameters such as the mixture ratio of extra language or domain corpus. However, there is no systematic study which bridge the gap between the optimal mixture ratio and the actual mode… ▽ More Large Language Models (LLM) often needs to be Continual Pre-Trained (CPT) to obtain the unfamiliar language skill or adapt into new domains. The huge training cost of CPT often asks for cautious choice of key hyper-parameters such as the mixture ratio of extra language or domain corpus. However, there is no systematic study which bridge the gap between the optimal mixture ratio and the actual model performance, and the gap between experimental scaling law and the actual deployment in the full model size. In this paper, we perform CPT on Llama-3 8B and 70B to enhance its Chinese ability. We study the optimal correlation between the Additional Language Mixture Ratio (ALMR) and the Learning Rate (LR) on the 8B size which directly indicate the optimal experimental set up. By thorough choice of hyper-parameter, and subsequent fine-tuning, the model capability is improved not only on the Chinese-related benchmark, but also some specific domains including math, coding and emotional intelligence. We deploy the final 70B version of LLM on an real-life chat system which obtain satisfying performance. △ Less

Submitted 10 September, 2024; originally announced September 2024.

Comments: 11 pages, 4 figures

arXiv:2409.06601 [pdf, other]

LaMsS: When Large Language Models Meet Self-Skepticism

Authors: Yetao Wu, Yihong Wang, Teng Chen, Ningyuan Xi, Qingqing Gu, Hongyang Lei, Luo Ji

Abstract: Hallucination is a major challenge for large language models (LLMs), preventing their further application in some fields. The skeptical thinking of humankind could be useful for LLMs to self-cognition, self-reflection and alleviate their hallucinations. Inspired by this consideration, we propose a novel approach called LaMsS, which combines the semantic understanding capability of LLMs with self-s… ▽ More Hallucination is a major challenge for large language models (LLMs), preventing their further application in some fields. The skeptical thinking of humankind could be useful for LLMs to self-cognition, self-reflection and alleviate their hallucinations. Inspired by this consideration, we propose a novel approach called LaMsS, which combines the semantic understanding capability of LLMs with self-skepticism. By introducing a series of skepticism tokens and augmenting them into the vocabulary, we conduct both pertaining and finetuning, which allow the LLM to decode each normal token followed by a skeptical token, representing different skepticism levels. By calculating the response skepticism given a query, one can define a new self-aware LLM which is only willing to answer with relative lower skepticism level than the threshold. By examining the accuracy, AUC and AP of willingly answering questions, we demonstrate that LaMsS achieves better performance than baselines on both multi-choice questions and open-domain question-answering benchmarks, and can generalize to multi-task and out-of-domain settings. Our study sheds some lights on the self-skepticism modeling on further artificial intelligence. Project code and model checkpoints can be found in https://anonymous.4open.science/r/SM-1E76. △ Less

Submitted 25 April, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

Comments: 11 pages, 6 figures, ICLR 2025 Workshop SSI-FM,

arXiv:2408.02566 [pdf, other]

Magnetocaloric Effect of Topological Excitations in Kitaev Magnets

Authors: Han Li, Enze Lv, Ning Xi, Yuan Gao, Yang Qi, Wei Li, Gang Su

Abstract: Traditional magnetic sub-Kelvin cooling relies on the nearly free local moments in hydrate paramagnetic salts, whose utility is hampered by the dilute magnetic ions and low thermal conductivity. Here we propose to use instead fractional excitations inherent to quantum spin liquids (QSLs) as an alternative, which are sensitive to external fields and can induce a very distinctive magnetocaloric effe… ▽ More Traditional magnetic sub-Kelvin cooling relies on the nearly free local moments in hydrate paramagnetic salts, whose utility is hampered by the dilute magnetic ions and low thermal conductivity. Here we propose to use instead fractional excitations inherent to quantum spin liquids (QSLs) as an alternative, which are sensitive to external fields and can induce a very distinctive magnetocaloric effect. With state-of-the-art tensor-network approach, we compute low-temperature properties of Kitaev honeycomb model. For the ferromagnetic case, strong demagnetization cooling effect is observed due to the nearly free $Z_2$ vortices via spin fractionalization, described by a paramagnetic equation of state with a renormalized Curie constant. For the antiferromagnetic Kitaev case, we uncover an intermediate-field gapless QSL phase with very large spin entropy, possibly due to the emergence of spinon Fermi surface. Potential realization of topological excitation cooling in Kitaev materials is also discussed, which may offer a promising pathway to circumvent existing limitations in the paramagnetic hydrates. △ Less

Submitted 5 August, 2024; originally announced August 2024.

Comments: 10 pages, 4 figures; supplementary materials; to appear in Nat. Commun. (2024)

arXiv:2405.13005 [pdf]

Understanding Sarcoidosis Using Large Language Models and Social Media Data

Authors: Nan Miles Xi, Hong-Long Ji, Lin Wang

Abstract: Sarcoidosis is a rare inflammatory disease characterized by the formation of granulomas in various organs. The disease presents diagnostic and treatment challenges due to its diverse manifestations and unpredictable nature. In this study, we employed a Large Language Model (LLM) to analyze sarcoidosis-related discussions on the social media platform Reddit. Our findings underscore the efficacy of… ▽ More Sarcoidosis is a rare inflammatory disease characterized by the formation of granulomas in various organs. The disease presents diagnostic and treatment challenges due to its diverse manifestations and unpredictable nature. In this study, we employed a Large Language Model (LLM) to analyze sarcoidosis-related discussions on the social media platform Reddit. Our findings underscore the efficacy of LLMs in accurately identifying sarcoidosis-related content. We discovered a wide array of symptoms reported by patients, with fatigue, swollen lymph nodes, and shortness of breath as the most prevalent. Prednisone was the most prescribed medication, while infliximab showed the highest effectiveness in improving prognoses. Notably, our analysis revealed disparities in prognosis based on age and gender, with women and younger patients experiencing good and polarized outcomes, respectively. Furthermore, unsupervised clustering identified three distinct patient subgroups (phenotypes) with unique symptom profiles, prognostic outcomes, and demographic distributions. Finally, sentiment analysis revealed a moderate negative impact on patients' mental health post-diagnosis, particularly among women and younger individuals. Our study represents the first application of LLMs to understand sarcoidosis through social media data. It contributes to understanding the disease by providing data-driven insights into its manifestations, treatments, prognoses, and impact on patients' lives. Our findings have direct implications for improving personalized treatment strategies and enhancing the quality of care for individuals living with sarcoidosis. △ Less

Submitted 27 October, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

Journal ref: Journal of Healthcare Informatics Research, 2024

arXiv:2404.00625 [pdf, other]

Scalable second-order consensus of hierarchical groups

Authors: Jiamin Wang, Jian Liu, Feng Xiao, Ning Xi, Yuanshi Zheng

Abstract: Motivated by widespread dominance hierarchy, growth of group sizes, and feedback mechanisms in social species, we are devoted to exploring the scalable second-order consensus of hierarchical groups. More specifically, a hierarchical group consists of a collection of agents with double-integrator dynamics on a directed acyclic graph with additional reverse edges, which characterize feedback mechani… ▽ More Motivated by widespread dominance hierarchy, growth of group sizes, and feedback mechanisms in social species, we are devoted to exploring the scalable second-order consensus of hierarchical groups. More specifically, a hierarchical group consists of a collection of agents with double-integrator dynamics on a directed acyclic graph with additional reverse edges, which characterize feedback mechanisms across hierarchical layers. As the group size grows and the reverse edges appear, we investigate whether the absolute velocity protocol and the relative velocity protocol can preserve the system consensus property without tuning the control gains. It is rigorously proved that the absolute velocity protocol is able to achieve completely scalable second-order consensus but the relative velocity protocol cannot. This result theoretically reveals how the scalable coordination behavior in hierarchical groups is determined by local interaction rules. Moreover, we develop a hierarchical structure in order to achieve scalable second-order consensus for networks of any size and with any number of reverse edges. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: 9 pages, 1 figure

arXiv:2403.11895 [pdf, other]

Thermal Tensor Network Approach for Spin-Lattice Relaxation in Quantum Magnets

Authors: Ning Xi, Yuan Gao, Chengchen Li, Shuang Liang, Rong Yu, Xiaoqun Wang, Wei Li

Abstract: Low-dimensional quantum magnets, particularly those with strong spin frustration, are characterized by their notable spin fluctuations. Nuclear magnetic resonance (NMR) serves as a sensitive probe of low-energy fluctuations that offers valuable insight into rich magnetic phases and emergent phenomena in quantum magnets. Although experimentally accessible, the numerical simulation of NMR relaxation… ▽ More Low-dimensional quantum magnets, particularly those with strong spin frustration, are characterized by their notable spin fluctuations. Nuclear magnetic resonance (NMR) serves as a sensitive probe of low-energy fluctuations that offers valuable insight into rich magnetic phases and emergent phenomena in quantum magnets. Although experimentally accessible, the numerical simulation of NMR relaxation rates, specifically the spin-lattice relaxation rate $1/T_1$, remains a significant challenge. Analytical continuation based on Monte Carlo calculations are hampered by the notorious negative sign for frustrated systems, and the real-time simulations incur significant costs to capture low-energy fluctuations. Here we propose computing the relaxation rate using thermal tensor networks (TTNs), which provides a streamlined approach by calculating its imaginary-time proxy. We showcase the accuracy and versatility of our methodology by applying it to one-dimensional spin chains and two-dimensional lattices, where we find that the critical exponents $η$ and $zν$ can be extracted from the low-temperature scalings of the simulated $1/T_1$ near quantum critical points. Our results also provide insights into the low-dimensional and frustrated magnetic materials, elucidating universal scaling behaviors in the Ising chain compound CoNb$_2$O$_6$ and revealing the renormalized classical behaviors in the triangular-lattice antiferromagnet Ba$_8$CoNb$_6$O$_{24}$. We apply the approach to effective model of the family of frustrated magnets AYbCh$_2$ (A = Na, K, Cs, and Ch = O, S, Se), and find dramatic changes from spin ordered to the proposed quantum spin liquid phase. Overall, with high reliability and accuracy, the TTN methodology offers a systematic strategy for studying the intricate dynamics observed across a broad spectrum of quantum magnets and related fields. △ Less

Submitted 20 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: 15 pages, 12 figures

arXiv:2403.10785 [pdf, other]

Emergent $D_8^{(1)}$ spectrum and topological soliton excitation in CoNb$_2$O$_6$

Authors: Ning Xi, Xiao Wang, Yunjing Gao, Yunfeng Jiang, Rong Yu, Jianda Wu

Abstract: Quantum integrability emerging near a quantum critical point (QCP) is manifested by exotic excitation spectrum that is organized by the associated algebraic structure. A well known example is the emergent $E_8$ integrability near the QCP of a transverse field Ising chain (TFIC), which was long predicted theoretically and initially proposed to be realized in the quasi-one-dimensional (q1D) quantum… ▽ More Quantum integrability emerging near a quantum critical point (QCP) is manifested by exotic excitation spectrum that is organized by the associated algebraic structure. A well known example is the emergent $E_8$ integrability near the QCP of a transverse field Ising chain (TFIC), which was long predicted theoretically and initially proposed to be realized in the quasi-one-dimensional (q1D) quantum magnet CoNb$_2$O$_6$. However, later measurements on the spin excitation spectrum of this material revealed a series of satellite peaks that cannot be described by the $E_8$ Lie algebra. Motivated by these experimental progresses, we hereby revisit the spin excitations of CoNb$_2$O$_6$ by combining numerical calculation and analytical analysis. We show that, as effects of strong interchain fluctuations, the spectrum of the system near the 1D QCP is characterized by the $D_{8}^{(1)}$ Lie algebra with robust topological soliton excitation. We further show that the $D_{8}^{(1)}$ spectrum can be realized in a broad class of interacting quantum systems. Our results advance the exploration of integrability and manipulation of topological excitations in quantum critical systems. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 6 pages, 3 figures - Supplementary Material 5 pages

arXiv:2403.01969 [pdf, other]

AS-ES Learning: Towards Efficient CoT Learning in Small Models

Authors: Nuwa Xi, Yuhan Chen, Sendong Zhao, Haochun Wang, Bing Qin, Ting Liu

Abstract: Chain-of-Thought (CoT) serves as a critical emerging ability in LLMs, especially when it comes to logical reasoning. Attempts have been made to induce such ability in small models as well by distilling from the data with CoT generated by Large Language Models (LLMs). However, existing methods often simply generate and incorporate more data from LLMs and fail to note the importance of efficiently u… ▽ More Chain-of-Thought (CoT) serves as a critical emerging ability in LLMs, especially when it comes to logical reasoning. Attempts have been made to induce such ability in small models as well by distilling from the data with CoT generated by Large Language Models (LLMs). However, existing methods often simply generate and incorporate more data from LLMs and fail to note the importance of efficiently utilizing existing CoT data. We here propose a new training paradigm AS-ES (Abstractive Segments - Extractive Segments) learning, which exploits the inherent information in CoT for iterative generation. Experiments show that our methods surpass the direct seq2seq training on CoT-extensive tasks like MWP and PET summarization, without data augmentation or altering the model itself. Furthermore, we explore the reason behind the inefficiency of small models in learning CoT and provide an explanation of why AS-ES learning works, giving insights into the underlying mechanism of CoT. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2402.11229 [pdf, other]

doi 10.1103/PhysRevB.111.L201117

Spin dynamics and dark particle in a weak-coupled quantum Ising ladder with $\mathcal{D}_8^{(1)}$ spectrum

Authors: Yunjing Gao, Xiao Wang, Ning Xi, Yunfeng Jiang, Rong Yu, Jianda Wu

Abstract: Emergent Ising$_h^2$ integrability is anticipated in a quantum Ising ladder composed of two weakly-coupled critical transverse field Ising chains. The system is remarkable for including eight types of massive relativistic particles, with their scattering matrix and mass spectrum characterized by the $\mathcal{D}_8^{(1)}$ Lie algebra. In this article, by computing the spin dynamical structure facto… ▽ More Emergent Ising$_h^2$ integrability is anticipated in a quantum Ising ladder composed of two weakly-coupled critical transverse field Ising chains. The system is remarkable for including eight types of massive relativistic particles, with their scattering matrix and mass spectrum characterized by the $\mathcal{D}_8^{(1)}$ Lie algebra. In this article, by computing the spin dynamical structure factors following analytical form factor approach, we clearly identify dispersive single-particle excitations of (anti-) soliton and breathers as well as their multi-particle continua in the spectra, which is further confirmed by the numerical simulations. We show that the selection rule inherent in the parity and topological charge of the theory, causes a significant result that charge-parity-odd particles, termed as dark particles, cannot be directly excited from the ground state through any local or quasi-local operations. This in turn suggests the long lifetime of the lightest dark particle. △ Less

Submitted 28 May, 2025; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: 5 pages, 4 figures - Supplementary Material 10 pages

Journal ref: Phys. Rev. B 111, L201117 (2025)

arXiv:2402.01349 [pdf, other]

LLMs May Perform MCQA by Selecting the Least Incorrect Option

Authors: Haochun Wang, Sendong Zhao, Zewen Qiang, Nuwa Xi, Bing Qin, Ting Liu

Abstract: In the field of NLP, Large Language Models (LLMs) have markedly enhanced performance across a variety of tasks. However, the comprehensive evaluation of LLMs remains an inevitable challenge for the community. Recently, the adoption of Multiple Choice Question Answering (MCQA) as a benchmark for assessing LLMs has gained considerable traction. However, concerns regarding the robustness of this eval… ▽ More In the field of NLP, Large Language Models (LLMs) have markedly enhanced performance across a variety of tasks. However, the comprehensive evaluation of LLMs remains an inevitable challenge for the community. Recently, the adoption of Multiple Choice Question Answering (MCQA) as a benchmark for assessing LLMs has gained considerable traction. However, concerns regarding the robustness of this evaluative method persist. Building upon previous discussions on the issue of \textit{variability}, we reveal an additional dimension of concern: LLMs may perform MCQA by selecting the least incorrect option rather than distinctly correct. This observation suggests that LLMs might regard multiple options as correct, which could undermine the reliability of MCQA as a metric for evaluating LLMs. To address this challenge, we introduce an enhanced dataset augmentation method for MCQA, termed MCQA+, to provide a more accurate reflection of the model performance, thereby highlighting the necessity for more sophisticated evaluation mechanisms in the assessment of LLM capabilities. △ Less

Submitted 6 December, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: COLING 2025

arXiv:2401.16107 [pdf, other]

Beyond Direct Diagnosis: LLM-based Multi-Specialist Agent Consultation for Automatic Diagnosis

Authors: Haochun Wang, Sendong Zhao, Zewen Qiang, Nuwa Xi, Bing Qin, Ting Liu

Abstract: Automatic diagnosis is a significant application of AI in healthcare, where diagnoses are generated based on the symptom description of patients. Previous works have approached this task directly by modeling the relationship between the normalized symptoms and all possible diseases. However, in the clinical diagnostic process, patients are initially consulted by a general practitioner and, if nece… ▽ More Automatic diagnosis is a significant application of AI in healthcare, where diagnoses are generated based on the symptom description of patients. Previous works have approached this task directly by modeling the relationship between the normalized symptoms and all possible diseases. However, in the clinical diagnostic process, patients are initially consulted by a general practitioner and, if necessary, referred to specialists in specific domains for a more comprehensive evaluation. The final diagnosis often emerges from a collaborative consultation among medical specialist groups. Recently, large language models have shown impressive capabilities in natural language understanding. In this study, we adopt tuning-free LLM-based agents as medical practitioners and propose the Agent-derived Multi-Specialist Consultation (AMSC) framework to model the diagnosis process in the real world by adaptively fusing probability distributions of agents over potential diseases. Experimental results demonstrate the superiority of our approach compared with baselines. Notably, our approach requires significantly less parameter updating and training time, enhancing efficiency and practical utility. Furthermore, we delve into a novel perspective on the role of implicit symptoms within the context of automatic diagnosis. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2312.15932 [pdf, other]

doi 10.1038/s41567-023-02212-2

Observation of a 1/3 Magnetisation Plateau Phase as Evidence for the Kitaev Interaction in a Honeycomb-Lattice Antiferromagnet

Authors: Yanyan Shangguan, Song Bao, Zhao-Yang Dong, Ning Xi, Yi-Peng Gao, Zhen Ma, Wei Wang, Zhongyuan Qi, Shuai Zhang, Zhentao Huang, Junbo Liao, Xiaoxue Zhao, Bo Zhang, Shufan Cheng, Hao Xu, Dehong Yu, Richard A. Mole, Naoki Murai, Seiko Ohira-Kawamura, Lunhua He, Jiazheng Hao, Qing-Bo Yan, Fengqi Song, Wei Li, Shun-Li Yu , et al. (2 additional authors not shown)

Abstract: Fractional magnetisation plateaus, in which the magnetisation is pinned at a fraction of its saturated value within a range of external magnetic field, are spectacular macroscopic manifestations of the collective quantum behaviours. One prominent example of the plateau phase is found in spin-1/2 triangular-lattice antiferromagnets featuring strong geometrical frustration, and is often interpreted… ▽ More Fractional magnetisation plateaus, in which the magnetisation is pinned at a fraction of its saturated value within a range of external magnetic field, are spectacular macroscopic manifestations of the collective quantum behaviours. One prominent example of the plateau phase is found in spin-1/2 triangular-lattice antiferromagnets featuring strong geometrical frustration, and is often interpreted as quantum-fluctuation-stabilised state in magnetic field via the "order-by-disorder" mechanism. Here, we observe an unprecedented 1/3 magnetisation plateau between 5.2 and 7.4 T at 2 K in a spin-1 antiferromagnet Na$_3$Ni$_2$BiO$_6$ with a honeycomb lattice, where conventionally no geometrical frustration is anticipated. By carrying out elastic neutron scattering measurements, we propose the spin structure of the plateau phase to be an unusual partial spin-flop ferrimagnetic order, transitioning from the zigzag antiferromagnetic order in zero field. Our theoretical calculations show that the plateau phase is stabilised by the bond-anisotropic Kitaev interaction. These results provide a new paradigm for the exploration of rich quantum phases in frustrated magnets and exotic Kitaev physics in high-spin systems. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: Submitted version, 10 pages, 5 figures. Final version has been published in Nature Physics

Journal ref: Nature Physics 19, 1883-1889 (2023)

arXiv:2309.05978 [pdf, ps, other]

doi 10.1007/s11432-023-3865-0

CToMP: A Cycle-task-oriented Memory Protection Scheme for Unmanned Systems

Authors: Chengyan Ma, Ning Xi, Di Lu, Yebo Feng, Jianfeng Ma

Abstract: Memory corruption attacks (MCAs) refer to malicious behaviors of system intruders that modify the contents of a memory location to disrupt the normal operation of computing systems, causing leakage of sensitive data or perturbations to ongoing processes. Unlike general-purpose systems, unmanned systems cannot deploy complete security protection schemes, due to their limitations in size, cost and p… ▽ More Memory corruption attacks (MCAs) refer to malicious behaviors of system intruders that modify the contents of a memory location to disrupt the normal operation of computing systems, causing leakage of sensitive data or perturbations to ongoing processes. Unlike general-purpose systems, unmanned systems cannot deploy complete security protection schemes, due to their limitations in size, cost and performance. MCAs in unmanned systems are particularly difficult to defend against. Furthermore, MCAs have diverse and unpredictable attack interfaces in unmanned systems, severely impacting digital and physical sectors. In this paper, we first generalize, model and taxonomize MCAs found in unmanned systems currently, laying the foundation for designing a portable and general defense approach. According to different attack mechanisms, we found that MCAs are mainly categorized into two types--return2libc and return2shellcode. To tackle return2libc attacks, we model the erratic operation of unmanned systems with cycles and then propose a cycle-task-oriented memory protection (CToMP) approach to protect control flows from tampering. To defend against return2shellcode attacks, we introduce a secure process stack with a randomized memory address by leveraging the memory pool to prevent Shellcode from being executed. Moreover, we discuss the mechanism by which CToMP resists the ROP attack, a novel variant of return2libc attacks. Finally, we implement CToMP on CUAV V5+ with Ardupilot and Crazyflie. The evaluation and security analysis results demonstrate that the proposed approach CToMP is resilient to various MCAs in unmanned systems with low footprints and system overhead. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: This paper has been accepted by SCIENCE CHINA Information Sciences

arXiv:2309.05203 [pdf, other]

From Artificially Real to Real: Leveraging Pseudo Data from Large Language Models for Low-Resource Molecule Discovery

Authors: Yuhan Chen, Nuwa Xi, Yanrui Du, Haochun Wang, Jianyu Chen, Sendong Zhao, Bing Qin

Abstract: Molecule discovery serves as a cornerstone in numerous scientific domains, fueling the development of new materials and innovative drug designs. Recent developments of in-silico molecule discovery have highlighted the promising results of cross-modal techniques, which bridge molecular structures with their descriptive annotations. However, these cross-modal methods frequently encounter the issue o… ▽ More Molecule discovery serves as a cornerstone in numerous scientific domains, fueling the development of new materials and innovative drug designs. Recent developments of in-silico molecule discovery have highlighted the promising results of cross-modal techniques, which bridge molecular structures with their descriptive annotations. However, these cross-modal methods frequently encounter the issue of data scarcity, hampering their performance and application. In this paper, we address the low-resource challenge by utilizing artificially-real data generated by Large Language Models (LLMs). We first introduce a retrieval-based prompting strategy to construct high-quality pseudo data, then explore the optimal method to effectively leverage this pseudo data. Experiments show that using pseudo data for domain adaptation outperforms all existing methods, while also requiring a smaller model scale, reduced data size and lower training cost, highlighting its efficiency. Furthermore, our method shows a sustained improvement as the volume of pseudo data increases, revealing the great potential of pseudo data in advancing low-resource cross-modal molecule discovery. Our code and data are available at https://github.com/SCIR-HI/ArtificiallyR2R. △ Less

Submitted 5 March, 2024; v1 submitted 10 September, 2023; originally announced September 2023.

Comments: AAAI2024

arXiv:2309.04175 [pdf, other]

doi 10.1145/3686807

Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese

Authors: Haochun Wang, Sendong Zhao, Zewen Qiang, Zijian Li, Nuwa Xi, Yanrui Du, MuZhen Cai, Haoqiang Guo, Yuhan Chen, Haoming Xu, Bing Qin, Ting Liu

Abstract: Large Language Models (LLMs) have demonstrated remarkable success in diverse natural language processing (NLP) tasks in general domains. However, LLMs sometimes generate responses with the hallucination about medical facts due to limited domain knowledge. Such shortcomings pose potential risks in the utilization of LLMs within medical contexts. To address this challenge, we propose knowledge-tunin… ▽ More Large Language Models (LLMs) have demonstrated remarkable success in diverse natural language processing (NLP) tasks in general domains. However, LLMs sometimes generate responses with the hallucination about medical facts due to limited domain knowledge. Such shortcomings pose potential risks in the utilization of LLMs within medical contexts. To address this challenge, we propose knowledge-tuning, which leverages structured medical knowledge bases for the LLMs to grasp domain knowledge efficiently and facilitate reliable response generation. We also release cMedKnowQA, a Chinese medical knowledge question-answering dataset constructed from medical knowledge bases to assess the medical knowledge proficiency of LLMs. Experimental results show that the LLMs which are knowledge-tuned with cMedKnowQA, can exhibit higher levels of accuracy in response generation compared with vanilla instruction-tuning and offer a new reliable way for the domain adaptation of LLMs. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 figures

arXiv:2309.04174 [pdf, other]

Manifold-based Verbalizer Space Re-embedding for Tuning-free Prompt-based Classification

Authors: Haochun Wang, Sendong Zhao, Chi Liu, Nuwa Xi, Muzhen Cai, Bing Qin, Ting Liu

Abstract: Prompt-based classification adapts tasks to a cloze question format utilizing the [MASK] token and the filled tokens are then mapped to labels through pre-defined verbalizers. Recent studies have explored the use of verbalizer embeddings to reduce labor in this process. However, all existing studies require a tuning process for either the pre-trained models or additional trainable embeddings. Mean… ▽ More Prompt-based classification adapts tasks to a cloze question format utilizing the [MASK] token and the filled tokens are then mapped to labels through pre-defined verbalizers. Recent studies have explored the use of verbalizer embeddings to reduce labor in this process. However, all existing studies require a tuning process for either the pre-trained models or additional trainable embeddings. Meanwhile, the distance between high-dimensional verbalizer embeddings should not be measured by Euclidean distance due to the potential for non-linear manifolds in the representation space. In this study, we propose a tuning-free manifold-based space re-embedding method called Locally Linear Embedding with Intra-class Neighborhood Constraint (LLE-INC) for verbalizer embeddings, which preserves local properties within the same class as guidance for classification. Experimental results indicate that even without tuning any parameters, our LLE-INC is on par with automated verbalizers with parameter tuning. And with the parameter updating, our approach further enhances prompt-based tuning by up to 3.2%. Furthermore, experiments with the LLaMA-7B&13B indicate that LLE-INC is an efficient tuning-free classification approach for the hyper-scale language models. △ Less

Submitted 29 January, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Accepted by AAAI 2024, 11 pages, 3 figures

arXiv:2307.09769 [pdf, other]

Source-Free Domain Adaptation for Medical Image Segmentation via Prototype-Anchored Feature Alignment and Contrastive Learning

Authors: Qinji Yu, Nan Xi, Junsong Yuan, Ziyu Zhou, Kang Dang, Xiaowei Ding

Abstract: Unsupervised domain adaptation (UDA) has increasingly gained interests for its capacity to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. However, typical UDA methods require concurrent access to both the source and target domain data, which largely limits its application in medical scenarios where source data is often unavailable due to privacy concern.… ▽ More Unsupervised domain adaptation (UDA) has increasingly gained interests for its capacity to transfer the knowledge learned from a labeled source domain to an unlabeled target domain. However, typical UDA methods require concurrent access to both the source and target domain data, which largely limits its application in medical scenarios where source data is often unavailable due to privacy concern. To tackle the source data-absent problem, we present a novel two-stage source-free domain adaptation (SFDA) framework for medical image segmentation, where only a well-trained source segmentation model and unlabeled target data are available during domain adaptation. Specifically, in the prototype-anchored feature alignment stage, we first utilize the weights of the pre-trained pixel-wise classifier as source prototypes, which preserve the information of source features. Then, we introduce the bi-directional transport to align the target features with class prototypes by minimizing its expected cost. On top of that, a contrastive learning stage is further devised to utilize those pixels with unreliable predictions for a more compact target feature distribution. Extensive experiments on a cross-modality medical segmentation task demonstrate the superiority of our method in large domain discrepancy settings compared with the state-of-the-art SFDA approaches and even some UDA methods. Code is available at https://github.com/CSCYQJ/MICCAI23-ProtoContra-SFDA. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: Accepted by MICCAI23

arXiv:2307.05355 [pdf, other]

UniCoRN: Unified Cognitive Signal ReconstructioN bridging cognitive signals and human language

Authors: Nuwa Xi, Sendong Zhao, Haochun Wang, Chi Liu, Bing Qin, Ting Liu

Abstract: Decoding text stimuli from cognitive signals (e.g. fMRI) enhances our understanding of the human language system, paving the way for building versatile Brain-Computer Interface. However, existing studies largely focus on decoding individual word-level fMRI volumes from a restricted vocabulary, which is far too idealized for real-world application. In this paper, we propose fMRI2text, the first ope… ▽ More Decoding text stimuli from cognitive signals (e.g. fMRI) enhances our understanding of the human language system, paving the way for building versatile Brain-Computer Interface. However, existing studies largely focus on decoding individual word-level fMRI volumes from a restricted vocabulary, which is far too idealized for real-world application. In this paper, we propose fMRI2text, the first openvocabulary task aiming to bridge fMRI time series and human language. Furthermore, to explore the potential of this new task, we present a baseline solution, UniCoRN: the Unified Cognitive Signal ReconstructioN for Brain Decoding. By reconstructing both individual time points and time series, UniCoRN establishes a robust encoder for cognitive signals (fMRI & EEG). Leveraging a pre-trained language model as decoder, UniCoRN proves its efficacy in decoding coherent text from fMRI series across various split settings. Our model achieves a 34.77% BLEU score on fMRI2text, and a 37.04% BLEU when generalized to EEGto-text decoding, thereby surpassing the former baseline. Experimental results indicate the feasibility of decoding consecutive fMRI volumes, and the effectiveness of decoding different cognitive signals using a unified structure. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: the 61st Annual Meeting of the Association for Computational Linguistics

arXiv:2306.16288 [pdf, other]

doi 10.1038/s41535-024-00636-4

Emergent criticality in fully frustrated quantum magnets

Authors: Yuchen Fan, Ning Xi, Changle Liu, Bruce Normand, Rong Yu

Abstract: Phase transitions in condensed matter are often linked to exotic emergent properties. We study the fully frustrated bilayer Heisenberg antiferromagnet to demonstrate that an applied magnetic field creates a novel emergent criticality. The quantum phase diagram contains four states, the DS (singlets on every interlayer dimer bond), DTAF (all triplets with antiferromagnetic order), TC (a singlet-tri… ▽ More Phase transitions in condensed matter are often linked to exotic emergent properties. We study the fully frustrated bilayer Heisenberg antiferromagnet to demonstrate that an applied magnetic field creates a novel emergent criticality. The quantum phase diagram contains four states, the DS (singlets on every interlayer dimer bond), DTAF (all triplets with antiferromagnetic order), TC (a singlet-triplet checkerboard) and FM (saturated ferromagnet). The thermal phase diagram is dominated by a wall of discontinuities extending from the zero-field DTAF-DS transition to a quantum critical endpoint where the field drives the DTAF and TC into the FM. This first-order wall is terminated at finite temperatures by a line of critical points, where the Berezinskii-Kosterlitz-Thouless (BKT) transition of the DTAF and the thermal Ising transition of the TC also terminate. We demonstrate by quantum Monte Carlo simulations that the BKT transition does not change the Ising nature of the DTAF-DS critical line. By contrast, the combination of symmetries merging on the multicritical DTAF-TC line leads to a 4-state Potts universality not contained in the microscopic Hamiltonian, which we associate with the Ashkin-Teller model. Our results represent a systematic step in understanding emergent phenomena in quantum magnetic materials including the ``Shastry-Sutherland compound'' SrCu$_2$(BO$_3$)$_2$. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Comments: 10+8 pages, 5+7 figures

Journal ref: npj Quantum Materials 9, 25 (2024)

arXiv:2306.09563 [pdf, ps, other]

doi 10.1103/PhysRevB.107.L220401

Nature of the 1/9-magnetization plateau in the spin-1/2 kagome Heisenberg antiferromagnet

Authors: Da-zhi Fang, Ning Xi, Shi-Ju Ran, Gang Su

Abstract: The nature of the 1/9-magnetization plateau of the spin-1/2 kagome Heisenberg antiferromagnet remains controversial due to the exotic physical properties and high complexity induced by the geometrical frustration. Instead of a Z3 quantum spin liquid revealed on a cylinder, we show on an infinite-size lattice that the 1/9 plateau can be described by a valence bond crystal (VBC) that breaks spatial… ▽ More The nature of the 1/9-magnetization plateau of the spin-1/2 kagome Heisenberg antiferromagnet remains controversial due to the exotic physical properties and high complexity induced by the geometrical frustration. Instead of a Z3 quantum spin liquid revealed on a cylinder, we show on an infinite-size lattice that the 1/9 plateau can be described by a valence bond crystal (VBC) that breaks spatial translational invariance. Consistent results are achieved by two accurate tensor network methods, namely the full-update infinite projected-entangled pair states and the projected-entangled simplex states. The VBC exhibits an hourglass pattern with $\sqrt{3}\times\sqrt{3}$ spatial symmetry, demonstrated by magnetizations, bond energies, and three-body correlators. The spatial inversion symmetry in the $\sqrt{3}\times\sqrt{3}$ VBC is instantly broken with the presence of the difference between the coupling strengths in the up and down triangles, suggesting the existence of gapless excitations. The gapless nature of the 1/9 plateau is further indicated by the scaling behaviors of the entanglement entropy and the correlation length, which indicate a c=1 conformal field theory. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: 6 pages, 5 figures, Supplemental Material at http://link.aps.org/supplemental/10.1103/PhysRevB.107.L220401

Journal ref: Phys. Rev. B 107, L220401 (2023)

arXiv:2304.06975 [pdf, other]

HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge

Authors: Haochun Wang, Chi Liu, Nuwa Xi, Zewen Qiang, Sendong Zhao, Bing Qin, Ting Liu

Abstract: Large Language Models (LLMs), such as the LLaMA model, have demonstrated their effectiveness in various general-domain natural language processing (NLP) tasks. Nevertheless, LLMs have not yet performed optimally in biomedical domain tasks due to the need for medical expertise in the responses. In response to this challenge, we propose HuaTuo, a LLaMA-based model that has been supervised-fine-tuned… ▽ More Large Language Models (LLMs), such as the LLaMA model, have demonstrated their effectiveness in various general-domain natural language processing (NLP) tasks. Nevertheless, LLMs have not yet performed optimally in biomedical domain tasks due to the need for medical expertise in the responses. In response to this challenge, we propose HuaTuo, a LLaMA-based model that has been supervised-fine-tuned with generated QA (Question-Answer) instances. The experimental results demonstrate that HuaTuo generates responses that possess more reliable medical knowledge. Our proposed HuaTuo model is accessible at https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: LLaMA-based Chinese Medical model - HuaTuo. Model, code and training data are available at https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese

arXiv:2304.05642 [pdf, other]

Global Prompt Cell: A Portable Control Module for Effective Prompt Tuning

Authors: Chi Liu, Haochun Wang, Nuwa Xi, Sendong Zhao, Bing Qin

Abstract: As a novel approach to tuning pre-trained models, prompt tuning involves freezing the parameters in downstream tasks while inserting trainable embeddings into inputs in the first layer. However, previous methods have mainly focused on the initialization of prompt embeddings. The strategy of training and utilizing prompt embeddings in a reasonable way has become a limiting factor in the effectivene… ▽ More As a novel approach to tuning pre-trained models, prompt tuning involves freezing the parameters in downstream tasks while inserting trainable embeddings into inputs in the first layer. However, previous methods have mainly focused on the initialization of prompt embeddings. The strategy of training and utilizing prompt embeddings in a reasonable way has become a limiting factor in the effectiveness of prompt tuning. To address this issue, we introduce the Global Prompt Cell (GPC), a portable control module for prompt tuning that selectively preserves prompt information across all encoder layers. Our experimental results demonstrate a 5.8% improvement on SuperGLUE datasets compared to vanilla prompt tuning. △ Less

Submitted 13 May, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

arXiv:2302.06596 [pdf, other]

doi 10.1103/PhysRevLett.131.116702

Plaquette Singlet Transition, Magnetic Barocaloric Effect, and Spin Supersolidity in the Shastry-Sutherland Model

Authors: Junsen Wang, Han Li, Ning Xi, Yuan Gao, Qing-Bo Yan, Wei Li, Gang Su

Abstract: Inspired by recent experimental measurements [Guo \textit{et al.}, Phys. Rev. Lett.~\textbf{124}, 206602 (2020); Jiménez \textit{et al.}, Nature \textbf{592}, 370 (2021)] on frustrated quantum magnet SrCu$_2$(BO$_3$)$_2$ under combined pressure and magnetic fields, we study the related spin-$1/2$ Shastry-Sutherland (SS) model using state-of-the-art tensor network methods. By calculating thermodyna… ▽ More Inspired by recent experimental measurements [Guo \textit{et al.}, Phys. Rev. Lett.~\textbf{124}, 206602 (2020); Jiménez \textit{et al.}, Nature \textbf{592}, 370 (2021)] on frustrated quantum magnet SrCu$_2$(BO$_3$)$_2$ under combined pressure and magnetic fields, we study the related spin-$1/2$ Shastry-Sutherland (SS) model using state-of-the-art tensor network methods. By calculating thermodynamics, correlations and susceptibilities, we find, in zero magnetic field, not only a line of first-order plaquette-singlet (PS) to dimer-singlet phase transition ending with a critical point, but also signatures of the ordered PS transition with its critical endpoint terminating on this first-order line. Moreover, we uncover prominent magnetic barocaloric responses, a novel type of quantum correlation induced cooling effect, in the strongly fluctuating supercritical regime. Under finite fields, we identify a quantum phase transition from the PS phase to the spin supersolid phase that breaks simultaneously lattice translational and spin rotational symmetries. The present findings on the SS model are accessible in current experiments and would shed new light on exotic critical and supercritical phenomena in archetypal frustrated quantum magnets. △ Less

Submitted 14 September, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

Comments: Close to the published version. 7 pages, 4 figures (SM 9 pages, 12 figures)

Journal ref: Phys. Rev. Lett. 131, 116702 (2023)

arXiv:2301.03571 [pdf, other]

Dipolar Spin Liquid Ending with Quantum Critical Point in a Gd-based Triangular Magnet

Authors: Junsen Xiang, Cheng Su, Ning Xi, Zhendong Fu, Zhuo Chen, Hai Jin, Ziyu Chen, Zhao-Jun Mo, Yang Qi, Jun Shen, Long Zhang, Wentao Jin, Wei Li, Peijie Sun, Gang Su

Abstract: By performing experiment and model studies on a triangular-lattice dipolar magnet KBaGd(BO$_3$)$_2$ (KBGB), we find the highly frustrated magnet with a planar anisotropy hosts a strongly fluctuating dipolar spin liquid (DSL), which originates from the intriguing interplay between dipolar and Heisenberg interactions. The DSL constitutes an extended regime in the field-temperature phase diagram, whi… ▽ More By performing experiment and model studies on a triangular-lattice dipolar magnet KBaGd(BO$_3$)$_2$ (KBGB), we find the highly frustrated magnet with a planar anisotropy hosts a strongly fluctuating dipolar spin liquid (DSL), which originates from the intriguing interplay between dipolar and Heisenberg interactions. The DSL constitutes an extended regime in the field-temperature phase diagram, which gets lowered in temperature as field increases and eventually ends with an unconventional quantum critical point (QCP) at $B_c\simeq 0.75$~T. Based on dipolar Heisenberg model calculations, we identify the DSL as a Berezinskii-Kosterlitz-Thouless (BKT) phase with emergent U(1) symmetry. Due to the tremendous entropy accumulation that can be related to the strong BKT and quantum fluctuations, unprecedented magnetic cooling effects are observed in the DSL regime and particularly near the QCP, making KBGB a superior dipolar coolant to commercial Gd-based refrigerants. We establish the phase diagram for triangular-lattice dipolar quantum magnets where emergent symmetry plays an essential role, and provide a basis and opens an avenue for their applications in sub-Kelvin refrigeration. △ Less

Submitted 16 January, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

Comments: 7 pages and 4 figures in main text, 5 pages and 4 figures in Supplementary Materials

arXiv:2212.12114 [pdf]

Predicting Survival of Tongue Cancer Patients by Machine Learning Models

Authors: Angelos Vasilopoulos, Nan Miles Xi

Abstract: Tongue cancer is a common oral cavity malignancy that originates in the mouth and throat. Much effort has been invested in improving its diagnosis, treatment, and management. Surgical removal, chemotherapy, and radiation therapy remain the major treatment for tongue cancer. The survival of patients determines the treatment effect. Previous studies have identified certain survival and risk factors… ▽ More Tongue cancer is a common oral cavity malignancy that originates in the mouth and throat. Much effort has been invested in improving its diagnosis, treatment, and management. Surgical removal, chemotherapy, and radiation therapy remain the major treatment for tongue cancer. The survival of patients determines the treatment effect. Previous studies have identified certain survival and risk factors based on descriptive statistics, ignoring the complex, nonlinear relationship among clinical and demographic variables. In this study, we utilize five cutting-edge machine learning models and clinical data to predict the survival of tongue cancer patients after treatment. Five-fold cross-validation, bootstrap analysis, and permutation feature importance are applied to estimate and interpret model performance. The prognostic factors identified by our method are consistent with previous clinical studies. Our method is accurate, interpretable, and thus useable as additional evidence in tongue cancer treatment and management. △ Less

Submitted 22 December, 2022; originally announced December 2022.

arXiv:2211.00772 [pdf]

Tuning hyperparameters of doublet-detection methods for single-cell RNA sequencing data

Authors: Nan Miles Xi, Angelos Vasilopoulos

Abstract: The existence of doublets in single-cell RNA sequencing (scRNA-seq) data poses a great challenge in downstream data analysis. Computational doublet-detection methods have been developed to remove doublets from scRNA-seq data. Yet, the default hyperparameter settings of those methods may not provide optimal performance. Here, we propose a strategy to tune hyperparameters for a cutting-edge doublet-… ▽ More The existence of doublets in single-cell RNA sequencing (scRNA-seq) data poses a great challenge in downstream data analysis. Computational doublet-detection methods have been developed to remove doublets from scRNA-seq data. Yet, the default hyperparameter settings of those methods may not provide optimal performance. Here, we propose a strategy to tune hyperparameters for a cutting-edge doublet-detection method. We utilize a full factorial design to explore the relationship between hyperparameters and detection accuracy on 16 real scRNA-seq datasets. The optimal hyperparameters are obtained by a response surface model and convex optimization. We show that the optimal hyperparameters provide top performance across scRNA-seq datasets under various biological conditions. Our tuning strategy can be applied to other computational doublet-detection methods. It also offers insights into hyperparameter tuning for broader computational methods in scRNA-seq data analysis. △ Less

Submitted 5 February, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

arXiv:2210.04151 [pdf]

Prediction of Drug-Induced TdP Risks Using Machine Learning and Rabbit Ventricular Wedge Assay

Authors: Jaela Foster-Burns, Nan Miles Xi

Abstract: Torsades de pointes (TdP) is an irregular heart rhythm as a side effect of drugs and may cause sudden cardiac death. A machine learning model that can accurately identify drug TdP risk is necessary. This study uses multinomial logistic regression models to predict three-class drug TdP risks based on datasets generated from rabbit ventricular wedge assay experiments. The training-test split and fiv… ▽ More Torsades de pointes (TdP) is an irregular heart rhythm as a side effect of drugs and may cause sudden cardiac death. A machine learning model that can accurately identify drug TdP risk is necessary. This study uses multinomial logistic regression models to predict three-class drug TdP risks based on datasets generated from rabbit ventricular wedge assay experiments. The training-test split and five-fold cross-validation provide unbiased measurements for prediction accuracy. We utilize bootstrap to construct a 95% confidence interval for prediction accuracy. The model interpretation is further demonstrated by permutation predictor importance. Our study offers an interpretable modeling method suitable for drug TdP risk prediction. Our method can be easily generalized to broader applications of drug side effect assessment. △ Less

Submitted 8 October, 2022; originally announced October 2022.

arXiv:2209.06453 [pdf, other]

Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words

Authors: Haochun Wang, Chi Liu, Nuwa Xi, Sendong Zhao, Meizhi Ju, Shiwei Zhang, Ziheng Zhang, Yefeng Zheng, Bing Qin, Ting Liu

Abstract: Prompt-based fine-tuning for pre-trained models has proven effective for many natural language processing tasks under few-shot settings in general domain. However, tuning with prompt in biomedical domain has not been investigated thoroughly. Biomedical words are often rare in general domain, but quite ubiquitous in biomedical contexts, which dramatically deteriorates the performance of pre-trained… ▽ More Prompt-based fine-tuning for pre-trained models has proven effective for many natural language processing tasks under few-shot settings in general domain. However, tuning with prompt in biomedical domain has not been investigated thoroughly. Biomedical words are often rare in general domain, but quite ubiquitous in biomedical contexts, which dramatically deteriorates the performance of pre-trained models on downstream biomedical applications even after fine-tuning, especially in low-resource scenarios. We propose a simple yet effective approach to helping models learn rare biomedical words during tuning with prompt. Experimental results show that our method can achieve up to 6% improvement in biomedical natural language inference task without any extra parameters or training steps using few-shot vanilla prompt settings. △ Less

Submitted 14 September, 2022; originally announced September 2022.

Comments: Accepted to COLING 2022

arXiv:2205.03238 [pdf]

Ultra-sensitive Flexible Sponge-Sensor Array for Muscle Activities Detection and Human Limb Motion Recognition

Authors: Jiao Suo, Yifan Liu, Clio Cheng, Keer Wang, Meng Chen, Ho-yin Chan, Roy Vellaisamy, Ning Xi, Vivian W. Q. Lou, Wen Jung Li

Abstract: Human limb motion tracking and recognition plays an important role in medical rehabilitation training, lower limb assistance, prosthetics design for amputees, feedback control for assistive robots, etc. Lightweight wearable sensors, including inertial sensors, surface electromyography sensors, and flexible strain/pressure, are promising to become the next-generation human motion capture devices. H… ▽ More Human limb motion tracking and recognition plays an important role in medical rehabilitation training, lower limb assistance, prosthetics design for amputees, feedback control for assistive robots, etc. Lightweight wearable sensors, including inertial sensors, surface electromyography sensors, and flexible strain/pressure, are promising to become the next-generation human motion capture devices. Herein, we present a wireless wearable device consisting of a sixteen-channel flexible sponge-based pressure sensor array to recognize various human lower limb motions by detecting contours on the human skin caused by calf gastrocnemius muscle actions. Each sensing element is a round porous structure of thin carbon nanotube/polydimethylsiloxane nanocomposites with a diameter of 4 mm and thickness of about 400 μm. Ten human subjects were recruited to perform ten different lower limb motions while wearing the developed device. The motion classification result with the support vector machine method shows a macro-recall of about 97.3% for all ten motions tested. This work demonstrates a portable wearable muscle activity detection device with a lower limb motion recognition application, which can be potentially used in assistive robot control, healthcare, sports monitoring, etc. △ Less

Submitted 29 June, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

Comments: 17 pages, 6 figures

arXiv:2204.08133 [pdf, other]

doi 10.1126/science.adc9487

Proximate deconfined quantum critical point in SrCu2(BO3)2

Authors: Yi Cui, Lu Liu, Huihang Lin, Kai-Hsin Wu, Wenshan Hong, Xuefei Liu, Cong Li, Ze Hu, Ning Xi, Shiliang Li, Rong Yu, Anders W. Sandvik, Weiqiang Yu

Abstract: The deconfined quantum critical point (DQCP) represents a paradigm shift in quantum matter studies, presenting a "beyond Landau" scenario for order--order transitions. Its experimental realization, however, has remained elusive. Using high-pressure $^{11}$B NMR measurements on the quantum magnet SrCu$_2$(BO$_3$)$_2$, we here demonstrate a magnetic-field induced plaquette-singlet to antiferromagnet… ▽ More The deconfined quantum critical point (DQCP) represents a paradigm shift in quantum matter studies, presenting a "beyond Landau" scenario for order--order transitions. Its experimental realization, however, has remained elusive. Using high-pressure $^{11}$B NMR measurements on the quantum magnet SrCu$_2$(BO$_3$)$_2$, we here demonstrate a magnetic-field induced plaquette-singlet to antiferromagnetic transition above 1.8 GPa at a remarkably low temperature, $T_{\rm c}\simeq 0.07$ K. First-order signatures of the transition weaken with increasing pressure, and we observe quantum critical scaling at the highest pressure, 2.4 GPa. Supported by model calculations, we suggest that these observations can be explained by a proximate DQCP inducing critical quantum fluctuations and emergent O(3) symmetry of the order parameters. Our findings take the DQCP from a theoretical concept to a concrete experimental platform. △ Less

Submitted 18 October, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

Journal ref: Science 380, 1179 (2023)

arXiv:2203.15804 [pdf]

Improving The Diagnosis of Thyroid Cancer by Machine Learning and Clinical Data

Authors: Nan Miles Xi, Lin Wang, Chuanjia Yang

Abstract: Thyroid cancer is a common endocrine carcinoma that occurs in the thyroid gland. Much effort has been invested in improving its diagnosis, and thyroidectomy remains the primary treatment method. A successful operation without unnecessary side injuries relies on an accurate preoperative diagnosis. Current human assessment of thyroid nodule malignancy is prone to errors and may not guarantee an accu… ▽ More Thyroid cancer is a common endocrine carcinoma that occurs in the thyroid gland. Much effort has been invested in improving its diagnosis, and thyroidectomy remains the primary treatment method. A successful operation without unnecessary side injuries relies on an accurate preoperative diagnosis. Current human assessment of thyroid nodule malignancy is prone to errors and may not guarantee an accurate preoperative diagnosis. This study proposed a machine framework to predict thyroid nodule malignancy based on a novel clinical dataset we collected. The 10-fold cross-validation, bootstrap analysis, and permutation predictor importance were applied to estimate and interpret the model performance under uncertainty. The comparison between model prediction and expert assessment shows the advantage of our framework over human judgment in predicting thyroid nodule malignancy. Our method is accurate, interpretable, and thus useable as additional evidence in the preoperative diagnosis for thyroid cancer. △ Less

Submitted 27 March, 2022; originally announced March 2022.

arXiv:2202.13375 [pdf, other]

doi 10.1088/1674-1056/ac5987

Dynamical signatures of the one-dimensional deconfined quantum critical point

Authors: Ning Xi, Rong Yu

Abstract: We study the critical scaling and dynamical signatures of fractionalized excitations at two different deconfined quantum critical points (DQCPs) in an $S = 1/2$ spin chain by using the time evolution of infinite matrix product states. The scaling of the correlation functions and the dispersion of the conserved current correlations explicitly show the emergence of enhanced continuous symmetries at… ▽ More We study the critical scaling and dynamical signatures of fractionalized excitations at two different deconfined quantum critical points (DQCPs) in an $S = 1/2$ spin chain by using the time evolution of infinite matrix product states. The scaling of the correlation functions and the dispersion of the conserved current correlations explicitly show the emergence of enhanced continuous symmetries at these DQCPs. The dynamical structure factors in several different channels reveal the development of deconfined fractionalized excitations at the DQCPs. Furthermore, we find an effective spin-charge separation at the DQCP between the ferromagnetic (FM) and valence bond solid (VBS) phases, and identify two continua associated to different types of fractionalized excitations at the DQCP between the $X$-direction and $Z$-direction FM phases. Our findings not only provide direct evidence for the DQCP in one dimension but also shed light on exploring the DQCP in higher dimensions. △ Less

Submitted 27 February, 2022; originally announced February 2022.

arXiv:2202.12543 [pdf, other]

doi 10.1088/1751-8121/ac7181

Emergent O(4) symmetry at an one-dimensional deconfined quantum tricritical point

Authors: Ning Xi, Rong Yu

Abstract: We show an $\rm O(4)$ symmetry emerges at a deconfined quantum tricritical point of a valence bond solid and two ferromagnetic phases in an $S = 1/2$ frustrated spin chain by combining analytical analysis and numerical calculations with the time evolution of infinite matrix product states. With this symmetry, the valence-bond solid and the three magnetic order parameters form an $\rm O(4)$ pseudov… ▽ More We show an $\rm O(4)$ symmetry emerges at a deconfined quantum tricritical point of a valence bond solid and two ferromagnetic phases in an $S = 1/2$ frustrated spin chain by combining analytical analysis and numerical calculations with the time evolution of infinite matrix product states. With this symmetry, the valence-bond solid and the three magnetic order parameters form an $\rm O(4)$ pseudovector in the infrared limit, and can continuously rotate into each other. We numerically determine the location of the quantum tricritical point and study the scaling of the correlation functions of the $\rm O(4)$ vector components and associated conserved currents. The critical behaviors of these correlation functions are all in accord with field theoretical results. The emergent $\rm O(4)$ symmetry at the tricritical point is justified by the integer value of the scaling dimension of the emergent Noether conserved currents. Our findings not only give direct evidence of such a high emergent symmetry at an one-dimensional valence bond solid to magnetic transition but also shed light on exploring emergent symmetries in higher dimensions. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: submitted to: Journal of Physics A: Mathematical and Theoretical

arXiv:2202.00302 [pdf, ps, other]

The based rings of two-sided cells in an affine Weyl group of type $\tilde B_3$, II

Authors: Yannan Qiu, Nanhua Xi

Abstract: We compute the based rings of two-sided cells corresponding to the unipotent classes in $Sp_6(\mathbb C)$ with Jordan blocks (33), (411), (222) respectively. The results for the first two two-sided cells also verify Lusztig's conjecture on the structure of the based rings of two-sided cells of an affine Weyl group. The result for the last two-sided cell partially suggests a modification of Lusztig… ▽ More We compute the based rings of two-sided cells corresponding to the unipotent classes in $Sp_6(\mathbb C)$ with Jordan blocks (33), (411), (222) respectively. The results for the first two two-sided cells also verify Lusztig's conjecture on the structure of the based rings of two-sided cells of an affine Weyl group. The result for the last two-sided cell partially suggests a modification of Lusztig's conjecture on the structure of the based rings of two-sided cells of an affine Weyl group. △ Less

Submitted 1 February, 2022; originally announced February 2022.

arXiv:2201.05669 [pdf]

Prediction of Drug-Induced TdP Risks Using Machine Learning and Rabbit Ventricular Wedge Assay

Authors: Nan Miles Xi, Dalong Patrick Huang

Abstract: The evaluation of drug-induced Torsades de pointes (TdP) risks is crucial in drug safety assessment. In this study, we discuss machine learning approaches in the prediction of drug-induced TdP risks using preclinical data. Specifically, the random forest model was trained on the dataset generated by the rabbit ventricular wedge assay. The model prediction performance was measured on 28 drugs from… ▽ More The evaluation of drug-induced Torsades de pointes (TdP) risks is crucial in drug safety assessment. In this study, we discuss machine learning approaches in the prediction of drug-induced TdP risks using preclinical data. Specifically, the random forest model was trained on the dataset generated by the rabbit ventricular wedge assay. The model prediction performance was measured on 28 drugs from the Comprehensive In Vitro Proarrhythmia Assay initiative. Leave-one-drug-out cross-validation provided an unbiased estimation of model performance. Stratified bootstrap revealed the uncertainty in the asymptotic model prediction. Our study validated the utility of machine learning approaches in predicting drug-induced TdP risks from preclinical data. Our methods can be extended to other preclinical protocols and serve as a supplementary evaluation in drug safety assessment. △ Less

Submitted 14 January, 2022; originally announced January 2022.

Comments: arXiv admin note: text overlap with arXiv:2108.00543

arXiv:2111.07368 [pdf, other]

doi 10.1103/PhysRevB.107.L220408

First-order transition between the plaquette valence bond solid and antiferromagnetic phases of the Shastry-Sutherland model

Authors: Ning Xi, Hongyu Chen, Z. Y. Xie, Rong Yu

Abstract: We study the ground state phase diagram of the Shastry-Sutherland model by using the variational optimization of the infinite tensor network states, and find a weakly first-order transition between the plaquette and the antiferromagnetic states. The full plaquette state strongly competes with the empty plaquette ground state, with an energy difference less than $10^{-4}J$. We show a staggered ring… ▽ More We study the ground state phase diagram of the Shastry-Sutherland model by using the variational optimization of the infinite tensor network states, and find a weakly first-order transition between the plaquette and the antiferromagnetic states. The full plaquette state strongly competes with the empty plaquette ground state, with an energy difference less than $10^{-4}J$. We show a staggered ring exchange interaction that preserves the Shastry-Sutherland lattice symmetry can stabilize the full plaquette ground state. In light of this, we propose the triple point where the full plaquette, empty plaquette, and antiferromagnetic phases meet as a deconfined quantum critical point. △ Less

Submitted 14 November, 2021; originally announced November 2021.

Comments: 5.5+2 pages, 4+3 figures

Journal ref: Phys. Rev. B 107, L220408 (2023)

arXiv:2108.00543 [pdf]

Statistical Learning in Preclinical Drug Proarrhythmic Assessment

Authors: Nan Milex Xi, Yu-Yi Hsu, Qianyu Dang, Dalong Patrick Huang

Abstract: Torsades de pointes (TdP) is an irregular heart rhythm characterized by faster beat rates and potentially could lead to sudden cardiac death. Much effort has been invested in understanding the drug-induced TdP in preclinical studies. However, a comprehensive statistical learning framework that can accurately predict the drug-induced TdP risk from preclinical data is still lacking. We proposed ordi… ▽ More Torsades de pointes (TdP) is an irregular heart rhythm characterized by faster beat rates and potentially could lead to sudden cardiac death. Much effort has been invested in understanding the drug-induced TdP in preclinical studies. However, a comprehensive statistical learning framework that can accurately predict the drug-induced TdP risk from preclinical data is still lacking. We proposed ordinal logistic regression and ordinal random forest models to predict low-, intermediate-, and high-risk drugs based on datasets generated from two experimental protocols. Leave-one-drug-out cross-validation, stratified bootstrap, and permutation predictor importance were applied to estimate and interpret the model performance under uncertainty. The potential outlier drugs identified by our models are consistent with their descriptions in the literature. Our method is accurate, interpretable, and thus useable as supplemental evidence in the drug safety assessment. △ Less

Submitted 7 January, 2022; v1 submitted 1 August, 2021; originally announced August 2021.

arXiv:2107.08983 [pdf, ps, other]

The based rings of two-sided cells in an affine weyl group of type $\tilde B_3$, I

Authors: Yannan Qiu, Nanhua Xi

Abstract: For type $\tilde B_3$ we show that Lusztig's conjecture on the structure of the based ring of two-sided cell corresponding to the unipotent class in $Sp_6(\mathbb C)$ with 3 equal Jordan blocks needs modified. For type $\tilde B_3$ we show that Lusztig's conjecture on the structure of the based ring of two-sided cell corresponding to the unipotent class in $Sp_6(\mathbb C)$ with 3 equal Jordan blocks needs modified. △ Less

Submitted 8 November, 2021; v1 submitted 19 July, 2021; originally announced July 2021.

Showing 1–50 of 76 results for author: Xi, N