-
The Ramsey number of the 4-cycle versus a book graph
Authors:
Chunyang Dou,
Tianyu Li,
Qizhong Lin,
Xing Peng
Abstract:
Given positive integers $n$ and $k$, the book graph $B_n^{(k)}$ consists of $n$ copies of $K_{k+1}$ sharing a common $K_k$. The book graph is a common generalization of a star and a clique, which can be seen by taking $k=1$ and $n=1$ respectively. In addition, the Ramsey number of a book graph is closely related to the diagonal Ramsey number. Thus the study of extremal problems related to the book…
▽ More
Given positive integers $n$ and $k$, the book graph $B_n^{(k)}$ consists of $n$ copies of $K_{k+1}$ sharing a common $K_k$. The book graph is a common generalization of a star and a clique, which can be seen by taking $k=1$ and $n=1$ respectively. In addition, the Ramsey number of a book graph is closely related to the diagonal Ramsey number. Thus the study of extremal problems related to the book graph is of substantial significance. In this paper, we aim to investigate the Ramsey number $r(C_4,B_n^{(k)})$ which is the smallest integer $N$ such that for any graph $G$ on $N$ vertices, either $G$ contains $C_4$ as a subgraph or the complement $\overline{G}$ contains $B_n^{(k)}$ as a subgraph. For $k=1$, a pioneer work by Parsons ({\it Trans.~Amer.~Math.~Soc.,} 209 (1975), 33--44) gives an upper bound for $r(C_4,B_n^{(1)})$, which is tight for infinitely many $n$. For $k=2$, in a recent paper ({\em J. Graph Theory,} 103 (2023), 309--322), the second, the third, and the fourth authors obtained the exact value of $r(C_4,B_{n}^{(2)})$ for infinitely many $n$. The goal of this paper is to prove a similar result for each integer $k \geq 3$. To be precise, given an integer $k \geq 3$ and a constant $0<\varepsilon<1$, let $n=q^2-kq+t+\binom{k}{2}-k$ and $Q(k,\varepsilon)=(320k^4)^{k+1}/\varepsilon^{2k}$, where $1 \leq t \leq (1-\varepsilon)q$. We first establish an upper bound for $r(C_4,B_n^{(k)})$ provided $q \geq Q(k,\varepsilon)$. Then we show the upper bound is tight for $q \geq Q(k,\varepsilon)$ being a prime power and $1 \leq t \leq (1-\varepsilon)q$ under some assumptions. The proof leverages on a simple but novel refinement of a well-known inequality related to a $C_4$-free graph. Therefore, for each $k \geq 3$, we obtain the exact value of $r(C_4,B_n^{(k)})$ for infinitely many $n$. Moreover, we prove general upper and lower bounds of $r(C_4,B_n^{(k)})$ for $k \geq 3$.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Towards Structure-aware Model for Multi-modal Knowledge Graph Completion
Authors:
Linyu Li,
Zhi Jin,
Yichi Zhang,
Dongming Jin,
Chengfeng Dou,
Yuanpeng He,
Xuan Zhang,
Haiyan Zhao
Abstract:
Knowledge graphs (KGs) play a key role in promoting various multimedia and AI applications. However, with the explosive growth of multi-modal information, traditional knowledge graph completion (KGC) models cannot be directly applied. This has attracted a large number of researchers to study multi-modal knowledge graph completion (MMKGC). Since MMKG extends KG to the visual and textual domains, MM…
▽ More
Knowledge graphs (KGs) play a key role in promoting various multimedia and AI applications. However, with the explosive growth of multi-modal information, traditional knowledge graph completion (KGC) models cannot be directly applied. This has attracted a large number of researchers to study multi-modal knowledge graph completion (MMKGC). Since MMKG extends KG to the visual and textual domains, MMKGC faces two main challenges: (1) how to deal with the fine-grained modality information interaction and awareness; (2) how to ensure the dominant role of graph structure in multi-modal knowledge fusion and deal with the noise generated by other modalities during modality fusion. To address these challenges, this paper proposes a novel MMKGC model named TSAM, which integrates fine-grained modality interaction and dominant graph structure to form a high-performance MMKGC framework. Specifically, to solve the challenges, TSAM proposes the Fine-grained Modality Awareness Fusion method (FgMAF), which uses pre-trained language models to better capture fine-grained semantic information interaction of different modalities and employs an attention mechanism to achieve fine-grained modality awareness and fusion. Additionally, TSAM presents the Structure-aware Contrastive Learning method (SaCL), which utilizes two contrastive learning approaches to align other modalities more closely with the structured modality. Extensive experiments show that the proposed TSAM model significantly outperforms existing MMKGC models on widely used multi-modal datasets.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation
Authors:
Zhengwei Tao,
Zhi Jin,
Bincheng Li,
Xiaoying Bai,
Haiyan Zhao,
Chengfeng Dou,
Xiancai Chen,
Jia Li,
Linyu Li,
Chongyang Tao
Abstract:
Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event…
▽ More
Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event prediction as a retrieval-augmented generation (RAG) and reasoning task. In these benchmarks, each prediction question is answered with relevant retrieved news articles. However, because there is no consideration on whether the questions can be supported by valid or sufficient supporting rationales, some of the questions in these benchmarks may be inherently noninferable. To address this issue, we introduce a new benchmark, PROPHET, which comprises inferable forecasting questions paired with relevant news for retrieval. To ensure the inferability of the benchmark, we propose Causal Intervened Likelihood (CIL), a statistical measure that assesses inferability through causal inference. In constructing this benchmark, we first collected recent trend forecasting questions and then filtered the data using CIL, resulting in an inferable benchmark for event prediction. Through extensive experiments, we first demonstrate the validity of CIL and in-depth investigations into event prediction with the aid of CIL. Subsequently, we evaluate several representative prediction systems on PROPHET, drawing valuable insights for future directions.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
Enhancing LLM Generation with Knowledge Hypergraph for Evidence-Based Medicine
Authors:
Chengfeng Dou,
Ying Zhang,
Zhi Jin,
Wenpin Jiao,
Haiyan Zhao,
Yongqiang Zhao,
Zhengwei Tao
Abstract:
Evidence-based medicine (EBM) plays a crucial role in the application of large language models (LLMs) in healthcare, as it provides reliable support for medical decision-making processes. Although it benefits from current retrieval-augmented generation~(RAG) technologies, it still faces two significant challenges: the collection of dispersed evidence and the efficient organization of this evidence…
▽ More
Evidence-based medicine (EBM) plays a crucial role in the application of large language models (LLMs) in healthcare, as it provides reliable support for medical decision-making processes. Although it benefits from current retrieval-augmented generation~(RAG) technologies, it still faces two significant challenges: the collection of dispersed evidence and the efficient organization of this evidence to support the complex queries necessary for EBM. To tackle these issues, we propose using LLMs to gather scattered evidence from multiple sources and present a knowledge hypergraph-based evidence management model to integrate these evidence while capturing intricate relationships. Furthermore, to better support complex queries, we have developed an Importance-Driven Evidence Prioritization (IDEP) algorithm that utilizes the LLM to generate multiple evidence features, each with an associated importance score, which are then used to rank the evidence and produce the final retrieval results. Experimental results from six datasets demonstrate that our approach outperforms existing RAG techniques in application domains of interest to EBM, such as medical quizzing, hallucination detection, and decision support. Testsets and the constructed knowledge graph can be accessed at \href{https://drive.google.com/file/d/1WJ9QTokK3MdkjEmwuFQxwH96j_Byawj_/view?usp=drive_link}{https://drive.google.com/rag4ebm}.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Turán numbers of cycles plus a general graph
Authors:
Chunyang Dou,
Fu-tao Hu,
Xing Peng
Abstract:
For a family of graphs $\cal F$, a graph $G$ is $\cal F$-free if it does not contain a member of $\cal F$ as a subgraph. The Turán number $\textrm{ex}(n,{\cal F})$ is the maximum number of edges in an $n$-vertex graph which is $\cal F$-free. Let ${\cal C}_{\geq k}$ be the set of cycles with length at least $k$. In this paper, we investigate the Turán number of $\{{\cal C}_{\geq k}, F\}$ for a gene…
▽ More
For a family of graphs $\cal F$, a graph $G$ is $\cal F$-free if it does not contain a member of $\cal F$ as a subgraph. The Turán number $\textrm{ex}(n,{\cal F})$ is the maximum number of edges in an $n$-vertex graph which is $\cal F$-free. Let ${\cal C}_{\geq k}$ be the set of cycles with length at least $k$. In this paper, we investigate the Turán number of $\{{\cal C}_{\geq k}, F\}$ for a general graph $F$. To be precise, we determine $\textrm{ex}(n, \{{\cal C}_{\geq k}, F\})$ apart from a constant additive term, where $F$ either is a 2-connected nonbipartite graph or is a 2-connected bipartite graph under some conditions. This is an extension of a previous result on the Turán number of $\{{\cal C}_{\geq k}, K_r\}$ by the first author, Ning, and the third author.
△ Less
Submitted 12 December, 2024; v1 submitted 26 November, 2024;
originally announced November 2024.
-
The number of edges in graphs with bounded clique number and circumference
Authors:
Chunyang Dou,
Bo Ning,
Xing Peng
Abstract:
Let $\cal H$ be a family of graphs. The Turán number ${\rm ex}(n,{\cal H})$ is the maximum possible number of edges in an $n$-vertex graph which does not contain any member of $\cal H$ as a subgraph. As a common generalization of Turán's theorem and Erdős-Gallai theorem on the Turán number of matchings, Alon and Frankl determined ${\rm ex}(n,{\cal H})$ for ${\cal H}=\{K_r,M_k\}$, where $M_k$ is a…
▽ More
Let $\cal H$ be a family of graphs. The Turán number ${\rm ex}(n,{\cal H})$ is the maximum possible number of edges in an $n$-vertex graph which does not contain any member of $\cal H$ as a subgraph. As a common generalization of Turán's theorem and Erdős-Gallai theorem on the Turán number of matchings, Alon and Frankl determined ${\rm ex}(n,{\cal H})$ for ${\cal H}=\{K_r,M_k\}$, where $M_k$ is a matching of size $k$. Replacing $M_k$ by $P_k$, Katona and Xiao obtained the Turán number of ${\cal H}=\{K_r,P_k\}$ for $r \leq \lfloor k/2 \rfloor$ and sufficiently large $n$. In addition, they proposed a conjecture for the case of $r \geq \lfloor k/2 \rfloor+1$ and sufficiently large $n$. Motivated by the fact that the result for ${\rm ex}(n,P_k)$ can be deduced from the one for ${\rm ex}(n,{\cal C}_{\geq k})$, we investigate the Turán number of ${\cal H}=\{K_r, {\cal C}_{\geq k}\}$ in this paper. In other words, we aim to determine the maximum number of edges in graphs with clique number at most $r-1$ and circumference at most $k-1$. For ${\cal H}=\{K_r, {\cal C}_{\geq k}\}$, we are able to show the value of ${\rm ex}(n,{\cal H})$ for $r \geq \lfloor (k-1)/2\rfloor+2$ and all $n$. As an application of this result, we confirm Katona and Xiao's conjecture in a stronger form. For $r \leq \lfloor (k-1)/2\rfloor+1$, we manage to show the value of ${\rm ex}(n,{\cal H})$ for sufficiently large $n$.
△ Less
Submitted 12 December, 2024; v1 submitted 8 October, 2024;
originally announced October 2024.
-
Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment
Authors:
Chengfeng Dou,
Ying Zhang,
Zhi Jin,
Wenpin Jiao,
Haiyan Zhao,
Yongqiang Zhao,
Zhengwei Tao
Abstract:
This research examines the use of Reinforcement Learning from AI Feedback (RLAIF) techniques to improve healthcare dialogue models, with the aim of tackling the challenges of preference-aligned data annotation while reducing the reliance on medical experts. We argue that the primary challenges in current RLAIF research for healthcare are the limitations of automated evaluation methods and the diff…
▽ More
This research examines the use of Reinforcement Learning from AI Feedback (RLAIF) techniques to improve healthcare dialogue models, with the aim of tackling the challenges of preference-aligned data annotation while reducing the reliance on medical experts. We argue that the primary challenges in current RLAIF research for healthcare are the limitations of automated evaluation methods and the difficulties in accurately representing physician preferences. To address these challenges, we present a new evaluation framework based on standardized patient examinations. This framework is designed to objectively assess the effectiveness of large language models (LLMs) in guiding users and following instructions, enabling a comprehensive comparison across different models. Furthermore, our investigation of effective ways to express physician preferences using Constitutional AI algorithms highlighted the particular effectiveness of flowcharts. Utilizing this finding, we introduce an innovative agent-based approach for annotating preference data. This approach autonomously creates medical dialogue flows tailored to the patient's condition, demonstrates strong generalization abilities, and reduces the need for expert involvement. Our results show that the agent-based approach outperforms existing RLAIF annotation methods in standardized patient examinations and surpasses current open source medical dialogue LLMs in various test scenarios.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Authors:
Sriram Yenamandra,
Arun Ramachandran,
Mukul Khanna,
Karmesh Yadav,
Jay Vakil,
Andrew Melnik,
Michael Büttner,
Leon Harz,
Lyon Brown,
Gora Chand Nandi,
Arjun PS,
Gaurav Kumar Yadav,
Rahul Kala,
Robert Haschke,
Yang Luo,
Jinxin Zhu,
Yansen Han,
Bingyi Lu,
Xuan Gu,
Qinyuan Liu,
Yaping Zhao,
Qiting Ye,
Chenxiao Dou,
Yansong Chua,
Volodymyr Kuzma
, et al. (20 additional authors not shown)
Abstract:
In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface withi…
▽ More
In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface within that environment. We organized a NeurIPS 2023 competition featuring both simulation and real-world components to evaluate solutions to this task. Our baselines on the most challenging version of this task, using real perception in simulation, achieved only an 0.8% success rate; by the end of the competition, the best participants achieved an 10.8\% success rate, a 13x improvement. We observed that the most successful teams employed a variety of methods, yet two common threads emerged among the best solutions: enhancing error detection and recovery, and improving the integration of perception with decision-making processes. In this paper, we detail the results and methodologies used, both in simulation and real-world settings. We discuss the lessons learned and their implications for future research. Additionally, we compare performance in real and simulated environments, emphasizing the necessity for robust generalization to novel settings.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Integrating Physician Diagnostic Logic into Large Language Models: Preference Learning from Process Feedback
Authors:
Chengfeng Dou,
Zhi Jin,
Wenpin Jiao,
Haiyan Zhao,
Yongqiang Zhao,
Zhenwei Tao
Abstract:
The use of large language models in medical dialogue generation has garnered significant attention, with a focus on improving response quality and fluency. While previous studies have made progress in optimizing model performance for single-round medical Q&A tasks, there is a need to enhance the model's capability for multi-round conversations to avoid logical inconsistencies. To address this, we…
▽ More
The use of large language models in medical dialogue generation has garnered significant attention, with a focus on improving response quality and fluency. While previous studies have made progress in optimizing model performance for single-round medical Q&A tasks, there is a need to enhance the model's capability for multi-round conversations to avoid logical inconsistencies. To address this, we propose an approach called preference learning from process feedback~(PLPF), which integrates the doctor's diagnostic logic into LLMs. PLPF involves rule modeling, preference data generation, and preference alignment to train the model to adhere to the diagnostic process. Experimental results using Standardized Patient Testing show that PLPF enhances the diagnostic accuracy of the baseline model in medical conversations by 17.6%, outperforming traditional reinforcement learning from human feedback. Additionally, PLPF demonstrates effectiveness in both multi-round and single-round dialogue tasks, showcasing its potential for improving medical dialogue generation.
△ Less
Submitted 2 August, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Enhancing the Spatial Awareness Capability of Multi-Modal Large Language Model
Authors:
Yongqiang Zhao,
Zhenyu Li,
Zhi Jin,
Feng Zhang,
Haiyan Zhao,
Chengfeng Dou,
Zhengwei Tao,
Xinhai Xu,
Donghong Liu
Abstract:
The Multi-Modal Large Language Model (MLLM) refers to an extension of the Large Language Model (LLM) equipped with the capability to receive and infer multi-modal data. Spatial awareness stands as one of the crucial abilities of MLLM, encompassing diverse skills related to understanding spatial relationships among objects and between objects and the scene area. Industries such as autonomous drivin…
▽ More
The Multi-Modal Large Language Model (MLLM) refers to an extension of the Large Language Model (LLM) equipped with the capability to receive and infer multi-modal data. Spatial awareness stands as one of the crucial abilities of MLLM, encompassing diverse skills related to understanding spatial relationships among objects and between objects and the scene area. Industries such as autonomous driving, smart healthcare, robotics, virtual, and augmented reality heavily demand MLLM's spatial awareness capabilities. However, there exists a noticeable gap between the current spatial awareness capabilities of MLLM and the requirements set by human needs. To address this issue, this paper proposes using more precise spatial position information between objects to guide MLLM in providing more accurate responses to user-related inquiries. Specifically, for a particular multi-modal task, we utilize algorithms for acquiring geometric spatial information and scene graphs to obtain relevant geometric spatial information and scene details of objects involved in the query. Subsequently, based on this information, we direct MLLM to address spatial awareness-related queries posed by the user. Extensive experiments were conducted in benchmarks such as MME, MM-Vet, and other multi-modal large language models. The experimental results thoroughly confirm the efficacy of the proposed method in enhancing the spatial awareness tasks and associated tasks of MLLM.
△ Less
Submitted 31 October, 2023; v1 submitted 31 October, 2023;
originally announced October 2023.
-
PlugMed: Improving Specificity in Patient-Centered Medical Dialogue Generation using In-Context Learning
Authors:
Chengfeng Dou,
Zhi Jin,
Wenping Jiao,
Haiyan Zhao,
Zhenwei Tao,
Yongqiang Zhao
Abstract:
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge, through emphasizing the importance of providing responses specific to the patients. It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance even in some tasks in med…
▽ More
The patient-centered medical dialogue systems strive to offer diagnostic interpretation services to users who are less knowledgeable about medical knowledge, through emphasizing the importance of providing responses specific to the patients. It is difficult for the large language models (LLMs) to guarantee the specificity of responses in spite of its promising performance even in some tasks in medical field. Inspired by in-context learning, we propose PlugMed, a Plug-and-Play Medical Dialogue System, for addressing this challenge. PlugMed is equipped with two modules, the prompt generation (PG) module and the response ranking (RR) module, to enhances LLMs' dialogue strategies for improving the specificity of the dialogue. The PG module is designed to stimulate the imitative ability of LLMs by providing them with real dialogues from similar patients as prompts. The RR module incorporates fine-tuned small model as response filter to enable the selection of appropriate responses generated by LLMs. Furthermore, we introduce a new evaluation method based on matching both user's intent and high-frequency medical term to effectively assess the specificity of the responses. We conduct experimental evaluations on three medical dialogue datasets, and the results, including both automatic and human evaluation, demonstrate the effectiveness of our approach.
△ Less
Submitted 18 October, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Binary stochasticity enabled highly efficient neuromorphic deep learning achieves better-than-software accuracy
Authors:
Yang Li,
Wei Wang,
Ming Wang,
Chunmeng Dou,
Zhengyu Ma,
Huihui Zhou,
Peng Zhang,
Nicola Lepri,
Xumeng Zhang,
Qing Luo,
Xiaoxin Xu,
Guanhua Yang,
Feng Zhang,
Ling Li,
Daniele Ielmini,
Ming Liu
Abstract:
Deep learning needs high-precision handling of forwarding signals, backpropagating errors, and updating weights. This is inherently required by the learning algorithm since the gradient descent learning rule relies on the chain product of partial derivatives. However, it is challenging to implement deep learning in hardware systems that use noisy analog memristors as artificial synapses, as well a…
▽ More
Deep learning needs high-precision handling of forwarding signals, backpropagating errors, and updating weights. This is inherently required by the learning algorithm since the gradient descent learning rule relies on the chain product of partial derivatives. However, it is challenging to implement deep learning in hardware systems that use noisy analog memristors as artificial synapses, as well as not being biologically plausible. Memristor-based implementations generally result in an excessive cost of neuronal circuits and stringent demands for idealized synaptic devices. Here, we demonstrate that the requirement for high precision is not necessary and that more efficient deep learning can be achieved when this requirement is lifted. We propose a binary stochastic learning algorithm that modifies all elementary neural network operations, by introducing (i) stochastic binarization of both the forwarding signals and the activation function derivatives, (ii) signed binarization of the backpropagating errors, and (iii) step-wised weight updates. Through an extensive hybrid approach of software simulation and hardware experiments, we find that binary stochastic deep learning systems can provide better performance than the software-based benchmarks using the high-precision learning algorithm. Also, the binary stochastic algorithm strongly simplifies the neural network operations in hardware, resulting in an improvement of the energy efficiency for the multiply-and-accumulate operations by more than three orders of magnitudes.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
SeSQL: Yet Another Large-scale Session-level Chinese Text-to-SQL Dataset
Authors:
Saihao Huang,
Lijie Wang,
Zhenghua Li,
Zeyang Liu,
Chenhui Dou,
Fukang Yan,
Xinyan Xiao,
Hua Wu,
Min Zhang
Abstract:
As the first session-level Chinese dataset, CHASE contains two separate parts, i.e., 2,003 sessions manually constructed from scratch (CHASE-C), and 3,456 sessions translated from English SParC (CHASE-T). We find the two parts are highly discrepant and incompatible as training and evaluation data. In this work, we present SeSQL, yet another large-scale session-level text-to-SQL dataset in Chinese,…
▽ More
As the first session-level Chinese dataset, CHASE contains two separate parts, i.e., 2,003 sessions manually constructed from scratch (CHASE-C), and 3,456 sessions translated from English SParC (CHASE-T). We find the two parts are highly discrepant and incompatible as training and evaluation data. In this work, we present SeSQL, yet another large-scale session-level text-to-SQL dataset in Chinese, consisting of 5,028 sessions all manually constructed from scratch. In order to guarantee data quality, we adopt an iterative annotation workflow to facilitate intense and in-time review of previous-round natural language (NL) questions and SQL queries. Moreover, by completing all context-dependent NL questions, we obtain 27,012 context-independent question/SQL pairs, allowing SeSQL to be used as the largest dataset for single-round multi-DB text-to-SQL parsing. We conduct benchmark session-level text-to-SQL parsing experiments on SeSQL by employing three competitive session-level parsers, and present detailed analysis.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
BEIKE NLP at SemEval-2022 Task 4: Prompt-Based Paragraph Classification for Patronizing and Condescending Language Detection
Authors:
Yong Deng,
Chenxiao Dou,
Liangyu Chen,
Deqiang Miao,
Xianghui Sun,
Baochang Ma,
Xiangang Li
Abstract:
PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media.Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappoin…
▽ More
PCL detection task is aimed at identifying and categorizing language that is patronizing or condescending towards vulnerable communities in the general media.Compared to other NLP tasks of paragraph classification, the negative language presented in the PCL detection task is usually more implicit and subtle to be recognized, making the performance of common text-classification approaches disappointed. Targeting the PCL detection problem in SemEval-2022 Task 4, in this paper, we give an introduction to our team's solution, which exploits the power of prompt-based learning on paragraph classification. We reformulate the task as an appropriate cloze prompt and use pre-trained Masked Language Models to fill the cloze slot. For the two subtasks, binary classification and multi-label classification, DeBERTa model is adopted and fine-tuned to predict masked label words of task-specific prompts. On the evaluation dataset, for binary classification, our approach achieves an F1-score of 0.6406; for multi-label classification, our approach achieves an macro-F1-score of 0.4689 and ranks first in the leaderboard.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
To Answer or Not to Answer? Improving Machine Reading Comprehension Model with Span-based Contrastive Learning
Authors:
Yunjie Ji,
Liangyu Chen,
Chenxiao Dou,
Baochang Ma,
Xiangang Li
Abstract:
Machine Reading Comprehension with Unanswerable Questions is a difficult NLP task, challenged by the questions which can not be answered from passages. It is observed that subtle literal changes often make an answerable question unanswerable, however, most MRC models fail to recognize such changes. To address this problem, in this paper, we propose a span-based method of Contrastive Learning (span…
▽ More
Machine Reading Comprehension with Unanswerable Questions is a difficult NLP task, challenged by the questions which can not be answered from passages. It is observed that subtle literal changes often make an answerable question unanswerable, however, most MRC models fail to recognize such changes. To address this problem, in this paper, we propose a span-based method of Contrastive Learning (spanCL) which explicitly contrast answerable questions with their answerable and unanswerable counterparts at the answer span level. With spanCL, MRC models are forced to perceive crucial semantic changes from slight literal differences. Experiments on SQuAD 2.0 dataset show that spanCL can improve baselines significantly, yielding 0.86-2.14 absolute EM improvements. Additional experiments also show that spanCL is an effective way to utilize generated questions.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Function-words Enhanced Attention Networks for Few-Shot Inverse Relation Classification
Authors:
Chunliu Dou,
Shaojuan Wu,
Xiaowang Zhang,
Zhiyong Feng,
Kewen Wang
Abstract:
The relation classification is to identify semantic relations between two entities in a given text. While existing models perform well for classifying inverse relations with large datasets, their performance is significantly reduced for few-shot learning. In this paper, we propose a function words adaptively enhanced attention framework (FAEA) for few-shot inverse relation classification, in which…
▽ More
The relation classification is to identify semantic relations between two entities in a given text. While existing models perform well for classifying inverse relations with large datasets, their performance is significantly reduced for few-shot learning. In this paper, we propose a function words adaptively enhanced attention framework (FAEA) for few-shot inverse relation classification, in which a hybrid attention model is designed to attend class-related function words based on meta-learning. As the involvement of function words brings in significant intra-class redundancy, an adaptive message passing mechanism is introduced to capture and transfer inter-class differences.We mathematically analyze the negative impact of function words from dot-product measurement, which explains why message passing mechanism effectively reduces the impact. Our experimental results show that FAEA outperforms strong baselines, especially the inverse relation accuracy is improved by 14.33% under 1-shot setting in FewRel1.0.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Improvement of Printing Quality for Laser-induced Forward Transfer based LaserAssisted Bioprinting Process using a CFD-based numerical model
Authors:
Jie Qu,
Chaoran Dou,
Ben Xu,
Jianzhi Li,
Zhonghao Rao,
Andrew Tsin
Abstract:
As one of the three-dimensional (3D) bioprinting techniques with great application potential, laser-induced-forward-transfer (LIFT) based laser assisted bioprinting (LAB) transfers the bioink through a developed jet flow, and the printing quality highly depends on the stability of jet flow regime. To understand the connection between the jet flow and printing outcomes, a Computational Fluid Dynami…
▽ More
As one of the three-dimensional (3D) bioprinting techniques with great application potential, laser-induced-forward-transfer (LIFT) based laser assisted bioprinting (LAB) transfers the bioink through a developed jet flow, and the printing quality highly depends on the stability of jet flow regime. To understand the connection between the jet flow and printing outcomes, a Computational Fluid Dynamic (CFD) model was developed for the first time to accurately describe the jet flow regime and provide a guidance for optimal printing process planning. By adopting the printing parameters recommended by the CFD model, the printing quality was greatly improved by forming stable jet regime and organized printing patterns on the substrate, and the size of printed droplet can also be accurately predicted through a static equilibrium model. The ultimate goal of this research is to direct the LIFT-based LAB process and eventually improve the quality of bioprinting.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Graphene Overcoats for Ultra-High Storage Density Magnetic Media
Authors:
N. Dwivedi,
A. K. Ott,
K. Sasikumar,
C. Dou,
R. J. Yeo,
B. Narayanan,
U. Sassi,
D. De Fazio,
G. Soavi,
T. Dutta,
S. K. R. S. Sankaranarayanan,
A. C. Ferrari,
C. S. Bhatia
Abstract:
Hard disk drives (HDDs) are used as secondary storage in a number of digital electronic devices owing to low cost ($<$0.1\$/GB at 2016 prices) and large data storage capacity (10TB with a 3.5 inch HDD). Due to the exponentially increasing amount of data, there is a need to increase areal storage densities beyond$\sim$1Tb/in$^2$. This requires the thickness of carbon overcoats (COCs) to be$<…
▽ More
Hard disk drives (HDDs) are used as secondary storage in a number of digital electronic devices owing to low cost ($<$0.1\$/GB at 2016 prices) and large data storage capacity (10TB with a 3.5 inch HDD). Due to the exponentially increasing amount of data, there is a need to increase areal storage densities beyond$\sim$1Tb/in$^2$. This requires the thickness of carbon overcoats (COCs) to be$<$2nm. Friction, wear, corrosion, and thermal stability are critical concerns$<$2nm, where most of the protective properties of current COCs are lost. This limits current technology and restricts COC integration with heat assisted magnetic recording technology (HAMR), since this also requires laser irradiation stability. Here we show that graphene-based overcoats can overcome all these limitations. 2-4 layers of graphene enable two-fold reduction in friction and provide better corrosion and wear than state-of-the-art COCs. A single graphene layer is enough to reduce corrosion$\sim$2.5 times. We also show that graphene can withstand HAMR conditions. Thus, graphene-based overcoats can enable ultrahigh areal density HDDs$>$10Tb/in$^2$.
△ Less
Submitted 2 June, 2019;
originally announced June 2019.
-
A New Upper Bound for the Largest Growth Rate of Linear Rayleigh--Taylor Instability
Authors:
Changsheng Dou,
Jialiang Wang,
Weiwei Wang
Abstract:
We investigate the effect of surface tension on the linear Rayleigh--Taylor (RT) instability in stratified incompressible viscous fluids with or without (interface) surface tension. The existence of linear RT instability solutions with largest growth rate $Λ$ is proved under the instability condition (i.e., the surface tension coefficient $\vartheta$ is less than a threshold $\vartheta_{\rm c}$) b…
▽ More
We investigate the effect of surface tension on the linear Rayleigh--Taylor (RT) instability in stratified incompressible viscous fluids with or without (interface) surface tension. The existence of linear RT instability solutions with largest growth rate $Λ$ is proved under the instability condition (i.e., the surface tension coefficient $\vartheta$ is less than a threshold $\vartheta_{\rm c}$) by modified variational method of PDEs. Moreover we find a new upper bound for $Λ$. In particular, we observe from the upper bound that $Λ$ decreasingly converges to zero, as $\vartheta$ goes from zero to the threshold $\vartheta_{\rm c}$. The convergence behavior of $Λ$ mathematically verifies the classical RT instability experiment that the instability growth is limited by surface tension during the linear stage.
△ Less
Submitted 20 February, 2021; v1 submitted 30 January, 2019;
originally announced January 2019.
-
Tetrahedral amorphous carbon resistive memories with graphene-based electrodes
Authors:
A. K. Ott,
C. Dou,
U. Sassi,
I. Goykhman,
D. Yoon,
J. Wu,
A. Lombardo,
A. C. Ferrari
Abstract:
Resistive-switching memories are alternative to Si-based ones, which face scaling and high power consumption issues. Tetrahedral amorphous carbon (ta-C) shows reversible, non-volatile resistive switching. Here we report polarity independent ta-C resistive memory devices with graphene-based electrodes. Our devices show ON/OFF resistance ratios$\sim$4x$10^5$, ten times higher than with metal electro…
▽ More
Resistive-switching memories are alternative to Si-based ones, which face scaling and high power consumption issues. Tetrahedral amorphous carbon (ta-C) shows reversible, non-volatile resistive switching. Here we report polarity independent ta-C resistive memory devices with graphene-based electrodes. Our devices show ON/OFF resistance ratios$\sim$4x$10^5$, ten times higher than with metal electrodes, with no increase in switching power, and low power density$\sim$14$μ$W/$μ$m$^2$. We attribute this to a suppressed tunneling current due to the low density of states of graphene near the Dirac point, consistent with the current-voltage characteristics derived from a quantum point contact model. Our devices also have multiple resistive states. This allows storing more than one bit per cell. This can be exploited in a range of signal processing/computing-type operations, such as implementing logic, providing synaptic and neuron-like mimics, and performing analogue signal processing in non-von-Neumann architectures
△ Less
Submitted 5 May, 2018;
originally announced May 2018.
-
Global weak solutions to 3D compressible Primitive equations with density-dependent viscosity
Authors:
Fengchao Wang,
Changsheng Dou,
Quansen Jiu
Abstract:
This paper is devoted to investigating the global existence of weak solutions for the compressible primitive equations (CPE) with damping term in a three-dimensional torus for large initial data. The system takes into account density-dependent viscosity. In our proof, we represent the vertical velocity as a function of the density and the horizontal velocity which will play a role to use the Faedo…
▽ More
This paper is devoted to investigating the global existence of weak solutions for the compressible primitive equations (CPE) with damping term in a three-dimensional torus for large initial data. The system takes into account density-dependent viscosity. In our proof, we represent the vertical velocity as a function of the density and the horizontal velocity which will play a role to use the Faedo-Galerkin method to obtain the global existence of the approximate solutions. Motivated by Vasseur and Yu [yucheng2016], we obtain the key estimates of lower bound of the density, the Bresch-Desjardin entropy on the approximate solutions. Based on these estimates, using compactness arguments, we prove the global existence of weak solutions of CPE by vanishing the parameters in our approximate system step by step.
△ Less
Submitted 12 December, 2017;
originally announced December 2017.
-
Existence of strong solutions to the steady Navier-Stokes equations for a compressible heat-conductive fluid with large forces
Authors:
Changsheng Dou,
Fei Jiang,
Song Jiang,
Yong-Fu Yang
Abstract:
We prove that there exists a strong solution to the Dirichlet boundary value problem for the steady Navier-Stokes equations of a compressible heat-conductive fluid with large external forces in a bounded domain $R^d (d = 2, 3)$, provided that the Mach number is appropriately small. At the same time, the low Mach number limit is rigorously verified. The basic idea in the proof is to split the equat…
▽ More
We prove that there exists a strong solution to the Dirichlet boundary value problem for the steady Navier-Stokes equations of a compressible heat-conductive fluid with large external forces in a bounded domain $R^d (d = 2, 3)$, provided that the Mach number is appropriately small. At the same time, the low Mach number limit is rigorously verified. The basic idea in the proof is to split the equations into two parts, one of which is similar to the steady incompressible Navier-Stokes equations with large forces, while another part corresponds to the steady compressible heat-conductive Navier-Stokes equations with small forces. The existence is then established by dealing with these two parts separately, establishing uniform in the Mach number a priori estimates and exploiting the known results on the steady incompressible Navier-Stokes equations.
△ Less
Submitted 7 August, 2014; v1 submitted 27 February, 2013;
originally announced February 2013.
-
Weak-strong uniqueness property for the compressible flow of liquid crystals
Authors:
Yong-Fu Yang,
Changsheng Dou,
Qiangchang Ju
Abstract:
Weak-strong uniqueness property in the class of finite energy weak solutions is established for two different compressible liquid crystal systems by the method of relative entropy. To overcome the difficulties caused by the molecular direction with inhomogeneous Dirichlet boundary condition, new techniques are introduced to build up the relative entropy inequalities.
Weak-strong uniqueness property in the class of finite energy weak solutions is established for two different compressible liquid crystal systems by the method of relative entropy. To overcome the difficulties caused by the molecular direction with inhomogeneous Dirichlet boundary condition, new techniques are introduced to build up the relative entropy inequalities.
△ Less
Submitted 6 May, 2012; v1 submitted 30 March, 2012;
originally announced March 2012.
-
On One-dimensional Compressible Navier-Stokes Equations with Degenerate Viscosity and Constant State at Far fields
Authors:
Changsheng Dou,
Quansen Jiu
Abstract:
In this paper, we are concerned with the Cauchy problem for one-dimensional compressible isentropic Navier-Stokes equations with density-dependent viscosity $μ(ρ)=ρ^α(α>0)$ and pressure $P(ρ)=ρ^γ\ (γ>1)$. We will establish the global existence and asymptotic behavior of weak solutions for any $α>0$ and $γ>1$ under the assumption that the density function keeps a constant state at far fields. This…
▽ More
In this paper, we are concerned with the Cauchy problem for one-dimensional compressible isentropic Navier-Stokes equations with density-dependent viscosity $μ(ρ)=ρ^α(α>0)$ and pressure $P(ρ)=ρ^γ\ (γ>1)$. We will establish the global existence and asymptotic behavior of weak solutions for any $α>0$ and $γ>1$ under the assumption that the density function keeps a constant state at far fields. This enlarges the ranges of $α$ and $γ$ and improves the previous results presented by Jiu and Xin. As a result, in the case that $0<α<\frac12$, we obtain the large time behavior of the strong solution obtained by Mellet and Vasseur when the solution has a lower bound (no vacuum).
△ Less
Submitted 19 March, 2012;
originally announced March 2012.
-
Local well-posedness and blow up criterion for the Inviscid Boussinesq system in Hölder spaces
Authors:
Xiaona Cui,
Changsheng Dou,
Quansen Jiu
Abstract:
We prove the local in time existence and a blow up criterion of solution in the Hölder spaces for the inviscid Boussinesq system in $R^{N},N\geq2$, under the assumptions that the initial values $θ_{0},u_{0}\in C^{r}$, with $r>1$.
We prove the local in time existence and a blow up criterion of solution in the Hölder spaces for the inviscid Boussinesq system in $R^{N},N\geq2$, under the assumptions that the initial values $θ_{0},u_{0}\in C^{r}$, with $r>1$.
△ Less
Submitted 21 September, 2010;
originally announced September 2010.