-
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model
Authors:
Yifan Chang,
Yukang Feng,
Jianwen Sun,
Jiaxin Ai,
Chuanhao Li,
S. Kevin Zhou,
Kaipeng Zhang
Abstract:
Recent years have seen rapid advances in AI-driven image generation. Early diffusion models emphasized perceptual quality, while newer multimodal models like GPT-4o-image integrate high-level reasoning, improving semantic understanding and structural composition. Scientific illustration generation exemplifies this evolution: unlike general image synthesis, it demands accurate interpretation of tec…
▽ More
Recent years have seen rapid advances in AI-driven image generation. Early diffusion models emphasized perceptual quality, while newer multimodal models like GPT-4o-image integrate high-level reasoning, improving semantic understanding and structural composition. Scientific illustration generation exemplifies this evolution: unlike general image synthesis, it demands accurate interpretation of technical content and transformation of abstract ideas into clear, standardized visuals. This task is significantly more knowledge-intensive and laborious, often requiring hours of manual work and specialized tools. Automating it in a controllable, intelligent manner would provide substantial practical value. Yet, no benchmark currently exists to evaluate AI on this front. To fill this gap, we introduce SridBench, the first benchmark for scientific figure generation. It comprises 1,120 instances curated from leading scientific papers across 13 natural and computer science disciplines, collected via human experts and MLLMs. Each sample is evaluated along six dimensions, including semantic fidelity and structural accuracy. Experimental results reveal that even top-tier models like GPT-4o-image lag behind human performance, with common issues in text/visual clarity and scientific correctness. These findings highlight the need for more advanced reasoning-driven visual generation capabilities.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
PIV-FlowDiffuser:Transfer-learning-based denoising diffusion models for PIV
Authors:
Qianyu Zhu,
Junjie Wang,
Jeremiah Hu,
Jia Ai,
Yong Lee
Abstract:
Deep learning algorithms have significantly reduced the computational time and improved the spatial resolution of particle image velocimetry~(PIV). However, the models trained on synthetic datasets might have a degraded performance on practical particle images due to domain gaps. As a result, special residual patterns are often observed for the vector fields of deep learning-based estimators. To r…
▽ More
Deep learning algorithms have significantly reduced the computational time and improved the spatial resolution of particle image velocimetry~(PIV). However, the models trained on synthetic datasets might have a degraded performance on practical particle images due to domain gaps. As a result, special residual patterns are often observed for the vector fields of deep learning-based estimators. To reduce the special noise step-by-step, we employ a denoising diffusion model~(FlowDiffuser) for PIV analysis. And the data-hungry iterative denoising diffusion model is trained via a transfer learning strategy, resulting in our PIV-FlowDiffuser method. Specifically, (1) pre-training a FlowDiffuser model with multiple optical flow datasets of the computer vision community, such as Sintel, KITTI, etc; (2) fine-tuning the pre-trained model on synthetic PIV datasets. Note that the PIV images are upsampled by a factor of two to resolve the small-scale turbulent flow structures. The visualized results indicate that our PIV-FlowDiffuser effectively suppresses the noise patterns. Therefore, the denoising diffusion model reduces the average end-point error~($AEE$) by 59.4% over RAFT256-PIV baseline on the classic Cai's dataset. Besides, PIV-FlowDiffuser exhibits enhanced generalization performance on unseen particle images due to transfer learning. Overall, this study highlights the transfer-learning-based denoising diffusion models for PIV. And a detailed implementation is recommended for interested readers in the repository https://github.com/Zhu-Qianyu/PIV-FlowDiffuser.
△ Less
Submitted 21 April, 2025;
originally announced April 2025.
-
Agent That Debugs: Dynamic State-Guided Vulnerability Repair
Authors:
Zhengyao Liu,
Yunlong Ma,
Jingxuan Xu,
Junchen Ai,
Xiang Gao,
Hailong Sun,
Abhik Roychoudhury
Abstract:
In recent years, more vulnerabilities have been discovered every day, while manual vulnerability repair requires specialized knowledge and is time-consuming. As a result, many detected or even published vulnerabilities remain unpatched, thereby increasing the exposure of software systems to attacks. Recent advancements in agents based on Large Language Models have demonstrated their increasing cap…
▽ More
In recent years, more vulnerabilities have been discovered every day, while manual vulnerability repair requires specialized knowledge and is time-consuming. As a result, many detected or even published vulnerabilities remain unpatched, thereby increasing the exposure of software systems to attacks. Recent advancements in agents based on Large Language Models have demonstrated their increasing capabilities in code understanding and generation, which can be promising to achieve automated vulnerability repair. However, the effectiveness of agents based on static information retrieval is still not sufficient for patch generation. To address the challenge, we propose a program repair agent called VulDebugger that fully utilizes both static and dynamic context, and it debugs programs in a manner akin to humans. The agent inspects the actual state of the program via the debugger and infers expected states via constraints that need to be satisfied. By continuously comparing the actual state with the expected state, it deeply understands the root causes of the vulnerabilities and ultimately accomplishes repairs. We experimentally evaluated VulDebugger on 50 real-life projects. With 60.00% successfully fixed, VulDebugger significantly outperforms state-of-the-art approaches for vulnerability repair.
△ Less
Submitted 10 April, 2025;
originally announced April 2025.
-
InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Control
Authors:
Ruixiang Wu,
Jiahao Ai,
Tongxin Li
Abstract:
Model Predictive Control (MPC) is a powerful control strategy widely utilized in domains like energy management, building control, and autonomous systems. However, its effectiveness in real-world settings is challenged by the need to incorporate context-specific predictions and expert instructions, which traditional MPC often neglects. We propose InstructMPC, a novel framework that addresses this…
▽ More
Model Predictive Control (MPC) is a powerful control strategy widely utilized in domains like energy management, building control, and autonomous systems. However, its effectiveness in real-world settings is challenged by the need to incorporate context-specific predictions and expert instructions, which traditional MPC often neglects. We propose InstructMPC, a novel framework that addresses this gap by integrating real-time human instructions through a Large Language Model (LLM) to produce context-aware predictions for MPC. Our method employs a Language-to-Distribution (L2D) module to translate contextual information into predictive disturbance trajectories, which are then incorporated into the MPC optimization. Unlike existing context-aware and language-based MPC models, InstructMPC enables dynamic human-LLM interaction and fine-tunes the L2D module in a closed loop with theoretical performance guarantees, achieving a regret bound of $O(\sqrt{T\log T})$ for linear dynamics when optimized via advanced fine-tuning methods such as Direct Preference Optimization (DPO) using a tailored loss function.
△ Less
Submitted 14 April, 2025; v1 submitted 8 April, 2025;
originally announced April 2025.
-
MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models
Authors:
Pengfei Zhou,
Fanrui Zhang,
Xiaopeng Peng,
Zhaopan Xu,
Jiaxin Ai,
Yansheng Qiu,
Chuanhao Li,
Zhen Li,
Ming Li,
Yukang Feng,
Jianwen Sun,
Haoquan Zhang,
Zizhen Li,
Xiaofeng Mao,
Wangbo Zhao,
Kai Wang,
Xiaojun Chang,
Wenqi Shao,
Yang You,
Kaipeng Zhang
Abstract:
Multimodal reasoning, which integrates language and visual cues into problem solving and decision making, is a fundamental aspect of human intelligence and a crucial step toward artificial general intelligence. However, the evaluation of multimodal reasoning capabilities in Multimodal Large Language Models (MLLMs) remains inadequate. Most existing reasoning benchmarks are constrained by limited da…
▽ More
Multimodal reasoning, which integrates language and visual cues into problem solving and decision making, is a fundamental aspect of human intelligence and a crucial step toward artificial general intelligence. However, the evaluation of multimodal reasoning capabilities in Multimodal Large Language Models (MLLMs) remains inadequate. Most existing reasoning benchmarks are constrained by limited data size, narrow domain coverage, and unstructured knowledge distribution. To close these gaps, we introduce MDK12-Bench, a multi-disciplinary benchmark assessing the reasoning capabilities of MLLMs via real-world K-12 examinations. Spanning six disciplines (math, physics, chemistry, biology, geography, and information science), our benchmark comprises 140K reasoning instances across diverse difficulty levels from primary school to 12th grade. It features 6,827 instance-level knowledge point annotations based on a well-organized knowledge structure, detailed answer explanations, difficulty labels and cross-year partitions, providing a robust platform for comprehensive evaluation. Additionally, we present a novel dynamic evaluation framework to mitigate data contamination issues by bootstrapping question forms, question types, and image styles during evaluation. Extensive experiment on MDK12-Bench reveals the significant limitation of current MLLMs in multimodal reasoning. The findings on our benchmark provide insights into the development of the next-generation models. Our data and codes are available at https://github.com/LanceZPF/MDK12.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Concentration inequalities for the sum in sampling without replacement: an approach via majorization
Authors:
Jianhang Ai,
Ondřej Kuželka,
Christos Pelekis
Abstract:
Let $P=(x_1,\ldots,x_n)$ be a population consisting of $n\ge 2$ real numbers whose sum is zero, and let $k <n$ be a positive integer. We sample $k$ elements from $P$ without replacement and denote by $X_P$ the sum of the elements in our sample. In this article, using ideas from the theory of majorization, we deduce non-asymptotic lower and upper bounds on the probability that $X_P$ exceeds its exp…
▽ More
Let $P=(x_1,\ldots,x_n)$ be a population consisting of $n\ge 2$ real numbers whose sum is zero, and let $k <n$ be a positive integer. We sample $k$ elements from $P$ without replacement and denote by $X_P$ the sum of the elements in our sample. In this article, using ideas from the theory of majorization, we deduce non-asymptotic lower and upper bounds on the probability that $X_P$ exceeds its expected value.
△ Less
Submitted 26 March, 2025;
originally announced March 2025.
-
On high discrepancy $1$-factorizations of complete graphs
Authors:
Jiangdong Ai,
Fankang He,
Seonghyuk Im,
Hyunwoo Lee
Abstract:
We proved that for every sufficiently large $n$, the complete graph $K_{2n}$ with an arbitrary edge signing $σ: E(K_{2n}) \to \{-1, +1\}$ admits a high discrepancy $1$-factor decomposition. That is, there exists a universal constant $c > 0$ such that every edge-signed $K_{2n}$ has a perfect matching decomposition $\{ψ_1, \ldots, ψ_{2n-1}\}$, where for each perfect matching $ψ_i$, the discrepancy…
▽ More
We proved that for every sufficiently large $n$, the complete graph $K_{2n}$ with an arbitrary edge signing $σ: E(K_{2n}) \to \{-1, +1\}$ admits a high discrepancy $1$-factor decomposition. That is, there exists a universal constant $c > 0$ such that every edge-signed $K_{2n}$ has a perfect matching decomposition $\{ψ_1, \ldots, ψ_{2n-1}\}$, where for each perfect matching $ψ_i$, the discrepancy $\lvert \frac{1}{n} \sum_{e\in E(ψ_i)} σ(e) \rvert$ is at least $c$.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models
Authors:
Zhaopan Xu,
Pengfei Zhou,
Weidong Tang,
Jiaxin Ai,
Wangbo Zhao,
Xiaojiang Peng,
Kai Wang,
Yang You,
Wenqi Shao,
Hongxun Yao,
Kaipeng Zhang
Abstract:
In recent years, Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning. However, this impressive progress relies on vast amounts of data collected from the internet, raising significant concerns about privacy and security. To address these issues, machine unlearning (MU) has emerged as a pr…
▽ More
In recent years, Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning. However, this impressive progress relies on vast amounts of data collected from the internet, raising significant concerns about privacy and security. To address these issues, machine unlearning (MU) has emerged as a promising solution, enabling the removal of specific knowledge from an already trained model without requiring retraining from scratch. Although MU for MLLMs has gained attention, current evaluations of its efficacy remain incomplete, and the underlying problem is often poorly defined, which hinders the development of strategies for creating more secure and trustworthy systems. To bridge this gap, we introduce a benchmark, named PEBench, which includes a dataset of personal entities and corresponding general event scenes, designed to comprehensively assess the performance of MU for MLLMs. Through PEBench, we aim to provide a standardized and robust framework to advance research in secure and privacy-preserving multimodal models. We benchmarked 6 MU methods, revealing their strengths and limitations, and shedding light on key challenges and opportunities for MU in MLLMs.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification
Authors:
Zhaopan Xu,
Pengfei Zhou,
Jiaxin Ai,
Wangbo Zhao,
Kai Wang,
Xiaojiang Peng,
Wenqi Shao,
Hongxun Yao,
Kaipeng Zhang
Abstract:
Reasoning is an essential capacity for large language models (LLMs) to address complex tasks, where the identification of process errors is vital for improving this ability. Recently, process-level reward models (PRMs) were proposed to provide step-wise rewards that facilitate reinforcement learning and data production during training and guide LLMs toward correct steps during inference, thereby i…
▽ More
Reasoning is an essential capacity for large language models (LLMs) to address complex tasks, where the identification of process errors is vital for improving this ability. Recently, process-level reward models (PRMs) were proposed to provide step-wise rewards that facilitate reinforcement learning and data production during training and guide LLMs toward correct steps during inference, thereby improving reasoning accuracy. However, existing benchmarks of PRMs are text-based and focus on error detection, neglecting other scenarios like reasoning search. To address this gap, we introduce MPBench, a comprehensive, multi-task, multimodal benchmark designed to systematically assess the effectiveness of PRMs in diverse scenarios. MPBench employs three evaluation paradigms, each targeting a specific role of PRMs in the reasoning process: (1) Step Correctness, which assesses the correctness of each intermediate reasoning step; (2) Answer Aggregation, which aggregates multiple solutions and selects the best one; and (3) Reasoning Process Search, which guides the search for optimal reasoning steps during inference. Through these paradigms, MPBench makes comprehensive evaluations and provides insights into the development of multimodal PRMs.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges
Authors:
Jiaxin Ai,
Pengfei Zhou,
Zhaopan Xu,
Ming Li,
Fanrui Zhang,
Zizhen Li,
Jianwen Sun,
Yukang Feng,
Baojin Huang,
Zhongyuan Wang,
Kaipeng Zhang
Abstract:
As multi-modal large language models (MLLMs) frequently exhibit errors when solving scientific problems, evaluating the validity of their reasoning processes is critical for ensuring reliability and uncovering fine-grained model weaknesses. Since human evaluation is laborious and costly, prompting MLLMs as automated process judges has become a common practice. However, the reliability of these mod…
▽ More
As multi-modal large language models (MLLMs) frequently exhibit errors when solving scientific problems, evaluating the validity of their reasoning processes is critical for ensuring reliability and uncovering fine-grained model weaknesses. Since human evaluation is laborious and costly, prompting MLLMs as automated process judges has become a common practice. However, the reliability of these model-based judges remains uncertain. To address this, we introduce ProJudgeBench, the first comprehensive benchmark specifically designed for evaluating abilities of MLLM-based process judges. ProJudgeBench comprises 2,400 test cases and 50,118 step-level labels, spanning four scientific disciplines with diverse difficulty levels and multi-modal content. In ProJudgeBench, each step is meticulously annotated by human experts for correctness, error type, and explanation, enabling a systematic evaluation of judges' capabilities to detect, classify and diagnose errors. Evaluation on ProJudgeBench reveals a significant performance gap between open-source and proprietary models. To bridge this gap, we further propose ProJudge-173k, a large-scale instruction-tuning dataset, and a Dynamic Dual-Phase fine-tuning strategy that encourages models to explicitly reason through problem-solving before assessing solutions. Both contributions significantly enhance the process evaluation capabilities of open-source models. All the resources will be released to foster future research of reliable multi-modal process evaluation.
△ Less
Submitted 9 March, 2025;
originally announced March 2025.
-
ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy
Authors:
Jianwen Sun,
Yukang Feng,
Chuanhao Li,
Fanrui Zhang,
Zizhen Li,
Jiaxin Ai,
Sizhuo Zhou,
Yu Dai,
Shenglin Zhang,
Kaipeng Zhang
Abstract:
Unified models (UniMs) for multimodal understanding and generation have recently received much attention in the area of vision and language. Existing UniMs are designed to simultaneously learn both multimodal understanding and generation capabilities, demanding substantial computational resources, and often struggle to generate interleaved text-image. We present ARMOR, a resource-efficient and pur…
▽ More
Unified models (UniMs) for multimodal understanding and generation have recently received much attention in the area of vision and language. Existing UniMs are designed to simultaneously learn both multimodal understanding and generation capabilities, demanding substantial computational resources, and often struggle to generate interleaved text-image. We present ARMOR, a resource-efficient and pure autoregressive framework that achieves both understanding and generation by fine-tuning existing multimodal large language models (MLLMs). Specifically, ARMOR extends existing MLLMs from three perspectives: (1) For model architecture, an asymmetric encoder-decoder architecture with a forward-switching mechanism is introduced to unify embedding space integrating textual and visual modalities for enabling natural text-image interleaved generation with minimal computational overhead. (2) For training data, a meticulously curated, high-quality interleaved dataset is collected for fine-tuning MLLMs. (3) For the training algorithm, we propose a ``what or how to generate" algorithm to empower existing MLLMs with multimodal generation capabilities while preserving their multimodal understanding capabilities, through three progressive training stages based on the collected dataset. Experimental results demonstrate that ARMOR upgrades existing MLLMs to UniMs with promising image generation capabilities, using limited training resources. Our code will be released soon at https://armor.github.io.
△ Less
Submitted 9 March, 2025;
originally announced March 2025.
-
Strong Ramsey game on two boards
Authors:
Jiangdong Ai,
Jun Gao,
Zixiang Xu,
Xin Yan
Abstract:
The strong Ramsey game $R(\mathcal{B}, H)$ is a two-player game played on a graph $\mathcal{B}$, referred to as the board, with a target graph $H$. In this game, two players, $P_1$ and $P_2$, alternately claim unclaimed edges of $\mathcal{B}$, starting with $P_1$. The goal is to claim a subgraph isomorphic to $H$, with the first player achieving this declared the winner. A fundamental open questio…
▽ More
The strong Ramsey game $R(\mathcal{B}, H)$ is a two-player game played on a graph $\mathcal{B}$, referred to as the board, with a target graph $H$. In this game, two players, $P_1$ and $P_2$, alternately claim unclaimed edges of $\mathcal{B}$, starting with $P_1$. The goal is to claim a subgraph isomorphic to $H$, with the first player achieving this declared the winner. A fundamental open question, persisting for over three decades, asks whether there exists a graph $H$ such that in the game $R(K_n, H)$, $P_1$ does not have a winning strategy in a bounded number of moves as $n \to \infty$.
In this paper, we shift the focus to the variant $R(K_n \sqcup K_n, H)$, introduced by David, Hartarsky, and Tiba, where the board $K_n \sqcup K_n$ consists of two disjoint copies of $K_n$. We prove that there exist infinitely many graphs $H$ such that $P_1$ cannot win in $R(K_n \sqcup K_n, H)$ within a bounded number of moves through a concise proof. This perhaps provides evidence for the existence of examples to the above longstanding open problem.
△ Less
Submitted 27 January, 2025; v1 submitted 12 January, 2025;
originally announced January 2025.
-
Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection
Authors:
Jikang Cheng,
Zhiyuan Yan,
Ying Zhang,
Li Hao,
Jiaxin Ai,
Qin Zou,
Chen Li,
Zhongyuan Wang
Abstract:
The rapid advancement of face forgery techniques has introduced a growing variety of forgeries. Incremental Face Forgery Detection (IFFD), involving gradually adding new forgery data to fine-tune the previously trained model, has been introduced as a promising strategy to deal with evolving forgery methods. However, a naively trained IFFD model is prone to catastrophic forgetting when new forgerie…
▽ More
The rapid advancement of face forgery techniques has introduced a growing variety of forgeries. Incremental Face Forgery Detection (IFFD), involving gradually adding new forgery data to fine-tune the previously trained model, has been introduced as a promising strategy to deal with evolving forgery methods. However, a naively trained IFFD model is prone to catastrophic forgetting when new forgeries are integrated, as treating all forgeries as a single ''Fake" class in the Real/Fake classification can cause different forgery types overriding one another, thereby resulting in the forgetting of unique characteristics from earlier tasks and limiting the model's effectiveness in learning forgery specificity and generality. In this paper, we propose to stack the latent feature distributions of previous and new tasks brick by brick, $\textit{i.e.}$, achieving $\textbf{aligned feature isolation}$. In this manner, we aim to preserve learned forgery information and accumulate new knowledge by minimizing distribution overriding, thereby mitigating catastrophic forgetting. To achieve this, we first introduce Sparse Uniform Replay (SUR) to obtain the representative subsets that could be treated as the uniformly sparse versions of the previous global distributions. We then propose a Latent-space Incremental Detector (LID) that leverages SUR data to isolate and align distributions. For evaluation, we construct a more advanced and comprehensive benchmark tailored for IFFD. The leading experimental results validate the superiority of our method.
△ Less
Submitted 28 March, 2025; v1 submitted 18 November, 2024;
originally announced November 2024.
-
Arc-disjoint in- and out-branchings in semicomplete split digraphs
Authors:
Jiangdong Ai,
Yiming Hao,
Zhaoxiang Li,
Qi Shao
Abstract:
An \emph{out-tree (in-tree)} is an oriented tree where every vertex except one, called the \emph{root}, has in-degree (out-degree) one. An \emph{out-branching $B^+_u$ (in-branching $B^-_u$)} of a digraph $D$ is a spanning out-tree (in-tree) rooted at $u$. A \emph{good $(u,v)$-pair} in $D$ is a pair of branchings $B^+_u, B^-_v$ which are arc-disjoint. Thomassen proved that deciding whether a digrap…
▽ More
An \emph{out-tree (in-tree)} is an oriented tree where every vertex except one, called the \emph{root}, has in-degree (out-degree) one. An \emph{out-branching $B^+_u$ (in-branching $B^-_u$)} of a digraph $D$ is a spanning out-tree (in-tree) rooted at $u$. A \emph{good $(u,v)$-pair} in $D$ is a pair of branchings $B^+_u, B^-_v$ which are arc-disjoint. Thomassen proved that deciding whether a digraph has any good pair is NP-complete. A \emph{semicomplete split digraph} is a digraph where the vertex set is the disjoint union of two non-empty sets, $V_1$ and $V_2$, such that $V_1$ is an independent set, the subdigraph induced by $V_2$ is semicomplete, and every vertex in $V_1$ is adjacent to every vertex in $V_2$. In this paper, we prove that every $2$-arc-strong semicomplete split digraph $D$ contains a good $(u, v)$-pair for any choice of vertices $u, v$ of $D$, thereby confirming a conjecture by Bang-Jensen and Wang [Bang-Jensen and Wang, J. Graph Theory, 2024].
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
Spanning weakly even trees of graphs
Authors:
Jiangdong Ai,
M. N. Ellingham,
Zhipeng Gao,
Yixuan Huang,
Xiangzhou Liu,
Songling Shan,
Simon Špacapan,
Jun Yue
Abstract:
Let $G$ be a graph (with multiple edges allowed) and let $T$ be a tree in $G$. We say that $T$ is $\textit{even}$ if every leaf of $T$ belongs to the same part of the bipartition of $T$, and that $T$ is $\textit{weakly even}$ if every leaf of $T$ that has maximum degree in $G$ belongs to the same part of the bipartition of $T$. We confirm two recent conjectures of Jackson and Yoshimoto by showing…
▽ More
Let $G$ be a graph (with multiple edges allowed) and let $T$ be a tree in $G$. We say that $T$ is $\textit{even}$ if every leaf of $T$ belongs to the same part of the bipartition of $T$, and that $T$ is $\textit{weakly even}$ if every leaf of $T$ that has maximum degree in $G$ belongs to the same part of the bipartition of $T$. We confirm two recent conjectures of Jackson and Yoshimoto by showing that every connected graph that is not a regular bipartite graph has a spanning weakly even tree.
△ Less
Submitted 17 October, 2024; v1 submitted 23 September, 2024;
originally announced September 2024.
-
A short note on spanning even trees
Authors:
Jiangdong Ai,
Zhipeng Gao,
Xiangzhou Liu,
Jun Yue
Abstract:
We call a tree $T$ is \emph{even} if every pair of its leaves is joined by a path of even length. Jackson and Yoshimoto~[J. Graph Theory, 2024] conjectured that every $r$-regular nonbipartite connected graph $G$ has a spanning even tree. They verified this conjecture for the case when $G$ has a $2$-factor. In this paper, we prove that the conjecture holds when $r$ is odd, thereby resolving the onl…
▽ More
We call a tree $T$ is \emph{even} if every pair of its leaves is joined by a path of even length. Jackson and Yoshimoto~[J. Graph Theory, 2024] conjectured that every $r$-regular nonbipartite connected graph $G$ has a spanning even tree. They verified this conjecture for the case when $G$ has a $2$-factor. In this paper, we prove that the conjecture holds when $r$ is odd, thereby resolving the only remaining unsolved case for this conjecture.
△ Less
Submitted 10 September, 2024; v1 submitted 13 August, 2024;
originally announced August 2024.
-
IDRetracor: Towards Visual Forensics Against Malicious Face Swapping
Authors:
Jikang Cheng,
Jiaxin Ai,
Zhen Han,
Chao Liang,
Qin Zou,
Zhongyuan Wang,
Qian Wang
Abstract:
The face swapping technique based on deepfake methods poses significant social risks to personal identity security. While numerous deepfake detection methods have been proposed as countermeasures against malicious face swapping, they can only output binary labels (Fake/Real) for distinguishing fake content without reliable and traceable evidence. To achieve visual forensics and target face attribu…
▽ More
The face swapping technique based on deepfake methods poses significant social risks to personal identity security. While numerous deepfake detection methods have been proposed as countermeasures against malicious face swapping, they can only output binary labels (Fake/Real) for distinguishing fake content without reliable and traceable evidence. To achieve visual forensics and target face attribution, we propose a novel task named face retracing, which considers retracing the original target face from the given fake one via inverse mapping. Toward this goal, we propose an IDRetracor that can retrace arbitrary original target identities from fake faces generated by multiple face swapping methods. Specifically, we first adopt a mapping resolver to perceive the possible solution space of the original target face for the inverse mappings. Then, we propose mapping-aware convolutions to retrace the original target face from the fake one. Such convolutions contain multiple kernels that can be combined under the control of the mapping resolver to tackle different face swapping mappings dynamically. Extensive experiments demonstrate that the IDRetracor exhibits promising retracing performance from both quantitative and qualitative perspectives.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
A complete characterization of split digraphs with a strong arc decomposition
Authors:
Jiangdong Ai,
Fankang He,
Zhaoxiang Li,
Zhongmei Qin,
Changxin Wang
Abstract:
A \textbf{strong arc decomposition} of a (multi-)digraph $D(V, A)$ is a partition of its arc set $A$ into two disjoint arc sets $A_1$ and $A_2$ such that both of the spanning subdigraphs $D(V, A_1)$ and $D(V, A_2)$ are strong. In this paper, we fully characterize all split digraphs that do not have a strong decomposition. This resolves two problems proposed by Bang-Jensen and Wang and contributes…
▽ More
A \textbf{strong arc decomposition} of a (multi-)digraph $D(V, A)$ is a partition of its arc set $A$ into two disjoint arc sets $A_1$ and $A_2$ such that both of the spanning subdigraphs $D(V, A_1)$ and $D(V, A_2)$ are strong. In this paper, we fully characterize all split digraphs that do not have a strong decomposition. This resolves two problems proposed by Bang-Jensen and Wang and contributes to a series of efforts aimed at addressing this problem for specific graph classes. This work continues the research on semicomplete composition [Bang-Jensen, Gutin and Yeo, J. Graph Theory, 2020]; on locally semicomplete digraphs [Bang-Jensen and Huang, J. Combin. Theory Ser. B, 2010]; on a type of tournaments [Bang-Jensen and Yeo, Combinatorica, 2004].
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Number of Subgraphs and Their Converses in Tournaments and New Digraph Polynomials
Authors:
Jiangdong Ai,
Gregory Gutin,
Hui Lei,
Anders Yeo,
Yacong Zhou
Abstract:
An oriented graph $D$ is converse invariant if, for any tournament $T$, the number of copies of $D$ in $T$ is equal to that of its converse $-D$. El Sahili and Ghazo Hanna [J. Graph Theory 102 (2023), 684-701] showed that any oriented graph $D$ with maximum degree at most 2 is converse invariant. They proposed a question: Can we characterize all converse invariant oriented graphs?
In this paper,…
▽ More
An oriented graph $D$ is converse invariant if, for any tournament $T$, the number of copies of $D$ in $T$ is equal to that of its converse $-D$. El Sahili and Ghazo Hanna [J. Graph Theory 102 (2023), 684-701] showed that any oriented graph $D$ with maximum degree at most 2 is converse invariant. They proposed a question: Can we characterize all converse invariant oriented graphs?
In this paper, we introduce a digraph polynomial and employ it to give a necessary condition for an oriented graph to be converse invariant. This polynomial serves as a cornerstone in proving all the results presented in this paper. In particular, we characterize all orientations of trees with diameter at most 3 that are converse invariant. We also show that all orientations of regular graphs are not converse invariant if $D$ and $-D$ have different degree sequences. In addition, in contrast to the findings of El Sahili and Ghazo Hanna, we prove that every connected graph $G$ with maximum degree at least $3$, admits an orientation $D$ of $G$ such that $D$ is not converse invariant. We pose one conjecture.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
A variable version of the quasi-kernel conjecture
Authors:
Jiangdong Ai,
Xiangzhou Liu,
Fei Peng
Abstract:
A quasi-kernel of a digraph $D$ is an independent set $Q$ such that every vertex can reach $Q$ in at most two steps. A 48-year conjecture made by P.L. Erdős and Székely, denoted the small QK conjecture, says that every sink-free digraph contains a quasi-kernel of size at most $n/2$.
Recently, Spiro posed the large QK conjecture, that every sink-free digraph contains a quasi-kernel $Q$ such that…
▽ More
A quasi-kernel of a digraph $D$ is an independent set $Q$ such that every vertex can reach $Q$ in at most two steps. A 48-year conjecture made by P.L. Erdős and Székely, denoted the small QK conjecture, says that every sink-free digraph contains a quasi-kernel of size at most $n/2$.
Recently, Spiro posed the large QK conjecture, that every sink-free digraph contains a quasi-kernel $Q$ such that $|N^-[Q]|\geq n/2$, and showed that it follows from the small QK conjecture.
In this paper, we establish that the large QK conjecture implies the small QK conjecture with a weaker constant. We also show that the large QK conjecture is equivalent to a sharp version of it, answering affirmatively a question of Spiro. We formulate variable versions of these conjectures, which are still open in general.
Not many digraphs are known to have quasi-kernels of size $(1-α)n$ or less. We show this for digraphs with bounded dichromatic number, by proving the stronger statement that every sink-free digraph contains a quasi-kernel of size at most $(1-1/k)n$, where $k$ is the digraph's kernel-perfect number.
△ Less
Submitted 13 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Superionic surface Li-ion transport in carbonaceous materials
Authors:
Jianbin Zhou,
Shen Wang,
Chaoshan Wu,
Ji Qi,
Hongli Wan,
Shen Lai,
Shijie Feng,
Tsz Wai Ko,
Zhaohui Liang,
Ke Zhou,
Nimrod Harpak,
Nick Solan,
Mengchen Liu,
Zeyu Hui,
Paulina J. Ai,
Kent Griffith,
Chunsheng Wang,
Shyue Ping Ong,
Yan Yao,
Ping Liu
Abstract:
Unlike Li-ion transport in the bulk of carbonaceous materials, little is known about Li-ion diffusion on their surface. In this study, we have discovered an ultra-fast Li-ion transport phenomenon on the surface of carbonaceous materials, particularly when they have limited Li insertion capacity along with a high surface area. This is exemplified by a carbon black, Ketjen Black (KB). An ionic condu…
▽ More
Unlike Li-ion transport in the bulk of carbonaceous materials, little is known about Li-ion diffusion on their surface. In this study, we have discovered an ultra-fast Li-ion transport phenomenon on the surface of carbonaceous materials, particularly when they have limited Li insertion capacity along with a high surface area. This is exemplified by a carbon black, Ketjen Black (KB). An ionic conductivity of 18.1 mS cm-1 at room temperature is observed, far exceeding most solid-state ion conductors. Theoretical calculations reveal a low diffusion barrier for the surface Li species. The species is also identified as Li*, which features a partial positive charge. As a result, lithiated KB functions effectively as an interlayer between Li and solid-state electrolytes (SSE) to mitigate dendrite growth and cell shorting. This function is found to be electrolyte agnostic, effective for both sulfide and halide SSEs. Further, lithiated KB can act as a high-performance mixed ion/electron conductor that is thermodynamically stable at potentials near Li metal. A graphite anode mixed with KB instead of a solid electrolyte demonstrates full utilization with a capacity retention of ~85% over 300 cycles. The discovery of this surface-mediated ultra-fast Li-ion transport mechanism provides new directions for the design of solid-state ion conductors and solid-state batteries.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
A Survey of Neural Network Robustness Assessment in Image Recognition
Authors:
Jie Wang,
Jun Ai,
Minyan Lu,
Haoran Su,
Dan Yu,
Yutao Zhang,
Junda Zhu,
Jingyu Liu
Abstract:
In recent years, there has been significant attention given to the robustness assessment of neural networks. Robustness plays a critical role in ensuring reliable operation of artificial intelligence (AI) systems in complex and uncertain environments. Deep learning's robustness problem is particularly significant, highlighted by the discovery of adversarial attacks on image classification models.…
▽ More
In recent years, there has been significant attention given to the robustness assessment of neural networks. Robustness plays a critical role in ensuring reliable operation of artificial intelligence (AI) systems in complex and uncertain environments. Deep learning's robustness problem is particularly significant, highlighted by the discovery of adversarial attacks on image classification models. Researchers have dedicated efforts to evaluate robustness in diverse perturbation conditions for image recognition tasks. Robustness assessment encompasses two main techniques: robustness verification/ certification for deliberate adversarial attacks and robustness testing for random data corruptions. In this survey, we present a detailed examination of both adversarial robustness (AR) and corruption robustness (CR) in neural network assessment. Analyzing current research papers and standards, we provide an extensive overview of robustness assessment in image recognition. Three essential aspects are analyzed: concepts, metrics, and assessment methods. We investigate the perturbation metrics and range representations used to measure the degree of perturbations on images, as well as the robustness metrics specifically for the robustness conditions of classification models. The strengths and limitations of the existing methods are also discussed, and some potential directions for future research are provided.
△ Less
Submitted 15 April, 2024; v1 submitted 12 April, 2024;
originally announced April 2024.
-
On degree power sum in $P_k$-free graphs
Authors:
Jiangdong Ai,
Fankang He,
Yihang Liu,
Bo Ning
Abstract:
Let $G$ be a graph on $n$ vertices with degree sequence $(d_1,d_2......d_n)$. For a real $p \geq 1$, let $D_p(G)=\sum_{i=1}^nd_i^p$. A Turán-type problem of degree power sum was initiated by Caro and Yuster \cite{caro2000degpower}: determining the function $D_p(n,H) :=\max \{D_p(G): \text{$G$ is an $n$-vertex $H$-free graph}\}$. They obtained some exact values for certain graphs $H$. For a path…
▽ More
Let $G$ be a graph on $n$ vertices with degree sequence $(d_1,d_2......d_n)$. For a real $p \geq 1$, let $D_p(G)=\sum_{i=1}^nd_i^p$. A Turán-type problem of degree power sum was initiated by Caro and Yuster \cite{caro2000degpower}: determining the function $D_p(n,H) :=\max \{D_p(G): \text{$G$ is an $n$-vertex $H$-free graph}\}$. They obtained some exact values for certain graphs $H$. For a path $P_k$, they mentioned that ``a close examination of the proof of Theorem 1.2 shows that the value of $n_0(k)$ in the statement of the theorem is $O(k^2)$", namely, they could show the $n$-vertex $P_k$-free graph with maximum degree power sum is $W_{n,k-1,\lfloor \frac{k}{2} \rfloor -1} = K_{\lfloor \frac{k}{2} \rfloor -1} \vee \left((n - \lceil \frac{k}{2} \rceil)K_1 \cup K_{1+k-2\lfloor \frac{k}{2} \rfloor} \right)$ when $n \geq c k^2$ for some constant $c$. In this note, we improve their result to a linear size of $k$ by a different approach. The bound is tight up to a constant factor.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
TDANet: A Novel Temporal Denoise Convolutional Neural Network With Attention for Fault Diagnosis
Authors:
Zhongzhi Li,
Rong Fan,
Jingqi Tu,
Jinyi Ma,
Jianliang Ai,
Yiqun Dong
Abstract:
Fault diagnosis plays a crucial role in maintaining the operational integrity of mechanical systems, preventing significant losses due to unexpected failures. As intelligent manufacturing and data-driven approaches evolve, Deep Learning (DL) has emerged as a pivotal technique in fault diagnosis research, recognized for its ability to autonomously extract complex features. However, the practical ap…
▽ More
Fault diagnosis plays a crucial role in maintaining the operational integrity of mechanical systems, preventing significant losses due to unexpected failures. As intelligent manufacturing and data-driven approaches evolve, Deep Learning (DL) has emerged as a pivotal technique in fault diagnosis research, recognized for its ability to autonomously extract complex features. However, the practical application of current fault diagnosis methods is challenged by the complexity of industrial environments. This paper proposed the Temporal Denoise Convolutional Neural Network With Attention (TDANet), designed to improve fault diagnosis performance in noise environments. This model transforms one-dimensional signals into two-dimensional tensors based on their periodic properties, employing multi-scale 2D convolution kernels to extract signal information both within and across periods. This method enables effective identification of signal characteristics that vary over multiple time scales. The TDANet incorporates a Temporal Variable Denoise (TVD) module with residual connections and a Multi-head Attention Fusion (MAF) module, enhancing the saliency of information within noisy data and maintaining effective fault diagnosis performance. Evaluation on two datasets, CWRU (single sensor) and Real aircraft sensor fault (multiple sensors), demonstrates that the TDANet model significantly outperforms existing deep learning approaches in terms of diagnostic accuracy under noisy environments.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Piercing independent sets in graphs without large induced matching
Authors:
Jiangdong Ai,
Hong Liu,
Zixiang Xu,
Qiang Zhou
Abstract:
Given a graph $G$, denote by $h(G)$ the smallest size of a subset of $V(G)$ which intersects every maximum independent set of $G$. We prove that any graph $G$ without induced matching of size $t$ satisfies $h(G)\le ω(G)^{3t-3+o(1)}$. This resolves a conjecture of Hajebi, Li and Spirkl (Hitting all maximum stable sets in $P_{5}$-free graphs, JCTB 2024).
Given a graph $G$, denote by $h(G)$ the smallest size of a subset of $V(G)$ which intersects every maximum independent set of $G$. We prove that any graph $G$ without induced matching of size $t$ satisfies $h(G)\le ω(G)^{3t-3+o(1)}$. This resolves a conjecture of Hajebi, Li and Spirkl (Hitting all maximum stable sets in $P_{5}$-free graphs, JCTB 2024).
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Solution on strong partition of $2$-balanced regular multipartite tournaments
Authors:
Jiangdong Ai,
Fankang He,
Yihang Liu
Abstract:
We call a partition of a $c$-partite tournament into tournaments of order $c$ is strong if each tournament is strongly connected. The strong partition number denoted as $ST(r)$, represents the minimum integer $c'$ such that every regular $r$-balanced $c$-partite tournament has a strong partition with $c\geq c'$. Figueroa, Montellano-Ballesteros and Olsen showed the existence of $ST(r)$ for all…
▽ More
We call a partition of a $c$-partite tournament into tournaments of order $c$ is strong if each tournament is strongly connected. The strong partition number denoted as $ST(r)$, represents the minimum integer $c'$ such that every regular $r$-balanced $c$-partite tournament has a strong partition with $c\geq c'$. Figueroa, Montellano-Ballesteros and Olsen showed the existence of $ST(r)$ for all $r\geq 2$ and proved that $5\leq ST(2)\leq 7$. In this note, we establish that $ST(2)=6$ and we also show the unique $2$-balanced $5$-partite tournament which has no strong partition.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Not all distributional shifts are equal: Fine-grained robust conformal inference
Authors:
Jiahao Ai,
Zhimei Ren
Abstract:
We introduce a fine-grained framework for uncertainty quantification of predictive models under distributional shifts. This framework distinguishes the shift in covariate distributions from that in the conditional relationship between the outcome ($Y$) and the covariates ($X$). We propose to reweight the training samples to adjust for an identifiable covariate shift while protecting against worst-…
▽ More
We introduce a fine-grained framework for uncertainty quantification of predictive models under distributional shifts. This framework distinguishes the shift in covariate distributions from that in the conditional relationship between the outcome ($Y$) and the covariates ($X$). We propose to reweight the training samples to adjust for an identifiable covariate shift while protecting against worst-case conditional distribution shift bounded in an $f$-divergence ball. Based on ideas from conformal inference and distributionally robust learning, we present an algorithm that outputs (approximately) valid and efficient prediction intervals in the presence of distributional shifts. As a use case, we apply the framework to sensitivity analysis of individual treatment effects with hidden confounding. The proposed methods are evaluated in simulation studies and four real data applications, demonstrating superior robustness and efficiency compared with existing benchmarks.
△ Less
Submitted 18 May, 2025; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Scalable and reliable deep transfer learning for intelligent fault detection via multi-scale neural processes embedded with knowledge
Authors:
Zhongzhi Li,
Jingqi Tu,
Jiacheng Zhu,
Jianliang Ai,
Yiqun Dong
Abstract:
Deep transfer learning (DTL) is a fundamental method in the field of Intelligent Fault Detection (IFD). It aims to mitigate the degradation of method performance that arises from the discrepancies in data distribution between training set (source domain) and testing set (target domain). Considering the fact that fault data collection is challenging and certain faults are scarce, DTL-based methods…
▽ More
Deep transfer learning (DTL) is a fundamental method in the field of Intelligent Fault Detection (IFD). It aims to mitigate the degradation of method performance that arises from the discrepancies in data distribution between training set (source domain) and testing set (target domain). Considering the fact that fault data collection is challenging and certain faults are scarce, DTL-based methods face the limitation of available observable data, which reduces the detection performance of the methods in the target domain. Furthermore, DTL-based methods lack comprehensive uncertainty analysis that is essential for building reliable IFD systems. To address the aforementioned problems, this paper proposes a novel DTL-based method known as Neural Processes-based deep transfer learning with graph convolution network (GTNP). Feature-based transfer strategy of GTNP bridges the data distribution discrepancies of source domain and target domain in high-dimensional space. Both the joint modeling based on global and local latent variables and sparse sampling strategy reduce the demand of observable data in the target domain. The multi-scale uncertainty analysis is obtained by using the distribution characteristics of global and local latent variables. Global analysis of uncertainty enables GTNP to provide quantitative values that reflect the complexity of methods and the difficulty of tasks. Local analysis of uncertainty allows GTNP to model uncertainty (confidence of the fault detection result) at each sample affected by noise and bias. The validation of the proposed method is conducted across 3 IFD tasks, consistently showing the superior detection performance of GTNP compared to the other DTL-based methods.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
A new perspective from hypertournaments to tournaments
Authors:
Jiangdong Ai,
Qiming Dai,
Qiwen Guo,
Yingqi Hu,
Changxin Wang
Abstract:
A $k$-tournament $H$ on $n$ vertices is a pair $(V, A)$ for $2\leq k\leq n$, where $V(H)$ is a set of vertices, and $A(H)$ is a set of all possible $k$-tuples of vertices, such that for any $k$-subset $S$ of $V$, $A(H)$ contains exactly one of the $k!$ possible permutations of $S$. In this paper, we investigate the relationship between a hyperdigraph and its corresponding normal digraph. Particula…
▽ More
A $k$-tournament $H$ on $n$ vertices is a pair $(V, A)$ for $2\leq k\leq n$, where $V(H)$ is a set of vertices, and $A(H)$ is a set of all possible $k$-tuples of vertices, such that for any $k$-subset $S$ of $V$, $A(H)$ contains exactly one of the $k!$ possible permutations of $S$. In this paper, we investigate the relationship between a hyperdigraph and its corresponding normal digraph. Particularly, drawing on a result from Gutin and Yeo, we establish an intrinsic relationship between a strong $k$-tournament and a strong tournament, which enables us to provide an alternative (more straightforward and concise) proof for some previously known results and get some new results.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
DualTeacher: Bridging Coexistence of Unlabelled Classes for Semi-supervised Incremental Object Detection
Authors:
Ziqi Yuan,
Liyuan Wang,
Wenbo Ding,
Xingxing Zhang,
Jiachen Zhong,
Jianyong Ai,
Jianmin Li,
Jun Zhu
Abstract:
In real-world applications, an object detector often encounters object instances from new classes and needs to accommodate them effectively. Previous work formulated this critical problem as incremental object detection (IOD), which assumes the object instances of new classes to be fully annotated in incremental data. However, as supervisory signals are usually rare and expensive, the supervised I…
▽ More
In real-world applications, an object detector often encounters object instances from new classes and needs to accommodate them effectively. Previous work formulated this critical problem as incremental object detection (IOD), which assumes the object instances of new classes to be fully annotated in incremental data. However, as supervisory signals are usually rare and expensive, the supervised IOD may not be practical for implementation. In this work, we consider a more realistic setting named semi-supervised IOD (SSIOD), where the object detector needs to learn new classes incrementally from a few labelled data and massive unlabelled data without catastrophic forgetting of old classes. A commonly-used strategy for supervised IOD is to encourage the current model (as a student) to mimic the behavior of the old model (as a teacher), but it generally fails in SSIOD because a dominant number of object instances from old and new classes are coexisting and unlabelled, with the teacher only recognizing a fraction of them. Observing that learning only the classes of interest tends to preclude detection of other classes, we propose to bridge the coexistence of unlabelled classes by constructing two teacher models respectively for old and new classes, and using the concatenation of their predictions to instruct the student. This approach is referred to as DualTeacher, which can serve as a strong baseline for SSIOD with limited resource overhead and no extra hyperparameters. We build various benchmarks for SSIOD and perform extensive experiments to demonstrate the superiority of our approach (e.g., the performance lead is up to 18.28 AP on MS-COCO). Our code is available at \url{https://github.com/chuxiuhong/DualTeacher}.
△ Less
Submitted 13 December, 2023;
originally announced January 2024.
-
BD-MSA: Body decouple VHR Remote Sensing Image Change Detection method guided by multi-scale feature information aggregation
Authors:
Yonghui Tan,
Xiaolong Li,
Yishu Chen,
Jinquan Ai
Abstract:
The purpose of remote sensing image change detection (RSCD) is to detect differences between bi-temporal images taken at the same place. Deep learning has been extensively used to RSCD tasks, yielding significant results in terms of result recognition. However, due to the shooting angle of the satellite, the impacts of thin clouds, and certain lighting conditions, the problem of fuzzy edges in the…
▽ More
The purpose of remote sensing image change detection (RSCD) is to detect differences between bi-temporal images taken at the same place. Deep learning has been extensively used to RSCD tasks, yielding significant results in terms of result recognition. However, due to the shooting angle of the satellite, the impacts of thin clouds, and certain lighting conditions, the problem of fuzzy edges in the change region in some remote sensing photographs cannot be properly handled using current RSCD algorithms. To solve this issue, we proposed a Body Decouple Multi-Scale by fearure Aggregation change detection (BD-MSA), a novel model that collects both global and local feature map information in the channel and space dimensions of the feature map during the training and prediction phases. This approach allows us to successfully extract the change region's boundary information while also divorcing the change region's main body from its boundary. Numerous studies have shown that the assessment metrics and evaluation effects of the model described in this paper on the publicly available datasets DSIFN-CD, S2Looking and WHU-CD are the best when compared to other models.
△ Less
Submitted 3 March, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Graph operations and a unified method for kinds of Turán-type problems on paths, cycles and matchings
Authors:
Jiangdong Ai,
Hui Lei,
Bo Ning,
Yongtang Shi
Abstract:
Let $G$ be a connected graph and $\mathcal{P}(G)$ a graph parameter. We say that $\mathcal{P}(G)$ is feasible if $\mathcal{P}(G)$ satisfies the following properties: (I) $\mathcal{P}(G)\leq \mathcal{P}(G_{uv})$, if $G_{uv}=G[u\to v]$ for any $u,v$, where $G_{uv}$ is the graph obtained by applying Kelmans operation from $u$ to $v$; (II) $\mathcal{P}(G) <\mathcal{P}(G+e)$ for any edge…
▽ More
Let $G$ be a connected graph and $\mathcal{P}(G)$ a graph parameter. We say that $\mathcal{P}(G)$ is feasible if $\mathcal{P}(G)$ satisfies the following properties: (I) $\mathcal{P}(G)\leq \mathcal{P}(G_{uv})$, if $G_{uv}=G[u\to v]$ for any $u,v$, where $G_{uv}$ is the graph obtained by applying Kelmans operation from $u$ to $v$; (II) $\mathcal{P}(G) <\mathcal{P}(G+e)$ for any edge $e\notin E(G)$. Let $P_k$ be a path of order $k$, $\mathcal{C}_{\geq k}$ the set of all cycles of length at least $k$ and $M_{k+1}$ a matching containing $k+1$ independent edges. In this paper, we mainly prove the following three results: (i) Let $n\geq k\geq 5$ and let $t=\left\lfloor\frac{k-1}{2}\right\rfloor$. Let $G$ be a $2$-connected $n$-vertex $\mathcal{C}_{\geq k}$-free graph with the maximum $\mathcal{P}(G)$ where $\mathcal{P}(G)$ is feasible. Then, $G\in \mathcal{G}^1_{n,k}=\{W_{n,k,s}=K_{s}\vee ((n-k+s)K_1\cup K_{k-2s}): 2\leq s\leq t\}$. (ii) Let $n\geq k\geq 4$ and let $t=\left\lfloor\frac{k}{2}\right\rfloor-1$. Let $G$ be a connected $n$-vertex $P_{k}$-free graph with the maximum $\mathcal{P}(G)$ where $\mathcal{P}(G)$ is feasible. Then, $G\in \mathcal{G}^2_{n,k}=\{W_{n,k-1,s}=K_{s}\vee ((n-k+s+1)K_1\cup K_{k-2s-1}): 1\leq s\leq t\}.$ (iii) Let $G$ be a connected $n$-vertex $M_{k+1}$-free graph with the maximum $\mathcal{P}(G)$ where $\mathcal{P}(G)$ is feasible. Then, $G\cong K_n$ when $n=2k+1$ and $G\in \mathcal{G}^3_{n,k}=\{K_s\vee ((n-2k+s-1)K_1\cup K_{2k-2s+1}):1\leq s\leq k\}$ when $n\geq 2k+2$. Directly derived from these three main results, we obtain a series of applications in Turán-type problems, generalized Turán-type problems, powers of graph degrees in extremal graph theory, and problems related to spectral radius, and signless Laplacian spectral radius in spectral graph theory.
△ Less
Submitted 29 January, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
PointSSC: A Cooperative Vehicle-Infrastructure Point Cloud Benchmark for Semantic Scene Completion
Authors:
Yuxiang Yan,
Boda Liu,
Jianfei Ai,
Qinbu Li,
Ru Wan,
Jian Pu
Abstract:
Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC,…
▽ More
Semantic Scene Completion (SSC) aims to jointly generate space occupancies and semantic labels for complex 3D scenes. Most existing SSC models focus on volumetric representations, which are memory-inefficient for large outdoor spaces. Point clouds provide a lightweight alternative but existing benchmarks lack outdoor point cloud scenes with semantic labels. To address this, we introduce PointSSC, the first cooperative vehicle-infrastructure point cloud benchmark for semantic scene completion. These scenes exhibit long-range perception and minimal occlusion. We develop an automated annotation pipeline leveraging Semantic Segment Anything to efficiently assign semantics. To benchmark progress, we propose a LiDAR-based model with a Spatial-Aware Transformer for global and local feature extraction and a Completion and Segmentation Cooperative Module for joint completion and segmentation. PointSSC provides a challenging testbed to drive advances in semantic point cloud completion for real-world navigation. The code and datasets are available at https://github.com/yyxssm/PointSSC.
△ Less
Submitted 6 March, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
On Seymour's and Sullivan's Second Neighbourhood Conjectures
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin,
Shujing Wang,
Anders Yeo,
Yacong Zhou
Abstract:
For a vertex $x$ of a digraph, $d^+(x)$ ($d^-(x)$, resp.) is the number of vertices at distance 1 from (to, resp.) $x$ and $d^{++}(x)$ is the number of vertices at distance 2 from $x$. In 1995, Seymour conjectured that for any oriented graph $D$ there exists a vertex $x$ such that $d^+(x)\leq d^{++}(x)$. In 2006, Sullivan conjectured that there exists a vertex $x$ in $D$ such that…
▽ More
For a vertex $x$ of a digraph, $d^+(x)$ ($d^-(x)$, resp.) is the number of vertices at distance 1 from (to, resp.) $x$ and $d^{++}(x)$ is the number of vertices at distance 2 from $x$. In 1995, Seymour conjectured that for any oriented graph $D$ there exists a vertex $x$ such that $d^+(x)\leq d^{++}(x)$. In 2006, Sullivan conjectured that there exists a vertex $x$ in $D$ such that $d^-(x)\leq d^{++}(x)$. We give a sufficient condition in terms of the number of transitive triangles for an oriented graph to satisfy Sullivan's conjecture. In particular, this implies that Sullivan's conjecture holds for all orientations of planar graphs and of triangle-free graphs. An oriented graph $D$ is an oriented split graph if the vertices of $D$ can be partitioned into vertex sets $X$ and $Y$ such that $X$ is an independent set and $Y$ induces a tournament. We also show that the two conjectures hold for some families of oriented split graphs, in particular, when $Y$ induces a regular or an almost regular tournament.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Bounds on Maximum Weight Directed Cut
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin,
Anders Yeo,
Yacong Zhou
Abstract:
We obtain lower and upper bounds for the maximum weight of a directed cut in the classes of weighted digraphs and weighted acyclic digraphs as well as in some of their subclasses. We compare our results with those obtained for the maximum size of a directed cut in unweighted digraphs. In particular, we show that a lower bound obtained by Alon, Bollobas, Gyafas, Lehel and Scott (J Graph Th 55(1) (2…
▽ More
We obtain lower and upper bounds for the maximum weight of a directed cut in the classes of weighted digraphs and weighted acyclic digraphs as well as in some of their subclasses. We compare our results with those obtained for the maximum size of a directed cut in unweighted digraphs. In particular, we show that a lower bound obtained by Alon, Bollobas, Gyafas, Lehel and Scott (J Graph Th 55(1) (2007)) for unweighted acyclic digraphs can be extended to weighted digraphs with the maximum length of a cycle being bounded by a constant and the weight of every arc being at least one. We state a number of open problems.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
WS-3D-Lane: Weakly Supervised 3D Lane Detection With 2D Lane Labels
Authors:
Jianyong Ai,
Wenbo Ding,
Jiuhua Zhao,
Jiachen Zhong
Abstract:
Compared to 2D lanes, real 3D lane data is difficult to collect accurately. In this paper, we propose a novel method for training 3D lanes with only 2D lane labels, called weakly supervised 3D lane detection WS-3D-Lane. By assumptions of constant lane width and equal height on adjacent lanes, we indirectly supervise 3D lane heights in the training. To overcome the problem of the dynamic change of…
▽ More
Compared to 2D lanes, real 3D lane data is difficult to collect accurately. In this paper, we propose a novel method for training 3D lanes with only 2D lane labels, called weakly supervised 3D lane detection WS-3D-Lane. By assumptions of constant lane width and equal height on adjacent lanes, we indirectly supervise 3D lane heights in the training. To overcome the problem of the dynamic change of the camera pitch during data collection, a camera pitch self-calibration method is proposed. In anchor representation, we propose a double-layer anchor with a improved non-maximum suppression (NMS) method, which enables the anchor-based method to predict two lane lines that are close. Experiments are conducted on the base of 3D-LaneNet under two supervision methods. Under weakly supervised setting, our WS-3D-Lane outperforms previous 3D-LaneNet: F-score rises to 92.3% on Apollo 3D synthetic dataset, and F1 rises to 74.5% on ONCE-3DLanes. Meanwhile, WS-3D-Lane in purely supervised setting makes more increments and outperforms state-of-the-art. To the best of our knowledge, WS-3D-Lane is the first try of 3D lane detection under weakly supervised setting.
△ Less
Submitted 17 January, 2023; v1 submitted 23 September, 2022;
originally announced September 2022.
-
Fault Detection and Classification of Aerospace Sensors using a VGG16-based Deep Neural Network
Authors:
Zhongzhi Li,
Yunmei Zhao,
Jinyi Ma,
Jianliang Ai,
Yiqun Dong
Abstract:
Compared with traditional model-based fault detection and classification (FDC) methods, deep neural networks (DNN) prove to be effective for the aerospace sensors FDC problems. However, time being consumed in training the DNN is excessive, and explainability analysis for the FDC neural network is still underwhelming. A concept known as imagefication-based intelligent FDC has been studied in recent…
▽ More
Compared with traditional model-based fault detection and classification (FDC) methods, deep neural networks (DNN) prove to be effective for the aerospace sensors FDC problems. However, time being consumed in training the DNN is excessive, and explainability analysis for the FDC neural network is still underwhelming. A concept known as imagefication-based intelligent FDC has been studied in recent years. This concept advocates to stack the sensors measurement data into an image format, the sensors FDC issue is then transformed to abnormal regions detection problem on the stacked image, which may well borrow the recent advances in the machine vision vision realm. Although promising results have been claimed in the imagefication-based intelligent FDC researches, due to the low size of the stacked image, small convolutional kernels and shallow DNN layers were used, which hinders the FDC performance. In this paper, we first propose a data augmentation method which inflates the stacked image to a larger size (correspondent to the VGG16 net developed in the machine vision realm). The FDC neural network is then trained via fine-tuning the VGG16 directly. To truncate and compress the FDC net size (hence its running time), we perform model pruning on the fine-tuned net. Class activation mapping (CAM) method is also adopted for explainability analysis of the FDC net to verify its internal operations. Via data augmentation, fine-tuning from VGG16, and model pruning, the FDC net developed in this paper claims an FDC accuracy 98.90% across 4 aircraft at 5 flight conditions (running time 26 ms). The CAM results also verify the FDC net w.r.t. its internal operations.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Results on the Small Quasi-Kernel Conjecture
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin,
Anders Yeo,
Yacong Zhou
Abstract:
A {\em quasi-kernel} of a digraph $D$ is an independent set $Q\subseteq V(D)$ such that for every vertex $v\in V(D)\backslash Q$, there exists a directed path with one or two arcs from $v$ to a vertex $u\in Q$. In 1974, Chvátal and Lovász proved that every digraph has a quasi-kernel. In 1976, Erdős and Sźekely conjectured that every sink-free digraph $D=(V(D),A(D))$ has a quasi-kernel of size at m…
▽ More
A {\em quasi-kernel} of a digraph $D$ is an independent set $Q\subseteq V(D)$ such that for every vertex $v\in V(D)\backslash Q$, there exists a directed path with one or two arcs from $v$ to a vertex $u\in Q$. In 1974, Chvátal and Lovász proved that every digraph has a quasi-kernel. In 1976, Erdős and Sźekely conjectured that every sink-free digraph $D=(V(D),A(D))$ has a quasi-kernel of size at most $|V(D)|/2$. In this paper, we give a new method to show that the conjecture holds for a generalization of anti-claw-free digraphs. For any sink-free one-way split digraph $D$ of order $n$, when $n\geq 3$, we show a stronger result that $D$ has a quasi-kernel of size at most $\frac{n+3}{2} - \sqrt{n}$, and the bound is sharp.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Deepfake Face Traceability with Disentangling Reversing Network
Authors:
Jiaxin Ai,
Zhongyuan Wang,
Baojin Huang,
Zhen Han
Abstract:
Deepfake face not only violates the privacy of personal identity, but also confuses the public and causes huge social harm. The current deepfake detection only stays at the level of distinguishing true and false, and cannot trace the original genuine face corresponding to the fake face, that is, it does not have the ability to trace the source of evidence. The deepfake countermeasure technology fo…
▽ More
Deepfake face not only violates the privacy of personal identity, but also confuses the public and causes huge social harm. The current deepfake detection only stays at the level of distinguishing true and false, and cannot trace the original genuine face corresponding to the fake face, that is, it does not have the ability to trace the source of evidence. The deepfake countermeasure technology for judicial forensics urgently calls for deepfake traceability. This paper pioneers an interesting question about face deepfake, active forensics that "know it and how it happened". Given that deepfake faces do not completely discard the features of original faces, especially facial expressions and poses, we argue that original faces can be approximately speculated from their deepfake counterparts. Correspondingly, we design a disentangling reversing network that decouples latent space features of deepfake faces under the supervision of fake-original face pair samples to infer original faces in reverse.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Augmented Imagefication: A Data-driven Fault Detection Method for Aircraft Air Data Sensors
Authors:
Hang Zhao,
Jinyi Ma,
Zhongzhi Li,
Yiqun Dong,
Jianliang Ai
Abstract:
In this paper, a novel data-driven approach named Augmented Imagefication for Fault detection (FD) of aircraft air data sensors (ADS) is proposed. Exemplifying the FD problem of aircraft air data sensors, an online FD scheme on edge device based on deep neural network (DNN) is developed. First, the aircraft inertial reference unit measurements is adopted as equivalent inputs, which is scalable to…
▽ More
In this paper, a novel data-driven approach named Augmented Imagefication for Fault detection (FD) of aircraft air data sensors (ADS) is proposed. Exemplifying the FD problem of aircraft air data sensors, an online FD scheme on edge device based on deep neural network (DNN) is developed. First, the aircraft inertial reference unit measurements is adopted as equivalent inputs, which is scalable to different aircraft/flight cases. Data associated with 6 different aircraft/flight conditions are collected to provide diversity (scalability) in the training/testing database. Then Augmented Imagefication is proposed for the DNN-based prediction of flying conditions. The raw data are reshaped as a grayscale image for convolutional operation, and the necessity of augmentation is analyzed and pointed out. Different kinds of augmented method, i.e. Flip, Repeat, Tile and their combinations are discussed, the result shows that the All Repeat operation in both axes of image matrix leads to the best performance of DNN. The interpretability of DNN is studied based on Grad-CAM, which provide a better understanding and further solidifies the robustness of DNN. Next the DNN model, VGG-16 with augmented imagefication data is optimized for mobile hardware deployment. After pruning of DNN, a lightweight model (98.79% smaller than original VGG-16) with high accuracy (slightly up by 0.27%) and fast speed (time delay is reduced by 87.54%) is obtained. And the hyperparameters optimization of DNN based on TPE is implemented and the best combination of hyperparameters is determined (learning rate 0.001, iterative epochs 600, and batch size 100 yields the highest accuracy at 0.987). Finally, a online FD deployment based on edge device, Jetson Nano, is developed and the real time monitoring of aircraft is achieved. We believe that this method is instructive for addressing the FD problems in other similar fields.
△ Less
Submitted 28 June, 2022; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Consistent Attack: Universal Adversarial Perturbation on Embodied Vision Navigation
Authors:
Chengyang Ying,
You Qiaoben,
Xinning Zhou,
Hang Su,
Wenbo Ding,
Jianyong Ai
Abstract:
Embodied agents in vision navigation coupled with deep neural networks have attracted increasing attention. However, deep neural networks have been shown vulnerable to malicious adversarial noises, which may potentially cause catastrophic failures in Embodied Vision Navigation. Among different adversarial noises, universal adversarial perturbations (UAP), i.e., a constant image-agnostic perturbati…
▽ More
Embodied agents in vision navigation coupled with deep neural networks have attracted increasing attention. However, deep neural networks have been shown vulnerable to malicious adversarial noises, which may potentially cause catastrophic failures in Embodied Vision Navigation. Among different adversarial noises, universal adversarial perturbations (UAP), i.e., a constant image-agnostic perturbation applied on every input frame of the agent, play a critical role in Embodied Vision Navigation since they are computation-efficient and application-practical during the attack. However, existing UAP methods ignore the system dynamics of Embodied Vision Navigation and might be sub-optimal. In order to extend UAP to the sequential decision setting, we formulate the disturbed environment under the universal noise $δ$, as a $δ$-disturbed Markov Decision Process ($δ$-MDP). Based on the formulation, we analyze the properties of $δ$-MDP and propose two novel Consistent Attack methods, named Reward UAP and Trajectory UAP, for attacking Embodied agents, which consider the dynamic of the MDP and calculate universal noises by estimating the disturbed distribution and the disturbed Q function. For various victim models, our Consistent Attack can cause a significant drop in their performance in the PointGoal task in Habitat with different datasets and different scenes. Extensive experimental results indicate that there exist serious potential risks for applying Embodied Vision Navigation methods to the real world.
△ Less
Submitted 25 March, 2023; v1 submitted 12 June, 2022;
originally announced June 2022.
-
DistAD: Software Anomaly Detection Based on Execution Trace Distribution
Authors:
Shiyi Kong,
Jun Ai,
Minyan Lu,
Shuguang Wang,
W. Eric Wong
Abstract:
Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) be…
▽ More
Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) before they ultimately lead to unavoidable failures, thus, supporting the following runtime fault-tolerant techniques. In this work, we propose a novel anomaly detection method named DistAD, which is based on the distribution of software runtime dynamic execution traces. Unlike other existing works using key performance indicators, the execution trace is collected during runtime via intrusive instrumentation. Instrumentation are controlled following a sampling mechanism to avoid excessive overheads. Bi-directional Long Short-Term Memory (Bi-LSTM), an architecture of Recurrent Neural Network (RNN) is used to achieve the anomaly detection. The whole framework is constructed under a One-Class Neural Network (OCNN) learning mode which can help eliminate the limits of lacking for enough labeled samples and the data imbalance issues. A series of controlled experiments are conducted on a widely used database system named Cassandra to prove the validity and feasibility of the proposed method. Overheads brought about by the intrusive probing are also evaluated. The results show that DistAD can achieve more than 70% accuracy and 90% recall (in normal states) with no more than 2 times overheads compared with unmonitored executions.
△ Less
Submitted 26 April, 2022; v1 submitted 28 February, 2022;
originally announced February 2022.
-
Extended Path Partition Conjecture for Semicomplete and Acyclic Compositions
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin,
Yacong Zhou
Abstract:
Let $D$ be a digraph and let $λ(D)$ denote the number of vertices in a longest path of $D$. For a pair of vertex-disjoint induced subdigraphs $A$ and $B$ of $D$, we say that $(A,B)$ is a partition of $D$ if $V(A)\cup V(B)=V(D).$ The Path Partition Conjecture (PPC) states that for every digraph, $D$, and every integer $q$ with $1\leq q\leqλ(D)-1$, there exists a partition $(A,B)$ of $D$ such that…
▽ More
Let $D$ be a digraph and let $λ(D)$ denote the number of vertices in a longest path of $D$. For a pair of vertex-disjoint induced subdigraphs $A$ and $B$ of $D$, we say that $(A,B)$ is a partition of $D$ if $V(A)\cup V(B)=V(D).$ The Path Partition Conjecture (PPC) states that for every digraph, $D$, and every integer $q$ with $1\leq q\leqλ(D)-1$, there exists a partition $(A,B)$ of $D$ such that $λ(A)\leq q$ and $λ(B)\leqλ(D)-q.$ Let $T$ be a digraph with vertex set $\{u_1,\dots, u_t\}$ and for every $i\in [t]$, let $H_i$ be a digraph with vertex set $\{u_{i,j_i}\colon\,
j_i\in [n_i]\}$. The {\em composition} $Q=T[H_1,\dots , H_t]$ of $T$ and $H_1,\ldots, H_t$ is a digraph with vertex set $\{u_{i,j_i}\colon\, i\in [t], j_i\in [n_i]\}$ and arc set $$A(Q)=\cup^t_{i=1}A(H_i)\cup \{u_{i,j_i}u_{p,q_p}\colon\, u_iu_p\in A(T), j_i\in [n_i], q_p\in [n_p]\}.$$ We say that $Q$ is acyclic {(semicomplete, respectively)} if $T$ is acyclic {(semicomplete, respectively)}. In this paper, we introduce a conjecture stronger than PPC using a property first studied by Bang-Jensen, Nielsen and Yeo (2006) and show that the stronger conjecture holds for wide families of acyclic and semicomplete compositions.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
Detection Software Content Failures Using Dynamic Execution Information
Authors:
Shiyi Kong,
Minyan Lu,
Jun Ai,
Shuguang Wang
Abstract:
Modern software systems become too complex to be tested and validated. Detecting software partial failures in complex systems at runtime assist to handle software unintended behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to find the manifestation of faults before they finally lead to unavoidable failures, thus supporti…
▽ More
Modern software systems become too complex to be tested and validated. Detecting software partial failures in complex systems at runtime assist to handle software unintended behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to find the manifestation of faults before they finally lead to unavoidable failures, thus supporting following runtime fault tolerant techniques. We review the state of the art articles and find that the content failures account for the majority of all kinds of software failures, but its detection methods are rarely studied. In this work, we propose a novel failure detection indicator based on the software runtime dynamic execution information for software content failures. The runtime information is recorded during software execution, then transformed to a measure named runtime entropy and finally fed into machine learning models. The machine learning models are built to classify the intended and unintended behaviors of the objected software systems. A series of controlled experiments on several open source projects are conducted to prove the feasibility of the method. We also evaluate the accuracy of machine learning models built in this work.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Efficient DETR: Improving End-to-End Object Detector with Dense Prior
Authors:
Zhuyu Yao,
Jiangbo Ai,
Boxun Li,
Chi Zhang
Abstract:
The recently proposed end-to-end transformer detectors, such as DETR and Deformable DETR, have a cascade structure of stacking 6 decoder layers to update object queries iteratively, without which their performance degrades seriously. In this paper, we investigate that the random initialization of object containers, which include object queries and reference points, is mainly responsible for the re…
▽ More
The recently proposed end-to-end transformer detectors, such as DETR and Deformable DETR, have a cascade structure of stacking 6 decoder layers to update object queries iteratively, without which their performance degrades seriously. In this paper, we investigate that the random initialization of object containers, which include object queries and reference points, is mainly responsible for the requirement of multiple iterations. Based on our findings, we propose Efficient DETR, a simple and efficient pipeline for end-to-end object detection. By taking advantage of both dense detection and sparse set detection, Efficient DETR leverages dense prior to initialize the object containers and brings the gap of the 1-decoder structure and 6-decoder structure. Experiments conducted on MS COCO show that our method, with only 3 encoder layers and 1 decoder layer, achieves competitive performance with state-of-the-art object detection methods. Efficient DETR is also robust in crowded scenes. It outperforms modern detectors on CrowdHuman dataset by a large margin.
△ Less
Submitted 3 April, 2021;
originally announced April 2021.
-
Kings in Multipartite Hypertournaments
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin
Abstract:
In his paper "Kings in Bipartite Hypertournaments" (Graphs $\&$ Combinatorics 35, 2019), Petrovic stated two conjectures on 4-kings in multipartite hypertournaments. We prove one of these conjectures and give counterexamples for the other.
In his paper "Kings in Bipartite Hypertournaments" (Graphs $\&$ Combinatorics 35, 2019), Petrovic stated two conjectures on 4-kings in multipartite hypertournaments. We prove one of these conjectures and give counterexamples for the other.
△ Less
Submitted 16 July, 2021; v1 submitted 11 November, 2020;
originally announced November 2020.
-
Attribute-guided Feature Extraction and Augmentation Robust Learning for Vehicle Re-identification
Authors:
Chaoran Zhuge,
Yujie Peng,
Yadong Li,
Jiangbo Ai,
Junru Chen
Abstract:
Vehicle re-identification is one of the core technologies of intelligent transportation systems and smart cities, but large intra-class diversity and inter-class similarity poses great challenges for existing method. In this paper, we propose a multi-guided learning approach which utilizing the information of attributes and meanwhile introducing two novel random augments to improve the robustness…
▽ More
Vehicle re-identification is one of the core technologies of intelligent transportation systems and smart cities, but large intra-class diversity and inter-class similarity poses great challenges for existing method. In this paper, we propose a multi-guided learning approach which utilizing the information of attributes and meanwhile introducing two novel random augments to improve the robustness during training. What's more, we propose an attribute constraint method and group re-ranking strategy to refine matching results. Our method achieves mAP of 66.83% and rank-1 accuracy 76.05% in the CVPR 2020 AI City Challenge.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Proximity and Remoteness in Directed and Undirected Graphs
Authors:
Jiangdong Ai,
Stefanie Gerke,
Gregory Gutin,
Sonwabile Mafunda
Abstract:
Let $D$ be a strongly connected digraph. The average distance $\barσ(v)$ of a vertex $v$ of $D$ is the arithmetic mean of the distances from $v$ to all other vertices of $D$. The remoteness $ρ(D)$ and proximity $π(D)$ of $D$ are the maximum and the minimum of the average distances of the vertices of $D$, respectively. We obtain sharp upper and lower bounds on $π(D)$ and $ρ(D)$ as a function of the…
▽ More
Let $D$ be a strongly connected digraph. The average distance $\barσ(v)$ of a vertex $v$ of $D$ is the arithmetic mean of the distances from $v$ to all other vertices of $D$. The remoteness $ρ(D)$ and proximity $π(D)$ of $D$ are the maximum and the minimum of the average distances of the vertices of $D$, respectively. We obtain sharp upper and lower bounds on $π(D)$ and $ρ(D)$ as a function of the order $n$ of $D$ and describe the extreme digraphs for all the bounds. We also obtain such bounds for strong tournaments. We show that for a strong tournament $T$, we have $π(T)=ρ(T)$ if and only if $T$ is regular. Due to this result, one may conjecture that every strong digraph $D$ with $π(D)=ρ(D)$ is regular. We present an infinite family of non-regular strong digraphs $D$ such that $π(D)=ρ(D).$ We describe such a family for undirected graphs as well.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Face Attribute Invertion
Authors:
X G Tu,
Y Luo,
H S Zhang,
W J Ai,
Z Ma,
M Xie
Abstract:
Manipulating human facial images between two domains is an important and interesting problem. Most of the existing methods address this issue by applying two generators or one generator with extra conditional inputs. In this paper, we proposed a novel self-perception method based on GANs for automatical face attribute inverse. The proposed method takes face images as inputs and employs only one si…
▽ More
Manipulating human facial images between two domains is an important and interesting problem. Most of the existing methods address this issue by applying two generators or one generator with extra conditional inputs. In this paper, we proposed a novel self-perception method based on GANs for automatical face attribute inverse. The proposed method takes face images as inputs and employs only one single generator without being conditioned on other inputs. Profiting from the multi-loss strategy and modified U-net structure, our model is quite stable in training and capable of preserving finer details of the original face images.
△ Less
Submitted 14 January, 2020;
originally announced January 2020.
-
Improved measurements of the absolute branching fractions of the inclusive decays $D^{+(0)}\toφX$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
M. Alekseev,
A. Amoroso,
F. F. An,
Q. An,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsure n,
J. V. Bennett,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J Biernat,
J. Bloms,
I. Boyko,
R. A. Briere,
H. Cai
, et al. (462 additional authors not shown)
Abstract:
By analyzing 2.93 fb$^{-1}$ of $e^+e^-$ annihilation data taken at the center-of-mass energy $\sqrt s=$ 3.773 GeV with the BESIII detector, we determine the branching fractions of the inclusive decays $D^+\toφX$ and $D^0\toφX$ to be $(1.135\pm0.034\pm0.031)\%$ and $(1.091\pm0.027\pm0.035)\%$, respectively, where $X$ denotes any possible particle combination. The first uncertainties are statistical…
▽ More
By analyzing 2.93 fb$^{-1}$ of $e^+e^-$ annihilation data taken at the center-of-mass energy $\sqrt s=$ 3.773 GeV with the BESIII detector, we determine the branching fractions of the inclusive decays $D^+\toφX$ and $D^0\toφX$ to be $(1.135\pm0.034\pm0.031)\%$ and $(1.091\pm0.027\pm0.035)\%$, respectively, where $X$ denotes any possible particle combination. The first uncertainties are statistical and the second systematic. We also determine the branching fractions of the decays $D\toφX$ and their charge conjugate modes $\bar{D}\toφ\bar{X}$ separately for the first time, and no significant CP asymmetry is observed.
△ Less
Submitted 17 October, 2019; v1 submitted 14 August, 2019;
originally announced August 2019.