Search | arXiv e-print repository

Wi-Fi Sensing Tool Release: Gathering 802.11ax Channel State Information from a Commercial Wi-Fi Access Point

Authors: Zisheng Wang, Feng Li, Hangbin Zhao, Zihuan Mao, Yaodong Zhang, Qisheng Huang, Bo Cao, Mingming Cao, Baolin He, Qilin Hou

Abstract: Wi-Fi sensing has emerged as a powerful technology, leveraging channel state information (CSI) extracted from wireless data packets to enable diverse applications, ranging from human presence detection to gesture recognition and health monitoring. However, CSI extraction from commercial Wi-Fi access point lacks and out of date. This paper introduces ZTECSITool,a toolkit designed to capture high-re… ▽ More Wi-Fi sensing has emerged as a powerful technology, leveraging channel state information (CSI) extracted from wireless data packets to enable diverse applications, ranging from human presence detection to gesture recognition and health monitoring. However, CSI extraction from commercial Wi-Fi access point lacks and out of date. This paper introduces ZTECSITool,a toolkit designed to capture high-resolution CSI measurements from commercial Wi-Fi 6 (802.11ax) access points, supporting bandwidths up to 160 MHz and 512 subcarriers. ZTECSITool bridges a critical gap in Wi-Fi sensing research, facilitating the development of next-generation sensing systems. The toolkit includes customized firmware and open-source software tools for configuring, collecting, and parsing CSI data, offering researchers a robust platform for advanced sensing applications. We detail the command protocols for CSI extraction, including band selection,STA filtering, and report configuration, and provide insights into the data structure of the reported CSI. Additionally, we present a Python-based graphical interface for real-time CSI visualization and analysis △ Less

Submitted 20 June, 2025; originally announced June 2025.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2506.12463 [pdf, ps, other]

Adding links wisely: how an influencer seeks for leadership in opinion dynamics?

Authors: Lingfei Wang, Yu Xing, Yuhao Yi, Ming Cao, Karl H. Johansson

Abstract: This paper investigates the problem of leadership development for an external influencer using the Friedkin-Johnsen (FJ) opinion dynamics model, where the influencer is modeled as a fully stubborn agent and leadership is quantified by social power. The influencer seeks to maximize her social power by strategically adding a limited number of links to regular agents. This optimization problem is sho… ▽ More This paper investigates the problem of leadership development for an external influencer using the Friedkin-Johnsen (FJ) opinion dynamics model, where the influencer is modeled as a fully stubborn agent and leadership is quantified by social power. The influencer seeks to maximize her social power by strategically adding a limited number of links to regular agents. This optimization problem is shown to be equivalent to maximizing the absorbing probability to the influencer in an augmented Markov chain. The resulting objective function is both monotone and submodular, enabling the use of a greedy algorithm to compute an approximate solution. To handle large-scale networks efficiently, a random walk sampling over the Markov chain is employed to reduce computational complexity. Analytical characterizations of the solution are provided for both low and high stubbornness of regular agents. Specific network topologies are also examined: for complete graphs with rank-one weight matrices, the problem reduces to a hyperbolic 0-1 programmming problem, which is solvable in polynomial time; for symmetric ring graphs with circulant weight matrices and uniform agent stubbornness, the optimal strategy involves selecting agents that are sufficiently dispersed across the network. Numerical simulations are presented for illustration. △ Less

Submitted 14 June, 2025; originally announced June 2025.

arXiv:2505.08592 [pdf, other]

Communication-Efficient Distributed Online Nonconvex Optimization with Time-Varying Constraints

Authors: Kunpeng Zhang, Lei Xu, Xinlei Yi, Guanghui Wen, Ming Cao, Karl H. Johansson, Tianyou Chai, Tao Yang

Abstract: This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents, where the nonconvex local loss and convex local constraint functions can vary arbitrarily across iterations, and the information of them is privately revealed to each agent at each iteration. For a uniformly jointly strongly connected time-varying directed graph, we pro… ▽ More This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents, where the nonconvex local loss and convex local constraint functions can vary arbitrarily across iterations, and the information of them is privately revealed to each agent at each iteration. For a uniformly jointly strongly connected time-varying directed graph, we propose two distributed bandit online primal--dual algorithm with compressed communication to efficiently utilize communication resources in the one-point and two-point bandit feedback settings, respectively. In nonconvex optimization, finding a globally optimal decision is often NP-hard. As a result, the standard regret metric used in online convex optimization becomes inapplicable. To measure the performance of the proposed algorithms, we use a network regret metric grounded in the first-order optimality condition associated with the variational inequality. We show that the compressed algorithms establish sublinear network regret and cumulative constraint violation bounds. Finally, a simulation example is presented to validate the theoretical results. △ Less

Submitted 14 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

Comments: 56 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2503.22410

arXiv:2504.20504 [pdf, other]

Quality-factor inspired deep neural network solver for solving inverse scattering problems

Authors: Yutong Du, Zicheng Liu, Miao Cao, Zupeng Liang, Yali Zong, Changyou Li

Abstract: Deep neural networks have been applied to address electromagnetic inverse scattering problems (ISPs) and shown superior imaging performances, which can be affected by the training dataset, the network architecture and the applied loss function. Here, the quality of data samples is cared and valued by the defined quality factor. Based on the quality factor, the composition of the training dataset i… ▽ More Deep neural networks have been applied to address electromagnetic inverse scattering problems (ISPs) and shown superior imaging performances, which can be affected by the training dataset, the network architecture and the applied loss function. Here, the quality of data samples is cared and valued by the defined quality factor. Based on the quality factor, the composition of the training dataset is optimized. The network architecture is integrated with the residual connections and channel attention mechanism to improve feature extraction. A loss function that incorporates data-fitting error, physical-information constraints and the desired feature of the solution is designed and analyzed to suppress the background artifacts and improve the reconstruction accuracy. Various numerical analysis are performed to demonstrate the superiority of the proposed quality-factor inspired deep neural network (QuaDNN) solver and the imaging performance is finally verified by experimental imaging test. △ Less

Submitted 29 April, 2025; originally announced April 2025.

arXiv:2503.24089 [pdf, ps, other]

Initial State Privacy of Nonlinear Systems on Riemannian Manifolds

Authors: Le Liu, Yu Kawano, Antai Xie, Ming Cao

Abstract: In this paper, we investigate initial state privacy protection for discrete-time nonlinear closed systems. By capturing Riemannian geometric structures inherent in such privacy challenges, we refine the concept of differential privacy through the introduction of an initial state adjacency set based on Riemannian distances. A new differential privacy condition is formulated using incremental output… ▽ More In this paper, we investigate initial state privacy protection for discrete-time nonlinear closed systems. By capturing Riemannian geometric structures inherent in such privacy challenges, we refine the concept of differential privacy through the introduction of an initial state adjacency set based on Riemannian distances. A new differential privacy condition is formulated using incremental output boundedness, enabling the design of time-varying Laplacian noise to achieve specified privacy guarantees. The proposed framework extends beyond initial state protection to also cover system parameter privacy, which is demonstrated as a special application. △ Less

Submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.23922 [pdf, ps, other]

Distributionally Robust Model Order Reduction for Linear Systems

Authors: Le Liu, Yu Kawano, Yangming Dou, Ming Cao

Abstract: In this paper, we investigate distributionally robust model order reduction for linear, discrete-time, time-invariant systems. The external input is assumed to follow an uncertain distribution within a Wasserstein ambiguity set. We begin by considering the case where the distribution is certain and formulate an optimization problem to obtain the reduced model. When the distribution is uncertain, t… ▽ More In this paper, we investigate distributionally robust model order reduction for linear, discrete-time, time-invariant systems. The external input is assumed to follow an uncertain distribution within a Wasserstein ambiguity set. We begin by considering the case where the distribution is certain and formulate an optimization problem to obtain the reduced model. When the distribution is uncertain, the interaction between the reduced-order model and the distribution is modeled by a Stackelberg game. To ensure solvability, we first introduce the Gelbrich distance and demonstrate that the Stackelberg game within a Wasserstein ambiguity set is equivalent to that within a Gelbrich ambiguity set. Then, we propose a nested optimization problem to solve the Stackelberg game. Furthermore, the nested optimization problem is relaxed into a nested convex optimization problem, ensuring computational feasibility. Finally, a simulation is presented to illustrate the effectiveness of the proposed method. △ Less

Submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.23903 [pdf, ps, other]

Privacy Preservation for Statistical Input in Dynamical Systems

Authors: Le Liu, Yu Kawano, Ming Cao

Abstract: This paper addresses the challenge of privacy preservation for statistical inputs in dynamical systems. Motivated by an autonomous building application, we formulate a privacy preservation problem for statistical inputs in linear time-invariant systems. What makes this problem widely applicable is that the inputs, rather than being assumed to be deterministic, follow a probability distribution, in… ▽ More This paper addresses the challenge of privacy preservation for statistical inputs in dynamical systems. Motivated by an autonomous building application, we formulate a privacy preservation problem for statistical inputs in linear time-invariant systems. What makes this problem widely applicable is that the inputs, rather than being assumed to be deterministic, follow a probability distribution, inherently embedding privacy-sensitive information that requires protection. This formulation also presents a technical challenge as conventional differential privacy mechanisms are not directly applicable. Through rigorous analysis, we develop strategy to achieve $(0, δ)$ differential privacy through adding noise. Finally, the effectiveness of our methods is demonstrated by revisiting the autonomous building application. △ Less

Submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.22410 [pdf, ps, other]

Distributed Constrained Online Nonconvex Optimization with Compressed Communication

Authors: Kunpeng Zhang, Lei Xu, Xinlei Yi, Ming Cao, Karl H. Johansson, Tianyou Chai, Tao Yang

Abstract: This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents. For a time-varying graph, we propose a distributed online primal-dual algorithm with compressed communication to efficiently utilize communication resources. We show that the proposed algorithm establishes an $\mathcal{O}( {{T^{\max \{ {1 - {θ_1},{θ_1}} \}}}} )$ network… ▽ More This paper considers distributed online nonconvex optimization with time-varying inequality constraints over a network of agents. For a time-varying graph, we propose a distributed online primal-dual algorithm with compressed communication to efficiently utilize communication resources. We show that the proposed algorithm establishes an $\mathcal{O}( {{T^{\max \{ {1 - {θ_1},{θ_1}} \}}}} )$ network regret bound and an $\mathcal{O}( {T^{1 - {θ_1}/2}} )$ network cumulative constraint violation bound, where $T$ is the number of iterations and ${θ_1} \in ( {0,1} )$ is a user-defined trade-off parameter. When Slater's condition holds (i.e, there is a point that strictly satisfies the inequality constraints at all iterations), the network cumulative constraint violation bound is reduced to $\mathcal{O}( {T^{1 - {θ_1}}} )$. These bounds are comparable to the state-of-the-art results established by existing distributed online algorithms with perfect communication for distributed online convex optimization with (time-varying) inequality constraints. Finally, a simulation example is presented to validate the theoretical results. △ Less

Submitted 28 March, 2025; originally announced March 2025.

Comments: 35 pages, 2 figures. arXiv admin note: text overlap with arXiv:2411.11574

arXiv:2503.21487 [pdf, ps, other]

On Tensor-based Polynomial Hamiltonian Systems

Authors: Shaoxuan Cui, Guofeng Zhang, Hildeberto Jardon-Kojakhmetov, Ming Cao

Abstract: It is known that a linear system with a system matrix A constitutes a Hamiltonian system with a quadratic Hamiltonian if and only if A is a Hamiltonian matrix. This provides a straightforward method to verify whether a linear system is Hamiltonian or whether a given Hamiltonian function corresponds to a linear system. These techniques fundamentally rely on the properties of Hamiltonian matrices. B… ▽ More It is known that a linear system with a system matrix A constitutes a Hamiltonian system with a quadratic Hamiltonian if and only if A is a Hamiltonian matrix. This provides a straightforward method to verify whether a linear system is Hamiltonian or whether a given Hamiltonian function corresponds to a linear system. These techniques fundamentally rely on the properties of Hamiltonian matrices. Building on recent advances in tensor algebra, this paper generalizes such results to a broad class of polynomial systems. As the systems of interest can be naturally represented in tensor forms, we name them tensor-based polynomial systems. Our main contribution is that we formally define Hamiltonian cubical tensors and characterize their properties. Crucially, we demonstrate that a tensor-based polynomial system is a Hamiltonian system with a polynomial Hamiltonian if and only if all associated system tensors are Hamiltonian cubical tensors-a direct parallel to the linear case. Additionally, we establish a computationally tractable stability criterion for tensor-based polynomial Hamiltonian systems. Finally, we validate all theoretical results through numerical examples and provide a further intuitive discussion. △ Less

Submitted 27 March, 2025; originally announced March 2025.

arXiv:2502.13390 [pdf, other]

Deep-Unfolded Massive Grant-Free Transmission in Cell-Free Wireless Communication Systems

Authors: Gangle Sun, Mengyao Cao, Wenjin Wang, Wei Xu, Christoph Studer

Abstract: Grant-free transmission and cell-free communication are vital in improving coverage and quality-of-service for massive machine-type communication. This paper proposes a novel framework of joint active user detection, channel estimation, and data detection (JACD) for massive grant-free transmission in cell-free wireless communication systems. We formulate JACD as an optimization problem and solve i… ▽ More Grant-free transmission and cell-free communication are vital in improving coverage and quality-of-service for massive machine-type communication. This paper proposes a novel framework of joint active user detection, channel estimation, and data detection (JACD) for massive grant-free transmission in cell-free wireless communication systems. We formulate JACD as an optimization problem and solve it approximately using forward-backward splitting. To deal with the discrete symbol constraint, we relax the discrete constellation to its convex hull and propose two approaches that promote solutions from the constellation set. To reduce complexity, we replace costly computations with approximate shrinkage operations and approximate posterior mean estimator computations. To improve active user detection (AUD) performance, we introduce a soft-output AUD module that considers both the data estimates and channel conditions. To jointly optimize all algorithm hyper-parameters and to improve JACD performance, we further deploy deep unfolding together with a momentum strategy, resulting in two algorithms called DU-ABC and DU-POEM. Finally, we demonstrate the efficacy of the proposed JACD algorithms via extensive system simulations. △ Less

Submitted 18 February, 2025; originally announced February 2025.

Comments: To appear in the IEEE Transactions on Signal Processing

arXiv:2502.08276 [pdf, ps, other]

Higher-order Laplacian dynamics on hypergraphs with cooperative and antagonistic interactions

Authors: Shaoxuan Cui, Chencheng Zhang, Bin Jiang, Hildeberto Jardón Kojakhmetov, Ming Cao

Abstract: Laplacian dynamics on a signless graph characterize a class of linear interactions, where pairwise cooperative interactions between all agents lead to the convergence to a common state. On a structurally balanced signed graph, the agents converge to values of the same magnitude but opposite signs (bipartite consensus), as illustrated by the well-known Altafini model. These interactions have been m… ▽ More Laplacian dynamics on a signless graph characterize a class of linear interactions, where pairwise cooperative interactions between all agents lead to the convergence to a common state. On a structurally balanced signed graph, the agents converge to values of the same magnitude but opposite signs (bipartite consensus), as illustrated by the well-known Altafini model. These interactions have been modeled using traditional graphs, where the relationships between agents are always pairwise. In comparison, higher-order networks (such as hypergraphs), offer the possibility to capture more complex, group-wise interactions among agents. This raises a natural question: can collective behavior be analyzed by using hypergraphs? The answer is affirmative. In this paper, higher-order Laplacian dynamics on signless hypergraphs are first introduced and various collective convergence behaviors are investigated, in the framework of homogeneous and non-homogeneous polynomial systems. Furthermore, by employing gauge transformations and leveraging tensor similarities, we extend these dynamics to signed hypergraphs, drawing parallels to the Altafini model. Moreover, we explore non-polynomial interaction functions within this framework. The theoretical results are demonstrated through several numerical examples. △ Less

Submitted 12 February, 2025; originally announced February 2025.

arXiv:2502.03171

Hybrid Near-Field and Far-Field Localization with Multiple Holographic MIMO Surfaces

Authors: Mengyuan Cao

Abstract: Localization methods based on holographic multiple input multiple output (HMIMO) have gained much attention for its potential to achieve high accuracy. By deploying multiple HMIMOs, we can improve the link quality and system coverage. As the scale of HMIMO increases to improve beam control capability, the near-field (NF) region of each HMIMO expands. However, existing multiple HMIMO-enabled method… ▽ More Localization methods based on holographic multiple input multiple output (HMIMO) have gained much attention for its potential to achieve high accuracy. By deploying multiple HMIMOs, we can improve the link quality and system coverage. As the scale of HMIMO increases to improve beam control capability, the near-field (NF) region of each HMIMO expands. However, existing multiple HMIMO-enabled methods mainly focus on the far-field (FF) of each HMIMO, which leads to low localization accuracy when applied in the NF. In this paper, a hybrid NF and FF localization method aided by multiple RISs, a low cost implementation of HMIMO, is proposed. In such a scenario, it is difficult to achieve user localization and RIS optimization since the equivalent NF of all RISs expands, which results in high complexity, and we need to handle the interference caused by multiple RISs. To tackle this challenge, we propose a two-phase RIS-enabled localization method that first estimate the relative locations of the user to each RIS and fuse the results to obtain the global estimation. In this way, the algorithm complexity is reduced. We formulate the RIS optimization problem to keep the RIS sidelobe as low as possible to minimize the interference. The effectiveness of the proposed method is verified through simulations. △ Less

Submitted 1 March, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

Comments: After further discussion and review, we believe that the current research findings require additional experiments and verification, and more work is currently underway. Therefore, we would like to withdraw the paper in order to further improve the study

arXiv:2501.17868 [pdf, other]

Hybrid Near-field and Far-field Localization with Holographic MIMO

Authors: Mengyuan Cao, Haobo Zhang, Yonina C. Eldar, Hongliang Zhang

Abstract: Due to its ability to precisely control wireless beams, holographic multiple-input multiple-output (HMIMO) is expected to be a promising solution to achieve high-accuracy localization. However, as the scale of HMIMO increases to improve beam control capability, the corresponding near-field (NF) region expands, indicating that users may exist in both NF and far-field (FF) regions with different ele… ▽ More Due to its ability to precisely control wireless beams, holographic multiple-input multiple-output (HMIMO) is expected to be a promising solution to achieve high-accuracy localization. However, as the scale of HMIMO increases to improve beam control capability, the corresponding near-field (NF) region expands, indicating that users may exist in both NF and far-field (FF) regions with different electromagnetic transmission characteristics. As a result, existing methods for pure NF or FF localization are no longer applicable. We consider a hybrid NF and FF localization scenario in this paper, where a base station (BS) locates multiple users in both NF and FF regions with the aid of a reconfigurable intelligent surface (RIS), which is a low-cost implementation of HMIMO. In such a scenario, it is difficult to locate the users and optimize the RIS phase shifts because whether the location of the user is in the NF or FF region is unknown, and the channels of different users are coupled. To tackle this challenge, we propose a RIS-enabled localization method that searches the users in both NF and FF regions and tackles the coupling issue by jointly estimating all user locations. We derive the localization error bound by considering the channel coupling and propose an RIS phase shift optimization algorithm that minimizes the derived bound. Simulations show the effectiveness of the proposed method and demonstrate the performance gain compared to pure NF and FF techniques. △ Less

Submitted 14 January, 2025; originally announced January 2025.

arXiv:2501.13751 [pdf, other]

On Disentangled Training for Nonlinear Transform in Learned Image Compression

Authors: Han Li, Shaohui Li, Wenrui Dai, Maida Cao, Nuowen Kan, Chenglin Li, Junni Zou, Hongkai Xiong

Abstract: Learned image compression (LIC) has demonstrated superior rate-distortion (R-D) performance compared to traditional codecs, but is challenged by training inefficiency that could incur more than two weeks to train a state-of-the-art model from scratch. Existing LIC methods overlook the slow convergence caused by compacting energy in learning nonlinear transforms. In this paper, we first reveal that… ▽ More Learned image compression (LIC) has demonstrated superior rate-distortion (R-D) performance compared to traditional codecs, but is challenged by training inefficiency that could incur more than two weeks to train a state-of-the-art model from scratch. Existing LIC methods overlook the slow convergence caused by compacting energy in learning nonlinear transforms. In this paper, we first reveal that such energy compaction consists of two components, i.e., feature decorrelation and uneven energy modulation. On such basis, we propose a linear auxiliary transform (AuxT) to disentangle energy compaction in training nonlinear transforms. The proposed AuxT obtains coarse approximation to achieve efficient energy compaction such that distribution fitting with the nonlinear transforms can be simplified to fine details. We then develop wavelet-based linear shortcuts (WLSs) for AuxT that leverages wavelet-based downsampling and orthogonal linear projection for feature decorrelation and subband-aware scaling for △ Less

Submitted 15 February, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

Comments: Accepted by ICLR2025

arXiv:2501.06566 [pdf, other]

Cooperative Aerial Robot Inspection Challenge: A Benchmark for Heterogeneous Multi-UAV Planning and Lessons Learned

Authors: Muqing Cao, Thien-Minh Nguyen, Shenghai Yuan, Andreas Anastasiou, Angelos Zacharia, Savvas Papaioannou, Panayiotis Kolios, Christos G. Panayiotou, Marios M. Polycarpou, Xinhang Xu, Mingjie Zhang, Fei Gao, Boyu Zhou, Ben M. Chen, Lihua Xie

Abstract: We propose the Cooperative Aerial Robot Inspection Challenge (CARIC), a simulation-based benchmark for motion planning algorithms in heterogeneous multi-UAV systems. CARIC features UAV teams with complementary sensors, realistic constraints, and evaluation metrics prioritizing inspection quality and efficiency. It offers a ready-to-use perception-control software stack and diverse scenarios to sup… ▽ More We propose the Cooperative Aerial Robot Inspection Challenge (CARIC), a simulation-based benchmark for motion planning algorithms in heterogeneous multi-UAV systems. CARIC features UAV teams with complementary sensors, realistic constraints, and evaluation metrics prioritizing inspection quality and efficiency. It offers a ready-to-use perception-control software stack and diverse scenarios to support the development and evaluation of task allocation and motion planning algorithms. Competitions using CARIC were held at IEEE CDC 2023 and the IROS 2024 Workshop on Multi-Robot Perception and Navigation, attracting innovative solutions from research teams worldwide. This paper examines the top three teams from CDC 2023, analyzing their exploration, inspection, and task allocation strategies while drawing insights into their performance across scenarios. The results highlight the task's complexity and suggest promising directions for future research in cooperative multi-UAV systems. △ Less

Submitted 14 January, 2025; v1 submitted 11 January, 2025; originally announced January 2025.

Comments: Please find our website at https://ntu-aris.github.io/caric

arXiv:2412.15819 [pdf, ps, other]

Robustness-enhanced Myoelectric Control with GAN-based Open-set Recognition

Authors: Cheng Wang, Ziyang Feng, Pin Zhang, Manjiang Cao, Yiming Yuan, Tengfei Chang

Abstract: Electromyography (EMG) signals are widely used in human motion recognition and medical rehabilitation, yet their variability and susceptibility to noise significantly limit the reliability of myoelectric control systems. Existing recognition algorithms often fail to handle unfamiliar actions effectively, leading to system instability and errors. This paper proposes a novel framework based on Gener… ▽ More Electromyography (EMG) signals are widely used in human motion recognition and medical rehabilitation, yet their variability and susceptibility to noise significantly limit the reliability of myoelectric control systems. Existing recognition algorithms often fail to handle unfamiliar actions effectively, leading to system instability and errors. This paper proposes a novel framework based on Generative Adversarial Networks (GANs) to enhance the robustness and usability of myoelectric control systems by enabling open-set recognition. The method incorporates a GAN-based discriminator to identify and reject unknown actions, maintaining system stability by preventing misclassifications. Experimental evaluations on publicly available and self-collected datasets demonstrate a recognition accuracy of 97.6\% for known actions and a 23.6\% improvement in Active Error Rate (AER) after rejecting unknown actions. The proposed approach is computationally efficient and suitable for deployment on edge devices, making it practical for real-world applications. △ Less

Submitted 29 May, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

Comments: 11 pages, 14 figures

arXiv:2411.11574 [pdf, ps, other]

Reduced Network Cumulative Constraint Violation for Distributed Bandit Convex Optimization under Slater Condition

Authors: Kunpeng Zhang, Xinlei Yi, Jinliang Ding, Ming Cao, Karl H. Johansson, Tao Yang

Abstract: This paper studies the distributed bandit convex optimization problem with time-varying inequality constraints, where the goal is to minimize network regret and cumulative constraint violation. To calculate network cumulative constraint violation, existing distributed bandit online algorithms solving this problem directly use the clipped constraint function to replace its original constraint funct… ▽ More This paper studies the distributed bandit convex optimization problem with time-varying inequality constraints, where the goal is to minimize network regret and cumulative constraint violation. To calculate network cumulative constraint violation, existing distributed bandit online algorithms solving this problem directly use the clipped constraint function to replace its original constraint function. However, the use of the clipping operation renders Slater condition (i.e, there exists a point that strictly satisfies the inequality constraints at all iterations) ineffective to achieve reduced network cumulative constraint violation. To tackle this challenge, we propose a new distributed bandit online primal-dual algorithm. If local loss functions are convex, we show that the proposed algorithm establishes sublinear network regret and cumulative constraint violation bounds. When Slater condition holds, the network cumulative constraint violation bound is reduced. In addition, if local loss functions are strongly convex, for the case where strongly convex parameters are unknown, the network regret bound is reduced. For the case where strongly convex parameters are known, the network regret and cumulative constraint violation bounds are further reduced. To the best of our knowledge, this paper is among the first to establish reduced (network) cumulative constraint violation bounds for (distributed) bandit convex optimization with time-varying constraints under Slater condition. Finally, a numerical example is provided to verify the theoretical results. △ Less

Submitted 28 March, 2025; v1 submitted 18 November, 2024; originally announced November 2024.

arXiv:2410.19857 [pdf, other]

Co-evolutionary control of a class of coupled mixed-feedback systems

Authors: Luis Guillermo Venegas-Pineda, Hildeberto Jardón-Kojakhmetov, Ming Cao

Abstract: Oscillatory behavior is ubiquitous in many natural and engineered systems, often emerging through self-regulating mechanisms. In this paper, we address the challenge of stabilizing a desired oscillatory pattern in a networked system where neither the internal dynamics nor the interconnections can be changed. To achieve this, we propose two distinct control strategies. The first requires the full k… ▽ More Oscillatory behavior is ubiquitous in many natural and engineered systems, often emerging through self-regulating mechanisms. In this paper, we address the challenge of stabilizing a desired oscillatory pattern in a networked system where neither the internal dynamics nor the interconnections can be changed. To achieve this, we propose two distinct control strategies. The first requires the full knowledge of the system generating the desired oscillatory pattern, while the second only needs local error information. In addition, the controllers are implemented as co-evolutionary, or adaptive, rules of some edges in an extended plant-controller network. We validate our approach in several insightful scenarios, including synchronization and systems with time-varying network structures. △ Less

Submitted 4 February, 2025; v1 submitted 22 October, 2024; originally announced October 2024.

arXiv:2410.03178 [pdf, other]

Optimal Control in Both Steady State and Transient Process with Unknown Disturbances

Authors: Ming Li, Zhaojian Wang, Feng Liu, Ming Cao, Bo Yang

Abstract: The scheme of online optimization as a feedback controller is widely used to steer the states of a physical system to the optimal solution of a predefined optimization problem. Such methods focus on regulating the physical states to the optimal solution in the steady state, without considering the performance during the transient process. In this paper, we simultaneously consider the performance i… ▽ More The scheme of online optimization as a feedback controller is widely used to steer the states of a physical system to the optimal solution of a predefined optimization problem. Such methods focus on regulating the physical states to the optimal solution in the steady state, without considering the performance during the transient process. In this paper, we simultaneously consider the performance in both the steady state and the transient process of a linear time-invariant system with unknown disturbances. The performance of the transient process is illustrated by the concept of overtaking optimality. An overtaking optimal controller with known disturbances is derived to achieve the transient overtaking optimality while guaranteeing steady-state performance. Then, we propose a disturbance independent near-optimal controller, which can achieve optimal steady-state performance and approach the overtaking optimal performance in the transient process. The system performance gap between the overtaking optimal controller and the proposed controller proves to be inversely proportional to the control gains. A case study on a power system with four buses is used to validate the effectiveness of the two controllers. △ Less

Submitted 4 October, 2024; originally announced October 2024.

arXiv:2409.05044 [pdf, other]

An Analysis of Logit Learning with the r-Lambert Function

Authors: Rory Gavin, Ming Cao, Keith Paarporn

Abstract: The well-known replicator equation in evolutionary game theory describes how population-level behaviors change over time when individuals make decisions using simple imitation learning rules. In this paper, we study evolutionary dynamics based on a fundamentally different class of learning rules known as logit learning. Numerous previous studies on logit dynamics provide numerical evidence of bifu… ▽ More The well-known replicator equation in evolutionary game theory describes how population-level behaviors change over time when individuals make decisions using simple imitation learning rules. In this paper, we study evolutionary dynamics based on a fundamentally different class of learning rules known as logit learning. Numerous previous studies on logit dynamics provide numerical evidence of bifurcations of multiple fixed points for several types of games. Our results here provide a more explicit analysis of the logit fixed points and their stability properties for the entire class of two-strategy population games -- by way of the $r$-Lambert function. We find that for Prisoner's Dilemma and anti-coordination games, there is only a single fixed point for all rationality levels. However, coordination games exhibit a pitchfork bifurcation: there is a single fixed point in a low-rationality regime, and three fixed points in a high-rationality regime. We provide an implicit characterization for the level of rationality where this bifurcation occurs. In all cases, the set of logit fixed points converges to the full set of Nash equilibria in the high rationality limit. △ Less

Submitted 16 February, 2025; v1 submitted 8 September, 2024; originally announced September 2024.

Comments: 9 pages, one figure, to be included in CDC 2024 conference proceedings

arXiv:2406.16935 [pdf, other]

Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex

Authors: Spandan Madan, Will Xiao, Mingran Cao, Hanspeter Pfister, Margaret Livingstone, Gabriel Kreiman

Abstract: We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \tex… ▽ More We characterized the generalization capabilities of DNN-based encoding models when predicting neuronal responses from the visual cortex. We collected \textit{MacaqueITBench}, a large-scale dataset of neural population responses from the macaque inferior temporal (IT) cortex to over $300,000$ images, comprising $8,233$ unique natural images presented to seven monkeys over $109$ sessions. Using \textit{MacaqueITBench}, we investigated the impact of distribution shifts on models predicting neural activity by dividing the images into Out-Of-Distribution (OOD) train and test splits. The OOD splits included several different image-computable types including image contrast, hue, intensity, temperature, and saturation. Compared to the performance on in-distribution test images -- the conventional way these models have been evaluated -- models performed worse at predicting neuronal responses to out-of-distribution images, retaining as little as $20\%$ of the performance on in-distribution test images. The generalization performance under OOD shifts can be well accounted by a simple image similarity metric -- the cosine distance between image representations extracted from a pre-trained object recognition model is a strong predictor of neural predictivity under different distribution shifts. The dataset of images, neuronal firing rate recordings, and computational benchmarks are hosted publicly at: https://bit.ly/3zeutVd. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.12703 [pdf, other]

Coarse-Fine Spectral-Aware Deformable Convolution For Hyperspectral Image Reconstruction

Authors: Jincheng Yang, Lishun Wang, Miao Cao, Huan Wang, Yinping Zhao, Xin Yuan

Abstract: We study the inverse problem of Coded Aperture Snapshot Spectral Imaging (CASSI), which captures a spatial-spectral data cube using snapshot 2D measurements and uses algorithms to reconstruct 3D hyperspectral images (HSI). However, current methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies and non-local similarities. The recently popular Transformer-b… ▽ More We study the inverse problem of Coded Aperture Snapshot Spectral Imaging (CASSI), which captures a spatial-spectral data cube using snapshot 2D measurements and uses algorithms to reconstruct 3D hyperspectral images (HSI). However, current methods based on Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies and non-local similarities. The recently popular Transformer-based methods are poorly deployed on downstream tasks due to the high computational cost caused by self-attention. In this paper, we propose Coarse-Fine Spectral-Aware Deformable Convolution Network (CFSDCN), applying deformable convolutional networks (DCN) to this task for the first time. Considering the sparsity of HSI, we design a deformable convolution module that exploits its deformability to capture long-range dependencies and non-local similarities. In addition, we propose a new spectral information interaction module that considers both coarse-grained and fine-grained spectral similarities. Extensive experiments demonstrate that our CFSDCN significantly outperforms previous state-of-the-art (SOTA) methods on both simulated and real HSI datasets. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 7 pages, 5 figures, Accepted by ICIP2024

arXiv:2406.06329 [pdf, other]

A Parameter-efficient Language Extension Framework for Multilingual ASR

Authors: Wei Liu, Jingyong Hou, Dong Yang, Muyong Cao, Tan Lee

Abstract: Covering all languages with a multilingual speech recognition model (MASR) is very difficult. Performing language extension on top of an existing MASR is a desirable choice. In this study, the MASR continual learning problem is probabilistically decomposed into language identity prediction (LP) and cross-lingual adaptation (XLA) sub-problems. Based on this, we propose an architecture-based framewo… ▽ More Covering all languages with a multilingual speech recognition model (MASR) is very difficult. Performing language extension on top of an existing MASR is a desirable choice. In this study, the MASR continual learning problem is probabilistically decomposed into language identity prediction (LP) and cross-lingual adaptation (XLA) sub-problems. Based on this, we propose an architecture-based framework for language extension that can fundamentally solve catastrophic forgetting, debudded as PELE. PELE is designed to be parameter-efficient, incrementally incorporating an add-on module to adapt to a new language. Specifically, different parameter-efficient fine-tuning (PEFT) modules and their variants are explored as potential candidates to perform XLA. Experiments are carried out on 5 new languages with a wide range of low-resourced data sizes. The best-performing PEFT candidate can achieve satisfactory performance across all languages and demonstrates superiority in three of five languages over the continual joint learning setting. Notably, PEFT methods focusing on weight parameters or input features are revealed to be limited in performance, showing significantly inferior extension capabilities compared to inserting a lightweight module in between layers such as an Adapter. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech 2024

arXiv:2405.18969 [pdf, ps, other]

Global and local observability of hypergraphs

Authors: Chencheng Zhang, Hao Yang, Shaoxuan Cui, Bin Jiang, Ming Cao

Abstract: This paper studies observability for non-uniform hypergraphs with inputs and outputs. To capture higher-order interactions, we define a canonical non-homogeneous dynamical system with nonlinear outputs on hypergraphs. We then construct algebraic necessary and sufficient conditions based on polynomial ideals and varieties for global observability at an initial state of hypergraphs. An example is gi… ▽ More This paper studies observability for non-uniform hypergraphs with inputs and outputs. To capture higher-order interactions, we define a canonical non-homogeneous dynamical system with nonlinear outputs on hypergraphs. We then construct algebraic necessary and sufficient conditions based on polynomial ideals and varieties for global observability at an initial state of hypergraphs. An example is given to illustrate the proposed criteria for observability. Further, necessary and sufficient conditions for local observability are derived based on rank conditions of observability matrices, which provide a framework to study local observability for non-uniform hypergraphs. Finally, the similarity of observability for hypergraphs is proposed using similarity of tensors, which reveals the relation of observability between two hypergraphs and helps to check the observability intuitively. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.18333 [pdf, other]

On the analysis of a higher-order Lotka-Volterra model: an application of S-tensors and the polynomial complementarity problem

Authors: Shaoxuan Cui, Qi Zhao, Guofeng Zhang, Hildeberto Jardón-Kojakhmetov, Ming Cao

Abstract: It is known that the effect of species' density on species' growth is non-additive in real ecological systems. This challenges the conventional Lotka-Volterra model, where the interactions are always pairwise and their effects are additive. To address this challenge, we introduce HOIs (Higher-Order Interactions) which are able to capture, for example, the indirect effect of one species on a second… ▽ More It is known that the effect of species' density on species' growth is non-additive in real ecological systems. This challenges the conventional Lotka-Volterra model, where the interactions are always pairwise and their effects are additive. To address this challenge, we introduce HOIs (Higher-Order Interactions) which are able to capture, for example, the indirect effect of one species on a second one correlating to a third species. Towards this end, we propose a general higher-order Lotka-Volterra model. We provide an existence result of a positive equilibrium for a non-homogeneous polynomial equation system with the help of S-tensors. Afterward, by utilizing the latter result, as well as the theory of monotone systems and results from the polynomial complementarity problem, we provide comprehensive results regarding the existence, uniqueness, and stability of the corresponding equilibrium. These results can be regarded as natural extensions of many analogous ones for the classical Lotka-Volterra model, especially in the case of full cooperation, competition among two factions, and pure competition. Finally, illustrative numerical examples are provided to highlight our contributions. △ Less

Submitted 8 July, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

arXiv:2404.06784 [pdf]

Statistical evaluation of 571 GaAs quantum point contact transistors showing the 0.7 anomaly in quantized conductance using millikelvin cryogenic on-chip multiplexing

Authors: Pengcheng Ma, Kaveh Delfanazari, Reuben K. Puddy, Jiahui Li, Moda Cao, Teng Yi, Jonathan P. Griffiths, Harvey E. Beere, David A. Ritchie, Michael J. Kelly, Charles G. Smith

Abstract: The mass production and the practical number of cryogenic quantum devices producible in a single chip are limited to the number of electrical contact pads and wiring of the cryostat or dilution refrigerator. It is, therefore, beneficial to contrast the measurements of hundreds of devices fabricated in a single chip in one cooldown process to promote the scalability, integrability, reliability, and… ▽ More The mass production and the practical number of cryogenic quantum devices producible in a single chip are limited to the number of electrical contact pads and wiring of the cryostat or dilution refrigerator. It is, therefore, beneficial to contrast the measurements of hundreds of devices fabricated in a single chip in one cooldown process to promote the scalability, integrability, reliability, and reproducibility of quantum devices and to save evaluation time, cost and energy. Here, we use a cryogenic on-chip multiplexer architecture and investigate the statistics of the 0.7 anomaly observed on the first three plateaus of the quantized conductance of semiconductor quantum point contact (QPC) transistors. Our single chips contain 256 split gate field effect QPC transistors (QFET) each, with two 16-branch multiplexed source-drain and gate pads, allowing individual transistors to be selected, addressed and controlled through an electrostatic gate voltage process. A total of 1280 quantum transistors with nano-scale dimensions are patterned in 5 different chips of GaAs heterostructures. From the measurements of 571 functioning QPCs taken at temperatures T= 1.4 K and T= 40 mK, it is found that the spontaneous polarisation model and Kondo effect do not fit our results. Furthermore, some of the features in our data largely agreed with van Hove model with short-range interactions. Our approach provides further insight into the quantum mechanical properties and microscopic origin of the 0.7 anomaly in QPCs, paving the way for the development of semiconducting quantum circuits and integrated cryogenic electronics, for scalable quantum logic control, readout, synthesis, and processing applications. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2403.19238 [pdf, other]

Taming Lookup Tables for Efficient Image Retouching

Authors: Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yang

Abstract: The widespread use of high-definition screens in edge devices, such as end-user cameras, smartphones, and televisions, is spurring a significant demand for image enhancement. Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources. To th… ▽ More The widespread use of high-definition screens in edge devices, such as end-user cameras, smartphones, and televisions, is spurring a significant demand for image enhancement. Existing enhancement models often optimize for high performance while falling short of reducing hardware inference time and power consumption, especially on edge devices with constrained computing and storage resources. To this end, we propose Image Color Enhancement Lookup Table (ICELUT) that adopts LUTs for extremely efficient edge inference, without any convolutional neural network (CNN). During training, we leverage pointwise (1x1) convolution to extract color information, alongside a split fully connected layer to incorporate global information. Both components are then seamlessly converted into LUTs for hardware-agnostic deployment. ICELUT achieves near-state-of-the-art performance and remarkably low power consumption. We observe that the pointwise network structure exhibits robust scalability, upkeeping the performance even with a heavily downsampled 32x32 input image. These enable ICELUT, the first-ever purely LUT-based image enhancer, to reach an unprecedented speed of 0.4ms on GPU and 7ms on CPU, at least one order faster than any CNN solution. Codes are available at https://github.com/Stephen0808/ICELUT. △ Less

Submitted 13 July, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

Comments: Accepted by ECCV2024

arXiv:2403.03416 [pdf, other]

On discrete-time polynomial dynamical systems on hypergraphs

Authors: Shaoxuan Cui, Guofeng Zhang, Hildeberto Jardón-Kojakhmetov, Ming Cao

Abstract: This paper studies the stability of discrete-time polynomial dynamical systems on hypergraphs by utilizing the Perron-Frobenius theorem for nonnegative tensors with respect to the tensors Z-eigenvalues and Z-eigenvectors. Firstly, for a multilinear polynomial system on a uniform hypergraph, we study the stability of the origin of the corresponding systems. Next, we extend our results to non-homoge… ▽ More This paper studies the stability of discrete-time polynomial dynamical systems on hypergraphs by utilizing the Perron-Frobenius theorem for nonnegative tensors with respect to the tensors Z-eigenvalues and Z-eigenvectors. Firstly, for a multilinear polynomial system on a uniform hypergraph, we study the stability of the origin of the corresponding systems. Next, we extend our results to non-homogeneous polynomial systems on non-uniform hypergraphs. We confirm that the local stability of any discrete-time polynomial system is in general dominated by pairwise terms. Assuming that the origin is locally stable, we construct a conservative (but explicit) region of attraction from the system parameters. Finally, we validate our results via some numerical examples. △ Less

Submitted 5 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

Comments: arXiv admin note: text overlap with arXiv:2401.03652

arXiv:2403.03048 [pdf, other]

Design of Stochastic Quantizers for Privacy Preservation

Authors: Le Liu, Yu Kawano, Ming Cao

Abstract: In this paper, we examine the role of stochastic quantizers for privacy preservation. We first employ a static stochastic quantizer and investigate its corresponding privacy-preserving properties. Specifically, we demonstrate that a sufficiently large quantization step guarantees $(0, δ)$ differential privacy. Additionally, the degradation of control performance caused by quantization is evaluated… ▽ More In this paper, we examine the role of stochastic quantizers for privacy preservation. We first employ a static stochastic quantizer and investigate its corresponding privacy-preserving properties. Specifically, we demonstrate that a sufficiently large quantization step guarantees $(0, δ)$ differential privacy. Additionally, the degradation of control performance caused by quantization is evaluated as the tracking error of output regulation. These two analyses characterize the trade-off between privacy and control performance, determined by the quantization step. This insight enables us to use quantization intentionally as a means to achieve the seemingly conflicting two goals of maintaining control performance and preserving privacy at the same time; towards this end, we further investigate a dynamic stochastic quantizer. Under a stability assumption, the dynamic stochastic quantizer can enhance privacy, more than the static one, while achieving the same control performance. We further handle the unstable case by additionally applying input Gaussian noise. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 11 pages, 4 figures

arXiv:2402.09752 [pdf]

Vector spectrometer with Hertz-level resolution and super-recognition capability

Authors: Ting Qing, Shupeng Li, Huashan Yang, Lihan Wang, Yijie Fang, Xiaohu Tang, Meihui Cao, Jianming Lu, Jijun He, Junqiu Liu, Yueguang Lyu, Shilong Pan

Abstract: High-resolution optical spectrometers are crucial in revealing intricate characteristics of signals, determining laser frequencies, measuring physical constants, identifying substances, and advancing biosensing applications. Conventional spectrometers, however, often grapple with inherent trade-offs among spectral resolution, wavelength range, and accuracy. Furthermore, even at high resolution, re… ▽ More High-resolution optical spectrometers are crucial in revealing intricate characteristics of signals, determining laser frequencies, measuring physical constants, identifying substances, and advancing biosensing applications. Conventional spectrometers, however, often grapple with inherent trade-offs among spectral resolution, wavelength range, and accuracy. Furthermore, even at high resolution, resolving overlapping spectral lines during spectroscopic analyses remains a huge challenge. Here, we propose a vector spectrometer with ultrahigh resolution, combining broadband optical frequency hopping, ultrafine microwave-photonic scanning, and vector detection. A programmable frequency-hopping laser was developed, facilitating a sub-Hz linewidth and Hz-level frequency stability, an improvement of four and six orders of magnitude, respectively, compared to those of state-of-the-art tunable lasers. We also designed an asymmetric optical transmitter and receiver to eliminate measurement errors arising from modulation nonlinearity and multi-channel crosstalk. The resultant vector spectrometer exhibits an unprecedented frequency resolution of 2 Hz, surpassing the state-of-the-art by four orders of magnitude, over a 33-nm range. Through high-resolution vector analysis, we observed that group delay information enhances the separation capability of overlapping spectral lines by over 47%, significantly streamlining the real-time identification of diverse substances. Our technique fills the gap in optical spectrometers with resolutions below 10 kHz and enables vector measurement to embrace revolution in functionality. △ Less

Submitted 6 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: 21 pages, 6 figures

arXiv:2401.06334 [pdf, other]

Unified Near-field and Far-field Localization with Holographic MIMO

Authors: Mengyuan Cao, Haobo Zhang, Boya Di, Hongliang Zhang

Abstract: Localization which uses holographic multiple input multiple output surface such as reconfigurable intelligent surface (RIS) has gained increasing attention due to its ability to accurately localize users in non-line-of-sight conditions. However, existing RIS-enabled localization methods assume the users at either the near-field (NF) or the far-field (FF) region, which results in high complexity or… ▽ More Localization which uses holographic multiple input multiple output surface such as reconfigurable intelligent surface (RIS) has gained increasing attention due to its ability to accurately localize users in non-line-of-sight conditions. However, existing RIS-enabled localization methods assume the users at either the near-field (NF) or the far-field (FF) region, which results in high complexity or low localization accuracy, respectively, when they are applied in the whole area. In this paper, a unified NF and FF localization method is proposed for the RIS-enabled localization system to overcome the above issue. Specifically, the NF and FF regions are both divided into grids. The RIS reflects the signals from the user to the base station~(BS), and then the BS uses the received signals to determine the grid where the user is located. Compared with existing NF- or FF-only schemes, the design of the location estimation method and the RIS phase shift optimization algorithm is more challenging because they are based on a hybrid NF and FF model. To tackle these challenges, we formulate the optimization problems for location estimation and RIS phase shifts, and design two algorithms to effectively solve the formulated problems, respectively. The effectiveness of the proposed method is verified through simulations. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.03689 [pdf, other]

LUPET: Incorporating Hierarchical Information Path into Multilingual ASR

Authors: Wei Liu, Jingyong Hou, Dong Yang, Muyong Cao, Tan Lee

Abstract: Toward high-performance multilingual automatic speech recognition (ASR), various types of linguistic information and model design have demonstrated their effectiveness independently. They include language identity (LID), phoneme information, language-specific processing modules, and cross-lingual self-supervised speech representation. It is expected that leveraging their benefits synergistically i… ▽ More Toward high-performance multilingual automatic speech recognition (ASR), various types of linguistic information and model design have demonstrated their effectiveness independently. They include language identity (LID), phoneme information, language-specific processing modules, and cross-lingual self-supervised speech representation. It is expected that leveraging their benefits synergistically in a unified solution would further improve the overall system performance. This paper presents a novel design of a hierarchical information path, named LUPET, which sequentially encodes, from the shallow layers to deep layers, multiple aspects of linguistic and acoustic information at diverse granularity scales. The path starts from LID prediction, followed by acoustic unit discovery, phoneme sharing, and finally token recognition routed by a mixture-of-expert. ASR experiments are carried out on 10 languages in the Common Voice corpus. The results demonstrate the superior performance of LUPET as compared to the baseline systems. Most importantly, LUPET effectively mitigates the issue of performance compromise of high-resource languages with low-resource ones in the multilingual setting. △ Less

Submitted 8 January, 2025; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: Accepted by Interspeech 2024

arXiv:2401.03652 [pdf, other]

On Metzler positive systems on hypergraphs

Authors: Shaoxuan Cui, Guofeng Zhang, Hildeberto Jardón-Kojakhmetov, Ming Cao

Abstract: In graph-theoretical terms, an edge in a graph connects two vertices while a hyperedge of a hypergraph connects any more than one vertices. If the hypergraph's hyperedges further connect the same number of vertices, it is said to be uniform. In algebraic graph theory, a graph can be characterized by an adjacency matrix, and similarly, a uniform hypergraph can be characterized by an adjacency tenso… ▽ More In graph-theoretical terms, an edge in a graph connects two vertices while a hyperedge of a hypergraph connects any more than one vertices. If the hypergraph's hyperedges further connect the same number of vertices, it is said to be uniform. In algebraic graph theory, a graph can be characterized by an adjacency matrix, and similarly, a uniform hypergraph can be characterized by an adjacency tensor. This similarity enables us to extend existing tools of matrix analysis for studying dynamical systems evolving on graphs to the study of a class of polynomial dynamical systems evolving on hypergraphs utilizing the properties of tensors. To be more precise, in this paper, we first extend the concept of a Metzler matrix to a Metzler tensor and then describe some useful properties of such tensors. Next, we focus on positive systems on hypergraphs associated with Metzler tensors. More importantly, we design control laws to stabilize the origin of this class of Metzler positive systems on hypergraphs. In the end, we apply our findings to two classic dynamical systems: a higher-order Lotka-Volterra population dynamics system and a higher-order SIS epidemic dynamic process. The corresponding novel stability results are accompanied by ample numerical examples. △ Less

Submitted 4 November, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

arXiv:2312.09445 [pdf, other]

IncepSE: Leveraging InceptionTime's performance with Squeeze and Excitation mechanism in ECG analysis

Authors: Tue Minh Cao, Nhat Hong Tran, Le Phi Nguyen, Hieu Huy Pham, Hung Thanh Nguyen

Abstract: Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques tha… ▽ More Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques that are aimed at tackling the formidable challenges of severe imbalance dataset PTB-XL and gradient corruption. By this means, we manage to set a new height for deep learning model in a supervised learning manner across the majority of tasks. Our model consistently surpasses InceptionTime by substantial margins compared to other state-of-the-arts in this domain, noticeably 0.013 AUROC score improvement in the "all" task, while also mitigating the inherent dataset fluctuations during training. △ Less

Submitted 16 November, 2023; originally announced December 2023.

arXiv:2310.17720 [pdf]

Advancing Brain Tumor Detection: A Thorough Investigation of CNNs, Clustering, and SoftMax Classification in the Analysis of MRI Images

Authors: Jonayet Miah, Duc M Cao, Md Abu Sayed3, Md Siam Taluckder, Md Sabbirul Haque, Fuad Mahmud

Abstract: Brain tumors pose a significant global health challenge due to their high prevalence and mortality rates across all age groups. Detecting brain tumors at an early stage is crucial for effective treatment and patient outcomes. This study presents a comprehensive investigation into the use of Convolutional Neural Networks (CNNs) for brain tumor detection using Magnetic Resonance Imaging (MRI) images… ▽ More Brain tumors pose a significant global health challenge due to their high prevalence and mortality rates across all age groups. Detecting brain tumors at an early stage is crucial for effective treatment and patient outcomes. This study presents a comprehensive investigation into the use of Convolutional Neural Networks (CNNs) for brain tumor detection using Magnetic Resonance Imaging (MRI) images. The dataset, consisting of MRI scans from both healthy individuals and patients with brain tumors, was processed and fed into the CNN architecture. The SoftMax Fully Connected layer was employed to classify the images, achieving an accuracy of 98%. To evaluate the CNN's performance, two other classifiers, Radial Basis Function (RBF) and Decision Tree (DT), were utilized, yielding accuracy rates of 98.24% and 95.64%, respectively. The study also introduced a clustering method for feature extraction, improving CNN's accuracy. Sensitivity, Specificity, and Precision were employed alongside accuracy to comprehensively evaluate the network's performance. Notably, the SoftMax classifier demonstrated the highest accuracy among the categorizers, achieving 99.52% accuracy on test data. The presented research contributes to the growing field of deep learning in medical image analysis. The combination of CNNs and MRI data offers a promising tool for accurately detecting brain tumors, with potential implications for early diagnosis and improved patient care. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Journal ref: JOIV : International Journal on Informatics Visualization, JOIV : Int. J. Inform. Visualization ISSN / E-ISSN 2549-9610 / 2549-9904, 2023

arXiv:2309.02670 [pdf, other]

Progressive Attention Guidance for Whole Slide Vulvovaginal Candidiasis Screening

Authors: Jiangdong Cai, Honglin Xiong, Maosong Cao, Luyan Liu, Lichi Zhang, Qian Wang

Abstract: Vulvovaginal candidiasis (VVC) is the most prevalent human candidal infection, estimated to afflict approximately 75% of all women at least once in their lifetime. It will lead to several symptoms including pruritus, vaginal soreness, and so on. Automatic whole slide image (WSI) classification is highly demanded, for the huge burden of disease control and prevention. However, the WSI-based compute… ▽ More Vulvovaginal candidiasis (VVC) is the most prevalent human candidal infection, estimated to afflict approximately 75% of all women at least once in their lifetime. It will lead to several symptoms including pruritus, vaginal soreness, and so on. Automatic whole slide image (WSI) classification is highly demanded, for the huge burden of disease control and prevention. However, the WSI-based computer-aided VCC screening method is still vacant due to the scarce labeled data and unique properties of candida. Candida in WSI is challenging to be captured by conventional classification models due to its distinctive elongated shape, the small proportion of their spatial distribution, and the style gap from WSIs. To make the model focus on the candida easier, we propose an attention-guided method, which can obtain a robust diagnosis classification model. Specifically, we first use a pre-trained detection model as prior instruction to initialize the classification model. Then we design a Skip Self-Attention module to refine the attention onto the fined-grained features of candida. Finally, we use a contrastive learning method to alleviate the overfitting caused by the style gap of WSIs and suppress the attention to false positive regions. Our experimental results demonstrate that our framework achieves state-of-the-art performance. Code and example data are available at https://github.com/cjdbehumble/MICCAI2023-VVC-Screening. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: Accepted in the main conference MICCAI 2023

Journal ref: 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023)

arXiv:2308.14637 [pdf, other]

Joint Active User Detection, Channel Estimation, and Data Detection for Massive Grant-Free Transmission in Cell-Free Systems

Authors: Gangle Sun, Mengyao Cao, Wenjin Wang, Wei Xu, Christoph Studer

Abstract: Cell-free communication has the potential to significantly improve grant-free transmission in massive machine-type communication, wherein multiple access points jointly serve a large number of user equipments to improve coverage and spectral efficiency. In this paper, we propose a novel framework for joint active user detection (AUD), channel estimation (CE), and data detection (DD) for massive gr… ▽ More Cell-free communication has the potential to significantly improve grant-free transmission in massive machine-type communication, wherein multiple access points jointly serve a large number of user equipments to improve coverage and spectral efficiency. In this paper, we propose a novel framework for joint active user detection (AUD), channel estimation (CE), and data detection (DD) for massive grant-free transmission in cell-free systems. We formulate an optimization problem for joint AUD, CE, and DD by considering both the sparsity of the data matrix, which arises from intermittent user activity, and the sparsity of the effective channel matrix, which arises from intermittent user activity and large-scale fading. We approximately solve this optimization problem with a box-constrained forward-backward splitting algorithm, which significantly improves AUD, CE, and DD performance. We demonstrate the effectiveness of the proposed framework through simulation experiments. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: To be presented at IEEE SPAWC 2023

arXiv:2308.04285 [pdf, other]

Flocking control against the malicious agent

Authors: Chencheng Zhang, Hao Yang, Bin Jiang, Ming Cao

Abstract: This paper investigates the flocking control of a swarm with a malicious agent that falsifies its controller parameters to cause collision, division, and escape of agents in the swarm. A novel geometric flocking condition is established by designing the configuration of the malicious agent and its neighbors, under which we propose a hierarchal geometric configuration-based flocking control method.… ▽ More This paper investigates the flocking control of a swarm with a malicious agent that falsifies its controller parameters to cause collision, division, and escape of agents in the swarm. A novel geometric flocking condition is established by designing the configuration of the malicious agent and its neighbors, under which we propose a hierarchal geometric configuration-based flocking control method. To help detect the malicious agent, a parameter estimate mechanism is also provided. The proposed method can achieve the flocking control goal and meanwhile contain the malicious agent in the swarm without removing it. Experimental result shows the effectiveness of the theoretical result. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2307.06182 [pdf, other]

CellGAN: Conditional Cervical Cell Synthesis for Augmenting Cytopathological Image Classification

Authors: Zhenrong Shen, Maosong Cao, Sheng Wang, Lichi Zhang, Qian Wang

Abstract: Automatic examination of thin-prep cytologic test (TCT) slides can assist pathologists in finding cervical abnormality for accurate and efficient cancer screening. Current solutions mostly need to localize suspicious cells and classify abnormality based on local patches, concerning the fact that whole slide images of TCT are extremely large. It thus requires many annotations of normal and abnormal… ▽ More Automatic examination of thin-prep cytologic test (TCT) slides can assist pathologists in finding cervical abnormality for accurate and efficient cancer screening. Current solutions mostly need to localize suspicious cells and classify abnormality based on local patches, concerning the fact that whole slide images of TCT are extremely large. It thus requires many annotations of normal and abnormal cervical cells, to supervise the training of the patch-level classifier for promising performance. In this paper, we propose CellGAN to synthesize cytopathological images of various cervical cell types for augmenting patch-level cell classification. Built upon a lightweight backbone, CellGAN is equipped with a non-linear class mapping network to effectively incorporate cell type information into image generation. We also propose the Skip-layer Global Context module to model the complex spatial relationship of the cells, and attain high fidelity of the synthesized images through adversarial learning. Our experiments demonstrate that CellGAN can produce visually plausible TCT cytopathological images for different cell types. We also validate the effectiveness of using CellGAN to greatly augment patch-level cell classification performance. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Journal ref: 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023)

arXiv:2306.00619 [pdf, other]

General SIS diffusion process with indirect spreading pathways on a hypergraph

Authors: Shaoxuan Cui, Fangzhou Liu, Hildeberto Jardón-Kojakhmetov, Ming Cao

Abstract: While conventional graphs only characterize pairwise interactions, higher-order networks (hypergraph, simplicial complex) capture multi-body interactions, which is a potentially more suitable modeling framework for a complex real system. However, the introduction of higher-order interactions brings new challenges for the rigorous analysis of such systems on a higher-order network. In this paper, w… ▽ More While conventional graphs only characterize pairwise interactions, higher-order networks (hypergraph, simplicial complex) capture multi-body interactions, which is a potentially more suitable modeling framework for a complex real system. However, the introduction of higher-order interactions brings new challenges for the rigorous analysis of such systems on a higher-order network. In this paper, we study a series of SIS-type diffusion processes with both indirect and direct pathways on a directed hypergraph. In a concrete case, the model we propose is based on a specific choice (polynomial) of interaction function (how several agents influence each other when they are in a hyperedge). Then, by the same choice of interaction function, we further extend the system and propose a bi-virus competing model on a directed hypergraph by coupling two single-virus models together. Finally, the most general model in this paper considers an abstract interaction function under single-virus and bi-virus settings. For the single-virus model, we provide the results regarding healthy state and endemic equilibrium. For the bi-virus setting, we further give an analysis of the existence and stability of the healthy state, dominant endemic equilibria, and coexisting equilibria. All theoretical results are finally supported by some numerical examples. △ Less

Submitted 9 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

MSC Class: 05C65; 34D05; 34C12; 37N25; 92D30

arXiv:2305.10983 [pdf, other]

Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment

Authors: Tianhe Wu, Shuwei Shi, Haoming Cai, Mingdeng Cao, Jing Xiao, Yinqiang Zheng, Yujiu Yang

Abstract: Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information. It is becoming more significant with the increasing advancement of virtual reality (VR) technology. However, the quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipe… ▽ More Blind Omnidirectional Image Quality Assessment (BOIQA) aims to objectively assess the human perceptual quality of omnidirectional images (ODIs) without relying on pristine-quality image information. It is becoming more significant with the increasing advancement of virtual reality (VR) technology. However, the quality assessment of ODIs is severely hampered by the fact that the existing BOIQA pipeline lacks the modeling of the observer's browsing process. To tackle this issue, we propose a novel multi-sequence network for BOIQA called Assessor360, which is derived from the realistic multi-assessor ODI quality assessment procedure. Specifically, we propose a generalized Recursive Probability Sampling (RPS) method for the BOIQA task, combining content and details information to generate multiple pseudo-viewport sequences from a given starting point. Additionally, we design a Multi-scale Feature Aggregation (MFA) module with a Distortion-aware Block (DAB) to fuse distorted and semantic features of each viewport. We also devise Temporal Modeling Module (TMM) to learn the viewport transition in the temporal domain. Extensive experimental results demonstrate that Assessor360 outperforms state-of-the-art methods on multiple OIQA datasets. The code and models are available at https://github.com/TianheWu/Assessor360. △ Less

Submitted 10 October, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.10006 [pdf, other]

EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging

Authors: Lishun Wang, Miao Cao, Xin Yuan

Abstract: Video snapshot compressive imaging (SCI) uses a two-dimensional detector to capture consecutive video frames during a single exposure time. Following this, an efficient reconstruction algorithm needs to be designed to reconstruct the desired video frames. Although recent deep learning-based state-of-the-art (SOTA) reconstruction algorithms have achieved good results in most tasks, they still face… ▽ More Video snapshot compressive imaging (SCI) uses a two-dimensional detector to capture consecutive video frames during a single exposure time. Following this, an efficient reconstruction algorithm needs to be designed to reconstruct the desired video frames. Although recent deep learning-based state-of-the-art (SOTA) reconstruction algorithms have achieved good results in most tasks, they still face the following challenges due to excessive model complexity and GPU memory limitations: 1) these models need high computational cost, and 2) they are usually unable to reconstruct large-scale video frames at high compression ratios. To address these issues, we develop an efficient network for video SCI by using dense connections and space-time factorization mechanism within a single residual block, dubbed EfficientSCI. The EfficientSCI network can well establish spatial-temporal correlation by using convolution in the spatial domain and Transformer in the temporal domain, respectively. We are the first time to show that an UHD color video with high compression ratio can be reconstructed from a snapshot 2D measurement using a single end-to-end deep learning model with PSNR above 32 dB. Extensive results on both simulation and real data show that our method significantly outperforms all previous SOTA algorithms with better real-time performance. The code is at https://github.com/ucaswangls/EfficientSCI.git. △ Less

Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2304.11080 [pdf, other]

Multimodal contrastive learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals and patient metadata

Authors: Tue M. Cao, Nhat H. Tran, Phi Le Nguyen, Hieu Pham

Abstract: This work discusses the use of contrastive learning and deep learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals. While the ECG signals usually contain 12 leads (channels), many healthcare facilities and devices lack access to all these 12 leads. This raises the problem of how to use only fewer ECG leads to produce meaningful diagnoses with high performance. We i… ▽ More This work discusses the use of contrastive learning and deep learning for diagnosing cardiovascular diseases from electrocardiography (ECG) signals. While the ECG signals usually contain 12 leads (channels), many healthcare facilities and devices lack access to all these 12 leads. This raises the problem of how to use only fewer ECG leads to produce meaningful diagnoses with high performance. We introduce a simple experiment to test whether contrastive learning can be applied to this task. More specifically, we added the similarity between the embedding vectors when the 12 leads signal and the fewer leads ECG signal to the loss function to bring these representations closer together. Despite its simplicity, this has been shown to have improved the performance of diagnosing with all lead combinations, proving the potential of contrastive learning on this task. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: Accepted for presentation at the Midwest Machine Learning Symposium (MMLS 2023), Chicago, IL, USA

arXiv:2302.07584 [pdf, other]

Fast and Blind Speech Copy-Move Detection and Localization in Noise

Authors: Dong Yang, Mingle Liu, Muyong Cao

Abstract: Copy-move forgery on speech (CMF), coupled with post-processing techniques, presents a great challenge to the forensic detection and localization of tampered areas. Most of the existing CMF detection approaches necessitate pre-segmentation of speech to facilitate similarity calculations among these segments. However, these approaches usually suffer from the problems of uncontrollable computational… ▽ More Copy-move forgery on speech (CMF), coupled with post-processing techniques, presents a great challenge to the forensic detection and localization of tampered areas. Most of the existing CMF detection approaches necessitate pre-segmentation of speech to facilitate similarity calculations among these segments. However, these approaches usually suffer from the problems of uncontrollable computational complexity and sensitivity to the presence of a word that is read multiple times within a speech recording. To address these issues, we propose a local feature tensors-based CMF detection algorithm that can transform duplicate detection and localization problems into a special tensor-matching procedure, accompanied by complete theoretical analysis as support. Through extensive experimentation, we have demonstrated that our method exhibits computational efficiency and robustness against post-processing techniques. Notably, it can effectively and blindly detect tampered segments, even those as short as a fractional second. These advantages highlight the promising potential of our approach for practical applications. △ Less

Submitted 8 September, 2023; v1 submitted 15 February, 2023; originally announced February 2023.

arXiv:2302.03453 [pdf, other]

OSRT: Omnidirectional Image Super-Resolution with Distortion-aware Transformer

Authors: Fanghua Yu, Xintao Wang, Mingdeng Cao, Gen Li, Ying Shan, Chao Dong

Abstract: Omnidirectional images (ODIs) have obtained lots of research interest for immersive experiences. Although ODIs require extremely high resolution to capture details of the entire scene, the resolutions of most ODIs are insufficient. Previous methods attempt to solve this issue by image super-resolution (SR) on equirectangular projection (ERP) images. However, they omit geometric properties of ERP i… ▽ More Omnidirectional images (ODIs) have obtained lots of research interest for immersive experiences. Although ODIs require extremely high resolution to capture details of the entire scene, the resolutions of most ODIs are insufficient. Previous methods attempt to solve this issue by image super-resolution (SR) on equirectangular projection (ERP) images. However, they omit geometric properties of ERP in the degradation process, and their models can hardly generalize to real ERP images. In this paper, we propose Fisheye downsampling, which mimics the real-world imaging process and synthesizes more realistic low-resolution samples. Then we design a distortion-aware Transformer (OSRT) to modulate ERP distortions continuously and self-adaptively. Without a cumbersome process, OSRT outperforms previous methods by about 0.2dB on PSNR. Moreover, we propose a convenient data augmentation strategy, which synthesizes pseudo ERP images from plain images. This simple strategy can alleviate the over-fitting problem of large networks and significantly boost the performance of ODISR. Extensive experiments have demonstrated the state-of-the-art performance of our OSRT. Codes and models will be available at https://github.com/Fanghua-Yu/OSRT. △ Less

Submitted 9 February, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

Comments: main paper + supplement

arXiv:2301.12213 [pdf, other]

The Domain of Attraction of the Desired Path in Vector-field Guided Path Following

Authors: Weijia Yao, Bohuan Lin, Brian D. O. Anderson, Ming Cao

Abstract: In the vector-field guided path-following problem, a sufficiently smooth vector field is designed such that its integral curves converge to and move along a one-dimensional geometric desired path. The existence of singular points where the vector field vanishes creates a topological obstruction to global convergence to the desired path and some associated topological analysis has been conducted in… ▽ More In the vector-field guided path-following problem, a sufficiently smooth vector field is designed such that its integral curves converge to and move along a one-dimensional geometric desired path. The existence of singular points where the vector field vanishes creates a topological obstruction to global convergence to the desired path and some associated topological analysis has been conducted in our previous work. In this paper, we strengthen the result in our previous work by showing that the domain of attraction of the desired path, which is a compact asymptotically stable one-dimensional embedded submanifold of an $n$-dimensional ambient manifold $\mathcal{M}$, is homeomorphic to $\mathbb{R}^{n-1} \times \mathbb{S}^1$, and not just homotopy equivalent to $\mathbb{S}^1$. This result is extended for a $k$-dimensional compact manifold for $k \ge 2$. △ Less

Submitted 28 January, 2023; originally announced January 2023.

arXiv:2301.03047 [pdf, other]

Large-scale Global Low-rank Optimization for Computational Compressed Imaging

Authors: Daoyu Li, Hanwen Xu, Miao Cao, Xin Yuan, David J. Brady, Liheng Bian

Abstract: Computational reconstruction plays a vital role in computer vision and computational photography. Most of the conventional optimization and deep learning techniques explore local information for reconstruction. Recently, nonlocal low-rank (NLR) reconstruction has achieved remarkable success in improving accuracy and generalization. However, the computational cost has inhibited NLR from seeking glo… ▽ More Computational reconstruction plays a vital role in computer vision and computational photography. Most of the conventional optimization and deep learning techniques explore local information for reconstruction. Recently, nonlocal low-rank (NLR) reconstruction has achieved remarkable success in improving accuracy and generalization. However, the computational cost has inhibited NLR from seeking global structural similarity, which consequentially keeps it trapped in the tradeoff between accuracy and efficiency and prevents it from high-dimensional large-scale tasks. To address this challenge, we report here the global low-rank (GLR) optimization technique, realizing highly-efficient large-scale reconstruction with global self-similarity. Inspired by the self-attention mechanism in deep learning, GLR extracts exemplar image patches by feature detection instead of conventional uniform selection. This directly produces key patches using structural features to avoid burdensome computational redundancy. Further, it performs patch matching across the entire image via neural-based convolution, which produces the global similarity heat map in parallel, rather than conventional sequential block-wise matching. As such, GLR improves patch grouping efficiency by more than one order of magnitude. We experimentally demonstrate GLR's effectiveness on temporal, frequency, and spectral dimensions, including different computational imaging modalities of compressive temporal imaging, magnetic resonance imaging, and multispectral filter array demosaicing. This work presents the superiority of inherent fusion of deep learning strategies and iterative optimization, and breaks the persistent dilemma of the tradeoff between accuracy and efficiency for various large-scale reconstruction tasks. △ Less

Submitted 8 January, 2023; originally announced January 2023.

arXiv:2209.09478 [pdf, other]

Guiding vector fields for the distributed motion coordination of mobile robots

Authors: Weijia Yao, Hector Garcia de Marina, Zhiyong Sun, Ming Cao

Abstract: We propose coordinating guiding vector fields to achieve two tasks simultaneously with a team of robots: first, the guidance and navigation of multiple robots to possibly different paths or surfaces typically embedded in 2D or 3D; second, their motion coordination while tracking their prescribed paths or surfaces. The motion coordination is defined by desired parametric displacements between robot… ▽ More We propose coordinating guiding vector fields to achieve two tasks simultaneously with a team of robots: first, the guidance and navigation of multiple robots to possibly different paths or surfaces typically embedded in 2D or 3D; second, their motion coordination while tracking their prescribed paths or surfaces. The motion coordination is defined by desired parametric displacements between robots on the path or surface. Such a desired displacement is achieved by controlling the virtual coordinates, which correspond to the path or surface's parameters, between guiding vector fields. Rigorous mathematical guarantees underpinned by dynamical systems theory and Lyapunov theory are provided for the effective distributed motion coordination and navigation of robots on paths or surfaces from all initial positions. As an example for practical robotic applications, we derive a control algorithm from the proposed coordinating guiding vector fields for a Dubins-car-like model with actuation saturation. Our proposed algorithm is distributed and scalable to an arbitrary number of robots. Furthermore, extensive illustrative simulations and fixed-wing aircraft outdoor experiments validate the effectiveness and robustness of our algorithm. △ Less

Submitted 30 October, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

Comments: Evolved paper from arXiv:2103.12372. Accepted to IEEE Transactions on Robotics. Supplementary video: https://www.bilibili.com/video/BV16e4y147xp/

arXiv:2209.01578 [pdf, other]

Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

Authors: Lishun Wang, Miao Cao, Yong Zhong, Xin Yuan

Abstract: Video snapshot compressive imaging (SCI) captures multiple sequential video frames by a single measurement using the idea of computational imaging. The underlying principle is to modulate high-speed frames through different masks and these modulated frames are summed to a single measurement captured by a low-speed 2D sensor (dubbed optical encoder); following this, algorithms are employed to recon… ▽ More Video snapshot compressive imaging (SCI) captures multiple sequential video frames by a single measurement using the idea of computational imaging. The underlying principle is to modulate high-speed frames through different masks and these modulated frames are summed to a single measurement captured by a low-speed 2D sensor (dubbed optical encoder); following this, algorithms are employed to reconstruct the desired high-speed frames (dubbed software decoder) if needed. In this paper, we consider the reconstruction algorithm in video SCI, i.e., recovering a series of video frames from a compressed measurement. Specifically, we propose a Spatial-Temporal transFormer (STFormer) to exploit the correlation in both spatial and temporal domains. STFormer network is composed of a token generation block, a video reconstruction block, and these two blocks are connected by a series of STFormer blocks. Each STFormer block consists of a spatial self-attention branch, a temporal self-attention branch and the outputs of these two branches are integrated by a fusion network. Extensive results on both simulated and real data demonstrate the state-of-the-art performance of STFormer. The code and models are publicly available at https://github.com/ucaswangls/STFormer.git △ Less

Submitted 8 September, 2022; v1 submitted 4 September, 2022; originally announced September 2022.

arXiv:2207.11388 [pdf, other]

Low-Complexity Acoustic Echo Cancellation with Neural Kalman Filtering

Authors: Dong Yang, Fei Jiang, Wei Wu, Xuefei Fang, Muyong Cao

Abstract: The Kalman filter has been adopted in acoustic echo cancellation due to its robustness to double-talk, fast convergence, and good steady-state performance. The performance of Kalman filter is closely related to the estimation accuracy of the state noise covariance and the observation noise covariance. The estimation error may lead to unacceptable results, especially when the echo path suffers abru… ▽ More The Kalman filter has been adopted in acoustic echo cancellation due to its robustness to double-talk, fast convergence, and good steady-state performance. The performance of Kalman filter is closely related to the estimation accuracy of the state noise covariance and the observation noise covariance. The estimation error may lead to unacceptable results, especially when the echo path suffers abrupt changes, the tracking performance of the Kalman filter could be degraded significantly. In this paper, we propose the neural Kalman filtering (NKF), which uses neural networks to implicitly model the covariance of the state noise and observation noise and to output the Kalman gain in real-time. Experimental results on both synthetic test sets and real-recorded test sets show that, the proposed NKF has superior convergence and re-convergence performance while ensuring low near-end speech degradation comparing with the state-of-the-art model-based methods. Moreover, the model size of the proposed NKF is merely 5.3 K and the RTF is as low as 0.09, which indicates that it can be deployed in low-resource platforms. △ Less

Submitted 29 October, 2022; v1 submitted 22 July, 2022; originally announced July 2022.

Showing 1–50 of 122 results for author: Cao, M