-
Optimization-Free Patch Attack on Stereo Depth Estimation
Authors:
Hangcheng Liu,
Xu Kuang,
Xingshuo Han,
Xingwan Wu,
Haoran Ou,
Shangwei Guo,
Xingyi Huang,
Tao Xiang,
Tianwei Zhang
Abstract:
Stereo Depth Estimation (SDE) is essential for scene understanding in vision-based systems like autonomous driving. However, recent studies show that SDE models are vulnerable to adversarial attacks, which are often limited to unrealistic settings, e.g., digital perturbations on separate stereo views in static scenes, restricting their real-world applicability. This raises a critical question: how…
▽ More
Stereo Depth Estimation (SDE) is essential for scene understanding in vision-based systems like autonomous driving. However, recent studies show that SDE models are vulnerable to adversarial attacks, which are often limited to unrealistic settings, e.g., digital perturbations on separate stereo views in static scenes, restricting their real-world applicability. This raises a critical question: how can we design physically realizable, scene-adaptive, and transferable attacks against SDE under realistic constraints?
To answer this, we make two key contributions. First, we propose a unified attack framework that extends optimization-based techniques to four core stages of stereo matching: feature extraction, cost-volume construction, cost aggregation, and disparity regression. A comprehensive stage-wise evaluation across 9 mainstream SDE models, under constraints like photometric consistency, reveals that optimization-based patches suffer from poor transferability. Interestingly, partially transferable patches suggest that patterns, rather than pixel-level perturbations, may be key to generalizable attacks. Motivated by this, we present PatchHunter, the first optimization-free adversarial patch attack against SDE. PatchHunter formulates patch generation as a reinforcement learning-driven search over a structured space of visual patterns crafted to disrupt SDE assumptions.
We validate PatchHunter across three levels: the KITTI dataset, the CARLA simulator, and real-world vehicle deployment. PatchHunter not only surpasses optimization-based methods in effectiveness but also achieves significantly better black-box transferability. Even under challenging physical conditions like low light, PatchHunter maintains high attack success (e.g., D1-all > 0.4), whereas optimization-based methods fail.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Holmes: Automated Fact Check with Large Language Models
Authors:
Haoran Ou,
Gelei Deng,
Xingshuo Han,
Jie Zhang,
Xinlei He,
Han Qiu,
Shangwei Guo,
Tianwei Zhang
Abstract:
The rise of Internet connectivity has accelerated the spread of disinformation, threatening societal trust, decision-making, and national security. Disinformation has evolved from simple text to complex multimodal forms combining images and text, challenging existing detection methods. Traditional deep learning models struggle to capture the complexity of multimodal disinformation. Inspired by adv…
▽ More
The rise of Internet connectivity has accelerated the spread of disinformation, threatening societal trust, decision-making, and national security. Disinformation has evolved from simple text to complex multimodal forms combining images and text, challenging existing detection methods. Traditional deep learning models struggle to capture the complexity of multimodal disinformation. Inspired by advances in AI, this study explores using Large Language Models (LLMs) for automated disinformation detection. The empirical study shows that (1) LLMs alone cannot reliably assess the truthfulness of claims; (2) providing relevant evidence significantly improves their performance; (3) however, LLMs cannot autonomously search for accurate evidence. To address this, we propose Holmes, an end-to-end framework featuring a novel evidence retrieval method that assists LLMs in collecting high-quality evidence. Our approach uses (1) LLM-powered summarization to extract key information from open sources and (2) a new algorithm and metrics to evaluate evidence quality. Holmes enables LLMs to verify claims and generate justifications effectively. Experiments show Holmes achieves 88.3% accuracy on two open-source datasets and 90.2% in real-time verification tasks. Notably, our improved evidence retrieval boosts fact-checking accuracy by 30.8% over existing methods
△ Less
Submitted 5 May, 2025;
originally announced May 2025.
-
Multiple-Particle Autofocusing Algorithm Using Axial Resolution and Morphological Analyses Based on Digital Holography
Authors:
Wei-Na Li,
Yi Zhou,
Jiatai Chen,
Hongjie Ou,
XiangSheng Xie
Abstract:
We propose an autofocusing algorithm to obtain, relatively accurately, the 3D position of each particle, particularly its axial location, and particle number of a dense transparent particle solution via its hologram. First, morphological analyses and constrained intensity are used on raw reconstructed images to obtain information on candidate focused particles. Second, axial resolution is used to…
▽ More
We propose an autofocusing algorithm to obtain, relatively accurately, the 3D position of each particle, particularly its axial location, and particle number of a dense transparent particle solution via its hologram. First, morphological analyses and constrained intensity are used on raw reconstructed images to obtain information on candidate focused particles. Second, axial resolution is used to obtain the real focused particles. Based on the mean intensity and equivalent diameter of each candidate focused particle, all focused particles are eventually secured. Our proposed method can rapidly provide relatively accurate ground-truth axial positions to solve the autofocusing problem that occurs with dense particles.
△ Less
Submitted 23 March, 2025;
originally announced March 2025.
-
Experimental Evaluation of an SDN Controller for Open Optical-circuit-switched Networks
Authors:
Kazuya Anazawa,
Takeru Inoue,
Toru Mano,
Hiroshi Ou,
Hirotaka Ujikawa,
Dmitrii Briantcev,
Sumaiya Binte Ali,
Devika Dass,
Hideki Nishizawa,
Yoshiaki Sone,
Eoin Kenny,
Marco Ruffini,
Daniel Kilper,
Eiji Oki,
Koichi Takasugi
Abstract:
Open optical networks have been considered to be important for cost-effectively building and operating the networks. Recently, the optical-circuit-switches (OCSes) have attracted industry and academia because of their cost efficiency and higher capacity than traditional electrical packet switches (EPSes) and reconfigurable optical add drop multiplexers (ROADMs). Though the open interfaces and cont…
▽ More
Open optical networks have been considered to be important for cost-effectively building and operating the networks. Recently, the optical-circuit-switches (OCSes) have attracted industry and academia because of their cost efficiency and higher capacity than traditional electrical packet switches (EPSes) and reconfigurable optical add drop multiplexers (ROADMs). Though the open interfaces and control planes for traditional ROADMs and transponders have been defined by several standard-defining organizations (SDOs), those of OCSes have not. Considering that several OCSes have already been installed in production datacenter networks (DCNs) and several OCS products are on the market, bringing the openness and interoperability into the OCS-based networks has become important. Motivated by this fact, this paper investigates a software-defined networking (SDN) controller for open optical-circuit-switched networks. To this end, we identified the use cases of OCSes and derived the controller requirements for supporting them. We then proposed a multi-vendor (MV) OCS controller framework that satisfies the derived requirements; it was designed to quickly and consistently operate fiber paths upon receiving the operation requests. We validated our controller by implementing it and evaluating its performance on actual MV-OCS networks. It satisfied all the requirements, and fiber paths could be configured within 1.0 second by using our controller.
△ Less
Submitted 29 April, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.
-
Oedipus: LLM-enchanced Reasoning CAPTCHA Solver
Authors:
Gelei Deng,
Haoran Ou,
Yi Liu,
Jie Zhang,
Tianwei Zhang,
Yang Liu
Abstract:
CAPTCHAs have become a ubiquitous tool in safeguarding applications from automated bots. Over time, the arms race between CAPTCHA development and evasion techniques has led to increasingly sophisticated and diverse designs. The latest iteration, reasoning CAPTCHAs, exploits tasks that are intuitively simple for humans but challenging for conventional AI technologies, thereby enhancing security mea…
▽ More
CAPTCHAs have become a ubiquitous tool in safeguarding applications from automated bots. Over time, the arms race between CAPTCHA development and evasion techniques has led to increasingly sophisticated and diverse designs. The latest iteration, reasoning CAPTCHAs, exploits tasks that are intuitively simple for humans but challenging for conventional AI technologies, thereby enhancing security measures.
Driven by the evolving AI capabilities, particularly the advancements in Large Language Models (LLMs), we investigate the potential of multimodal LLMs to solve modern reasoning CAPTCHAs. Our empirical analysis reveals that, despite their advanced reasoning capabilities, LLMs struggle to solve these CAPTCHAs effectively. In response, we introduce Oedipus, an innovative end-to-end framework for automated reasoning CAPTCHA solving. Central to this framework is a novel strategy that dissects the complex and human-easy-AI-hard tasks into a sequence of simpler and AI-easy steps. This is achieved through the development of a Domain Specific Language (DSL) for CAPTCHAs that guides LLMs in generating actionable sub-steps for each CAPTCHA challenge. The DSL is customized to ensure that each unit operation is a highly solvable subtask revealed in our previous empirical study. These sub-steps are then tackled sequentially using the Chain-of-Thought (CoT) methodology.
Our evaluation shows that Oedipus effectively resolves the studied CAPTCHAs, achieving an average success rate of 63.5\%. Remarkably, it also shows adaptability to the most recent CAPTCHA designs introduced in late 2023, which are not included in our initial study. This prompts a discussion on future strategies for designing reasoning CAPTCHAs that can effectively counter advanced AI solutions.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Safeguarding adaptive methods: global convergence of Barzilai-Borwein and other stepsize choices
Authors:
Hongjia Ou,
Andreas Themelis
Abstract:
Leveraging on recent advancements on adaptive methods for convex minimization problems, this paper provides a linesearch-free proximal gradient framework for globalizing the convergence of popular stepsize choices such as Barzilai-Borwein and one-dimensional Anderson acceleration. This framework can cope with problems in which the gradient of the differentiable function is merely locally Hölder co…
▽ More
Leveraging on recent advancements on adaptive methods for convex minimization problems, this paper provides a linesearch-free proximal gradient framework for globalizing the convergence of popular stepsize choices such as Barzilai-Borwein and one-dimensional Anderson acceleration. This framework can cope with problems in which the gradient of the differentiable function is merely locally Hölder continuous. Our analysis not only encompasses but also refines existing results upon which it builds. The theory is corroborated by numerical evidence that showcases the synergetic interplay between fast stepsize selections and adaptive methods.
△ Less
Submitted 13 May, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
A Light-Weight LiDAR-Inertial SLAM System with Loop Closing
Authors:
Kangcheng Liu,
Huosen Ou
Abstract:
In this work, we propose a lightweight integrated LiDAR-Inertial SLAM system with high efficiency and a great loop closure capacity. We found that the current State-of-the-art LiDAR-Inertial SLAM system has poor performance in loop closure. The LiDAR-Inertial SLAM system often fails with the large drifting and suffers from limited efficiency when faced with large-scale circumstances. In this work,…
▽ More
In this work, we propose a lightweight integrated LiDAR-Inertial SLAM system with high efficiency and a great loop closure capacity. We found that the current State-of-the-art LiDAR-Inertial SLAM system has poor performance in loop closure. The LiDAR-Inertial SLAM system often fails with the large drifting and suffers from limited efficiency when faced with large-scale circumstances. In this work, firstly, to improve the speed of the whole LiDAR-Inertial SLAM system, we have proposed a new data structure of the sparse voxel-hashing to enhance the efficiency of the LiDAR-Inertial SLAM system. Secondly, to improve the point cloud-based localization performance, we have integrated the loop closure algorithms to improve the localization performance. Extensive experiments on the real-scene large-scale complicated circumstances demonstrate the great effectiveness and robustness of the proposed LiDAR-Inertial SLAM system.
△ Less
Submitted 19 December, 2022; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Networked Restless Multi-Armed Bandits for Mobile Interventions
Authors:
Han-Ching Ou,
Christoph Siebenbrunner,
Jackson Killian,
Meredith B Brooks,
David Kempe,
Yevgeniy Vorobeychik,
Milind Tambe
Abstract:
Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobil…
▽ More
Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
Contingency-Aware Influence Maximization: A Reinforcement Learning Approach
Authors:
Haipeng Chen,
Wei Qiu,
Han-Ching Ou,
Bo An,
Milind Tambe
Abstract:
The influence maximization (IM) problem aims at finding a subset of seed nodes in a social network that maximize the spread of influence. In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM. Such contingency aware IM is critical for applications for non-profit organizations in low re…
▽ More
The influence maximization (IM) problem aims at finding a subset of seed nodes in a social network that maximize the spread of influence. In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM. Such contingency aware IM is critical for applications for non-profit organizations in low resource communities (e.g., spreading awareness of disease prevention). Despite the initial success, a major practical obstacle in promoting the solutions to more communities is the tremendous runtime of the greedy algorithms and the lack of high performance computing (HPC) for the non-profits in the field -- whenever there is a new social network, the non-profits usually do not have the HPCs to recalculate the solutions. Motivated by this and inspired by the line of works that use reinforcement learning (RL) to address combinatorial optimization on graphs, we formalize the problem as a Markov Decision Process (MDP), and use RL to learn an IM policy over historically seen networks, and generalize to unseen networks with negligible runtime at test phase. To fully exploit the properties of our targeted problem, we propose two technical innovations that improve the existing methods, including state-abstraction and theoretically grounded reward shaping. Empirical results show that our method achieves influence as high as the state-of-the-art methods for contingency-aware IM, while having negligible runtime at test phase.
△ Less
Submitted 13 June, 2021;
originally announced June 2021.
-
Active Screening for Recurrent Diseases: A Reinforcement Learning Approach
Authors:
Han-Ching Ou,
Haipeng Chen,
Shahin Jabbari,
Milind Tambe
Abstract:
Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease an…
▽ More
Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease and rapid spreading, the goal is to minimize the number of infections over a long time horizon. Active screening can be formalized as a sequential combinatorial optimization over the network of people and their connections. The main computational challenges in this formalization arise from i) the combinatorial nature of the problem, ii) the need of sequential planning and iii) the uncertainties in the infectiousness states of the population.
Previous works on active screening fail to scale to large time horizon while fully considering the future effect of current interventions. In this paper, we propose a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN), with several innovative adaptations that are designed to address the above challenges. First, we use graph convolutional networks (GCNs) to represent the Q-function that exploit the node correlations of the underlying contact network. Second, to avoid solving a combinatorial optimization problem in each time period, we decompose the node set selection as a sub-sequence of decisions, and further design a two-level RL framework that solves the problem in a hierarchical way. Finally, to speed-up the slow convergence of RL which arises from reward sparseness, we incorporate ideas from curriculum learning into our hierarchical RL approach. We evaluate our RL algorithm on several real-world networks.
△ Less
Submitted 19 April, 2021; v1 submitted 7 January, 2021;
originally announced January 2021.
-
Woodpecker-DL: Accelerating Deep Neural Networks via Hardware-Aware Multifaceted Optimizations
Authors:
Yongchao Liu,
Yue Jin,
Yong Chen,
Teng Teng,
Hang Ou,
Rui Zhao,
Yao Zhang
Abstract:
Accelerating deep model training and inference is crucial in practice. Existing deep learning frameworks usually concentrate on optimizing training speed and pay fewer attentions to inference-specific optimizations. Actually, model inference differs from training in terms of computation, e.g. parameters are refreshed each gradient update step during training, but kept invariant during inference. T…
▽ More
Accelerating deep model training and inference is crucial in practice. Existing deep learning frameworks usually concentrate on optimizing training speed and pay fewer attentions to inference-specific optimizations. Actually, model inference differs from training in terms of computation, e.g. parameters are refreshed each gradient update step during training, but kept invariant during inference. These special characteristics of model inference open new opportunities for its optimization. In this paper, we propose a hardware-aware optimization framework, namely Woodpecker-DL (WPK), to accelerate inference by taking advantage of multiple joint optimizations from the perspectives of graph optimization, automated searches, domain-specific language (DSL) compiler techniques and system-level exploration. In WPK, we investigated two new automated search approaches based on genetic algorithm and reinforcement learning, respectively, to hunt the best operator code configurations targeting specific hardware. A customized DSL compiler is further attached to these search algorithms to generate efficient codes. To create an optimized inference plan, WPK systematically explores high-speed operator implementations from third-party libraries besides our automatically generated codes and singles out the best implementation per operator for use. Extensive experiments demonstrated that on a Tesla P100 GPU, we can achieve the maximum speedup of 5.40 over cuDNN and 1.63 over TVM on individual convolution operators, and run up to 1.18 times faster than TensorRT for end-to-end model inference.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Neural Network-Aided BCJR Algorithm for Joint Symbol Detection and Channel Decoding
Authors:
Wen-Chiao Tsai,
Chieh-Fang Teng,
Han-Mo Ou,
An-Yeu Wu
Abstract:
Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its se…
▽ More
Recently, deep learning-assisted communication systems have achieved many eye-catching results and attracted more and more researchers in this emerging field. Instead of completely replacing the functional blocks of communication systems with neural networks, a hybrid manner of BCJRNet symbol detection is proposed to combine the advantages of the BCJR algorithm and neural networks. However, its separate block design not only degrades the system performance but also results in additional hardware complexity. In this work, we propose a BCJR receiver for joint symbol detection and channel decoding. It can simultaneously utilize the trellis diagram and channel state information for a more accurate calculation of branch probability and thus achieve global optimum with 2.3 dB gain over separate block design. Furthermore, a dedicated neural network model is proposed to replace the channel-model-based computation of the BCJR receiver, which can avoid the requirements of perfect CSI and is more robust under CSI uncertainty with 1.0 dB gain.
△ Less
Submitted 21 July, 2020; v1 submitted 30 May, 2020;
originally announced June 2020.
-
Neural Network-based Equalizer by Utilizing Coding Gain in Advance
Authors:
Chieh-Fang Teng,
Han-Mo Ou,
An-Yeu Wu
Abstract:
Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteri…
▽ More
Recently, deep learning has been exploited in many fields with revolutionary breakthroughs. In the light of this, deep learning-assisted communication systems have also attracted much attention in recent years and have potential to break down the conventional design rule for communication systems. In this work, we propose two kinds of neural network-based equalizers to exploit different characteristics between convolutional neural networks and recurrent neural networks. The equalizer in conventional block-based design may destroy the code structure and degrade the capacity of coding gain for decoder. On the contrary, our proposed approach not only eliminates channel fading, but also exploits the code structure with utilization of coding gain in advance, which can effectively increase the overall utilization of coding gain with more than 1.5 dB gain.
△ Less
Submitted 31 August, 2019; v1 submitted 10 July, 2019;
originally announced July 2019.
-
Who and When to Screen: Multi-Round Active Screening for Recurrent Infectious Diseases Under Uncertainty
Authors:
Han-Ching Ou,
Arunesh Sinha,
Sze-Chuan Suen,
Andrew Perrault,
Milind Tambe
Abstract:
Controlling recurrent infectious diseases is a vital yet complicated problem. In this paper, we propose a novel active screening model (ACTS) and algorithms to facilitate active screening for recurrent diseases (no permanent immunity) under infection uncertainty. Our contributions are: (1) A new approach to modeling multi-round network-based screening/contact tracing under uncertainty, which is a…
▽ More
Controlling recurrent infectious diseases is a vital yet complicated problem. In this paper, we propose a novel active screening model (ACTS) and algorithms to facilitate active screening for recurrent diseases (no permanent immunity) under infection uncertainty. Our contributions are: (1) A new approach to modeling multi-round network-based screening/contact tracing under uncertainty, which is a common real-life practice in a variety of diseases; (2) Two novel algorithms, Full- and Fast-REMEDY. Full-REMEDY considers the effect of future actions and finds a policy that provides high solution quality, where Fast-REMEDY scales linearly in the size of the network; (3) We evaluate Full- and Fast-REMEDY on several real-world datasets which emulate human contact and find that they control diseases better than the baselines. To the best of our knowledge, this is the first work on multi-round active screening with uncertainty for diseases with no permanent immunity.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.