-
From Noise to Precision: A Diffusion-Driven Approach to Zero-Inflated Precipitation Prediction
Authors:
Wentao Gao,
Jiuyong Li,
Lin Liu,
Thuc Duy Le,
Xiongren Chen,
Xiaojing Du,
Jixue Liu,
Yanchang Zhao,
Yun Chen
Abstract:
Zero-inflated data pose significant challenges in precipitation forecasting due to the predominance of zeros with sparse non-zero events. To address this, we propose the Zero Inflation Diffusion Framework (ZIDF), which integrates Gaussian perturbation for smoothing zero-inflated distributions, Transformer-based prediction for capturing temporal patterns, and diffusion-based denoising to restore th…
▽ More
Zero-inflated data pose significant challenges in precipitation forecasting due to the predominance of zeros with sparse non-zero events. To address this, we propose the Zero Inflation Diffusion Framework (ZIDF), which integrates Gaussian perturbation for smoothing zero-inflated distributions, Transformer-based prediction for capturing temporal patterns, and diffusion-based denoising to restore the original data structure. In our experiments, we use observational precipitation data collected from South Australia along with synthetically generated zero-inflated data. Results show that ZIDF demonstrates significant performance improvements over multiple state-of-the-art precipitation forecasting models, achieving up to 56.7\% reduction in MSE and 21.1\% reduction in MAE relative to the baseline Non-stationary Transformer. These findings highlight ZIDF's ability to robustly handle sparse time series data and suggest its potential generalizability to other domains where zero inflation is a key challenge.
△ Less
Submitted 1 September, 2025;
originally announced September 2025.
-
Are Enterprises Ready for Quantum-Safe Cybersecurity?
Authors:
Tran Duc Le,
Phuc Hao Do,
Truong Duy Dinh,
Van Dai Pham
Abstract:
Quantum computing threatens to undermine classical cryptography by breaking widely deployed encryption and signature schemes. This paper examines enterprise readiness for quantum-safe cybersecurity through three perspectives: (i) the technologist view, assessing the maturity of post-quantum cryptography (PQC) and quantum key distribution (QKD); (ii) the enterprise (CISO/CIO) view, analyzing organi…
▽ More
Quantum computing threatens to undermine classical cryptography by breaking widely deployed encryption and signature schemes. This paper examines enterprise readiness for quantum-safe cybersecurity through three perspectives: (i) the technologist view, assessing the maturity of post-quantum cryptography (PQC) and quantum key distribution (QKD); (ii) the enterprise (CISO/CIO) view, analyzing organizational awareness, risk management, and operational barriers; and (iii) the threat actor view, evaluating the evolving quantum threat and the urgency of migration. Using recent standards (e.g., NIST's 2024 PQC algorithms), industry surveys, and threat intelligence, we synthesize findings via a SWOT analysis to map strengths, weaknesses, opportunities, and threats. Results indicate uneven and generally insufficient preparedness: while PQC standards and niche QKD deployments signal technical progress, fewer than 5\% of enterprises have formal quantum-transition plans, and many underestimate "harvest now, decrypt later" risks. Financial, telecom, and government sectors have begun migration, but most industries remain exploratory or stalled by costs, complexity, and skills gaps. Expert consensus places cryptanalytically relevant quantum computers in the 2030s, yet delayed preparation could leave today's data vulnerable for decades. We recommend immediate steps: establishing crypto-agility, creating quantum transition roadmaps, prioritizing PQC deployment in high-value systems, and upskilling cybersecurity teams. A coordinated, proactive approach is essential to secure current and future digital assets in the quantum era.
△ Less
Submitted 1 September, 2025;
originally announced September 2025.
-
VOTA: Parallelizing 6G-RAN Experimentation with Virtualized Over-The-Air Workloads
Authors:
Chang Liu,
T. D. Khoa Le,
Rahul Saini,
Kishor C. Joshi,
George Exarchakos
Abstract:
Testbed sharing, a practice in which different researchers concurrently develop independent use cases on top of the same testbed, is ubiquitous in wireless experimental research. Its key drawback is experimental inconvenience: one must delay experiments or tolerate compute and RF interference that harms experimental fidelity. In this paper, we propose \textbf{VOTA}, an open-source, software-only t…
▽ More
Testbed sharing, a practice in which different researchers concurrently develop independent use cases on top of the same testbed, is ubiquitous in wireless experimental research. Its key drawback is experimental inconvenience: one must delay experiments or tolerate compute and RF interference that harms experimental fidelity. In this paper, we propose \textbf{VOTA}, an open-source, software-only testbed scaling method that leverages real-time virtualization and frequency tuning to maximize parallel experiments while controlling interference. In a demonstration of two interference-sensitive 6G use cases -- \textit{MIMO iDFT/DFT Offloading} and \textit{O-RAN DoS Attack} -- running side-by-side on a 32-core host, we showcase VOTA capabilities: \textbf{dedicated-like} results while allowing \textbf{2.67$\times$} more sharing opportunities.
△ Less
Submitted 29 August, 2025;
originally announced September 2025.
-
Raising the Bar: An Asymptotic Comparison of Classical and Quantum Shortest Path Algorithms
Authors:
Phuc Hao Do,
Tran Duc Le
Abstract:
The Single-Source Shortest Path (SSSP) problem is a cornerstone of computer science with vast applications, for which Dijkstra's algorithm has long been the classical baseline. While various quantum algorithms have been proposed, their performance has typically been benchmarked against this decades-old approach. This landscape was recently reshaped by the introduction of a new classical algorithm…
▽ More
The Single-Source Shortest Path (SSSP) problem is a cornerstone of computer science with vast applications, for which Dijkstra's algorithm has long been the classical baseline. While various quantum algorithms have been proposed, their performance has typically been benchmarked against this decades-old approach. This landscape was recently reshaped by the introduction of a new classical algorithm by Duan et al. with a complexity of $O(m \cdot (\log n)^{2/3})$. This development necessitates a re-evaluation of the quantum advantage narrative for SSSP. In this paper, we conduct a systematic theoretical comparison of modern quantum and classical SSSP algorithms in light of this new classical frontier. Through an analysis of their theoretical cost functions, we illustrate how their relative scaling compares across scenarios that vary in graph density and path length. Our analysis suggests a nuanced picture: sophisticated quantum algorithms, such as the one by Wesolowski and Piddock, can exhibit more favorable asymptotic scaling, but only in regimes characterized by short solution paths. Conversely, for problems involving long paths, state-of-the-art classical algorithms appear to maintain a scaling advantage. Our work provides an updated perspective for future quantum algorithm development and underscores that the pursuit of quantum advantage is a dynamic race where the classical goalposts are continually shifting.
△ Less
Submitted 16 August, 2025;
originally announced August 2025.
-
ACM Multimedia Grand Challenge on ENT Endoscopy Analysis
Authors:
Trong-Thuan Nguyen,
Viet-Tham Huynh,
Thao Thi Phuong Dao,
Ha Nguyen Thi,
Tien To Vu Thuy,
Uyen Hanh Tran,
Tam V. Nguyen,
Thanh Dinh Le,
Minh-Triet Tran
Abstract:
Automated analysis of endoscopic imagery is a critical yet underdeveloped component of ENT (ear, nose, and throat) care, hindered by variability in devices and operators, subtle and localized findings, and fine-grained distinctions such as laterality and vocal-fold state. In addition to classification, clinicians require reliable retrieval of similar cases, both visually and through concise textua…
▽ More
Automated analysis of endoscopic imagery is a critical yet underdeveloped component of ENT (ear, nose, and throat) care, hindered by variability in devices and operators, subtle and localized findings, and fine-grained distinctions such as laterality and vocal-fold state. In addition to classification, clinicians require reliable retrieval of similar cases, both visually and through concise textual descriptions. These capabilities are rarely supported by existing public benchmarks. To this end, we introduce ENTRep, the ACM Multimedia 2025 Grand Challenge on ENT endoscopy analysis, which integrates fine-grained anatomical classification with image-to-image and text-to-image retrieval under bilingual (Vietnamese and English) clinical supervision. Specifically, the dataset comprises expert-annotated images, labeled for anatomical region and normal or abnormal status, and accompanied by dual-language narrative descriptions. In addition, we define three benchmark tasks, standardize the submission protocol, and evaluate performance on public and private test splits using server-side scoring. Moreover, we report results from the top-performing teams and provide an insight discussion.
△ Less
Submitted 6 August, 2025;
originally announced August 2025.
-
Challenges in Applying Variational Quantum Algorithms to Dynamic Satellite Network Routing
Authors:
Phuc Hao Do,
Tran Duc Le
Abstract:
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Q…
▽ More
Applying near-term variational quantum algorithms to the problem of dynamic satellite network routing represents a promising direction for quantum computing. In this work, we provide a critical evaluation of two major approaches: static quantum optimizers such as the Variational Quantum Eigensolver (VQE) and the Quantum Approximate Optimization Algorithm (QAOA) for offline route computation, and Quantum Reinforcement Learning (QRL) methods for online decision-making. Using ideal, noise-free simulations, we find that these algorithms face significant challenges. Specifically, static optimizers are unable to solve even a classically easy 4-node shortest path problem due to the complexity of the optimization landscape. Likewise, a basic QRL agent based on policy gradient methods fails to learn a useful routing strategy in a dynamic 8-node environment and performs no better than random actions. These negative findings highlight key obstacles that must be addressed before quantum algorithms can offer real advantages in communication networks. We discuss the underlying causes of these limitations, including barren plateaus and learning instability, and suggest future research directions to overcome them.
△ Less
Submitted 6 August, 2025;
originally announced August 2025.
-
A Genetic Algorithm Framework for Optimizing Three-Impulse Orbital Transfers with Poliastro Simulation
Authors:
Phuc Hao Do,
Tran Duc Le
Abstract:
Orbital maneuver planning is a critical aspect of mission design, aimed at minimizing propellant consumption, which is directly correlated with the total velocity change ($ΔV$). While analytical solutions like the Hohmann and Bi-elliptic transfers offer optimal strategies for specific cases, they lack the flexibility for more general optimization problems. This paper presents a computational frame…
▽ More
Orbital maneuver planning is a critical aspect of mission design, aimed at minimizing propellant consumption, which is directly correlated with the total velocity change ($ΔV$). While analytical solutions like the Hohmann and Bi-elliptic transfers offer optimal strategies for specific cases, they lack the flexibility for more general optimization problems. This paper presents a computational framework that couples a Genetic Algorithm (GA) with the Poliastro orbital mechanics library to autonomously discover fuel-optimal, three-impulse transfer trajectories between coplanar circular orbits. We validate this framework across two distinct scenarios: a low-energy transfer from Low Earth Orbit (LEO) to a Geostationary Orbit (GEO), and a high-energy transfer to a distant orbit with a radius 20 times that of LEO. Our results demonstrate the framework's remarkable adaptability. For the LEO-to-GEO transfer, the GA precisely converges to the classical Hohmann transfer, achieving an identical $ΔV$ of 3853.96 m/s and validating the method's accuracy. Conversely, for the high-energy transfer, the GA identifies a superior Bi-elliptic trajectory that yields a significant $ΔV$ saving of 213.47 m/s compared to the Hohmann transfer. This fuel efficiency, however, necessitates a trade-off, extending the mission duration from approximately 1 day to over 140 years. This work demonstrates an accessible and powerful toolchain for the rapid prototyping of optimal trajectories, showcasing how combining evolutionary algorithms with open-source libraries provides a robust method for solving complex astrodynamics problems and quantifying their critical design trade-offs.
△ Less
Submitted 5 August, 2025;
originally announced August 2025.
-
AES-RV: Hardware-Efficient RISC-V Accelerator with Low-Latency AES Instruction Extension for IoT Security
Authors:
Van Tinh Nguyen,
Phuc Hung Pham,
Vu Trung Duong Le,
Hoai Luan Pham,
Tuan Hai Vu,
Thi Diem Tran
Abstract:
The Advanced Encryption Standard (AES) is a widely adopted cryptographic algorithm essential for securing embedded systems and IoT platforms. However, existing AES hardware accelerators often face limitations in performance, energy efficiency, and flexibility. This paper presents AES-RV, a hardware-efficient RISC-V accelerator featuring low-latency AES instruction extensions optimized for real-tim…
▽ More
The Advanced Encryption Standard (AES) is a widely adopted cryptographic algorithm essential for securing embedded systems and IoT platforms. However, existing AES hardware accelerators often face limitations in performance, energy efficiency, and flexibility. This paper presents AES-RV, a hardware-efficient RISC-V accelerator featuring low-latency AES instruction extensions optimized for real-time processing across all AES modes and key sizes. AES-RV integrates three key innovations: high-bandwidth internal buffers for continuous data processing, a specialized AES unit with custom low-latency instructions, and a pipelined system supported by a ping-pong memory transfer mechanism. Implemented on the Xilinx ZCU102 SoC FPGA, AES-RV achieves up to 255.97 times speedup and up to 453.04 times higher energy efficiency compared to baseline and conventional CPU/GPU platforms. It also demonstrates superior throughput and area efficiency against state-of-the-art AES accelerators, making it a strong candidate for secure and high-performance embedded systems.
△ Less
Submitted 17 May, 2025;
originally announced May 2025.
-
Topology-Guided Knowledge Distillation for Efficient Point Cloud Processing
Authors:
Luu Tung Hai,
Thinh D. Le,
Zhicheng Ding,
Qing Tian,
Truong-Son Hy
Abstract:
Point cloud processing has gained significant attention due to its critical role in applications such as autonomous driving and 3D object recognition. However, deploying high-performance models like Point Transformer V3 in resource-constrained environments remains challenging due to their high computational and memory demands. This work introduces a novel distillation framework that leverages topo…
▽ More
Point cloud processing has gained significant attention due to its critical role in applications such as autonomous driving and 3D object recognition. However, deploying high-performance models like Point Transformer V3 in resource-constrained environments remains challenging due to their high computational and memory demands. This work introduces a novel distillation framework that leverages topology-aware representations and gradient-guided knowledge distillation to effectively transfer knowledge from a high-capacity teacher to a lightweight student model. Our approach captures the underlying geometric structures of point clouds while selectively guiding the student model's learning process through gradient-based feature alignment. Experimental results in the Nuscenes, SemanticKITTI, and Waymo datasets demonstrate that the proposed method achieves competitive performance, with an approximately 16x reduction in model size and a nearly 1.9x decrease in inference time compared to its teacher model. Notably, on NuScenes, our method achieves state-of-the-art performance among knowledge distillation techniques trained solely on LiDAR data, surpassing prior knowledge distillation baselines in segmentation performance. Our implementation is available publicly at:
https://github.com/HySonLab/PointDistill
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Lifelong Whole Slide Image Analysis: Online Vision-Language Adaptation and Past-to-Present Gradient Distillation
Authors:
Doanh C. Bui,
Hoai Luan Pham,
Vu Trung Duong Le,
Tuan Hai Vu,
Van Duy Tran,
Khang Nguyen,
Yasuhiko Nakashima
Abstract:
Whole Slide Images (WSIs) play a crucial role in accurate cancer diagnosis and prognosis, as they provide tissue details at the cellular level. However, the rapid growth of computational tasks involving WSIs poses significant challenges. Given that WSIs are gigapixels in size, they present difficulties in terms of storage, processing, and model training. Therefore, it is essential to develop lifel…
▽ More
Whole Slide Images (WSIs) play a crucial role in accurate cancer diagnosis and prognosis, as they provide tissue details at the cellular level. However, the rapid growth of computational tasks involving WSIs poses significant challenges. Given that WSIs are gigapixels in size, they present difficulties in terms of storage, processing, and model training. Therefore, it is essential to develop lifelong learning approaches for WSI analysis. In scenarios where slides are distributed across multiple institutes, we aim to leverage them to develop a unified online model as a computational tool for cancer diagnosis in clinical and hospital settings. In this study, we introduce ADaFGrad, a method designed to enhance lifelong learning for whole-slide image (WSI) analysis. First, we leverage pathology vision-language foundation models to develop a framework that enables interaction between a slide's regional tissue features and a predefined text-based prototype buffer. Additionally, we propose a gradient-distillation mechanism that mimics the gradient of a logit with respect to the classification-head parameters across past and current iterations in a continual-learning setting. We construct a sequence of six TCGA datasets for training and evaluation. Experimental results show that ADaFGrad outperforms both state-of-the-art WSI-specific and conventional continual-learning methods after only a few training epochs, exceeding them by up to +5.068% in the class-incremental learning scenario while exhibiting the least forgetting (i.e., retaining the most knowledge from previous tasks). Moreover, ADaFGrad surpasses its baseline by as much as +40.084% in accuracy, further demonstrating the effectiveness of the proposed modules.
△ Less
Submitted 4 May, 2025;
originally announced May 2025.
-
ZeroSlide: Is Zero-Shot Classification Adequate for Lifelong Learning in Whole-Slide Image Analysis in the Era of Pathology Vision-Language Foundation Models?
Authors:
Doanh C. Bui,
Hoai Luan Pham,
Vu Trung Duong Le,
Tuan Hai Vu,
Van Duy Tran,
Yasuhiko Nakashima
Abstract:
Lifelong learning for whole slide images (WSIs) poses the challenge of training a unified model to perform multiple WSI-related tasks, such as cancer subtyping and tumor classification, in a distributed, continual fashion. This is a practical and applicable problem in clinics and hospitals, as WSIs are large, require storage, processing, and transfer time. Training new models whenever new tasks ar…
▽ More
Lifelong learning for whole slide images (WSIs) poses the challenge of training a unified model to perform multiple WSI-related tasks, such as cancer subtyping and tumor classification, in a distributed, continual fashion. This is a practical and applicable problem in clinics and hospitals, as WSIs are large, require storage, processing, and transfer time. Training new models whenever new tasks are defined is time-consuming. Recent work has applied regularization- and rehearsal-based methods to this setting. However, the rise of vision-language foundation models that align diagnostic text with pathology images raises the question: are these models alone sufficient for lifelong WSI learning using zero-shot classification, or is further investigation into continual learning strategies needed to improve performance? To our knowledge, this is the first study to compare conventional continual-learning approaches with vision-language zero-shot classification for WSIs. Our source code and experimental results will be available soon.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Evaluating Developer-written Unit Test Case Reduction for Java -- A Replication Study
Authors:
Tuan D Le,
Brandon Wilber,
Arpit Christi
Abstract:
Abstract: Failing test case reduction can promote efficient debugging because a developer may not need to observe components that are not relevant to inducing failure. Failing test case reduction can also improve the efficiency of fault localization. These considerations have prompted researchers to study the reduction process, the reduction output, and the removed entities. Christi et al. studied…
▽ More
Abstract: Failing test case reduction can promote efficient debugging because a developer may not need to observe components that are not relevant to inducing failure. Failing test case reduction can also improve the efficiency of fault localization. These considerations have prompted researchers to study the reduction process, the reduction output, and the removed entities. Christi et al. studied test reduction using a tool called ReduSharptor for C# tests. They considered the test to be an Abstract Syntax Tree (AST). Based on that, they studied the reduction outcome and removed entities in terms of Leaf nodes and Non-Leaf nodes of the AST. They claimed that (1) leaf nodes are removed in large numbers, and (2) the probability of removal is slightly higher than non-leaf nodes. We replicate their results using a different test case reduction tool, ReduJavator, for Java unit tests. We evaluate test reduction using 30 randomly chosen bugs from the Defects4J database and 30 mutants for 6 open-source projects. Our results confirm their first claim: leaf nodes are removed in large numbers. Our results are inconclusive regarding their second claim; we cannot confirm that the probability of removal is higher for non-leaf nodes.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Disentangled Representation Learning for Causal Inference with Instruments
Authors:
Debo Cheng,
Jiuyong Li,
Lin Liu,
Ziqi Xu,
Weijia Zhang,
Jixue Liu,
Thuc Duy Le
Abstract:
Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this paper, we consider a relax…
▽ More
Latent confounders are a fundamental challenge for inferring causal effects from observational data. The instrumental variable (IV) approach is a practical way to address this challenge. Existing IV based estimators need a known IV or other strong assumptions, such as the existence of two or more IVs in the system, which limits the application of the IV approach. In this paper, we consider a relaxed requirement, which assumes there is an IV proxy in the system without knowing which variable is the proxy. We propose a Variational AutoEncoder (VAE) based disentangled representation learning method to learn an IV representation from a dataset with latent confounders and then utilise the IV representation to obtain an unbiased estimation of the causal effect from the data. Extensive experiments on synthetic and real-world data have demonstrated that the proposed algorithm outperforms the existing IV based estimators and VAE-based estimators.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
Leaning Time-Varying Instruments for Identifying Causal Effects in Time-Series Data
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Thuc duy Le,
Xudong Guo,
Shichao Zhang
Abstract:
Querying causal effects from time-series data is important across various fields, including healthcare, economics, climate science, and epidemiology. However, this task becomes complex in the existence of time-varying latent confounders, which affect both treatment and outcome variables over time and can introduce bias in causal effect estimation. Traditional instrumental variable (IV) methods are…
▽ More
Querying causal effects from time-series data is important across various fields, including healthcare, economics, climate science, and epidemiology. However, this task becomes complex in the existence of time-varying latent confounders, which affect both treatment and outcome variables over time and can introduce bias in causal effect estimation. Traditional instrumental variable (IV) methods are limited in addressing such complexities due to the need for predefined IVs or strong assumptions that do not hold in dynamic settings. To tackle these issues, we develop a novel Time-varying Conditional Instrumental Variables (CIV) for Debiasing causal effect estimation, referred to as TDCIV. TDCIV leverages Long Short-Term Memory (LSTM) and Variational Autoencoder (VAE) models to disentangle and learn the representations of time-varying CIV and its conditioning set from proxy variables without prior knowledge. Under the assumptions of the Markov property and availability of proxy variables, we theoretically establish the validity of these learned representations for addressing the biases from time-varying latent confounders, thus enabling accurate causal effect estimation. Our proposed TDCIV is the first to effectively learn time-varying CIV and its associated conditioning set without relying on domain-specific knowledge.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
FQsun: A Configurable Wave Function-Based Quantum Emulator for Power-Efficient Quantum Simulations
Authors:
Tuan Hai Vu,
Vu Trung Duong Le,
Hoai Luan Pham,
Quoc Chuong Nguyen,
Yasuhiko Nakashima
Abstract:
Quantum computers are promising powerful computers for solving complex problems, but access to real quantum hardware remains limited due to high costs. Although the software simulators on CPUs/GPUs such as Qiskit, ProjectQ, and Qsun offer flexibility and support for many qubits, they struggle with high power consumption and limited processing speed, especially as qubit counts scale. Accordingly, q…
▽ More
Quantum computers are promising powerful computers for solving complex problems, but access to real quantum hardware remains limited due to high costs. Although the software simulators on CPUs/GPUs such as Qiskit, ProjectQ, and Qsun offer flexibility and support for many qubits, they struggle with high power consumption and limited processing speed, especially as qubit counts scale. Accordingly, quantum emulators implemented on dedicated hardware, such as FPGAs and analog circuits, offer a promising path for addressing energy efficiency concerns. However, existing studies on hardware-based emulators still face challenges in terms of limited flexibility and lack of fidelity evaluation. To overcome these gaps, we propose FQsun, a quantum emulator that enhances performance by integrating four key innovations: efficient memory organization, a configurable Quantum Gate Unit (QGU), optimized scheduling, and multiple number precisions. Five FQsun versions with different number precisions are implemented on the Xilinx ZCU102, consuming a maximum power of 2.41W. Experimental results demonstrate high fidelity, low mean square error, and high normalized gate speed, particularly with 32-bit versions, establishing FQsun's capability as a precise quantum emulator. Benchmarking on famous quantum algorithms reveals that FQsun achieves a superior power-delay product, outperforming software simulators on CPUs in the processing speed range.
△ Less
Submitted 18 March, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
Linking Model Intervention to Causal Interpretation in Model Explanation
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Kui Yu,
Thuc Duy Le,
Jixue Liu
Abstract:
Intervention intuition is often used in model explanation where the intervention effect of a feature on the outcome is quantified by the difference of a model prediction when the feature value is changed from the current value to the baseline value. Such a model intervention effect of a feature is inherently association. In this paper, we will study the conditions when an intuitive model intervent…
▽ More
Intervention intuition is often used in model explanation where the intervention effect of a feature on the outcome is quantified by the difference of a model prediction when the feature value is changed from the current value to the baseline value. Such a model intervention effect of a feature is inherently association. In this paper, we will study the conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome. This work links the model intervention effect to the causal interpretation of a model. Such an interpretation capability is important since it indicates whether a machine learning model is trustworthy to domain experts. The conditions also reveal the limitations of using a model intervention effect for causal interpretation in an environment with unobserved features. Experiments on semi-synthetic datasets have been conducted to validate theorems and show the potential for using the model intervention effect for model interpretation.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
Theoretical Analysis of the Efficient-Memory Matrix Storage Method for Quantum Emulation Accelerators with Gate Fusion on FPGAs
Authors:
Tran Xuan Hieu Le,
Hoai Luan Pham,
Tuan Hai Vu,
Vu Trung Duong Le,
Nakashima Yasuhiko
Abstract:
Quantum emulators play an important role in the development and testing of quantum algorithms, especially given the limitations of the current FTQC era. Developing high-speed, memory-optimized quantum emulators is a growing research trend, with gate fusion being a promising technique. However, existing gate fusion implementations often struggle to efficiently support large-scale quantum systems wi…
▽ More
Quantum emulators play an important role in the development and testing of quantum algorithms, especially given the limitations of the current FTQC era. Developing high-speed, memory-optimized quantum emulators is a growing research trend, with gate fusion being a promising technique. However, existing gate fusion implementations often struggle to efficiently support large-scale quantum systems with a high number of qubits due to a lack of optimizations for the exponential growth in memory requirements. Therefore, this study proposes the EMMS (Efficient-Memory Matrix Storage) method for storing quantum operators and states, along with an EMMS-based Quantum Emulator Accelerator (QEA) architecture that incorporates multiple processing elements (PEs) to accelerate tensor product and matrix multiplication computations in quantum emulation with gate fusion. The theoretical analysis of the QEA on the Xilinx ZCU102 FPGA, using varying numbers of PEs and different depths of unitary and local data memory, reveals a linear increase in memory depth with the number of qubits. This scaling highlights the potential of the EMMS-based QEA to accommodate larger quantum circuits, providing insights into selecting appropriate memory sizes and FPGA devices. Furthermore, the estimated performance of the QEA with PE counts ranging from $2^2$ to $2^5$ on the Xilinx ZCU102 FPGA demonstrates that increasing the number of PEs significantly reduces the computation cycle count for circuits with fewer than 18 qubits, making it significantly faster than previous works.
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
TSI: A Multi-View Representation Learning Approach for Time Series Forecasting
Authors:
Wentao Gao,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Thuc Duy Le,
Debo Cheng,
Yanchang Zhao,
Yun Chen
Abstract:
As the growing demand for long sequence time-series forecasting in real-world applications, such as electricity consumption planning, the significance of time series forecasting becomes increasingly crucial across various domains. This is highlighted by recent advancements in representation learning within the field. This study introduces a novel multi-view approach for time series forecasting tha…
▽ More
As the growing demand for long sequence time-series forecasting in real-world applications, such as electricity consumption planning, the significance of time series forecasting becomes increasingly crucial across various domains. This is highlighted by recent advancements in representation learning within the field. This study introduces a novel multi-view approach for time series forecasting that innovatively integrates trend and seasonal representations with an Independent Component Analysis (ICA)-based representation. Recognizing the limitations of existing methods in representing complex and high-dimensional time series data, this research addresses the challenge by combining TS (trend and seasonality) and ICA (independent components) perspectives. This approach offers a holistic understanding of time series data, going beyond traditional models that often miss nuanced, nonlinear relationships. The efficacy of TSI model is demonstrated through comprehensive testing on various benchmark datasets, where it shows superior performance over current state-of-the-art models, particularly in multivariate forecasting. This method not only enhances the accuracy of forecasting but also contributes significantly to the field by providing a more in-depth understanding of time series data. The research which uses ICA for a view lays the groundwork for further exploration and methodological advancements in time series forecasting, opening new avenues for research and practical applications.
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
Continuous Learning of Transformer-based Audio Deepfake Detection
Authors:
Tuan Duy Nguyen Le,
Kah Kuan Teh,
Huy Dat Tran
Abstract:
This paper proposes a novel framework for audio deepfake detection with two main objectives: i) attaining the highest possible accuracy on available fake data, and ii) effectively performing continuous learning on new fake data in a few-shot learning manner. Specifically, we conduct a large audio deepfake collection using various deep audio generation methods. The data is further enhanced with add…
▽ More
This paper proposes a novel framework for audio deepfake detection with two main objectives: i) attaining the highest possible accuracy on available fake data, and ii) effectively performing continuous learning on new fake data in a few-shot learning manner. Specifically, we conduct a large audio deepfake collection using various deep audio generation methods. The data is further enhanced with additional augmentation methods to increase variations amidst compressions, far-field recordings, noise, and other distortions. We then adopt the Audio Spectrogram Transformer for the audio deepfake detection model. Accordingly, the proposed method achieves promising performance on various benchmark datasets. Furthermore, we present a continuous learning plugin module to update the trained model most effectively with the fewest possible labeled data points of the new fake type. The proposed method outperforms the conventional direct fine-tuning approach with much fewer labeled data points.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Deconfounding Multi-Cause Latent Confounders: A Factor-Model Approach to Climate Model Bias Correction
Authors:
Wentao Gao,
Jiuyong Li,
Debo Cheng,
Lin Liu,
Jixue Liu,
Thuc Duy Le,
Xiaojing Du,
Xiongren Chen,
Yanchang Zhao,
Yun Chen
Abstract:
Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, the GCM Outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often ne…
▽ More
Global Climate Models (GCMs) are crucial for predicting future climate changes by simulating the Earth systems. However, the GCM Outputs exhibit systematic biases due to model uncertainties, parameterization simplifications, and inadequate representation of complex climate phenomena. Traditional bias correction methods, which rely on historical observation data and statistical techniques, often neglect unobserved confounders, leading to biased results. This paper proposes a novel bias correction approach to utilize both GCM and observational data to learn a factor model that captures multi-cause latent confounders. Inspired by recent advances in causality based time series deconfounding, our method first constructs a factor model to learn latent confounders from historical data and then applies them to enhance the bias correction process using advanced time series forecasting models. The experimental results demonstrate significant improvements in the accuracy of precipitation outputs. By addressing unobserved confounders, our approach offers a robust and theoretically grounded solution for climate model bias correction.
△ Less
Submitted 6 June, 2025; v1 submitted 21 August, 2024;
originally announced August 2024.
-
Exploring the Limitations of Kolmogorov-Arnold Networks in Classification: Insights to Software Training and Hardware Implementation
Authors:
Van Duy Tran,
Tran Xuan Hieu Le,
Thi Diem Tran,
Hoai Luan Pham,
Vu Trung Duong Le,
Tuan Hai Vu,
Van Tinh Nguyen,
Yasuhiko Nakashima
Abstract:
Kolmogorov-Arnold Networks (KANs), a novel type of neural network, have recently gained popularity and attention due to the ability to substitute multi-layer perceptions (MLPs) in artificial intelligence (AI) with higher accuracy and interoperability. However, KAN assessment is still limited and cannot provide an in-depth analysis of a specific domain. Furthermore, no study has been conducted on t…
▽ More
Kolmogorov-Arnold Networks (KANs), a novel type of neural network, have recently gained popularity and attention due to the ability to substitute multi-layer perceptions (MLPs) in artificial intelligence (AI) with higher accuracy and interoperability. However, KAN assessment is still limited and cannot provide an in-depth analysis of a specific domain. Furthermore, no study has been conducted on the implementation of KANs in hardware design, which would directly demonstrate whether KANs are truly superior to MLPs in practical applications. As a result, in this paper, we focus on verifying KANs for classification issues, which are a common but significant topic in AI using four different types of datasets. Furthermore, the corresponding hardware implementation is considered using the Vitis high-level synthesis (HLS) tool. To the best of our knowledge, this is the first article to implement hardware for KAN. The results indicate that KANs cannot achieve more accuracy than MLPs in high complex datasets while utilizing substantially higher hardware resources. Therefore, MLP remains an effective approach for achieving accuracy and efficiency in software and hardware implementation.
△ Less
Submitted 25 July, 2024; v1 submitted 25 July, 2024;
originally announced July 2024.
-
A Critique of Chen's "The 2-MAXSAT Problem Can Be Solved in Polynomial Time"
Authors:
Tran Duy Anh Le,
Michael P. Reidy,
Eliot J. Smith
Abstract:
In this paper, we examine Yangjun Chen's technical report titled ``The 2-MAXSAT Problem Can Be Solved in Polynomial Time'' [Che23], which revises and expands upon their conference paper of the same name [Che22]. Chen's paper purports to build a polynomial-time algorithm for the ${\rm NP}$-complete problem 2-MAXSAT by converting a 2-CNF formula into a graph that is then searched. We show through mu…
▽ More
In this paper, we examine Yangjun Chen's technical report titled ``The 2-MAXSAT Problem Can Be Solved in Polynomial Time'' [Che23], which revises and expands upon their conference paper of the same name [Che22]. Chen's paper purports to build a polynomial-time algorithm for the ${\rm NP}$-complete problem 2-MAXSAT by converting a 2-CNF formula into a graph that is then searched. We show through multiple counterexamples that Chen's proposed algorithms contain flaws, and we find that the structures they create lack properly formalized definitions. Furthermore, we elaborate on how the author fails to prove the correctness of their algorithms and how they make overgeneralizations in their time analysis of their proposed solution. Due to these issues, we conclude that Chen's technical report [Che23] and conference paper [Che22] both fail to provide a proof that ${\rm P}={\rm NP}$.
△ Less
Submitted 21 February, 2024;
originally announced April 2024.
-
Robust COVID-19 Detection in CT Images with CLIP
Authors:
Li Lin,
Yamini Sri Krubha,
Zhenhuan Yang,
Cheng Ren,
Thuc Duy Le,
Irene Amerini,
Xin Wang,
Shu Hu
Abstract:
In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data. In this work, we introduce the first lightweight detector designed to overcome these obstacles, leveraging a frozen CLIP image encoder a…
▽ More
In the realm of medical imaging, particularly for COVID-19 detection, deep learning models face substantial challenges such as the necessity for extensive computational resources, the paucity of well-annotated datasets, and a significant amount of unlabeled data. In this work, we introduce the first lightweight detector designed to overcome these obstacles, leveraging a frozen CLIP image encoder and a trainable multilayer perception (MLP). Enhanced with Conditional Value at Risk (CVaR) for robustness and a loss landscape flattening strategy for improved generalization, our model is tailored for high efficacy in COVID-19 detection. Furthermore, we integrate a teacher-student framework to capitalize on the vast amounts of unlabeled data, enabling our model to achieve superior performance despite the inherent data limitations. Experimental results on the COV19-CT-DB dataset demonstrate the effectiveness of our approach, surpassing baseline by up to 10.6% in `macro' F1 score in supervised learning. The code is available at https://github.com/Purdue-M2/COVID-19_Detection_M2_PURDUE.
△ Less
Submitted 8 September, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Low-resource classification of mobility functioning information in clinical sentences using large language models
Authors:
Tuan Dung Le,
Thanh Duong,
Thanh Thieu
Abstract:
Objective: Function is increasingly recognized as an important indicator of whole-person health. This study evaluates the ability of publicly available large language models (LLMs) to accurately identify the presence of functioning information from clinical notes. We explore various strategies to improve the performance on this task. Materials and Methods: We collect a balanced binary classificati…
▽ More
Objective: Function is increasingly recognized as an important indicator of whole-person health. This study evaluates the ability of publicly available large language models (LLMs) to accurately identify the presence of functioning information from clinical notes. We explore various strategies to improve the performance on this task. Materials and Methods: We collect a balanced binary classification dataset of 1000 sentences from the Mobility NER dataset, which was curated from n2c2 clinical notes. For evaluation, we construct zero-shot and few-shot prompts to query the LLMs whether a given sentence contains mobility functioning information. Two sampling techniques, random sampling and k-nearest neighbor (kNN)-based sampling, are used to select the few-shot examples. Furthermore, we apply a parameter-efficient prompt-based fine-tuning method to the LLMs and evaluate their performance under various training settings. Results: Flan-T5-xxl outperforms all other models in both zero-shot and few-shot settings, achieving a F1 score of 0.865 with a single demonstrative example selected by kNN sampling. In prompt-based fine-tuning experiments, this foundation model also demonstrates superior performance across all low-resource settings, particularly achieving an impressive F1 score of 0.922 using the full training dataset. The smaller model, Flan-T5-xl, requires fine-tuning with only 2.3M additional parameters to achieve comparable performance to the fully fine-tuned Gatortron-base model, both surpassing 0.9 F1 score. Conclusion: Open-source instruction-tuned LLMs demonstrate impressive in-context learning capability in the mobility functioning classification task. The performance of these models can be further improved by continuing fine-tuning on a task-specific dataset.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Instrumental Variable Estimation for Causal Inference in Longitudinal Data with Time-Dependent Latent Confounders
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Wentao Gao,
Thuc Duy Le
Abstract:
Causal inference from longitudinal observational data is a challenging problem due to the difficulty in correctly identifying the time-dependent confounders, especially in the presence of latent time-dependent confounders. Instrumental variable (IV) is a powerful tool for addressing the latent confounders issue, but the traditional IV technique cannot deal with latent time-dependent confounders in…
▽ More
Causal inference from longitudinal observational data is a challenging problem due to the difficulty in correctly identifying the time-dependent confounders, especially in the presence of latent time-dependent confounders. Instrumental variable (IV) is a powerful tool for addressing the latent confounders issue, but the traditional IV technique cannot deal with latent time-dependent confounders in longitudinal studies. In this work, we propose a novel Time-dependent Instrumental Factor Model (TIFM) for time-varying causal effect estimation from data with latent time-dependent confounders. At each time-step, the proposed TIFM method employs the Recurrent Neural Network (RNN) architecture to infer latent IV, and then uses the inferred latent IV factor for addressing the confounding bias caused by the latent time-dependent confounders. We provide a theoretical analysis for the proposed TIFM method regarding causal effect estimation in longitudinal data. Extensive evaluation with synthetic datasets demonstrates the effectiveness of TIFM in addressing causal effect estimation over time. We further apply TIFM to a climate dataset to showcase the potential of the proposed method in tackling real-world problems.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
On Czerwinski's "${\rm P} \neq {\rm NP}$ relative to a ${\rm P}$-complete oracle"
Authors:
Michael C. Chavrimootoo,
Tran Duy Anh Le,
Michael P. Reidy,
Eliot J. Smith
Abstract:
In this paper, we take a closer look at Czerwinski's "${\rm P}\neq{\rm NP}$ relative to a ${\rm P}$-complete oracle" [Cze23]. There are (uncountably) infinitely-many relativized worlds where ${\rm P}$ and ${\rm NP}$ differ, and it is well-known that for any ${\rm P}$-complete problem $A$, ${\rm P}^A \neq {\rm NP}^A \iff {\rm P}\neq {\rm NP}$. The paper defines two sets ${\rm D}_{\rm P}$ and…
▽ More
In this paper, we take a closer look at Czerwinski's "${\rm P}\neq{\rm NP}$ relative to a ${\rm P}$-complete oracle" [Cze23]. There are (uncountably) infinitely-many relativized worlds where ${\rm P}$ and ${\rm NP}$ differ, and it is well-known that for any ${\rm P}$-complete problem $A$, ${\rm P}^A \neq {\rm NP}^A \iff {\rm P}\neq {\rm NP}$. The paper defines two sets ${\rm D}_{\rm P}$ and ${\rm D}_{\rm NP}$ and builds the purported proof of their main theorem on the claim that an oracle Turing machine with ${\rm D}_{\rm NP}$ as its oracle and that accepts ${\rm D}_{\rm P}$ must make $Θ(2^n)$ queries to the oracle. We invalidate the latter by proving that there is an oracle Turing machine with ${\rm D}_{\rm NP}$ as its oracle that accepts ${\rm D}_{\rm P}$ and yet only makes one query to the oracle. We thus conclude that Czerwinski's paper [Cze23] fails to establish that ${\rm P} \neq {\rm NP}$.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Conditional Instrumental Variable Regression with Representation Learning for Causal Inference
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Thuc Duy Le
Abstract:
This paper studies the challenging problem of estimating causal effects from observational data, in the presence of unobserved confounders. The two-stage least square (TSLS) method and its variants with a standard instrumental variable (IV) are commonly used to eliminate confounding bias, including the bias caused by unobserved confounders, but they rely on the linearity assumption. Besides, the s…
▽ More
This paper studies the challenging problem of estimating causal effects from observational data, in the presence of unobserved confounders. The two-stage least square (TSLS) method and its variants with a standard instrumental variable (IV) are commonly used to eliminate confounding bias, including the bias caused by unobserved confounders, but they rely on the linearity assumption. Besides, the strict condition of unconfounded instruments posed on a standard IV is too strong to be practical. To address these challenging and practical problems of the standard IV method (linearity assumption and the strict condition), in this paper, we use a conditional IV (CIV) to relax the unconfounded instrument condition of standard IV and propose a non-linear CIV regression with Confounding Balancing Representation Learning, CBRL.CIV, for jointly eliminating the confounding bias from unobserved confounders and balancing the observed confounders, without the linearity assumption. We theoretically demonstrate the soundness of CBRL.CIV. Extensive experiments on synthetic and two real-world datasets show the competitive performance of CBRL.CIV against state-of-the-art IV-based estimators and superiority in dealing with the non-linear situation.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Advancing Wound Filling Extraction on 3D Faces: Auto-Segmentation and Wound Face Regeneration Approach
Authors:
Duong Q. Nguyen,
Thinh D. Le,
Phuong D. Nguyen,
Nga T. K. Le,
H. Nguyen-Xuan
Abstract:
Facial wound segmentation plays a crucial role in preoperative planning and optimizing patient outcomes in various medical applications. In this paper, we propose an efficient approach for automating 3D facial wound segmentation using a two-stream graph convolutional network. Our method leverages the Cir3D-FaIR dataset and addresses the challenge of data imbalance through extensive experimentation…
▽ More
Facial wound segmentation plays a crucial role in preoperative planning and optimizing patient outcomes in various medical applications. In this paper, we propose an efficient approach for automating 3D facial wound segmentation using a two-stream graph convolutional network. Our method leverages the Cir3D-FaIR dataset and addresses the challenge of data imbalance through extensive experimentation with different loss functions. To achieve accurate segmentation, we conducted thorough experiments and selected a high-performing model from the trained models. The selected model demonstrates exceptional segmentation performance for complex 3D facial wounds. Furthermore, based on the segmentation model, we propose an improved approach for extracting 3D facial wound fillers and compare it to the results of the previous study. Our method achieved a remarkable accuracy of 0.9999986\% on the test suite, surpassing the performance of the previous method. From this result, we use 3D printing technology to illustrate the shape of the wound filling. The outcomes of this study have significant implications for physicians involved in preoperative planning and intervention design. By automating facial wound segmentation and improving the accuracy of wound-filling extraction, our approach can assist in carefully assessing and optimizing interventions, leading to enhanced patient outcomes. Additionally, it contributes to advancing facial reconstruction techniques by utilizing machine learning and 3D bioprinting for printing skin tissue implants. Our source code is available at \url{https://github.com/SIMOGroup/WoundFilling3D}.
△ Less
Submitted 12 July, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Learning Conditional Instrumental Variable Representation for Causal Effect Estimation
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Thuc Duy Le,
Jixue Liu
Abstract:
One of the fundamental challenges in causal inference is to estimate the causal effect of a treatment on its outcome of interest from observational data. However, causal effect estimation often suffers from the impacts of confounding bias caused by unmeasured confounders that affect both the treatment and the outcome. The instrumental variable (IV) approach is a powerful way to eliminate the confo…
▽ More
One of the fundamental challenges in causal inference is to estimate the causal effect of a treatment on its outcome of interest from observational data. However, causal effect estimation often suffers from the impacts of confounding bias caused by unmeasured confounders that affect both the treatment and the outcome. The instrumental variable (IV) approach is a powerful way to eliminate the confounding bias from latent confounders. However, the existing IV-based estimators require a nominated IV, and for a conditional IV (CIV) the corresponding conditioning set too, for causal effect estimation. This limits the application of IV-based estimators. In this paper, by leveraging the advantage of disentangled representation learning, we propose a novel method, named DVAE.CIV, for learning and disentangling the representations of CIV and the representations of its conditioning set for causal effect estimations from data with latent confounders. Extensive experimental results on both synthetic and real-world datasets demonstrate the superiority of the proposed DVAE.CIV method against the existing causal effect estimators.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Linking a predictive model to causal effect estimation
Authors:
Jiuyong Li,
Lin Liu,
Ziqi Xu,
Ha Xuan Tran,
Thuc Duy Le,
Jixue Liu
Abstract:
A predictive model makes outcome predictions based on some given features, i.e., it estimates the conditional probability of the outcome given a feature vector. In general, a predictive model cannot estimate the causal effect of a feature on the outcome, i.e., how the outcome will change if the feature is changed while keeping the values of other features unchanged. This is because causal effect e…
▽ More
A predictive model makes outcome predictions based on some given features, i.e., it estimates the conditional probability of the outcome given a feature vector. In general, a predictive model cannot estimate the causal effect of a feature on the outcome, i.e., how the outcome will change if the feature is changed while keeping the values of other features unchanged. This is because causal effect estimation requires interventional probabilities. However, many real world problems such as personalised decision making, recommendation, and fairness computing, need to know the causal effect of any feature on the outcome for a given instance. This is different from the traditional causal effect estimation problem with a fixed treatment variable. This paper first tackles the challenge of estimating the causal effect of any feature (as the treatment) on the outcome w.r.t. a given instance. The theoretical results naturally link a predictive model to causal effect estimations and imply that a predictive model is causally interpretable when the conditions identified in the paper are satisfied. The paper also reveals the robust property of a causally interpretable model. We use experiments to demonstrate that various types of predictive models, when satisfying the conditions identified in this paper, can estimate the causal effects of features as accurately as state-of-the-art causal effect estimation methods. We also show the potential of such causally interpretable predictive models for robust predictions and personalised decision making.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Application of Self-Supervised Learning to MICA Model for Reconstructing Imperfect 3D Facial Structures
Authors:
Phuong D. Nguyen,
Thinh D. Le,
Duong Q. Nguyen,
Binh Nguyen,
H. Nguyen-Xuan
Abstract:
In this study, we emphasize the integration of a pre-trained MICA model with an imperfect face dataset, employing a self-supervised learning approach. We present an innovative method for regenerating flawed facial structures, yielding 3D printable outputs that effectively support physicians in their patient treatment process. Our results highlight the model's capacity for concealing scars and achi…
▽ More
In this study, we emphasize the integration of a pre-trained MICA model with an imperfect face dataset, employing a self-supervised learning approach. We present an innovative method for regenerating flawed facial structures, yielding 3D printable outputs that effectively support physicians in their patient treatment process. Our results highlight the model's capacity for concealing scars and achieving comprehensive facial reconstructions without discernible scarring. By capitalizing on pre-trained models and necessitating only a few hours of supplementary training, our methodology adeptly devises an optimal model for reconstructing damaged and imperfect facial features. Harnessing contemporary 3D printing technology, we institute a standardized protocol for fabricating realistic, camouflaging mask models for patients in a laboratory environment.
△ Less
Submitted 8 April, 2023;
originally announced April 2023.
-
3D Facial Imperfection Regeneration: Deep learning approach and 3D printing prototypes
Authors:
Phuong D. Nguyen,
Thinh D. Le,
Duong Q. Nguyen,
Thanh Q. Nguyen,
Li-Wei Chou,
H. Nguyen-Xuan
Abstract:
This study explores the potential of a fully convolutional mesh autoencoder model for regenerating 3D nature faces with the presence of imperfect areas. We utilize deep learning approaches in graph processing and analysis to investigate the capabilities model in recreating a filling part for facial scars. Our approach in dataset creation is able to generate a facial scar rationally in a virtual sp…
▽ More
This study explores the potential of a fully convolutional mesh autoencoder model for regenerating 3D nature faces with the presence of imperfect areas. We utilize deep learning approaches in graph processing and analysis to investigate the capabilities model in recreating a filling part for facial scars. Our approach in dataset creation is able to generate a facial scar rationally in a virtual space that corresponds to the unique circumstances. Especially, we propose a new method which is named 3D Facial Imperfection Regeneration(3D-FaIR) for reproducing a complete face reconstruction based on the remaining features of the patient face. To further enhance the applicable capacity of the present research, we develop an improved outlier technique to separate the wounds of patients and provide appropriate wound cover models. Also, a Cir3D-FaIR dataset of imperfect faces and open codes was released at https://github.com/SIMOGroup/3DFaIR. Our findings demonstrate the potential of the proposed approach to help patients recover more quickly and safely through convenient techniques. We hope that this research can contribute to the development of new products and innovative solutions for facial scar regeneration.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
Causal Inference with Conditional Instruments using Deep Generative Models
Authors:
Debo Cheng,
Ziqi Xu,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Thuc Duy Le
Abstract:
The instrumental variable (IV) approach is a widely used way to estimate the causal effects of a treatment on an outcome of interest from observational data with latent confounders. A standard IV is expected to be related to the treatment variable and independent of all other variables in the system. However, it is challenging to search for a standard IV from data directly due to the strict condit…
▽ More
The instrumental variable (IV) approach is a widely used way to estimate the causal effects of a treatment on an outcome of interest from observational data with latent confounders. A standard IV is expected to be related to the treatment variable and independent of all other variables in the system. However, it is challenging to search for a standard IV from data directly due to the strict conditions. The conditional IV (CIV) method has been proposed to allow a variable to be an instrument conditioning on a set of variables, allowing a wider choice of possible IVs and enabling broader practical applications of the IV approach. Nevertheless, there is not a data-driven method to discover a CIV and its conditioning set directly from data. To fill this gap, in this paper, we propose to learn the representations of the information of a CIV and its conditioning set from data with latent confounders for average causal effect estimation. By taking advantage of deep generative models, we develop a novel data-driven approach for simultaneously learning the representation of a CIV from measured variables and generating the representation of its conditioning set given measured variables. Extensive experiments on synthetic and real-world datasets show that our method outperforms the existing IV methods.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Data-Driven Causal Effect Estimation Based on Graphical Causal Modelling: A Survey
Authors:
Debo Cheng,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Thuc Duy Le
Abstract:
In many fields of scientific research and real-world applications, unbiased estimation of causal effects from non-experimental data is crucial for understanding the mechanism underlying the data and for decision-making on effective responses or interventions. A great deal of research has been conducted to address this challenging problem from different angles. For estimating causal effect in obser…
▽ More
In many fields of scientific research and real-world applications, unbiased estimation of causal effects from non-experimental data is crucial for understanding the mechanism underlying the data and for decision-making on effective responses or interventions. A great deal of research has been conducted to address this challenging problem from different angles. For estimating causal effect in observational data, assumptions such as Markov condition, faithfulness and causal sufficiency are always made. Under the assumptions, full knowledge such as, a set of covariates or an underlying causal graph, is typically required. A practical challenge is that in many applications, no such full knowledge or only some partial knowledge is available. In recent years, research has emerged to use search strategies based on graphical causal modelling to discover useful knowledge from data for causal effect estimation, with some mild assumptions, and has shown promise in tackling the practical challenge. In this survey, we review these data-driven methods on causal effect estimation for a single treatment with a single outcome of interest and focus on the challenges faced by data-driven causal effect estimation. We concisely summarise the basic concepts and theories that are essential for data-driven causal effect estimation using graphical causal modelling but are scattered around the literature. We identify and discuss the challenges faced by data-driven causal effect estimation and characterise the existing methods by their assumptions and the approaches to tackling the challenges. We analyse the strengths and limitations of the different types of methods and present an empirical evaluation to support the discussions. We hope this review will motivate more researchers to design better data-driven methods based on graphical causal modelling for the challenging problem of causal effect estimation.
△ Less
Submitted 3 December, 2023; v1 submitted 19 August, 2022;
originally announced August 2022.
-
Explanatory causal effects for model agnostic explanations
Authors:
Jiuyong Li,
Ha Xuan Tran,
Thuc Duy Le,
Lin Liu,
Kui Yu,
Jixue Liu
Abstract:
This paper studies the problem of estimating the contributions of features to the prediction of a specific instance by a machine learning model and the overall contribution of a feature to the model. The causal effect of a feature (variable) on the predicted outcome reflects the contribution of the feature to a prediction very well. A challenge is that most existing causal effects cannot be estima…
▽ More
This paper studies the problem of estimating the contributions of features to the prediction of a specific instance by a machine learning model and the overall contribution of a feature to the model. The causal effect of a feature (variable) on the predicted outcome reflects the contribution of the feature to a prediction very well. A challenge is that most existing causal effects cannot be estimated from data without a known causal graph. In this paper, we define an explanatory causal effect based on a hypothetical ideal experiment. The definition brings several benefits to model agnostic explanations. First, explanations are transparent and have causal meanings. Second, the explanatory causal effect estimation can be data driven. Third, the causal effects provide both a local explanation for a specific prediction and a global explanation showing the overall importance of a feature in a predictive model. We further propose a method using individual and combined variables based on explanatory causal effects for explanations. We show the definition and the method work with experiments on some real-world data sets.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
Ancestral Instrument Method for Causal Inference without Complete Knowledge
Authors:
Debo Cheng,
Jiuyong Li,
Lin Liu,
Jiji Zhang,
Thuc duy Le,
Jixue Liu
Abstract:
Unobserved confounding is the main obstacle to causal effect estimation from observational data. Instrumental variables (IVs) are widely used for causal effect estimation when there exist latent confounders. With the standard IV method, when a given IV is valid, unbiased estimation can be obtained, but the validity requirement on a standard IV is strict and untestable. Conditional IVs have been pr…
▽ More
Unobserved confounding is the main obstacle to causal effect estimation from observational data. Instrumental variables (IVs) are widely used for causal effect estimation when there exist latent confounders. With the standard IV method, when a given IV is valid, unbiased estimation can be obtained, but the validity requirement on a standard IV is strict and untestable. Conditional IVs have been proposed to relax the requirement of standard IVs by conditioning on a set of observed variables (known as a conditioning set for a conditional IV). However, the criterion for finding a conditioning set for a conditional IV needs a directed acyclic graph (DAG) representing the causal relationships of both observed and unobserved variables. This makes it challenging to discover a conditioning set directly from data. In this paper, by leveraging maximal ancestral graphs (MAGs) for causal inference with latent variables, we study the graphical properties of ancestral IVs, a type of conditional IVs using MAGs, and develop the theory to support data-driven discovery of the conditioning set for a given ancestral IV in data under the pretreatment variable assumption. Based on the theory, we develop an algorithm for unbiased causal effect estimation with a given ancestral IV and observational data. Extensive experiments on synthetic and real-world datasets demonstrate the performance of the algorithm in comparison with existing IV methods.
△ Less
Submitted 8 December, 2023; v1 submitted 11 January, 2022;
originally announced January 2022.
-
Dependency-based Anomaly Detection: a General Framework and Comprehensive Evaluation
Authors:
Sha Lu,
Lin Liu,
Kui Yu,
Thuc Duy Le,
Jixue Liu,
Jiuyong Li
Abstract:
Anomaly detection is crucial for understanding unusual behaviors in data, as anomalies offer valuable insights. This paper introduces Dependency-based Anomaly Detection (DepAD), a general framework that utilizes variable dependencies to uncover meaningful anomalies with better interpretability. DepAD reframes unsupervised anomaly detection as supervised feature selection and prediction tasks, whic…
▽ More
Anomaly detection is crucial for understanding unusual behaviors in data, as anomalies offer valuable insights. This paper introduces Dependency-based Anomaly Detection (DepAD), a general framework that utilizes variable dependencies to uncover meaningful anomalies with better interpretability. DepAD reframes unsupervised anomaly detection as supervised feature selection and prediction tasks, which allows users to tailor anomaly detection algorithms to their specific problems and data. We extensively evaluate representative off-the-shelf techniques for the DepAD framework. Two DepAD algorithms emerge as all-rounders and superior performers in handling a wide range of datasets compared to nine state-of-the-art anomaly detection methods. Additionally, we demonstrate that DepAD algorithms provide new and insightful interpretations for detected anomalies.
△ Less
Submitted 17 April, 2024; v1 submitted 12 November, 2020;
originally announced November 2020.
-
Compiling ONNX Neural Network Models Using MLIR
Authors:
Tian Jin,
Gheorghe-Teodor Bercea,
Tung D. Le,
Tong Chen,
Gong Su,
Haruki Imai,
Yasushi Negishi,
Anh Leu,
Kevin O'Brien,
Kiyokuni Kawachiya,
Alexandre E. Eichenberger
Abstract:
Deep neural network models are becoming increasingly popular and have been used in various tasks such as computer vision, speech recognition, and natural language processing. Machine learning models are commonly trained in a resource-rich environment and then deployed in a distinct environment such as high availability machines or edge devices. To assist the portability of models, the open-source…
▽ More
Deep neural network models are becoming increasingly popular and have been used in various tasks such as computer vision, speech recognition, and natural language processing. Machine learning models are commonly trained in a resource-rich environment and then deployed in a distinct environment such as high availability machines or edge devices. To assist the portability of models, the open-source community has proposed the Open Neural Network Exchange (ONNX) standard. In this paper, we present a high-level, preliminary report on our onnx-mlir compiler, which generates code for the inference of deep neural network models described in the ONNX format. Onnx-mlir is an open-source compiler implemented using the Multi-Level Intermediate Representation (MLIR) infrastructure recently integrated in the LLVM project. Onnx-mlir relies on the MLIR concept of dialects to implement its functionality. We propose here two new dialects: (1) an ONNX specific dialect that encodes the ONNX standard semantics, and (2) a loop-based dialect to provide for a common lowering point for all ONNX dialect operations. Each intermediate representation facilitates its own characteristic set of graph-level and loop-based optimizations respectively. We illustrate our approach by following several models through the proposed representations and we include some early optimization work and performance results.
△ Less
Submitted 30 September, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Computational methods for cancer driver discovery: A survey
Authors:
Vu Viet Hoang Pham,
Lin Liu,
Cameron Bracken,
Gregory Goodall,
Jiuyong Li,
Thuc Duy Le
Abstract:
Motivation: Uncovering the genomic causes of cancer, known as cancer driver genes, is a fundamental task in biomedical research. Cancer driver genes drive the development and progression of cancer, thus identifying cancer driver genes and their regulatory mechanism is crucial to the design of cancer treatment and intervention. Many computational methods, which take the advantages of computer scien…
▽ More
Motivation: Uncovering the genomic causes of cancer, known as cancer driver genes, is a fundamental task in biomedical research. Cancer driver genes drive the development and progression of cancer, thus identifying cancer driver genes and their regulatory mechanism is crucial to the design of cancer treatment and intervention. Many computational methods, which take the advantages of computer science and data science, have been developed to utilise multiple types of genomic data to reveal cancer drivers and their regulatory mechanism behind cancer development and progression. Due to the complexity of the mechanistic insight of cancer genes in driving cancer and the fast development of the field, it is necessary to have a comprehensive review about the current computational methods for discovering different types of cancer drivers. Results: We survey computational methods for identifying cancer drivers from genomic data. We categorise the methods into three groups, methods for single driver identification, methods for driver module identification, and methods for identifying personalised cancer drivers. We also conduct a case study to compare the performance of the current methods. We further analyse the advantages and limitations of the current methods, and discuss the challenges and future directions of the topic. In addition, we investigate the resources for discovering and validating cancer drivers in order to provide a one-stop reference of the tools to facilitate cancer driver discovery. The ultimate goal of the paper is to help those interested in the topic to establish a solid background to carry out further research in the field.
△ Less
Submitted 2 July, 2020;
originally announced July 2020.
-
A general framework for causal classification
Authors:
Jiuyong Li,
Weijia Zhang,
Lin Liu,
Kui Yu,
Thuc Duy Le,
Jixue Liu
Abstract:
In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which patients should be treated with a certain type of treatment? These are typical causal questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with trad…
▽ More
In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which patients should be treated with a certain type of treatment? These are typical causal questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with traditional classification methods as they only use associations to predict outcomes. For personalised marketing, these questions are often answered with uplift modelling. The objective of uplift modelling is to estimate causal effect, but its literature does not discuss when the uplift represents causal effect. Causal heterogeneity modelling can solve the problem, but its assumption of unconfoundedness is untestable in data. So practitioners need guidelines in their applications when using the methods. In this paper, we use causal classification for a set of personalised decision making problems, and differentiate it from classification. We discuss the conditions when causal classification can be resolved by uplift (and causal heterogeneity) modelling methods. We also propose a general framework for causal classification, by using off-the-shelf supervised methods for flexible implementations. Experiments have shown two instantiations of the framework work for causal classification and for uplift (causal heterogeneity) modelling, and are competitive with the other uplift (causal heterogeneity) modelling methods.
△ Less
Submitted 14 March, 2021; v1 submitted 25 March, 2020;
originally announced March 2020.
-
Causal query in observational data with hidden variables
Authors:
Debo Cheng,
Jiuyong Li,
Lin Liu,
Jixue Liu,
Kui Yu,
Thuc Duy Le
Abstract:
This paper discusses the problem of causal query in observational data with hidden variables, with the aim of seeking the change of an outcome when "manipulating" a variable while given a set of plausible confounding variables which affect the manipulated variable and the outcome. Such an "experiment on data" to estimate the causal effect of the manipulated variable is useful for validating an exp…
▽ More
This paper discusses the problem of causal query in observational data with hidden variables, with the aim of seeking the change of an outcome when "manipulating" a variable while given a set of plausible confounding variables which affect the manipulated variable and the outcome. Such an "experiment on data" to estimate the causal effect of the manipulated variable is useful for validating an experiment design using historical data or for exploring confounders when studying a new relationship. However, existing data-driven methods for causal effect estimation face some major challenges, including poor scalability with high dimensional data, low estimation accuracy due to heuristics used by the global causal structure learning algorithms, and the assumption of causal sufficiency when hidden variables are inevitable in data. In this paper, we develop a theorem for using local search to find a superset of the adjustment (or confounding) variables for causal effect estimation from observational data under a realistic pretreatment assumption. The theorem ensures that the unbiased estimate of causal effect is included in the set of causal effects estimated by the superset of adjustment variables. Based on the developed theorem, we propose a data-driven algorithm for causal query. Experiments show that the proposed algorithm is faster and produces better causal effect estimation than an existing data-driven causal effect estimation method with hidden variables. The causal effects estimated by the proposed algorithm are as accurate as those by the state-of-the-art methods using domain knowledge.
△ Less
Submitted 24 November, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Identify treatment effect patterns for personalised decisions
Authors:
Jiuyong Li,
Lin Liu,
Shisheng Zhang,
Saisai Ma,
Thuc Duy Le,
Jixue Liu
Abstract:
In personalised decision making, evidence is required to determine whether an action (treatment) is suitable for an individual. Such evidence can be obtained by modelling treatment effect heterogeneity in subgroups. The existing interpretable modelling methods take a top-down approach to search for subgroups with heterogeneous treatment effects and they may miss the most specific and relevant cont…
▽ More
In personalised decision making, evidence is required to determine whether an action (treatment) is suitable for an individual. Such evidence can be obtained by modelling treatment effect heterogeneity in subgroups. The existing interpretable modelling methods take a top-down approach to search for subgroups with heterogeneous treatment effects and they may miss the most specific and relevant context for an individual. In this paper, we design a \emph{Treatment effect pattern (TEP)} to represent treatment effect heterogeneity in data. To achieve an interpretable presentation of TEPs, we use a local causal structure around the outcome to explicitly show how those important variables are used in modelling. We also derive a formula for unbiasedly estimating the \emph{Conditional Average Causal Effect (CATE)} using the local structure in our problem setting. In the discovery process, we aim at minimising heterogeneity within each subgroup represented by a pattern. We propose a bottom-up search algorithm to discover the most specific patterns fitting individual circumstances the best for personalised decision making. Experiments show that the proposed method models treatment effect heterogeneity better than three other existing tree based methods in synthetic and real world data sets.
△ Less
Submitted 23 June, 2022; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method
Authors:
Haruki Imai,
Samuel Matzek,
Tung D. Le,
Yasushi Negishi,
Kiyokuni Kawachiya
Abstract:
Deep neural network models used for medical image segmentation are large because they are trained with high-resolution three-dimensional (3D) images. Graphics processing units (GPUs) are widely used to accelerate the trainings. However, the memory on a GPU is not large enough to train the models. A popular approach to tackling this problem is patch-based method, which divides a large image into sm…
▽ More
Deep neural network models used for medical image segmentation are large because they are trained with high-resolution three-dimensional (3D) images. Graphics processing units (GPUs) are widely used to accelerate the trainings. However, the memory on a GPU is not large enough to train the models. A popular approach to tackling this problem is patch-based method, which divides a large image into small patches and trains the models with these small patches. However, this method would degrade the segmentation quality if a target object spans multiple patches. In this paper, we propose a novel approach for 3D medical image segmentation that utilizes the data-swapping, which swaps out intermediate data from GPU memory to CPU memory to enlarge the effective GPU memory size, for training high-resolution 3D medical images without patching. We carefully tuned parameters in the data-swapping method to obtain the best training performance for 3D U-Net, a widely used deep neural network model for medical image segmentation. We applied our tuning to train 3D U-Net with full-size images of 192 x 192 x 192 voxels in brain tumor dataset. As a result, communication overhead, which is the most important issue, was reduced by 17.1%. Compared with the patch-based method for patches of 128 x 128 x 128 voxels, our training for full-size images achieved improvement on the mean Dice score by 4.48% and 5.32 % for detecting whole tumor sub-region and tumor core sub-region, respectively. The total training time was reduced from 164 hours to 47 hours, resulting in 3.53 times of acceleration.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
An exploration of algorithmic discrimination in data and classification
Authors:
Jixue Liu,
Jiuyong Li,
Feiyue Ye,
Lin Liu,
Thuc Duy Le,
Ping Xiong
Abstract:
Algorithmic discrimination is an important aspect when data is used for predictive purposes. This paper analyzes the relationships between discrimination and classification, data set partitioning, and decision models, as well as correlation. The paper uses real world data sets to demonstrate the existence of discrimination and the independence between the discrimination of data sets and the discri…
▽ More
Algorithmic discrimination is an important aspect when data is used for predictive purposes. This paper analyzes the relationships between discrimination and classification, data set partitioning, and decision models, as well as correlation. The paper uses real world data sets to demonstrate the existence of discrimination and the independence between the discrimination of data sets and the discrimination of classification models.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
FairMod - Making Predictive Models Discrimination Aware
Authors:
Jixue Liu,
Jiuyong Li,
Lin Liu,
Thuc Duy Le,
Feiyue Ye,
Gefei Li
Abstract:
Predictive models such as decision trees and neural networks may produce discrimination in their predictions. This paper proposes a method to post-process the predictions of a predictive model to make the processed predictions non-discriminatory. The method considers multiple protected variables together. Multiple protected variables make the problem more challenging than a simple protected variab…
▽ More
Predictive models such as decision trees and neural networks may produce discrimination in their predictions. This paper proposes a method to post-process the predictions of a predictive model to make the processed predictions non-discriminatory. The method considers multiple protected variables together. Multiple protected variables make the problem more challenging than a simple protected variable. The method uses a well-cited discrimination metric and adapts it to allow the specification of explanatory variables, such as position, profession, education, that describe the contexts of the applications. It models the post-processing of predictions problem as a nonlinear optimization problem to find best adjustments to the predictions so that the discrimination constraints of all protected variables are all met at the same time. The proposed method is independent of classification methods. It can handle the cases that existing methods cannot handle: satisfying multiple protected attributes at the same time, allowing multiple explanatory attributes, and being independent of classification model types. An evaluation using four real world data sets shows that the proposed method is as effectively as existing methods, in addition to its extra power.
△ Less
Submitted 4 November, 2018;
originally announced November 2018.
-
Discovering Context Specific Causal Relationships
Authors:
Saisai Ma,
Jiuyong Li,
Lin Liu,
Thuc Duy Le
Abstract:
With the increasing need of personalised decision making, such as personalised medicine and online recommendations, a growing attention has been paid to the discovery of the context and heterogeneity of causal relationships. Most existing methods, however, assume a known cause (e.g. a new drug) and focus on identifying from data the contexts of heterogeneous effects of the cause (e.g. patient grou…
▽ More
With the increasing need of personalised decision making, such as personalised medicine and online recommendations, a growing attention has been paid to the discovery of the context and heterogeneity of causal relationships. Most existing methods, however, assume a known cause (e.g. a new drug) and focus on identifying from data the contexts of heterogeneous effects of the cause (e.g. patient groups with different responses to the new drug). There is no approach to efficiently detecting directly from observational data context specific causal relationships, i.e. discovering the causes and their contexts simultaneously. In this paper, by taking the advantages of highly efficient decision tree induction and the well established causal inference framework, we propose the Tree based Context Causal rule discovery (TCC) method, for efficient exploration of context specific causal relationships from data. Experiments with both synthetic and real world data sets show that TCC can effectively discover context specific causal rules from the data.
△ Less
Submitted 20 August, 2018;
originally announced August 2018.
-
TFLMS: Large Model Support in TensorFlow by Graph Rewriting
Authors:
Tung D. Le,
Haruki Imai,
Yasushi Negishi,
Kiyokuni Kawachiya
Abstract:
While accelerators such as GPUs have limited memory, deep neural networks are becoming larger and will not fit with the memory limitation of accelerators for training. We propose an approach to tackle this problem by rewriting the computational graph of a neural network, in which swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. In particular, we…
▽ More
While accelerators such as GPUs have limited memory, deep neural networks are becoming larger and will not fit with the memory limitation of accelerators for training. We propose an approach to tackle this problem by rewriting the computational graph of a neural network, in which swap-out and swap-in operations are inserted to temporarily store intermediate results on CPU memory. In particular, we first revise the concept of a computational graph by defining a concrete semantics for variables in a graph. We then formally show how to derive swap-out and swap-in operations from an existing graph and present rules to optimize the graph. To realize our approach, we developed a module in TensorFlow, named TFLMS. TFLMS is published as a pull request in the TensorFlow repository for contributing to the TensorFlow community. With TFLMS, we were able to train ResNet-50 and 3DUnet with 4.7x and 2x larger batch size, respectively. In particular, we were able to train 3DUNet using images of size of $192^3$ for image segmentation, which, without TFLMS, had been done only by dividing the images to smaller images, which affects the accuracy.
△ Less
Submitted 2 October, 2019; v1 submitted 5 July, 2018;
originally announced July 2018.
-
ParallelPC: an R package for efficient constraint based causal exploration
Authors:
Thuc Duy Le,
Tao Hoang,
Jiuyong Li,
Lin Liu,
Shu Hu
Abstract:
Discovering causal relationships from data is the ultimate goal of many research areas. Constraint based causal exploration algorithms, such as PC, FCI, RFCI, PC-simple, IDA and Joint-IDA have achieved significant progress and have many applications. A common problem with these methods is the high computational complexity, which hinders their applications in real world high dimensional datasets, e…
▽ More
Discovering causal relationships from data is the ultimate goal of many research areas. Constraint based causal exploration algorithms, such as PC, FCI, RFCI, PC-simple, IDA and Joint-IDA have achieved significant progress and have many applications. A common problem with these methods is the high computational complexity, which hinders their applications in real world high dimensional datasets, e.g gene expression datasets. In this paper, we present an R package, ParallelPC, that includes the parallelised versions of these causal exploration algorithms. The parallelised algorithms help speed up the procedure of experimenting big datasets and reduce the memory used when running the algorithms. The package is not only suitable for super-computers or clusters, but also convenient for researchers using personal computers with multi core CPUs. Our experiment results on real world datasets show that using the parallelised algorithms it is now practical to explore causal relationships in high dimensional datasets with thousands of variables in a single multicore computer. ParallelPC is available in CRAN repository at https://cran.rproject.org/web/packages/ParallelPC/index.html.
△ Less
Submitted 11 October, 2015;
originally announced October 2015.
-
Mining Combined Causes in Large Data Sets
Authors:
Saisai Ma,
Jiuyong Li,
Lin Liu,
Thuc Duy Le
Abstract:
In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to incl…
▽ More
In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to include both individual and combined variables in the causal discovery using existing methods, but this scheme is computationally infeasible due to the huge number of combined variables. In this paper, we propose a novel approach to address this practical causal discovery problem, i.e. mining combined causes in large data sets. The experiments with both synthetic and real world data sets show that the proposed method can obtain high-quality causal discoveries with a high computational efficiency.
△ Less
Submitted 15 October, 2015; v1 submitted 28 August, 2015;
originally announced August 2015.
-
From Observational Studies to Causal Rule Mining
Authors:
Jiuyong Li,
Thuc Duy Le,
Lin Liu,
Jixue Liu,
Zhou Jin,
Bingyu Sun,
Saisai Ma
Abstract:
Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested…
▽ More
Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many data sets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large data sets. The problem is, causal relationships imply associations, but the reverse is not always true. However we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem while observational studies can be utilised to eliminate non-causal associations. In this paper we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large data sets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real world data sets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach in general is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables, a feature that other causal discovery methods do not possess.
△ Less
Submitted 16 August, 2015;
originally announced August 2015.