-
Automated Treatment Planning for Interstitial HDR Brachytherapy for Locally Advanced Cervical Cancer using Deep Reinforcement Learning
Authors:
Mohammadamin Moradi,
Runyu Jiang,
Yingzi Liu,
Malvern Madondo,
Tianming Wu,
James J. Sohn,
Xiaofeng Yang,
Yasmin Hasan,
Zhen Tian
Abstract:
High-dose-rate (HDR) brachytherapy plays a critical role in the treatment of locally advanced cervical cancer but remains highly dependent on manual treatment planning expertise. The objective of this study is to develop a fully automated HDR brachytherapy planning framework that integrates reinforcement learning (RL) and dose-based optimization to generate clinically acceptable treatment plans wi…
▽ More
High-dose-rate (HDR) brachytherapy plays a critical role in the treatment of locally advanced cervical cancer but remains highly dependent on manual treatment planning expertise. The objective of this study is to develop a fully automated HDR brachytherapy planning framework that integrates reinforcement learning (RL) and dose-based optimization to generate clinically acceptable treatment plans with improved consistency and efficiency. We propose a hierarchical two-stage autoplanning framework. In the first stage, a deep Q-network (DQN)-based RL agent iteratively selects treatment planning parameters (TPPs), which control the trade-offs between target coverage and organ-at-risk (OAR) sparing. The agent's state representation includes both dose-volume histogram (DVH) metrics and current TPP values, while its reward function incorporates clinical dose objectives and safety constraints, including D90, V150, V200 for targets, and D2cc for all relevant OARs (bladder, rectum, sigmoid, small bowel, and large bowel). In the second stage, a customized Adam-based optimizer computes the corresponding dwell time distribution for the selected TPPs using a clinically informed loss function. The framework was evaluated on a cohort of patients with complex applicator geometries. The proposed framework successfully learned clinically meaningful TPP adjustments across diverse patient anatomies. For the unseen test patients, the RL-based automated planning method achieved an average score of 93.89%, outperforming the clinical plans which averaged 91.86%. These findings are notable given that score improvements were achieved while maintaining full target coverage and reducing CTV hot spots in most cases.
△ Less
Submitted 13 June, 2025;
originally announced June 2025.
-
On the High-Rate FDPC Codes: Construction, Encoding, and a Generalization
Authors:
Mohsen Moradi,
Sheida Rabeti,
Hessam Mahdavifar
Abstract:
Recently introduced Fair-Density Parity-Check (FDPC) codes, targeting high-rate applications, offer superior error-correction performance (ECP) compared to 5G Low-Density Parity-Check (LDPC) codes, given the same number of message-passing decoding iterations. In this paper, we present a novel construction method for FDPC codes, introduce a generalization of these codes, and propose a low-complexit…
▽ More
Recently introduced Fair-Density Parity-Check (FDPC) codes, targeting high-rate applications, offer superior error-correction performance (ECP) compared to 5G Low-Density Parity-Check (LDPC) codes, given the same number of message-passing decoding iterations. In this paper, we present a novel construction method for FDPC codes, introduce a generalization of these codes, and propose a low-complexity encoding algorithm. Numerical results demonstrate the fast convergence of the message-passing decoder for FDPC codes.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Bounds and New Constructions for Girth-Constrained Regular Bipartite Graphs
Authors:
Sheida Rabeti,
Mohsen Moradi,
Hessam Mahdavifar
Abstract:
In this paper, we explore the design and analysis of regular bipartite graphs motivated by their application in low-density parity-check (LDPC) codes specifically with constrained girth and in the high-rate regime. We focus on the relation between the girth of the graph, and the size of the sets of variable and check nodes. We derive bounds on the size of the vertices in regular bipartite graphs,…
▽ More
In this paper, we explore the design and analysis of regular bipartite graphs motivated by their application in low-density parity-check (LDPC) codes specifically with constrained girth and in the high-rate regime. We focus on the relation between the girth of the graph, and the size of the sets of variable and check nodes. We derive bounds on the size of the vertices in regular bipartite graphs, showing how the required number of check nodes grows with respect to the number of variable nodes as girth grows large. Furthermore, we present two constructions for bipartite graphs with girth $\mathcal{G} = 8$; one based on a greedy construction of $(w_c, w_r)$-regular graphs, and another based on semi-regular graphs which have uniform column weight distribution with a sublinear number of check nodes. The second construction leverages sequences of integers without any length-$3$ arithmetic progression and is asymptotically optimal while maintaining a girth of $8$. Also, both constructions can offer sparse parity-check matrices for high-rate codes with medium-to-large block lengths. Our results solely focus on the graph-theoretic problem but can potentially contribute to the ongoing effort to design LDPC codes with high girth and minimum distance, specifically in high code rates.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Continuous Self-Improvement of Large Language Models by Test-time Training with Verifier-Driven Sample Selection
Authors:
Mohammad Mahdi Moradi,
Hossam Amer,
Sudhir Mudur,
Weiwei Zhang,
Yang Liu,
Walid Ahmed
Abstract:
Learning to adapt pretrained language models to unlabeled, out-of-distribution data is a critical challenge, as models often falter on structurally novel reasoning tasks even while excelling within their training distribution. We introduce a new framework called VDS-TTT - Verifier-Driven Sample Selection for Test-Time Training to efficiently address this. We use a learned verifier to score a pool…
▽ More
Learning to adapt pretrained language models to unlabeled, out-of-distribution data is a critical challenge, as models often falter on structurally novel reasoning tasks even while excelling within their training distribution. We introduce a new framework called VDS-TTT - Verifier-Driven Sample Selection for Test-Time Training to efficiently address this. We use a learned verifier to score a pool of generated responses and select only from high ranking pseudo-labeled examples for fine-tuned adaptation. Specifically, for each input query our LLM generates N candidate answers; the verifier assigns a reliability score to each, and the response with the highest confidence and above a fixed threshold is paired with its query for test-time training. We fine-tune only low-rank LoRA adapter parameters, ensuring adaptation efficiency and fast convergence. Our proposed self-supervised framework is the first to synthesize verifier driven test-time training data for continuous self-improvement of the model. Experiments across three diverse benchmarks and three state-of-the-art LLMs demonstrate that VDS-TTT yields up to a 32.29% relative improvement over the base model and a 6.66% gain compared to verifier-based methods without test-time training, highlighting its effectiveness and efficiency for on-the-fly large language model adaptation.
△ Less
Submitted 28 May, 2025; v1 submitted 25 May, 2025;
originally announced May 2025.
-
Balancing Computation Load and Representation Expressivity in Parallel Hybrid Neural Networks
Authors:
Mohammad Mahdi Moradi,
Walid Ahmed,
Shuangyue Wen,
Sudhir Mudur,
Weiwei Zhang,
Yang Liu
Abstract:
Attention and State-Space Models (SSMs) when combined in a hybrid network in sequence or in parallel provide complementary strengths. In a hybrid sequential pipeline they alternate between applying a transformer to the input and then feeding its output into a SSM. This results in idle periods in the individual components increasing end-to-end latency and lowering throughput caps. In the parallel h…
▽ More
Attention and State-Space Models (SSMs) when combined in a hybrid network in sequence or in parallel provide complementary strengths. In a hybrid sequential pipeline they alternate between applying a transformer to the input and then feeding its output into a SSM. This results in idle periods in the individual components increasing end-to-end latency and lowering throughput caps. In the parallel hybrid architecture, the transformer operates independently in parallel with the SSM, and these pairs are cascaded, with output from one pair forming the input to the next. Two issues are (i) creating an expressive knowledge representation with the inherently divergent outputs from these separate branches, and (ii) load balancing the computation between these parallel branches, while maintaining representation fidelity. In this work we present FlowHN, a novel parallel hybrid network architecture that accommodates various strategies for load balancing, achieved through appropriate distribution of input tokens between the two branches. Two innovative differentiating factors in FlowHN include a FLOP aware dynamic token split between the attention and SSM branches yielding efficient balance in compute load, and secondly, a method to fuse the highly divergent outputs from individual branches for enhancing representation expressivity. Together they enable much better token processing speeds, avoid bottlenecks, and at the same time yield significantly improved accuracy as compared to other competing works. We conduct comprehensive experiments on autoregressive language modeling for models with 135M, 350M, and 1B parameters. FlowHN outperforms sequential hybrid models and its parallel counterpart, achieving up to 4* higher Tokens per Second (TPS) and 2* better Model FLOPs Utilization (MFU).
△ Less
Submitted 28 May, 2025; v1 submitted 25 May, 2025;
originally announced May 2025.
-
GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance
Authors:
Mohammad Mahdi Moradi,
Sudhir Mudur
Abstract:
Knowledge-Based Visual Question Answering (KB-VQA) methods focus on tasks that demand reasoning with information extending beyond the explicit content depicted in the image. Early methods relied on explicit knowledge bases to provide this auxiliary information. Recent approaches leverage Large Language Models (LLMs) as implicit knowledge sources. While KB-VQA methods have demonstrated promising re…
▽ More
Knowledge-Based Visual Question Answering (KB-VQA) methods focus on tasks that demand reasoning with information extending beyond the explicit content depicted in the image. Early methods relied on explicit knowledge bases to provide this auxiliary information. Recent approaches leverage Large Language Models (LLMs) as implicit knowledge sources. While KB-VQA methods have demonstrated promising results, their potential remains constrained as the auxiliary text provided may not be relevant to the question context, and may also include irrelevant information that could misguide the answer predictor. We introduce a novel four-stage framework called Grounding Caption-Guided Knowledge-Based Visual Question Answering (GC-KBVQA), which enables LLMs to effectively perform zero-shot VQA tasks without the need for end-to-end multimodal training. Innovations include grounding question-aware caption generation to move beyond generic descriptions and have compact, yet detailed and context-rich information. This is combined with knowledge from external sources to create highly informative prompts for the LLM. GC-KBVQA can address a variety of VQA tasks, and does not require task-specific fine-tuning, thus reducing both costs and deployment complexity by leveraging general-purpose, pre-trained LLMs. Comparison with competing KB-VQA methods shows significantly improved performance. Our code will be made public.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
PAC codes with Bounded-Complexity Sequential Decoding: Pareto Distribution and Code Design
Authors:
Mohsen Moradi,
Hessam Mahdavifar
Abstract:
Recently, a novel variation of polar codes known as polarization-adjusted convolutional (PAC) codes has been introduced by Arıkan. These codes significantly outperform conventional polar and convolutional codes, particularly for short codeword lengths, and are shown to operate very close to the optimal bounds. It has also been shown that if the rate profile of PAC codes does not adhere to certain…
▽ More
Recently, a novel variation of polar codes known as polarization-adjusted convolutional (PAC) codes has been introduced by Arıkan. These codes significantly outperform conventional polar and convolutional codes, particularly for short codeword lengths, and are shown to operate very close to the optimal bounds. It has also been shown that if the rate profile of PAC codes does not adhere to certain polarized cutoff rate constraints, the computation complexity for their sequential decoding grows exponentially. In this paper, we address the converse problem, demonstrating that if the rate profile of a PAC code follows the polarized cutoff rate constraints, the required computations for its sequential decoding can be bounded with a distribution that follows a Pareto distribution. This serves as a guideline for the rate-profile design of PAC codes. For a high-rate PAC\,$(1024,899)$ code, simulation results show that the PAC code with Fano decoder, when constructed based on the polarized cutoff rate constraints, achieves a coding gain of more than $0.75$ dB at a frame error rate (FER) of $10^{-5}$ compared to the state-of-the-art 5G polar and LDPC codes.
△ Less
Submitted 8 December, 2024;
originally announced December 2024.
-
Multi-armed Bandits with Missing Outcome
Authors:
Ilia Mahrooghi,
Mahshad Moradi,
Sina Akbari,
Negar Kiyavash
Abstract:
While significant progress has been made in designing algorithms that minimize regret in online decision-making, real-world scenarios often introduce additional complexities, perhaps the most challenging of which is missing outcomes. Overlooking this aspect or simply assuming random missingness invariably leads to biased estimates of the rewards and may result in linear regret. Despite the practic…
▽ More
While significant progress has been made in designing algorithms that minimize regret in online decision-making, real-world scenarios often introduce additional complexities, perhaps the most challenging of which is missing outcomes. Overlooking this aspect or simply assuming random missingness invariably leads to biased estimates of the rewards and may result in linear regret. Despite the practical relevance of this challenge, no rigorous methodology currently exists for systematically handling missingness, especially when the missingness mechanism is not random. In this paper, we address this gap in the context of multi-armed bandits (MAB) with missing outcomes by analyzing the impact of different missingness mechanisms on achievable regret bounds. We introduce algorithms that account for missingness under both missing at random (MAR) and missing not at random (MNAR) models. Through both analytical and simulation studies, we demonstrate the drastic improvements in decision-making by accounting for missingness in these settings.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Kolmogorov-Arnold Network Autoencoders
Authors:
Mohammadamin Moradi,
Shirin Panahi,
Erik Bollt,
Ying-Cheng Lai
Abstract:
Deep learning models have revolutionized various domains, with Multi-Layer Perceptrons (MLPs) being a cornerstone for tasks like data regression and image classification. However, a recent study has introduced Kolmogorov-Arnold Networks (KANs) as promising alternatives to MLPs, leveraging activation functions placed on edges rather than nodes. This structural shift aligns KANs closely with the Kol…
▽ More
Deep learning models have revolutionized various domains, with Multi-Layer Perceptrons (MLPs) being a cornerstone for tasks like data regression and image classification. However, a recent study has introduced Kolmogorov-Arnold Networks (KANs) as promising alternatives to MLPs, leveraging activation functions placed on edges rather than nodes. This structural shift aligns KANs closely with the Kolmogorov-Arnold representation theorem, potentially enhancing both model accuracy and interpretability. In this study, we explore the efficacy of KANs in the context of data representation via autoencoders, comparing their performance with traditional Convolutional Neural Networks (CNNs) on the MNIST, SVHN, and CIFAR-10 datasets. Our results demonstrate that KAN-based autoencoders achieve competitive performance in terms of reconstruction accuracy, thereby suggesting their viability as effective tools in data analysis tasks.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
Data-driven model discovery with Kolmogorov-Arnold networks
Authors:
Mohammadamin Moradi,
Shirin Panahi,
Erik M. Bollt,
Ying-Cheng Lai
Abstract:
Data-driven model discovery of complex dynamical systems is typically done using sparse optimization, but it has a fundamental limitation: sparsity in that the underlying governing equations of the system contain only a small number of elementary mathematical terms. Examples where sparse optimization fails abound, such as the classic Ikeda or optical-cavity map in nonlinear dynamics and a large va…
▽ More
Data-driven model discovery of complex dynamical systems is typically done using sparse optimization, but it has a fundamental limitation: sparsity in that the underlying governing equations of the system contain only a small number of elementary mathematical terms. Examples where sparse optimization fails abound, such as the classic Ikeda or optical-cavity map in nonlinear dynamics and a large variety of ecosystems. Exploiting the recently articulated Kolmogorov-Arnold networks, we develop a general model-discovery framework for any dynamical systems including those that do not satisfy the sparsity condition. In particular, we demonstrate non-uniqueness in that a large number of approximate models of the system can be found which generate the same invariant set with the correct statistics such as the Lyapunov exponents and Kullback-Leibler divergence. An analogy to shadowing of numerical trajectories in chaotic systems is pointed out.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
AIM 2024 Challenge on Video Saliency Prediction: Methods and Results
Authors:
Andrey Moskalenko,
Alexey Bryncev,
Dmitry Vatolin,
Radu Timofte,
Gen Zhan,
Li Yang,
Yunlong Tang,
Yiting Liao,
Jiongzhi Lin,
Baitao Huang,
Morteza Moradi,
Mohammad Moradi,
Francesco Rundo,
Concetto Spampinato,
Ali Borji,
Simone Palazzo,
Yuxin Zhu,
Yinan Sun,
Huiyu Duan,
Yuqin Cao,
Ziheng Jia,
Qiang Hu,
Xiongkuo Min,
Guangtao Zhai,
Hao Fang
, et al. (8 additional authors not shown)
Abstract:
This paper reviews the Challenge on Video Saliency Prediction at AIM 2024. The goal of the participants was to develop a method for predicting accurate saliency maps for the provided set of video sequences. Saliency maps are widely exploited in various applications, including video compression, quality assessment, visual perception studies, the advertising industry, etc. For this competition, a pr…
▽ More
This paper reviews the Challenge on Video Saliency Prediction at AIM 2024. The goal of the participants was to develop a method for predicting accurate saliency maps for the provided set of video sequences. Saliency maps are widely exploited in various applications, including video compression, quality assessment, visual perception studies, the advertising industry, etc. For this competition, a previously unused large-scale audio-visual mouse saliency (AViMoS) dataset of 1500 videos with more than 70 observers per video was collected using crowdsourced mouse tracking. The dataset collection methodology has been validated using conventional eye-tracking data and has shown high consistency. Over 30 teams registered in the challenge, and there are 7 teams that submitted the results in the final phase. The final phase solutions were tested and ranked by commonly used quality metrics on a private test subset. The results of this evaluation and the descriptions of the solutions are presented in this report. All data, including the private test subset, is made publicly available on the challenge homepage - https://challenges.videoprocessing.ai/challenges/video-saliency-prediction.html.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
On Fast SC-based Polar Decoders: Metric Polarization and a Pruning Technique
Authors:
Mohsen Moradi,
Hessam Mahdavifar
Abstract:
Short- to medium-block-length polar-like and polarization-adjusted convolutional (PAC) codes have demonstrated exceptional error-correction performance through sequential decoding. Successive cancellation list (SCL) decoding of polar-like and PAC codes can potentially match the performance of sequential decoding though a relatively large list size is often required. By benefiting from an optimal m…
▽ More
Short- to medium-block-length polar-like and polarization-adjusted convolutional (PAC) codes have demonstrated exceptional error-correction performance through sequential decoding. Successive cancellation list (SCL) decoding of polar-like and PAC codes can potentially match the performance of sequential decoding though a relatively large list size is often required. By benefiting from an optimal metric function, sequential decoding can find the correct path corresponding to the transmitted data by following almost one path on average at high Eb/N0 regimes. When considering a large number of paths in SCL decoding, a main bottleneck emerges that is the need for a rather expensive sorting operation at each level of decoding of data bits. In this paper, we propose a method to obtain the optimal metric function for each depth of the polarization tree through a process that we call polarization of the metric function. One of the major advantages of the proposed metric function is that it can be utilized in fast SC-based (FSC) and SCL-based (FSCL) decoders, i.e., decoders that opt to skip the so-called rate-1 and rate-0 nodes in the binary tree representation for significantly more efficient implementation. Furthermore, based on the average value of the polarized metric function of FSC-based decoders, we introduce a pruning technique that keeps only the paths whose metric values are close to the average value. As a result, our proposed technique significantly reduces the number of required sorting operations for FSCL-based decoding algorithms. For instance, for a high-rate PAC(128,99) code, SCL decoding with a list size of 32 achieves error-correction performance comparable to the Fano algorithm. Our method reduces the number of sorting operations of FSCL decoding to 33%, further decreasing latency.
△ Less
Submitted 19 March, 2025; v1 submitted 7 August, 2024;
originally announced August 2024.
-
Artificial intelligence for context-aware visual change detection in software test automation
Authors:
Milad Moradi,
Ke Yan,
David Colwell,
Rhona Asgari
Abstract:
Automated software testing is integral to the software development process, streamlining workflows and ensuring product reliability. Visual testing within this context, especially concerning user interface (UI) and user experience (UX) validation, stands as one of crucial determinants of overall software quality. Nevertheless, conventional methods like pixel-wise comparison and region-based visual…
▽ More
Automated software testing is integral to the software development process, streamlining workflows and ensuring product reliability. Visual testing within this context, especially concerning user interface (UI) and user experience (UX) validation, stands as one of crucial determinants of overall software quality. Nevertheless, conventional methods like pixel-wise comparison and region-based visual change detection fall short in capturing contextual similarities, nuanced alterations, and understanding the spatial relationships between UI elements. In this paper, we introduce a novel graph-based method for visual change detection in software test automation. Leveraging a machine learning model, our method accurately identifies UI controls from software screenshots and constructs a graph representing contextual and spatial relationships between the controls. This information is then used to find correspondence between UI controls within screenshots of different versions of a software. The resulting graph encapsulates the intricate layout of the UI and underlying contextual relations, providing a holistic and context-aware model. This model is finally used to detect and highlight visual regressions in the UI. Comprehensive experiments on different datasets showed that our change detector can accurately detect visual software changes in various simple and complex test scenarios. Moreover, it outperformed pixel-wise comparison and region-based baselines by a large margin in more complex testing scenarios. This work not only contributes to the advancement of visual change detection but also holds practical implications, offering a robust solution for real-world software test automation challenges, enhancing reliability, and ensuring the seamless evolution of software interfaces.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Enhancing Predictive Accuracy in Pharmaceutical Sales Through An Ensemble Kernel Gaussian Process Regression Approach
Authors:
Shahin Mirshekari,
Mohammadreza Moradi,
Hossein Jafari,
Mehdi Jafari,
Mohammad Ensaf
Abstract:
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior perf…
▽ More
This research employs Gaussian Process Regression (GPR) with an ensemble kernel, integrating Exponential Squared, Revised Matérn, and Rational Quadratic kernels to analyze pharmaceutical sales data. Bayesian optimization was used to identify optimal kernel weights: 0.76 for Exponential Squared, 0.21 for Revised Matérn, and 0.13 for Rational Quadratic. The ensemble kernel demonstrated superior performance in predictive accuracy, achieving an \( R^2 \) score near 1.0, and significantly lower values in Mean Squared Error (MSE), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These findings highlight the efficacy of ensemble kernels in GPR for predictive analytics in complex pharmaceutical sales datasets.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Exploring the landscape of large language models: Foundations, techniques, and challenges
Authors:
Milad Moradi,
Ke Yan,
David Colwell,
Matthias Samwald,
Rhona Asgari
Abstract:
In this review paper, we delve into the realm of Large Language Models (LLMs), covering their foundational principles, diverse applications, and nuanced training processes. The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches, with a special focus on methods that optimize efficiency in parameter usage. Additionally, it explores how LLMs can be mo…
▽ More
In this review paper, we delve into the realm of Large Language Models (LLMs), covering their foundational principles, diverse applications, and nuanced training processes. The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches, with a special focus on methods that optimize efficiency in parameter usage. Additionally, it explores how LLMs can be more closely aligned with human preferences through innovative reinforcement learning frameworks and other novel methods that incorporate human feedback. The article also examines the emerging technique of retrieval augmented generation, integrating external knowledge into LLMs. The ethical dimensions of LLM deployment are discussed, underscoring the need for mindful and responsible application. Concluding with a perspective on future research trajectories, this review offers a succinct yet comprehensive overview of the current state and emerging trends in the evolving landscape of LLMs, serving as an insightful guide for both researchers and practitioners in artificial intelligence.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Socially Pertinent Robots in Gerontological Healthcare
Authors:
Xavier Alameda-Pineda,
Angus Addlesee,
Daniel Hernández García,
Chris Reinke,
Soraya Arias,
Federica Arrigoni,
Alex Auternaud,
Lauriane Blavette,
Cigdem Beyan,
Luis Gomez Camara,
Ohad Cohen,
Alessandro Conti,
Sébastien Dacunha,
Christian Dondrup,
Yoav Ellinson,
Francesco Ferro,
Sharon Gannot,
Florian Gras,
Nancie Gunson,
Radu Horaud,
Moreno D'Incà,
Imad Kimouche,
Séverin Lemaignan,
Oliver Lemon,
Cyril Liotard
, et al. (19 additional authors not shown)
Abstract:
Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platforms have been used in gerontological healthcare, the question of whether or not a social interactive robot with multi-modal conversational capabilitie…
▽ More
Despite the many recent achievements in developing and deploying social robotics, there are still many underexplored environments and applications for which systematic evaluation of such systems by end-users is necessary. While several robotic platforms have been used in gerontological healthcare, the question of whether or not a social interactive robot with multi-modal conversational capabilities will be useful and accepted in real-life facilities is yet to be answered. This paper is an attempt to partially answer this question, via two waves of experiments with patients and companions in a day-care gerontological facility in Paris with a full-sized humanoid robot endowed with social and conversational interaction capabilities. The software architecture, developed during the H2020 SPRING project, together with the experimental protocol, allowed us to evaluate the acceptability (AES) and usability (SUS) with more than 60 end-users. Overall, the users are receptive to this technology, especially when the robot perception and action skills are robust to environmental clutter and flexible to handle a plethora of different interactions.
△ Less
Submitted 11 February, 2025; v1 submitted 11 April, 2024;
originally announced April 2024.
-
SalFoM: Dynamic Saliency Prediction with Video Foundation Models
Authors:
Morteza Moradi,
Mohammad Moradi,
Francesco Rundo,
Concetto Spampinato,
Ali Borji,
Simone Palazzo
Abstract:
Recent advancements in video saliency prediction (VSP) have shown promising performance compared to the human visual system, whose emulation is the primary goal of VSP. However, current state-of-the-art models employ spatio-temporal transformers trained on limited amounts of data, hindering generalizability adaptation to downstream tasks. The benefits of vision foundation models present a potentia…
▽ More
Recent advancements in video saliency prediction (VSP) have shown promising performance compared to the human visual system, whose emulation is the primary goal of VSP. However, current state-of-the-art models employ spatio-temporal transformers trained on limited amounts of data, hindering generalizability adaptation to downstream tasks. The benefits of vision foundation models present a potential solution to improve the VSP process. However, adapting image foundation models to the video domain presents significant challenges in modeling scene dynamics and capturing temporal information. To address these challenges, and as the first initiative to design a VSP model based on video foundation models, we introduce SalFoM, a novel encoder-decoder video transformer architecture. Our model employs UnMasked Teacher (UMT) as feature extractor and presents a heterogeneous decoder which features a locality-aware spatio-temporal transformer and integrates local and global spatio-temporal information from various perspectives to produce the final saliency map. Our qualitative and quantitative experiments on the challenging VSP benchmark datasets of DHF1K, Hollywood-2 and UCF-Sports demonstrate the superiority of our proposed model in comparison with the state-of-the-art methods.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Machine-learning prediction of tipping with applications to the Atlantic Meridional Overturning Circulation
Authors:
Shirin Panahi,
Ling-Wei Kong,
Mohammadamin Moradi,
Zheng-Meng Zhai,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Anticipating a tipping point, a transition from one stable steady state to another, is a problem of broad relevance due to the ubiquity of the phenomenon in diverse fields. The steady-state nature of the dynamics about a tipping point makes its prediction significantly more challenging than predicting other types of critical transitions from oscillatory or chaotic dynamics. Exploiting the benefits…
▽ More
Anticipating a tipping point, a transition from one stable steady state to another, is a problem of broad relevance due to the ubiquity of the phenomenon in diverse fields. The steady-state nature of the dynamics about a tipping point makes its prediction significantly more challenging than predicting other types of critical transitions from oscillatory or chaotic dynamics. Exploiting the benefits of noise, we develop a general data-driven and machine-learning approach to predicting potential future tipping in nonautonomous dynamical systems and validate the framework using examples from different fields. As an application, we address the problem of predicting the potential collapse of the Atlantic Meridional Overturning Circulation (AMOC), possibly driven by climate-induced changes in the freshwater input to the North Atlantic. Our predictions based on synthetic and currently available empirical data place a potential collapse window spanning from 2040 to 2065, in consistency with the results in the current literature.
△ Less
Submitted 17 October, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Random forests for detecting weak signals and extracting physical information: a case study of magnetic navigation
Authors:
Mohammadamin Moradi,
Zheng-Meng Zhai,
Aaron Nielsen,
Ying-Cheng Lai
Abstract:
It was recently demonstrated that two machine-learning architectures, reservoir computing and time-delayed feed-forward neural networks, can be exploited for detecting the Earth's anomaly magnetic field immersed in overwhelming complex signals for magnetic navigation in a GPS-denied environment. The accuracy of the detected anomaly field corresponds to a positioning accuracy in the range of 10 to…
▽ More
It was recently demonstrated that two machine-learning architectures, reservoir computing and time-delayed feed-forward neural networks, can be exploited for detecting the Earth's anomaly magnetic field immersed in overwhelming complex signals for magnetic navigation in a GPS-denied environment. The accuracy of the detected anomaly field corresponds to a positioning accuracy in the range of 10 to 40 meters. To increase the accuracy and reduce the uncertainty of weak signal detection as well as to directly obtain the position information, we exploit the machine-learning model of random forests that combines the output of multiple decision trees to give optimal values of the physical quantities of interest. In particular, from time-series data gathered from the cockpit of a flying airplane during various maneuvering stages, where strong background complex signals are caused by other elements of the Earth's magnetic field and the fields produced by the electronic systems in the cockpit, we demonstrate that the random-forest algorithm performs remarkably well in detecting the weak anomaly field and in filtering the position of the aircraft. With the aid of the conventional inertial navigation system, the positioning error can be reduced to less than 10 meters. We also find that, contrary to the conventional wisdom, the classic Tolles-Lawson model for calibrating and removing the magnetic field generated by the body of the aircraft is not necessary and may even be detrimental for the success of the random-forest method.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
PAC Code Rate-Profile Design Using Search-Constrained Optimization Algorithms
Authors:
Mohsen Moradi,
David G. M. Mitchell
Abstract:
In this paper, we introduce a novel rate-profile design based on search-constrained optimization techniques to assess the performance of polarization-adjusted convolutional (PAC) codes under Fano (sequential) decoding. The results demonstrate that the resulting PAC code offers much reduced computational complexity compared to a construction based on a conventional genetic algorithm without a perfo…
▽ More
In this paper, we introduce a novel rate-profile design based on search-constrained optimization techniques to assess the performance of polarization-adjusted convolutional (PAC) codes under Fano (sequential) decoding. The results demonstrate that the resulting PAC code offers much reduced computational complexity compared to a construction based on a conventional genetic algorithm without a performance loss in error-correction performance. As the fitness function of our algorithm, we propose an adaptive successive cancellation list decoding algorithm to determine the weight distribution of the rate profiles. The simulation results indicate that, for a PAC(256, 128) code, only 8% of the population requires that their fitness function be evaluated with a large list size. This represents an improvement of almost 92% over a conventional evolutionary algorithm. For a PAC(64, 32) code, this improvement is about 99%. We also plotted the performance of the high-rate PAC(128, 105) and PAC(64, 51) codes, and the results show that they exhibit superior performance compared to other algorithms.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding
Authors:
Morteza Moradi,
Simone Palazzo,
Concetto Spampinato
Abstract:
In recent years, finding an effective and efficient strategy for exploiting spatial and temporal information has been a hot research topic in video saliency prediction (VSP). With the emergence of spatio-temporal transformers, the weakness of the prior strategies, e.g., 3D convolutional networks and LSTM-based networks, for capturing long-range dependencies has been effectively compensated. While…
▽ More
In recent years, finding an effective and efficient strategy for exploiting spatial and temporal information has been a hot research topic in video saliency prediction (VSP). With the emergence of spatio-temporal transformers, the weakness of the prior strategies, e.g., 3D convolutional networks and LSTM-based networks, for capturing long-range dependencies has been effectively compensated. While VSP has drawn benefits from spatio-temporal transformers, finding the most effective way for aggregating temporal features is still challenging. To address this concern, we propose a transformer-based video saliency prediction approach with high temporal dimension decoding network (THTD-Net). This strategy accounts for the lack of complex hierarchical interactions between features that are extracted from the transformer-based spatio-temporal encoder: in particular, it does not require multiple decoders and aims at gradually reducing temporal features' dimensions in the decoder. This decoder-based architecture yields comparable performance to multi-branch and over-complicated models on common benchmarks such as DHF1K, UCF-sports and Hollywood-2.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Single-Microphone Speaker Separation and Voice Activity Detection in Noisy and Reverberant Environments
Authors:
Renana Opochinsky,
Mordehay Moradi,
Sharon Gannot
Abstract:
Speech separation involves extracting an individual speaker's voice from a multi-speaker audio signal. The increasing complexity of real-world environments, where multiple speakers might converse simultaneously, underscores the importance of effective speech separation techniques. This work presents a single-microphone speaker separation network with TF attention aiming at noisy and reverberant en…
▽ More
Speech separation involves extracting an individual speaker's voice from a multi-speaker audio signal. The increasing complexity of real-world environments, where multiple speakers might converse simultaneously, underscores the importance of effective speech separation techniques. This work presents a single-microphone speaker separation network with TF attention aiming at noisy and reverberant environments. We dub this new architecture as Separation TF Attention Network (Sep-TFAnet). In addition, we present a variant of the separation network, dubbed $ \text{Sep-TFAnet}^{\text{VAD}}$, which incorporates a voice activity detector (VAD) into the separation network.
The separation module is based on a temporal convolutional network (TCN) backbone inspired by the Conv-Tasnet architecture with multiple modifications. Rather than a learned encoder and decoder, we use short-time Fourier transform (STFT) and inverse short-time Fourier transform (iSTFT) for the analysis and synthesis, respectively. Our system is specially developed for human-robotic interactions and should support online mode. The separation capabilities of $ \text{Sep-TFAnet}^{\text{VAD}}$ and Sep-TFAnet were evaluated and extensively analyzed under several acoustic conditions, demonstrating their advantages over competing methods. Since separation networks trained on simulated data tend to perform poorly on real recordings, we also demonstrate the ability of the proposed scheme to better generalize to realistic examples recorded in our acoustic lab by a humanoid robot. Project page: https://Sep-TFAnet.github.io
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Multi-level biomedical NER through multi-granularity embeddings and enhanced labeling
Authors:
Fahime Shahrokh,
Nasser Ghadiri,
Rasoul Samani,
Milad Moradi
Abstract:
Biomedical Named Entity Recognition (NER) is a fundamental task of Biomedical Natural Language Processing for extracting relevant information from biomedical texts, such as clinical records, scientific publications, and electronic health records. The conventional approaches for biomedical NER mainly use traditional machine learning techniques, such as Conditional Random Fields and Support Vector M…
▽ More
Biomedical Named Entity Recognition (NER) is a fundamental task of Biomedical Natural Language Processing for extracting relevant information from biomedical texts, such as clinical records, scientific publications, and electronic health records. The conventional approaches for biomedical NER mainly use traditional machine learning techniques, such as Conditional Random Fields and Support Vector Machines or deep learning-based models like Recurrent Neural Networks and Convolutional Neural Networks. Recently, Transformer-based models, including BERT, have been used in the domain of biomedical NER and have demonstrated remarkable results. However, these models are often based on word-level embeddings, limiting their ability to capture character-level information, which is effective in biomedical NER due to the high variability and complexity of biomedical texts. To address these limitations, this paper proposes a hybrid approach that integrates the strengths of multiple models. In this paper, we proposed an approach that leverages fine-tuned BERT to provide contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text. In addition, also we propose an enhanced labelling method as part of pre-processing to enhance the identification of the entity's beginning word and thus improve the identification of multi-word entities, a common challenge in biomedical NER. By integrating these models and the pre-processing method, our proposed model effectively captures both contextual information and detailed character-level information. We evaluated our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11. These results illustrate the proficiency of our proposed model in performing biomedical Named Entity Recognition.
△ Less
Submitted 24 December, 2023;
originally announced December 2023.
-
Machine-learning parameter tracking with partial state observation
Authors:
Zheng-Meng Zhai,
Mohammadamin Moradi,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing,…
▽ More
Complex and nonlinear dynamical systems often involve parameters that change with time, accurate tracking of which is essential to tasks such as state estimation, prediction, and control. Existing machine-learning methods require full state observation of the underlying system and tacitly assume adiabatic changes in the parameter. Formulating an inverse problem and exploiting reservoir computing, we develop a model-free and fully data-driven framework to accurately track time-varying parameters from partial state observation in real time. In particular, with training data from a subset of the dynamical variables of the system for a small number of known parameter values, the framework is able to accurately predict the parameter variations in time. Low- and high-dimensional, Markovian and non-Markovian nonlinear dynamical systems are used to demonstrate the power of the machine-learning based parameter-tracking framework. Pertinent issues affecting the tracking performance are addressed.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Multilevel User Credibility Assessment in Social Networks
Authors:
Mohammad Moradi,
Mostafa Haghir Chehreghani
Abstract:
Online social networks are major platforms for disseminating both real and fake news. Many users, intentionally or unintentionally, spread harmful content, fake news, and rumors in fields such as politics and business. Consequently, numerous studies have been conducted in recent years to assess user credibility. A significant shortcoming of most existing methods is that they categorize users as ei…
▽ More
Online social networks are major platforms for disseminating both real and fake news. Many users, intentionally or unintentionally, spread harmful content, fake news, and rumors in fields such as politics and business. Consequently, numerous studies have been conducted in recent years to assess user credibility. A significant shortcoming of most existing methods is that they categorize users as either real or fake. However, in real-world applications, it is often more desirable to consider several levels of user credibility. Another limitation is that existing approaches only utilize a portion of important features, which reduces their performance. In this paper, due to the lack of an appropriate dataset for multilevel user credibility assessment, we first design a method to collect data suitable for assessing credibility at multiple levels. Then, we develop the MultiCred model, which places users at one of several levels of credibility based on a rich and diverse set of features extracted from users' profiles, tweets, and comments. MultiCred leverages deep language models to analyze textual data and deep neural models to process non-textual features. Our extensive experiments reveal that MultiCred significantly outperforms existing approaches in terms of several accuracy measures.
△ Less
Submitted 11 January, 2025; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Model-free tracking control of complex dynamical trajectories with machine learning
Authors:
Zheng-Meng Zhai,
Mohammadamin Moradi,
Ling-Wei Kong,
Bryan Glaz,
Mulugeta Haile,
Ying-Cheng Lai
Abstract:
Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially obs…
▽ More
Nonlinear tracking control enabling a dynamical system to track a desired trajectory is fundamental to robotics, serving a wide range of civil and defense applications. In control engineering, designing tracking control requires complete knowledge of the system model and equations. We develop a model-free, machine-learning framework to control a two-arm robotic manipulator using only partially observed states, where the controller is realized by reservoir computing. Stochastic input is exploited for training, which consists of the observed partial state vector as the first and its immediate future as the second component so that the neural machine regards the latter as the future state of the former. In the testing (deployment) phase, the immediate-future component is replaced by the desired observational vector from the reference trajectory. We demonstrate the effectiveness of the control framework using a variety of periodic and chaotic signals, and establish its robustness against measurement noise, disturbances, and uncertainties.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Polarization-Adjusted Convolutional (PAC) Codes as a Concatenation of Inner Cyclic and Outer Polar- and Reed-Muller-like Codes
Authors:
Mohsen Moradi
Abstract:
Polarization-adjusted convolutional (PAC) codes are a new family of linear block codes that can perform close to the theoretical bounds in the short block-length regime. These codes combine polar coding and convolutional coding. In this study, we show that PAC codes are equivalent to a new class of codes consisting of inner cyclic codes and outer polar- and Reed-Muller-like codes. We leverage the…
▽ More
Polarization-adjusted convolutional (PAC) codes are a new family of linear block codes that can perform close to the theoretical bounds in the short block-length regime. These codes combine polar coding and convolutional coding. In this study, we show that PAC codes are equivalent to a new class of codes consisting of inner cyclic codes and outer polar- and Reed-Muller-like codes. We leverage the properties of cyclic codes to establish that PAC codes outperform polar- and Reed-Muller-like codes in terms of minimum distance.
△ Less
Submitted 2 August, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
Model-agnostic explainable artificial intelligence for object detection in image data
Authors:
Milad Moradi,
Ke Yan,
David Colwell,
Matthias Samwald,
Rhona Asgari
Abstract:
In recent years, deep neural networks have been widely used for building high-performance Artificial Intelligence (AI) systems for computer vision applications. Object detection is a fundamental task in computer vision, which has been greatly progressed through developing large and intricate AI models. However, the lack of transparency is a big challenge that may not allow the widespread adoption…
▽ More
In recent years, deep neural networks have been widely used for building high-performance Artificial Intelligence (AI) systems for computer vision applications. Object detection is a fundamental task in computer vision, which has been greatly progressed through developing large and intricate AI models. However, the lack of transparency is a big challenge that may not allow the widespread adoption of these models. Explainable artificial intelligence is a field of research where methods are developed to help users understand the behavior, decision logics, and vulnerabilities of AI systems. Previously, few explanation methods were developed for object detection based on random masking. However, random masks may raise some issues regarding the actual importance of pixels within an image. In this paper, we design and implement a black-box explanation method named Black-box Object Detection Explanation by Masking (BODEM) through adopting a hierarchical random masking approach for object detection systems. We propose a hierarchical random masking framework in which coarse-grained masks are used in lower levels to find salient regions within an image, and fine-grained mask are used to refine the salient regions in higher levels. Experimentations on various object detection datasets and models showed that BODEM can effectively explain the behavior of object detectors. Moreover, our method outperformed Detector Randomized Input Sampling for Explanation (D-RISE) and Local Interpretable Model-agnostic Explanations (LIME) with respect to different quantitative measures of explanation effectiveness. The experimental results demonstrate that BODEM can be an effective method for explaining and validating object detection systems in black-box testing scenarios.
△ Less
Submitted 4 September, 2024; v1 submitted 30 March, 2023;
originally announced March 2023.
-
ThoughtSource: A central hub for large language model reasoning data
Authors:
Simon Ott,
Konstantin Hebenstreit,
Valentin Liévin,
Christoffer Egeberg Hother,
Milad Moradi,
Maximilian Mayrhauser,
Robert Praas,
Ole Winther,
Matthias Samwald
Abstract:
Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to 'hallucinate' facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a te…
▽ More
Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to 'hallucinate' facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a technique known as chain-of-thought prompting, has recently been proposed as a way to address some of these issues. Here we present ThoughtSource, a meta-dataset and software library for chain-of-thought (CoT) reasoning. The goal of ThoughtSource is to improve future artificial intelligence systems by facilitating qualitative understanding of CoTs, enabling empirical evaluations, and providing training data. This first release of ThoughtSource integrates seven scientific/medical, three general-domain and five math word question answering datasets.
△ Less
Submitted 27 July, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
TinyHD: Efficient Video Saliency Prediction with Heterogeneous Decoders using Hierarchical Maps Distillation
Authors:
Feiyan Hu,
Simone Palazzo,
Federica Proietto Salanitri,
Giovanni Bellitto,
Morteza Moradi,
Concetto Spampinato,
Kevin McGuinness
Abstract:
Video saliency prediction has recently attracted attention of the research community, as it is an upstream task for several practical applications. However, current solutions are particularly computationally demanding, especially due to the wide usage of spatio-temporal 3D convolutions. We observe that, while different model architectures achieve similar performance on benchmarks, visual variation…
▽ More
Video saliency prediction has recently attracted attention of the research community, as it is an upstream task for several practical applications. However, current solutions are particularly computationally demanding, especially due to the wide usage of spatio-temporal 3D convolutions. We observe that, while different model architectures achieve similar performance on benchmarks, visual variations between predicted saliency maps are still significant. Inspired by this intuition, we propose a lightweight model that employs multiple simple heterogeneous decoders and adopts several practical approaches to improve accuracy while keeping computational costs low, such as hierarchical multi-map knowledge distillation, multi-output saliency prediction, unlabeled auxiliary datasets and channel reduction with teacher assistant supervision. Our approach achieves saliency prediction accuracy on par or better than state-of-the-art methods on DFH1K, UCF-Sports and Hollywood2 benchmarks, while enhancing significantly the efficiency of the model. Code is on https://github.com/feiyanhu/tinyHD
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Towards Automatic Prediction of Outcome in Treatment of Cerebral Aneurysms
Authors:
Ashutosh Jadhav,
Satyananda Kashyap,
Hakan Bulu,
Ronak Dholakia,
Amon Y. Liu,
Tanveer Syeda-Mahmood,
William R. Patterson,
Hussain Rangwala,
Mehdi Moradi
Abstract:
Intrasaccular flow disruptors treat cerebral aneurysms by diverting the blood flow from the aneurysm sac. Residual flow into the sac after the intervention is a failure that could be due to the use of an undersized device, or to vascular anatomy and clinical condition of the patient. We report a machine learning model based on over 100 clinical and imaging features that predict the outcome of wide…
▽ More
Intrasaccular flow disruptors treat cerebral aneurysms by diverting the blood flow from the aneurysm sac. Residual flow into the sac after the intervention is a failure that could be due to the use of an undersized device, or to vascular anatomy and clinical condition of the patient. We report a machine learning model based on over 100 clinical and imaging features that predict the outcome of wide-neck bifurcation aneurysm treatment with an intravascular embolization device. We combine clinical features with a diverse set of common and novel imaging measurements within a random forest model. We also develop neural network segmentation algorithms in 2D and 3D to contour the sac in angiographic images and automatically calculate the imaging features. These deliver 90% overlap with manual contouring in 2D and 83% in 3D. Our predictive model classifies complete vs. partial occlusion outcomes with an accuracy of 75.31%, and weighted F1-score of 0.74.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Application of Guessing to Sequential Decoding of Polarization-Adjusted Convolutional (PAC) Codes
Authors:
Mohsen Moradi
Abstract:
Despite the extreme error-correction performance, the amount of computation of sequential decoding of the polarization-adjusted convolutional (PAC) codes is random. In sequential decoding of convolutional codes, the computational cutoff rate denotes the region between rates whose average computational complexity of decoding is finite and those which is infinite. In this paper, by benefiting from t…
▽ More
Despite the extreme error-correction performance, the amount of computation of sequential decoding of the polarization-adjusted convolutional (PAC) codes is random. In sequential decoding of convolutional codes, the computational cutoff rate denotes the region between rates whose average computational complexity of decoding is finite and those which is infinite. In this paper, by benefiting from the polarization and guessing techniques, we prove that the computational cutoff rate in sequential decoding of pre-transformed polar codes polarizes. The polarization of the computational cutoff rate affects the criteria for the rate-profile construction of the pre-transformed polar codes. We propose a technique for taming the Reed-Muller (RM) rate-profile construction, and the performance results demonstrate that the error-correction performance of the PAC codes can achieve the theoretical bounds using the tamed-RM rate-profile construction and requires a significantly lower computational complexity than the RM rate-profile construction.
△ Less
Submitted 29 November, 2022; v1 submitted 8 August, 2022;
originally announced August 2022.
-
CheXRelNet: An Anatomy-Aware Model for Tracking Longitudinal Relationships between Chest X-Rays
Authors:
Gaurang Karwande,
Amarachi Mbakawe,
Joy T. Wu,
Leo A. Celi,
Mehdi Moradi,
Ismini Lourentzou
Abstract:
Despite the progress in utilizing deep learning to automate chest radiograph interpretation and disease diagnosis tasks, change between sequential Chest X-rays (CXRs) has received limited attention. Monitoring the progression of pathologies that are visualized through chest imaging poses several challenges in anatomical motion estimation and image registration, i.e., spatially aligning the two ima…
▽ More
Despite the progress in utilizing deep learning to automate chest radiograph interpretation and disease diagnosis tasks, change between sequential Chest X-rays (CXRs) has received limited attention. Monitoring the progression of pathologies that are visualized through chest imaging poses several challenges in anatomical motion estimation and image registration, i.e., spatially aligning the two images and modeling temporal dynamics in change detection. In this work, we propose CheXRelNet, a neural model that can track longitudinal pathology change relations between two CXRs. CheXRelNet incorporates local and global visual features, utilizes inter-image and intra-image anatomical information, and learns dependencies between anatomical region attributes, to accurately predict disease change for a pair of CXRs. Experimental results on the Chest ImaGenome dataset show increased downstream performance compared to baselines. Code is available at https://github.com/PLAN-Lab/ChexRelNet
△ Less
Submitted 15 September, 2022; v1 submitted 7 August, 2022;
originally announced August 2022.
-
A Tree Pruning Technique for Decoding Complexity Reduction of Polar Codes and PAC Codes
Authors:
Mohsen Moradi,
Amir Mozammel
Abstract:
Sorting operation is one of the main bottlenecks for the successive-cancellation list (SCL) decoding. This paper introduces an improvement to the SCL decoding for polar and pre-transformed polar codes that reduces the number of sorting operations without degrading the code's error-correction performance. In an SCL decoding with an optimum metric function we show that, on average, the correct branc…
▽ More
Sorting operation is one of the main bottlenecks for the successive-cancellation list (SCL) decoding. This paper introduces an improvement to the SCL decoding for polar and pre-transformed polar codes that reduces the number of sorting operations without degrading the code's error-correction performance. In an SCL decoding with an optimum metric function we show that, on average, the correct branch's bit-metric value must be equal to the bit-channel capacity, and on the other hand, the average bit-metric value of a wrong branch can be at most zero. This implies that a wrong path's partial path metric value deviates from the bit-channel capacity's partial summation. For relatively reliable bit-channels, the bit metric for a wrong branch becomes very large negative number, which enables us to detect and prune such paths. We prove that, for a threshold lower than the bit-channel cutoff rate, the probability of pruning the correct path decreases exponentially by the given threshold. Based on these findings, we presented a pruning technique, and the experimental results demonstrate a substantial decrease in the amount of sorting procedures required for SCL decoding. In the stack algorithm, a similar technique is used to significantly reduce the average number of paths in the stack.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
A global analysis of metrics used for measuring performance in natural language processing
Authors:
Kathrin Blagec,
Georg Dorffner,
Milad Moradi,
Simon Ott,
Matthias Samwald
Abstract:
Measuring the performance of natural language processing models is challenging. Traditionally used metrics, such as BLEU and ROUGE, originally devised for machine translation and summarization, have been shown to suffer from low correlation with human judgment and a lack of transferability to other tasks and languages. In the past 15 years, a wide range of alternative metrics have been proposed. H…
▽ More
Measuring the performance of natural language processing models is challenging. Traditionally used metrics, such as BLEU and ROUGE, originally devised for machine translation and summarization, have been shown to suffer from low correlation with human judgment and a lack of transferability to other tasks and languages. In the past 15 years, a wide range of alternative metrics have been proposed. However, it is unclear to what extent this has had an impact on NLP benchmarking efforts. Here we provide the first large-scale cross-sectional analysis of metrics used for measuring performance in natural language processing. We curated, mapped and systematized more than 3500 machine learning model performance results from the open repository 'Papers with Code' to enable a global and comprehensive analysis. Our results suggest that the large majority of natural language processing metrics currently used have properties that may result in an inadequate reflection of a models' performance. Furthermore, we found that ambiguities and inconsistencies in the reporting of metrics may lead to difficulties in interpreting and comparing model performances, impairing transparency and reproducibility in NLP research.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Deep Learning, Natural Language Processing, and Explainable Artificial Intelligence in the Biomedical Domain
Authors:
Milad Moradi,
Matthias Samwald
Abstract:
In this article, we first give an introduction to artificial intelligence and its applications in biology and medicine in Section 1. Deep learning methods are then described in Section 2. We narrow down the focus of the study on textual data in Section 3, where natural language processing and its applications in the biomedical domain are described. In Section 4, we give an introduction to explaina…
▽ More
In this article, we first give an introduction to artificial intelligence and its applications in biology and medicine in Section 1. Deep learning methods are then described in Section 2. We narrow down the focus of the study on textual data in Section 3, where natural language processing and its applications in the biomedical domain are described. In Section 4, we give an introduction to explainable artificial intelligence and discuss the importance of explainability of artificial intelligence systems, especially in the biomedical domain.
△ Less
Submitted 7 March, 2022; v1 submitted 25 February, 2022;
originally announced February 2022.
-
3D Segmentation with Fully Trainable Gabor Kernels and Pearson's Correlation Coefficient
Authors:
Ken C. L. Wong,
Mehdi Moradi
Abstract:
The convolutional layer and loss function are two fundamental components in deep learning. Because of the success of conventional deep learning kernels, the less versatile Gabor kernels become less popular despite the fact that they can provide abundant features at different frequencies, orientations, and scales with much fewer parameters. For existing loss functions for multi-class image segmenta…
▽ More
The convolutional layer and loss function are two fundamental components in deep learning. Because of the success of conventional deep learning kernels, the less versatile Gabor kernels become less popular despite the fact that they can provide abundant features at different frequencies, orientations, and scales with much fewer parameters. For existing loss functions for multi-class image segmentation, there is usually a tradeoff among accuracy, robustness to hyperparameters, and manual weight selections for combining different losses. Therefore, to gain the benefits of using Gabor kernels while keeping the advantage of automatic feature generation in deep learning, we propose a fully trainable Gabor-based convolutional layer where all Gabor parameters are trainable through backpropagation. Furthermore, we propose a loss function based on the Pearson's correlation coefficient, which is accurate, robust to learning rates, and does not require manual weight selections. Experiments on 43 3D brain magnetic resonance images with 19 anatomical structures show that, using the proposed loss function with a proper combination of conventional and Gabor-based kernels, we can train a network with only 1.6 million parameters to achieve an average Dice coefficient of 83%. This size is 44 times smaller than the original V-Net which has 71 million parameters. This paper demonstrates the potentials of using learnable parametric kernels in deep learning for 3D segmentation.
△ Less
Submitted 15 December, 2022; v1 submitted 10 January, 2022;
originally announced January 2022.
-
Improving the robustness and accuracy of biomedical language models through adversarial training
Authors:
Milad Moradi,
Matthias Samwald
Abstract:
Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain. They have obtained state-of-the-art performance scores on a wide variety of biomedical and clinical Natural Language Processing (NLP) benchmarks. However, the robustness and reliability of these models has been less explored so far. Neural NLP models can be…
▽ More
Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain. They have obtained state-of-the-art performance scores on a wide variety of biomedical and clinical Natural Language Processing (NLP) benchmarks. However, the robustness and reliability of these models has been less explored so far. Neural NLP models can be easily fooled by adversarial samples, i.e. minor changes to input that preserve the meaning and understandability of the text but force the NLP system to make erroneous decisions. This raises serious concerns about the security and trust-worthiness of biomedical NLP systems, especially when they are intended to be deployed in real-world use cases. We investigated the robustness of several transformer neural language models, i.e. BioBERT, SciBERT, BioMed-RoBERTa, and Bio-ClinicalBERT, on a wide range of biomedical and clinical text processing tasks. We implemented various adversarial attack methods to test the NLP systems in different attack scenarios. Experimental results showed that the biomedical NLP models are sensitive to adversarial samples; their performance dropped in average by 21 and 18.9 absolute percent on character-level and word-level adversarial noise, respectively. Conducting extensive adversarial training experiments, we fine-tuned the NLP models on a mixture of clean samples and adversarial inputs. Results showed that adversarial training is an effective defense mechanism against adversarial noise; the models robustness improved in average by 11.3 absolute percent. In addition, the models performance on clean data increased in average by 2.4 absolute present, demonstrating that adversarial training can boost generalization abilities of biomedical NLP systems.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain
Authors:
Milad Moradi,
Kathrin Blagec,
Florian Haberl,
Matthias Samwald
Abstract:
Deep neural language models have set new breakthroughs in many tasks of Natural Language Processing (NLP). Recent work has shown that deep transformer language models (pretrained on large amounts of texts) can achieve high levels of task-specific few-shot performance comparable to state-of-the-art models. However, the ability of these large language models in few-shot transfer learning has not yet…
▽ More
Deep neural language models have set new breakthroughs in many tasks of Natural Language Processing (NLP). Recent work has shown that deep transformer language models (pretrained on large amounts of texts) can achieve high levels of task-specific few-shot performance comparable to state-of-the-art models. However, the ability of these large language models in few-shot transfer learning has not yet been explored in the biomedical domain. We investigated the performance of two powerful transformer language models, i.e. GPT-3 and BioBERT, in few-shot settings on various biomedical NLP tasks. The experimental results showed that, to a great extent, both the models underperform a language model fine-tuned on the full training data. Although GPT-3 had already achieved near state-of-the-art results in few-shot knowledge transfer on open-domain NLP tasks, it could not perform as effectively as BioBERT, which is orders of magnitude smaller than GPT-3. Regarding that BioBERT was already pretrained on large biomedical text corpora, our study suggests that language models may largely benefit from in-domain pretraining in task-specific few-shot learning. However, in-domain pretraining seems not to be sufficient; novel pretraining and few-shot learning strategies are required in the biomedical NLP domain.
△ Less
Submitted 1 June, 2022; v1 submitted 6 September, 2021;
originally announced September 2021.
-
Deep learning models are not robust against noise in clinical text
Authors:
Milad Moradi,
Kathrin Blagec,
Matthias Samwald
Abstract:
Artificial Intelligence (AI) systems are attracting increasing interest in the medical domain due to their ability to learn complicated tasks that require human intelligence and expert knowledge. AI systems that utilize high-performance Natural Language Processing (NLP) models have achieved state-of-the-art results on a wide variety of clinical text processing benchmarks. They have even outperform…
▽ More
Artificial Intelligence (AI) systems are attracting increasing interest in the medical domain due to their ability to learn complicated tasks that require human intelligence and expert knowledge. AI systems that utilize high-performance Natural Language Processing (NLP) models have achieved state-of-the-art results on a wide variety of clinical text processing benchmarks. They have even outperformed human accuracy on some tasks. However, performance evaluation of such AI systems have been limited to accuracy measures on curated and clean benchmark datasets that may not properly reflect how robustly these systems can operate in real-world situations. In order to address this challenge, we introduce and implement a wide variety of perturbation methods that simulate different types of noise and variability in clinical text data. While noisy samples produced by these perturbation methods can often be understood by humans, they may cause AI systems to make erroneous decisions. Conducting extensive experiments on several clinical text processing tasks, we evaluated the robustness of high-performance NLP models against various types of character-level and word-level noise. The results revealed that the NLP models performance degrades when the input contains small amounts of noise. This study is a significant step towards exposing vulnerabilities of AI models utilized in clinical text processing systems. The proposed perturbation methods can be used in performance evaluation tests to assess how robustly clinical NLP models can operate on noisy data, in real-world settings.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Evaluating the Robustness of Neural Language Models to Input Perturbations
Authors:
Milad Moradi,
Matthias Samwald
Abstract:
High-performance neural language models have obtained state-of-the-art results on a wide range of Natural Language Processing (NLP) tasks. However, results for common benchmark datasets often do not reflect model reliability and robustness when applied to noisy, real-world data. In this study, we design and implement various types of character-level and word-level perturbation methods to simulate…
▽ More
High-performance neural language models have obtained state-of-the-art results on a wide range of Natural Language Processing (NLP) tasks. However, results for common benchmark datasets often do not reflect model reliability and robustness when applied to noisy, real-world data. In this study, we design and implement various types of character-level and word-level perturbation methods to simulate realistic scenarios in which input texts may be slightly noisy or different from the data distribution on which NLP systems were trained. Conducting comprehensive experiments on different NLP tasks, we investigate the ability of high-performance language models such as BERT, XLNet, RoBERTa, and ELMo in handling different types of input perturbations. The results suggest that language models are sensitive to input perturbations and their performance can decrease even when small changes are introduced. We highlight that models need to be further improved and that current benchmarks are not reflecting model robustness well. We argue that evaluations on perturbed inputs should routinely complement widely-used benchmarks in order to yield a more realistic understanding of NLP systems robustness.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Hybrid deep learning methods for phenotype prediction from clinical notes
Authors:
Sahar Khalafi,
Nasser Ghadiri,
Milad Moradi
Abstract:
Identifying patient cohorts from clinical notes in secondary electronic health records is a fundamental task in clinical information management. However, with the growing number of clinical notes, it becomes challenging to analyze the data manually for phenotype detection. Automatic extraction of clinical concepts would helps to identify the patient phenotypes correctly. This paper proposes a nove…
▽ More
Identifying patient cohorts from clinical notes in secondary electronic health records is a fundamental task in clinical information management. However, with the growing number of clinical notes, it becomes challenging to analyze the data manually for phenotype detection. Automatic extraction of clinical concepts would helps to identify the patient phenotypes correctly. This paper proposes a novel hybrid model for automatically extracting patient phenotypes using natural language processing and deep learning models to determine the patient phenotypes without dictionaries and human intervention. The model is based on a neural bidirectional sequence model (BiLSTM or BiGRU) and a CNN layer for phenotypes identification. An extra CNN layer is run parallel to the hybrid model to extract more features related to each phenotype. We used pre-trained embeddings such as FastText and Word2vec separately as the input layers to evaluate other embedding's performance. Experimental results using MIMIC III database in internal comparison demonstrate that the proposed model achieved significant performance improvement over existing models. The enhanced version of our model with an extra CNN layer obtained a relatively higher F1-score than the original hybrid model. We also showed that BiGRU layer with FastText embedding had better performance than BiLSTM layer to identify patient phenotypes.
△ Less
Submitted 3 May, 2022; v1 submitted 16 August, 2021;
originally announced August 2021.
-
Basis Scaling and Double Pruning for Efficient Inference in Network-Based Transfer Learning
Authors:
Ken C. L. Wong,
Satyananda Kashyap,
Mehdi Moradi
Abstract:
Network-based transfer learning allows the reuse of deep learning features with limited data, but the resulting models can be unnecessarily large. Although network pruning can improve inference efficiency, existing algorithms usually require fine-tuning that may not be suitable for small datasets. In this paper, using the singular value decomposition, we decompose a convolutional layer into two la…
▽ More
Network-based transfer learning allows the reuse of deep learning features with limited data, but the resulting models can be unnecessarily large. Although network pruning can improve inference efficiency, existing algorithms usually require fine-tuning that may not be suitable for small datasets. In this paper, using the singular value decomposition, we decompose a convolutional layer into two layers: a convolutional layer with the orthonormal basis vectors as the filters, and a "BasisScalingConv" layer which is responsible for rescaling the features and transforming them back to the original space. As the filters in each decomposed layer are linearly independent, when using the proposed basis scaling factors with the Taylor approximation of importance, pruning can be more effective and fine-tuning individual weights is unnecessary. Furthermore, as the numbers of input and output channels of the original convolutional layer remain unchanged after basis pruning, it is applicable to virtually all architectures and can be combined with existing pruning algorithms for double pruning to further increase the pruning capability. When transferring knowledge from ImageNet pre-trained models to different target domains, with less than 1% reduction in classification accuracies, we can achieve pruning ratios up to 74.6% for CIFAR-10 and 98.9% for MNIST in model parameters.
△ Less
Submitted 20 December, 2023; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Chest ImaGenome Dataset for Clinical Reasoning
Authors:
Joy T. Wu,
Nkechinyere N. Agu,
Ismini Lourentzou,
Arjun Sharma,
Joseph A. Paguio,
Jasper S. Yao,
Edward C. Dee,
William Mitchell,
Satyananda Kashyap,
Andrea Giovannini,
Leo A. Celi,
Mehdi Moradi
Abstract:
Despite the progress in automatic detection of radiologic findings from chest X-ray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep le…
▽ More
Despite the progress in automatic detection of radiologic findings from chest X-ray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global "weak" labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe $242,072$ images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, we provide: i) $1,256$ combinations of relation annotations between $29$ CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, ii) over $670,000$ localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph dataset from $500$ unique patients.
△ Less
Submitted 31 July, 2021;
originally announced August 2021.
-
Concatenated Reed-Solomon and Polarization-Adjusted Convolutional (PAC) Codes
Authors:
Mohsen Moradi,
Amir Mozammel
Abstract:
Two concatenated coding schemes incorporating algebraic Reed-Solomon (RS) codes and polarization-adjusted convolutional (PAC) codes are proposed. Simulation results show that at a bit error rate of $10^{-5}$, a concatenated scheme using RS and PAC codes has more than $0.25$ dB coding gain over the NASA standard concatenation scheme, which uses RS and convolutional codes.
Two concatenated coding schemes incorporating algebraic Reed-Solomon (RS) codes and polarization-adjusted convolutional (PAC) codes are proposed. Simulation results show that at a bit error rate of $10^{-5}$, a concatenated scheme using RS and PAC codes has more than $0.25$ dB coding gain over the NASA standard concatenation scheme, which uses RS and convolutional codes.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
A Monte-Carlo Based Construction of Polarization-Adjusted Convolutional (PAC) Codes
Authors:
Mohsen Moradi,
Amir Mozammel
Abstract:
This paper proposes a rate-profile construction method for polarization-adjusted convolutional (PAC) codes of any code length and rate, which is capable of maintaining trade-off between the error-correction performance and decoding complexity of PAC code. The proposed method can improve the error-correction performance of PAC codes while guaranteeing a low mean sequential decoding complexity for s…
▽ More
This paper proposes a rate-profile construction method for polarization-adjusted convolutional (PAC) codes of any code length and rate, which is capable of maintaining trade-off between the error-correction performance and decoding complexity of PAC code. The proposed method can improve the error-correction performance of PAC codes while guaranteeing a low mean sequential decoding complexity for signal-to-noise ratio (SNR) values beyond a target SNR value.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray
Authors:
Nkechinyere N. Agu,
Joy T. Wu,
Hanqing Chao,
Ismini Lourentzou,
Arjun Sharma,
Mehdi Moradi,
Pingkun Yan,
James Hendler
Abstract:
Radiologists usually observe anatomical regions of chest X-ray images as well as the overall image before making a decision. However, most existing deep learning models only look at the entire X-ray image for classification, failing to utilize important anatomical information. In this paper, we propose a novel multi-label chest X-ray classification model that accurately classifies the image findin…
▽ More
Radiologists usually observe anatomical regions of chest X-ray images as well as the overall image before making a decision. However, most existing deep learning models only look at the entire X-ray image for classification, failing to utilize important anatomical information. In this paper, we propose a novel multi-label chest X-ray classification model that accurately classifies the image finding and also localizes the findings to their correct anatomical regions. Specifically, our model consists of two modules, the detection module and the anatomical dependency module. The latter utilizes graph convolutional networks, which enable our model to learn not only the label dependency but also the relationship between the anatomical regions in the chest X-ray. We further utilize a method to efficiently create an adjacency matrix for the anatomical regions using the correlation of the label across the different regions. Detailed experiments and analysis of our results show the effectiveness of our method when compared to the current state-of-the-art multi-label chest X-ray image classification methods while also providing accurate location information.
△ Less
Submitted 20 May, 2021;
originally announced May 2021.
-
Channel Scaling: A Scale-and-Select Approach for Transfer Learning
Authors:
Ken C. L. Wong,
Satyananda Kashyap,
Mehdi Moradi
Abstract:
Transfer learning with pre-trained neural networks is a common strategy for training classifiers in medical image analysis. Without proper channel selections, this often results in unnecessarily large models that hinder deployment and explainability. In this paper, we propose a novel approach to efficiently build small and well performing networks by introducing the channel-scaling layers. A chann…
▽ More
Transfer learning with pre-trained neural networks is a common strategy for training classifiers in medical image analysis. Without proper channel selections, this often results in unnecessarily large models that hinder deployment and explainability. In this paper, we propose a novel approach to efficiently build small and well performing networks by introducing the channel-scaling layers. A channel-scaling layer is attached to each frozen convolutional layer, with the trainable scaling weights inferring the importance of the corresponding feature channels. Unlike the fine-tuning approaches, we maintain the weights of the original channels and large datasets are not required. By imposing L1 regularization and thresholding on the scaling weights, this framework iteratively removes unnecessary feature channels from a pre-trained model. Using an ImageNet pre-trained VGG16 model, we demonstrate the capabilities of the proposed framework on classifying opacity from chest X-ray images. The results show that we can reduce the number of parameters by 95% while delivering a superior performance.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Explaining Black-box Models for Biomedical Text Classification
Authors:
Milad Moradi,
Matthias Samwald
Abstract:
In this paper, we propose a novel method named Biomedical Confident Itemsets Explanation (BioCIE), aiming at post-hoc explanation of black-box machine learning models for biomedical text classification. Using sources of domain knowledge and a confident itemset mining method, BioCIE discretizes the decision space of a black-box into smaller subspaces and extracts semantic relationships between the…
▽ More
In this paper, we propose a novel method named Biomedical Confident Itemsets Explanation (BioCIE), aiming at post-hoc explanation of black-box machine learning models for biomedical text classification. Using sources of domain knowledge and a confident itemset mining method, BioCIE discretizes the decision space of a black-box into smaller subspaces and extracts semantic relationships between the input text and class labels in different subspaces. Confident itemsets discover how biomedical concepts are related to class labels in the black-box's decision space. BioCIE uses the itemsets to approximate the black-box's behavior for individual predictions. Optimizing fidelity, interpretability, and coverage measures, BioCIE produces class-wise explanations that represent decision boundaries of the black-box. Results of evaluations on various biomedical text classification tasks and black-box models demonstrated that BioCIE can outperform perturbation-based and decision set methods in terms of producing concise, accurate, and interpretable explanations. BioCIE improved the fidelity of instance-wise and class-wise explanations by 11.6% and 7.5%, respectively. It also improved the interpretability of explanations by 8%. BioCIE can be effectively used to explain how a black-box biomedical text classification model semantically relates input texts to class labels. The source code and supplementary material are available at https://github.com/mmoradi-iut/BioCIE.
△ Less
Submitted 20 December, 2020;
originally announced December 2020.
-
On the Metric and Computation of PAC Codes
Authors:
Mohsen Moradi
Abstract:
In this paper, we present an optimal metric function on average, which leads to a significantly low decoding computation while maintaining the superiority of the polarization-adjusted convolutional (PAC) codes' error-correction performance. With our proposed metric function, the PAC codes' decoding computation is comparable to the conventional convolutional codes (CC) sequential decoding. Moreover…
▽ More
In this paper, we present an optimal metric function on average, which leads to a significantly low decoding computation while maintaining the superiority of the polarization-adjusted convolutional (PAC) codes' error-correction performance. With our proposed metric function, the PAC codes' decoding computation is comparable to the conventional convolutional codes (CC) sequential decoding. Moreover, simulation results show an improvement in the low-rate PAC codes' error-correction performance when using our proposed metric function. We prove that choosing the polarized cutoff rate as the metric function's bias value reduces the probability of the sequential decoder advancing in the wrong path exponentially with respect to the wrong path depth. We also prove that the upper bound of the PAC codes' computation has a Pareto distribution; our simulation results also verify this. Furthermore, we present a scaling-bias procedure and a method of choosing threshold spacing for the search-limited sequential decoding that substantially improves the decoder's average computation. Our results show that for some codes with a length of 128, the search-limited PAC codes can achieve an error-correction performance close to the error-correction performance of the polar codes under successive cancellation list decoding with a list size of 64 and CRC length of 11 with a considerably lower computation.
△ Less
Submitted 10 December, 2020;
originally announced December 2020.