Search | arXiv e-print repository

arXiv:2508.11197 [pdf, ps, other]

E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection

Authors: Ahmad Mousavi, Yeganeh Abdollahinejad, Roberto Corizzo, Nathalie Japkowicz, Zois Boukouvalas

Abstract: Detecting multimodal misinformation on social media remains challenging due to inconsistencies between modalities, changes in temporal patterns, and substantial class imbalance. Many existing methods treat posts independently and fail to capture the event-level structure that connects them across time and modality. We propose E-CaTCH, an interpretable and scalable framework for robustly detecting… ▽ More Detecting multimodal misinformation on social media remains challenging due to inconsistencies between modalities, changes in temporal patterns, and substantial class imbalance. Many existing methods treat posts independently and fail to capture the event-level structure that connects them across time and modality. We propose E-CaTCH, an interpretable and scalable framework for robustly detecting misinformation. If needed, E-CaTCH clusters posts into pseudo-events based on textual similarity and temporal proximity, then processes each event independently. Within each event, textual and visual features are extracted using pre-trained BERT and ResNet encoders, refined via intra-modal self-attention, and aligned through bidirectional cross-modal attention. A soft gating mechanism fuses these representations to form contextualized, content-aware embeddings of each post. To model temporal evolution, E-CaTCH segments events into overlapping time windows and uses a trend-aware LSTM, enhanced with semantic shift and momentum signals, to encode narrative progression over time. Classification is performed at the event level, enabling better alignment with real-world misinformation dynamics. To address class imbalance and promote stable learning, the model integrates adaptive class weighting, temporal consistency regularization, and hard-example mining. The total loss is aggregated across all events. Extensive experiments on Fakeddit, IND, and COVID-19 MISINFOGRAPH demonstrate that E-CaTCH consistently outperforms state-of-the-art baselines. Cross-dataset evaluations further demonstrate its robustness, generalizability, and practical applicability across diverse misinformation scenarios. △ Less

Submitted 15 August, 2025; originally announced August 2025.

arXiv:2508.08076 [pdf, ps, other]

Heterogeneity in Entity Matching: A Survey and Experimental Analysis

Authors: Mohammad Hossein Moslemi, Amir Mousavi, Behshid Behkamal, Mostafa Milani

Abstract: Entity matching (EM) is a fundamental task in data integration and analytics, essential for identifying records that refer to the same real-world entity across diverse sources. In practice, datasets often differ widely in structure, format, schema, and semantics, creating substantial challenges for EM. We refer to this setting as Heterogeneous EM (HEM). This survey offers a unified perspective on… ▽ More Entity matching (EM) is a fundamental task in data integration and analytics, essential for identifying records that refer to the same real-world entity across diverse sources. In practice, datasets often differ widely in structure, format, schema, and semantics, creating substantial challenges for EM. We refer to this setting as Heterogeneous EM (HEM). This survey offers a unified perspective on HEM by introducing a taxonomy, grounded in prior work, that distinguishes two primary categories -- representation and semantic heterogeneity -- and their subtypes. The taxonomy provides a systematic lens for understanding how variations in data form and meaning shape the complexity of matching tasks. We then connect this framework to the FAIR principles -- Findability, Accessibility, Interoperability, and Reusability -- demonstrating how they both reveal the challenges of HEM and suggest strategies for mitigating them. Building on this foundation, we critically review recent EM methods, examining their ability to address different heterogeneity types, and conduct targeted experiments on state-of-the-art models to evaluate their robustness and adaptability under semantic heterogeneity. Our analysis uncovers persistent limitations in current approaches and points to promising directions for future research, including multimodal matching, human-in-the-loop workflows, deeper integration with large language models and knowledge graphs, and fairness-aware evaluation in heterogeneous settings. △ Less

Submitted 11 August, 2025; originally announced August 2025.

Comments: Survey and experimental analysis on heterogeneous entity matching

MSC Class: 68P20 68P20 68P20 ACM Class: H.2.8; H.2.4; I.2.7

arXiv:2506.15524 [pdf, ps, other]

NTIRE 2025 Image Shadow Removal Challenge Report

Authors: Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Cailian Chen, Zongwei Wu, Radu Timofte, Mingjia Li, Jin Hu, Hainuo Wang, Hengxing Liu, Jiarui Wang, Qiming Hu, Xiaojie Guo, Xin Lu, Jiarong Yang, Yuanfei Bao, Anya Hu, Zihao Fan, Kunyu Wang, Jie Xiao, Xi Wang, Xueyang Fu, Zheng-Jun Zha, Yu-Fan Lin, Chia-Ming Lee , et al. (57 additional authors not shown)

Abstract: This work examines the findings of the NTIRE 2025 Shadow Removal Challenge. A total of 306 participants have registered, with 17 teams successfully submitting their solutions during the final evaluation phase. Following the last two editions, this challenge had two evaluation tracks: one focusing on reconstruction fidelity and the other on visual perception through a user study. Both tracks were e… ▽ More This work examines the findings of the NTIRE 2025 Shadow Removal Challenge. A total of 306 participants have registered, with 17 teams successfully submitting their solutions during the final evaluation phase. Following the last two editions, this challenge had two evaluation tracks: one focusing on reconstruction fidelity and the other on visual perception through a user study. Both tracks were evaluated with images from the WSRD+ dataset, simulating interactions between self- and cast-shadows with a large number of diverse objects, textures, and materials. △ Less

Submitted 18 June, 2025; originally announced June 2025.

arXiv:2504.14092 [pdf, other]

Retinex-guided Histogram Transformer for Mask-free Shadow Removal

Authors: Wei Dong, Han Zhou, Seyed Amirreza Mousavi, Jun Chen

Abstract: While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline… ▽ More While deep learning methods have achieved notable progress in shadow removal, many existing approaches rely on shadow masks that are difficult to obtain, limiting their generalization to real-world scenes. In this work, we propose ReHiT, an efficient mask-free shadow removal framework based on a hybrid CNN-Transformer architecture guided by Retinex theory. We first introduce a dual-branch pipeline to separately model reflectance and illumination components, and each is restored by our developed Illumination-Guided Hybrid CNN-Transformer (IG-HCT) module. Second, besides the CNN-based blocks that are capable of learning residual dense features and performing multi-scale semantic fusion, multi-scale semantic fusion, we develop the Illumination-Guided Histogram Transformer Block (IGHB) to effectively handle non-uniform illumination and spatially complex shadows. Extensive experiments on several benchmark datasets validate the effectiveness of our approach over existing mask-free methods. Trained solely on the NTIRE 2025 Shadow Removal Challenge dataset, our solution delivers competitive results with one of the smallest parameter sizes and fastest inference speeds among top-ranked entries, highlighting its applicability for real-world applications with limited computational resources. The code is available at https://github.com/dongw22/oath. △ Less

Submitted 18 April, 2025; originally announced April 2025.

Comments: Accpeted by CVPR 2025 NTIRE Workshop, Retinex Guidance, Histogram Transformer

arXiv:2504.03807 [pdf]

From Keypoints to Realism: A Realistic and Accurate Virtual Try-on Network from 2D Images

Authors: Maliheh Toozandehjani, Ali Mousavi, Reza Taheri

Abstract: The aim of image-based virtual try-on is to generate realistic images of individuals wearing target garments, ensuring that the pose, body shape and characteristics of the target garment are accurately preserved. Existing methods often fail to reproduce the fine details of target garments effectively and lack generalizability to new scenarios. In the proposed method, the person's initial garment i… ▽ More The aim of image-based virtual try-on is to generate realistic images of individuals wearing target garments, ensuring that the pose, body shape and characteristics of the target garment are accurately preserved. Existing methods often fail to reproduce the fine details of target garments effectively and lack generalizability to new scenarios. In the proposed method, the person's initial garment is completely removed. Subsequently, a precise warping is performed using the predicted keypoints to fully align the target garment with the body structure and pose of the individual. Based on the warped garment, a body segmentation map is more accurately predicted. Then, using an alignment-aware segment normalization, the misaligned areas between the warped garment and the predicted garment region in the segmentation map are removed. Finally, the generator produces the final image with high visual quality, reconstructing the precise characteristics of the target garment, including its overall shape and texture. This approach emphasizes preserving garment characteristics and improving adaptability to various poses, providing better generalization for diverse applications. △ Less

Submitted 4 April, 2025; originally announced April 2025.

Comments: in Persian language

arXiv:2503.18890 [pdf, ps, other]

Public-Key Quantum Money and Fast Real Transforms

Authors: Jake Doliskani, Morteza Mirzaei, Ali Mousavi

Abstract: We propose a public-key quantum money scheme based on group actions and the Hartley transform. Our scheme adapts the quantum money scheme of Zhandry (2024), replacing the Fourier transform with the Hartley transform. This substitution ensures the banknotes have real amplitudes rather than complex amplitudes, which could offer both computational and theoretical advantages. To support this new con… ▽ More We propose a public-key quantum money scheme based on group actions and the Hartley transform. Our scheme adapts the quantum money scheme of Zhandry (2024), replacing the Fourier transform with the Hartley transform. This substitution ensures the banknotes have real amplitudes rather than complex amplitudes, which could offer both computational and theoretical advantages. To support this new construction, we propose a new verification algorithm that uses group action twists to address verification failures caused by the switch to real amplitudes. We also show how to efficiently compute the serial number associated with a money state using a new algorithm based on continuous-time quantum walks. Finally, we present a recursive algorithm for the quantum Hartley transform, achieving lower gate complexity than prior work and demonstrate how to compute other real quantum transforms, such as the quantum sine transform, using the quantum Hartley transform as a subroutine. △ Less

Submitted 17 July, 2025; v1 submitted 24 March, 2025; originally announced March 2025.

arXiv:2503.02228 [pdf, other]

One Patient's Annotation is Another One's Initialization: Towards Zero-Shot Surgical Video Segmentation with Cross-Patient Initialization

Authors: Seyed Amir Mousavi, Utku Ozbulak, Francesca Tozzi, Nikdokht Rashidian, Wouter Willaert, Joris Vankerschaver, Wesley De Neve

Abstract: Video object segmentation is an emerging technology that is well-suited for real-time surgical video segmentation, offering valuable clinical assistance in the operating room by ensuring consistent frame tracking. However, its adoption is limited by the need for manual intervention to select the tracked object, making it impractical in surgical settings. In this work, we tackle this challenge with… ▽ More Video object segmentation is an emerging technology that is well-suited for real-time surgical video segmentation, offering valuable clinical assistance in the operating room by ensuring consistent frame tracking. However, its adoption is limited by the need for manual intervention to select the tracked object, making it impractical in surgical settings. In this work, we tackle this challenge with an innovative solution: using previously annotated frames from other patients as the tracking frames. We find that this unconventional approach can match or even surpass the performance of using patients' own tracking frames, enabling more autonomous and efficient AI-assisted surgical workflows. Furthermore, we analyze the benefits and limitations of this approach, highlighting its potential to enhance segmentation accuracy while reducing the need for manual input. Our findings provide insights into key factors influencing performance, offering a foundation for future research on optimizing cross-patient frame selection for real-time surgical video analysis. △ Less

Submitted 3 March, 2025; originally announced March 2025.

arXiv:2503.01743 [pdf, other]

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Authors: Microsoft, :, Abdelrahman Abouelenin, Atabak Ashfaq, Adam Atkinson, Hany Awadalla, Nguyen Bach, Jianmin Bao, Alon Benhaim, Martin Cai, Vishrav Chaudhary, Congcong Chen, Dong Chen, Dongdong Chen, Junkun Chen, Weizhu Chen, Yen-Chun Chen, Yi-ling Chen, Qi Dai, Xiyang Dai, Ruchao Fan, Mei Gao, Min Gao, Amit Garg, Abhishek Goswami , et al. (51 additional authors not shown)

Abstract: We introduce Phi-4-Mini and Phi-4-Multimodal, compact yet highly capable language and multimodal models. Phi-4-Mini is a 3.8-billion-parameter language model trained on high-quality web and synthetic data, significantly outperforming recent open-source models of similar size and matching the performance of models twice its size on math and coding tasks requiring complex reasoning. This achievement… ▽ More We introduce Phi-4-Mini and Phi-4-Multimodal, compact yet highly capable language and multimodal models. Phi-4-Mini is a 3.8-billion-parameter language model trained on high-quality web and synthetic data, significantly outperforming recent open-source models of similar size and matching the performance of models twice its size on math and coding tasks requiring complex reasoning. This achievement is driven by a carefully curated synthetic data recipe emphasizing high-quality math and coding datasets. Compared to its predecessor, Phi-3.5-Mini, Phi-4-Mini features an expanded vocabulary size of 200K tokens to better support multilingual applications, as well as group query attention for more efficient long-sequence generation. Phi-4-Multimodal is a multimodal model that integrates text, vision, and speech/audio input modalities into a single model. Its novel modality extension approach leverages LoRA adapters and modality-specific routers to allow multiple inference modes combining various modalities without interference. For example, it now ranks first in the OpenASR leaderboard to date, although the LoRA component of the speech/audio modality has just 460 million parameters. Phi-4-Multimodal supports scenarios involving (vision + language), (vision + speech), and (speech/audio) inputs, outperforming larger vision-language and speech-language models on a wide range of tasks. Additionally, we experiment to further train Phi-4-Mini to enhance its reasoning capabilities. Despite its compact 3.8-billion-parameter size, this experimental version achieves reasoning performance on par with or surpassing significantly larger models, including DeepSeek-R1-Distill-Qwen-7B and DeepSeek-R1-Distill-Llama-8B. △ Less

Submitted 7 March, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

Comments: 39 pages

arXiv:2502.20934 [pdf, ps, other]

Revisiting the Evaluation Bias Introduced by Frame Sampling Strategies in Surgical Video Segmentation Using SAM2

Authors: Utku Ozbulak, Seyed Amir Mousavi, Francesca Tozzi, Niki Rashidian, Wouter Willaert, Wesley De Neve, Joris Vankerschaver

Abstract: Real-time video segmentation is a promising opportunity for AI-assisted surgery, offering intraoperative guidance by identifying tools and anatomical structures. Despite growing interest in surgical video segmentation, annotation protocols vary widely across datasets -- some provide dense, frame-by-frame labels, while others rely on sparse annotations sampled at low frame rates such as 1 FPS. In t… ▽ More Real-time video segmentation is a promising opportunity for AI-assisted surgery, offering intraoperative guidance by identifying tools and anatomical structures. Despite growing interest in surgical video segmentation, annotation protocols vary widely across datasets -- some provide dense, frame-by-frame labels, while others rely on sparse annotations sampled at low frame rates such as 1 FPS. In this study, we investigate how such inconsistencies in annotation density and frame rate sampling influence the evaluation of zero-shot segmentation models, using SAM2 as a case study for cholecystectomy procedures. Surprisingly, we find that under conventional sparse evaluation settings, lower frame rates can appear to outperform higher ones due to a smoothing effect that conceals temporal inconsistencies. However, when assessed under real-time streaming conditions, higher frame rates yield superior segmentation stability, particularly for dynamic objects like surgical graspers. To understand how these differences align with human perception, we conducted a survey among surgeons, nurses, and machine learning engineers and found that participants consistently preferred high-FPS segmentation overlays, reinforcing the importance of evaluating every frame in real-time applications rather than relying on sparse sampling strategies. Our findings highlight the risk of evaluation bias that is introduced by inconsistent dataset protocols and bring attention to the need for temporally fair benchmarking in surgical video AI. △ Less

Submitted 30 July, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

Comments: Accepted for publication in the 28th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) Workshop on Fairness of AI in Medical Imaging (FAIMI), 2025

arXiv:2501.18109 [pdf]

Influence of High-Performance Image-to-Image Translation Networks on Clinical Visual Assessment and Outcome Prediction: Utilizing Ultrasound to MRI Translation in Prostate Cancer

Authors: Mohammad R. Salmanpour, Amin Mousavi, Yixi Xu, William B Weeks, Ilker Hacihaliloglu

Abstract: Purpose: This study examines the core traits of image-to-image translation (I2I) networks, focusing on their effectiveness and adaptability in everyday clinical settings. Methods: We have analyzed data from 794 patients diagnosed with prostate cancer (PCa), using ten prominent 2D/3D I2I networks to convert ultrasound (US) images into MRI scans. We also introduced a new analysis of Radiomic feature… ▽ More Purpose: This study examines the core traits of image-to-image translation (I2I) networks, focusing on their effectiveness and adaptability in everyday clinical settings. Methods: We have analyzed data from 794 patients diagnosed with prostate cancer (PCa), using ten prominent 2D/3D I2I networks to convert ultrasound (US) images into MRI scans. We also introduced a new analysis of Radiomic features (RF) via the Spearman correlation coefficient to explore whether networks with high performance (SSIM>85%) could detect subtle RFs. Our study further examined synthetic images by 7 invited physicians. As a final evaluation study, we have investigated the improvement that are achieved using the synthetic MRI data on two traditional machine learning and one deep learning method. Results: In quantitative assessment, 2D-Pix2Pix network substantially outperformed the other 7 networks, with an average SSIM~0.855. The RF analysis revealed that 76 out of 186 RFs were identified using the 2D-Pix2Pix algorithm alone, although half of the RFs were lost during the translation process. A detailed qualitative review by 7 medical doctors noted a deficiency in low-level feature recognition in I2I tasks. Furthermore, the study found that synthesized image-based classification outperformed US image-based classification with an average accuracy and AUC~0.93. Conclusion: This study showed that while 2D-Pix2Pix outperformed cutting-edge networks in low-level feature discovery and overall error and similarity metrics, it still requires improvement in low-level feature performance, as highlighted by Group 3. Further, the study found using synthetic image-based classification outperformed original US image-based methods. △ Less

Submitted 19 July, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

Comments: 9 pages, 4 figures and 1 table

MSC Class: 14J60 (Primary) 14F05; 14J26 (Secondary) ACM Class: F.2.2

arXiv:2501.11268 [pdf, ps, other]

$\ell_0$-Regularized Quadratic Surface Support Vector Machines

Authors: Ahmad Mousavi, Ramin Zandvakili

Abstract: Kernel-free quadratic surface support vector machines have recently gained traction due to their flexibility in modeling nonlinear decision boundaries without relying on kernel functions. However, the introduction of a full quadratic classifier significantly increases the number of model parameters, scaling quadratically with data dimensionality, which often leads to overfitting and makes interpre… ▽ More Kernel-free quadratic surface support vector machines have recently gained traction due to their flexibility in modeling nonlinear decision boundaries without relying on kernel functions. However, the introduction of a full quadratic classifier significantly increases the number of model parameters, scaling quadratically with data dimensionality, which often leads to overfitting and makes interpretation difficult. To address these challenges, we propose a sparse variant of the QSVM by enforcing a cardinality constraint on the model parameters. While enhancing generalization and promoting sparsity, leveraging the $\ell_0$-norm inevitably incurs additional computational complexity. To tackle this, we develop a penalty decomposition algorithm capable of producing solutions that provably satisfy the first-order Lu-Zhang optimality conditions. Our approach accommodates both hinge and quadratic loss functions. In both cases, we demonstrate that the subproblems arising within the algorithm either admit closed-form solutions or can be solved efficiently through dual formulations, which contributes to the method's overall effectiveness. We also analyze the convergence behavior of the algorithm under both loss settings. Finally, we validate our approach on several real-world datasets, demonstrating its ability to reduce overfitting while maintaining strong classification performance. The complete implementation and experimental code are publicly available at https://github.com/raminzandvakili/L0-QSVM. △ Less

Submitted 10 August, 2025; v1 submitted 19 January, 2025; originally announced January 2025.

arXiv:2412.18409 [pdf, other]

The Impact of the Single-Label Assumption in Image Recognition Benchmarking

Authors: Esla Timothy Anzaku, Seyed Amir Mousavi, Arnout Van Messem, Wesley De Neve

Abstract: Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation protocols and the actual complexity of visual data. This mismatch can penalize DNNs for predicting correct but unannotated labels, which may partly explain reported… ▽ More Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation protocols and the actual complexity of visual data. This mismatch can penalize DNNs for predicting correct but unannotated labels, which may partly explain reported accuracy drops, such as the widely cited 11 to 14 percent top-1 accuracy decline on ImageNetV2, a replication test set for ImageNet. This raises the question: do such drops reflect genuine generalization failures or artifacts of restrictive evaluation metrics? We rigorously assess the impact of multi-label characteristics on reported accuracy gaps. To evaluate the multi-label prediction capability (MLPC) of single-label-trained models, we introduce a variable top-$k$ evaluation, where $k$ matches the number of valid labels per image. Applied to 315 ImageNet-trained models, our analyses demonstrate that conventional top-1 accuracy disproportionately penalizes valid but secondary predictions. We also propose Aggregate Subgroup Model Accuracy (ASMA) to better capture multi-label performance across model subgroups. Our results reveal wide variability in MLPC, with some models consistently ranking multiple correct labels higher. Under this evaluation, the perceived gap between ImageNet and ImageNetV2 narrows substantially. To further isolate multi-label recognition performance from contextual cues, we introduce PatchML, a synthetic dataset containing systematically combined object patches. PatchML demonstrates that many models trained with single-label supervision nonetheless recognize multiple objects. Altogether, these findings highlight limitations in single-label evaluation and reveal that modern DNNs have stronger multi-label capabilities than standard metrics suggest. △ Less

Submitted 27 May, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

Comments: 34 pages, 7 figures

arXiv:2412.01936 [pdf, other]

Kernel-Free Universum Quadratic Surface Twin Support Vector Machines for Imbalanced Data

Authors: Hossein Moosaei, Milan Hladík, Ahmad Mousavi, Zheming Gao, Haojie Fu

Abstract: Binary classification tasks with imbalanced classes pose significant challenges in machine learning. Traditional classifiers often struggle to accurately capture the characteristics of the minority class, resulting in biased models with subpar predictive performance. In this paper, we introduce a novel approach to tackle this issue by leveraging Universum points to support the minority class withi… ▽ More Binary classification tasks with imbalanced classes pose significant challenges in machine learning. Traditional classifiers often struggle to accurately capture the characteristics of the minority class, resulting in biased models with subpar predictive performance. In this paper, we introduce a novel approach to tackle this issue by leveraging Universum points to support the minority class within quadratic twin support vector machine models. Unlike traditional classifiers, our models utilize quadratic surfaces instead of hyperplanes for binary classification, providing greater flexibility in modeling complex decision boundaries. By incorporating Universum points, our approach enhances classification accuracy and generalization performance on imbalanced datasets. We generated four artificial datasets to demonstrate the flexibility of the proposed methods. Additionally, we validated the effectiveness of our approach through empirical evaluations on benchmark datasets, showing superior performance compared to conventional classifiers and existing methods for imbalanced classification. △ Less

Submitted 2 December, 2024; originally announced December 2024.

arXiv:2412.00068 [pdf]

Enhanced Lung Cancer Survival Prediction using Semi-Supervised Pseudo-Labeling and Learning from Diverse PET/CT Datasets

Authors: Mohammad R. Salmanpour, Arman Gorji, Amin Mousavi, Ali Fathi Jouzdani, Nima Sanati, Mehdi Maghsudi, Bonnie Leung, Cheryl Ho, Ren Yuan, Arman Rahmim

Abstract: Objective: This study explores a semi-supervised learning (SSL), pseudo-labeled strategy using diverse datasets to enhance lung cancer (LCa) survival predictions, analyzing Handcrafted and Deep Radiomic Features (HRF/DRF) from PET/CT scans with Hybrid Machine Learning Systems (HMLS). Methods: We collected 199 LCa patients with both PET & CT images, obtained from The Cancer Imaging Archive (TCIA) a… ▽ More Objective: This study explores a semi-supervised learning (SSL), pseudo-labeled strategy using diverse datasets to enhance lung cancer (LCa) survival predictions, analyzing Handcrafted and Deep Radiomic Features (HRF/DRF) from PET/CT scans with Hybrid Machine Learning Systems (HMLS). Methods: We collected 199 LCa patients with both PET & CT images, obtained from The Cancer Imaging Archive (TCIA) and our local database, alongside 408 head&neck cancer (HNCa) PET/CT images from TCIA. We extracted 215 HRFs and 1024 DRFs by PySERA and a 3D-Autoencoder, respectively, within the ViSERA software, from segmented primary tumors. The supervised strategy (SL) employed a HMLSs: PCA connected with 4 classifiers on both HRF and DRFs. SSL strategy expanded the datasets by adding 408 pseudo-labeled HNCa cases (labeled by Random Forest algorithm) to 199 LCa cases, using the same HMLSs techniques. Furthermore, Principal Component Analysis (PCA) linked with 4 survival prediction algorithms were utilized in survival hazard ratio analysis. Results: SSL strategy outperformed SL method (p-value<0.05), achieving an average accuracy of 0.85 with DRFs from PET and PCA+ Multi-Layer Perceptron (MLP), compared to 0.65 for SL strategy using DRFs from CT and PCA+ K-Nearest Neighbor (KNN). Additionally, PCA linked with Component-wise Gradient Boosting Survival Analysis on both HRFs and DRFs, as extracted from CT, had an average c-index of 0.80 with a Log Rank p-value<<0.001, confirmed by external testing. Conclusions: Shifting from HRFs and SL to DRFs and SSL strategies, particularly in contexts with limited data points, enabling CT or PET alone to significantly achieve high predictive performance. △ Less

Submitted 25 November, 2024; originally announced December 2024.

Comments: 12 pages and 7 figures

arXiv:2408.05948 [pdf, other]

ConvKGYarn: Spinning Configurable and Scalable Conversational Knowledge Graph QA datasets with Large Language Models

Authors: Ronak Pradeep, Daniel Lee, Ali Mousavi, Jeff Pound, Yisi Sang, Jimmy Lin, Ihab Ilyas, Saloni Potdar, Mostafa Arefiyan, Yunyao Li

Abstract: The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an idea… ▽ More The rapid advancement of Large Language Models (LLMs) and conversational assistants necessitates dynamic, scalable, and configurable conversational datasets for training and evaluation. These datasets must accommodate diverse user interaction modes, including text and voice, each presenting unique modeling challenges. Knowledge Graphs (KGs), with their structured and evolving nature, offer an ideal foundation for current and precise knowledge. Although human-curated KG-based conversational datasets exist, they struggle to keep pace with the rapidly changing user information needs. We present ConvKGYarn, a scalable method for generating up-to-date and configurable conversational KGQA datasets. Qualitative psychometric analyses confirm our method can generate high-quality datasets rivaling a popular conversational KGQA dataset while offering it at scale and covering a wide range of human-interaction configurations. We showcase its utility by testing LLMs on diverse conversations - exploring model behavior on conversational KGQA sets with different configurations grounded in the same KG fact set. Our results highlight the ability of ConvKGYarn to improve KGQA foundations and evaluate parametric knowledge of LLMs, thus offering a robust solution to the constantly evolving landscape of conversational assistants. △ Less

Submitted 12 August, 2024; originally announced August 2024.

arXiv:2406.04496 [pdf, other]

Time Sensitive Knowledge Editing through Efficient Finetuning

Authors: Xiou Ge, Ali Mousavi, Edouard Grave, Armand Joulin, Kun Qian, Benjamin Han, Mostafa Arefiyan, Yunyao Li

Abstract: Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge e… ▽ More Large Language Models (LLMs) have demonstrated impressive capability in different tasks and are bringing transformative changes to many domains. However, keeping the knowledge in LLMs up-to-date remains a challenge once pretraining is complete. It is thus essential to design effective methods to both update obsolete knowledge and induce new knowledge into LLMs. Existing locate-and-edit knowledge editing (KE) method suffers from two limitations. First, the post-edit LLMs by such methods generally have poor capability in answering complex queries that require multi-hop reasoning. Second, the long run-time of such locate-and-edit methods to perform knowledge edits make it infeasible for large scale KE in practice. In this paper, we explore Parameter-Efficient Fine-Tuning (PEFT) techniques as an alternative for KE. We curate a more comprehensive temporal KE dataset with both knowledge update and knowledge injection examples for KE performance benchmarking. We further probe the effect of fine-tuning on a range of layers in an LLM for the multi-hop QA task. We find that PEFT performs better than locate-and-edit techniques for time-sensitive knowledge edits. △ Less

Submitted 22 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: ACL 2024 main

arXiv:2404.01626 [pdf, other]

Entity Disambiguation via Fusion Entity Decoding

Authors: Junxiong Wang, Ali Mousavi, Omar Attia, Ronak Pradeep, Saloni Potdar, Alexander M. Rush, Umar Farooq Minhas, Yunyao Li

Abstract: Entity disambiguation (ED), which links the mentions of ambiguous entities to their referent entities in a knowledge base, serves as a core component in entity linking (EL). Existing generative approaches demonstrate improved accuracy compared to classification approaches under the standardized ZELDA benchmark. Nevertheless, generative approaches suffer from the need for large-scale pre-training a… ▽ More Entity disambiguation (ED), which links the mentions of ambiguous entities to their referent entities in a knowledge base, serves as a core component in entity linking (EL). Existing generative approaches demonstrate improved accuracy compared to classification approaches under the standardized ZELDA benchmark. Nevertheless, generative approaches suffer from the need for large-scale pre-training and inefficient generation. Most importantly, entity descriptions, which could contain crucial information to distinguish similar entities from each other, are often overlooked. We propose an encoder-decoder model to disambiguate entities with more detailed entity descriptions. Given text and candidate entities, the encoder learns interactions between the text and each candidate entity, producing representations for each entity candidate. The decoder then fuses the representations of entity candidates together and selects the correct entity. Our experiments, conducted on various entity disambiguation benchmarks, demonstrate the strong and robust performance of this model, particularly +1.5% in the ZELDA benchmark compared with GENRE. Furthermore, we integrate this approach into the retrieval/reader framework and observe +1.5% improvements in end-to-end entity linking in the GERBIL benchmark compared with EntQA. △ Less

Submitted 7 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: Accepted at NAACL'24 main

arXiv:2311.13546 [pdf, other]

Enigma: Privacy-Preserving Execution of QAOA on Untrusted Quantum Computers

Authors: Ramin Ayanzadeh, Ahmad Mousavi, Narges Alavisamani, Moinuddin Qureshi

Abstract: Quantum computers can solve problems that are beyond the capabilities of conventional computers. As quantum computers are expensive and hard to maintain, the typical model for performing quantum computation is to send the circuit to a quantum cloud provider. This leads to privacy concerns for commercial entities as an untrusted server can learn protected information from the provided circuit. Curr… ▽ More Quantum computers can solve problems that are beyond the capabilities of conventional computers. As quantum computers are expensive and hard to maintain, the typical model for performing quantum computation is to send the circuit to a quantum cloud provider. This leads to privacy concerns for commercial entities as an untrusted server can learn protected information from the provided circuit. Current proposals for Secure Quantum Computing (SQC) either rely on emerging technologies (such as quantum networks) or incur prohibitive overheads (for Quantum Homomorphic Encryption). The goal of our paper is to enable low-cost privacy-preserving quantum computation that can be used with current systems. We propose Enigma, a suite of privacy-preserving schemes specifically designed for the Quantum Approximate Optimization Algorithm (QAOA). Unlike previous SQC techniques that obfuscate quantum circuits, Enigma transforms the input problem of QAOA, such that the resulting circuit and the outcomes are unintelligible to the server. We introduce three variants of Enigma. Enigma-I protects the coefficients of QAOA using random phase flipping and fudging of values. Enigma-II protects the nodes of the graph by introducing decoy qubits, which are indistinguishable from primary ones. Enigma-III protects the edge information of the graph by modifying the graph such that each node has an identical number of connections. For all variants of Enigma, we demonstrate that we can still obtain the solution for the original problem. We evaluate Enigma using IBM quantum devices and show that the privacy improvements of Enigma come at only a small reduction in fidelity (1%-13%). △ Less

Submitted 22 November, 2023; originally announced November 2023.

arXiv:2309.11669 [pdf, other]

Construction of Paired Knowledge Graph-Text Datasets Informed by Cyclic Evaluation

Authors: Ali Mousavi, Xin Zhan, He Bai, Peng Shi, Theo Rekatsinas, Benjamin Han, Yunyao Li, Jeff Pound, Josh Susskind, Natalie Schluter, Ihab Ilyas, Navdeep Jaitly

Abstract: Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find… ▽ More Datasets that pair Knowledge Graphs (KG) and text together (KG-T) can be used to train forward and reverse neural models that generate text from KG and vice versa. However models trained on datasets where KG and text pairs are not equivalent can suffer from more hallucination and poorer recall. In this paper, we verify this empirically by generating datasets with different levels of noise and find that noisier datasets do indeed lead to more hallucination. We argue that the ability of forward and reverse models trained on a dataset to cyclically regenerate source KG or text is a proxy for the equivalence between the KG and the text in the dataset. Using cyclic evaluation we find that manually created WebNLG is much better than automatically created TeKGen and T-REx. Guided by these observations, we construct a new, improved dataset called LAGRANGE using heuristics meant to improve equivalence between KG and text and show the impact of each of the heuristics on cyclic evaluation. We also construct two synthetic datasets using large language models (LLMs), and observe that these are conducive to models that perform significantly well on cyclic generation of text, but less so on cyclic generation of KGs, probably because of a lack of a consistent underlying ontology. △ Less

Submitted 20 September, 2023; originally announced September 2023.

Comments: 16 pages

arXiv:2305.09464 [pdf, other]

doi 10.1145/3555041.3589672

Growing and Serving Large Open-domain Knowledge Graphs

Authors: Ihab F. Ilyas, JP Lacerda, Yunyao Li, Umar Farooq Minhas, Ali Mousavi, Jeffrey Pound, Theodoros Rekatsinas, Chiraag Sumanth

Abstract: Applications of large open-domain knowledge graphs (KGs) to real-world problems pose many unique challenges. In this paper, we present extensions to Saga our platform for continuous construction and serving of knowledge at scale. In particular, we describe a pipeline for training knowledge graph embeddings that powers key capabilities such as fact ranking, fact verification, a related entities ser… ▽ More Applications of large open-domain knowledge graphs (KGs) to real-world problems pose many unique challenges. In this paper, we present extensions to Saga our platform for continuous construction and serving of knowledge at scale. In particular, we describe a pipeline for training knowledge graph embeddings that powers key capabilities such as fact ranking, fact verification, a related entities service, and support for entity linking. We then describe how our platform, including graph embeddings, can be leveraged to create a Semantic Annotation service that links unstructured Web documents to entities in our KG. Semantic annotation of the Web effectively expands our knowledge graph with edges to open-domain Web content which can be used in various search and ranking problems. Finally, we leverage annotated Web documents to drive Open-domain Knowledge Extraction. This targeted extraction framework identifies important coverage issues in the KG, then finds relevant data sources for target entities on the Web and extracts missing information to enrich the KG. Finally, we describe adaptations to our knowledge platform needed to construct and serve private personal knowledge on-device. This includes private incremental KG construction, cross-device knowledge sync, and global knowledge enrichment. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: To be published in SIGMOD 2023

arXiv:2304.01926 [pdf]

High-Throughput Vector Similarity Search in Knowledge Graphs

Authors: Jason Mohoney, Anil Pacaci, Shihabur Rahman Chowdhury, Ali Mousavi, Ihab F. Ilyas, Umar Farooq Minhas, Jeffrey Pound, Theodoros Rekatsinas

Abstract: There is an increasing adoption of machine learning for encoding data into vectors to serve online recommendation and search use cases. As a result, recent data management systems propose augmenting query processing with online vector similarity search. In this work, we explore vector similarity search in the context of Knowledge Graphs (KGs). Motivated by the tasks of finding related KG queries a… ▽ More There is an increasing adoption of machine learning for encoding data into vectors to serve online recommendation and search use cases. As a result, recent data management systems propose augmenting query processing with online vector similarity search. In this work, we explore vector similarity search in the context of Knowledge Graphs (KGs). Motivated by the tasks of finding related KG queries and entities for past KG query workloads, we focus on hybrid vector similarity search (hybrid queries for short) where part of the query corresponds to vector similarity search and part of the query corresponds to predicates over relational attributes associated with the underlying data vectors. For example, given past KG queries for a song entity, we want to construct new queries for new song entities whose vector representations are close to the vector representation of the entity in the past KG query. But entities in a KG also have non-vector attributes such as a song associated with an artist, a genre, and a release date. Therefore, suggested entities must also satisfy query predicates over non-vector attributes beyond a vector-based similarity predicate. While these tasks are central to KGs, our contributions are generally applicable to hybrid queries. In contrast to prior works that optimize online queries, we focus on enabling efficient batch processing of past hybrid query workloads. We present our system, HQI, for high-throughput batch processing of hybrid queries. We introduce a workload-aware vector data partitioning scheme to tailor the vector index layout to the given workload and describe a multi-query optimization technique to reduce the overhead of vector similarity computations. We evaluate our methods on industrial workloads and demonstrate that HQI yields a 31x improvement in throughput for finding related KG queries compared to existing hybrid query processing approaches. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: 13 pages, 7 figures, to be published in ACM SIGMOD 2023

arXiv:2301.01795 [pdf, other]

PACO: Parts and Attributes of Common Objects

Authors: Vignesh Ramanathan, Anmol Kalia, Vladan Petrovic, Yi Wen, Baixue Zheng, Baishan Guo, Rui Wang, Aaron Marquez, Rama Kovvuri, Abhishek Kadian, Amir Mousavi, Yiwen Song, Abhimanyu Dubey, Dhruv Mahajan

Abstract: Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part cate… ▽ More Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part categories and 55 attributes across image (LVIS) and video (Ego4D) datasets. We provide 641K part masks annotated across 260K object boxes, with roughly half of them exhaustively annotated with attributes as well. We design evaluation metrics and provide benchmark results for three tasks on the dataset: part mask segmentation, object and part attribute prediction and zero-shot instance detection. Dataset, models, and code are open-sourced at https://github.com/facebookresearch/paco. △ Less

Submitted 4 January, 2023; originally announced January 2023.

arXiv:2211.04088 [pdf, other]

A Penalty-Based Method for Communication-Efficient Decentralized Bilevel Programming

Authors: Parvin Nazari, Ahmad Mousavi, Davoud Ataee Tarzanagh, George Michailidis

Abstract: Bilevel programming has recently received attention in the literature due to its wide range of applications, including reinforcement learning and hyper-parameter optimization. However, it is widely assumed that the underlying bilevel optimization problem is solved either by a single machine or, in the case of multiple machines connected in a star-shaped network, i.e., in a federated learning setti… ▽ More Bilevel programming has recently received attention in the literature due to its wide range of applications, including reinforcement learning and hyper-parameter optimization. However, it is widely assumed that the underlying bilevel optimization problem is solved either by a single machine or, in the case of multiple machines connected in a star-shaped network, i.e., in a federated learning setting. The latter approach suffers from a high communication cost on the central node (e.g., parameter server). Hence, there is an interest in developing methods that solve bilevel optimization problems in a communication-efficient, decentralized manner. To that end, this paper introduces a penalty function-based decentralized algorithm with theoretical guarantees for this class of optimization problems. Specifically, a distributed alternating gradient-type algorithm for solving consensus bilevel programming over a decentralized network is developed. A key feature of the proposed algorithm is the estimation of the hyper-gradient of the penalty function through decentralized computation of matrix-vector products and a few vector communications. The estimation is integrated into an alternating algorithm for solving the penalized reformulation of the bilevel optimization problem. Under appropriate step sizes and penalty parameters, our theoretical framework ensures non-asymptotic convergence to the optimal solution of the original problem under various convexity conditions. Our theoretical result highlights improvements in the iteration complexity of decentralized bilevel optimization, all while making efficient use of vector communication. Empirical results demonstrate that the proposed method performs well in real-world settings. △ Less

Submitted 10 October, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: To appear in Automatica

arXiv:2209.13716 [pdf, other]

doi 10.1109/LSP.2021.3068616

Hamiltonian Adaptive Importance Sampling

Authors: Ali Mousavi, Reza Monsefi, Víctor Elvira

Abstract: Importance sampling (IS) is a powerful Monte Carlo (MC) methodology for approximating integrals, for instance in the context of Bayesian inference. In IS, the samples are simulated from the so-called proposal distribution, and the choice of this proposal is key for achieving a high performance. In adaptive IS (AIS) methods, a set of proposals is iteratively improved. AIS is a relevant and timely m… ▽ More Importance sampling (IS) is a powerful Monte Carlo (MC) methodology for approximating integrals, for instance in the context of Bayesian inference. In IS, the samples are simulated from the so-called proposal distribution, and the choice of this proposal is key for achieving a high performance. In adaptive IS (AIS) methods, a set of proposals is iteratively improved. AIS is a relevant and timely methodology although many limitations remain yet to be overcome, e.g., the curse of dimensionality in high-dimensional and multi-modal problems. Moreover, the Hamiltonian Monte Carlo (HMC) algorithm has become increasingly popular in machine learning and statistics. HMC has several appealing features such as its exploratory behavior, especially in high-dimensional targets, when other methods suffer. In this paper, we introduce the novel Hamiltonian adaptive importance sampling (HAIS) method. HAIS implements a two-step adaptive process with parallel HMC chains that cooperate at each iteration. The proposed HAIS efficiently adapts a population of proposals, extracting the advantages of HMC. HAIS can be understood as a particular instance of the generic layered AIS family with an additional resampling step. HAIS achieves a significant performance improvement in high-dimensional problems w.r.t. state-of-the-art algorithms. We discuss the statistical properties of HAIS and show its high performance in two challenging examples. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Journal ref: in IEEE Signal Processing Letters, vol. 28, pp. 713-717, 2021

arXiv:2104.01331 [pdf, other]

Sparse Universum Quadratic Surface Support Vector Machine Models for Binary Classification

Authors: Hossein Moosaei, Ahmad Mousavi, Milan Hladík, Zheming Gao

Abstract: In binary classification, kernel-free linear or quadratic support vector machines are proposed to avoid dealing with difficulties such as finding appropriate kernel functions or tuning their hyper-parameters. Furthermore, Universum data points, which do not belong to any class, can be exploited to embed prior knowledge into the corresponding models so that the generalization performance is improve… ▽ More In binary classification, kernel-free linear or quadratic support vector machines are proposed to avoid dealing with difficulties such as finding appropriate kernel functions or tuning their hyper-parameters. Furthermore, Universum data points, which do not belong to any class, can be exploited to embed prior knowledge into the corresponding models so that the generalization performance is improved. In this paper, we design novel kernel-free Universum quadratic surface support vector machine models. Further, we propose the L1 norm regularized version that is beneficial for detecting potential sparsity patterns in the Hessian of the quadratic surface and reducing to the standard linear models if the data points are (almost) linearly separable. The proposed models are convex such that standard numerical solvers can be utilized for solving them. Nonetheless, we formulate a least squares version of the L1 norm regularized model and next, design an effective tailored algorithm that only requires solving one linear system. Several theoretical properties of these models are then reported/proved as well. We finally conduct numerical experiments on both artificial and public benchmark data sets to demonstrate the feasibility and effectiveness of the proposed models. △ Less

Submitted 3 April, 2021; originally announced April 2021.

arXiv:2102.00820 [pdf]

Adaptive Neuro Fuzzy Networks based on Quantum Subtractive Clustering

Authors: Ali Mousavi, Mehrdad Jalali, Mahdi Yaghoubi

Abstract: Data mining techniques can be used to discover useful patterns by exploring and analyzing data and it's feasible to synergitically combine machine learning tools to discover fuzzy classification rules.In this paper, an adaptive Neuro fuzzy network with TSK fuzzy type and an improved quantum subtractive clustering has been developed. Quantum clustering (QC) is an intuition from quantum mechanics wh… ▽ More Data mining techniques can be used to discover useful patterns by exploring and analyzing data and it's feasible to synergitically combine machine learning tools to discover fuzzy classification rules.In this paper, an adaptive Neuro fuzzy network with TSK fuzzy type and an improved quantum subtractive clustering has been developed. Quantum clustering (QC) is an intuition from quantum mechanics which uses Schrodinger potential and time-consuming gradient descent method. The principle advantage and shortcoming of QC is analyzed and based on its shortcomings, an improved algorithm through a subtractive clustering method is proposed. Cluster centers represent a general model with essential characteristics of data which can be use as premise part of fuzzy rules.The experimental results revealed that proposed Anfis based on quantum subtractive clustering yielded good approximation and generalization capabilities and impressive decrease in the number of fuzzy rules and network output accuracy in comparison with traditional methods. △ Less

Submitted 26 January, 2021; originally announced February 2021.

Comments: Proceedings of the International Conference on Data Science (IC-DATA), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Nevada, USA, 2011

arXiv:2011.02714 [pdf]

Obstacles in Fully Automatic Program Repair: A survey

Authors: S. Amirhossein Mousavi, Donya Azizi Babani, Francesco Flammini

Abstract: The current article is an interdisciplinary attempt to decipher automatic program repair processes. The review is done by the manner typical to human science known as diffraction. We attempt to spot a gap in the literature of self-healing and self-repair operations and further investigate the approaches that would enable us to tackle the problems we face. As a conclusion, we suggest a shift in the… ▽ More The current article is an interdisciplinary attempt to decipher automatic program repair processes. The review is done by the manner typical to human science known as diffraction. We attempt to spot a gap in the literature of self-healing and self-repair operations and further investigate the approaches that would enable us to tackle the problems we face. As a conclusion, we suggest a shift in the current approach to automatic program repair operations in order to attain our goals. The emphasis of this review is to achieve full automation. Several obstacles are shortly mentioned in the current essay but the main shortage that is covered is the overfitting obstacle, and this particular problem is investigated in the stream that is related to full automation of the repair process. △ Less

Submitted 5 November, 2020; originally announced November 2020.

arXiv:2007.13893 [pdf, other]

Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

Authors: Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi

Abstract: Off-policy evaluation (OPE) in reinforcement learning is an important problem in settings where experimentation is limited, such as education and healthcare. But, in these very same settings, observed actions are often confounded by unobserved variables making OPE even more difficult. We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where… ▽ More Off-policy evaluation (OPE) in reinforcement learning is an important problem in settings where experimentation is limited, such as education and healthcare. But, in these very same settings, observed actions are often confounded by unobserved variables making OPE even more difficult. We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where states and actions can act as proxies for the unobserved confounders. We show how, given only a latent variable model for states and actions, policy value can be identified from off-policy data. Our method involves two stages. In the first, we show how to use proxies to estimate stationary distribution ratios, extending recent work on breaking the curse of horizon to the confounded setting. In the second, we show optimal balancing can be combined with such learned ratios to obtain policy value while avoiding direct modeling of reward functions. We establish theoretical guarantees of consistency, and benchmark our method empirically. △ Less

Submitted 27 July, 2020; originally announced July 2020.

arXiv:2005.03059 [pdf]

CovidCTNet: An Open-Source Deep Learning Approach to Identify Covid-19 Using CT Image

Authors: Tahereh Javaheri, Morteza Homayounfar, Zohreh Amoozgar, Reza Reiazi, Fatemeh Homayounieh, Engy Abbas, Azadeh Laali, Amir Reza Radmard, Mohammad Hadi Gharib, Seyed Ali Javad Mousavi, Omid Ghaemi, Rosa Babaei, Hadi Karimi Mobin, Mehdi Hosseinzadeh, Rana Jahanban-Esfahlan, Khaled Seidi, Mannudeep K. Kalra, Guanglan Zhang, L. T. Chitkushev, Benjamin Haibe-Kains, Reza Malekzadeh, Reza Rawassizadeh

Abstract: Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, it… ▽ More Coronavirus disease 2019 (Covid-19) is highly contagious with limited treatment options. Early and accurate diagnosis of Covid-19 is crucial in reducing the spread of the disease and its accompanied mortality. Currently, detection by reverse transcriptase polymerase chain reaction (RT-PCR) is the gold standard of outpatient and inpatient detection of Covid-19. RT-PCR is a rapid method, however, its accuracy in detection is only ~70-75%. Another approved strategy is computed tomography (CT) imaging. CT imaging has a much higher sensitivity of ~80-98%, but similar accuracy of 70%. To enhance the accuracy of CT imaging detection, we developed an open-source set of algorithms called CovidCTNet that successfully differentiates Covid-19 from community-acquired pneumonia (CAP) and other lung diseases. CovidCTNet increases the accuracy of CT imaging detection to 90% compared to radiologists (70%). The model is designed to work with heterogeneous and small sample sizes independent of the CT imaging hardware. In order to facilitate the detection of Covid-19 globally and assist radiologists and physicians in the screening process, we are releasing all algorithms and parametric details in an open-source format. Open-source sharing of our CovidCTNet enables developers to rapidly improve and optimize services, while preserving user privacy and data ownership. △ Less

Submitted 15 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 5 figures

arXiv:2003.11126 [pdf, other]

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Authors: Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou

Abstract: Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \cite{liu18breaking} proposed an approach that avoids the \emph{curse of horizon} suffered by typical importance-sampling-based methods. While showing promising… ▽ More Off-policy estimation for long-horizon problems is important in many real-life applications such as healthcare and robotics, where high-fidelity simulators may not be available and on-policy evaluation is expensive or impossible. Recently, \cite{liu18breaking} proposed an approach that avoids the \emph{curse of horizon} suffered by typical importance-sampling-based methods. While showing promising results, this approach is limited in practice as it requires data be drawn from the \emph{stationary distribution} of a \emph{known} behavior policy. In this work, we propose a novel approach that eliminates such limitations. In particular, we formulate the problem as solving for the fixed point of a certain operator. Using tools from Reproducing Kernel Hilbert Spaces (RKHSs), we develop a new estimator that computes importance ratios of stationary distributions, without knowledge of how the off-policy data are collected. We analyze its asymptotic consistency and finite-sample generalization. Experiments on benchmarks verify the effectiveness of our approach. △ Less

Submitted 24 March, 2020; originally announced March 2020.

Comments: Published at ICLR 2020

arXiv:1908.08616 [pdf, other]

doi 10.3934/jimo.2021046

Quadratic Surface Support Vector Machine with L1 Norm Regularization

Authors: Ahmad Mousavi, Zheming Gao, Lanshan Han, Alvin Lim

Abstract: We propose $\ell_1$ norm regularized quadratic surface support vector machine models for binary classification in supervised learning. We establish their desired theoretical properties, including the existence and uniqueness of the optimal solution, reduction to the standard SVMs over (almost) linearly separable data sets, and detection of true sparsity pattern over (almost) quadratically separabl… ▽ More We propose $\ell_1$ norm regularized quadratic surface support vector machine models for binary classification in supervised learning. We establish their desired theoretical properties, including the existence and uniqueness of the optimal solution, reduction to the standard SVMs over (almost) linearly separable data sets, and detection of true sparsity pattern over (almost) quadratically separable data sets if the penalty parameter of $\ell_1$ norm is large enough. We also demonstrate their promising practical efficiency by conducting various numerical experiments on both synthetic and publicly available benchmark data sets. △ Less

Submitted 30 January, 2021; v1 submitted 22 August, 2019; originally announced August 2019.

arXiv:1908.01014 [pdf, other]

A Survey on Compressive Sensing: Classical Results and Recent Advancements

Authors: Ahmad Mousavi, Mehdi Rezaee, Ramin Ayanzadeh

Abstract: Recovering sparse signals from linear measurements has demonstrated outstanding utility in a vast variety of real-world applications. Compressive sensing is the topic that studies the associated raised questions for the possibility of a successful recovery. This topic is well-nourished and numerous results are available in the literature. However, their dispersity makes it challenging and time-con… ▽ More Recovering sparse signals from linear measurements has demonstrated outstanding utility in a vast variety of real-world applications. Compressive sensing is the topic that studies the associated raised questions for the possibility of a successful recovery. This topic is well-nourished and numerous results are available in the literature. However, their dispersity makes it challenging and time-consuming for readers and practitioners to quickly grasp its main ideas and classical algorithms, and further touch upon the recent advancements in this surging field. Besides, the sparsity notion has already demonstrated its effectiveness in many contemporary fields. Thus, these results are useful and inspiring for further investigation of related questions in these emerging fields from new perspectives. In this survey, we gather and overview vital classical tools and algorithms in compressive sensing and describe significant recent advancements. We conclude this survey by a numerical comparison of the performance of described approaches on an interesting application. △ Less

Submitted 22 July, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

arXiv:1806.10744 [pdf]

Intention to Use SMS Vaccination Reminder and Management System among Health Centers in Malaysia: The Mediating Effect of Attitude

Authors: Kamal Karkonasasi, Cheah Yu-N, Seyed Aliakbar Mousavi

Abstract: The purpose of this study is to find out influential factors of intention to use the proposed system among health centers by considering the technology acceptance model as the underlying theory. The proposed model of this study is tested using data collected from self-directed questionnaires filled up by 70 nurses in government hospitals and government clinics in Malaysia. A research method, which… ▽ More The purpose of this study is to find out influential factors of intention to use the proposed system among health centers by considering the technology acceptance model as the underlying theory. The proposed model of this study is tested using data collected from self-directed questionnaires filled up by 70 nurses in government hospitals and government clinics in Malaysia. A research method, which is based on the multi-analytical approach of Multiple Regression Analysis and Artificial Neural Networks, helps to refine the results of this study. The compatibility of the proposed system only has a significant and positive effect on attitude to use the system among other factors. Moreover, health centers' intention to use the system is further influenced by their perceived usefulness than their attitude to use. The mediating effect of Attitude is also proved. However, there is not any statistically significant difference in Intention scores accounting for the participants' demographic characteristics. The implication of this study is that by developing the system, which fits to the current procedures of health centers, we give them positive feelings. IT educational programs also should be provided for nurses to identify the benefits of the proposed system. It is also important to consider the indirect effects from independent variables on intention to use the system. However, the system does not require to consider the demographic characteristics of its end users. △ Less

Submitted 27 June, 2018; originally announced June 2018.

arXiv:1805.10531 [pdf, other]

Unsupervised Learning with Stein's Unbiased Risk Estimator

Authors: Christopher A. Metzler, Ali Mousavi, Reinhard Heckel, Richard G. Baraniuk

Abstract: Learning from unlabeled and noisy data is one of the grand challenges of machine learning. As such, it has seen a flurry of research with new ideas proposed continuously. In this work, we revisit a classical idea: Stein's Unbiased Risk Estimator (SURE). We show that, in the context of image recovery, SURE and its generalizations can be used to train convolutional neural networks (CNNs) for a range… ▽ More Learning from unlabeled and noisy data is one of the grand challenges of machine learning. As such, it has seen a flurry of research with new ideas proposed continuously. In this work, we revisit a classical idea: Stein's Unbiased Risk Estimator (SURE). We show that, in the context of image recovery, SURE and its generalizations can be used to train convolutional neural networks (CNNs) for a range of image denoising and recovery problems without any ground truth data. Specifically, our goal is to reconstruct an image $x$ from a noisy linear transformation (measurement) of the image. We consider two scenarios: one where no additional data is available and one where we have measurements of other images that are drawn from the same noisy distribution as $x$, but have no access to the clean images. Such is the case, for instance, in the context of medical imaging, microscopy, and astronomy, where noise-less ground truth data is rarely available. We show that in this situation, SURE can be used to estimate the mean-squared-error loss associated with an estimate of $x$. Using this estimate of the loss, we train networks to perform denoising and compressed sensing recovery. In addition, we also use the SURE framework to partially explain and improve upon an intriguing results presented by Ulyanov et al. in "Deep Image Prior": that a network initialized with random weights and fit to a single noisy image can effectively denoise that image. Public implementations of the networks and methods described in this paper can be found at https://github.com/ricedsp/D-AMP_Toolbox. △ Less

Submitted 22 July, 2020; v1 submitted 26 May, 2018; originally announced May 2018.

arXiv:1707.03386 [pdf, ps, other]

DeepCodec: Adaptive Sensing and Recovery via Deep Convolutional Neural Networks

Authors: Ali Mousavi, Gautam Dasarathy, Richard G. Baraniuk

Abstract: In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a near-optimal number of undersampled m… ▽ More In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a near-optimal number of undersampled measurements and the inverse transformation from measurements to signals. This is in contrast to traditional compressive sensing (CS) systems that use random linear measurements and convex optimization or iterative algorithms for signal recovery. We compare our new framework with $\ell_1$-minimization from the phase transition point of view and demonstrate that it outperforms $\ell_1$-minimization in the regions of phase transition plot where $\ell_1$-minimization cannot recover the exact solution. In addition, we experimentally demonstrate how learning measurements enhances the overall recovery performance, speeds up training of recovery framework, and leads to having fewer parameters to learn. △ Less

Submitted 11 July, 2017; originally announced July 2017.

arXiv:1704.06625 [pdf, other]

Learned D-AMP: Principled Neural Network based Compressive Image Recovery

Authors: Christopher A. Metzler, Ali Mousavi, Richard G. Baraniuk

Abstract: Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they can run orders of magnitude faster than existing techniques. However, these methods are largely unprincipled black boxes tha… ▽ More Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they can run orders of magnitude faster than existing techniques. However, these methods are largely unprincipled black boxes that are difficult to train and often-times specific to a single measurement matrix. It was recently demonstrated that iterative sparse-signal-recovery algorithms can be "unrolled" to form interpretable deep networks. Taking inspiration from this work, we develop a novel neural network architecture that mimics the behavior of the denoising-based approximate message passing (D-AMP) algorithm. We call this new network Learned D-AMP (LDAMP). The LDAMP network is easy to train, can be applied to a variety of different measurement matrices, and comes with a state-evolution heuristic that accurately predicts its performance. Most importantly, it outperforms the state-of-the-art BM3D-AMP and NLR-CS algorithms in terms of both accuracy and run time. At high resolutions, and when used with sensing matrices that have fast implementations, LDAMP runs over $50\times$ faster than BM3D-AMP and hundreds of times faster than NLR-CS. △ Less

Submitted 6 November, 2017; v1 submitted 21 April, 2017; originally announced April 2017.

arXiv:1701.03891 [pdf, ps, other]

Learning to Invert: Signal Recovery via Deep Convolutional Networks

Authors: Ali Mousavi, Richard G. Baraniuk

Abstract: The promise of compressive sensing (CS) has been offset by two significant challenges. First, real-world data is not exactly sparse in a fixed basis. Second, current high-performance recovery algorithms are slow to converge, which limits CS to either non-real-time applications or scenarios where massive back-end computing is available. In this paper, we attack both of these challenges head-on by d… ▽ More The promise of compressive sensing (CS) has been offset by two significant challenges. First, real-world data is not exactly sparse in a fixed basis. Second, current high-performance recovery algorithms are slow to converge, which limits CS to either non-real-time applications or scenarios where massive back-end computing is available. In this paper, we attack both of these challenges head-on by developing a new signal recovery framework we call {\em DeepInverse} that learns the inverse transformation from measurement vectors to signals using a {\em deep convolutional network}. When trained on a set of representative images, the network learns both a representation for the signals (addressing challenge one) and an inverse map approximating a greedy or convex recovery algorithm (addressing challenge two). Our experiments indicate that the DeepInverse network closely approximates the solution produced by state-of-the-art CS recovery algorithms yet is hundreds of times faster in run time. The tradeoff for the ultrafast run time is a computationally intensive, off-line training procedure typical to deep networks. However, the training needs to be completed only once, which makes the approach attractive for a host of sparse recovery problems. △ Less

Submitted 14 January, 2017; originally announced January 2017.

Comments: Accepted at The 42nd IEEE International Conference on Acoustics, Speech and Signal Processing

arXiv:1511.01017 [pdf, ps, other]

Consistent Parameter Estimation for LASSO and Approximate Message Passing

Authors: Ali Mousavi, Arian Maleki, Richard G. Baraniuk

Abstract: We consider the problem of recovering a vector $β_o \in \mathbb{R}^p$ from $n$ random and noisy linear observations $y= Xβ_o + w$, where $X$ is the measurement matrix and $w$ is noise. The LASSO estimate is given by the solution to the optimization problem $\hatβ_λ = \arg \min_β \frac{1}{2} \|y-Xβ\|_2^2 + λ\| β\|_1$. Among the iterative algorithms that have been proposed for solving this optimizat… ▽ More We consider the problem of recovering a vector $β_o \in \mathbb{R}^p$ from $n$ random and noisy linear observations $y= Xβ_o + w$, where $X$ is the measurement matrix and $w$ is noise. The LASSO estimate is given by the solution to the optimization problem $\hatβ_λ = \arg \min_β \frac{1}{2} \|y-Xβ\|_2^2 + λ\| β\|_1$. Among the iterative algorithms that have been proposed for solving this optimization problem, approximate message passing (AMP) has attracted attention for its fast convergence. Despite significant progress in the theoretical analysis of the estimates of LASSO and AMP, little is known about their behavior as a function of the regularization parameter $λ$, or the thereshold parameters $τ^t$. For instance the following basic questions have not yet been studied in the literature: (i) How does the size of the active set $\|\hatβ^λ\|_0/p$ behave as a function of $λ$? (ii) How does the mean square error $\|\hatβ_λ - β_o\|_2^2/p$ behave as a function of $λ$? (iii) How does $\|β^t - β_o \|_2^2/p$ behave as a function of $τ^1, \ldots, τ^{t-1}$? Answering these questions will help in addressing practical challenges regarding the optimal tuning of $λ$ or $τ^1, τ^2, \ldots$. This paper answers these questions in the asymptotic setting and shows how these results can be employed in deriving simple and theoretically optimal approaches for tuning the parameters $τ^1, \ldots, τ^t$ for AMP or $λ$ for LASSO. It also explores the connection between the optimal tuning of the parameters of AMP and the optimal tuning of LASSO. △ Less

Submitted 4 November, 2015; v1 submitted 3 November, 2015; originally announced November 2015.

Comments: arXiv admin note: text overlap with arXiv:1309.5979

arXiv:1508.04073 [pdf, ps, other]

An Information-Theoretic Measure of Dependency Among Variables in Large Datasets

Authors: Ali Mousavi, Richard G. Baraniuk

Abstract: The maximal information coefficient (MIC), which measures the amount of dependence between two variables, is able to detect both linear and non-linear associations. However, computational cost grows rapidly as a function of the dataset size. In this paper, we develop a computationally efficient approximation to the MIC that replaces its dynamic programming step with a much simpler technique based… ▽ More The maximal information coefficient (MIC), which measures the amount of dependence between two variables, is able to detect both linear and non-linear associations. However, computational cost grows rapidly as a function of the dataset size. In this paper, we develop a computationally efficient approximation to the MIC that replaces its dynamic programming step with a much simpler technique based on the uniform partitioning of data grid. A variety of experiments demonstrate the quality of our approximation. △ Less

Submitted 17 August, 2015; originally announced August 2015.

arXiv:1508.04065 [pdf, ps, other]

doi 10.1109/ALLERTON.2015.7447163

A Deep Learning Approach to Structured Signal Recovery

Authors: Ali Mousavi, Ankit B. Patel, Richard G. Baraniuk

Abstract: In this paper, we develop a new framework for sensing and recovering structured signals. In contrast to compressive sensing (CS) systems that employ linear measurements, sparse representations, and computationally complex convex/greedy algorithms, we introduce a deep learning framework that supports both linear and mildly nonlinear measurements, that learns a structured representation from trainin… ▽ More In this paper, we develop a new framework for sensing and recovering structured signals. In contrast to compressive sensing (CS) systems that employ linear measurements, sparse representations, and computationally complex convex/greedy algorithms, we introduce a deep learning framework that supports both linear and mildly nonlinear measurements, that learns a structured representation from training data, and that efficiently computes a signal estimate. In particular, we apply a stacked denoising autoencoder (SDA), as an unsupervised feature learner. SDA enables us to capture statistical dependencies between the different elements of certain signals and improve signal recovery performance as compared to the CS approach. △ Less

Submitted 17 August, 2015; originally announced August 2015.

Journal ref: In Proceeding of 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)

arXiv:1311.0035 [pdf, ps, other]

Parameterless Optimal Approximate Message Passing

Authors: Ali Mousavi, Arian Maleki, Richard G. Baraniuk

Abstract: Iterative thresholding algorithms are well-suited for high-dimensional problems in sparse recovery and compressive sensing. The performance of this class of algorithms depends heavily on the tuning of certain threshold parameters. In particular, both the final reconstruction error and the convergence rate of the algorithm crucially rely on how the threshold parameter is set at each step of the alg… ▽ More Iterative thresholding algorithms are well-suited for high-dimensional problems in sparse recovery and compressive sensing. The performance of this class of algorithms depends heavily on the tuning of certain threshold parameters. In particular, both the final reconstruction error and the convergence rate of the algorithm crucially rely on how the threshold parameter is set at each step of the algorithm. In this paper, we propose a parameter-free approximate message passing (AMP) algorithm that sets the threshold parameter at each iteration in a fully automatic way without either having an information about the signal to be reconstructed or needing any tuning from the user. We show that the proposed method attains both the minimum reconstruction error and the highest convergence rate. Our method is based on applying the Stein unbiased risk estimate (SURE) along with a modified gradient descent to find the optimal threshold in each iteration. Motivated by the connections between AMP and LASSO, it could be employed to find the solution of the LASSO for the optimal regularization parameter. To the best of our knowledge, this is the first work concerning parameter tuning that obtains the fastest convergence rate with theoretical guarantees. △ Less

Submitted 31 October, 2013; originally announced November 2013.

arXiv:1309.5979 [pdf, other]

Asymptotic Analysis of LASSOs Solution Path with Implications for Approximate Message Passing

Authors: Ali Mousavi, Arian Maleki, Richard G. Baraniuk

Abstract: This paper concerns the performance of the LASSO (also knows as basis pursuit denoising) for recovering sparse signals from undersampled, randomized, noisy measurements. We consider the recovery of the signal $x_o \in \mathbb{R}^N$ from $n$ random and noisy linear observations $y= Ax_o + w$, where $A$ is the measurement matrix and $w$ is the noise. The LASSO estimate is given by the solution to th… ▽ More This paper concerns the performance of the LASSO (also knows as basis pursuit denoising) for recovering sparse signals from undersampled, randomized, noisy measurements. We consider the recovery of the signal $x_o \in \mathbb{R}^N$ from $n$ random and noisy linear observations $y= Ax_o + w$, where $A$ is the measurement matrix and $w$ is the noise. The LASSO estimate is given by the solution to the optimization problem $x_o$ with $\hat{x}_λ = \arg \min_x \frac{1}{2} \|y-Ax\|_2^2 + λ\|x\|_1$. Despite major progress in the theoretical analysis of the LASSO solution, little is known about its behavior as a function of the regularization parameter $λ$. In this paper we study two questions in the asymptotic setting (i.e., where $N \rightarrow \infty$, $n \rightarrow \infty$ while the ratio $n/N$ converges to a fixed number in $(0,1)$): (i) How does the size of the active set $\|\hat{x}_λ\|_0/N$ behave as a function of $λ$, and (ii) How does the mean square error $\|\hat{x}_λ - x_o\|_2^2/N$ behave as a function of $λ$? We then employ these results in a new, reliable algorithm for solving LASSO based on approximate message passing (AMP). △ Less

Submitted 23 September, 2013; originally announced September 2013.

arXiv:1201.6112 [pdf]

An Efficient Method for Mining Event-Related Potential Patterns

Authors: Seyed Aliakbar Mousavi, Muhammad Rafie Hj Arshad, Hasimah Hj Mohamed, Saleh Ali Alomari

Abstract: In the present paper, we propose a Neuroelectromagnetic Ontology Framework (NOF) for mining Event-related Potentials (ERP) patterns as well as the process. The aim for this research is to develop an infrastructure for mining, analysis and sharing the ERP domain ontologies. The outcome of this research is a Neuroelectromagnetic knowledge-based system. The framework has 5 stages: 1) Data pre-process… ▽ More In the present paper, we propose a Neuroelectromagnetic Ontology Framework (NOF) for mining Event-related Potentials (ERP) patterns as well as the process. The aim for this research is to develop an infrastructure for mining, analysis and sharing the ERP domain ontologies. The outcome of this research is a Neuroelectromagnetic knowledge-based system. The framework has 5 stages: 1) Data pre-processing and preparation; 2) Data mining application; 3) Rule Comparison and Evaluation; 4) Association rules Post-processing 5) Domain Ontologies. In 5th stage a new set of hidden rules can be discovered base on comparing association rules by domain ontologies and expert rules. △ Less

Submitted 30 January, 2012; originally announced January 2012.

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 6, No 1, November 2011

arXiv:1001.4423 [pdf]

A New Decoding Scheme for Errorless Codes for Overloaded CDMA with Active User Detection

Authors: Ali Mousavi, Pedram Pad, Farokh Marvasti

Abstract: Recently, a new class of binary codes for overloaded CDMA systems are proposed that not only has the ability of errorless communication but also suitable for detecting active users. These codes are called COWDA [1]. In [1], a Maximum Likelihood (ML) decoder is proposed for this class of codes. Although the proposed scheme of coding/decoding show impressive performance, the decoder can be improve… ▽ More Recently, a new class of binary codes for overloaded CDMA systems are proposed that not only has the ability of errorless communication but also suitable for detecting active users. These codes are called COWDA [1]. In [1], a Maximum Likelihood (ML) decoder is proposed for this class of codes. Although the proposed scheme of coding/decoding show impressive performance, the decoder can be improved. In this paper by assuming more practical conditions for the traffic in the system, we suggest an algorithm that increases the performance of the decoder several orders of magnitude (the Bit-Error-Rate (BER) is divided by a factor of 400 in some Eb/N0's The algorithm supposes the Poison distribution for the time of activation/deactivation of the users. △ Less

Submitted 25 January, 2010; originally announced January 2010.

Showing 1–44 of 44 results for author: Mousavi, A