Search | arXiv e-print repository

doi 10.1145/3715336.3735754

IKIWISI: An Interactive Visual Pattern Generator for Evaluating the Reliability of Vision-Language Models Without Ground Truth

Authors: Md Touhidul Islam, Imran Kabir, Md Alimoor Reza, Syed Masum Billah

Abstract: We present IKIWISI ("I Know It When I See It"), an interactive visual pattern generator for assessing vision-language models in video object recognition when ground truth is unavailable. IKIWISI transforms model outputs into a binary heatmap where green cells indicate object presence and red cells indicate object absence. This visualization leverages humans' innate pattern recognition abilities to… ▽ More We present IKIWISI ("I Know It When I See It"), an interactive visual pattern generator for assessing vision-language models in video object recognition when ground truth is unavailable. IKIWISI transforms model outputs into a binary heatmap where green cells indicate object presence and red cells indicate object absence. This visualization leverages humans' innate pattern recognition abilities to evaluate model reliability. IKIWISI introduces "spy objects": adversarial instances users know are absent, to discern models hallucinating on nonexistent items. The tool functions as a cognitive audit mechanism, surfacing mismatches between human and machine perception by visualizing where models diverge from human understanding. Our study with 15 participants found that users considered IKIWISI easy to use, made assessments that correlated with objective metrics when available, and reached informed conclusions by examining only a small fraction of heatmap cells. This approach not only complements traditional evaluation methods through visual assessment of model behavior with custom object sets, but also reveals opportunities for improving alignment between human perception and machine understanding in vision-language systems. △ Less

Submitted 28 May, 2025; originally announced May 2025.

Comments: Accepted at DIS'25 (Funchal, Portugal)

arXiv:2505.18866 [pdf, ps, other]

Distribution-Aware Mobility-Assisted Decentralized Federated Learning

Authors: Md Farhamdur Reza, Reza Jahani, Richeng Jin, Huaiyu Dai

Abstract: Decentralized federated learning (DFL) has attracted significant attention due to its scalability and independence from a central server. In practice, some participating clients can be mobile, yet the impact of user mobility on DFL performance remains largely unexplored, despite its potential to facilitate communication and model convergence. In this work, we demonstrate that introducing a small f… ▽ More Decentralized federated learning (DFL) has attracted significant attention due to its scalability and independence from a central server. In practice, some participating clients can be mobile, yet the impact of user mobility on DFL performance remains largely unexplored, despite its potential to facilitate communication and model convergence. In this work, we demonstrate that introducing a small fraction of mobile clients, even with random movement, can significantly improve the accuracy of DFL by facilitating information flow. To further enhance performance, we propose novel distribution-aware mobility patterns, where mobile clients strategically navigate the network, leveraging knowledge of data distributions and static client locations. The proposed moving strategies mitigate the impact of data heterogeneity and boost learning convergence. Extensive experiments validate the effectiveness of induced mobility in DFL and demonstrate the superiority of our proposed mobility patterns over random movement. △ Less

Submitted 24 May, 2025; originally announced May 2025.

Comments: Under review for possible publication in IEEE GLOBECOM 2025

arXiv:2505.07217 [pdf, ps, other]

Computation of Vector-Valued Invariants for a Finite Complex Reflection Group

Authors: Masashi Kosuda, Shoyu Nagaoka, Manabu Oura, A. K. M. Selim Reza

Abstract: We consider the complex reflection group $H_1$, identified as No. 8 in the Shephard-Todd classification. In this paper, we present computations of the vector-valued invariants associated with various representations of $H_1$. Additionally, we investigate the structure of the corresponding invariant rings. We consider the complex reflection group $H_1$, identified as No. 8 in the Shephard-Todd classification. In this paper, we present computations of the vector-valued invariants associated with various representations of $H_1$. Additionally, we investigate the structure of the corresponding invariant rings. △ Less

Submitted 12 May, 2025; originally announced May 2025.

MSC Class: 20F55; 20C30; 13A50

arXiv:2505.03470 [pdf, other]

Blending 3D Geometry and Machine Learning for Multi-View Stereopsis

Authors: Vibhas Vats, Md. Alimoor Reza, David Crandall, Soon-heung Jung

Abstract: Traditional multi-view stereo (MVS) methods primarily depend on photometric and geometric consistency constraints. In contrast, modern learning-based algorithms often rely on the plane sweep algorithm to infer 3D geometry, applying explicit geometric consistency (GC) checks only as a post-processing step, with no impact on the learning process itself. In this work, we introduce GC MVSNet plus plus… ▽ More Traditional multi-view stereo (MVS) methods primarily depend on photometric and geometric consistency constraints. In contrast, modern learning-based algorithms often rely on the plane sweep algorithm to infer 3D geometry, applying explicit geometric consistency (GC) checks only as a post-processing step, with no impact on the learning process itself. In this work, we introduce GC MVSNet plus plus, a novel approach that actively enforces geometric consistency of reference view depth maps across multiple source views (multi view) and at various scales (multi scale) during the learning phase (see Fig. 1). This integrated GC check significantly accelerates the learning process by directly penalizing geometrically inconsistent pixels, effectively halving the number of training iterations compared to other MVS methods. Furthermore, we introduce a densely connected cost regularization network with two distinct block designs simple and feature dense optimized to harness dense feature connections for enhanced regularization. Extensive experiments demonstrate that our approach achieves a new state of the art on the DTU and BlendedMVS datasets and secures second place on the Tanks and Temples benchmark. To our knowledge, GC MVSNet plus plus is the first method to enforce multi-view, multi-scale supervised geometric consistency during learning. Our code is available. △ Less

Submitted 6 May, 2025; originally announced May 2025.

Comments: A pre-print -- paper under-review. arXiv admin note: substantial text overlap with arXiv:2310.19583

arXiv:2505.00410 [pdf, other]

Machine Learning Meets Transparency in Osteoporosis Risk Assessment: A Comparative Study of ML and Explainability Analysis

Authors: Farhana Elias, Md Shihab Reza, Muhammad Zawad Mahmud, Samiha Islam, Shahran Rahman Alve

Abstract: The present research tackles the difficulty of predicting osteoporosis risk via machine learning (ML) approaches, emphasizing the use of explainable artificial intelligence (XAI) to improve model transparency. Osteoporosis is a significant public health concern, sometimes remaining untreated owing to its asymptomatic characteristics, and early identification is essential to avert fractures. The re… ▽ More The present research tackles the difficulty of predicting osteoporosis risk via machine learning (ML) approaches, emphasizing the use of explainable artificial intelligence (XAI) to improve model transparency. Osteoporosis is a significant public health concern, sometimes remaining untreated owing to its asymptomatic characteristics, and early identification is essential to avert fractures. The research assesses six machine learning classifiers: Random Forest, Logistic Regression, XGBoost, AdaBoost, LightGBM, and Gradient Boosting and utilizes a dataset based on clinical, demographic, and lifestyle variables. The models are refined using GridSearchCV to calibrate hyperparameters, with the objective of enhancing predictive efficacy. XGBoost had the greatest accuracy (91%) among the evaluated models, surpassing others in precision (0.92), recall (0.91), and F1-score (0.90). The research further integrates XAI approaches, such as SHAP, LIME, and Permutation Feature Importance, to elucidate the decision-making process of the optimal model. The study indicates that age is the primary determinant in forecasting osteoporosis risk, followed by hormonal alterations and familial history. These results corroborate clinical knowledge and affirm the models' therapeutic significance. The research underscores the significance of explainability in machine learning models for healthcare applications, guaranteeing that physicians can rely on the system's predictions. The report ultimately proposes directions for further research, such as validation across varied populations and the integration of supplementary biomarkers for enhanced predictive accuracy. △ Less

Submitted 9 May, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

Comments: Submitted in an international conference

arXiv:2504.16993 [pdf, other]

Micro-Transfer Printed Continuous-Wave and Mode-Locked Laser Integration at 800 nm on a Silicon Nitride Platform

Authors: Max Kiewiet, Stijn Cuyvers, Maximilien Billet, Konstantinos Akritidis, Valeria Bonito Oliva, Gaudhaman Jeevanandam, Sandeep Saseendran, Manuel Reza, Pol Van Dorpe, Roelof Jansen, Joost Brouckaert, Günther Roelkens, Kasper Van Gasse, Bart Kuyken

Abstract: Applications such as augmented and virtual reality (AR/VR), optical atomic clocks, and quantum computing require photonic integration of (near-)visible laser sources to enable commercialization at scale. The heterogeneous integration of III-V optical gain materials with low-loss silicon nitride waveguides enables complex photonic circuits with low-noise lasers on a single chip. Previous such demon… ▽ More Applications such as augmented and virtual reality (AR/VR), optical atomic clocks, and quantum computing require photonic integration of (near-)visible laser sources to enable commercialization at scale. The heterogeneous integration of III-V optical gain materials with low-loss silicon nitride waveguides enables complex photonic circuits with low-noise lasers on a single chip. Previous such demonstrations are mostly geared towards telecommunication wavelengths. At shorter wavelengths, limited options exist for efficient light coupling between III-V and silicon nitride waveguides. Recent advances in wafer-bonded devices at these wavelengths require complex coupling structures and suffer from poor heat dissipation. Here, we overcome these challenges and demonstrate a wafer-scale micro-transfer printing method integrating functional III-V devices directly onto the silicon substrate of a commercial silicon nitride platform. We show butt-coupling of efficient GaAs-based amplifiers operating at 800 nm with integrated saturable absorbers to silicon nitride cavities. This resulted in extended-cavity continuous-wave and mode-locked lasers generating pulse trains with repetition rates ranging from 3.2 to 9.2 GHz and excellent passive stability with a fundamental radio-frequency linewidth of 519 Hz. These results show the potential to build complex, high-performance fully-integrated laser systems at 800 nm using scalable manufacturing, promising advances for AR/VR, nonlinear photonics, timekeeping, quantum computing, and beyond. △ Less

Submitted 23 April, 2025; originally announced April 2025.

Comments: 9 pages main article, 7 pages supplementary. 4 main figures, 9 supplementary figures

arXiv:2504.12488 [pdf, other]

Co-Writing with AI, on Human Terms: Aligning Research with User Demands Across the Writing Process

Authors: Mohi Reza, Jeb Thomas-Mitchell, Peter Dushniku, Nathan Laundry, Joseph Jay Williams, Anastasia Kuzminykh

Abstract: As generative AI tools like ChatGPT become integral to everyday writing, critical questions arise about how to preserve writers' sense of agency and ownership when using these tools. Yet, a systematic understanding of how AI assistance affects different aspects of the writing process - and how this shapes writers' agency - remains underexplored. To address this gap, we conducted a systematic revie… ▽ More As generative AI tools like ChatGPT become integral to everyday writing, critical questions arise about how to preserve writers' sense of agency and ownership when using these tools. Yet, a systematic understanding of how AI assistance affects different aspects of the writing process - and how this shapes writers' agency - remains underexplored. To address this gap, we conducted a systematic review of 109 HCI papers using the PRISMA approach. From this literature, we identify four overarching design strategies for AI writing support: structured guidance, guided exploration, active co-writing, and critical feedback - mapped across the four key cognitive processes in writing: planning, translating, reviewing, and monitoring. We complement this analysis with interviews of 15 writers across diverse domains. Our findings reveal that writers' desired levels of AI intervention vary across the writing process: content-focused writers (e.g., academics) prioritize ownership during planning, while form-focused writers (e.g., creatives) value control over translating and reviewing. Writers' preferences are also shaped by contextual goals, values, and notions of originality and authorship. By examining when ownership matters, what writers want to own, and how AI interactions shape agency, we surface both alignment and gaps between research and user needs. Our findings offer actionable design guidance for developing human-centered writing tools for co-writing with AI, on human terms. △ Less

Submitted 16 April, 2025; originally announced April 2025.

ACM Class: H.5.2; I.2.7; I.2.6; I.7.2

arXiv:2504.02750 [pdf]

Investigation of the influence of electrostatic excitation on instabilities and electron transport in ExB plasma configurations

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll, Benedict Rose

Abstract: Partially magnetized plasmas in ExB configurations - where the electric and magnetic fields are mutually perpendicular - exhibit a cross-field transport behavior, which is widely believed to be dominantly governed by complex instability-driven mechanisms. This phenomenon plays a crucial role in a variety of plasma technologies, including Hall thrusters, where azimuthal instabilities significantly… ▽ More Partially magnetized plasmas in ExB configurations - where the electric and magnetic fields are mutually perpendicular - exhibit a cross-field transport behavior, which is widely believed to be dominantly governed by complex instability-driven mechanisms. This phenomenon plays a crucial role in a variety of plasma technologies, including Hall thrusters, where azimuthal instabilities significantly influence electron confinement and, hence, device performance. While the impact of prominent plasma instabilities, such as the electron cyclotron drift instability (ECDI) and the modified two-stream instability (MTSI) on cross-field transport of electron species is well recognized and widely studied, strategies for actively manipulating these dynamics remain underexplored. In this study, we investigate the effect of targeted wave excitation on instability evolution and electron transport using one- and two-dimensional particle-in-cell simulations of representative plasma discharge configurations. A time-varying electric field is applied axially to modulate the spectral energy distribution of the instabilities across a range of forcing frequencies and amplitudes. Our results reveal that the so-called "unsteady forcing" can both suppress and amplify instability modes depending on excitation parameters. In particular, across both 1D and 2D simulation configurations, forcing near 40 MHz effectively reduces ECDI amplitude and decreases axial electron transport by about 30%, while high-frequency excitation near the electron cyclotron frequency induces spectral broadening, inverse energy cascades, and enhanced transport. These findings point to the role of nonlinear frequency locking and energy pathway disruption as mechanisms for modifying instability-driven transport. Our results offer insights into potential pathways to enhance plasma confinement and control in next-generation ExB devices. △ Less

Submitted 3 April, 2025; originally announced April 2025.

Comments: 20 pages, 15 figures

arXiv:2503.12827 [pdf, other]

GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack

Authors: Md Farhamdur Reza, Richeng Jin, Tianfu Wu, Huaiyu Dai

Abstract: Existing score-based adversarial attacks mainly focus on crafting $top$-1 adversarial examples against classifiers with single-label classification. Their attack success rate and query efficiency are often less than satisfactory, particularly under small perturbation requirements; moreover, the vulnerability of classifiers with multi-label learning is yet to be studied. In this paper, we propose a… ▽ More Existing score-based adversarial attacks mainly focus on crafting $top$-1 adversarial examples against classifiers with single-label classification. Their attack success rate and query efficiency are often less than satisfactory, particularly under small perturbation requirements; moreover, the vulnerability of classifiers with multi-label learning is yet to be studied. In this paper, we propose a comprehensive surrogate free score-based attack, named \b geometric \b score-based \b black-box \b attack (GSBA$^K$), to craft adversarial examples in an aggressive $top$-$K$ setting for both untargeted and targeted attacks, where the goal is to change the $top$-$K$ predictions of the target classifier. We introduce novel gradient-based methods to find a good initial boundary point to attack. Our iterative method employs novel gradient estimation techniques, particularly effective in $top$-$K$ setting, on the decision boundary to effectively exploit the geometry of the decision boundary. Additionally, GSBA$^K$ can be used to attack against classifiers with $top$-$K$ multi-label learning. Extensive experimental results on ImageNet and PASCAL VOC datasets validate the effectiveness of GSBA$^K$ in crafting $top$-$K$ adversarial examples. △ Less

Submitted 30 May, 2025; v1 submitted 17 March, 2025; originally announced March 2025.

Comments: License changed to CC BY 4.0 to align with ICLR 2025. No changes to content. Published at: https://openreview.net/forum?id=htX7AoHyln

arXiv:2503.12663 [pdf, other]

Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding

Authors: Imran Kabir, Md Alimoor Reza, Syed Billah

Abstract: Large multimodal models (LMMs) are increasingly integrated into autonomous driving systems for user interaction. However, their limitations in fine-grained spatial reasoning pose challenges for system interpretability and user trust. We introduce Logic-RAG, a novel Retrieval-Augmented Generation (RAG) framework that improves LMMs' spatial understanding in driving scenarios. Logic-RAG constructs a… ▽ More Large multimodal models (LMMs) are increasingly integrated into autonomous driving systems for user interaction. However, their limitations in fine-grained spatial reasoning pose challenges for system interpretability and user trust. We introduce Logic-RAG, a novel Retrieval-Augmented Generation (RAG) framework that improves LMMs' spatial understanding in driving scenarios. Logic-RAG constructs a dynamic knowledge base (KB) about object-object relationships in first-order logic (FOL) using a perception module, a query-to-logic embedder, and a logical inference engine. We evaluated Logic-RAG on visual-spatial queries using both synthetic and real-world driving videos. When using popular LMMs (GPT-4V, Claude 3.5) as proxies for an autonomous driving system, these models achieved only 55% accuracy on synthetic driving scenes and under 75% on real-world driving scenes. Augmenting them with Logic-RAG increased their accuracies to over 80% and 90%, respectively. An ablation study showed that even without logical inference, the fact-based context constructed by Logic-RAG alone improved accuracy by 15%. Logic-RAG is extensible: it allows seamless replacement of individual components with improved versions and enables domain experts to compose new knowledge in both FOL and natural language. In sum, Logic-RAG addresses critical spatial reasoning deficiencies in LMMs for autonomous driving applications. Code and data are available at https://github.com/Imran2205/LogicRAG. △ Less

Submitted 16 March, 2025; originally announced March 2025.

arXiv:2502.11203 [pdf]

Multiscale autonomous forecasting of plasma systems' dynamics using neural networks

Authors: Farbod Faraji, Maryam Reza

Abstract: Plasma systems exhibit complex multiscale dynamics, resolving which poses significant challenges for conventional numerical simulations. Machine learning (ML) offers an alternative by learning data-driven representations of these dynamics. Yet existing ML time-stepping models suffer from error accumulation, instability, and limited long-term forecasting horizons. This paper demonstrates the applic… ▽ More Plasma systems exhibit complex multiscale dynamics, resolving which poses significant challenges for conventional numerical simulations. Machine learning (ML) offers an alternative by learning data-driven representations of these dynamics. Yet existing ML time-stepping models suffer from error accumulation, instability, and limited long-term forecasting horizons. This paper demonstrates the application of a hierarchical multiscale neural network architecture for autonomous plasma forecasting. The framework integrates multiple neural networks trained across different temporal scales to capture both fine-scale and large-scale behaviors while mitigating compounding error in recursive evaluation. Fine-scale networks accurately resolve fast-evolving features, while coarse-scale networks provide broader temporal context, reducing the frequency of recursive updates and limiting the accumulation of small prediction errors over time. We first evaluate the method using canonical nonlinear dynamical systems and compare its performance against classical single-scale neural networks. The results demonstrate that single-scale neural networks experience rapid divergence due to recursive error accumulation, whereas the multiscale approach improves stability and extends prediction horizons. Next, our ML model is applied to two plasma configurations of high scientific and applied significance, demonstrating its ability to preserve spatial structures and capture multiscale plasma dynamics. By leveraging multiple time-stepping resolutions, the applied framework is shown to outperform conventional single-scale networks for the studied plasma test cases. The results of this work position the hierarchical multiscale neural network as a promising tool for efficient plasma forecasting and digital twin applications. △ Less

Submitted 28 February, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

Comments: 29 pages, 25 figures

arXiv:2501.17823 [pdf, ps, other]

Robust Multimodal Learning via Cross-Modal Proxy Tokens

Authors: Md Kaykobad Reza, Ameya Patil, Mashhour Solh, M. Salman Asif

Abstract: Multimodal models often experience a significant performance drop when one or more modalities are missing during inference. To address this challenge, we propose a simple yet effective approach that enhances robustness to missing modalities while maintaining strong performance when all modalities are available. Our method introduces cross-modal proxy tokens (CMPTs), which approximate the class tok… ▽ More Multimodal models often experience a significant performance drop when one or more modalities are missing during inference. To address this challenge, we propose a simple yet effective approach that enhances robustness to missing modalities while maintaining strong performance when all modalities are available. Our method introduces cross-modal proxy tokens (CMPTs), which approximate the class token of a missing modality by attending only to the tokens of the available modality without requiring explicit modality generation or auxiliary networks. To efficiently learn these approximations with minimal computational overhead, we employ low-rank adapters in frozen unimodal encoders and jointly optimize an alignment loss with a task-specific loss. Extensive experiments on five multimodal datasets show that our method outperforms state-of-the-art baselines across various missing rates while achieving competitive results in complete-modality settings. Overall, our method offers a flexible and efficient solution for robust multimodal learning. The code and pretrained models will be released on GitHub. △ Less

Submitted 2 June, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

Comments: 21 Pages, 9 Figures, 6 Tables

arXiv:2501.16814 [pdf]

Dynamic Metadata Schemes in the Neutron and Photon Science Communities: A Case Study of X-Ray Photon Correlation Spectroscopy

Authors: Amir Tosson, Mohammad Reza, Christian Gutt

Abstract: Metadata is one of the most important aspects for advancing data management practices within all research communities. Definitions and schemes of metadata are inter alia of particular significance in the domain of neutron and photon scattering experiments covering a broad area of different scientific disciplines. The demand of describing continuously evolving highly nonstandardized experiments, in… ▽ More Metadata is one of the most important aspects for advancing data management practices within all research communities. Definitions and schemes of metadata are inter alia of particular significance in the domain of neutron and photon scattering experiments covering a broad area of different scientific disciplines. The demand of describing continuously evolving highly nonstandardized experiments, including the resulting processed and published data, constitutes a considerable challenge for a static definition of metadata. Here, we present the concept of dynamic metadata for the neutron and photon scientific community, which enriches a static set of defined basic metadata. We explore the idea of dynamic metadata with the help of the use case of X-ray Photon Correlation Spectroscopy (XPCS), which is a synchrotron-based scattering technique that allows the investigation of nanoscale dynamic processes. It serves here as a demonstrator of how dynamic metadata can improve data acquisition, sharing, and analysis workflows. Our approach enables researchers to tailor metadata definitions dynamically and adapt them to the evolving demands of describing data and results from a diverse set of experiments. We demonstrate that dynamic metadata standards yield advantages that enhance data reproducibility, interoperability, and the dissemination of knowledge. △ Less

Submitted 28 January, 2025; originally announced January 2025.

Journal ref: Engineering and Technology International Journal of Computer and Information Engineering, Vol:18, No:5, 2024

arXiv:2412.19160 [pdf, other]

Cross-Spectral Vision Transformer for Biometric Authentication using Forehead Subcutaneous Vein Pattern and Periocular Pattern

Authors: Arun K. Sharma, Shubhobrata Bhattacharya, Motahar Reza, Bishakh Bhattacharya

Abstract: Traditional biometric systems have encountered significant setbacks due to various unavoidable factors, for example, face recognition-based biometrics fails due to the wearing of face masks and fingerprints create hygiene concerns. This paper proposes a novel lightweight cross-spectral vision transformer (CS-ViT) for biometric authentication using forehead subcutaneous vein patterns and periocular… ▽ More Traditional biometric systems have encountered significant setbacks due to various unavoidable factors, for example, face recognition-based biometrics fails due to the wearing of face masks and fingerprints create hygiene concerns. This paper proposes a novel lightweight cross-spectral vision transformer (CS-ViT) for biometric authentication using forehead subcutaneous vein patterns and periocular patterns, offering a promising alternative to traditional methods, capable of performing well even with the face masks and without any physical touch. The proposed framework comprises a cross-spectral dual-channel architecture designed to handle two distinct biometric traits and to capture inter-dependencies in terms of relative spectral patterns. Each channel consists of a Phase-Only Correlation Cross-Spectral Attention (POC-CSA) that captures their individual as well as correlated patterns. The computation of cross-spectral attention using POC extracts the phase correlation in the spatial features. Therefore, it is robust against the resolution/intensity variations and illumination of the input images, assuming both biometric traits are from the same person. The lightweight model is suitable for edge device deployment. The performance of the proposed algorithm was rigorously evaluated using the Forehead Subcutaneous Vein Pattern and Periocular Biometric Pattern (FSVP-PBP) database. The results demonstrated the superiority of the algorithm over state-of-the-art methods, achieving a remarkable classification accuracy of 98.8% with the combined vein and periocular patterns. △ Less

Submitted 3 March, 2025; v1 submitted 26 December, 2024; originally announced December 2024.

Comments: Submitted to IEEE TPAMI

arXiv:2412.12532 [pdf, other]

Addressing Small and Imbalanced Medical Image Datasets Using Generative Models: A Comparative Study of DDPM and PGGANs with Random and Greedy K Sampling

Authors: Iman Khazrak, Shakhnoza Takhirova, Mostafa M. Rezaee, Mehrdad Yadollahi, Robert C. Green II, Shuteng Niu

Abstract: The development of accurate medical image classification models is often constrained by privacy concerns and data scarcity for certain conditions, leading to small and imbalanced datasets. To address these limitations, this study explores the use of generative models, such as Denoising Diffusion Probabilistic Models (DDPM) and Progressive Growing Generative Adversarial Networks (PGGANs), for datas… ▽ More The development of accurate medical image classification models is often constrained by privacy concerns and data scarcity for certain conditions, leading to small and imbalanced datasets. To address these limitations, this study explores the use of generative models, such as Denoising Diffusion Probabilistic Models (DDPM) and Progressive Growing Generative Adversarial Networks (PGGANs), for dataset augmentation. The research introduces a framework to assess the impact of synthetic images generated by DDPM and PGGANs on the performance of four models: a custom CNN, Untrained VGG16, Pretrained VGG16, and Pretrained ResNet50. Experiments were conducted using Random Sampling and Greedy K Sampling to create small, imbalanced datasets. The synthetic images were evaluated using Frechet Inception Distance (FID) and compared to original datasets through classification metrics. The results show that DDPM consistently generated more realistic images with lower FID scores and significantly outperformed PGGANs in improving classification metrics across all models and datasets. Incorporating DDPM-generated images into the original datasets increased accuracy by up to 6%, enhancing model robustness and stability, particularly in imbalanced scenarios. Random Sampling demonstrated superior stability, while Greedy K Sampling offered diversity at the cost of higher FID scores. This study highlights the efficacy of DDPM in augmenting small, imbalanced medical image datasets, improving model performance by balancing the dataset and expanding its size. △ Less

Submitted 16 December, 2024; originally announced December 2024.

arXiv:2412.04183 [pdf]

Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach

Authors: Md Shihab Reza, Monirul Islam Mahmud, Ifti Azad Abeer, Nova Ahmed

Abstract: The development of computing has made credit scoring approaches possible, with various machine learning (ML) and deep learning (DL) techniques becoming more and more valuable. While complex models yield more accurate predictions, their interpretability is often weakened, which is a concern for credit scoring that places importance on decision fairness. As features of the dataset are a crucial fact… ▽ More The development of computing has made credit scoring approaches possible, with various machine learning (ML) and deep learning (DL) techniques becoming more and more valuable. While complex models yield more accurate predictions, their interpretability is often weakened, which is a concern for credit scoring that places importance on decision fairness. As features of the dataset are a crucial factor for the credit scoring system, we implement Linear Discriminant Analysis (LDA) as a feature reduction technique, which reduces the burden of the models complexity. We compared 6 different machine learning models, 1 deep learning model, and a hybrid model with and without using LDA. From the result, we have found our hybrid model, XG-DNN, outperformed other models with the highest accuracy of 99.45% and a 99% F1 score with LDA. Lastly, to interpret model decisions, we have applied 2 different explainable AI techniques named LIME (local) and Morris Sensitivity Analysis (global). Through this research, we showed how feature reduction techniques can be used without affecting the performance and explainability of the model, which can be very useful in resource-constrained settings to optimize the computational workload. △ Less

Submitted 5 December, 2024; originally announced December 2024.

Comments: Accepted on International Conference on Computer and Information Technology (ICCIT) 2024

arXiv:2411.05759 [pdf]

Latest progress on the reduced-order particle-in-cell scheme: II. Quasi-3D implementation and verification

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: Across many plasma applications, the underlying phenomena and interactions among the involved processes are known to exhibit three-dimensional characteristics. Furthermore, the global properties and evolution of plasma systems are often determined by a process called inverse energy cascade, where kinetic plasma processes at the microscopic scale interact and lead to macroscopic coherent structures… ▽ More Across many plasma applications, the underlying phenomena and interactions among the involved processes are known to exhibit three-dimensional characteristics. Furthermore, the global properties and evolution of plasma systems are often determined by a process called inverse energy cascade, where kinetic plasma processes at the microscopic scale interact and lead to macroscopic coherent structures. These structures can have a major impact on the stability of plasma discharges, with detrimental effects on the operation and performance of plasma technologies. Kinetic particle-in-cell (PIC) methods offer a sufficient level of fidelity to capture these processes and behaviors. However, three-dimensional PIC simulations that can cost-effectively overcome the curse of dimensionality and enable full-scale simulations of real-world time significance have remained elusive. Tackling the enormous computational cost issue associated with conventional PIC schemes, the computationally efficient reduced-order (RO) PIC approach provides a viable path to 3D simulations of real-size plasma systems. This part II paper builds upon the improvements to the RO-PIC's underpinning formulation discussed in part I and extends the novel "first-order" RO-PIC formulation to 3D. The resulting Quasi-3D (Q3D) implementation is rigorously verified in this paper, both at the module level of the Q3D reduced-dimension Poisson solver (RDPS) and at the global PIC code level. The plasma test cases employed correspond to 3D versions of the 2D configurations studied in Part I, including a 3D extension to the Diocotron instability problem. The detailed verifications of the Q3D RO-PIC confirm that it maintains the expected levels of cost-efficiency and accuracy, demonstrating the ability of the approach to indistinguishably reproduce full-3D simulation results at a fraction of the computational cost. △ Less

Submitted 8 November, 2024; originally announced November 2024.

Comments: 24 pages, 21 figures

arXiv:2411.05751 [pdf]

Latest progress on the reduced-order particle-in-cell scheme: I. refining the underlying formulation

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: The particle-in-cell (PIC) method is a well-established and widely used kinetic plasma modelling approach that provides a hybrid Lagrangian-Eulerian approach to solve the plasma kinetic equation. Despite its power in capturing details of the underlying physics of plasmas, conventional PIC implementations are associated with a significant computational cost, rendering their applications for real-wo… ▽ More The particle-in-cell (PIC) method is a well-established and widely used kinetic plasma modelling approach that provides a hybrid Lagrangian-Eulerian approach to solve the plasma kinetic equation. Despite its power in capturing details of the underlying physics of plasmas, conventional PIC implementations are associated with a significant computational cost, rendering their applications for real-world plasma science and engineering challenges impractical. The acceleration of the PIC method has thus become a topic of high interest, with several approaches having been pursued to this end. Among these, the concept of reduced-order (RO) PIC simulations, first introduced in 2023, provides a uniquely flexible and computationally efficient framework for kinetic plasma modelling - characteristics verified extensively in various plasma configurations. In this two-part article, we report the latest progress achieved on RO-PIC. Part I article revisits the original RO-PIC formulation and introduces refinements that substantially enhance the cost-efficiency and accuracy of the method. We discuss these refinements in comparison against the original formulation, illustrating the progression to a "first-order" implementation from the baseline "zeroth-order" one. In a detailed step-by-step verification, we first test the newly updated reduced-dimension Poisson solver (RDPS) in the first-order RO-PIC against its zeroth-order counterpart using test-case Poisson problems. Next, comparing against the zeroth-order version, we examine the performance of the complete first-order RO-PIC code in two-dimensional plasma problems. The detailed verifications demonstrate that the improvements in the RO-PIC formulation enable the approach to provide full-2D-equivalent results at a substantially lower (up to an order of magnitude) computational cost compared to the zeroth-order RO-PIC. △ Less

Submitted 8 November, 2024; originally announced November 2024.

Comments: 29 pages, 24 figures

arXiv:2411.02962 [pdf, ps, other]

Brown Halmos Operator Identity and Toeplitz Operators on the Dirichlet Space

Authors: Ashish Kujur, Md Ramiz Reza

Abstract: A well known result of Brown and Halmos shows that the Toeplitz operators induced by $L^{\infty}(\mathbb T)$ symbols on the Hardy space of the unit disc $\mathbb D$ are characterized by the operator identity $T_{\bar{z}}AT_z=A,$ where $T_z, T_{\bar{z}}$ are the Toeplitz operators induced by the function $z$ and $\bar{z}$ on the unit circle $\mathbb T$ respectively. In this paper we introduce and s… ▽ More A well known result of Brown and Halmos shows that the Toeplitz operators induced by $L^{\infty}(\mathbb T)$ symbols on the Hardy space of the unit disc $\mathbb D$ are characterized by the operator identity $T_{\bar{z}}AT_z=A,$ where $T_z, T_{\bar{z}}$ are the Toeplitz operators induced by the function $z$ and $\bar{z}$ on the unit circle $\mathbb T$ respectively. In this paper we introduce and study a class of Toeplitz operators on the Dirichlet space $\mathcal{D} _0$ induced by a symbol class $\mathcal T(\mathcal D _0)= \overline{H^{\infty}_0(\mathbb D)} + \mathcal M(\mathcal D_0 ),$ where $H^{\infty}_0(\mathbb D)$ denotes the set of all bounded analytic function on $\mathbb D$ vanishing at $0$ and $\mathcal M(\mathcal D _0)$ denotes the multiplier algebra of the Dirichlet space $\mathcal D_0.$ We find that the Toeplitz operators on the Dirichlet space $\mathcal D$ induced by the symbol class $\mathcal T(\mathcal D _0)$ is completely characterized by the operator identity $T_{\bar{z}}AT_z=A.$ △ Less

Submitted 5 November, 2024; originally announced November 2024.

Comments: 17 pages

MSC Class: 47B35; 46E22; 47A62

arXiv:2410.16547 [pdf, other]

PromptHive: Bringing Subject Matter Experts Back to the Forefront with Collaborative Prompt Engineering for Educational Content Creation

Authors: Mohi Reza, Ioannis Anastasopoulos, Shreya Bhandari, Zachary A. Pardos

Abstract: Involving subject matter experts in prompt engineering can guide LLM outputs toward more helpful, accurate, and tailored content that meets the diverse needs of different domains. However, iterating towards effective prompts can be challenging without adequate interface support for systematic experimentation within specific task contexts. In this work, we introduce PromptHive, a collaborative inte… ▽ More Involving subject matter experts in prompt engineering can guide LLM outputs toward more helpful, accurate, and tailored content that meets the diverse needs of different domains. However, iterating towards effective prompts can be challenging without adequate interface support for systematic experimentation within specific task contexts. In this work, we introduce PromptHive, a collaborative interface for prompt authoring, designed to better connect domain knowledge with prompt engineering through features that encourage rapid iteration on prompt variations. We conducted an evaluation study with ten subject matter experts in math and validated our design through two collaborative prompt-writing sessions and a learning gain study with 358 learners. Our results elucidate the prompt iteration process and validate the tool's usability, enabling non-AI experts to craft prompts that generate content comparable to human-authored materials while reducing perceived cognitive load by half and shortening the authoring process from several months to just a few hours. △ Less

Submitted 21 October, 2024; originally announced October 2024.

arXiv:2410.03010 [pdf, other]

MMP: Towards Robust Multi-Modal Learning with Masked Modality Projection

Authors: Niki Nezakati, Md Kaykobad Reza, Ameya Patil, Mashhour Solh, M. Salman Asif

Abstract: Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing methods that can handle missing modalities involve custom training or adaptation steps for each input modality combination. These approaches are either tied to sp… ▽ More Multimodal learning seeks to combine data from multiple input sources to enhance the performance of different downstream tasks. In real-world scenarios, performance can degrade substantially if some input modalities are missing. Existing methods that can handle missing modalities involve custom training or adaptation steps for each input modality combination. These approaches are either tied to specific modalities or become computationally expensive as the number of input modalities increases. In this paper, we propose Masked Modality Projection (MMP), a method designed to train a single model that is robust to any missing modality scenario. We achieve this by randomly masking a subset of modalities during training and learning to project available input modalities to estimate the tokens for the masked modalities. This approach enables the model to effectively learn to leverage the information from the available modalities to compensate for the missing ones, enhancing missing modality robustness. We conduct a series of experiments with various baseline models and datasets to assess the effectiveness of this strategy. Experiments demonstrate that our approach improves robustness to different missing modality scenarios, outperforming existing methods designed for missing modalities or specific modality combinations. △ Less

Submitted 7 October, 2024; v1 submitted 3 October, 2024; originally announced October 2024.

arXiv:2409.20507 [pdf, other]

Constraining Cosmology with Simulation-based inference and Optical Galaxy Cluster Abundance

Authors: Moonzarin Reza, Yuanyuan Zhang, Camille Avestruz, Louis E. Strigari, Simone Shevchuk, Francisco Villaescusa-Navarro

Abstract: We test the robustness of simulation-based inference (SBI) in the context of cosmological parameter estimation from galaxy cluster counts and masses in simulated optical datasets. We construct ``simulations'' using analytical models for the galaxy cluster halo mass function (HMF) and for the observed richness (number of observed member galaxies) to train and test the SBI method. We compare the SBI… ▽ More We test the robustness of simulation-based inference (SBI) in the context of cosmological parameter estimation from galaxy cluster counts and masses in simulated optical datasets. We construct ``simulations'' using analytical models for the galaxy cluster halo mass function (HMF) and for the observed richness (number of observed member galaxies) to train and test the SBI method. We compare the SBI parameter posterior samples to those from an MCMC analysis that uses the same analytical models to construct predictions of the observed data vector. The two methods exhibit comparable performance, with reliable constraints derived for the primary cosmological parameters, ($Ω_m$ and $σ_8$), and richness-mass relation parameters. We also perform out-of-domain tests with observables constructed from galaxy cluster-sized halos in the Quijote simulations. Again, the SBI and MCMC results have comparable posteriors, with similar uncertainties and biases. Unsurprisingly, upon evaluating the SBI method on thousands of simulated data vectors that span the parameter space, SBI exhibits worsened posterior calibration metrics in the out-of-domain application. We note that such calibration tests with MCMC is less computationally feasible and highlight the potential use of SBI to stress-test limitations of analytical models, such as in the use for constructing models for inference with MCMC. △ Less

Submitted 30 September, 2024; originally announced September 2024.

arXiv:2409.18506 [pdf, other]

Med-IC: Fusing a Single Layer Involution with Convolutions for Enhanced Medical Image Classification and Segmentation

Authors: Md. Farhadul Islam, Sarah Zabeen, Meem Arafat Manab, Mohammad Rakibul Hasan Mahin, Joyanta Jyoti Mondal, Md. Tanzim Reza, Md Zahidul Hasan, Munima Haque, Farig Sadeque, Jannatun Noor

Abstract: The majority of medical images, especially those that resemble cells, have similar characteristics. These images, which occur in a variety of shapes, often show abnormalities in the organ or cell region. The convolution operation possesses a restricted capability to extract visual patterns across several spatial regions of an image. The involution process, which is the inverse operation of convolu… ▽ More The majority of medical images, especially those that resemble cells, have similar characteristics. These images, which occur in a variety of shapes, often show abnormalities in the organ or cell region. The convolution operation possesses a restricted capability to extract visual patterns across several spatial regions of an image. The involution process, which is the inverse operation of convolution, complements this inherent lack of spatial information extraction present in convolutions. In this study, we investigate how applying a single layer of involution prior to a convolutional neural network (CNN) architecture can significantly improve classification and segmentation performance, with a comparatively negligible amount of weight parameters. The study additionally shows how excessive use of involution layers might result in inaccurate predictions in a particular type of medical image. According to our findings from experiments, the strategy of adding only a single involution layer before a CNN-based model outperforms most of the previous works. △ Less

Submitted 27 September, 2024; originally announced September 2024.

Comments: 13 pages, 5 figures, 4 tables, preprint submitted to an Elsevier journal

MSC Class: 68T45 ACM Class: I.4.6; I.4.9; I.5.4; J.3

arXiv:2409.02349 [pdf]

Machine Learning Applications to Computational Plasma Physics and Reduced-Order Plasma Modeling: A Perspective

Authors: Farbod Faraji, Maryam Reza

Abstract: Machine learning (ML) provides a broad spectrum of tools and architectures that enable the transformation of data from simulations and experiments into useful and explainable science, thereby augmenting domain knowledge. Furthermore, ML-enhanced numerical modelling can revamp scientific computing for real-world complex engineering systems, creating unique opportunities to examine the operation of… ▽ More Machine learning (ML) provides a broad spectrum of tools and architectures that enable the transformation of data from simulations and experiments into useful and explainable science, thereby augmenting domain knowledge. Furthermore, ML-enhanced numerical modelling can revamp scientific computing for real-world complex engineering systems, creating unique opportunities to examine the operation of the technologies in detail and automate their optimization and control. In recent years, ML applications have seen significant growth across various scientific domains, particularly in fluid mechanics, where ML has shown great promise in enhancing computational modeling of fluid flows. In contrast, ML applications in numerical plasma physics research remain relatively limited in scope and extent. Despite this, the close relationship between fluid mechanics and plasma physics presents a valuable opportunity to create a roadmap for transferring ML advances in fluid flow modeling to computational plasma physics. This Perspective aims to outline such a roadmap. We begin by discussing some general fundamental aspects of ML, including the various categories of ML algorithms and the different types of problems that can be solved with the help of ML. With regard to each problem type, we then present specific examples from the use of ML in computational fluid dynamics, reviewing several insightful prior efforts. We also review recent ML applications in plasma physics for each problem type. The paper discusses promising future directions and development pathways for ML in plasma modelling within the different application areas. Additionally, we point out prominent challenges that must be addressed to realize ML's full potential in computational plasma physics, including the need for cost-effective high-fidelity simulation tools for extensive data generation. △ Less

Submitted 3 September, 2024; originally announced September 2024.

Comments: 42 pages, 20 figures

arXiv:2408.13175 [pdf, other]

doi 10.1145/3663548.3688538

Identifying Crucial Objects in Blind and Low-Vision Individuals' Navigation

Authors: Md Touhidul Islam, Imran Kabir, Elena Ariel Pearce, Md Alimoor Reza, Syed Masum Billah

Abstract: This paper presents a curated list of 90 objects essential for the navigation of blind and low-vision (BLV) individuals, encompassing road, sidewalk, and indoor environments. We develop the initial list by analyzing 21 publicly available videos featuring BLV individuals navigating various settings. Then, we refine the list through feedback from a focus group study involving blind, low-vision, and… ▽ More This paper presents a curated list of 90 objects essential for the navigation of blind and low-vision (BLV) individuals, encompassing road, sidewalk, and indoor environments. We develop the initial list by analyzing 21 publicly available videos featuring BLV individuals navigating various settings. Then, we refine the list through feedback from a focus group study involving blind, low-vision, and sighted companions of BLV individuals. A subsequent analysis reveals that most contemporary datasets used to train recent computer vision models contain only a small subset of the objects in our proposed list. Furthermore, we provide detailed object labeling for these 90 objects across 31 video segments derived from the original 21 videos. Finally, we make the object list, the 21 videos, and object labeling in the 31 video segments publicly available. This paper aims to fill the existing gap and foster the development of more inclusive and effective navigation aids for the BLV community. △ Less

Submitted 23 August, 2024; originally announced August 2024.

Comments: Paper accepted at ASSETS'24 (Oct 27-30, 2024, St. Johns, Newfoundland, Canada). arXiv admin note: substantial text overlap with arXiv:2407.16777

arXiv:2408.08401 [pdf, other]

Understanding Help-Seeking Behavior of Students Using LLMs vs. Web Search for Writing SQL Queries

Authors: Harsh Kumar, Mohi Reza, Jeb Mitchell, Ilya Musabirov, Lisa Zhang, Michael Liut

Abstract: Growth in the use of large language models (LLMs) in programming education is altering how students write SQL queries. Traditionally, students relied heavily on web search for coding assistance, but this has shifted with the adoption of LLMs like ChatGPT. However, the comparative process and outcomes of using web search versus LLMs for coding help remain underexplored. To address this, we conducte… ▽ More Growth in the use of large language models (LLMs) in programming education is altering how students write SQL queries. Traditionally, students relied heavily on web search for coding assistance, but this has shifted with the adoption of LLMs like ChatGPT. However, the comparative process and outcomes of using web search versus LLMs for coding help remain underexplored. To address this, we conducted a randomized interview study in a database classroom to compare web search and LLMs, including a publicly available LLM (ChatGPT) and an instructor-tuned LLM, for writing SQL queries. Our findings indicate that using an instructor-tuned LLM required significantly more interactions than both ChatGPT and web search, but resulted in a similar number of edits to the final SQL query. No significant differences were found in the quality of the final SQL queries between conditions, although the LLM conditions directionally showed higher query quality. Furthermore, students using instructor-tuned LLM reported a lower mental demand. These results have implications for learning and productivity in programming education. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2407.19815 [pdf, ps, other]

Weight Enumerators of codes over $\mathbb{F}_2$ and over $\mathbb{Z}_4$

Authors: A. K. M. Selim Reza, Manabu Oura, Nur Hamid

Abstract: Weight enumerators are important tools for deciphering the algebraic structure of the related code spaces and for understanding group actions on these spaces. Our study focuses on symmetrized weight enumerators of pairs of Type II codes over the finite field $\mathbb{F}_{2}$ and the ring $\mathbb{Z}_{4}$. These pairs have been examined as invariants for a specified group. In particular, we concent… ▽ More Weight enumerators are important tools for deciphering the algebraic structure of the related code spaces and for understanding group actions on these spaces. Our study focuses on symmetrized weight enumerators of pairs of Type II codes over the finite field $\mathbb{F}_{2}$ and the ring $\mathbb{Z}_{4}$. These pairs have been examined as invariants for a specified group. In particular, we concentrate on the scenarios where the space of the invariant ring is of degree 8 and 16. Our findings show that in certain situations, the ring produced by the symmetrized weight enumerators precisely matches with the invariant ring of the designated group. This coincidence points to a profound relationship between the invariant ring's structure and the algebraic characteristics of the weight enumerators. △ Less

Submitted 19 September, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

Comments: 9 Pages including Appendix

MSC Class: Primary 94B05; Secondary 05E99

arXiv:2407.18522 [pdf]

First-principles investigation of the physical properties of wide band gap hexagonal AlPO4 compound for possible applications

Authors: A. S. M. Muhasin Reza, Md. Asif Afzal, S. H. Naqib

Abstract: In this study, using the density functional theory, we have investigated the bulk physical properties like structural, electronic band structure, elastic properties, thermal properties, optical properties and bonding features of AlPO4 compound in the hexagonal form. The values of our optimized structural parameters are very close to the previous results. Most of the results presented in this work… ▽ More In this study, using the density functional theory, we have investigated the bulk physical properties like structural, electronic band structure, elastic properties, thermal properties, optical properties and bonding features of AlPO4 compound in the hexagonal form. The values of our optimized structural parameters are very close to the previous results. Most of the results presented in this work are novel. The elastic constants indicate that AlPO4 is mechanically stable and brittle in nature. The compound is moderately hard and possesses low machinability index. AlPO4 contains significant elastic anisotropy. The charge density distribution , bond population analysis, Vickers hardness, thermo-mechanical properties, and optical properties have been investigated for the first time. The electronic band structure calculations reveal clear insulating behavior with a band gap of 6.0 eV. Band structure calculations were carried out without and with spin-orbit coupling (SOC) to explore possible topological signature. The energy dependent optical properties conform to the electronic band structure calculations. Major optical properties like dielectric functions, refractive index, photoconductivity, absorption coefficient, loss function and reflectivity are calculated and discussed in detail in this study. The compound is optically anisotropic. It is an efficient absorber and reflector of the ultraviolet light. △ Less

Submitted 26 July, 2024; originally announced July 2024.

arXiv:2407.16777 [pdf, other]

A Dataset for Crucial Object Recognition in Blind and Low-Vision Individuals' Navigation

Authors: Md Touhidul Islam, Imran Kabir, Elena Ariel Pearce, Md Alimoor Reza, Syed Masum Billah

Abstract: This paper introduces a dataset for improving real-time object recognition systems to aid blind and low-vision (BLV) individuals in navigation tasks. The dataset comprises 21 videos of BLV individuals navigating outdoor spaces, and a taxonomy of 90 objects crucial for BLV navigation, refined through a focus group study. We also provide object labeling for the 90 objects across 31 video segments cr… ▽ More This paper introduces a dataset for improving real-time object recognition systems to aid blind and low-vision (BLV) individuals in navigation tasks. The dataset comprises 21 videos of BLV individuals navigating outdoor spaces, and a taxonomy of 90 objects crucial for BLV navigation, refined through a focus group study. We also provide object labeling for the 90 objects across 31 video segments created from the 21 videos. A deeper analysis reveals that most contemporary datasets used in training computer vision models contain only a small subset of the taxonomy in our dataset. Preliminary evaluation of state-of-the-art computer vision models on our dataset highlights shortcomings in accurately detecting key objects relevant to BLV navigation, emphasizing the need for specialized datasets. We make our dataset publicly available, offering valuable resources for developing more inclusive navigation systems for BLV individuals. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 16 pages, 4 figures

arXiv:2405.11955 [pdf, other]

Shallow Recurrent Decoder for Reduced Order Modeling of Plasma Dynamics

Authors: J. Nathan Kutz, Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: Reduced order models are becoming increasingly important for rendering complex and multiscale spatio-temporal dynamics computationally tractable. The computational efficiency of such surrogate models is especially important for design, exhaustive exploration and physical understanding. Plasma simulations, in particular those applied to the study of ${\bf E}\times {\bf B}$ plasma discharges and tec… ▽ More Reduced order models are becoming increasingly important for rendering complex and multiscale spatio-temporal dynamics computationally tractable. The computational efficiency of such surrogate models is especially important for design, exhaustive exploration and physical understanding. Plasma simulations, in particular those applied to the study of ${\bf E}\times {\bf B}$ plasma discharges and technologies, such as Hall thrusters, require substantial computational resources in order to resolve the multidimentional dynamics that span across wide spatial and temporal scales. Although high-fidelity computational tools are available to simulate such systems over limited conditions and in highly simplified geometries, simulations of full-size systems and/or extensive parametric studies over many geometric configurations and under different physical conditions are computationally intractable with conventional numerical tools. Thus, scientific studies and industrially oriented modeling of plasma systems, including the important ${\bf E}\times {\bf B}$ technologies, stand to significantly benefit from reduced order modeling algorithms. We develop a model reduction scheme based upon a {\em Shallow REcurrent Decoder} (SHRED) architecture. The scheme uses a neural network for encoding limited sensor measurements in time (sequence-to-sequence encoding) to full state-space reconstructions via a decoder network. Based upon the theory of separation of variables, the SHRED architecture is capable of (i) reconstructing full spatio-temporal fields with as little as three point sensors, even the fields that are not measured with sensor feeds but that are in dynamic coupling with the measured field, and (ii) forecasting the future state of the system using neural network roll-outs from the trained time encoding model. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 12 pages, 7 figures

arXiv:2404.06593 [pdf, other]

Spatially Optimized Compact Deep Metric Learning Model for Similarity Search

Authors: Md. Farhadul Islam, Md. Tanzim Reza, Meem Arafat Manab, Mohammad Rakibul Hasan Mahin, Sarah Zabeen, Jannatun Noor

Abstract: Spatial optimization is often overlooked in many computer vision tasks. Filters should be able to recognize the features of an object regardless of where it is in the image. Similarity search is a crucial task where spatial features decide an important output. The capacity of convolution to capture visual patterns across various locations is limited. In contrast to convolution, the involution kern… ▽ More Spatial optimization is often overlooked in many computer vision tasks. Filters should be able to recognize the features of an object regardless of where it is in the image. Similarity search is a crucial task where spatial features decide an important output. The capacity of convolution to capture visual patterns across various locations is limited. In contrast to convolution, the involution kernel is dynamically created at each pixel based on the pixel value and parameters that have been learned. This study demonstrates that utilizing a single layer of involution feature extractor alongside a compact convolution model significantly enhances the performance of similarity search. Additionally, we improve predictions by using the GELU activation function rather than the ReLU. The negligible amount of weight parameters in involution with a compact model with better performance makes the model very useful in real-world implementations. Our proposed model is below 1 megabyte in size. We have experimented with our proposed methodology and other models on CIFAR-10, FashionMNIST, and MNIST datasets. Our proposed method outperforms across all three datasets. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 5 pages, 3 figures,

MSC Class: 68 ACM Class: I.4.7; I.2.6; I.2.10

arXiv:2403.15937 [pdf, other]

Model, Analyze, and Comprehend User Interactions within a Social Media Platform

Authors: Md Kaykobad Reza, S M Maksudul Alam, Yiran Luo, Youzhe Liu, Md Siam

Abstract: In this study, we propose a novel graph-based approach to model, analyze and comprehend user interactions within a social media platform based on post-comment relationship. We construct a user interaction graph from social media data and analyze it to gain insights into community dynamics, user behavior, and content preferences. Our investigation reveals that while 56.05% of the active users are s… ▽ More In this study, we propose a novel graph-based approach to model, analyze and comprehend user interactions within a social media platform based on post-comment relationship. We construct a user interaction graph from social media data and analyze it to gain insights into community dynamics, user behavior, and content preferences. Our investigation reveals that while 56.05% of the active users are strongly connected within the community, only 0.8% of them significantly contribute to its dynamics. Moreover, we observe temporal variations in community activity, with certain periods experiencing heightened engagement. Additionally, our findings highlight a correlation between user activity and popularity showing that more active users are generally more popular. Alongside these, a preference for positive and informative content is also observed where 82.41% users preferred positive and informative content. Overall, our study provides a comprehensive framework for understanding and managing online communities, leveraging graph-based techniques to gain valuable insights into user behavior and community dynamics. △ Less

Submitted 28 November, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

Comments: Accepted by 27th International Conference on Computer and Information Technology (ICCIT), 2024. 6 Pages, 6 Figures

arXiv:2403.01532 [pdf]

Data-driven local operator finding for reduced-order modelling of plasma systems: II. Application to parametric dynamics

Authors: Farbod Faraji, Maryam Reza, Aaron Knoll, J. Nathan Kutz

Abstract: Real-world systems often exhibit dynamics influenced by various parameters, either inherent or externally controllable, necessitating models capable of reliably capturing these parametric behaviors. Plasma technologies exemplify such systems. For example, phenomena governing global dynamics in Hall thrusters (a spacecraft propulsion technology) vary with various parameters, such as the "self-susta… ▽ More Real-world systems often exhibit dynamics influenced by various parameters, either inherent or externally controllable, necessitating models capable of reliably capturing these parametric behaviors. Plasma technologies exemplify such systems. For example, phenomena governing global dynamics in Hall thrusters (a spacecraft propulsion technology) vary with various parameters, such as the "self-sustained electric field". In this Part II, following on the introduction of our novel data-driven local operator finding algorithm, Phi Method, in Part I, we showcase the method's effectiveness in learning parametric dynamics to predict system behavior across unseen parameter spaces. We present two adaptations: the "parametric Phi Method" and the "ensemble Phi Method", which are demonstrated through 2D fluid-flow-past-a-cylinder and 1D Hall-thruster-plasma-discharge problems. Comparative evaluation against parametric OPT-DMD in the fluid case demonstrates superior predictive performance of the parametric Phi Method. Across both test cases, parametric and ensemble Phi Method reliably recover governing parametric PDEs and offer accurate predictions over test parameters. Ensemble ROM analysis underscores Phi Method's robust learning of dominant dynamic coefficients with high confidence. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 24 pages, 17 figures

arXiv:2403.01523 [pdf]

Data-driven local operator finding for reduced-order modelling of plasma systems: I. Concept and verifications

Authors: Farbod Faraji, Maryam Reza, Aaron Knoll, J. Nathan Kutz

Abstract: Reduced-order plasma models that can efficiently predict plasma behavior across various settings and configurations are highly sought after yet elusive. The demand for such models has surged in the past decade due to their potential to facilitate scientific research and expedite the development of plasma technologies. In line with the advancements in computational power and data-driven methods, we… ▽ More Reduced-order plasma models that can efficiently predict plasma behavior across various settings and configurations are highly sought after yet elusive. The demand for such models has surged in the past decade due to their potential to facilitate scientific research and expedite the development of plasma technologies. In line with the advancements in computational power and data-driven methods, we introduce the "Phi Method" in this two-part article. Part I presents this novel algorithm, which employs constrained regression on a candidate term library informed by numerical discretization schemes to discover discretized systems of differential equations. We demonstrate Phi Method's efficacy in deriving reliable and robust reduced-order models (ROMs) for three test cases: the Lorenz attractor, flow past a cylinder, and a 1D Hall-thruster-representative plasma. Part II will delve into the method's application for parametric dynamics discovery. Our results show that ROMs derived from the Phi Method provide remarkably accurate predictions of systems' behavior, whether derived from steady-state or transient-state data. This underscores the method's potential for transforming plasma system modeling. △ Less

Submitted 3 March, 2024; originally announced March 2024.

Comments: 27 pages, 18 figures

arXiv:2402.03417 [pdf, other]

A Computer Vision Based Approach for Stalking Detection Using a CNN-LSTM-MLP Hybrid Fusion Model

Authors: Murad Hasan, Shahriar Iqbal, Md. Billal Hossain Faisal, Md. Musnad Hossin Neloy, Md. Tonmoy Kabir, Md. Tanzim Reza, Md. Golam Rabiul Alam, Md Zia Uddin

Abstract: Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most aff… ▽ More Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most affected. Stalking is a visible action that usually occurs before any criminal activity begins as the stalker begins to follow, loiter, and stare at the victim before committing any criminal activity such as assault, kidnapping, rape, and so on. Therefore, it has become a necessity to detect stalking as all of these criminal activities can be stopped in the first place through stalking detection. In this research, we propose a novel deep learning-based hybrid fusion model to detect potential stalkers from a single video with a minimal number of frames. We extract multiple relevant features, such as facial landmarks, head pose estimation, and relative distance, as numerical values from video frames. This data is fed into a multilayer perceptron (MLP) to perform a classification task between a stalking and a non-stalking scenario. Simultaneously, the video frames are fed into a combination of convolutional and LSTM models to extract the spatio-temporal features. We use a fusion of these numerical and spatio-temporal features to build a classifier to detect stalking incidents. Additionally, we introduce a dataset consisting of stalking and non-stalking videos gathered from various feature films and television series, which is also used to train the model. The experimental results show the efficiency and dynamism of our proposed stalker detection system, achieving 89.58% testing accuracy with a significant improvement as compared to the state-of-the-art approaches. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: Under review for publication in the PLOS ONE journal, 17 pages, 9 figures

arXiv:2401.00548 [pdf, ps, other]

Cesàro summability of Taylor series in higher order weighted Dirichlet type spaces

Authors: Soumitra Ghara, Rajeev Gupta, Md. Ramiz Reza

Abstract: For a positive integer $m$ and a finite non-negative Borel measure $μ$ on the unit circle, we study the Hadamard multipliers of higher order weighted Dirichlet-type spaces $\mathcal H_{μ, m}$. We show that if $α>\frac{1}{2},$ then for any $f$ in $\mathcal H_{μ, m},$ the sequence of generalized Ces{à}ro sums $\{σ_n^α[f]\}$ converges to $f$. We further show that if $α=\frac{1}{2}$ then for the Dirac… ▽ More For a positive integer $m$ and a finite non-negative Borel measure $μ$ on the unit circle, we study the Hadamard multipliers of higher order weighted Dirichlet-type spaces $\mathcal H_{μ, m}$. We show that if $α>\frac{1}{2},$ then for any $f$ in $\mathcal H_{μ, m},$ the sequence of generalized Ces{à}ro sums $\{σ_n^α[f]\}$ converges to $f$. We further show that if $α=\frac{1}{2}$ then for the Dirac delta measure supported at any point on the unit circle, the previous statement breaks down for every positive integer $m$. △ Less

Submitted 31 December, 2023; originally announced January 2024.

Comments: 14 pages, comments and suggestions are welcome

MSC Class: 41A10; 40G05; 46E20

arXiv:2311.10883 [pdf, other]

Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models

Authors: Yimeng Li, Navid Rajabi, Sulabh Shrestha, Md Alimoor Reza, Jana Kosecka

Abstract: The image annotation stage is a critical and often the most time-consuming part required for training and evaluating object detection and semantic segmentation models. Deployment of the existing models in novel environments often requires detecting novel semantic classes not present in the training data. Furthermore, indoor scenes contain significant viewpoint variations, which need to be handled… ▽ More The image annotation stage is a critical and often the most time-consuming part required for training and evaluating object detection and semantic segmentation models. Deployment of the existing models in novel environments often requires detecting novel semantic classes not present in the training data. Furthermore, indoor scenes contain significant viewpoint variations, which need to be handled properly by trained perception models. We propose to leverage the recent advancements in state-of-the-art models for bottom-up segmentation (SAM), object detection (Detic), and semantic segmentation (MaskFormer), all trained on large-scale datasets. We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments, with the ultimate goal of facilitating the training of lightweight models for various downstream tasks. We also propose a multi-view labeling fusion stage, which considers the setting where multiple views of the scenes are available and can be used to identify and rectify single-view inconsistencies. We demonstrate the effectiveness of the proposed approach on the Active Vision dataset and the ADE20K dataset. We evaluate the quality of our labeling process by comparing it with human annotations. Also, we demonstrate the effectiveness of the obtained labels in downstream tasks such as object goal navigation and part discovery. In the context of object goal navigation, we depict enhanced performance using this fusion approach compared to a zero-shot baseline that utilizes large monolithic vision-language pre-trained models. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2310.19583 [pdf, other]

GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo

Authors: Vibhas K. Vats, Sripad Joshi, David J. Crandall, Md. Alimoor Reza, Soon-heung Jung

Abstract: Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at di… ▽ More Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at different scales during learning (see Fig. 1). We find that adding this geometric consistency loss significantly accelerates learning by explicitly penalizing geometrically inconsistent pixels, reducing the training iteration requirements to nearly half that of other MVS methods. Our extensive experiments show that our approach achieves a new state-of-the-art on the DTU and BlendedMVS datasets, and competitive results on the Tanks and Temples benchmark. To the best of our knowledge, GC-MVSNet is the first attempt to enforce multi-view, multi-scale geometric consistency during learning. △ Less

Submitted 21 December, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: Accepted in WACV 2024 Link: https://openaccess.thecvf.com/content/WACV2024/html/Vats_GC-MVSNet_Multi-View_Multi-Scale_Geometrically-Consistent_Multi-View_Stereo_WACV_2024_paper.html

Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

arXiv:2310.13712 [pdf, other]

doi 10.1145/3687038

Impact of Guidance and Interaction Strategies for LLM Use on Learner Performance and Perception

Authors: Harsh Kumar, Ilya Musabirov, Mohi Reza, Jiakai Shi, Xinyuan Wang, Joseph Jay Williams, Anastasia Kuzminykh, Michael Liut

Abstract: Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. Large language models (LLMs) offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction be… ▽ More Personalized chatbot-based teaching assistants can be crucial in addressing increasing classroom sizes, especially where direct teacher presence is limited. Large language models (LLMs) offer a promising avenue, with increasing research exploring their educational utility. However, the challenge lies not only in establishing the efficacy of LLMs but also in discerning the nuances of interaction between learners and these models, which impact learners' engagement and results. We conducted a formative study in an undergraduate computer science classroom (N=145) and a controlled experiment on Prolific (N=356) to explore the impact of four pedagogically informed guidance strategies on the learners' performance, confidence and trust in LLMs. Direct LLM answers marginally improved performance, while refining student solutions fostered trust. Structured guidance reduced random queries as well as instances of students copy-pasting assignment questions to the LLM. Our work highlights the role that teachers can play in shaping LLM-supported learning environments. △ Less

Submitted 19 August, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: To appear in CSCW 2024

arXiv:2310.03986 [pdf, other]

doi 10.1109/TPAMI.2024.3476487

Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation

Authors: Md Kaykobad Reza, Ashley Prater-Bennette, M. Salman Asif

Abstract: Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modali… ▽ More Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modalities are absent at test time. To enable robustness to missing modalities, we propose a simple and parameter-efficient adaptation procedure for pretrained multimodal networks. In particular, we exploit modulation of intermediate features to compensate for the missing modalities. We demonstrate that such adaptation can partially bridge performance drop due to missing modalities and outperform independent, dedicated networks trained for the available modality combinations in some cases. The proposed adaptation requires extremely small number of parameters (e.g., fewer than 1% of the total parameters) and applicable to a wide range of modality combinations and tasks. We conduct a series of experiments to highlight the missing modality robustness of our proposed method on five different multimodal tasks across seven datasets. Our proposed method demonstrates versatility across various tasks and datasets, and outperforms existing methods for robust multimodal learning with missing modalities. △ Less

Submitted 7 October, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). 28 pages, 6 figures, 17 tables

arXiv:2310.00117 [pdf, other]

doi 10.1145/3613904.3641899

ABScribe: Rapid Exploration & Organization of Multiple Writing Variations in Human-AI Co-Writing Tasks using Large Language Models

Authors: Mohi Reza, Nathan Laundry, Ilya Musabirov, Peter Dushniku, Zhi Yuan "Michael" Yu, Kashish Mittal, Tovi Grossman, Michael Liut, Anastasia Kuzminykh, Joseph Jay Williams

Abstract: Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art Large Language Models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new variations without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing… ▽ More Exploring alternative ideas by rewriting text is integral to the writing process. State-of-the-art Large Language Models (LLMs) can simplify writing variation generation. However, current interfaces pose challenges for simultaneous consideration of multiple variations: creating new variations without overwriting text can be difficult, and pasting them sequentially can clutter documents, increasing workload and disrupting writers' flow. To tackle this, we present ABScribe, an interface that supports rapid, yet visually structured, exploration and organization of writing variations in human-AI co-writing tasks. With ABScribe, users can swiftly modify variations using LLM prompts, which are auto-converted into reusable buttons. Variations are stored adjacently within text fields for rapid in-place comparisons using mouse-over interactions on a popup toolbar. Our user study with 12 writers shows that ABScribe significantly reduces task workload (d = 1.20, p < 0.001), enhances user perceptions of the revision process (d = 2.41, p < 0.001) compared to a popular baseline workflow, and provides insights into how writers explore variations using LLMs. △ Less

Submitted 27 March, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

Comments: CHI 2024

arXiv:2309.05005 [pdf]

Effects of the applied fields' strength on the plasma behavior and processes in ExB plasma discharges of various propellants: II. Magnetic field

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: We present in this part II the effects of the magnetic field intensity on the properties of the plasma discharge and the underlying phenomena for different propellant's ion mass. The plasma setup represents a perpendicular configuration of the electric and magnetic fields, with the electric field along the axial direction and the magnetic field along the radial direction. The magnetic field intens… ▽ More We present in this part II the effects of the magnetic field intensity on the properties of the plasma discharge and the underlying phenomena for different propellant's ion mass. The plasma setup represents a perpendicular configuration of the electric and magnetic fields, with the electric field along the axial direction and the magnetic field along the radial direction. The magnetic field intensity is changed from 5 to 30 mT, with 5 mT increments. The propellant gases are xenon, krypton, and argon. The simulations are carried out using a particle-in-cell (PIC) code based on the computationally efficient reduced-order PIC scheme. Similar to the observations in part I, we show that, across all propellants, the variation in the intensity of the magnetic field yields two distinct regimes of the plasma, where either the Modified Two Stream Instability (MTSI) or the Electron Cyclotron Drift Instability (ECDI) are present. Nonetheless, a third plasma regime is also observed for cases with moderate values of the magnetic field intensity (15 and 20 mT), in which the ECDI and the MTSI co-exist with comparable amplitudes. This described change in the plasma regime becomes clearly reflected in the radial distribution of the axial electron current density and the electron temperature anisotropy. Contrary to the effect of the electric field magnitude in part I, we observed here that the MTSI is absent at the relatively low magnetic field intensities (5 and 10 mT). At the relatively high magnitudes of the magnetic field (25 and 30 mT), the MTSI becomes strongly present, a long-wavelength wave mode develops, and the ECDI does not excite. An exception to this latter observation was noticed for xenon, for which the ECDI's presence persists up to the magnetic field peak value of 25 mT. △ Less

Submitted 10 September, 2023; originally announced September 2023.

Comments: 17 pages, 15 figures

arXiv:2309.05001 [pdf]

Effects of the applied fields' strength on the plasma behavior and processes in ExB plasma discharges of various propellants: I. Electric field

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: We present, in this two-part article, an extensive study on the influence that the magnitudes of the applied electric (E) and magnetic (B) fields have on a collisionless plasma discharge of xenon, krypton, and argon in a 2D radial-azimuthal configuration with perpendicular orientation of the fields. The dependency of the behavior and the underlying processes of ExB discharges on the strength of el… ▽ More We present, in this two-part article, an extensive study on the influence that the magnitudes of the applied electric (E) and magnetic (B) fields have on a collisionless plasma discharge of xenon, krypton, and argon in a 2D radial-azimuthal configuration with perpendicular orientation of the fields. The dependency of the behavior and the underlying processes of ExB discharges on the strength of electromagnetic field and ion mass has not yet been studied in depth and in a manner that can distinguish the role of each individual factor. This has been, on the one hand, due to the significant computational cost of conventional high-fidelity particle-in-cell (PIC) codes that do not allow for extensive simulations over a broad parameter space within practical timeframes. On the other hand, the experimental efforts have been limited, in part, by the measurements' spatial and temporal resolution. In this sense, the notably reduced computational cost of the reduced-order PIC scheme enables to numerically cast light on the parametric variations in various aspects of the physics of ExB discharges, such as high resolution spatial-temporal mappings of the plasma instabilities. In part I of the article, we focus on the effects of the E-field intensity. We demonstrate that the intensity of the field determines two distinct plasma regimes, which are characterized by different dominant instability campaigns. At relatively low E-field magnitudes, the Modified Two Stream Instability (MTSI) is dominant, whereas, at relatively high E-field magnitudes, the MTSI is mitigated, and the Electron Cyclotron Drift Instability (ECDI) becomes dominant. These two regimes are identified for all studied propellants. Consequent to the change in the plasma regime, the radial distribution of the axial electron current density and the electron temperature anisotropy vary. △ Less

Submitted 10 September, 2023; originally announced September 2023.

Comments: 20 pages, 16 figures

arXiv:2309.04001 [pdf, other]

doi 10.1109/OJSP.2024.3389812

MMSFormer: Multimodal Transformer for Material and Semantic Segmentation

Authors: Md Kaykobad Reza, Ashley Prater-Bennette, M. Salman Asif

Abstract: Leveraging information across diverse modalities is known to enhance performance on multimodal segmentation tasks. However, effectively fusing information from different modalities remains challenging due to the unique characteristics of each modality. In this paper, we propose a novel fusion strategy that can effectively fuse information from different modality combinations. We also propose a new… ▽ More Leveraging information across diverse modalities is known to enhance performance on multimodal segmentation tasks. However, effectively fusing information from different modalities remains challenging due to the unique characteristics of each modality. In this paper, we propose a novel fusion strategy that can effectively fuse information from different modality combinations. We also propose a new model named Multi-Modal Segmentation TransFormer (MMSFormer) that incorporates the proposed fusion strategy to perform multimodal material and semantic segmentation tasks. MMSFormer outperforms current state-of-the-art models on three different datasets. As we begin with only one input modality, performance improves progressively as additional modalities are incorporated, showcasing the effectiveness of the fusion block in combining useful information from diverse input modalities. Ablation studies show that different modules in the fusion block are crucial for overall model performance. Furthermore, our ablation studies also highlight the capacity of different input modalities to improve performance in the identification of different types of materials. The code and pretrained models will be made available at https://github.com/csiplab/MMSFormer. △ Less

Submitted 7 April, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

Comments: Accepted by IEEE Open Journal of Signal Processing. 15 pages, 3 figures, 9 tables

arXiv:2309.00565 [pdf]

Influence of the magnetic field's curvature on the radial-azimuthal dynamics of a Hall thruster plasma discharge with different propellants

Authors: Maryam Reza, Farbod Faraji, Aaron Knoll

Abstract: The topology of the applied magnetic field is an important design aspect of Hall thrusters. For modern Hall thrusters, the field topology most often features curved lines with a concave (negative) curvature upstream of the field peak and a convex (positive) curvature downstream. Additionally, the advent of the magnetic shielding technique has resulted in the design of Hall thrusters with non-conve… ▽ More The topology of the applied magnetic field is an important design aspect of Hall thrusters. For modern Hall thrusters, the field topology most often features curved lines with a concave (negative) curvature upstream of the field peak and a convex (positive) curvature downstream. Additionally, the advent of the magnetic shielding technique has resulted in the design of Hall thrusters with non-conventional magnetic fields that exhibit high degrees of concavity upstream of the field's peak. We carry out a rigorous and detailed study of the effects that the magnetic field's curvature has on the plasma properties and the underlying processes in a 2D configuration representative of a Hall thruster's radial-azimuthal cross-section. The analyses are performed for plasma discharges of three propellants: xenon, krypton, and argon. For each propellant, we have carried out high-fidelity reduced-order particle-in-cell (PIC) simulations with various degrees of positive and negative curvatures of the magnetic field. Corresponding 1D radial PIC simulations were also performed for xenon to compare the observations between 1D and 2D simulations. We observed that there are distinct differences in the plasma phenomena between the cases with positive and negative field curvatures. The instability spectra in the cases of positive curvature is mostly dominated by the Electron Cyclotron Drift Instability, whereas the Modified Two Stream Instability is dominant in the negative-curvature cases. The distribution of the plasma properties, particularly the electron and ion temperatures, and the contribution of various mechanisms to electrons' cross-field transport showed notable variations with the field's curvature, especially between the positive and the negative values. Finally, the magnetic field curvature was observed to majorly influence the ion beam divergence along the radial and azimuthal coordinates. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 25 pages, 24 figures

arXiv:2308.13727 [pdf]

Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: II. dynamics forecasting

Authors: Farbod Faraji, Maryam Reza, Aaron Knoll, J. Nathan Kutz

Abstract: In part I of the article, we demonstrated that a variant of the Dynamic Mode Decomposition (DMD) algorithm based on variable projection optimization, called Optimized DMD (OPT-DMD), enables a robust identification of the dominant spatiotemporally coherent modes underlying the data across various test cases representing different physical parameters in an ExB simulation configuration. As the OPT-DM… ▽ More In part I of the article, we demonstrated that a variant of the Dynamic Mode Decomposition (DMD) algorithm based on variable projection optimization, called Optimized DMD (OPT-DMD), enables a robust identification of the dominant spatiotemporally coherent modes underlying the data across various test cases representing different physical parameters in an ExB simulation configuration. As the OPT-DMD can be constrained to produce stable reduced-order models (ROMs) by construction, in this paper, we extend the application of the OPT-DMD and investigate the capabilities of the linear ROM from this algorithm toward forecasting in time of the plasma dynamics in configurations representative of the radial-azimuthal and axial-azimuthal cross-sections of a Hall thruster and over a range of simulation parameters in each test case. The predictive capacity of the OPT-DMD ROM is assessed primarily in terms of short-term dynamics forecast or, in other words, for large ratios of training-to-test data. However, the utility of the ROM for long-term dynamics forecasting is also presented for an example case in the radial-azimuthal configuration. The model's predictive performance is heterogeneous across various test cases. Nonetheless, a remarkable predictiveness is observed in the test cases that do not exhibit highly transient behaviors. Moreover, in all investigated cases, the error between the ground-truth and the reconstructed data from the OPT-DMD ROM remains bounded over time within both the training and the test window. As a result, despite its limitation in terms of generalized applicability to all plasma conditions, the OPT-DMD is proven as a reliable method to develop low computational cost and highly predictive data-driven reduced-order models in systems with a quasi-periodic global evolution of the plasma state. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: 14 pages, 14 figures

arXiv:2308.13726 [pdf]

Dynamic Mode Decomposition for data-driven analysis and reduced-order modelling of ExB plasmas: I. Extraction of spatiotemporally coherent patterns

Authors: Farbod Faraji, Maryam Reza, Aaron Knoll, J. Nathan Kutz

Abstract: In this two-part article, we evaluate the utility and the generalizability of the Dynamic Mode Decomposition (DMD) algorithm for data-driven analysis and reduced-order modelling of plasma dynamics in cross-field ExB configurations. The DMD algorithm is an interpretable data-driven method that finds a best-fit linear model describing the time evolution of spatiotemporally coherent structures (patte… ▽ More In this two-part article, we evaluate the utility and the generalizability of the Dynamic Mode Decomposition (DMD) algorithm for data-driven analysis and reduced-order modelling of plasma dynamics in cross-field ExB configurations. The DMD algorithm is an interpretable data-driven method that finds a best-fit linear model describing the time evolution of spatiotemporally coherent structures (patterns) in data. We have applied the DMD to extensive high-fidelity datasets generated using a particle-in-cell (PIC) code based on a cost-efficient reduced-order PIC scheme. In this part, we first provide an overview of the concept of DMD and its underpinning Proper Orthogonal and Singular Value Decomposition methods. Two of the main DMD variants are next introduced. We then present and discuss the results of the DMD application in terms of the identification and extraction of the dominant spatiotemporal modes from high-fidelity data over a range of simulation conditions. We demonstrate that the DMD variant based on variable projection optimization (OPT-DMD) outperforms the basic DMD method in identification of the modes underlying the data, leading to notably more reliable reconstruction of the ground-truth. Furthermore, we show in multiple test cases that the discrete frequency spectrum of OPT-DMD-extracted modes is consistent with the temporal spectrum from the Fast Fourier Transform of the data. This observation implies that the OPT-DMD augments the conventional spectral analyses by being able to uniquely reveal the spatial structure of the dominant modes in the frequency spectra, thus, yielding more accessible, comprehensive information on the spatiotemporal characteristics of the plasma phenomena. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: 21 pages, 16 figues

arXiv:2308.03163 [pdf, other]

CGBA: Curvature-aware Geometric Black-box Attack

Authors: Md Farhamdur Reza, Ali Rahmati, Tianfu Wu, Huaiyu Dai

Abstract: Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conduct… ▽ More Decision-based black-box attacks often necessitate a large number of queries to craft an adversarial example. Moreover, decision-based attacks based on querying boundary points in the estimated normal vector direction often suffer from inefficiency and convergence issues. In this paper, we propose a novel query-efficient curvature-aware geometric decision-based black-box attack (CGBA) that conducts boundary search along a semicircular path on a restricted 2D plane to ensure finding a boundary point successfully irrespective of the boundary curvature. While the proposed CGBA attack can work effectively for an arbitrary decision boundary, it is particularly efficient in exploiting the low curvature to craft high-quality adversarial examples, which is widely seen and experimentally verified in commonly used classifiers under non-targeted attacks. In contrast, the decision boundaries often exhibit higher curvature under targeted attacks. Thus, we develop a new query-efficient variant, CGBA-H, that is adapted for the targeted attack. In addition, we further design an algorithm to obtain a better initial boundary point at the expense of some extra queries, which considerably enhances the performance of the targeted attack. Extensive experiments are conducted to evaluate the performance of our proposed methods against some well-known classifiers on the ImageNet and CIFAR10 datasets, demonstrating the superiority of CGBA and CGBA-H over state-of-the-art non-targeted and targeted attacks, respectively. The source code is available at https://github.com/Farhamdur/CGBA. △ Less

Submitted 6 August, 2023; originally announced August 2023.

Comments: This paper is accepted to publish in ICCV

arXiv:2305.18297 [pdf]

Using disturbance function for vibration analysis of a beam with an open edge crack

Authors: Mousa Rezaee, Saeed Lotfan, Vahid A. Maleki

Abstract: In this article, the model presented by Shen and Pierre to investigate the transverse vibration behavior of a simply supported beam has been revised. This is done by applying more realistic assumptions. The crack is modeled as a continuous disturbance and the disturbance function is provided based on fracture mechanics. Next, the natural frequencies corresponding to the model are extracted using t… ▽ More In this article, the model presented by Shen and Pierre to investigate the transverse vibration behavior of a simply supported beam has been revised. This is done by applying more realistic assumptions. The crack is modeled as a continuous disturbance and the disturbance function is provided based on fracture mechanics. Next, the natural frequencies corresponding to the model are extracted using the Galerkin method. The effect of crack parameters on the vibration behavior of the cracked beam is investigated. The obtained results show that the natural frequencies of the beam decrease with increasing crack depth. At the end, the obtained results are compared with the experimental results. The results show that the presented model is improved compared to previous models and predicts the vibration behavior of cracked beams with better accuracy for different crack parameters. △ Less

Submitted 12 March, 2023; originally announced May 2023.

Comments: in Persian language. 20th Annual International Conference of Iranian Society of Mechanical, Shiraz, Iran, 2012

arXiv:2305.07427 [pdf]

Ab-initio investigation of the physical properties of BaAgAs Dirac semimetal and its possible thermo-mechanical and optoelectronic applications

Authors: A. S. M. Muhasin Reza, S. H. Naqib

Abstract: BaAgAs is a ternary Dirac semimetal which can be tuned across a number of topological orders. In this study we have investigated the bulk physical properties of BaAgAs using density functional theory based computations. Most of the results presented in this work are novel. The optimized structural parameters are in good agreement with previous results. The elastic constants indicate that BaAgAs is… ▽ More BaAgAs is a ternary Dirac semimetal which can be tuned across a number of topological orders. In this study we have investigated the bulk physical properties of BaAgAs using density functional theory based computations. Most of the results presented in this work are novel. The optimized structural parameters are in good agreement with previous results. The elastic constants indicate that BaAgAs is mechanically stable and brittle in nature. The compound is moderately hard and possesses fair degree of machinability. There is significant mechanical/elastic anisotropy in BaAgAs. The Debye temperature of the compound is medium and the phonon thermal conductivity and melting temperature are moderate as well. The bonding character is mixed with notable covalent contribution. The electronic band structure calculations reveal clear semimetallic behavior with a Dirac node at the Fermi level. BaAgAs has a small ellipsoidal Fermi surface centered at the G-point of the Brillouin zone. The phonon dispersion curves show dynamical stability. There is a clear phonon band gap between the acoustic and the optical branches. The energy dependent optical constants conform to the band structure calculations. The compound is an efficient absorber of the ultraviolet light and has potential to be used as an anti-reflection coating. Optical anisotropy of BaAgAs is moderate. The computed repulsive Coulomb pseudopotential is low indicating that the electronic correlations in this compound are not strong. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Submitted for publication. arXiv admin note: text overlap with arXiv:2303.01319

Showing 1–50 of 130 results for author: Reza, M