Search | arXiv e-print repository

AI-Driven Robotics for Free-Space Optics

Authors: Shiekh Zia Uddin, Sachin Vaidya, Shrish Choudhary, Zhuo Chen, Raafat K. Salib, Luke Huang, Dirk R. Englund, Marin Soljačić

Abstract: Tabletop optical experiments are foundational to research in many areas of science, including photonics, quantum optics, materials science, metrology, and biomedical imaging. However these experiments remain fundamentally reliant on manual design, assembly, and alignment, limiting throughput and reproducibility. Optics currently lacks generalizable robotic systems capable of operating across a div… ▽ More Tabletop optical experiments are foundational to research in many areas of science, including photonics, quantum optics, materials science, metrology, and biomedical imaging. However these experiments remain fundamentally reliant on manual design, assembly, and alignment, limiting throughput and reproducibility. Optics currently lacks generalizable robotic systems capable of operating across a diverse range of setups in realistic laboratory environments. Here we present OptoMate, an autonomous platform that integrates generative AI, computer vision, and precision robotics to enable automation of free-space optics experiments. Our platform interprets user-defined goals to generate valid optical setups using a fine-tuned large language model (LLM), assembles the setup via robotic pick-and-place with sub-millimeter accuracy, and performs fine alignment using a robot-deployable tool. The system then executes a range of automated measurements, including laser beam characterization, polarization mapping, and spectroscopy tasks. This work demonstrates the first flexible, AI-driven automation platform for optics, offering a path toward remote operation, cloud labs, and high-throughput discovery in the optical sciences. △ Less

Submitted 7 May, 2025; originally announced May 2025.

arXiv:2502.04367 [pdf, other]

Hybrid Deep Learning Framework for Classification of Kidney CT Images: Diagnosis of Stones, Cysts, and Tumors

Authors: Kiran Sharma, Ziya Uddin, Adarsh Wadal, Dhruv Gupta

Abstract: Medical image classification is a vital research area that utilizes advanced computational techniques to improve disease diagnosis and treatment planning. Deep learning models, especially Convolutional Neural Networks (CNNs), have transformed this field by providing automated and precise analysis of complex medical images. This study introduces a hybrid deep learning model that integrates a pre-tr… ▽ More Medical image classification is a vital research area that utilizes advanced computational techniques to improve disease diagnosis and treatment planning. Deep learning models, especially Convolutional Neural Networks (CNNs), have transformed this field by providing automated and precise analysis of complex medical images. This study introduces a hybrid deep learning model that integrates a pre-trained ResNet101 with a custom CNN to classify kidney CT images into four categories: normal, stone, cyst, and tumor. The proposed model leverages feature fusion to enhance classification accuracy, achieving 99.73% training accuracy and 100% testing accuracy. Using a dataset of 12,446 CT images and advanced feature mapping techniques, the hybrid CNN model outperforms standalone ResNet101. This architecture delivers a robust and efficient solution for automated kidney disease diagnosis, providing improved precision, recall, and reduced testing time, making it highly suitable for clinical applications. △ Less

Submitted 5 February, 2025; originally announced February 2025.

arXiv:2411.17447 [pdf, other]

Exploring Structural Dynamics in Retracted and Non-Retracted Author's Collaboration Networks: A Quantitative Analysis

Authors: Kiran Sharma, Aanchal Sharma, Jazlyn Jose, Vansh Saini, Raghavraj Sobti, Ziya Uddin

Abstract: Retractions undermine the reliability of scientific literature and the foundation of future research. Analyzing collaboration networks in retracted papers can identify risk factors, such as recurring co-authors or institutions. This study compared the network structures of retracted and non-retracted papers, using data from Retraction Watch and Scopus for 30 authors with significant retractions. C… ▽ More Retractions undermine the reliability of scientific literature and the foundation of future research. Analyzing collaboration networks in retracted papers can identify risk factors, such as recurring co-authors or institutions. This study compared the network structures of retracted and non-retracted papers, using data from Retraction Watch and Scopus for 30 authors with significant retractions. Collaboration networks were constructed, and network properties analyzed. Retracted networks showed hierarchical and centralized structures, while non-retracted networks exhibited distributed collaboration with stronger clustering and connectivity. Statistical tests, including $t$-tests and Cohen's $d$, revealed significant differences in metrics like Degree Centrality and Weighted Degree, highlighting distinct structural dynamics. These insights into retraction-prone collaborations can guide policies to improve research integrity. △ Less

Submitted 26 November, 2024; originally announced November 2024.

arXiv:2410.23365 [pdf, other]

Automated Personnel Selection for Software Engineers Using LLM-Based Profile Evaluation

Authors: Ahmed Akib Jawad Karim, Shahria Hoque, Md. Golam Rabiul Alam, Md. Zia Uddin

Abstract: Organizational success in todays competitive employment market depends on choosing the right staff. This work evaluates software engineer profiles using an automated staff selection method based on advanced natural language processing (NLP) techniques. A fresh dataset was generated by collecting LinkedIn profiles with important attributes like education, experience, skills, and self-introduction.… ▽ More Organizational success in todays competitive employment market depends on choosing the right staff. This work evaluates software engineer profiles using an automated staff selection method based on advanced natural language processing (NLP) techniques. A fresh dataset was generated by collecting LinkedIn profiles with important attributes like education, experience, skills, and self-introduction. Expert feedback helped transformer models including RoBERTa, DistilBERT, and a customized BERT variation, LastBERT, to be adjusted. The models were meant to forecast if a candidate's profile fit the selection criteria, therefore allowing automated ranking and assessment. With 85% accuracy and an F1 score of 0.85, RoBERTa performed the best; DistilBERT provided comparable results at less computing expense. Though light, LastBERT proved to be less effective, with 75% accuracy. The reusable models provide a scalable answer for further categorization challenges. This work presents a fresh dataset and technique as well as shows how transformer models could improve recruiting procedures. Expanding the dataset, enhancing model interpretability, and implementing the system in actual environments will be part of future activities. △ Less

Submitted 3 November, 2024; v1 submitted 30 October, 2024; originally announced October 2024.

Comments: 6 pages, 12 figures, conference paper

arXiv:2410.00422 [pdf, other]

Exploring Physics-Informed Neural Networks: From Fundamentals to Applications in Complex Systems

Authors: Sai Ganga, Ziya Uddin

Abstract: Physics-informed neural networks (PINNs) have emerged as a versatile and widely applicable concept across various science and engineering domains over the past decade. This article offers a comprehensive overview of the fundamentals of PINNs, tracing their evolution, modifications, and various variants. It explores the impact of different parameters on PINNs and the optimization algorithms involve… ▽ More Physics-informed neural networks (PINNs) have emerged as a versatile and widely applicable concept across various science and engineering domains over the past decade. This article offers a comprehensive overview of the fundamentals of PINNs, tracing their evolution, modifications, and various variants. It explores the impact of different parameters on PINNs and the optimization algorithms involved. The review also delves into the theoretical advancements related to the convergence, consistency, and stability of numerical solutions using PINNs, while highlighting the current state of the art. Given their ability to address equations involving complex physics, the article discusses various applications of PINNs, with a particular focus on their utility in computational fluid dynamics problems. Additionally, it identifies current gaps in the research and outlines future directions for the continued development of PINNs. △ Less

Submitted 1 October, 2024; originally announced October 2024.

arXiv:2404.15298 [pdf, other]

Unraveling Retraction Dynamics in COVID-19 Research: Patterns, Reasons, and Implications

Authors: Parul Khurana, Ziya Uddin, Kiran Sharma

Abstract: Amid the COVID-19 pandemic, while the world sought solutions, some scholars exploited the situation for personal gains through deceptive studies and manipulated data. This paper presents the extent of 400 retracted COVID-19 papers listed by the Retraction Watch database until February 2024. The primary purpose of the research was to analyze journal quality and retraction trends. For all stakeholde… ▽ More Amid the COVID-19 pandemic, while the world sought solutions, some scholars exploited the situation for personal gains through deceptive studies and manipulated data. This paper presents the extent of 400 retracted COVID-19 papers listed by the Retraction Watch database until February 2024. The primary purpose of the research was to analyze journal quality and retraction trends. For all stakeholders involved, such as editors, relevant researchers, and policymakers, evaluating the journal's quality is crucial information since it could help them effectively stop such incidents and their negative effects in the future. The present research results imply that one-fourth of publications were retracted within the first month of their publication, followed by an additional 6\% within six months of publication. One-third of the retractions originated from Q1 journals, with another significant portion coming from Q2 (29.8). A notable percentage of the retracted papers (23.2\%) lacked publishing impact, signifying their publication as conference papers or in journals not indexed by Scopus. An examination of the retraction reasons reveals that one-fourth of retractions were due to numerous causes, mostly in Q2 journals, and another quarter were due to data problems, with the majority happening in Q1 publications. Elsevier retracted 31 of the papers, with the majority published in Q1, followed by Springer (11.5), predominantly in Q2. Retracted papers were mainly associated with the USA, China, and India. In the USA, retractions were primarily from Q1 journals followed by no-impact publications; in China, it was Q1 followed by Q2, and in India, it was Q2 followed by no-impact publications. The study also examined author contributions, revealing that 69.3 were male contributors, with females (30.7) mainly holding middle author positions. △ Less

Submitted 26 March, 2024; originally announced April 2024.

Comments: 13 Pages, 9 figures

arXiv:2404.08423 [pdf, other]

SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies

Authors: Maeghal Jain, Ziya Uddin, Wubshet Ibrahim

Abstract: The outbreak of COVID-19 has highlighted the intricate interplay between public health and economic stability on a global scale. This study proposes a novel reinforcement learning framework designed to optimize health and economic outcomes during pandemics. The framework leverages the SIR model, integrating both lockdown measures (via a stringency index) and vaccination strategies to simulate dise… ▽ More The outbreak of COVID-19 has highlighted the intricate interplay between public health and economic stability on a global scale. This study proposes a novel reinforcement learning framework designed to optimize health and economic outcomes during pandemics. The framework leverages the SIR model, integrating both lockdown measures (via a stringency index) and vaccination strategies to simulate disease dynamics. The stringency index, indicative of the severity of lockdown measures, influences both the spread of the disease and the economic health of a country. Developing nations, which bear a disproportionate economic burden under stringent lockdowns, are the primary focus of our study. By implementing reinforcement learning, we aim to optimize governmental responses and strike a balance between the competing costs associated with public health and economic stability. This approach also enhances transparency in governmental decision-making by establishing a well-defined reward function for the reinforcement learning agent. In essence, this study introduces an innovative and ethical strategy to navigate the challenge of balancing public health and economic stability amidst infectious disease outbreaks. △ Less

Submitted 30 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: 27 pages, 12 figures

arXiv:2402.03417 [pdf, other]

A Computer Vision Based Approach for Stalking Detection Using a CNN-LSTM-MLP Hybrid Fusion Model

Authors: Murad Hasan, Shahriar Iqbal, Md. Billal Hossain Faisal, Md. Musnad Hossin Neloy, Md. Tonmoy Kabir, Md. Tanzim Reza, Md. Golam Rabiul Alam, Md Zia Uddin

Abstract: Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most aff… ▽ More Criminal and suspicious activity detection has become a popular research topic in recent years. The rapid growth of computer vision technologies has had a crucial impact on solving this issue. However, physical stalking detection is still a less explored area despite the evolution of modern technology. Nowadays, stalking in public places has become a common occurrence with women being the most affected. Stalking is a visible action that usually occurs before any criminal activity begins as the stalker begins to follow, loiter, and stare at the victim before committing any criminal activity such as assault, kidnapping, rape, and so on. Therefore, it has become a necessity to detect stalking as all of these criminal activities can be stopped in the first place through stalking detection. In this research, we propose a novel deep learning-based hybrid fusion model to detect potential stalkers from a single video with a minimal number of frames. We extract multiple relevant features, such as facial landmarks, head pose estimation, and relative distance, as numerical values from video frames. This data is fed into a multilayer perceptron (MLP) to perform a classification task between a stalking and a non-stalking scenario. Simultaneously, the video frames are fed into a combination of convolutional and LSTM models to extract the spatio-temporal features. We use a fusion of these numerical and spatio-temporal features to build a classifier to detect stalking incidents. Additionally, we introduce a dataset consisting of stalking and non-stalking videos gathered from various feature films and television series, which is also used to train the model. The experimental results show the efficiency and dynamism of our proposed stalker detection system, achieving 89.58% testing accuracy with a significant improvement as compared to the state-of-the-art approaches. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: Under review for publication in the PLOS ONE journal, 17 pages, 9 figures

arXiv:2306.15677 [pdf, other]

Measuring the continuous research impact of a researcher: The Kz index

Authors: Kiran Sharma, Ziya Uddin

Abstract: The ongoing discussion regarding the utilization of individual research performance for academic hiring, funding allocation, and resource distribution has prompted the need for improved metrics. While traditional measures such as total publications, citations count, and the h-index provide a general overview of research impact, they fall short of capturing the continuous contribution of researcher… ▽ More The ongoing discussion regarding the utilization of individual research performance for academic hiring, funding allocation, and resource distribution has prompted the need for improved metrics. While traditional measures such as total publications, citations count, and the h-index provide a general overview of research impact, they fall short of capturing the continuous contribution of researchers over time. To address this limitation, we propose the implementation of the Kz index, which takes into account both publication impact and age. In this study, we calculated Kz scores for 376 research profiles. Kz reveals that the researchers with the same h-index can exhibit different Kz scores, and vice versa. Furthermore, we observed instances where researchers with lower citation counts obtained higher Kz scores, and vice versa. Interestingly, the Kz metric follows a log-normal distribution. It highlights its potential as a valuable tool for ranking researchers and facilitating informed decision-making processes. By measuring the continuous research impact, we enable fair evaluations, enhance decision-making processes, and provide focused career advancement support and funding opportunities. △ Less

Submitted 12 June, 2023; originally announced June 2023.

arXiv:2304.11046 [pdf]

doi 10.1007/s11042-023-14597-6

Affective social anthropomorphic intelligent system

Authors: Md. Adyelullahil Mamun, Hasnat Md. Abdullah, Md. Golam Rabiul Alam, Muhammad Mehedi Hassan, Md. Zia Uddin

Abstract: Human conversational styles are measured by the sense of humor, personality, and tone of voice. These characteristics have become essential for conversational intelligent virtual assistants. However, most of the state-of-the-art intelligent virtual assistants (IVAs) are failed to interpret the affective semantics of human voices. This research proposes an anthropomorphic intelligent system that ca… ▽ More Human conversational styles are measured by the sense of humor, personality, and tone of voice. These characteristics have become essential for conversational intelligent virtual assistants. However, most of the state-of-the-art intelligent virtual assistants (IVAs) are failed to interpret the affective semantics of human voices. This research proposes an anthropomorphic intelligent system that can hold a proper human-like conversation with emotion and personality. A voice style transfer method is also proposed to map the attributes of a specific emotion. Initially, the frequency domain data (Mel-Spectrogram) is created by converting the temporal audio wave data, which comprises discrete patterns for audio features such as notes, pitch, rhythm, and melody. A collateral CNN-Transformer-Encoder is used to predict seven different affective states from voice. The voice is also fed parallelly to the deep-speech, an RNN model that generates the text transcription from the spectrogram. Then the transcripted text is transferred to the multi-domain conversation agent using blended skill talk, transformer-based retrieve-and-generate generation strategy, and beam-search decoding, and an appropriate textual response is generated. The system learns an invertible mapping of data to a latent space that can be manipulated and generates a Mel-spectrogram frame based on previous Mel-spectrogram frames to voice synthesize and style transfer. Finally, the waveform is generated using WaveGlow from the spectrogram. The outcomes of the studies we conducted on individual models were auspicious. Furthermore, users who interacted with the system provided positive feedback, demonstrating the system's effectiveness. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: Multimedia Tools and Applications (2023)

arXiv:2211.14607 [pdf, other]

Sketch2FullStack: Generating Skeleton Code of Full Stack Website and Application from Sketch using Deep Learning and Computer Vision

Authors: Somoy Subandhu Barua, Imam Mohammad Zulkarnain, Abhishek Roy, Md. Golam Rabiul Alam, Md Zia Uddin

Abstract: For a full-stack web or app development, it requires a software firm or more specifically a team of experienced developers to contribute a large portion of their time and resources to design the website and then convert it to code. As a result, the efficiency of the development team is significantly reduced when it comes to converting UI wireframes and database schemas into an actual working syste… ▽ More For a full-stack web or app development, it requires a software firm or more specifically a team of experienced developers to contribute a large portion of their time and resources to design the website and then convert it to code. As a result, the efficiency of the development team is significantly reduced when it comes to converting UI wireframes and database schemas into an actual working system. It would save valuable resources and fasten the overall workflow if the clients or developers can automate this process of converting the pre-made full-stack website design to get a partially working if not fully working code. In this paper, we present a novel approach of generating the skeleton code from sketched images using Deep Learning and Computer Vision approaches. The dataset for training are first-hand sketched images of low fidelity wireframes, database schemas and class diagrams. The approach consists of three parts. First, the front-end or UI elements detection and extraction from custom-made UI wireframes. Second, individual database table creation from schema designs and lastly, creating a class file from class diagrams. △ Less

Submitted 26 November, 2022; originally announced November 2022.

Comments: 12 pages, 10 figures, preprint

MSC Class: 68T07 (Primary) ACM Class: I.2.2; I.2.10; I.2.5; I.4.0; I.4.9; I.7.0; D.2.1; D.2.2

arXiv:2003.01519 [pdf, other]

doi 10.1016/j.comcom.2020.02.065

Amateur Drones Detection: A machine learning approach utilizing the acoustic signals in the presence of strong interference

Authors: Zahoor Uddin, Muhammad Altaf, Muhammad Bilal, Lewis Nkenyereye, Ali Kashif Bashir

Abstract: Owing to small size, sensing capabilities and autonomous nature, the Unmanned Air Vehicles (UAVs) have enormous applications in various areas, e.g., remote sensing, navigation, archaeology, journalism, environmental science, and agriculture. However, the unmonitored deployment of UAVs called the amateur drones (AmDr) can lead to serious security threats and risk to human life and infrastructure. T… ▽ More Owing to small size, sensing capabilities and autonomous nature, the Unmanned Air Vehicles (UAVs) have enormous applications in various areas, e.g., remote sensing, navigation, archaeology, journalism, environmental science, and agriculture. However, the unmonitored deployment of UAVs called the amateur drones (AmDr) can lead to serious security threats and risk to human life and infrastructure. Therefore, timely detection of the AmDr is essential for the protection and security of sensitive organizations, human life and other vital infrastructure. AmDrs can be detected using different techniques based on sound, video, thermal, and radio frequencies. However, the performance of these techniques is limited in sever atmospheric conditions. In this paper, we propose an efficient unsupervise machine learning approach of independent component analysis (ICA) to detect various acoustic signals i.e., sounds of bird, airplanes, thunderstorm, rain, wind and the UAVs in practical scenario. After unmixing the signals, the features like Mel Frequency Cepstral Coefficients (MFCC), the power spectral density (PSD) and the Root Mean Square Value (RMS) of the PSD are extracted by using ICA. The PSD and the RMS of PSD signals are extracted by first passing the signals from octave band filter banks. Based on the above features the signals are classified using Support Vector Machines (SVM) and K Nearest Neighbor (KNN) to detect the presence or absence of AmDr. Unique feature of the proposed technique is the detection of a single or multiple AmDrs at a time in the presence of multiple acoustic interfering signals. The proposed technique is verified through extensive simulations and it is observed that the RMS values of PSD with KNN performs better than the MFCC with KNN and SVM. △ Less

Submitted 28 February, 2020; originally announced March 2020.

Comments: 25 pages, 10 figures, accepted for the publication in future issue of "Computer Communications (2020)"

MSC Class: 68T45; 68T10; 62H30; ACM Class: C.2; C.2.4; G.3

Showing 1–12 of 12 results for author: Uddin, Z