Search | arXiv e-print repository

CONQURE: A Co-Execution Environment for Quantum and Classical Resources

Authors: Atulya Mahesh, Swastik Mittal, Frank Mueller

Abstract: Cutting edge classical computing today relies on a combination of CPU-based computing with a strong reliance on accelerators. In particular, high-performance computing (HPC) and machine learning (ML) rely heavily on acceleration via GPUs for numerical kernels. In the future, acceleration via quantum devices may complement GPUs for kernels where algorithms provide quantum advantage, i.e., significa… ▽ More Cutting edge classical computing today relies on a combination of CPU-based computing with a strong reliance on accelerators. In particular, high-performance computing (HPC) and machine learning (ML) rely heavily on acceleration via GPUs for numerical kernels. In the future, acceleration via quantum devices may complement GPUs for kernels where algorithms provide quantum advantage, i.e., significant speedups over classical algorithms. Computing with quantum kernels mapped onto quantum processing units (QPUs) requires seamless integration into HPC and ML. However, quantum offloading onto HPC/cloud lacks open-source software infrastructure. For classical algorithms, parallelization standards, such as OpenMP, MPI, or CUDA exist. In contrast, a lack of quantum abstractions currently limits the adoption of quantum acceleration in practical applications creating a gap between quantum algorithm development and practical HPC integration. Such integration needs to extend to efficient quantum offloading of kernels, which further requires scheduling of quantum resources, control of QPU kernel execution, tracking of QPU results, providing results to classical calling contexts and coordination with HPC scheduling. This work proposes CONQURE, a co-execution environment for quantum and classical resources. CONQURE is a fully open-source cloud queue framework that presents a novel modular scheduling framework allowing users to offload OpenMP quantum kernels to QPUs as quantum circuits, to relay results back to calling contexts in classical computing, and to schedule quantum resources via our CONQURE API. We show our API has a low overhead averaging 12.7ms in our tests, and we demonstrate functionality on an ion-trap device. Our OpenMP extension enables the parallelization of VQE runs with a 3.1X reduction in runtime. △ Less

Submitted 16 June, 2025; v1 submitted 4 May, 2025; originally announced May 2025.

arXiv:2410.00708 [pdf, other]

Hybrid Quantum Neural Network based Indoor User Localization using Cloud Quantum Computing

Authors: Sparsh Mittal, Yash Chand, Neel Kanth Kundu

Abstract: This paper proposes a hybrid quantum neural network (HQNN) for indoor user localization using received signal strength indicator (RSSI) values. We use publicly available RSSI datasets for indoor localization using WiFi, Bluetooth, and Zigbee to test the performance of the proposed HQNN. We also compare the performance of the HQNN with the recently proposed quantum fingerprinting-based user localiz… ▽ More This paper proposes a hybrid quantum neural network (HQNN) for indoor user localization using received signal strength indicator (RSSI) values. We use publicly available RSSI datasets for indoor localization using WiFi, Bluetooth, and Zigbee to test the performance of the proposed HQNN. We also compare the performance of the HQNN with the recently proposed quantum fingerprinting-based user localization method. Our results show that the proposed HQNN performs better than the quantum fingerprinting algorithm since the HQNN has trainable parameters in the quantum circuits, whereas the quantum fingerprinting algorithm uses a fixed quantum circuit to calculate the similarity between the test data point and the fingerprint dataset. Unlike prior works, we also test the performance of the HQNN and quantum fingerprint algorithm on a real IBM quantum computer using cloud quantum computing services. Therefore, this paper examines the performance of the HQNN on noisy intermediate scale (NISQ) quantum devices using real-world RSSI localization datasets. The novelty of our approach lies in the use of simple feature maps and ansatz with fewer neurons, alongside testing on actual quantum hardware using real-world data, demonstrating practical applicability in real-world scenarios. △ Less

Submitted 1 October, 2024; originally announced October 2024.

Comments: This work has been accepted for presentation at the IEEE TENSYMP 2024 conference

arXiv:2407.20696 [pdf]

Implementation of Formal Standard for Interoperability in M&S/System of Systems Integration with DEVS/SOA

Authors: Saurabh Mittal, Bernard P. Zeigler, José L. Risco-Martín

Abstract: Modeling and Simulation (M&S) is finding increasing application in development and testing of command and control systems comprised of information-intensive component systems. Achieving interoperability is one of the chief System of Systems (SoS) engineering objectives in the development of command and control (C2) capabilities for joint and coalition warfare. In this paper, we apply an SoS perspe… ▽ More Modeling and Simulation (M&S) is finding increasing application in development and testing of command and control systems comprised of information-intensive component systems. Achieving interoperability is one of the chief System of Systems (SoS) engineering objectives in the development of command and control (C2) capabilities for joint and coalition warfare. In this paper, we apply an SoS perspective on the integration of M&S with such systems. We employ recently developed interoperability concepts based on linguistic categories along with the Discrete Event System Specification (DEVS) formalism to implement a standard for interoperability. We will show how the developed standard is implemented in DEVS/SOA net-centric modeling and simulation framework that uses XML-based Service Oriented Architecture (SOA). We will discuss the simulator interfaces and the design issues in their implementation in DEVS/SOA. We will illustrate the application of DEVS/SOA in a multi-agent test instrumentation system that is deployable as a SOA. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2407.03686

Journal ref: The International C2 Journal, 3(1), pp. 1-61, 2009

arXiv:2401.10373 [pdf, other]

Harmonized Spatial and Spectral Learning for Robust and Generalized Medical Image Segmentation

Authors: Vandan Gorade, Sparsh Mittal, Debesh Jha, Rekha Singhal, Ulas Bagci

Abstract: Deep learning has demonstrated remarkable achievements in medical image segmentation. However, prevailing deep learning models struggle with poor generalization due to (i) intra-class variations, where the same class appears differently in different samples, and (ii) inter-class independence, resulting in difficulties capturing intricate relationships between distinct objects, leading to higher fa… ▽ More Deep learning has demonstrated remarkable achievements in medical image segmentation. However, prevailing deep learning models struggle with poor generalization due to (i) intra-class variations, where the same class appears differently in different samples, and (ii) inter-class independence, resulting in difficulties capturing intricate relationships between distinct objects, leading to higher false negative cases. This paper presents a novel approach that synergies spatial and spectral representations to enhance domain-generalized medical image segmentation. We introduce the innovative Spectral Correlation Coefficient objective to improve the model's capacity to capture middle-order features and contextual long-range dependencies. This objective complements traditional spatial objectives by incorporating valuable spectral information. Extensive experiments reveal that optimizing this objective with existing architectures like UNet and TransUNet significantly enhances generalization, interpretability, and noise robustness, producing more confident predictions. For instance, in cardiac segmentation, we observe a 0.81 pp and 1.63 pp (pp = percentage point) improvement in DSC over UNet and TransUNet, respectively. Our interpretability study demonstrates that, in most tasks, objectives optimized with UNet outperform even TransUNet by introducing global contextual information alongside local details. These findings underscore the versatility and effectiveness of our proposed method across diverse imaging modalities and medical domains. △ Less

Submitted 8 August, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: Early Accepted at ICPR-2024 for Oral Presentation

arXiv:2312.01128 [pdf, other]

SPEEDNet: Salient Pyramidal Enhancement Encoder-Decoder Network for Colonoscopy Images

Authors: Tushir Sahu, Vidhi Bhatt, Sai Chandra Teja R, Sparsh Mittal, Nagesh Kumar S

Abstract: Accurate identification and precise delineation of regions of significance, such as tumors or lesions, is a pivotal goal in medical imaging analysis. This paper proposes SPEEDNet, a novel architecture for precisely segmenting lesions within colonoscopy images. SPEEDNet uses a novel block named Dilated-Involutional Pyramidal Convolution Fusion (DIPC). A DIPC block combines the dilated involution la… ▽ More Accurate identification and precise delineation of regions of significance, such as tumors or lesions, is a pivotal goal in medical imaging analysis. This paper proposes SPEEDNet, a novel architecture for precisely segmenting lesions within colonoscopy images. SPEEDNet uses a novel block named Dilated-Involutional Pyramidal Convolution Fusion (DIPC). A DIPC block combines the dilated involution layers pairwise into a pyramidal structure to convert the feature maps into a compact space. This lowers the total number of parameters while improving the learning of representations across an optimal receptive field, thereby reducing the blurring effect. On the EBHISeg dataset, SPEEDNet outperforms three previous networks: UNet, FeedNet, and AttesResDUNet. Specifically, SPEEDNet attains an average dice score of 0.952 and a recall of 0.971. Qualitative results and ablation studies provide additional insights into the effectiveness of SPEEDNet. The model size of SPEEDNet is 9.81 MB, significantly smaller than that of UNet (22.84 MB), FeedNet(185.58 MB), and AttesResDUNet (140.09 MB). △ Less

Submitted 2 December, 2023; originally announced December 2023.

Comments: 5 pages, 3 figures

arXiv:2311.08085 [pdf, other]

Optimizing Electric Vehicle Efficiency with Real-Time Telemetry using Machine Learning

Authors: Aryaman Rao, Harshit Gupta, Parth Singh, Shivam Mittal, Utkrash Singh, Dinesh Kumar Vishwakarma

Abstract: In the contemporary world with degrading natural resources, the urgency of energy efficiency has become imperative due to the conservation and environmental safeguarding. Therefore, it's crucial to look for advanced technology to minimize energy consumption. This research focuses on the optimization of battery-electric city style vehicles through the use of a real-time in-car telemetry system that… ▽ More In the contemporary world with degrading natural resources, the urgency of energy efficiency has become imperative due to the conservation and environmental safeguarding. Therefore, it's crucial to look for advanced technology to minimize energy consumption. This research focuses on the optimization of battery-electric city style vehicles through the use of a real-time in-car telemetry system that communicates between components through the robust Controller Area Network (CAN) protocol. By harnessing real-time data from various sensors embedded within vehicles, our driving assistance system provides the driver with visual and haptic actionable feedback that guides the driver on using the optimum driving style to minimize power consumed by the vehicle. To develop the pace feedback mechanism for the driver, real-time data is collected through a Shell Eco Marathon Urban Concept vehicle platform and after pre-processing, it is analyzed using the novel machine learning algorithm TEMSL, that outperforms the existing baseline approaches across various performance metrics. This innovative method after numerous experimentation has proven effective in enhancing energy efficiency, guiding the driver along the track, and reducing human errors. The driving-assistance system offers a range of utilities, from cost savings and extended vehicle lifespan to significant contributions to environmental conservation and sustainable driving practices. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2207.00960 [pdf, other]

doi 10.1016/j.compind.2022.103720

WaferSegClassNet -- A Light-weight Network for Classification and Segmentation of Semiconductor Wafer Defects

Authors: Subhrajit Nag, Dhruv Makwana, Sai Chandra Teja R, Sparsh Mittal, C Krishna Mohan

Abstract: As the integration density and design intricacy of semiconductor wafers increase, the magnitude and complexity of defects in them are also on the rise. Since the manual inspection of wafer defects is costly, an automated artificial intelligence (AI) based computer-vision approach is highly desired. The previous works on defect analysis have several limitations, such as low accuracy and the need fo… ▽ More As the integration density and design intricacy of semiconductor wafers increase, the magnitude and complexity of defects in them are also on the rise. Since the manual inspection of wafer defects is costly, an automated artificial intelligence (AI) based computer-vision approach is highly desired. The previous works on defect analysis have several limitations, such as low accuracy and the need for separate models for classification and segmentation. For analyzing mixed-type defects, some previous works require separately training one model for each defect type, which is non-scalable. In this paper, we present WaferSegClassNet (WSCN), a novel network based on encoder-decoder architecture. WSCN performs simultaneous classification and segmentation of both single and mixed-type wafer defects. WSCN uses a "shared encoder" for classification, and segmentation, which allows training WSCN end-to-end. We use N-pair contrastive loss to first pretrain the encoder and then use BCE-Dice loss for segmentation, and categorical cross-entropy loss for classification. Use of N-pair contrastive loss helps in better embedding representation in the latent dimension of wafer maps. WSCN has a model size of only 0.51MB and performs only 0.2M FLOPS. Thus, it is much lighter than other state-of-the-art models. Also, it requires only 150 epochs for convergence, compared to 4,000 epochs needed by a previous work. We evaluate our model on the MixedWM38 dataset, which has 38,015 images. WSCN achieves an average classification accuracy of 98.2% and a dice coefficient of 0.9999. We are the first to show segmentation results on the MixedWM38 dataset. The source code can be obtained from https://github.com/ckmvigil/WaferSegClassNet. △ Less

Submitted 3 July, 2022; originally announced July 2022.

Comments: 11 pages, 2 figures, 7 tables, Published in Computers in Industry

Journal ref: Volume 142, 2022, 103720, ISSN 0166-3615,

arXiv:2105.11241 [pdf]

Generation of COVID-19 Chest CT Scan Images using Generative Adversarial Networks

Authors: Prerak Mann, Sahaj Jain, Saurabh Mittal, Aruna Bhat

Abstract: SARS-CoV-2, also known as COVID-19 or Coronavirus, is a viral contagious disease that is infected by a novel coronavirus, and has been rapidly spreading across the globe. It is very important to test and isolate people to reduce spread, and from here comes the need to do this quickly and efficiently. According to some studies, Chest-CT outperforms RT-PCR lab testing, which is the current standard,… ▽ More SARS-CoV-2, also known as COVID-19 or Coronavirus, is a viral contagious disease that is infected by a novel coronavirus, and has been rapidly spreading across the globe. It is very important to test and isolate people to reduce spread, and from here comes the need to do this quickly and efficiently. According to some studies, Chest-CT outperforms RT-PCR lab testing, which is the current standard, when diagnosing COVID-19 patients. Due to this, computer vision researchers have developed various deep learning systems that can predict COVID-19 using a Chest-CT scan correctly to a certain degree. The accuracy of these systems is limited since deep learning neural networks such as CNNs (Convolutional Neural Networks) need a significantly large quantity of data for training in order to produce good quality results. Since the disease is relatively recent and more focus has been on CXR (Chest XRay) images, the available chest CT Scan image dataset is much less. We propose a method, by utilizing GANs, to generate synthetic chest CT images of both positive and negative COVID-19 patients. Using a pre-built predictive model, we concluded that around 40% of the generated images are correctly predicted as COVID-19 positive. The dataset thus generated can be used to train a CNN-based classifier which can help determine COVID-19 in a patient with greater accuracy. △ Less

Submitted 20 May, 2021; originally announced May 2021.

arXiv:2008.03205 [pdf, other]

Multi-Task Driven Explainable Diagnosis of COVID-19 using Chest X-ray Images

Authors: Aakarsh Malhotra, Surbhi Mittal, Puspita Majumdar, Saheb Chhabra, Kartik Thakral, Mayank Vatsa, Richa Singh, Santanu Chaudhury, Ashwin Pudrod, Anjali Agrawal

Abstract: With increasing number of COVID-19 cases globally, all the countries are ramping up the testing numbers. While the RT-PCR kits are available in sufficient quantity in several countries, others are facing challenges with limited availability of testing kits and processing centers in remote areas. This has motivated researchers to find alternate methods of testing which are reliable, easily accessib… ▽ More With increasing number of COVID-19 cases globally, all the countries are ramping up the testing numbers. While the RT-PCR kits are available in sufficient quantity in several countries, others are facing challenges with limited availability of testing kits and processing centers in remote areas. This has motivated researchers to find alternate methods of testing which are reliable, easily accessible and faster. Chest X-Ray is one of the modalities that is gaining acceptance as a screening modality. Towards this direction, the paper has two primary contributions. Firstly, we present the COVID-19 Multi-Task Network which is an automated end-to-end network for COVID-19 screening. The proposed network not only predicts whether the CXR has COVID-19 features present or not, it also performs semantic segmentation of the regions of interest to make the model explainable. Secondly, with the help of medical professionals, we manually annotate the lung regions of 9000 frontal chest radiographs taken from ChestXray-14, CheXpert and a consolidated COVID-19 dataset. Further, 200 chest radiographs pertaining to COVID-19 patients are also annotated for semantic segmentation. This database will be released to the research community. △ Less

Submitted 3 August, 2020; originally announced August 2020.

arXiv:2006.15103 [pdf, other]

DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on Systolic Accelerator

Authors: Nandan Kumar Jha, Shreyas Ravishankar, Sparsh Mittal, Arvind Kaushik, Dipan Mandal, Mahesh Chandra

Abstract: The number of processing elements (PEs) in a fixed-sized systolic accelerator is well matched for large and compute-bound DNNs; whereas, memory-bound DNNs suffer from PE underutilization and fail to achieve peak performance and energy efficiency. To mitigate this, specialized dataflow and/or micro-architectural techniques have been proposed. However, due to the longer development cycle and the rap… ▽ More The number of processing elements (PEs) in a fixed-sized systolic accelerator is well matched for large and compute-bound DNNs; whereas, memory-bound DNNs suffer from PE underutilization and fail to achieve peak performance and energy efficiency. To mitigate this, specialized dataflow and/or micro-architectural techniques have been proposed. However, due to the longer development cycle and the rapid pace of evolution in the deep learning fields, these hardware-based solutions can be obsolete and ineffective in dealing with PE underutilization for state-of-the-art DNNs. In this work, we address the challenge of PE underutilization at the algorithm front and propose data reuse aware co-optimization (DRACO). This improves the PE utilization of memory-bound DNNs without any additional need for dataflow/micro-architecture modifications. Furthermore, unlike the previous co-optimization methods, DRACO not only maximizes performance and energy efficiency but also improves the predictive performance of DNNs. To the best of our knowledge, DRACO is the first work that resolves the resource underutilization challenge at the algorithm level and demonstrates a trade-off between computational efficiency, PE utilization, and predictive performance of DNN. Compared to the state-of-the-art row stationary dataflow, DRACO achieves 41.8% and 42.6% improvement in average PE utilization and inference latency (respectively) with negligible loss in predictive performance in MobileNetV1 on a $64\times64$ systolic array. DRACO provides seminal insights for utilization-aware DNN design methodologies that can fully leverage the computation power of systolic array-based hardware accelerators. △ Less

Submitted 26 June, 2020; originally announced June 2020.

Comments: Accepted as a conference paper in the IEEE Computer Society Annual Symposium on VLSI (ISVLSI). Limassol, CYPRUS, July 6-8, 2020

ACM Class: I.5.1; I.5.2; C.0; C.1.3

arXiv:2006.15100 [pdf, other]

doi 10.1109/VLSID49098.2020.00044

E2GC: Energy-efficient Group Convolution in Deep Neural Networks

Authors: Nandan Kumar Jha, Rajat Saini, Subhrajit Nag, Sparsh Mittal

Abstract: The number of groups ($g$) in group convolution (GConv) is selected to boost the predictive performance of deep neural networks (DNNs) in a compute and parameter efficient manner. However, we show that naive selection of $g$ in GConv creates an imbalance between the computational complexity and degree of data reuse, which leads to suboptimal energy efficiency in DNNs. We devise an optimum group si… ▽ More The number of groups ($g$) in group convolution (GConv) is selected to boost the predictive performance of deep neural networks (DNNs) in a compute and parameter efficient manner. However, we show that naive selection of $g$ in GConv creates an imbalance between the computational complexity and degree of data reuse, which leads to suboptimal energy efficiency in DNNs. We devise an optimum group size model, which enables a balance between computational cost and data movement cost, thus, optimize the energy-efficiency of DNNs. Based on the insights from this model, we propose an "energy-efficient group convolution" (E2GC) module where, unlike the previous implementations of GConv, the group size ($G$) remains constant. Further, to demonstrate the efficacy of the E2GC module, we incorporate this module in the design of MobileNet-V1 and ResNeXt-50 and perform experiments on two GPUs, P100 and P4000. We show that, at comparable computational complexity, DNNs with constant group size (E2GC) are more energy-efficient than DNNs with a fixed number of groups (F$g$GC). For example, on P100 GPU, the energy-efficiency of MobileNet-V1 and ResNeXt-50 is increased by 10.8% and 4.73% (respectively) when E2GC modules substitute the F$g$GC modules in both the DNNs. Furthermore, through our extensive experimentation with ImageNet-1K and Food-101 image classification datasets, we show that the E2GC module enables a trade-off between generalization ability and representational power of DNN. Thus, the predictive performance of DNNs can be optimized by selecting an appropriate $G$. The code and trained models are available at https://github.com/iithcandle/E2GC-release. △ Less

Submitted 26 June, 2020; originally announced June 2020.

Comments: Accepted as a conference paper in 2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID)

ACM Class: I.5.1; I.5.2; I.5.5; C.0

Journal ref: VLSID (2020) 155-160

arXiv:1911.04410 [pdf]

A deep learning framework for morphologic detail beyond the diffraction limit in infrared spectroscopic imaging

Authors: Kianoush Falahkheirkhah, Kevin Yeh, Shachi Mittal, Luke Pfister, Rohit Bhargava

Abstract: Infrared (IR) microscopes measure spectral information that quantifies molecular content to assign the identity of biomedical cells but lack the spatial quality of optical microscopy to appreciate morphologic features. Here, we propose a method to utilize the semantic information of cellular identity from IR imaging with the morphologic detail of pathology images in a deep learning-based approach… ▽ More Infrared (IR) microscopes measure spectral information that quantifies molecular content to assign the identity of biomedical cells but lack the spatial quality of optical microscopy to appreciate morphologic features. Here, we propose a method to utilize the semantic information of cellular identity from IR imaging with the morphologic detail of pathology images in a deep learning-based approach to image super-resolution. Using Generative Adversarial Networks (GANs), we enhance the spatial detail in IR imaging beyond the diffraction limit while retaining their spectral contrast. This technique can be rapidly integrated with modern IR microscopes to provide a framework useful for routine pathology. △ Less

Submitted 19 December, 2019; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: corrected typos (the word "lack" was missing in the abstract)

arXiv:1307.4952 [pdf, other]

The Pin-Bang Theory: Discovering The Pinterest World

Authors: Sudip Mittal, Neha Gupta, Prateek Dewan, Ponnurangam Kumaraguru

Abstract: Pinterest is an image-based online social network, which was launched in the year 2010 and has gained a lot of traction, ever since. Within 3 years, Pinterest has attained 48.7 million unique users. This stupendous growth makes it interesting to study Pinterest, and gives rise to multiple questions about it's users, and content. We characterized Pinterest on the basis of large scale crawls of 3.3… ▽ More Pinterest is an image-based online social network, which was launched in the year 2010 and has gained a lot of traction, ever since. Within 3 years, Pinterest has attained 48.7 million unique users. This stupendous growth makes it interesting to study Pinterest, and gives rise to multiple questions about it's users, and content. We characterized Pinterest on the basis of large scale crawls of 3.3 million user profiles, and 58.8 million pins. In particular, we explored various attributes of users, pins, boards, pin sources, and user locations, in detail and performed topical analysis of user generated textual content. The characterization revealed most prominent topics among users and pins, top image sources, and geographical distribution of users on Pinterest. We then investigated this social network from a privacy and security standpoint, and found traces of malware in the form of pin sources. Instances of Personally Identifiable Information (PII) leakage were also discovered in the form of phone numbers, BBM (Blackberry Messenger) pins, and email addresses. Further, our analysis demonstrated how Pinterest is a potential venue for copyright infringement, by showing that almost half of the images shared on Pinterest go uncredited. To the best of our knowledge, this is the first attempt to characterize Pinterest at such a large scale. △ Less

Submitted 18 July, 2013; originally announced July 2013.

Comments: 15 pages, 10 figures, 5 tables

MSC Class: 68

Showing 1–13 of 13 results for author: Mittal, S