Search | arXiv e-print repository

Compiler Optimization via LLM Reasoning for Efficient Model Serving

Authors: Sujun Tang, Christopher Priebe, Rohan Mahapatra, Lianhui Qin, Hadi Esmaeilzadeh

Abstract: While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible… ▽ More While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible transformations. Although existing stochastic search techniques can be effective, they are often sample-inefficient and fail to leverage the structural context underlying compilation decisions. We set out to investigate the research question of whether reasoning with large language models (LLMs), without any retraining, can leverage the context-aware decision space of compiler optimization to significantly improve sample efficiency. To that end, we introduce a novel compilation framework (dubbed REASONING COMPILER) that formulates optimization as a sequential, context-aware decision process, guided by a large language model and structured Monte Carlo tree search (MCTS). The LLM acts as a proposal mechanism, suggesting hardware-aware transformations that reflect the current program state and accumulated performance feedback. Monte Carlo tree search (MCTS) incorporates the LLM-generated proposals to balance exploration and exploitation, facilitating structured, context-sensitive traversal of the expansive compiler optimization space. By achieving substantial speedups with markedly fewer samples than leading neural compilers, our approach demonstrates the potential of LLM-guided reasoning to transform the landscape of compiler optimization. △ Less

Submitted 2 June, 2025; originally announced June 2025.

arXiv:2505.12114 [pdf, ps, other]

Behind the Screens: Uncovering Bias in AI-Driven Video Interview Assessments Using Counterfactuals

Authors: Dena F. Mujtaba, Nihar R. Mahapatra

Abstract: AI-enhanced personality assessments are increasingly shaping hiring decisions, using affective computing to predict traits from the Big Five (OCEAN) model. However, integrating AI into these assessments raises ethical concerns, especially around bias amplification rooted in training data. These biases can lead to discriminatory outcomes based on protected attributes like gender, ethnicity, and age… ▽ More AI-enhanced personality assessments are increasingly shaping hiring decisions, using affective computing to predict traits from the Big Five (OCEAN) model. However, integrating AI into these assessments raises ethical concerns, especially around bias amplification rooted in training data. These biases can lead to discriminatory outcomes based on protected attributes like gender, ethnicity, and age. To address this, we introduce a counterfactual-based framework to systematically evaluate and quantify bias in AI-driven personality assessments. Our approach employs generative adversarial networks (GANs) to generate counterfactual representations of job applicants by altering protected attributes, enabling fairness analysis without access to the underlying model. Unlike traditional bias assessments that focus on unimodal or static data, our method supports multimodal evaluation-spanning visual, audio, and textual features. This comprehensive approach is particularly important in high-stakes applications like hiring, where third-party vendors often provide AI systems as black boxes. Applied to a state-of-the-art personality prediction model, our method reveals significant disparities across demographic groups. We also validate our framework using a protected attribute classifier to confirm the effectiveness of our counterfactual generation. This work provides a scalable tool for fairness auditing of commercial AI hiring platforms, especially in black-box settings where training data and model internals are inaccessible. Our results highlight the importance of counterfactual approaches in improving ethical transparency in affective computing. △ Less

Submitted 17 May, 2025; originally announced May 2025.

arXiv:2406.10177 [pdf, other]

doi 10.21437/Interspeech.2024-2246

Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation

Authors: Dena Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Caryn Herring, Jia Bin

Abstract: Automatic speech recognition (ASR) systems often falter while processing stuttering-related disfluencies -- such as involuntary blocks and word repetitions -- yielding inaccurate transcripts. A critical barrier to progress is the scarcity of large, annotated disfluent speech datasets. Therefore, we present an inclusive ASR design approach, leveraging large-scale self-supervised learning on standar… ▽ More Automatic speech recognition (ASR) systems often falter while processing stuttering-related disfluencies -- such as involuntary blocks and word repetitions -- yielding inaccurate transcripts. A critical barrier to progress is the scarcity of large, annotated disfluent speech datasets. Therefore, we present an inclusive ASR design approach, leveraging large-scale self-supervised learning on standard speech followed by targeted fine-tuning and data augmentation on a smaller, curated dataset of disfluent speech. Our data augmentation technique enriches training datasets with various disfluencies, enhancing ASR processing of these speech patterns. Results show that fine-tuning wav2vec 2.0 with even a relatively small, labeled dataset, alongside data augmentation, can significantly reduce word error rates for disfluent speech. Our approach not only advances ASR inclusivity for people who stutter, but also paves the way for ASRs that can accommodate wider speech variations. △ Less

Submitted 1 October, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

Comments: Included in 2024 Proceedings of INTERSPEECH

ACM Class: I.2; K.4

arXiv:2405.19699 [pdf, other]

Fairness in AI-Driven Recruitment: Challenges, Metrics, Methods, and Future Directions

Authors: Dena F. Mujtaba, Nihar R. Mahapatra

Abstract: The recruitment process significantly impacts an organization's performance, productivity, and culture. Traditionally, human resource experts and industrial-organizational psychologists have developed systematic hiring methods, including job advertising, candidate skill assessments, and structured interviews to ensure candidate-organization fit. Recently, recruitment practices have shifted dramati… ▽ More The recruitment process significantly impacts an organization's performance, productivity, and culture. Traditionally, human resource experts and industrial-organizational psychologists have developed systematic hiring methods, including job advertising, candidate skill assessments, and structured interviews to ensure candidate-organization fit. Recently, recruitment practices have shifted dramatically toward artificial intelligence (AI)-based methods, driven by the need to efficiently manage large applicant pools. However, reliance on AI raises concerns about the amplification and propagation of human biases embedded within hiring algorithms, as empirically demonstrated by biases in candidate ranking systems and automated interview assessments. Consequently, algorithmic fairness has emerged as a critical consideration in AI-driven recruitment, aimed at rigorously addressing and mitigating these biases. This paper systematically reviews biases identified in AI-driven recruitment systems, categorizes fairness metrics and bias mitigation techniques, and highlights auditing approaches used in practice. We emphasize critical gaps and current limitations, proposing future directions to guide researchers and practitioners toward more equitable AI recruitment practices, promoting fair candidate treatment and enhancing organizational outcomes. △ Less

Submitted 18 May, 2025; v1 submitted 30 May, 2024; originally announced May 2024.

ACM Class: K.4.3; I.2.0; J.4

arXiv:2405.06150 [pdf, other]

Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech

Authors: Dena Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Hope Gerlach-Houck, Caryn Herring, Jia Bin

Abstract: Automatic speech recognition (ASR) systems, increasingly prevalent in education, healthcare, employment, and mobile technology, face significant challenges in inclusivity, particularly for the 80 million-strong global community of people who stutter. These systems often fail to accurately interpret speech patterns deviating from typical fluency, leading to critical usability issues and misinterpre… ▽ More Automatic speech recognition (ASR) systems, increasingly prevalent in education, healthcare, employment, and mobile technology, face significant challenges in inclusivity, particularly for the 80 million-strong global community of people who stutter. These systems often fail to accurately interpret speech patterns deviating from typical fluency, leading to critical usability issues and misinterpretations. This study evaluates six leading ASRs, analyzing their performance on both a real-world dataset of speech samples from individuals who stutter and a synthetic dataset derived from the widely-used LibriSpeech benchmark. The synthetic dataset, uniquely designed to incorporate various stuttering events, enables an in-depth analysis of each ASR's handling of disfluent speech. Our comprehensive assessment includes metrics such as word error rate (WER), character error rate (CER), and semantic accuracy of the transcripts. The results reveal a consistent and statistically significant accuracy bias across all ASRs against disfluent speech, manifesting in significant syntactical and semantic inaccuracies in transcriptions. These findings highlight a critical gap in current ASR technologies, underscoring the need for effective bias mitigation strategies. Addressing this bias is imperative not only to improve the technology's usability for people who stutter but also to ensure their equitable and inclusive participation in the rapidly evolving digital landscape. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: Accepted to NAACL 2024

arXiv:2310.17912 [pdf]

Restoring the Broken Covenant Between Compilers and Deep Learning Accelerators

Authors: Sean Kinzer, Soroush Ghodrati, Rohan Mahapatra, Byung Hoon Ahn, Edwin Mascarenhas, Xiaolong Li, Janarbek Matai, Liang Zhang, Hadi Esmaeilzadeh

Abstract: Deep learning accelerators address the computational demands of Deep Neural Networks (DNNs), departing from the traditional Von Neumann execution model. They leverage specialized hardware to align with the application domain's structure. Compilers for these accelerators face distinct challenges compared to those for general-purpose processors. These challenges include exposing and managing more mi… ▽ More Deep learning accelerators address the computational demands of Deep Neural Networks (DNNs), departing from the traditional Von Neumann execution model. They leverage specialized hardware to align with the application domain's structure. Compilers for these accelerators face distinct challenges compared to those for general-purpose processors. These challenges include exposing and managing more micro-architectural features, handling software-managed scratch pads for on-chip storage, explicitly managing data movement, and matching DNN layers with varying hardware capabilities. These complexities necessitate a new approach to compiler design, as traditional compilers mainly focused on generating fine-grained instruction sequences while abstracting micro-architecture details. This paper introduces the Architecture Covenant Graph (ACG), an abstract representation of an architectural structure's components and their programmable capabilities. By enabling the compiler to work with the ACG, it allows for adaptable compilation workflows when making changes to accelerator design, reducing the need for a complete compiler redevelopment. Codelets, which express DNN operation functionality and evolve into execution mappings on the ACG, are key to this process. The Covenant compiler efficiently targets diverse deep learning accelerators, achieving 93.8% performance compared to state-of-the-art, hand-tuned DNN layer implementations when compiling 14 DNN layers from various models on two different architectures. △ Less

Submitted 27 October, 2023; originally announced October 2023.

arXiv:2308.12120 [pdf, other]

An Open-Source ML-Based Full-Stack Optimization Framework for Machine Learning Accelerators

Authors: Hadi Esmaeilzadeh, Soroush Ghodrati, Andrew B. Kahng, Joon Kyung Kim, Sean Kinzer, Sayak Kundu, Rohan Mahapatra, Susmita Dey Manasi, Sachin Sapatnekar, Zhiang Wang, Ziqing Zeng

Abstract: Parameterizable machine learning (ML) accelerators are the product of recent breakthroughs in ML. To fully enable their design space exploration (DSE), we propose a physical-design-driven, learning-based prediction framework for hardware-accelerated deep neural network (DNN) and non-DNN ML algorithms. It adopts a unified approach that combines backend power, performance, and area (PPA) analysis wi… ▽ More Parameterizable machine learning (ML) accelerators are the product of recent breakthroughs in ML. To fully enable their design space exploration (DSE), we propose a physical-design-driven, learning-based prediction framework for hardware-accelerated deep neural network (DNN) and non-DNN ML algorithms. It adopts a unified approach that combines backend power, performance, and area (PPA) analysis with frontend performance simulation, thereby achieving a realistic estimation of both backend PPA and system metrics such as runtime and energy. In addition, our framework includes a fully automated DSE technique, which optimizes backend and system metrics through an automated search of architectural and backend parameters. Experimental studies show that our approach consistently predicts backend PPA and system metrics with an average 7% or less prediction error for the ASIC implementation of two deep learning accelerator platforms, VTA and VeriGOOD-ML, in both a commercial 12 nm process and a research-oriented 45 nm process. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: This is an extended version of our work titled "Physically Accurate Learning-based Performance Prediction of Hardware-accelerated ML Algorithms" published in MLCAD 2022

arXiv:2305.09248 [pdf, other]

Maximum-Width Rainbow-Bisecting Empty Annulus

Authors: Sang Won Bae, Sandip Banerjee, Arpita Baral, Priya Ranjan Sinha Mahapatra, Sang Duk Yoon

Abstract: Given a set of $n$ colored points with $k$ colors in the plane, we study the problem of computing a maximum-width rainbow-bisecting empty annulus (of objects specifically axis-parallel square, axis-parallel rectangle and circle) problem. We call a region rainbow if it contains at least one point of each color. The maximum-width rainbow-bisecting empty annulus problem asks to find an annulus $A$ of… ▽ More Given a set of $n$ colored points with $k$ colors in the plane, we study the problem of computing a maximum-width rainbow-bisecting empty annulus (of objects specifically axis-parallel square, axis-parallel rectangle and circle) problem. We call a region rainbow if it contains at least one point of each color. The maximum-width rainbow-bisecting empty annulus problem asks to find an annulus $A$ of a particular shape with maximum possible width such that $A$ does not contain any input points and it bisects the input point set into two parts, each of which is a rainbow. We compute a maximum-width rainbow-bisecting empty axis-parallel square, axis-parallel rectangular and circular annulus in $O(n^3)$ time using $O(n)$ space, in $O(k^2n^2\log n)$ time using $O(n\log n)$ space and in $O(n^3)$ time using $O(n^2)$ space respectively. △ Less

Submitted 26 March, 2024; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: A preliminary version is accepted in EuroCG 2021 and the expanded version is accepted in the journal Computational Geometry: Theory and Applications

arXiv:2303.03483 [pdf]

In-Storage Domain-Specific Acceleration for Serverless Computing

Authors: Rohan Mahapatra, Soroush Ghodrati, Byung Hoon Ahn, Sean Kinzer, Shu-ting Wang, Hanyang Xu, Lavanya Karthikeyan, Hardik Sharma, Amir Yazdanbakhsh, Mohammad Alian, Hadi Esmaeilzadeh

Abstract: While (1) serverless computing is emerging as a popular form of cloud execution, datacenters are going through major changes: (2) storage dissaggregation in the system infrastructure level and (3) integration of domain-specific accelerators in the hardware level. Each of these three trends individually provide significant benefits; however, when combined the benefits diminish. Specifically, the pa… ▽ More While (1) serverless computing is emerging as a popular form of cloud execution, datacenters are going through major changes: (2) storage dissaggregation in the system infrastructure level and (3) integration of domain-specific accelerators in the hardware level. Each of these three trends individually provide significant benefits; however, when combined the benefits diminish. Specifically, the paper makes the key observation that for serverless functions, the overhead of accessing dissaggregated persistent storage overshadows the gains from accelerators. Therefore, to benefit from all these trends in conjunction, we propose Domain-Specific Computational Storage for Serverless (DSCS-Serverless). This idea contributes a serverless model that leverages a programmable accelerator within computational storage to conjugate the benefits of acceleration and storage disaggregation simultaneously. Our results with eight applications shows that integrating a comparatively small accelerator within the storage (DSCS-Serverless) that fits within its power constrains (15 Watts), significantly outperforms a traditional disaggregated system that utilizes the NVIDIA RTX 2080 Ti GPU (250 Watts). Further, the work highlights that disaggregation, serverless model, and the limited power budget for computation in storage require a different design than the conventional practices of integrating microprocessors and FPGAs. This insight is in contrast with current practices of designing computational storage that are yet to address the challenges associated with the shifts in datacenters. In comparison with two such conventional designs that either use quad-core ARM A57 or a Xilinx FPGA, DSCS-Serverless provides 3.7x and 1.7x end-to-end application speedup, 4.3x and 1.9x energy reduction, and 3.2x and 2.3x higher cost efficiency, respectively. △ Less

Submitted 23 March, 2024; v1 submitted 6 March, 2023; originally announced March 2023.

arXiv:2108.06265 [pdf]

A reduced-order modeling framework for simulating signatures of faults in a bladed disk

Authors: Divya Shyam Singh, Atul Agrawal, D. Roy Mahapatra

Abstract: This paper reports a reduced-order modeling framework of bladed disks on a rotating shaft to simulate the vibration signature of faults like cracks in different components aiming towards simulated data-driven machine learning. We have employed lumped and one-dimensional analytical models of the subcomponents for better insight into the complex dynamic response. The framework seeks to address some… ▽ More This paper reports a reduced-order modeling framework of bladed disks on a rotating shaft to simulate the vibration signature of faults like cracks in different components aiming towards simulated data-driven machine learning. We have employed lumped and one-dimensional analytical models of the subcomponents for better insight into the complex dynamic response. The framework seeks to address some of the challenges encountered in analyzing and optimizing fault detection and identification schemes for health monitoring of rotating turbomachinery, including aero-engines. We model the bladed disks and shafts by combining lumped elements and one-dimensional finite elements, leading to a coupled system. The simulation results are in good agreement with previously published data. We model the cracks in a blade analytically with their effective reduced stiffness approximation. Multiple types of faults are modeled, including cracks in the blades of single and two-stage bladed disks, Fan Blade Off (FBO), and Foreign Object Damage (FOD). We have applied aero-engine operational loading conditions to simulate realistic scenarios of online health monitoring. The proposed reduced-order simulation framework will have applications in probabilistic signal modeling, machine learning toward fault signature identification, and parameter estimation with measured vibration signals. △ Less

Submitted 23 August, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: 39 Pages, 12 Figures

arXiv:2105.11250 [pdf, other]

Efficient Reporting of Top-k Subset Sums

Authors: Biswajit Sanyal, Subhashis Majumder, Priya Ranjan Sinha Mahapatra

Abstract: The "Subset Sum problem" is a very well-known NP-complete problem. In this work, a top-k variation of the "Subset Sum problem" is considered. This problem has wide application in recommendation systems, where instead of k best objects the k best subsets of objects with the lowest (or highest) overall scores are required. Given an input set R of n real numbers and a positive integer k, our target i… ▽ More The "Subset Sum problem" is a very well-known NP-complete problem. In this work, a top-k variation of the "Subset Sum problem" is considered. This problem has wide application in recommendation systems, where instead of k best objects the k best subsets of objects with the lowest (or highest) overall scores are required. Given an input set R of n real numbers and a positive integer k, our target is to generate the k best subsets of R such that the sum of their elements is minimized. Our solution methodology is based on constructing a metadata structure G for a given n. Each node of G stores a bit vector of size n from which a subset of R can be retrieved. Here it is shown that the construction of the whole graph G is not needed. To answer a query, only implicit traversal of the required portion of G on demand is sufficient, which obviously gets rid of the preprocessing step, thereby reducing the overall time and space requirement. A modified algorithm is then proposed to generate each subset incrementally, where it is shown that it is possible to do away with the explicit storage of the bit vector. This not only improves the space requirement but also improves the asymptotic time complexity. Finally, a variation of our algorithm that reports only the top-k subset sums has been compared with an existing algorithm, which shows that our algorithm performs better both in terms of time and space requirement by a constant factor. △ Less

Submitted 25 August, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

Comments: 27 pages, 8 figures, 2 tables, 2 algorithms, 3 functions

arXiv:2102.10140 [pdf]

BPLight-CNN: A Photonics-based Backpropagation Accelerator for Deep Learning

Authors: D. Dang, S. V. R. Chittamuru, S. Pasricha, R. Mahapatra, D. Sahoo

Abstract: Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pre-trained weights and focus only on improving the design of the inference phase. The recent trend is to build a com… ▽ More Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pre-trained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high performance deep learning training. We present the design for a convolutional neural network, BPLight-CNN, which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models including LeNet and VGG-Net. The proposed design achieves (i) at least 34x speedup, 34x improvement in computational efficiency, and 38.5x energy savings, during training; and (ii) 29x speedup, 31x improvement in computational efficiency, and 38.7x improvement in energy savings, during inference compared to the state-of-the-art designs. All these comparisons are done at a 16-bit resolution; and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared to the state-of-the-art. △ Less

Submitted 19 February, 2021; originally announced February 2021.

Report number: epic-21-01

arXiv:1908.09085 [pdf]

doi 10.1109/TITS.2019.2899321

Adaptive Group-based Zero Knowledge Proof-Authentication Protocol (AGZKP-AP) in Vehicular Ad Hoc Networks

Authors: Amar A. Rasheed, Rabi N. Mahapatra, Felix G. Hamza-Lup

Abstract: Vehicular Ad Hoc Networks (VANETs) are a particular subclass of mobile ad hoc networks that raise a number of security challenges, notably from the way users authenticate the network. Authentication technologies based on existing security policies and access control rules in such networks assume full trust on Roadside Unit (RSU) and authentication servers. The disclosure of authentication paramete… ▽ More Vehicular Ad Hoc Networks (VANETs) are a particular subclass of mobile ad hoc networks that raise a number of security challenges, notably from the way users authenticate the network. Authentication technologies based on existing security policies and access control rules in such networks assume full trust on Roadside Unit (RSU) and authentication servers. The disclosure of authentication parameters enables user's trace-ability over the network. VANETs' trusted entities (e.g. RSU) can utilize such information to track a user traveling behavior, violating user privacy and anonymity. In this paper, we proposed a novel, light-weight, Adaptive Group-based Zero Knowledge Proof-Authentication Protocol (AGZKP-AP) for VANETs. The proposed authentication protocol is capable of offering various levels of users' privacy settings based on the type of services available on such networks. Our scheme is based on the Zero-Knowledge-Proof (ZKP) crypto approach with the support of trade-off options. Users have the option to make critical decisions on the level of privacy and the amount of resources usage they prefer such as short system response time versus the number of private information disclosures. Furthermore, AGZKP-AP is incorporated with a distributed privilege control and revoking mechanism that render user's private information to law enforcement in case of a traffic violation. △ Less

Submitted 23 August, 2019; originally announced August 2019.

Journal ref: IEEE Intelligent Transportation Systems (2019)

arXiv:1811.06217 [pdf, other]

Maximum-Width Empty Square and Rectangular Annulus

Authors: Sang Won Bae, Arpita Baral, Priya Ranjan Sinha Mahapatra

Abstract: An annulus is, informally, a ring-shaped region, often described by two concentric circles. The maximum-width empty annulus problem asks to find an annulus of a certain shape with the maximum possible width that avoids a given set of $n$ points in the plane. This problem can also be interpreted as the problem of finding an optimal location of a ring-shaped obnoxious facility among the input points… ▽ More An annulus is, informally, a ring-shaped region, often described by two concentric circles. The maximum-width empty annulus problem asks to find an annulus of a certain shape with the maximum possible width that avoids a given set of $n$ points in the plane. This problem can also be interpreted as the problem of finding an optimal location of a ring-shaped obnoxious facility among the input points. In this paper, we study square and rectangular variants of the maximum-width empty anuulus problem, and present first nontrivial algorithms. Specifically, our algorithms run in $O(n^3)$ and $O(n^2 \log n)$ time for computing a maximum-width empty axis-parallel square and rectangular annulus, respectively. Both algorithms use only $O(n)$ space. △ Less

Submitted 15 November, 2018; originally announced November 2018.

arXiv:1712.00375 [pdf, ps, other]

Maximum-width Axis-Parallel Empty Rectangular Annulus

Authors: Arpita Baral, Abhilash Gondane, Sanjib Sadhu, Priya Ranjan Sinha Mahapatra

Abstract: Given a set $P$ of $n$ points on $\mathbb R^{2}$, we address the problem of computing an axis-parallel empty rectangular annulus $A$ of maximum-width such that no point of $P$ lies inside $A$ but all points of $P$ must lie inside, outside and on the boundaries of two parallel rectangles forming the annulus $A$. We propose an $O(n^3)$ time and $O(n)$ space algorithm to solve the problem. In a parti… ▽ More Given a set $P$ of $n$ points on $\mathbb R^{2}$, we address the problem of computing an axis-parallel empty rectangular annulus $A$ of maximum-width such that no point of $P$ lies inside $A$ but all points of $P$ must lie inside, outside and on the boundaries of two parallel rectangles forming the annulus $A$. We propose an $O(n^3)$ time and $O(n)$ space algorithm to solve the problem. In a particular case when the inner rectangle of an axis-parallel empty rectangular annulus reduces to an input point we can solve the problem in $O(n \log n)$ time and $O(n)$ space. △ Less

Submitted 1 December, 2017; originally announced December 2017.

arXiv:1703.09651 [pdf]

Structural Damage Identification Using Artificial Neural Network and Synthetic data

Authors: Divya Shyam Singha, G. B. L. Chowdarya, D Roy Mahapatraa

Abstract: This paper presents real-time vibration based identification technique using measured frequency response functions(FRFs) under random vibration loading. Artificial Neural Networks (ANNs) are trained to map damage fingerprints to damage characteristic parameters. Principal component statistical analysis(PCA) technique was used to tackle the problem of high dimensionality and high noise of data, whi… ▽ More This paper presents real-time vibration based identification technique using measured frequency response functions(FRFs) under random vibration loading. Artificial Neural Networks (ANNs) are trained to map damage fingerprints to damage characteristic parameters. Principal component statistical analysis(PCA) technique was used to tackle the problem of high dimensionality and high noise of data, which is common for industrial structures. The present study considers Crack, Rivet hole expansion and redundant uniform mass as damages on the structure. Frequency response function data after being reduced in size using PCA is fed to individual neural networks to localize and predict the severity of damage on the structure. The system of ANNs trained with both numerical and experimental model data to make the system reliable and robust. The methodology is applied to a numerical model of stiffened panel structure, where damages are confined close to the stiffener. The results showed that, in all the cases considered, it is possible to localize and predict severity of the damage occurrence with very good accuracy and reliability. △ Less

Submitted 27 March, 2017; originally announced March 2017.

Comments: 6 pages,6 figures, ISSS conference

arXiv:1703.08211 [pdf]

A Time-shared Photonic Reservoir Computer for Big Data Analytics

Authors: Dharanidhar Dang, Rabi Mahapatra

Abstract: Information processing has reached the era of big data. Big data challenges are difficult to address with traditional Von Neumann or Turing approach. Hence implementation of new computational techniques is highly essential. Nanophotonics with its remarkable speed and multiplexing capability is a promising candidate for such implementations. This paper proposes a novel photonic computing system mad… ▽ More Information processing has reached the era of big data. Big data challenges are difficult to address with traditional Von Neumann or Turing approach. Hence implementation of new computational techniques is highly essential. Nanophotonics with its remarkable speed and multiplexing capability is a promising candidate for such implementations. This paper proposes a novel photonic computing system made-up of Mach-Zehnder interferometer and an optical fiber spool to emulate a powerful machine learning technique called reservoir computing. The proposed system is also integrated with a time-division-multiplexing circuit to facilitate parallel computation of multiple tasks which is first of its kind. The proposed design performs large-scale tasks like spoken digit recognition, channel equalization, and time-series prediction. Experimental results with standard photonic simulator demonstrate significant performance in terms of speed and accuracy compared to state of the art digital and software implementations. △ Less

Submitted 23 March, 2017; originally announced March 2017.

Comments: 4 pages, 4 figures

arXiv:1609.06630 [pdf, ps, other]

No-hole $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling for Square Grid

Authors: Soumen Atta, Priya Ranjan Sinha Mahapatra, Stanisław Goldstein

Abstract: Given a fixed $k$ $\in$ $\mathbb{Z}^+$ and $λ$ $\in$ $\mathbb{Z}^+$, the objective of a $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling of a graph $G$ is to assign non-negative integers (known as labels) from the set $\{0, \ldots, λ-1\}$ to the vertices of $G$ such that the adjacent vertices receive values which differ by at least $k$, vertices connected by a path of length two receive values which differ… ▽ More Given a fixed $k$ $\in$ $\mathbb{Z}^+$ and $λ$ $\in$ $\mathbb{Z}^+$, the objective of a $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling of a graph $G$ is to assign non-negative integers (known as labels) from the set $\{0, \ldots, λ-1\}$ to the vertices of $G$ such that the adjacent vertices receive values which differ by at least $k$, vertices connected by a path of length two receive values which differ by at least $k-1$, and so on. The vertices which are at least $k+1$ distance apart can receive the same label. The smallest $λ$ for which there exists a $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling of $G$ is known as the $L(k, k-1, \ldots, 2, 1)$-labeling number of $G$ and is denoted by $λ_k(G)$. The ratio between the upper bound and the lower bound of a $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling is known as the approximation ratio. In this paper a lower bound on the value of the labeling number for square grid is computed and a formula is proposed which yields a $λ$-$L(k, k-1, \ldots, 2, 1)$-labeling of square grid, with approximation ratio at most $\frac{9}{8}$. The labeling presented is a no-hole one, i.e., it uses each label from $0$ to $λ-1$ at least once. △ Less

Submitted 22 December, 2016; v1 submitted 21 September, 2016; originally announced September 2016.

arXiv:cs/0607001 [pdf]

A Novel Application of Lifting Scheme for Multiresolution Correlation of Complex Radar Signals

Authors: Chinmoy Bhattacharya, P. R. Mahapatra

Abstract: The lifting scheme of discrete wavelet transform (DWT) is now quite well established as an efficient technique for image compression, and has been incorporated into the JPEG2000 standards. However, the potential of the lifting scheme has not been exploited in the context of correlationbased processing, such as encountered in radar applications. This paper presents a complete and consistent frame… ▽ More The lifting scheme of discrete wavelet transform (DWT) is now quite well established as an efficient technique for image compression, and has been incorporated into the JPEG2000 standards. However, the potential of the lifting scheme has not been exploited in the context of correlationbased processing, such as encountered in radar applications. This paper presents a complete and consistent framework for the application of DWT for correlation of complex signals. In particular, lifting scheme factorization of biorthogonal filterbanks is carried out in dual analysis basis spaces for multiresolution correlation of complex radar signals in the DWT domain only. A causal formulation of lifting for orthogonal filterbank is also developed. The resulting parallel algorithms and consequent saving of computational effort are briefly dealt with. △ Less

Submitted 1 July, 2006; originally announced July 2006.

Showing 1–19 of 19 results for author: Mahapatra, R