Search | arXiv e-print repository

PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation

Authors: Yifan Yin, Zhengtao Han, Shivam Aarya, Jianxin Wang, Shuhang Xu, Jiawei Peng, Angtian Wang, Alan Yuille, Tianmin Shu

Abstract: Fine-grained robot manipulation, such as lifting and rotating a bottle to display the label on the cap, requires robust reasoning about object parts and their relationships with intended tasks. Despite recent advances in training general-purpose robot manipulation policies guided by language instructions, there is a notable lack of large-scale datasets for fine-grained manipulation tasks with part… ▽ More Fine-grained robot manipulation, such as lifting and rotating a bottle to display the label on the cap, requires robust reasoning about object parts and their relationships with intended tasks. Despite recent advances in training general-purpose robot manipulation policies guided by language instructions, there is a notable lack of large-scale datasets for fine-grained manipulation tasks with part-level instructions and diverse 3D object instances annotated with part-level labels. In this work, we introduce PartInstruct, the first large-scale benchmark for training and evaluating fine-grained robot manipulation models using part-level instructions. PartInstruct comprises 513 object instances across 14 categories, each annotated with part-level information, and 1302 fine-grained manipulation tasks organized into 16 task classes. Our training set consists of over 10,000 expert demonstrations synthesized in a 3D simulator, where each demonstration is paired with a high-level task instruction, a chain of base part-based skill instructions, and ground-truth 3D information about the object and its parts. Additionally, we designed a comprehensive test suite to evaluate the generalizability of learned policies across new states, objects, and tasks. We evaluated several state-of-the-art robot manipulation approaches, including end-to-end vision-language policy learning and bi-level planning models for robot manipulation on our benchmark. The experimental results reveal that current models struggle to robustly ground part concepts and predict actions in 3D space, and face challenges when manipulating object parts in long-horizon tasks. △ Less

Submitted 27 May, 2025; originally announced May 2025.

arXiv:2505.10498 [pdf, other]

Batched Nonparametric Bandits via k-Nearest Neighbor UCB

Authors: Sakshi Arya

Abstract: We study sequential decision-making in batched nonparametric contextual bandits, where actions are selected over a finite horizon divided into a small number of batches. Motivated by constraints in domains such as medicine and marketing -- where online feedback is limited -- we propose a nonparametric algorithm that combines adaptive k-nearest neighbor (k-NN) regression with the upper confidence b… ▽ More We study sequential decision-making in batched nonparametric contextual bandits, where actions are selected over a finite horizon divided into a small number of batches. Motivated by constraints in domains such as medicine and marketing -- where online feedback is limited -- we propose a nonparametric algorithm that combines adaptive k-nearest neighbor (k-NN) regression with the upper confidence bound (UCB) principle. Our method, BaNk-UCB, is fully nonparametric, adapts to the context dimension, and is simple to implement. Unlike prior work relying on parametric or binning-based estimators, BaNk-UCB uses local geometry to estimate rewards and adaptively balances exploration and exploitation. We provide near-optimal regret guarantees under standard Lipschitz smoothness and margin assumptions, using a theoretically motivated batch schedule that balances regret across batches and achieves minimax-optimal rates. Empirical evaluations on synthetic and real-world datasets demonstrate that BaNk-UCB consistently outperforms binning-based baselines. △ Less

Submitted 15 May, 2025; originally announced May 2025.

Comments: 25 pages, 6 figures

MSC Class: 68T05; 62L05; 62G08; 68Q32 ACM Class: F.2.2; I.2.6

arXiv:2504.18673 [pdf, other]

Can Third-parties Read Our Emotions?

Authors: Jiayi Li, Yingfan Zhou, Pranav Narayanan Venkit, Halima Binte Islam, Sneha Arya, Shomir Wilson, Sarah Rajtmajer

Abstract: Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition… ▽ More Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition tasks that directly compare third-party annotations with first-party (author-provided) emotion labels. Our findings reveal significant limitations in third-party annotations-whether provided by human annotators or large language models (LLMs)-in faithfully representing authors' private states. However, LLMs outperform human annotators nearly across the board. We further explore methods to improve third-party annotation quality. We find that demographic similarity between first-party authors and third-party human annotators enhances annotation performance. While incorporating first-party demographic information into prompts leads to a marginal but statistically significant improvement in LLMs' performance. We introduce a framework for evaluating the limitations of third-party annotations and call for refined annotation practices to accurately represent and model authors' private states. △ Less

Submitted 25 April, 2025; originally announced April 2025.

arXiv:2503.00565 [pdf, other]

Semi-Parametric Batched Global Multi-Armed Bandits with Covariates

Authors: Sakshi Arya, Hyebin Song

Abstract: The multi-armed bandits (MAB) framework is a widely used approach for sequential decision-making, where a decision-maker selects an arm in each round with the goal of maximizing long-term rewards. Moreover, in many practical applications, such as personalized medicine and recommendation systems, feedback is provided in batches, contextual information is available at the time of decision-making, an… ▽ More The multi-armed bandits (MAB) framework is a widely used approach for sequential decision-making, where a decision-maker selects an arm in each round with the goal of maximizing long-term rewards. Moreover, in many practical applications, such as personalized medicine and recommendation systems, feedback is provided in batches, contextual information is available at the time of decision-making, and rewards from different arms are related rather than independent. We propose a novel semi-parametric framework for batched bandits with covariates and a shared parameter across arms, leveraging the single-index regression (SIR) model to capture relationships between arm rewards while balancing interpretability and flexibility. Our algorithm, Batched single-Index Dynamic binning and Successive arm elimination (BIDS), employs a batched successive arm elimination strategy with a dynamic binning mechanism guided by the single-index direction. We consider two settings: one where a pilot direction is available and another where the direction is estimated from data, deriving theoretical regret bounds for both cases. When a pilot direction is available with sufficient accuracy, our approach achieves minimax-optimal rates (with $d = 1$) for nonparametric batched bandits, circumventing the curse of dimensionality. Extensive experiments on simulated and real-world datasets demonstrate the effectiveness of our algorithm compared to the nonparametric batched bandit method introduced by \cite{jiang2024batched}. △ Less

Submitted 1 March, 2025; originally announced March 2025.

MSC Class: 62L05; 62G05

arXiv:2412.15380 [pdf, other]

Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation

Authors: Meghana Karri, Amit Soni Arya, Koushik Biswas, Nicol`o Gennaro, Vedat Cicek, Gorkem Durak, Yuri S. Velichko, Ulas Bagci

Abstract: This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guide… ▽ More This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guided consistency regularization and Sharpness-Aware Minimization emphasizing uncertainty. UG-CEMT improves semi-supervised performance while maintaining a consistent network architecture and task setting by fostering high disparity between sub-networks. Experiments demonstrate significant advantages over existing methods like Mean Teacher and Cross-pseudo Supervision in terms of disparity, domain generalization, and medical image segmentation performance. UG-CEMT achieves state-of-the-art results on multi-center prostate MRI and cardiac MRI datasets, where object segmentation is particularly challenging. Our results show that using only 10\% labeled data, UG-CEMT approaches the performance of fully supervised methods, demonstrating its effectiveness in exploiting unlabeled data for robust medical image segmentation. The code is publicly available at \url{https://github.com/Meghnak13/UG-CEMT} △ Less

Submitted 19 December, 2024; originally announced December 2024.

Comments: Accepted in WACV 2025

arXiv:2411.04131 [pdf]

Data Processing Chain and Products of EOS-06 OCM-3 Payload From Signal Processing to Geometric Precision

Authors: Ankur Garg, Tushar Shukla, Sunita Arya, Ghansham Sangar, Sampa Roy, Meenakshi Sarkar, S. Manthira Moorthi, Debajyoti Dhar

Abstract: The Ocean Color Monitor-3, launched aboard Oceansat-3, represents a significant advancement in ocean observation technology, building upon the capabilities of its predecessors. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sun-synchronous orbit. With thirteen spectral bands, OCM-3 enhances feature identificat… ▽ More The Ocean Color Monitor-3, launched aboard Oceansat-3, represents a significant advancement in ocean observation technology, building upon the capabilities of its predecessors. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sun-synchronous orbit. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sunsynchronous orbit. Operating at an altitude of 732.5 km, the satellite achieves high signal-to-noise ratios (SNR) through sophisticated onboard and ground processing techniques, including advanced geometric modeling for pixel registration.The OCM-3 processing pipeline, consisting of multiple levels, ensures rigorous calibration and correction of radiometric and geometric data. This paper presents key methodologies such as dark data modeling, photo response non-uniformity correction, and smear correction, are employed to enhance data quality. The effective implementation of ground time delay integration (TDI) allows for the refinement of SNR, with evaluations demonstrating that performance specifications were exceeded. Geometric calibration procedures, including band-to-band registration and geolocation accuracy assessments, which further optimize data reliability are presented in the paper. Advanced image registration techniques leveraging Ground Control Points (GCPs) and residual error analysis significantly reduce geolocation errors, achieving precision within specified thresholds. Overall, OCM-3 comprehensive calibration and processing strategies ensure high-quality, reliable data crucial for ocean monitoring and change detection applications, facilitating improved understanding of ocean dynamics and environmental changes. △ Less

Submitted 22 October, 2024; originally announced November 2024.

Comments: Preprint

arXiv:2411.00715 [pdf, other]

B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable

Authors: Shreyash Arya, Sukrut Rao, Moritz Böhle, Bernt Schiele

Abstract: B-cos Networks have been shown to be effective for obtaining highly human interpretable explanations of model decisions by architecturally enforcing stronger alignment between inputs and weight. B-cos variants of convolutional networks (CNNs) and vision transformers (ViTs), which primarily replace linear layers with B-cos transformations, perform competitively to their respective standard variants… ▽ More B-cos Networks have been shown to be effective for obtaining highly human interpretable explanations of model decisions by architecturally enforcing stronger alignment between inputs and weight. B-cos variants of convolutional networks (CNNs) and vision transformers (ViTs), which primarily replace linear layers with B-cos transformations, perform competitively to their respective standard variants while also yielding explanations that are faithful by design. However, it has so far been necessary to train these models from scratch, which is increasingly infeasible in the era of large, pre-trained foundation models. In this work, inspired by the architectural similarities in standard DNNs and B-cos networks, we propose 'B-cosification', a novel approach to transform existing pre-trained models to become inherently interpretable. We perform a thorough study of design choices to perform this conversion, both for convolutional neural networks and vision transformers. We find that B-cosification can yield models that are on par with B-cos models trained from scratch in terms of interpretability, while often outperforming them in terms of classification performance at a fraction of the training cost. Subsequently, we apply B-cosification to a pretrained CLIP model, and show that, even with limited data and compute cost, we obtain a B-cosified version that is highly interpretable and competitive on zero shot performance across a variety of datasets. We release our code and pre-trained model weights at https://github.com/shrebox/B-cosification. △ Less

Submitted 24 January, 2025; v1 submitted 1 November, 2024; originally announced November 2024.

Comments: 31 pages, 9 figures, 12 tables, Neural Information Processing Systems (NeurIPS) 2024; added references, corrected typos

arXiv:2410.09684 [pdf, other]

Technical Design Review of Duke Robotics Club's Oogway: An AUV for RoboSub 2024

Authors: Will Denton, Michael Bryant, Lilly Chiavetta, Vedarsh Shah, Rico Zhu, Philip Xue, Vincent Chen, Maxwell Lin, Hung Le, Austin Camacho, Raul Galvez, Nathan Yang, Nathanael Ren, Tyler Rose, Mathew Chu, Amir Ergashev, Saagar Arya, Kaelyn Pieter, Ethan Horowitz, Maanav Allampallam, Patrick Zheng, Mia Kaarls, June Wood

Abstract: The Duke Robotics Club is proud to present our robot for the 2024 RoboSub Competition: Oogway. Now in its second year, Oogway has been dramatically upgraded in both its capabilities and reliability. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities… ▽ More The Duke Robotics Club is proud to present our robot for the 2024 RoboSub Competition: Oogway. Now in its second year, Oogway has been dramatically upgraded in both its capabilities and reliability. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities are a result of the tight integration between these subsystems. Such examples include a re-envisioned controls system, an entirely new electrical stack, advanced sonar integration, additional cameras and system monitoring, a new marker dropper, and a watertight capsule mechanism. These additions enabled Oogway to prequalify for Robosub 2024. △ Less

Submitted 12 October, 2024; originally announced October 2024.

arXiv:2409.10849 [pdf, other]

SIFToM: Robust Spoken Instruction Following through Theory of Mind

Authors: Lance Ying, Jason Xinyu Liu, Shivam Aarya, Yizirui Fang, Stefanie Tellex, Joshua B. Tenenbaum, Tianmin Shu

Abstract: Spoken language instructions are ubiquitous in agent collaboration. However, in human-robot collaboration, recognition accuracy for human speech is often influenced by various speech and environmental factors, such as background noise, the speaker's accents, and mispronunciation. When faced with noisy or unfamiliar auditory inputs, humans use context and prior knowledge to disambiguate the stimulu… ▽ More Spoken language instructions are ubiquitous in agent collaboration. However, in human-robot collaboration, recognition accuracy for human speech is often influenced by various speech and environmental factors, such as background noise, the speaker's accents, and mispronunciation. When faced with noisy or unfamiliar auditory inputs, humans use context and prior knowledge to disambiguate the stimulus and take pragmatic actions, a process referred to as top-down processing in cognitive science. We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions by inferring the human's goal and joint plan as prior for speech perception and understanding. We test SIFToM in simulated home experiments (VirtualHome 2). Results show that the SIFToM model outperforms state-of-the-art speech and language models, approaching human-level accuracy on challenging speech instruction following tasks. We then demonstrate its ability at the task planning level on a mobile manipulator for breakfast preparation tasks. △ Less

Submitted 16 September, 2024; originally announced September 2024.

Comments: 7 pages, 4 figures

arXiv:2408.14995 [pdf, other]

Decomposing the Persistent Homology Transform of Star-Shaped Objects

Authors: Shreya Arya, Barbara Giunti, Abigail Hickok, Lida Kanari, Sarah McGuire, Katharine Turner

Abstract: In this paper, we study the geometric decomposition of the degree-$0$ Persistent Homology Transform (PHT) as viewed as a persistence diagram bundle. We focus on star-shaped objects as they can be segmented into smaller, simpler regions known as ``sectors''. Algebraically, we demonstrate that the degree-$0$ persistence diagram of a star-shaped object in $\mathbb{R}^2$ can be derived from the degree… ▽ More In this paper, we study the geometric decomposition of the degree-$0$ Persistent Homology Transform (PHT) as viewed as a persistence diagram bundle. We focus on star-shaped objects as they can be segmented into smaller, simpler regions known as ``sectors''. Algebraically, we demonstrate that the degree-$0$ persistence diagram of a star-shaped object in $\mathbb{R}^2$ can be derived from the degree-$0$ persistence diagrams of its sectors. Using this, we then establish sufficient conditions for star-shaped objects in $\mathbb{R}^2$ so that they have ``trivial geometric monodromy''. Consequently, the PHT of such a shape can be decomposed as a union of curves parameterized by $S^1$, where the curves are given by the continuous movement of each point in the persistence diagrams that are parameterized by $S^{1}$. Finally, we discuss the current challenges of generalizing these results to higher dimensions. △ Less

Submitted 27 August, 2024; originally announced August 2024.

Comments: 26 pages, 6 Figures

arXiv:2404.11667 [pdf, other]

Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

Authors: Shivvrat Arya, Yu Xiang, Vibhav Gogate

Abstract: We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with… ▽ More We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with deep learning architectures, they provide an intuitive, easy-to-use loss function for multi-label classification. A drawback of DDNs compared to Markov networks is their lack of advanced inference schemes, necessitating the use of Gibbs sampling. To address this challenge, we propose novel inference schemes based on local search and integer linear programming for computing the most likely assignment to the labels given observations. We evaluate our novel methods on three video datasets (Charades, TACoS, Wetlab) and three image datasets (MS-COCO, PASCAL VOC, NUS-WIDE), comparing their performance with (a) basic neural architectures and (b) neural architectures combined with Markov networks equipped with advanced inference and learning techniques. Our results demonstrate the superiority of our new DDN methods over the two competing approaches. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633

arXiv:2404.11606 [pdf, other]

Learning to Solve the Constrained Most Probable Explanation Task in Probabilistic Graphical Models

Authors: Shivvrat Arya, Tahrima Rahman, Vibhav Gogate

Abstract: We propose a self-supervised learning approach for solving the following constrained optimization task in log-linear models or Markov networks. Let $f$ and $g$ be two log-linear models defined over the sets $\mathbf{X}$ and $\mathbf{Y}$ of random variables respectively. Given an assignment $\mathbf{x}$ to all variables in $\mathbf{X}$ (evidence) and a real number $q$, the constrained most-probable… ▽ More We propose a self-supervised learning approach for solving the following constrained optimization task in log-linear models or Markov networks. Let $f$ and $g$ be two log-linear models defined over the sets $\mathbf{X}$ and $\mathbf{Y}$ of random variables respectively. Given an assignment $\mathbf{x}$ to all variables in $\mathbf{X}$ (evidence) and a real number $q$, the constrained most-probable explanation (CMPE) task seeks to find an assignment $\mathbf{y}$ to all variables in $\mathbf{Y}$ such that $f(\mathbf{x}, \mathbf{y})$ is maximized and $g(\mathbf{x}, \mathbf{y})\leq q$. In our proposed self-supervised approach, given assignments $\mathbf{x}$ to $\mathbf{X}$ (data), we train a deep neural network that learns to output near-optimal solutions to the CMPE problem without requiring access to any pre-computed solutions. The key idea in our approach is to use first principles and approximate inference methods for CMPE to derive novel loss functions that seek to push infeasible solutions towards feasible ones and feasible solutions towards optimal ones. We analyze the properties of our proposed method and experimentally demonstrate its efficacy on several benchmark problems. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: Will appear in AISTATS 2024

arXiv:2404.09432 [pdf, other]

The 8th AI City Challenge

Authors: Shuo Wang, David C. Anastasiu, Zheng Tang, Ming-Ching Chang, Yue Yao, Liang Zheng, Mohammed Shaiqur Rahman, Meenakshi S. Arya, Anuj Sharma, Pranamesh Chakraborty, Sanjita Prajapati, Quan Kong, Norimasa Kobori, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Fady Alnajjar, Ganzorig Batnasan, Ping-Yang Chen, Jun-Wei Hsieh, Xunlei Wu, Sameer Satish Pusegaonkar, Yizhou Wang, Sujit Biswas, Rama Chellappa

Abstract: The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC)… ▽ More The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC) people tracking, highlighting significant enhancements in camera count, character number, 3D annotation, and camera matrices, alongside new rules for 3D tracking and online tracking algorithm encouragement. Track 2 introduced dense video captioning for traffic safety, focusing on pedestrian accidents using multi-camera feeds to improve insights for insurance and prevention. Track 3 required teams to classify driver actions in a naturalistic driving analysis. Track 4 explored fish-eye camera analytics using the FishEye8K dataset. Track 5 focused on motorcycle helmet rule violation detection. The challenge utilized two leaderboards to showcase methods, with participants setting new benchmarks, some surpassing existing state-of-the-art achievements. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: Summary of the 8th AI City Challenge Workshop in conjunction with CVPR 2024

arXiv:2403.11075 [pdf, other]

GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment

Authors: Lance Ying, Kunal Jha, Shivam Aarya, Joshua B. Tenenbaum, Antonio Torralba, Tianmin Shu

Abstract: Verbal communication plays a crucial role in human cooperation, particularly when the partners only have incomplete information about the task, environment, and each other's mental state. In this paper, we propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA). GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the… ▽ More Verbal communication plays a crucial role in human cooperation, particularly when the partners only have incomplete information about the task, environment, and each other's mental state. In this paper, we propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA). GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the parts of agents' mental states that are relevant to the goals. This approach enables an embodied assistant to reason about when and how to proactively initialize communication with humans verbally using natural language to help achieve better cooperation. We evaluate our approach against strong baselines in two challenging environments, Overcooked (a multiplayer game) and VirtualHome (a household simulator). Our experimental results demonstrate that large language models struggle with generating meaningful communication that is grounded in the social and physical context. In contrast, our approach can successfully generate concise verbal communication for the embodied assistant to effectively boost the performance of the cooperation as well as human users' perception of the assistant. △ Less

Submitted 14 January, 2025; v1 submitted 16 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures

arXiv:2402.03621 [pdf, other]

Neural Network Approximators for Marginal MAP in Probabilistic Circuits

Authors: Shivvrat Arya, Tahrima Rahman, Vibhav Gogate

Abstract: Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marg… ▽ More Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marginal MAP (MMAP) tasks remain NP-hard in these models. Inspired by the recent work on using neural networks for generating near-optimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate (M)MAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function, and then use the latter as a loss function. The two main benefits of our new method are that it is self-supervised and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations, max-product inference, max-marginal inference and sequential estimation, which are used in practice to solve MMAP tasks in PCs. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: Will appear in AAAI 2024

arXiv:2401.15216 [pdf, other]

Quantum-Assisted Adaptive Beamforming in UASs Network: Enhancing Airborne Communication via Collaborative UASs for NextG IoT

Authors: Sudhanshu Arya, Ying Wang

Abstract: This paper introduces a novel quantum-based method for dynamic beamforming and re-forming in Unmanned Aircraft Systems (UASs), specifically addressing the critical challenges posed by the unavoidable hovering characteristics of UAVs. Hovering creates significant beam path distortions, impacting the reliability and quality of distributed beamforming in airborne networks. To overcome these challenge… ▽ More This paper introduces a novel quantum-based method for dynamic beamforming and re-forming in Unmanned Aircraft Systems (UASs), specifically addressing the critical challenges posed by the unavoidable hovering characteristics of UAVs. Hovering creates significant beam path distortions, impacting the reliability and quality of distributed beamforming in airborne networks. To overcome these challenges, our Quantum Search for UAS Beamforming (QSUB) employs quantum superposition, entanglement, and amplitude amplification. It adaptively reconfigures beams, enhancing beam quality and maintaining robust communication links in the face of rapid UAS state changes due to hovering. Furthermore, we propose an optimized framework, Quantum-Position-Locked Loop (Q-P-LL), that is based on the principle of the Nelder-Mead optimization method for adaptive search to reduce prediction error and improve resilience against angle-of-arrival estimation errors, crucial under dynamic hovering conditions. We also demonstrate the scalability of the system performance and computation complexity by comparing various numbers of active UASs. Importantly, QSUB and Q-P-LL can be applied to both classical and quantum computing architectures. Comparative analyses with conventional Maximum Ratio Transmission (MRT) schemes demonstrate the superior performance and scalability of our quantum approaches, marking significant advancements in the next-generation Internet of Things (IoT) applications requiring reliable airborne communication networks. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.14556 [pdf, other]

CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities

Authors: Rohith Peddi, Shivvrat Arya, Bharath Challa, Likhitha Pallapothula, Akshay Vyas, Bhavya Gouripeddi, Jikai Wang, Qifan Zhang, Vasundhara Komaragiri, Eric Ragan, Nicholas Ruozzi, Yu Xiang, Vibhav Gogate

Abstract: Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understan… ▽ More Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and the ability to reason about the structure of the activity. To this end, we collect a new egocentric 4D dataset, CaptainCook4D, comprising 384 recordings (94.5 hours) of people performing recipes in real kitchen environments. This dataset consists of two distinct types of activity: one in which participants adhere to the provided recipe instructions and another in which they deviate and induce errors. We provide 5.3K step annotations and 10K fine-grained action annotations and benchmark the dataset for the following tasks: supervised error recognition, multistep localization, and procedure learning △ Less

Submitted 8 December, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Accepted to the 2024 Neural Information Processing Systems Datasets and Benchmarks Track, Project Page: https://rohithpeddi.github.io/#/captaincook

arXiv:2310.03959 [pdf, other]

Towards Increasing the Robustness of Predictive Steering-Control Autonomous Navigation Systems Against Dash Cam Image Angle Perturbations Due to Pothole Encounters

Authors: Shivam Aarya

Abstract: Vehicle manufacturers are racing to create autonomous navigation and steering control algorithms for their vehicles. These software are made to handle various real-life scenarios such as obstacle avoidance and lane maneuvering. There is some ongoing research to incorporate pothole avoidance into these autonomous systems. However, there is very little research on the effect of hitting a pothole on… ▽ More Vehicle manufacturers are racing to create autonomous navigation and steering control algorithms for their vehicles. These software are made to handle various real-life scenarios such as obstacle avoidance and lane maneuvering. There is some ongoing research to incorporate pothole avoidance into these autonomous systems. However, there is very little research on the effect of hitting a pothole on the autonomous navigation software that uses cameras to make driving decisions. Perturbations in the camera angle when hitting a pothole can cause errors in the predicted steering angle. In this paper, we present a new model to compensate for such angle perturbations and reduce any errors in steering control prediction algorithms. We evaluate our model on perturbations of publicly available datasets and show our model can reduce the errors in the estimated steering angle from perturbed images to 2.3%, making autonomous steering control robust against the dash cam image angle perturbations induced when one wheel of a car goes over a pothole. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 7 pages, 6 figures

ACM Class: I.2.10; I.4; I.5

arXiv:2308.03277 [pdf, other]

From Ambiguity to Explicitness: NLP-Assisted 5G Specification Abstraction for Formal Analysis

Authors: Shiyu Yuan, Jingda Yang, Sudhanshu Arya, Carlo Lipizzi, Ying Wang

Abstract: Formal method-based analysis of the 5G Wireless Communication Protocol is crucial for identifying logical vulnerabilities and facilitating an all-encompassing security assessment, especially in the design phase. Natural Language Processing (NLP) assisted techniques and most of the tools are not widely adopted by the industry and research community. Traditional formal verification through a mathema… ▽ More Formal method-based analysis of the 5G Wireless Communication Protocol is crucial for identifying logical vulnerabilities and facilitating an all-encompassing security assessment, especially in the design phase. Natural Language Processing (NLP) assisted techniques and most of the tools are not widely adopted by the industry and research community. Traditional formal verification through a mathematics approach heavily relied on manual logical abstraction prone to being time-consuming, and error-prone. The reason that the NLP-assisted method did not apply in industrial research may be due to the ambiguity in the natural language of the protocol designs nature is controversial to the explicitness of formal verification. To address the challenge of adopting the formal methods in protocol designs, targeting (3GPP) protocols that are written in natural language, in this study, we propose a hybrid approach to streamline the analysis of protocols. We introduce a two-step pipeline that first uses NLP tools to construct data and then uses constructed data to extract identifiers and formal properties by using the NLP model. The identifiers and formal properties are further used for formal analysis. We implemented three models that take different dependencies between identifiers and formal properties as criteria. Our results of the optimal model reach valid accuracy of 39% for identifier extraction and 42% for formal properties predictions. Our work is proof of concept for an efficient procedure in performing formal analysis for largescale complicate specification and protocol analysis, especially for 5G and nextG communications. △ Less

Submitted 6 August, 2023; originally announced August 2023.

arXiv:2307.11247 [pdf, other]

Formal-Guided Fuzz Testing: Targeting Security Assurance from Specification to Implementation for 5G and Beyond

Authors: Jingda Yang, Sudhanshu Arya, Ying Wang

Abstract: Softwarization and virtualization in 5G and beyond necessitate thorough testing to ensure the security of critical infrastructure and networks, requiring the identification of vulnerabilities and unintended emergent behaviors from protocol designs to their software stack implementation. To provide an efficient and comprehensive solution, we propose a novel and first-of-its-kind approach that conne… ▽ More Softwarization and virtualization in 5G and beyond necessitate thorough testing to ensure the security of critical infrastructure and networks, requiring the identification of vulnerabilities and unintended emergent behaviors from protocol designs to their software stack implementation. To provide an efficient and comprehensive solution, we propose a novel and first-of-its-kind approach that connects the strengths and coverage of formal and fuzzing methods to efficiently detect vulnerabilities across protocol logic and implementation stacks in a hierarchical manner. We design and implement formal verification to detect attack traces in critical protocols, which are used to guide subsequent fuzz testing and incorporate feedback from fuzz testing to broaden the scope of formal verification. This innovative approach significantly improves efficiency and enables the auto-discovery of vulnerabilities and unintended emergent behaviors from the 3GPP protocols to software stacks. Following this approach, we discover one identifier leakage model, one DoS attack model, and two eavesdrop attack models due to the absence of rudimentary MITM protection within the protocol, despite the existence of a Transport Layer Security (TLS) solution to this issue for over a decade. More remarkably, guided by the identified formal analysis and attack models, we exploit 61 vulnerabilities using fuzz testing demonstrated on srsRAN platforms. These identified vulnerabilities contribute to fortifying protocol-level assumptions and refining the search space. Compared to state-of-the-art fuzz testing, our united formal and fuzzing methodology enables auto-assurance by systematically discovering vulnerabilities. It significantly reduces computational complexity, transforming the non-practical exponential growth in computational cost into linear growth. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.09325 [pdf, other]

Distributed 3D-Beam Reforming for Hovering-Tolerant UAVs Communication over Coexistence: A Deep-Q Learning for Intelligent Space-Air-Ground Integrated Networks

Authors: Sudhanshu Arya, Yifeng Peng, Jingda Yang, Ying Wang

Abstract: In this paper, we present a novel distributed UAVs beam reforming approach to dynamically form and reform a space-selective beam path in addressing the coexistence with satellite and terrestrial communications. Despite the unique advantage to support wider coverage in UAV-enabled cellular communications, the challenges reside in the array responses' sensitivity to random rotational motion and the… ▽ More In this paper, we present a novel distributed UAVs beam reforming approach to dynamically form and reform a space-selective beam path in addressing the coexistence with satellite and terrestrial communications. Despite the unique advantage to support wider coverage in UAV-enabled cellular communications, the challenges reside in the array responses' sensitivity to random rotational motion and the hovering nature of the UAVs. A model-free reinforcement learning (RL) based unified UAV beam selection and tracking approach is presented to effectively realize the dynamic distributed and collaborative beamforming. The combined impact of the UAVs' hovering and rotational motions is considered while addressing the impairment due to the interference from the orbiting satellites and neighboring networks. The main objectives of this work are two-fold: first, to acquire the channel awareness to uncover its impairments; second, to overcome the beam distortion to meet the quality of service (QoS) requirements. To overcome the impact of the interference and to maximize the beamforming gain, we define and apply a new optimal UAV selection algorithm based on the brute force criteria. Results demonstrate that the detrimental effects of the channel fading and the interference from the orbiting satellites and neighboring networks can be overcome using the proposed approach. Subsequently, an RL algorithm based on Deep Q-Network (DQN) is developed for real-time beam tracking. By augmenting the system with the impairments due to hovering and rotational motion, we show that the proposed DQN algorithm can reform the beam in real-time with negligible error. It is demonstrated that the proposed DQN algorithm attains an exceptional performance improvement. We show that it requires a few iterations only for fine-tuning its parameters without observing any plateaus irrespective of the hovering tolerance. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.04715 [pdf, other]

CVPR MultiEarth 2023 Deforestation Estimation Challenge:SpaceVision4Amazon

Authors: Sunita Arya, S Manthira Moorthi, Debajyoti Dhar

Abstract: In this paper, we present a deforestation estimation method based on attention guided UNet architecture using Electro-Optical (EO) and Synthetic Aperture Radar (SAR) satellite imagery. For optical images, Landsat-8 and for SAR imagery, Sentinel-1 data have been used to train and validate the proposed model. Due to the unavailability of temporally and spatially collocated data, individual model has… ▽ More In this paper, we present a deforestation estimation method based on attention guided UNet architecture using Electro-Optical (EO) and Synthetic Aperture Radar (SAR) satellite imagery. For optical images, Landsat-8 and for SAR imagery, Sentinel-1 data have been used to train and validate the proposed model. Due to the unavailability of temporally and spatially collocated data, individual model has been trained for each sensor. During training time Landsat-8 model achieved training and validation pixel accuracy of 93.45% and Sentinel-2 model achieved 83.87% pixel accuracy. During the test set evaluation, the model achieved pixel accuracy of 84.70% with F1-Score of 0.79 and IoU of 0.69. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2306.17329 [pdf, ps, other]

Kernel $ε$-Greedy for Multi-Armed Bandits with Covariates

Authors: Sakshi Arya, Bharath K. Sriperumbudur

Abstract: We consider the $ε$-greedy strategy for the multi-arm bandit with covariates (MABC) problem, where the mean reward functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). We propose to estimate the unknown mean reward functions using an online weighted kernel ridge regression estimator, and show the resultant estimator to be consistent under appropriate decay rates of the explor… ▽ More We consider the $ε$-greedy strategy for the multi-arm bandit with covariates (MABC) problem, where the mean reward functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). We propose to estimate the unknown mean reward functions using an online weighted kernel ridge regression estimator, and show the resultant estimator to be consistent under appropriate decay rates of the exploration probability sequence, $\{ε_t\}_t$, and regularization parameter, $\{λ_t\}_t$. Moreover, we show that for any choice of kernel and the corresponding RKHS, we achieve a sub-linear regret rate depending on the intrinsic dimensionality of the RKHS. Furthermore, we achieve the optimal regret rate of $\sqrt{T}$ under a margin condition for finite-dimensional RKHS. △ Less

Submitted 1 June, 2025; v1 submitted 29 June, 2023; originally announced June 2023.

MSC Class: 62L10; 62G05; 68T05

arXiv:2306.15648 [pdf, other]

Optimal Area-Sensitive Bounds for Polytope Approximation

Authors: Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Given a convex body $K$ of diameter $Δ$ in $\mathbb{R}^d$ for fixed $d$, the objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. The best known uniform bound, due to Dudley (1974), shows t… ▽ More Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Given a convex body $K$ of diameter $Δ$ in $\mathbb{R}^d$ for fixed $d$, the objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. The best known uniform bound, due to Dudley (1974), shows that $O((Δ/\varepsilon)^{(d-1)/2})$ facets suffice. While this bound is optimal in the case of a Euclidean ball, it is far from optimal for ``skinny'' convex bodies. A natural way to characterize a convex object's skinniness is in terms of its relationship to the Euclidean ball. Given a convex body $K$, define its surface diameter $Δ_{d-1}$ to be the diameter of a Euclidean ball of the same surface area as $K$. It follows from generalizations of the isoperimetric inequality that $Δ\geq Δ_{d-1}$. We show that, under the assumption that the width of the body in any direction is at least $\varepsilon$, it is possible to approximate a convex body using $O((Δ_{d-1}/\varepsilon)^{(d-1)/2})$ facets. This bound is never worse than the previous bound and may be significantly better for skinny bodies. The bound is tight, in the sense that for any value of $Δ_{d-1}$, there exist convex bodies that, up to constant factors, require this many facets. The improvement arises from a novel approach to sampling points on the boundary of a convex body. We employ a classical concept from convexity, called Macbeath regions. We demonstrate that Macbeath regions in $K$ and $K$'s polar behave much like polar pairs. We then apply known results on the Mahler volume to bound their number. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.15621 [pdf, other]

Approximate Nearest Neighbor Searching with Non-Euclidean and Weighted Distances

Authors: Ahmed Abdelkader, Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a distance function that satisfies certain smoothness and growth-rate assumptions. The objective is to preprocess $S$ into a data structure so that for any query poi… ▽ More We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a distance function that satisfies certain smoothness and growth-rate assumptions. The objective is to preprocess $S$ into a data structure so that for any query point $q$ in $\mathbb{R}^d$, it is possible to efficiently report any point of $S$ whose distance from $q$ is within a factor of $1+\varepsilon$ of the actual closest point. Prior to this work, the most efficient data structures for approximate nearest-neighbor searching in spaces of constant dimensionality applied only to the Euclidean metric. This paper overcomes this limitation through a method called convexification. For admissible distance functions, the proposed data structures answer queries in logarithmic time using $O(n \log (1 / \varepsilon) / \varepsilon^{d/2})$ space, nearly matching the best known bounds for the Euclidean metric. These results apply to both convex scaling distance functions (including the Mahalanobis distance and weighted Minkowski metrics) and Bregman divergences (including the Kullback-Leibler divergence and the Itakura-Saito distance). △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2306.10586 [pdf, other]

The Gromov-Wasserstein distance between spheres

Authors: Shreya Arya, Arnab Auddy, Ranthony Edmonds, Sunhyuk Lim, Facundo Memoli, Daniel Packer

Abstract: In this paper we consider a two-parameter family {dGWp,q}p,q of Gromov- Wasserstein distances between metric measure spaces. By exploiting a suitable interaction between specific values of the parameters p and q and the metric of the underlying spaces, we determine the exact value of the distance dGW4,2 between all pairs of unit spheres of different dimension endowed with their Euclidean distance… ▽ More In this paper we consider a two-parameter family {dGWp,q}p,q of Gromov- Wasserstein distances between metric measure spaces. By exploiting a suitable interaction between specific values of the parameters p and q and the metric of the underlying spaces, we determine the exact value of the distance dGW4,2 between all pairs of unit spheres of different dimension endowed with their Euclidean distance and their uniform measure. △ Less

Submitted 12 July, 2024; v1 submitted 18 June, 2023; originally announced June 2023.

Comments: 1. Added a Section 4, Section 5, and Appendix C; 2. Swapped Sections 2 and 3; 3. "Relagated Proofs" section (Section 4 in the old version) is now in appendix

arXiv:2305.02451 [pdf, other]

Ground-to-UAV Integrated Network: Low Latency Communication over Interference Channel

Authors: Sudhanshu Arya, Ying Wang

Abstract: We present a novel and first-of-its-kind information-theoretic framework for the key design consideration and implementation of a ground-to-UAV (G2U) communication network to minimize end-to-end transmission delay in the presence of interference. The proposed framework is useful as it describes the minimum transmission latency for an uplink ground-to-UAV communication must satisfy while achieving… ▽ More We present a novel and first-of-its-kind information-theoretic framework for the key design consideration and implementation of a ground-to-UAV (G2U) communication network to minimize end-to-end transmission delay in the presence of interference. The proposed framework is useful as it describes the minimum transmission latency for an uplink ground-to-UAV communication must satisfy while achieving a given level of reliability. To characterize the transmission delay, we utilize Fano's inequality and derive the tight upper bound for the capacity for the G2U uplink channel in the presence of interference, noise, and potential jamming. Subsequently, given the reliability constraint, the error exponent is obtained for the given channel. Furthermore, a relay UAV in the dual-hop relay mode, with amplify-and-forward (AF) protocol, is considered, for which we jointly obtain the optimal positions of the relay and the receiver UAVs in the presence of interference. Interestingly, in our study, we find that for both the point-to-point and relayed links, increasing the transmit power may not always be an optimal solution for delay minimization problems. Moreover, we prove that there exists an optimal height that minimizes the end-to-end transmission delay in the presence of interference. The proposed framework can be used in practice by a network controller as a system parameters selection criteria, where among a set of parameters, the parameters leading to the lowest transmission latency can be incorporated into the transmission. The based analysis further set the baseline assessment when applying Command and Control (C2) standards to mission-critical G2U and UAV-to-UAV(U2U) services. △ Less

Submitted 3 May, 2023; originally announced May 2023.

Comments: 12 pages, 11 Figures

arXiv:2304.07500 [pdf, other]

The 7th AI City Challenge

Authors: Milind Naphade, Shuo Wang, David C. Anastasiu, Zheng Tang, Ming-Ching Chang, Yue Yao, Liang Zheng, Mohammed Shaiqur Rahman, Meenakshi S. Arya, Anuj Sharma, Qi Feng, Vitaly Ablavsky, Stan Sclaroff, Pranamesh Chakraborty, Sanjita Prajapati, Alice Li, Shangru Li, Krishna Kunadharaju, Shenxin Jiang, Rama Chellappa

Abstract: The AI City Challenge's seventh edition emphasizes two domains at the intersection of computer vision and artificial intelligence - retail business and Intelligent Traffic Systems (ITS) - that have considerable untapped potential. The 2023 challenge had five tracks, which drew a record-breaking number of participation requests from 508 teams across 46 countries. Track 1 was a brand new track that… ▽ More The AI City Challenge's seventh edition emphasizes two domains at the intersection of computer vision and artificial intelligence - retail business and Intelligent Traffic Systems (ITS) - that have considerable untapped potential. The 2023 challenge had five tracks, which drew a record-breaking number of participation requests from 508 teams across 46 countries. Track 1 was a brand new track that focused on multi-target multi-camera (MTMC) people tracking, where teams trained and evaluated using both real and highly realistic synthetic data. Track 2 centered around natural-language-based vehicle track retrieval. Track 3 required teams to classify driver actions in naturalistic driving analysis. Track 4 aimed to develop an automated checkout system for retail stores using a single view camera. Track 5, another new addition, tasked teams with detecting violations of the helmet rule for motorcyclists. Two leader boards were released for submissions based on different methods: a public leader board for the contest where external private data wasn't allowed and a general leader board for all results submitted. The participating teams' top performances established strong baselines and even outperformed the state-of-the-art in the proposed challenge tracks. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Comments: Summary of the 7th AI City Challenge Workshop in conjunction with CVPR 2023

arXiv:2303.09586 [pdf, other]

Optimal Volume-Sensitive Bounds for Polytope Approximation

Authors: Sunil Arya, David M. Mount

Abstract: Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body $K$ of diameter $Δ$ in $\textbf{R}^d$ for fixed $d$. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. It is known from classical results of Dudley (1974) and… ▽ More Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body $K$ of diameter $Δ$ in $\textbf{R}^d$ for fixed $d$. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. It is known from classical results of Dudley (1974) and Bronshteyn and Ivanov (1976) that $Θ((Δ/\varepsilon)^{(d-1)/2})$ vertices (alternatively, facets) are both necessary and sufficient. While this bound is tight in the worst case, that of Euclidean balls, it is far from optimal for skinny convex bodies. A natural way to characterize a convex object's skinniness is in terms of its relationship to the Euclidean ball. Given a convex body $K$, define its \emph{volume diameter} $Δ_d$ to be the diameter of a Euclidean ball of the same volume as $K$, and define its \emph{surface diameter} $Δ_{d-1}$ analogously for surface area. It follows from generalizations of the isoperimetric inequality that $Δ\geq Δ_{d-1} \geq Δ_d$. Arya, da Fonseca, and Mount (SoCG 2012) demonstrated that the diameter-based bound could be made surface-area sensitive, improving the above bound to $O((Δ_{d-1}/\varepsilon)^{(d-1)/2})$. In this paper, we strengthen this by proving the existence of an approximation with $O((Δ_d/\varepsilon)^{(d-1)/2})$ facets. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: To appear in the 39th International Symposium on Computational Geometry (SoCG 2023)

arXiv:2303.08349 [pdf, other]

Economical Convex Coverings and Applications

Authors: Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: Coverings of convex bodies have emerged as a central component in the design of efficient solutions to approximation problems involving convex bodies. Intuitively, given a convex body $K$ and $ε> 0$, a covering is a collection of convex bodies whose union covers $K$ such that a constant factor expansion of each body lies within an $ε$ expansion of $K$. Coverings have been employed in many applicat… ▽ More Coverings of convex bodies have emerged as a central component in the design of efficient solutions to approximation problems involving convex bodies. Intuitively, given a convex body $K$ and $ε> 0$, a covering is a collection of convex bodies whose union covers $K$ such that a constant factor expansion of each body lies within an $ε$ expansion of $K$. Coverings have been employed in many applications, such as approximations for diameter, width, and $ε$-kernels of point sets, approximate nearest neighbor searching, polytope approximations, and approximations to the Closest Vector Problem (CVP). It is known how to construct coverings of size $n^{O(n)} / ε^{(n-1)/2}$ for general convex bodies in $\textbf{R}^n$. In special cases, such as when the convex body is the $\ell_p$ unit ball, this bound has been improved to $2^{O(n)} / ε^{(n-1)/2}$. This raises the question of whether such a bound generally holds. In this paper we answer the question in the affirmative. We demonstrate the power and versatility of our coverings by applying them to the problem of approximating a convex body by a polytope, under the Banach-Mazur metric. Given a well-centered convex body $K$ and an approximation parameter $ε> 0$, we show that there exists a polytope $P$ consisting of $2^{O(n)} / ε^{(n-1)/2}$ vertices (facets) such that $K \subset P \subset K(1+ε)$. This bound is optimal in the worst case up to factors of $2^{O(n)}$. As an additional consequence, we obtain the fastest $(1+ε)$-approximate CVP algorithm that works in any norm, with a running time of $2^{O(n)} / ε^{(n-1)/2}$ up to polynomial factors in the input size, and we obtain the fastest $(1+ε)$-approximation algorithm for integer programming. We also present a framework for constructing coverings of optimal size for any convex body (up to factors of $2^{O(n)}$). △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: Preliminary version appeared in Proc. 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1834-1861, 2023 (https://doi.org/10.1137/1.9781611977554.ch70)

arXiv:2302.11817 [pdf, ps, other]

doi 10.1063/5.0147638

Effect of channel dimensions and Reynolds numbers on the turbulence modulation for particle-laden turbulent channel flows

Authors: Naveen Rohilla, Siddhi Arya, Partha Sarathi Goswami

Abstract: The addition of particles to turbulent flows changes the underlying mechanism of turbulence and leads to turbulence modulation. Different temporal and spatial scales for both phases make it challenging to understand turbulence modulation via one parameter. The important parameters are particle Stokes number, mass loading, particle Reynolds number, fluid bulk Reynolds number, etc., that act togethe… ▽ More The addition of particles to turbulent flows changes the underlying mechanism of turbulence and leads to turbulence modulation. Different temporal and spatial scales for both phases make it challenging to understand turbulence modulation via one parameter. The important parameters are particle Stokes number, mass loading, particle Reynolds number, fluid bulk Reynolds number, etc., that act together and affect the fluid phase turbulence intensities. In the present study, we have carried out the large eddy simulations for different system sizes (2δ/dp = 54, 81, and 117) and fluid bulk Reynolds numbers (Re_b = 5600 and 13750) to quantify the extent of turbulence attenuation. Here, δ is the half-channel width, dp is the particle diameter, and Re_b is the fluid Reynolds number based on the fluid bulk velocity and channel width. The point particles are tracked with the Lagrangian approach. The scaling analysis of the feedback force shows that system size and fluid bulk Reynolds number are the two crucial parameters that affect the turbulence modulation more significantly than the other. The streamwise turbulent structures are observed to become lengthier and fewer with an increase in system size for the same volume fraction and fixed bulk Reynolds number. However, the streamwise high-speed streaks are smaller, thinner, and closely spaced for higher Reynolds numbers than the lower ones for the same volume fraction. In particle statistics, it is observed that the scaled particle fluctuations increase with the increase in system size while keeping the Reynolds number fixed. However, the scaled particle fluctuations decrease with the increase in fluid bulk Reynolds number for the same volume fraction and fixed system size. The present study highlights the scaling issue for designing industrial equipment for particle-laden turbulent flows. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: NIL

arXiv:2302.00633 [pdf, other]

Deep Dependency Networks for Multi-Label Classification

Authors: Shivvrat Arya, Yu Xiang, Vibhav Gogate

Abstract: We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data. First, we show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved by leveraging more powerful methods such… ▽ More We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data. First, we show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved by leveraging more powerful methods such as iterative join graph propagation, integer linear programming, and $\ell_1$ regularization-based structure learning. Then we propose a new modeling framework called deep dependency networks, which augments a dependency network, a model that is easy to train and learns more accurate dependencies but is limited to Gibbs sampling for inference, to the output layer of a neural network. We show that despite its simplicity, jointly learning this new architecture yields significant improvements in performance over the baseline neural network. In particular, our experimental evaluation on three video activity classification datasets: Charades, Textually Annotated Cooking Scenes (TACoS), and Wetlab, and three multi-label image classification datasets: MS-COCO, PASCAL VOC, and NUS-WIDE show that deep dependency networks are almost always superior to pure neural architectures that do not use dependency networks. △ Less

Submitted 6 February, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

arXiv:2208.07899 [pdf, other]

Predictions of damages from Atlantic tropical cyclones: a hierarchical Bayesian study on extremes

Authors: Lindsey Dietz, Sakshi Arya, Snigdhansu Chatterjee

Abstract: Bayesian hierarchical models are proposed for modeling tropical cyclone characteristics and their damage potential in the Atlantic basin. We model the joint probability distribution of tropical cyclone characteristics and their damage potential at two different temporal scales, while taking several climate indices into account. First, a predictive model for an entire season is developed that forec… ▽ More Bayesian hierarchical models are proposed for modeling tropical cyclone characteristics and their damage potential in the Atlantic basin. We model the joint probability distribution of tropical cyclone characteristics and their damage potential at two different temporal scales, while taking several climate indices into account. First, a predictive model for an entire season is developed that forecasts the number of cyclone events that will take place, the probability of each cyclone causing some amount of damage, and the monetized value of damages. Then, specific characteristics of individual cyclones are considered to predict the monetized value of the damage it will cause. Robustness studies are conducted and excellent prediction power is demonstrated across different data science models and evaluation techniques. △ Less

Submitted 16 August, 2022; originally announced August 2022.

arXiv:2204.09020 [pdf, other]

A Sheaf-Theoretic Construction of Shape Space

Authors: Shreya Arya, Justin Curry, Sayan Mukherjee

Abstract: We present a sheaf-theoretic construction of shape space -- the space of all shapes. We do this by describing a homotopy sheaf on the poset category of constructible sets, where each set is mapped to its Persistent Homology Transform (PHT). Recent results that build on fundamental work of Schapira have shown that this transform is injective, thus making the PHT a good summary object for each shape… ▽ More We present a sheaf-theoretic construction of shape space -- the space of all shapes. We do this by describing a homotopy sheaf on the poset category of constructible sets, where each set is mapped to its Persistent Homology Transform (PHT). Recent results that build on fundamental work of Schapira have shown that this transform is injective, thus making the PHT a good summary object for each shape. Our homotopy sheaf result allows us to "glue" PHTs of different shapes together to build up the PHT of a larger shape. In the case where our shape is a polyhedron we prove a generalized nerve lemma for the PHT. Finally, by re-examining the sampling result of Smale-Niyogi-Weinberger, we show that we can reliably approximate the PHT of a manifold by a polyhedron up to arbitrary precision. △ Less

Submitted 23 June, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: Version 3 has 45 pages and 11 figures. Substantially revised to include more background material and explicit computations of PHT distances between point clouds. A new appendix presents out gluing result for the PHT from the infinity category point of view

arXiv:2110.05897 [pdf, ps, other]

Dimensionality Reduction for $k$-Distance Applied to Persistent Homology

Authors: Shreya Arya, Jean-Daniel Boissonnat, Kunal Dutta, Martin Lotz

Abstract: Given a set P of n points and a constant k, we are interested in computing the persistent homology of the Cech filtration of P for the k-distance, and investigate the effectiveness of dimensionality reduction for this problem, answering an open question of Sheehy [Proc. SoCG, 2014]. We show that any linear transformation that preserves pairwise distances up to a (1 +/- e) multiplicative factor, mu… ▽ More Given a set P of n points and a constant k, we are interested in computing the persistent homology of the Cech filtration of P for the k-distance, and investigate the effectiveness of dimensionality reduction for this problem, answering an open question of Sheehy [Proc. SoCG, 2014]. We show that any linear transformation that preserves pairwise distances up to a (1 +/- e) multiplicative factor, must preserve the persistent homology of the Cech filtration up to a factor of (1-e)^(-1). Our results also show that the Vietoris-Rips and Delaunay filtrations for the k-distance, as well as the Cech filtration for the approximate k-distance of Buchet et al. [J. Comput. Geom., 2016] are preserved up to a (1 +/- e) factor. We also prove extensions of our main theorem, for point sets (i) lying in a region of bounded Gaussian width or (ii) on a low-dimensional submanifold, obtaining embeddings having the dimension bounds of Lotz [Proc. Roy. Soc., 2019] and Clarkson [Proc. SoCG, 2008] respectively. Our results also work in the terminal dimensionality reduction setting, where the distance of any point in the original ambient space, to any point in P, needs to be approximately preserved. △ Less

Submitted 12 October, 2021; originally announced October 2021.

Comments: 18 pages

MSC Class: 55-08; 68W27; 68W20 ACM Class: I.3.5

arXiv:2108.05191 [pdf]

Microcontroller Based Load Monitoring System

Authors: A N Madhavanunni, S S Arya, Renjith Kumar D

Abstract: The demand for power has increased exponentially over the last century. One avenue through which today's energy problems can be addressed is through the reduction of energy usage in households. This has increased the emphasis on the need for accurate and economic methods of power measurement. The goal of providing such data is to optimize and reduce their power consumption. In view of this, the pr… ▽ More The demand for power has increased exponentially over the last century. One avenue through which today's energy problems can be addressed is through the reduction of energy usage in households. This has increased the emphasis on the need for accurate and economic methods of power measurement. The goal of providing such data is to optimize and reduce their power consumption. In view of this, the present manuscript focuses on the design and implementation of precise and reliable load monitoring system using PIC microcontroller chip (PIC16F877A). This involves an accurate sensing of voltage, current and power factor of the load. A clever utilization of in-built ADC and timers of the microcontroller reduces the design complexity of the system. The proposed system monitors the load continuously on a real time basis and displays the parameters such as voltage, current, power factor, active, reactive and apparent powers in an LCD module. The use of microcontroller reduces the cost and makes the device compact. The proposed system has been implemented and tested in the laboratory for single phase loads. △ Less

Submitted 11 August, 2021; originally announced August 2021.

Comments: Paper presented in International Conference on Control, Calibration and Testing (ICCCT '15), Feb 13-14, 2015, PSG College of Technology, Coimbatore, India

Journal ref: Proceedings of International Conference on Control, Calibration and Testing (ICCCT '15), Feb 13-14, 2015, Coimbatore, India, 6-10

arXiv:2105.06508

Internet of Things (IoT) Based Video Analytics: a use case of Smart Doorbell

Authors: Shailesh Arya

Abstract: The vision of the internet of things (IoT) is a reality now. IoT devices are getting cheaper, smaller. They are becoming more and more computationally and energy-efficient. The global market of IoT-based video analytics has seen significant growth in recent years and it is expected to be a growing market segment. For any IoT-based video analytics application, few key points required, such as cost-… ▽ More The vision of the internet of things (IoT) is a reality now. IoT devices are getting cheaper, smaller. They are becoming more and more computationally and energy-efficient. The global market of IoT-based video analytics has seen significant growth in recent years and it is expected to be a growing market segment. For any IoT-based video analytics application, few key points required, such as cost-effectiveness, widespread use, flexible design, accurate scene detection, reusability of the framework. Video-based smart doorbell system is one such application domain for video analytics where many commercial offerings are available in the consumer market. However, such existing offerings are costly, monolithic, and proprietary. Also, there will be a trade-off between accuracy and portability. To address the foreseen problems, I'm proposing a distributed framework for video analytics with a use case of a smart doorbell system. The proposed framework uses AWS cloud services as a base platform and to meet the price affordability constraint, the system was implemented on affordable Raspberry Pi. The smart doorbell will be able to recognize the known/unknown person with at most accuracy. The smart doorbell system is also having additional detection functionalities such as harmful weapon detection, noteworthy vehicle detection, animal/pet detection. An iOS application is specifically developed for this implementation which can receive the notification from the smart doorbell in real-time. Finally, the paper also mentions the classical approaches for video analytics, their feasibility in implementing with this use-case, and comparative analysis in terms of accuracy and time required to detect an object in the frame is carried out. Results conclude that AWS cloud-based approach is worthy for this smart doorbell use case. △ Less

Submitted 13 September, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

Comments: Need to derive more results on different IoT devices!

arXiv:2103.13187 [pdf, ps, other]

doi 10.13140/RG.2.2.18060.54408/1

The Influence of Social Networks on Human Society

Authors: Shreyash Arya

Abstract: This report gives a brief overview of the origin of social networks and their most popular manifestation in the modern era - the Online Social Networks (OSNs) or social media. It further discusses the positive and negative implications of OSNs on human society. The coupling of Data Science and social media (social media mining) is then put forward as a powerful tool to overcome the current challen… ▽ More This report gives a brief overview of the origin of social networks and their most popular manifestation in the modern era - the Online Social Networks (OSNs) or social media. It further discusses the positive and negative implications of OSNs on human society. The coupling of Data Science and social media (social media mining) is then put forward as a powerful tool to overcome the current challenges and pave the path for futuristic advancements △ Less

Submitted 24 March, 2021; originally announced March 2021.

arXiv:2011.06887 [pdf, ps, other]

Adaptive estimation of a function from its Exponential Radon Transform in presence of noise

Authors: Anuj Abhishek, Sakshi Arya

Abstract: In this article we propose a locally adaptive strategy for estimating a function from its Exponential Radon Transform (ERT) data, without prior knowledge of the smoothness of functions that are to be estimated. We build a non-parametric kernel type estimator and show that for a class of functions comprising a wide Sobolev regularity scale, our proposed strategy follows the minimax optimal rate up… ▽ More In this article we propose a locally adaptive strategy for estimating a function from its Exponential Radon Transform (ERT) data, without prior knowledge of the smoothness of functions that are to be estimated. We build a non-parametric kernel type estimator and show that for a class of functions comprising a wide Sobolev regularity scale, our proposed strategy follows the minimax optimal rate up to a $\log{n}$ factor. We also show that there does not exist an optimal adaptive estimator on the Sobolev scale when the pointwise risk is used and in fact the rate achieved by the proposed estimator is the adaptive rate of convergence. △ Less

Submitted 13 November, 2020; originally announced November 2020.

arXiv:2009.09065 [pdf, other]

A Distributed Framework to Orchestrate Video Analytics Applications

Authors: Tapan Pathak, Vatsal Patel, Sarth Kanani, Shailesh Arya, Pankesh Patel, Muhammad Intizar Ali, John Breslin

Abstract: The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary of… ▽ More The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary offerings are bespoke, offering limited composability and reusability of a smart doorbell framework. Second, they are monolithic and proprietary, which means that the implementation details remain hidden from the users. We believe that a transparent design can greatly aid in the development of a smart doorbell, enabling its use in multiple application domains. To address the above-mentioned challenges, we propose a distributed framework to orchestrate video analytics across Edge and Cloud resources. We investigate trade-offs in the distribution of different software components over a bespoke/full system, where components over Edge and Cloud are treated generically. This paper evaluates the proposed framework as well as the state-of-the-art models and presents comparative analysis of them on various metrics (such as overall model accuracy, latency, memory, and CPU usage). The evaluation result demonstrates our intuition very well, showcasing that the AWS-based approach exhibits reasonably high object-detection accuracy, low memory, and CPU usage when compared to the state-of-the-art approaches, but high latency. △ Less

Submitted 17 September, 2020; originally announced September 2020.

Comments: 9

arXiv:2009.03168 [pdf, other]

Effect of lockdown interventions to control the COVID-19 epidemic in India

Authors: Ankit Sharma, Shreyash Arya, Shashee Kumari, Arnab Chatterjee

Abstract: The pandemic caused by the novel Coronavirus SARS-CoV2 has been responsible for life threatening health complications, and extreme pressure on healthcare systems. While preventive and definite curative medical interventions are yet to arrive, Non-Pharmaceutical Interventions (NPIs) like physical isolation, quarantine and drastic social measures imposed by governing agencies are effective in arrest… ▽ More The pandemic caused by the novel Coronavirus SARS-CoV2 has been responsible for life threatening health complications, and extreme pressure on healthcare systems. While preventive and definite curative medical interventions are yet to arrive, Non-Pharmaceutical Interventions (NPIs) like physical isolation, quarantine and drastic social measures imposed by governing agencies are effective in arresting the spread of infections in a population. In densely populated countries like India, lockdown interventions are partially effective due to social and administrative complexities. Using detailed demographic data, we present an agent based model to imitate the behavior of the population and its mobility features, even under intervention. We demonstrate the effectiveness of contact tracing policies and how our model efficiently relates to empirical findings on testing efficiency. We also present various lockdown intervention strategies for mitigation - using the bare number of infections, the effective reproduction rate, as well as using reinforcement learning. Our analysis can help assess the socio-economic consequences of such interventions, and provide useful ideas and insights to policy makers for better decision making. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: 21 pages, 9 figures

arXiv:2006.08220 [pdf]

Implementation of Google Assistant & Amazon Alexa on Raspberry Pi

Authors: Shailesh D. Arya, Samir Patel

Abstract: This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this pa… ▽ More This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this paper, we are developing a smart speaker prototype that has the functionalities of both in the same Raspberry Pi. Users can invoke a virtual assistant by saying the hot words and can leverage the best services of both eco-systems. This paper also explains the complex architecture of Google Assistant and Amazon Alexa and the working of both assistants as well. Later, this system can be used to control the smart home IoT devices. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 5 Pages, 5 Figures

arXiv:2005.13078 [pdf, other]

To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation

Authors: Sakshi Arya, Yuhong Yang

Abstract: Delayed rewards problem in contextual bandits has been of interest in various practical settings. We study randomized allocation strategies and provide an understanding on how the exploration-exploitation tradeoff is affected by delays in observing the rewards. In randomized strategies, the extent of exploration-exploitation is controlled by a user-determined exploration probability sequence. In t… ▽ More Delayed rewards problem in contextual bandits has been of interest in various practical settings. We study randomized allocation strategies and provide an understanding on how the exploration-exploitation tradeoff is affected by delays in observing the rewards. In randomized strategies, the extent of exploration-exploitation is controlled by a user-determined exploration probability sequence. In the presence of delayed rewards, one may choose between using the original exploration sequence that updates at every time point or update the sequence only when a new reward is observed, leading to two competing strategies. In this work, we show that while both strategies may lead to strong consistency in allocation, the property holds for a wider scope of situations for the latter. However, for finite sample performance, we illustrate that both strategies have their own advantages and disadvantages, depending on the severity of the delay and underlying reward generating mechanisms. △ Less

Submitted 26 May, 2020; originally announced May 2020.

arXiv:2004.14289 [pdf]

Smart Attendance System Usign CNN

Authors: Shailesh Arya, Hrithik Mesariya, Vishal Parekh

Abstract: The research on the attendance system has been going for a very long time, numerous arrangements have been proposed in the last decade to make this system efficient and less time consuming, but all those systems have several flaws. In this paper, we are introducing a smart and efficient system for attendance using face detection and face recognition. This system can be used to take attendance in c… ▽ More The research on the attendance system has been going for a very long time, numerous arrangements have been proposed in the last decade to make this system efficient and less time consuming, but all those systems have several flaws. In this paper, we are introducing a smart and efficient system for attendance using face detection and face recognition. This system can be used to take attendance in colleges or offices using real-time face recognition with the help of the Convolution Neural Network(CNN). The conventional methods like Eigenfaces and Fisher faces are sensitive to lighting, noise, posture, obstruction, illumination etc. Hence, we have used CNN to recognize the face and overcome such difficulties. The attendance records will be updated automatically and stored in an excel sheet as well as in a database. We have used MongoDB as a backend database for attendance records. △ Less

Submitted 22 April, 2020; originally announced April 2020.

Comments: 4 Pages, 9 Figures

arXiv:1910.14459 [pdf, other]

doi 10.1145/3559106

Optimal Bound on the Combinatorial Complexity of Approximating Polytopes

Authors: Rahul Arya, Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: This paper considers the question of how to succinctly approximate a multidimensional convex body by a polytope. Given a convex body $K$ of unit diameter in Euclidean $d$-dimensional space (where $d$ is a constant) and an error parameter $\varepsilon > 0$, the objective is to determine a convex polytope of low combinatorial complexity whose Hausdorff distance from $K$ is at most $\varepsilon$. By… ▽ More This paper considers the question of how to succinctly approximate a multidimensional convex body by a polytope. Given a convex body $K$ of unit diameter in Euclidean $d$-dimensional space (where $d$ is a constant) and an error parameter $\varepsilon > 0$, the objective is to determine a convex polytope of low combinatorial complexity whose Hausdorff distance from $K$ is at most $\varepsilon$. By combinatorial complexity we mean the total number of faces of all dimensions. Classical constructions by Dudley and Bronshteyn/Ivanov show that $O(1/\varepsilon^{(d-1)/2})$ facets or vertices are possible, respectively, but neither achieves both bounds simultaneously. In this paper, we show that it is possible to construct a polytope with $O(1/\varepsilon^{(d-1)/2})$ combinatorial complexity, which is optimal in the worst case. Our result is based on a new relationship between $\varepsilon$-width caps of a convex body and its polar body. Using this relationship, we are able to obtain a volume-sensitive bound on the number of approximating caps that are "essentially different." We achieve our main result by combining this with a variant of the witness-collector method and a novel variable-thickness layered construction of the economical cap covering. △ Less

Submitted 24 August, 2022; v1 submitted 30 October, 2019; originally announced October 2019.

Comments: To appear on the SODA 2020 special issue of ACM Transactions on Algorithms. arXiv admin note: text overlap with arXiv:1604.01175

Report number: 10.1137/1.9781611975994.48

Journal ref: SODA 2020, 786-805, 2020

arXiv:1906.02475 [pdf]

Discovery of Ionic Impact Ionization (I3) in Perovskites Triggered by a Single Photon

Authors: Zihan Xu, Yugang Yu, Iftikhar Ahmad Niaz, Yimu Chen, Shaurya Arya, Yusheng Lei, Mohammad Abu Raihan Miah, Jiayun Zhou, Alex Ce Zhang, Lujiang Yan, Sheng Xu, Kenji Nomura, Yu-Hwa Lo

Abstract: Organic-inorganic metal halide perovskite devices have generated significant interest for LED, photodetector, and solar cell applications due to their attractive optoelectronic properties and substrate-choice flexibility1-4. These devices exhibit slow time-scale response, which have been explained by point defect migration5-6. In this work, we report the discovery of a room temperature intrinsic a… ▽ More Organic-inorganic metal halide perovskite devices have generated significant interest for LED, photodetector, and solar cell applications due to their attractive optoelectronic properties and substrate-choice flexibility1-4. These devices exhibit slow time-scale response, which have been explained by point defect migration5-6. In this work, we report the discovery of a room temperature intrinsic amplification process in methylammonium lead iodide perovskite (MAPbI3) that can be triggered by few photons, down to a single photon. The electrical properties of the material, by way of photoresponse, are modified by an input energy as small as 0.19 attojoules, the energy of a single photon. These observations cannot be explained by photo-excited electronic band-to-band transitions or prevailing model of photo-excited point defect migration since none of the above can explain the observed macroscopic property change by absorption of single or few photons. The results suggest the existence of an avalanche-like collective motion of iodides and their accumulation near the anode, which we will call ionic impact ionization (I3 mechanism). The proposed I3 process is the ionic analog of the electronic impact ionization, and has been considered impossible before because conventionally it takes far more energy to move ions out of their equilibrium position than electrons. We have performed first principle calculations to show that in MAPbI3 the activation energy for the I3 mechanism is appreciably lower than the literature value of the activation energy for the electronic impact ionization. The discovery of I3 process in perovskite material opens up possibilities for new classes of devices for photonic and electronic applications. △ Less

Submitted 12 June, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

arXiv:1902.00819 [pdf, other]

Randomized Allocation with Nonparametric Estimation for Contextual Multi-Armed Bandits with Delayed Rewards

Authors: Sakshi Arya, Yuhong Yang

Abstract: We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some mild assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent. We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some mild assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent. △ Less

Submitted 4 September, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

Comments: Added simulations and some minor typographical changes

arXiv:1807.00484 [pdf, other]

doi 10.4230/LIPIcs.ESA.2018.3

Approximate Convex Intersection Detection with Applications to Width and Minkowski Sums

Authors: Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: Approximation problems involving a single convex body in $d$-dimensional space have received a great deal of attention in the computational geometry community. In contrast, works involving multiple convex bodies are generally limited to dimensions $d \leq 3$ and/or do not consider approximation. In this paper, we consider approximations to two natural problems involving multiple convex bodies: det… ▽ More Approximation problems involving a single convex body in $d$-dimensional space have received a great deal of attention in the computational geometry community. In contrast, works involving multiple convex bodies are generally limited to dimensions $d \leq 3$ and/or do not consider approximation. In this paper, we consider approximations to two natural problems involving multiple convex bodies: detecting whether two polytopes intersect and computing their Minkowski sum. Given an approximation parameter $\varepsilon > 0$, we show how to independently preprocess two polytopes $A,B$ into data structures of size $O(1/\varepsilon^{(d-1)/2})$ such that we can answer in polylogarithmic time whether $A$ and $B$ intersect approximately. More generally, we can answer this for the images of $A$ and $B$ under affine transformations. Next, we show how to $\varepsilon$-approximate the Minkowski sum of two given polytopes defined as the intersection of $n$ halfspaces in $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2 + α})$ time, for any constant $α> 0$. Finally, we present a surprising impact of these results to a well studied problem that considers a single convex body. We show how to $\varepsilon$-approximate the width of a set of $n$ points in $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2 + α})$ time, for any constant $α> 0$, a major improvement over the previous bound of roughly $O(n + 1/\varepsilon^{d-1})$ time. △ Less

Submitted 2 July, 2018; originally announced July 2018.

Journal ref: ESA 2018

arXiv:1703.10868 [pdf, other]

doi 10.4230/LIPIcs.SoCG.2017.10

Near-Optimal $\varepsilon$-Kernel Construction and Related Problems

Authors: Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: The computation of (i) $\varepsilon$-kernels, (ii) approximate diameter, and (iii) approximate bichromatic closest pair are fundamental problems in geometric approximation. In this paper, we describe new algorithms that offer significant improvements to their running times. In each case the input is a set of $n$ points in $R^d$ for a constant dimension $d \geq 3$ and an approximation parameter… ▽ More The computation of (i) $\varepsilon$-kernels, (ii) approximate diameter, and (iii) approximate bichromatic closest pair are fundamental problems in geometric approximation. In this paper, we describe new algorithms that offer significant improvements to their running times. In each case the input is a set of $n$ points in $R^d$ for a constant dimension $d \geq 3$ and an approximation parameter $\varepsilon > 0$. We reduce the respective running times (i) from $O((n + 1/\varepsilon^{d-2})\log(1/\varepsilon))$ to $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2+α})$, (ii) from $O((n + 1/\varepsilon^{d-2})\log(1/\varepsilon))$ to $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2+α})$, and (iii) from $O(n / \varepsilon^{d/3})$ to $O(n / \varepsilon^{d/4+α}),$ for an arbitrarily small constant $α> 0$. Result (i) is nearly optimal since the size of the output $\varepsilon$-kernel is $Θ(1/\varepsilon^{(d-1)/2})$ in the worst case. These results are all based on an efficient decomposition of a convex body using a hierarchy of Macbeath regions, and contrast to previous solutions that decompose space using quadtrees and grids. By further application of these techniques, we also show that it is possible to obtain near-optimal preprocessing time for the most efficient data structures to approximately answer queries for (iv) nearest-neighbor searching, (v) directional width, and (vi) polytope membership. △ Less

Submitted 31 March, 2017; originally announced March 2017.

ACM Class: F.2.2

Journal ref: Proc. 33rd International Symposium on Computational Geometry (SoCG 2017), pages 10:1-15, 2017

arXiv:1612.01696 [pdf, other]

doi 10.1137/1.9781611974782.18

Optimal Approximate Polytope Membership

Authors: Sunil Arya, Guilherme D. da Fonseca, David M. Mount

Abstract: In the polytope membership problem, a convex polytope $K$ in $R^d$ is given, and the objective is to preprocess $K$ into a data structure so that, given a query point $q \in R^d$, it is possible to determine efficiently whether $q \in K$. We consider this problem in an approximate setting and assume that $d$ is a constant. Given an approximation parameter $\varepsilon > 0$, the query can be answer… ▽ More In the polytope membership problem, a convex polytope $K$ in $R^d$ is given, and the objective is to preprocess $K$ into a data structure so that, given a query point $q \in R^d$, it is possible to determine efficiently whether $q \in K$. We consider this problem in an approximate setting and assume that $d$ is a constant. Given an approximation parameter $\varepsilon > 0$, the query can be answered either way if the distance from $q$ to $K$'s boundary is at most $\varepsilon$ times $K$'s diameter. Previous solutions to the problem were on the form of a space-time trade-off, where logarithmic query time demands $O(1/\varepsilon^{d-1})$ storage, whereas storage $O(1/\varepsilon^{(d-1)/2})$ admits roughly $O(1/\varepsilon^{(d-1)/8})$ query time. In this paper, we present a data structure that achieves logarithmic query time with storage of only $O(1/\varepsilon^{(d-1)/2})$, which matches the worst-case lower bound on the complexity of any $\varepsilon$-approximating polytope. Our data structure is based on a new technique, a hierarchy of ellipsoids defined as approximations to Macbeath regions. As an application, we obtain major improvements to approximate Euclidean nearest neighbor searching. Notably, the storage needed to answer $\varepsilon$-approximate nearest neighbor queries for a set of $n$ points in $O(\log \frac{n}{\varepsilon})$ time is reduced to $O(n/\varepsilon^{d/2})$. This halves the exponent in the $\varepsilon$-dependency of the existing space bound of roughly $O(n/\varepsilon^d)$, which has stood for 15 years (Har-Peled, 2001). △ Less

Submitted 6 December, 2016; originally announced December 2016.

Comments: SODA 2017

Showing 1–50 of 58 results for author: Arya, S