-
PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation
Authors:
Yifan Yin,
Zhengtao Han,
Shivam Aarya,
Jianxin Wang,
Shuhang Xu,
Jiawei Peng,
Angtian Wang,
Alan Yuille,
Tianmin Shu
Abstract:
Fine-grained robot manipulation, such as lifting and rotating a bottle to display the label on the cap, requires robust reasoning about object parts and their relationships with intended tasks. Despite recent advances in training general-purpose robot manipulation policies guided by language instructions, there is a notable lack of large-scale datasets for fine-grained manipulation tasks with part…
▽ More
Fine-grained robot manipulation, such as lifting and rotating a bottle to display the label on the cap, requires robust reasoning about object parts and their relationships with intended tasks. Despite recent advances in training general-purpose robot manipulation policies guided by language instructions, there is a notable lack of large-scale datasets for fine-grained manipulation tasks with part-level instructions and diverse 3D object instances annotated with part-level labels. In this work, we introduce PartInstruct, the first large-scale benchmark for training and evaluating fine-grained robot manipulation models using part-level instructions. PartInstruct comprises 513 object instances across 14 categories, each annotated with part-level information, and 1302 fine-grained manipulation tasks organized into 16 task classes. Our training set consists of over 10,000 expert demonstrations synthesized in a 3D simulator, where each demonstration is paired with a high-level task instruction, a chain of base part-based skill instructions, and ground-truth 3D information about the object and its parts. Additionally, we designed a comprehensive test suite to evaluate the generalizability of learned policies across new states, objects, and tasks. We evaluated several state-of-the-art robot manipulation approaches, including end-to-end vision-language policy learning and bi-level planning models for robot manipulation on our benchmark. The experimental results reveal that current models struggle to robustly ground part concepts and predict actions in 3D space, and face challenges when manipulating object parts in long-horizon tasks.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Batched Nonparametric Bandits via k-Nearest Neighbor UCB
Authors:
Sakshi Arya
Abstract:
We study sequential decision-making in batched nonparametric contextual bandits, where actions are selected over a finite horizon divided into a small number of batches. Motivated by constraints in domains such as medicine and marketing -- where online feedback is limited -- we propose a nonparametric algorithm that combines adaptive k-nearest neighbor (k-NN) regression with the upper confidence b…
▽ More
We study sequential decision-making in batched nonparametric contextual bandits, where actions are selected over a finite horizon divided into a small number of batches. Motivated by constraints in domains such as medicine and marketing -- where online feedback is limited -- we propose a nonparametric algorithm that combines adaptive k-nearest neighbor (k-NN) regression with the upper confidence bound (UCB) principle. Our method, BaNk-UCB, is fully nonparametric, adapts to the context dimension, and is simple to implement. Unlike prior work relying on parametric or binning-based estimators, BaNk-UCB uses local geometry to estimate rewards and adaptively balances exploration and exploitation. We provide near-optimal regret guarantees under standard Lipschitz smoothness and margin assumptions, using a theoretically motivated batch schedule that balances regret across batches and achieves minimax-optimal rates. Empirical evaluations on synthetic and real-world datasets demonstrate that BaNk-UCB consistently outperforms binning-based baselines.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
Can Third-parties Read Our Emotions?
Authors:
Jiayi Li,
Yingfan Zhou,
Pranav Narayanan Venkit,
Halima Binte Islam,
Sneha Arya,
Shomir Wilson,
Sarah Rajtmajer
Abstract:
Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition…
▽ More
Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition tasks that directly compare third-party annotations with first-party (author-provided) emotion labels. Our findings reveal significant limitations in third-party annotations-whether provided by human annotators or large language models (LLMs)-in faithfully representing authors' private states. However, LLMs outperform human annotators nearly across the board. We further explore methods to improve third-party annotation quality. We find that demographic similarity between first-party authors and third-party human annotators enhances annotation performance. While incorporating first-party demographic information into prompts leads to a marginal but statistically significant improvement in LLMs' performance. We introduce a framework for evaluating the limitations of third-party annotations and call for refined annotation practices to accurately represent and model authors' private states.
△ Less
Submitted 25 April, 2025;
originally announced April 2025.
-
Semi-Parametric Batched Global Multi-Armed Bandits with Covariates
Authors:
Sakshi Arya,
Hyebin Song
Abstract:
The multi-armed bandits (MAB) framework is a widely used approach for sequential decision-making, where a decision-maker selects an arm in each round with the goal of maximizing long-term rewards. Moreover, in many practical applications, such as personalized medicine and recommendation systems, feedback is provided in batches, contextual information is available at the time of decision-making, an…
▽ More
The multi-armed bandits (MAB) framework is a widely used approach for sequential decision-making, where a decision-maker selects an arm in each round with the goal of maximizing long-term rewards. Moreover, in many practical applications, such as personalized medicine and recommendation systems, feedback is provided in batches, contextual information is available at the time of decision-making, and rewards from different arms are related rather than independent. We propose a novel semi-parametric framework for batched bandits with covariates and a shared parameter across arms, leveraging the single-index regression (SIR) model to capture relationships between arm rewards while balancing interpretability and flexibility. Our algorithm, Batched single-Index Dynamic binning and Successive arm elimination (BIDS), employs a batched successive arm elimination strategy with a dynamic binning mechanism guided by the single-index direction. We consider two settings: one where a pilot direction is available and another where the direction is estimated from data, deriving theoretical regret bounds for both cases. When a pilot direction is available with sufficient accuracy, our approach achieves minimax-optimal rates (with $d = 1$) for nonparametric batched bandits, circumventing the curse of dimensionality. Extensive experiments on simulated and real-world datasets demonstrate the effectiveness of our algorithm compared to the nonparametric batched bandit method introduced by \cite{jiang2024batched}.
△ Less
Submitted 1 March, 2025;
originally announced March 2025.
-
Uncertainty-Guided Cross Attention Ensemble Mean Teacher for Semi-supervised Medical Image Segmentation
Authors:
Meghana Karri,
Amit Soni Arya,
Koushik Biswas,
Nicol`o Gennaro,
Vedat Cicek,
Gorkem Durak,
Yuri S. Velichko,
Ulas Bagci
Abstract:
This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guide…
▽ More
This work proposes a novel framework, Uncertainty-Guided Cross Attention Ensemble Mean Teacher (UG-CEMT), for achieving state-of-the-art performance in semi-supervised medical image segmentation. UG-CEMT leverages the strengths of co-training and knowledge distillation by combining a Cross-attention Ensemble Mean Teacher framework (CEMT) inspired by Vision Transformers (ViT) with uncertainty-guided consistency regularization and Sharpness-Aware Minimization emphasizing uncertainty. UG-CEMT improves semi-supervised performance while maintaining a consistent network architecture and task setting by fostering high disparity between sub-networks. Experiments demonstrate significant advantages over existing methods like Mean Teacher and Cross-pseudo Supervision in terms of disparity, domain generalization, and medical image segmentation performance. UG-CEMT achieves state-of-the-art results on multi-center prostate MRI and cardiac MRI datasets, where object segmentation is particularly challenging. Our results show that using only 10\% labeled data, UG-CEMT approaches the performance of fully supervised methods, demonstrating its effectiveness in exploiting unlabeled data for robust medical image segmentation. The code is publicly available at \url{https://github.com/Meghnak13/UG-CEMT}
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Data Processing Chain and Products of EOS-06 OCM-3 Payload From Signal Processing to Geometric Precision
Authors:
Ankur Garg,
Tushar Shukla,
Sunita Arya,
Ghansham Sangar,
Sampa Roy,
Meenakshi Sarkar,
S. Manthira Moorthi,
Debajyoti Dhar
Abstract:
The Ocean Color Monitor-3, launched aboard Oceansat-3, represents a significant advancement in ocean observation technology, building upon the capabilities of its predecessors. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sun-synchronous orbit. With thirteen spectral bands, OCM-3 enhances feature identificat…
▽ More
The Ocean Color Monitor-3, launched aboard Oceansat-3, represents a significant advancement in ocean observation technology, building upon the capabilities of its predecessors. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sun-synchronous orbit. With thirteen spectral bands, OCM-3 enhances feature identification and atmospheric correction, enabling precise data collection from a sunsynchronous orbit. Operating at an altitude of 732.5 km, the satellite achieves high signal-to-noise ratios (SNR) through sophisticated onboard and ground processing techniques, including advanced geometric modeling for pixel registration.The OCM-3 processing pipeline, consisting of multiple levels, ensures rigorous calibration and correction of radiometric and geometric data. This paper presents key methodologies such as dark data modeling, photo response non-uniformity correction, and smear correction, are employed to enhance data quality. The effective implementation of ground time delay integration (TDI) allows for the refinement of SNR, with evaluations demonstrating that performance specifications were exceeded. Geometric calibration procedures, including band-to-band registration and geolocation accuracy assessments, which further optimize data reliability are presented in the paper. Advanced image registration techniques leveraging Ground Control Points (GCPs) and residual error analysis significantly reduce geolocation errors, achieving precision within specified thresholds. Overall, OCM-3 comprehensive calibration and processing strategies ensure high-quality, reliable data crucial for ocean monitoring and change detection applications, facilitating improved understanding of ocean dynamics and environmental changes.
△ Less
Submitted 22 October, 2024;
originally announced November 2024.
-
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Authors:
Shreyash Arya,
Sukrut Rao,
Moritz Böhle,
Bernt Schiele
Abstract:
B-cos Networks have been shown to be effective for obtaining highly human interpretable explanations of model decisions by architecturally enforcing stronger alignment between inputs and weight. B-cos variants of convolutional networks (CNNs) and vision transformers (ViTs), which primarily replace linear layers with B-cos transformations, perform competitively to their respective standard variants…
▽ More
B-cos Networks have been shown to be effective for obtaining highly human interpretable explanations of model decisions by architecturally enforcing stronger alignment between inputs and weight. B-cos variants of convolutional networks (CNNs) and vision transformers (ViTs), which primarily replace linear layers with B-cos transformations, perform competitively to their respective standard variants while also yielding explanations that are faithful by design. However, it has so far been necessary to train these models from scratch, which is increasingly infeasible in the era of large, pre-trained foundation models. In this work, inspired by the architectural similarities in standard DNNs and B-cos networks, we propose 'B-cosification', a novel approach to transform existing pre-trained models to become inherently interpretable. We perform a thorough study of design choices to perform this conversion, both for convolutional neural networks and vision transformers. We find that B-cosification can yield models that are on par with B-cos models trained from scratch in terms of interpretability, while often outperforming them in terms of classification performance at a fraction of the training cost. Subsequently, we apply B-cosification to a pretrained CLIP model, and show that, even with limited data and compute cost, we obtain a B-cosified version that is highly interpretable and competitive on zero shot performance across a variety of datasets. We release our code and pre-trained model weights at https://github.com/shrebox/B-cosification.
△ Less
Submitted 24 January, 2025; v1 submitted 1 November, 2024;
originally announced November 2024.
-
Technical Design Review of Duke Robotics Club's Oogway: An AUV for RoboSub 2024
Authors:
Will Denton,
Michael Bryant,
Lilly Chiavetta,
Vedarsh Shah,
Rico Zhu,
Philip Xue,
Vincent Chen,
Maxwell Lin,
Hung Le,
Austin Camacho,
Raul Galvez,
Nathan Yang,
Nathanael Ren,
Tyler Rose,
Mathew Chu,
Amir Ergashev,
Saagar Arya,
Kaelyn Pieter,
Ethan Horowitz,
Maanav Allampallam,
Patrick Zheng,
Mia Kaarls,
June Wood
Abstract:
The Duke Robotics Club is proud to present our robot for the 2024 RoboSub Competition: Oogway. Now in its second year, Oogway has been dramatically upgraded in both its capabilities and reliability. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities…
▽ More
The Duke Robotics Club is proud to present our robot for the 2024 RoboSub Competition: Oogway. Now in its second year, Oogway has been dramatically upgraded in both its capabilities and reliability. Oogway was built on the principle of independent, well-integrated, and reliable subsystems. Individual components and subsystems were tested and designed separately. Oogway's most advanced capabilities are a result of the tight integration between these subsystems. Such examples include a re-envisioned controls system, an entirely new electrical stack, advanced sonar integration, additional cameras and system monitoring, a new marker dropper, and a watertight capsule mechanism. These additions enabled Oogway to prequalify for Robosub 2024.
△ Less
Submitted 12 October, 2024;
originally announced October 2024.
-
SIFToM: Robust Spoken Instruction Following through Theory of Mind
Authors:
Lance Ying,
Jason Xinyu Liu,
Shivam Aarya,
Yizirui Fang,
Stefanie Tellex,
Joshua B. Tenenbaum,
Tianmin Shu
Abstract:
Spoken language instructions are ubiquitous in agent collaboration. However, in human-robot collaboration, recognition accuracy for human speech is often influenced by various speech and environmental factors, such as background noise, the speaker's accents, and mispronunciation. When faced with noisy or unfamiliar auditory inputs, humans use context and prior knowledge to disambiguate the stimulu…
▽ More
Spoken language instructions are ubiquitous in agent collaboration. However, in human-robot collaboration, recognition accuracy for human speech is often influenced by various speech and environmental factors, such as background noise, the speaker's accents, and mispronunciation. When faced with noisy or unfamiliar auditory inputs, humans use context and prior knowledge to disambiguate the stimulus and take pragmatic actions, a process referred to as top-down processing in cognitive science. We present a cognitively inspired model, Speech Instruction Following through Theory of Mind (SIFToM), to enable robots to pragmatically follow human instructions under diverse speech conditions by inferring the human's goal and joint plan as prior for speech perception and understanding. We test SIFToM in simulated home experiments (VirtualHome 2). Results show that the SIFToM model outperforms state-of-the-art speech and language models, approaching human-level accuracy on challenging speech instruction following tasks. We then demonstrate its ability at the task planning level on a mobile manipulator for breakfast preparation tasks.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Decomposing the Persistent Homology Transform of Star-Shaped Objects
Authors:
Shreya Arya,
Barbara Giunti,
Abigail Hickok,
Lida Kanari,
Sarah McGuire,
Katharine Turner
Abstract:
In this paper, we study the geometric decomposition of the degree-$0$ Persistent Homology Transform (PHT) as viewed as a persistence diagram bundle. We focus on star-shaped objects as they can be segmented into smaller, simpler regions known as ``sectors''. Algebraically, we demonstrate that the degree-$0$ persistence diagram of a star-shaped object in $\mathbb{R}^2$ can be derived from the degree…
▽ More
In this paper, we study the geometric decomposition of the degree-$0$ Persistent Homology Transform (PHT) as viewed as a persistence diagram bundle. We focus on star-shaped objects as they can be segmented into smaller, simpler regions known as ``sectors''. Algebraically, we demonstrate that the degree-$0$ persistence diagram of a star-shaped object in $\mathbb{R}^2$ can be derived from the degree-$0$ persistence diagrams of its sectors. Using this, we then establish sufficient conditions for star-shaped objects in $\mathbb{R}^2$ so that they have ``trivial geometric monodromy''. Consequently, the PHT of such a shape can be decomposed as a union of curves parameterized by $S^1$, where the curves are given by the continuous movement of each point in the persistence diagrams that are parameterized by $S^{1}$. Finally, we discuss the current challenges of generalizing these results to higher dimensions.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification
Authors:
Shivvrat Arya,
Yu Xiang,
Vibhav Gogate
Abstract:
We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with…
▽ More
We present a unified framework called deep dependency networks (DDNs) that combines dependency networks and deep learning architectures for multi-label classification, with a particular emphasis on image and video data. The primary advantage of dependency networks is their ease of training, in contrast to other probabilistic graphical models like Markov networks. In particular, when combined with deep learning architectures, they provide an intuitive, easy-to-use loss function for multi-label classification. A drawback of DDNs compared to Markov networks is their lack of advanced inference schemes, necessitating the use of Gibbs sampling. To address this challenge, we propose novel inference schemes based on local search and integer linear programming for computing the most likely assignment to the labels given observations. We evaluate our novel methods on three video datasets (Charades, TACoS, Wetlab) and three image datasets (MS-COCO, PASCAL VOC, NUS-WIDE), comparing their performance with (a) basic neural architectures and (b) neural architectures combined with Markov networks equipped with advanced inference and learning techniques. Our results demonstrate the superiority of our new DDN methods over the two competing approaches.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Learning to Solve the Constrained Most Probable Explanation Task in Probabilistic Graphical Models
Authors:
Shivvrat Arya,
Tahrima Rahman,
Vibhav Gogate
Abstract:
We propose a self-supervised learning approach for solving the following constrained optimization task in log-linear models or Markov networks. Let $f$ and $g$ be two log-linear models defined over the sets $\mathbf{X}$ and $\mathbf{Y}$ of random variables respectively. Given an assignment $\mathbf{x}$ to all variables in $\mathbf{X}$ (evidence) and a real number $q$, the constrained most-probable…
▽ More
We propose a self-supervised learning approach for solving the following constrained optimization task in log-linear models or Markov networks. Let $f$ and $g$ be two log-linear models defined over the sets $\mathbf{X}$ and $\mathbf{Y}$ of random variables respectively. Given an assignment $\mathbf{x}$ to all variables in $\mathbf{X}$ (evidence) and a real number $q$, the constrained most-probable explanation (CMPE) task seeks to find an assignment $\mathbf{y}$ to all variables in $\mathbf{Y}$ such that $f(\mathbf{x}, \mathbf{y})$ is maximized and $g(\mathbf{x}, \mathbf{y})\leq q$. In our proposed self-supervised approach, given assignments $\mathbf{x}$ to $\mathbf{X}$ (data), we train a deep neural network that learns to output near-optimal solutions to the CMPE problem without requiring access to any pre-computed solutions. The key idea in our approach is to use first principles and approximate inference methods for CMPE to derive novel loss functions that seek to push infeasible solutions towards feasible ones and feasible solutions towards optimal ones. We analyze the properties of our proposed method and experimentally demonstrate its efficacy on several benchmark problems.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
The 8th AI City Challenge
Authors:
Shuo Wang,
David C. Anastasiu,
Zheng Tang,
Ming-Ching Chang,
Yue Yao,
Liang Zheng,
Mohammed Shaiqur Rahman,
Meenakshi S. Arya,
Anuj Sharma,
Pranamesh Chakraborty,
Sanjita Prajapati,
Quan Kong,
Norimasa Kobori,
Munkhjargal Gochoo,
Munkh-Erdene Otgonbold,
Fady Alnajjar,
Ganzorig Batnasan,
Ping-Yang Chen,
Jun-Wei Hsieh,
Xunlei Wu,
Sameer Satish Pusegaonkar,
Yizhou Wang,
Sujit Biswas,
Rama Chellappa
Abstract:
The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC)…
▽ More
The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC) people tracking, highlighting significant enhancements in camera count, character number, 3D annotation, and camera matrices, alongside new rules for 3D tracking and online tracking algorithm encouragement. Track 2 introduced dense video captioning for traffic safety, focusing on pedestrian accidents using multi-camera feeds to improve insights for insurance and prevention. Track 3 required teams to classify driver actions in a naturalistic driving analysis. Track 4 explored fish-eye camera analytics using the FishEye8K dataset. Track 5 focused on motorcycle helmet rule violation detection. The challenge utilized two leaderboards to showcase methods, with participants setting new benchmarks, some surpassing existing state-of-the-art achievements.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment
Authors:
Lance Ying,
Kunal Jha,
Shivam Aarya,
Joshua B. Tenenbaum,
Antonio Torralba,
Tianmin Shu
Abstract:
Verbal communication plays a crucial role in human cooperation, particularly when the partners only have incomplete information about the task, environment, and each other's mental state. In this paper, we propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA). GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the…
▽ More
Verbal communication plays a crucial role in human cooperation, particularly when the partners only have incomplete information about the task, environment, and each other's mental state. In this paper, we propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA). GOMA formulates verbal communication as a planning problem that minimizes the misalignment between the parts of agents' mental states that are relevant to the goals. This approach enables an embodied assistant to reason about when and how to proactively initialize communication with humans verbally using natural language to help achieve better cooperation. We evaluate our approach against strong baselines in two challenging environments, Overcooked (a multiplayer game) and VirtualHome (a household simulator). Our experimental results demonstrate that large language models struggle with generating meaningful communication that is grounded in the social and physical context. In contrast, our approach can successfully generate concise verbal communication for the embodied assistant to effectively boost the performance of the cooperation as well as human users' perception of the assistant.
△ Less
Submitted 14 January, 2025; v1 submitted 16 March, 2024;
originally announced March 2024.
-
Neural Network Approximators for Marginal MAP in Probabilistic Circuits
Authors:
Shivvrat Arya,
Tahrima Rahman,
Vibhav Gogate
Abstract:
Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marg…
▽ More
Probabilistic circuits (PCs) such as sum-product networks efficiently represent large multi-variate probability distributions. They are preferred in practice over other probabilistic representations such as Bayesian and Markov networks because PCs can solve marginal inference (MAR) tasks in time that scales linearly in the size of the network. Unfortunately, the maximum-a-posteriori (MAP) and marginal MAP (MMAP) tasks remain NP-hard in these models. Inspired by the recent work on using neural networks for generating near-optimal solutions to optimization problems such as integer linear programming, we propose an approach that uses neural networks to approximate (M)MAP inference in PCs. The key idea in our approach is to approximate the cost of an assignment to the query variables using a continuous multilinear function, and then use the latter as a loss function. The two main benefits of our new method are that it is self-supervised and after the neural network is learned, it requires only linear time to output a solution. We evaluate our new approach on several benchmark datasets and show that it outperforms three competing linear time approximations, max-product inference, max-marginal inference and sequential estimation, which are used in practice to solve MMAP tasks in PCs.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Quantum-Assisted Adaptive Beamforming in UASs Network: Enhancing Airborne Communication via Collaborative UASs for NextG IoT
Authors:
Sudhanshu Arya,
Ying Wang
Abstract:
This paper introduces a novel quantum-based method for dynamic beamforming and re-forming in Unmanned Aircraft Systems (UASs), specifically addressing the critical challenges posed by the unavoidable hovering characteristics of UAVs. Hovering creates significant beam path distortions, impacting the reliability and quality of distributed beamforming in airborne networks. To overcome these challenge…
▽ More
This paper introduces a novel quantum-based method for dynamic beamforming and re-forming in Unmanned Aircraft Systems (UASs), specifically addressing the critical challenges posed by the unavoidable hovering characteristics of UAVs. Hovering creates significant beam path distortions, impacting the reliability and quality of distributed beamforming in airborne networks. To overcome these challenges, our Quantum Search for UAS Beamforming (QSUB) employs quantum superposition, entanglement, and amplitude amplification. It adaptively reconfigures beams, enhancing beam quality and maintaining robust communication links in the face of rapid UAS state changes due to hovering. Furthermore, we propose an optimized framework, Quantum-Position-Locked Loop (Q-P-LL), that is based on the principle of the Nelder-Mead optimization method for adaptive search to reduce prediction error and improve resilience against angle-of-arrival estimation errors, crucial under dynamic hovering conditions. We also demonstrate the scalability of the system performance and computation complexity by comparing various numbers of active UASs. Importantly, QSUB and Q-P-LL can be applied to both classical and quantum computing architectures. Comparative analyses with conventional Maximum Ratio Transmission (MRT) schemes demonstrate the superior performance and scalability of our quantum approaches, marking significant advancements in the next-generation Internet of Things (IoT) applications requiring reliable airborne communication networks.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities
Authors:
Rohith Peddi,
Shivvrat Arya,
Bharath Challa,
Likhitha Pallapothula,
Akshay Vyas,
Bhavya Gouripeddi,
Jikai Wang,
Qifan Zhang,
Vasundhara Komaragiri,
Eric Ragan,
Nicholas Ruozzi,
Yu Xiang,
Vibhav Gogate
Abstract:
Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understan…
▽ More
Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and the ability to reason about the structure of the activity. To this end, we collect a new egocentric 4D dataset, CaptainCook4D, comprising 384 recordings (94.5 hours) of people performing recipes in real kitchen environments. This dataset consists of two distinct types of activity: one in which participants adhere to the provided recipe instructions and another in which they deviate and induce errors. We provide 5.3K step annotations and 10K fine-grained action annotations and benchmark the dataset for the following tasks: supervised error recognition, multistep localization, and procedure learning
△ Less
Submitted 8 December, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Towards Increasing the Robustness of Predictive Steering-Control Autonomous Navigation Systems Against Dash Cam Image Angle Perturbations Due to Pothole Encounters
Authors:
Shivam Aarya
Abstract:
Vehicle manufacturers are racing to create autonomous navigation and steering control algorithms for their vehicles. These software are made to handle various real-life scenarios such as obstacle avoidance and lane maneuvering. There is some ongoing research to incorporate pothole avoidance into these autonomous systems. However, there is very little research on the effect of hitting a pothole on…
▽ More
Vehicle manufacturers are racing to create autonomous navigation and steering control algorithms for their vehicles. These software are made to handle various real-life scenarios such as obstacle avoidance and lane maneuvering. There is some ongoing research to incorporate pothole avoidance into these autonomous systems. However, there is very little research on the effect of hitting a pothole on the autonomous navigation software that uses cameras to make driving decisions. Perturbations in the camera angle when hitting a pothole can cause errors in the predicted steering angle. In this paper, we present a new model to compensate for such angle perturbations and reduce any errors in steering control prediction algorithms. We evaluate our model on perturbations of publicly available datasets and show our model can reduce the errors in the estimated steering angle from perturbed images to 2.3%, making autonomous steering control robust against the dash cam image angle perturbations induced when one wheel of a car goes over a pothole.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
From Ambiguity to Explicitness: NLP-Assisted 5G Specification Abstraction for Formal Analysis
Authors:
Shiyu Yuan,
Jingda Yang,
Sudhanshu Arya,
Carlo Lipizzi,
Ying Wang
Abstract:
Formal method-based analysis of the 5G Wireless Communication Protocol is crucial for identifying logical vulnerabilities and facilitating an all-encompassing security assessment, especially in the design phase. Natural Language Processing (NLP) assisted techniques and most of the tools are not widely adopted by the industry and research community. Traditional formal verification through a mathema…
▽ More
Formal method-based analysis of the 5G Wireless Communication Protocol is crucial for identifying logical vulnerabilities and facilitating an all-encompassing security assessment, especially in the design phase. Natural Language Processing (NLP) assisted techniques and most of the tools are not widely adopted by the industry and research community. Traditional formal verification through a mathematics approach heavily relied on manual logical abstraction prone to being time-consuming, and error-prone. The reason that the NLP-assisted method did not apply in industrial research may be due to the ambiguity in the natural language of the protocol designs nature is controversial to the explicitness of formal verification. To address the challenge of adopting the formal methods in protocol designs, targeting (3GPP) protocols that are written in natural language, in this study, we propose a hybrid approach to streamline the analysis of protocols. We introduce a two-step pipeline that first uses NLP tools to construct data and then uses constructed data to extract identifiers and formal properties by using the NLP model. The identifiers and formal properties are further used for formal analysis. We implemented three models that take different dependencies between identifiers and formal properties as criteria. Our results of the optimal model reach valid accuracy of 39% for identifier extraction and 42% for formal properties predictions. Our work is proof of concept for an efficient procedure in performing formal analysis for largescale complicate specification and protocol analysis, especially for 5G and nextG communications.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
Formal-Guided Fuzz Testing: Targeting Security Assurance from Specification to Implementation for 5G and Beyond
Authors:
Jingda Yang,
Sudhanshu Arya,
Ying Wang
Abstract:
Softwarization and virtualization in 5G and beyond necessitate thorough testing to ensure the security of critical infrastructure and networks, requiring the identification of vulnerabilities and unintended emergent behaviors from protocol designs to their software stack implementation. To provide an efficient and comprehensive solution, we propose a novel and first-of-its-kind approach that conne…
▽ More
Softwarization and virtualization in 5G and beyond necessitate thorough testing to ensure the security of critical infrastructure and networks, requiring the identification of vulnerabilities and unintended emergent behaviors from protocol designs to their software stack implementation. To provide an efficient and comprehensive solution, we propose a novel and first-of-its-kind approach that connects the strengths and coverage of formal and fuzzing methods to efficiently detect vulnerabilities across protocol logic and implementation stacks in a hierarchical manner. We design and implement formal verification to detect attack traces in critical protocols, which are used to guide subsequent fuzz testing and incorporate feedback from fuzz testing to broaden the scope of formal verification. This innovative approach significantly improves efficiency and enables the auto-discovery of vulnerabilities and unintended emergent behaviors from the 3GPP protocols to software stacks. Following this approach, we discover one identifier leakage model, one DoS attack model, and two eavesdrop attack models due to the absence of rudimentary MITM protection within the protocol, despite the existence of a Transport Layer Security (TLS) solution to this issue for over a decade. More remarkably, guided by the identified formal analysis and attack models, we exploit 61 vulnerabilities using fuzz testing demonstrated on srsRAN platforms. These identified vulnerabilities contribute to fortifying protocol-level assumptions and refining the search space. Compared to state-of-the-art fuzz testing, our united formal and fuzzing methodology enables auto-assurance by systematically discovering vulnerabilities. It significantly reduces computational complexity, transforming the non-practical exponential growth in computational cost into linear growth.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Distributed 3D-Beam Reforming for Hovering-Tolerant UAVs Communication over Coexistence: A Deep-Q Learning for Intelligent Space-Air-Ground Integrated Networks
Authors:
Sudhanshu Arya,
Yifeng Peng,
Jingda Yang,
Ying Wang
Abstract:
In this paper, we present a novel distributed UAVs beam reforming approach to dynamically form and reform a space-selective beam path in addressing the coexistence with satellite and terrestrial communications. Despite the unique advantage to support wider coverage in UAV-enabled cellular communications, the challenges reside in the array responses' sensitivity to random rotational motion and the…
▽ More
In this paper, we present a novel distributed UAVs beam reforming approach to dynamically form and reform a space-selective beam path in addressing the coexistence with satellite and terrestrial communications. Despite the unique advantage to support wider coverage in UAV-enabled cellular communications, the challenges reside in the array responses' sensitivity to random rotational motion and the hovering nature of the UAVs. A model-free reinforcement learning (RL) based unified UAV beam selection and tracking approach is presented to effectively realize the dynamic distributed and collaborative beamforming. The combined impact of the UAVs' hovering and rotational motions is considered while addressing the impairment due to the interference from the orbiting satellites and neighboring networks. The main objectives of this work are two-fold: first, to acquire the channel awareness to uncover its impairments; second, to overcome the beam distortion to meet the quality of service (QoS) requirements. To overcome the impact of the interference and to maximize the beamforming gain, we define and apply a new optimal UAV selection algorithm based on the brute force criteria. Results demonstrate that the detrimental effects of the channel fading and the interference from the orbiting satellites and neighboring networks can be overcome using the proposed approach. Subsequently, an RL algorithm based on Deep Q-Network (DQN) is developed for real-time beam tracking. By augmenting the system with the impairments due to hovering and rotational motion, we show that the proposed DQN algorithm can reform the beam in real-time with negligible error. It is demonstrated that the proposed DQN algorithm attains an exceptional performance improvement. We show that it requires a few iterations only for fine-tuning its parameters without observing any plateaus irrespective of the hovering tolerance.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
CVPR MultiEarth 2023 Deforestation Estimation Challenge:SpaceVision4Amazon
Authors:
Sunita Arya,
S Manthira Moorthi,
Debajyoti Dhar
Abstract:
In this paper, we present a deforestation estimation method based on attention guided UNet architecture using Electro-Optical (EO) and Synthetic Aperture Radar (SAR) satellite imagery. For optical images, Landsat-8 and for SAR imagery, Sentinel-1 data have been used to train and validate the proposed model. Due to the unavailability of temporally and spatially collocated data, individual model has…
▽ More
In this paper, we present a deforestation estimation method based on attention guided UNet architecture using Electro-Optical (EO) and Synthetic Aperture Radar (SAR) satellite imagery. For optical images, Landsat-8 and for SAR imagery, Sentinel-1 data have been used to train and validate the proposed model. Due to the unavailability of temporally and spatially collocated data, individual model has been trained for each sensor. During training time Landsat-8 model achieved training and validation pixel accuracy of 93.45% and Sentinel-2 model achieved 83.87% pixel accuracy. During the test set evaluation, the model achieved pixel accuracy of 84.70% with F1-Score of 0.79 and IoU of 0.69.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Kernel $ε$-Greedy for Multi-Armed Bandits with Covariates
Authors:
Sakshi Arya,
Bharath K. Sriperumbudur
Abstract:
We consider the $ε$-greedy strategy for the multi-arm bandit with covariates (MABC) problem, where the mean reward functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). We propose to estimate the unknown mean reward functions using an online weighted kernel ridge regression estimator, and show the resultant estimator to be consistent under appropriate decay rates of the explor…
▽ More
We consider the $ε$-greedy strategy for the multi-arm bandit with covariates (MABC) problem, where the mean reward functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). We propose to estimate the unknown mean reward functions using an online weighted kernel ridge regression estimator, and show the resultant estimator to be consistent under appropriate decay rates of the exploration probability sequence, $\{ε_t\}_t$, and regularization parameter, $\{λ_t\}_t$. Moreover, we show that for any choice of kernel and the corresponding RKHS, we achieve a sub-linear regret rate depending on the intrinsic dimensionality of the RKHS. Furthermore, we achieve the optimal regret rate of $\sqrt{T}$ under a margin condition for finite-dimensional RKHS.
△ Less
Submitted 1 June, 2025; v1 submitted 29 June, 2023;
originally announced June 2023.
-
Optimal Area-Sensitive Bounds for Polytope Approximation
Authors:
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Given a convex body $K$ of diameter $Δ$ in $\mathbb{R}^d$ for fixed $d$, the objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. The best known uniform bound, due to Dudley (1974), shows t…
▽ More
Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Given a convex body $K$ of diameter $Δ$ in $\mathbb{R}^d$ for fixed $d$, the objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. The best known uniform bound, due to Dudley (1974), shows that $O((Δ/\varepsilon)^{(d-1)/2})$ facets suffice. While this bound is optimal in the case of a Euclidean ball, it is far from optimal for ``skinny'' convex bodies.
A natural way to characterize a convex object's skinniness is in terms of its relationship to the Euclidean ball. Given a convex body $K$, define its surface diameter $Δ_{d-1}$ to be the diameter of a Euclidean ball of the same surface area as $K$. It follows from generalizations of the isoperimetric inequality that $Δ\geq Δ_{d-1}$.
We show that, under the assumption that the width of the body in any direction is at least $\varepsilon$, it is possible to approximate a convex body using $O((Δ_{d-1}/\varepsilon)^{(d-1)/2})$ facets. This bound is never worse than the previous bound and may be significantly better for skinny bodies. The bound is tight, in the sense that for any value of $Δ_{d-1}$, there exist convex bodies that, up to constant factors, require this many facets.
The improvement arises from a novel approach to sampling points on the boundary of a convex body. We employ a classical concept from convexity, called Macbeath regions. We demonstrate that Macbeath regions in $K$ and $K$'s polar behave much like polar pairs. We then apply known results on the Mahler volume to bound their number.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Approximate Nearest Neighbor Searching with Non-Euclidean and Weighted Distances
Authors:
Ahmed Abdelkader,
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a distance function that satisfies certain smoothness and growth-rate assumptions. The objective is to preprocess $S$ into a data structure so that for any query poi…
▽ More
We present a new approach to approximate nearest-neighbor queries in fixed dimension under a variety of non-Euclidean distances. We are given a set $S$ of $n$ points in $\mathbb{R}^d$, an approximation parameter $\varepsilon > 0$, and a distance function that satisfies certain smoothness and growth-rate assumptions. The objective is to preprocess $S$ into a data structure so that for any query point $q$ in $\mathbb{R}^d$, it is possible to efficiently report any point of $S$ whose distance from $q$ is within a factor of $1+\varepsilon$ of the actual closest point.
Prior to this work, the most efficient data structures for approximate nearest-neighbor searching in spaces of constant dimensionality applied only to the Euclidean metric. This paper overcomes this limitation through a method called convexification. For admissible distance functions, the proposed data structures answer queries in logarithmic time using $O(n \log (1 / \varepsilon) / \varepsilon^{d/2})$ space, nearly matching the best known bounds for the Euclidean metric. These results apply to both convex scaling distance functions (including the Mahalanobis distance and weighted Minkowski metrics) and Bregman divergences (including the Kullback-Leibler divergence and the Itakura-Saito distance).
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
The Gromov-Wasserstein distance between spheres
Authors:
Shreya Arya,
Arnab Auddy,
Ranthony Edmonds,
Sunhyuk Lim,
Facundo Memoli,
Daniel Packer
Abstract:
In this paper we consider a two-parameter family {dGWp,q}p,q of Gromov- Wasserstein distances between metric measure spaces. By exploiting a suitable interaction between specific values of the parameters p and q and the metric of the underlying spaces, we determine the exact value of the distance dGW4,2 between all pairs of unit spheres of different dimension endowed with their Euclidean distance…
▽ More
In this paper we consider a two-parameter family {dGWp,q}p,q of Gromov- Wasserstein distances between metric measure spaces. By exploiting a suitable interaction between specific values of the parameters p and q and the metric of the underlying spaces, we determine the exact value of the distance dGW4,2 between all pairs of unit spheres of different dimension endowed with their Euclidean distance and their uniform measure.
△ Less
Submitted 12 July, 2024; v1 submitted 18 June, 2023;
originally announced June 2023.
-
Ground-to-UAV Integrated Network: Low Latency Communication over Interference Channel
Authors:
Sudhanshu Arya,
Ying Wang
Abstract:
We present a novel and first-of-its-kind information-theoretic framework for the key design consideration and implementation of a ground-to-UAV (G2U) communication network to minimize end-to-end transmission delay in the presence of interference. The proposed framework is useful as it describes the minimum transmission latency for an uplink ground-to-UAV communication must satisfy while achieving…
▽ More
We present a novel and first-of-its-kind information-theoretic framework for the key design consideration and implementation of a ground-to-UAV (G2U) communication network to minimize end-to-end transmission delay in the presence of interference. The proposed framework is useful as it describes the minimum transmission latency for an uplink ground-to-UAV communication must satisfy while achieving a given level of reliability. To characterize the transmission delay, we utilize Fano's inequality and derive the tight upper bound for the capacity for the G2U uplink channel in the presence of interference, noise, and potential jamming. Subsequently, given the reliability constraint, the error exponent is obtained for the given channel. Furthermore, a relay UAV in the dual-hop relay mode, with amplify-and-forward (AF) protocol, is considered, for which we jointly obtain the optimal positions of the relay and the receiver UAVs in the presence of interference. Interestingly, in our study, we find that for both the point-to-point and relayed links, increasing the transmit power may not always be an optimal solution for delay minimization problems. Moreover, we prove that there exists an optimal height that minimizes the end-to-end transmission delay in the presence of interference. The proposed framework can be used in practice by a network controller as a system parameters selection criteria, where among a set of parameters, the parameters leading to the lowest transmission latency can be incorporated into the transmission. The based analysis further set the baseline assessment when applying Command and Control (C2) standards to mission-critical G2U and UAV-to-UAV(U2U) services.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
The 7th AI City Challenge
Authors:
Milind Naphade,
Shuo Wang,
David C. Anastasiu,
Zheng Tang,
Ming-Ching Chang,
Yue Yao,
Liang Zheng,
Mohammed Shaiqur Rahman,
Meenakshi S. Arya,
Anuj Sharma,
Qi Feng,
Vitaly Ablavsky,
Stan Sclaroff,
Pranamesh Chakraborty,
Sanjita Prajapati,
Alice Li,
Shangru Li,
Krishna Kunadharaju,
Shenxin Jiang,
Rama Chellappa
Abstract:
The AI City Challenge's seventh edition emphasizes two domains at the intersection of computer vision and artificial intelligence - retail business and Intelligent Traffic Systems (ITS) - that have considerable untapped potential. The 2023 challenge had five tracks, which drew a record-breaking number of participation requests from 508 teams across 46 countries. Track 1 was a brand new track that…
▽ More
The AI City Challenge's seventh edition emphasizes two domains at the intersection of computer vision and artificial intelligence - retail business and Intelligent Traffic Systems (ITS) - that have considerable untapped potential. The 2023 challenge had five tracks, which drew a record-breaking number of participation requests from 508 teams across 46 countries. Track 1 was a brand new track that focused on multi-target multi-camera (MTMC) people tracking, where teams trained and evaluated using both real and highly realistic synthetic data. Track 2 centered around natural-language-based vehicle track retrieval. Track 3 required teams to classify driver actions in naturalistic driving analysis. Track 4 aimed to develop an automated checkout system for retail stores using a single view camera. Track 5, another new addition, tasked teams with detecting violations of the helmet rule for motorcyclists. Two leader boards were released for submissions based on different methods: a public leader board for the contest where external private data wasn't allowed and a general leader board for all results submitted. The participating teams' top performances established strong baselines and even outperformed the state-of-the-art in the proposed challenge tracks.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
Optimal Volume-Sensitive Bounds for Polytope Approximation
Authors:
Sunil Arya,
David M. Mount
Abstract:
Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body $K$ of diameter $Δ$ in $\textbf{R}^d$ for fixed $d$. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. It is known from classical results of Dudley (1974) and…
▽ More
Approximating convex bodies is a fundamental question in geometry and has a wide variety of applications. Consider a convex body $K$ of diameter $Δ$ in $\textbf{R}^d$ for fixed $d$. The objective is to minimize the number of vertices (alternatively, the number of facets) of an approximating polytope for a given Hausdorff error $\varepsilon$. It is known from classical results of Dudley (1974) and Bronshteyn and Ivanov (1976) that $Θ((Δ/\varepsilon)^{(d-1)/2})$ vertices (alternatively, facets) are both necessary and sufficient. While this bound is tight in the worst case, that of Euclidean balls, it is far from optimal for skinny convex bodies.
A natural way to characterize a convex object's skinniness is in terms of its relationship to the Euclidean ball. Given a convex body $K$, define its \emph{volume diameter} $Δ_d$ to be the diameter of a Euclidean ball of the same volume as $K$, and define its \emph{surface diameter} $Δ_{d-1}$ analogously for surface area. It follows from generalizations of the isoperimetric inequality that $Δ\geq Δ_{d-1} \geq Δ_d$.
Arya, da Fonseca, and Mount (SoCG 2012) demonstrated that the diameter-based bound could be made surface-area sensitive, improving the above bound to $O((Δ_{d-1}/\varepsilon)^{(d-1)/2})$. In this paper, we strengthen this by proving the existence of an approximation with $O((Δ_d/\varepsilon)^{(d-1)/2})$ facets.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Economical Convex Coverings and Applications
Authors:
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
Coverings of convex bodies have emerged as a central component in the design of efficient solutions to approximation problems involving convex bodies. Intuitively, given a convex body $K$ and $ε> 0$, a covering is a collection of convex bodies whose union covers $K$ such that a constant factor expansion of each body lies within an $ε$ expansion of $K$. Coverings have been employed in many applicat…
▽ More
Coverings of convex bodies have emerged as a central component in the design of efficient solutions to approximation problems involving convex bodies. Intuitively, given a convex body $K$ and $ε> 0$, a covering is a collection of convex bodies whose union covers $K$ such that a constant factor expansion of each body lies within an $ε$ expansion of $K$. Coverings have been employed in many applications, such as approximations for diameter, width, and $ε$-kernels of point sets, approximate nearest neighbor searching, polytope approximations, and approximations to the Closest Vector Problem (CVP).
It is known how to construct coverings of size $n^{O(n)} / ε^{(n-1)/2}$ for general convex bodies in $\textbf{R}^n$. In special cases, such as when the convex body is the $\ell_p$ unit ball, this bound has been improved to $2^{O(n)} / ε^{(n-1)/2}$. This raises the question of whether such a bound generally holds. In this paper we answer the question in the affirmative.
We demonstrate the power and versatility of our coverings by applying them to the problem of approximating a convex body by a polytope, under the Banach-Mazur metric. Given a well-centered convex body $K$ and an approximation parameter $ε> 0$, we show that there exists a polytope $P$ consisting of $2^{O(n)} / ε^{(n-1)/2}$ vertices (facets) such that $K \subset P \subset K(1+ε)$. This bound is optimal in the worst case up to factors of $2^{O(n)}$. As an additional consequence, we obtain the fastest $(1+ε)$-approximate CVP algorithm that works in any norm, with a running time of $2^{O(n)} / ε^{(n-1)/2}$ up to polynomial factors in the input size, and we obtain the fastest $(1+ε)$-approximation algorithm for integer programming. We also present a framework for constructing coverings of optimal size for any convex body (up to factors of $2^{O(n)}$).
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Effect of channel dimensions and Reynolds numbers on the turbulence modulation for particle-laden turbulent channel flows
Authors:
Naveen Rohilla,
Siddhi Arya,
Partha Sarathi Goswami
Abstract:
The addition of particles to turbulent flows changes the underlying mechanism of turbulence and leads to turbulence modulation. Different temporal and spatial scales for both phases make it challenging to understand turbulence modulation via one parameter. The important parameters are particle Stokes number, mass loading, particle Reynolds number, fluid bulk Reynolds number, etc., that act togethe…
▽ More
The addition of particles to turbulent flows changes the underlying mechanism of turbulence and leads to turbulence modulation. Different temporal and spatial scales for both phases make it challenging to understand turbulence modulation via one parameter. The important parameters are particle Stokes number, mass loading, particle Reynolds number, fluid bulk Reynolds number, etc., that act together and affect the fluid phase turbulence intensities. In the present study, we have carried out the large eddy simulations for different system sizes (2δ/dp = 54, 81, and 117) and fluid bulk Reynolds numbers (Re_b = 5600 and 13750) to quantify the extent of turbulence attenuation. Here, δ is the half-channel width, dp is the particle diameter, and Re_b is the fluid Reynolds number based on the fluid bulk velocity and channel width. The point particles are tracked with the Lagrangian approach. The scaling analysis of the feedback force shows that system size and fluid bulk Reynolds number are the two crucial parameters that affect the turbulence modulation more significantly than the other. The streamwise turbulent structures are observed to become lengthier and fewer with an increase in system size for the same volume fraction and fixed bulk Reynolds number. However, the streamwise high-speed streaks are smaller, thinner, and closely spaced for higher Reynolds numbers than the lower ones for the same volume fraction. In particle statistics, it is observed that the scaled particle fluctuations increase with the increase in system size while keeping the Reynolds number fixed. However, the scaled particle fluctuations decrease with the increase in fluid bulk Reynolds number for the same volume fraction and fixed system size. The present study highlights the scaling issue for designing industrial equipment for particle-laden turbulent flows.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Deep Dependency Networks for Multi-Label Classification
Authors:
Shivvrat Arya,
Yu Xiang,
Vibhav Gogate
Abstract:
We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data. First, we show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved by leveraging more powerful methods such…
▽ More
We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data. First, we show that the performance of previous approaches that combine Markov Random Fields with neural networks can be modestly improved by leveraging more powerful methods such as iterative join graph propagation, integer linear programming, and $\ell_1$ regularization-based structure learning. Then we propose a new modeling framework called deep dependency networks, which augments a dependency network, a model that is easy to train and learns more accurate dependencies but is limited to Gibbs sampling for inference, to the output layer of a neural network. We show that despite its simplicity, jointly learning this new architecture yields significant improvements in performance over the baseline neural network. In particular, our experimental evaluation on three video activity classification datasets: Charades, Textually Annotated Cooking Scenes (TACoS), and Wetlab, and three multi-label image classification datasets: MS-COCO, PASCAL VOC, and NUS-WIDE show that deep dependency networks are almost always superior to pure neural architectures that do not use dependency networks.
△ Less
Submitted 6 February, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Predictions of damages from Atlantic tropical cyclones: a hierarchical Bayesian study on extremes
Authors:
Lindsey Dietz,
Sakshi Arya,
Snigdhansu Chatterjee
Abstract:
Bayesian hierarchical models are proposed for modeling tropical cyclone characteristics and their damage potential in the Atlantic basin. We model the joint probability distribution of tropical cyclone characteristics and their damage potential at two different temporal scales, while taking several climate indices into account. First, a predictive model for an entire season is developed that forec…
▽ More
Bayesian hierarchical models are proposed for modeling tropical cyclone characteristics and their damage potential in the Atlantic basin. We model the joint probability distribution of tropical cyclone characteristics and their damage potential at two different temporal scales, while taking several climate indices into account. First, a predictive model for an entire season is developed that forecasts the number of cyclone events that will take place, the probability of each cyclone causing some amount of damage, and the monetized value of damages. Then, specific characteristics of individual cyclones are considered to predict the monetized value of the damage it will cause. Robustness studies are conducted and excellent prediction power is demonstrated across different data science models and evaluation techniques.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
A Sheaf-Theoretic Construction of Shape Space
Authors:
Shreya Arya,
Justin Curry,
Sayan Mukherjee
Abstract:
We present a sheaf-theoretic construction of shape space -- the space of all shapes. We do this by describing a homotopy sheaf on the poset category of constructible sets, where each set is mapped to its Persistent Homology Transform (PHT). Recent results that build on fundamental work of Schapira have shown that this transform is injective, thus making the PHT a good summary object for each shape…
▽ More
We present a sheaf-theoretic construction of shape space -- the space of all shapes. We do this by describing a homotopy sheaf on the poset category of constructible sets, where each set is mapped to its Persistent Homology Transform (PHT). Recent results that build on fundamental work of Schapira have shown that this transform is injective, thus making the PHT a good summary object for each shape. Our homotopy sheaf result allows us to "glue" PHTs of different shapes together to build up the PHT of a larger shape. In the case where our shape is a polyhedron we prove a generalized nerve lemma for the PHT. Finally, by re-examining the sampling result of Smale-Niyogi-Weinberger, we show that we can reliably approximate the PHT of a manifold by a polyhedron up to arbitrary precision.
△ Less
Submitted 23 June, 2023; v1 submitted 19 April, 2022;
originally announced April 2022.
-
Dimensionality Reduction for $k$-Distance Applied to Persistent Homology
Authors:
Shreya Arya,
Jean-Daniel Boissonnat,
Kunal Dutta,
Martin Lotz
Abstract:
Given a set P of n points and a constant k, we are interested in computing the persistent homology of the Cech filtration of P for the k-distance, and investigate the effectiveness of dimensionality reduction for this problem, answering an open question of Sheehy [Proc. SoCG, 2014]. We show that any linear transformation that preserves pairwise distances up to a (1 +/- e) multiplicative factor, mu…
▽ More
Given a set P of n points and a constant k, we are interested in computing the persistent homology of the Cech filtration of P for the k-distance, and investigate the effectiveness of dimensionality reduction for this problem, answering an open question of Sheehy [Proc. SoCG, 2014]. We show that any linear transformation that preserves pairwise distances up to a (1 +/- e) multiplicative factor, must preserve the persistent homology of the Cech filtration up to a factor of (1-e)^(-1). Our results also show that the Vietoris-Rips and Delaunay filtrations for the k-distance, as well as the Cech filtration for the approximate k-distance of Buchet et al. [J. Comput. Geom., 2016] are preserved up to a (1 +/- e) factor. We also prove extensions of our main theorem, for point sets (i) lying in a region of bounded Gaussian width or (ii) on a low-dimensional submanifold, obtaining embeddings having the dimension bounds of Lotz [Proc. Roy. Soc., 2019] and Clarkson [Proc. SoCG, 2008] respectively. Our results also work in the terminal dimensionality reduction setting, where the distance of any point in the original ambient space, to any point in P, needs to be approximately preserved.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
Microcontroller Based Load Monitoring System
Authors:
A N Madhavanunni,
S S Arya,
Renjith Kumar D
Abstract:
The demand for power has increased exponentially over the last century. One avenue through which today's energy problems can be addressed is through the reduction of energy usage in households. This has increased the emphasis on the need for accurate and economic methods of power measurement. The goal of providing such data is to optimize and reduce their power consumption. In view of this, the pr…
▽ More
The demand for power has increased exponentially over the last century. One avenue through which today's energy problems can be addressed is through the reduction of energy usage in households. This has increased the emphasis on the need for accurate and economic methods of power measurement. The goal of providing such data is to optimize and reduce their power consumption. In view of this, the present manuscript focuses on the design and implementation of precise and reliable load monitoring system using PIC microcontroller chip (PIC16F877A). This involves an accurate sensing of voltage, current and power factor of the load. A clever utilization of in-built ADC and timers of the microcontroller reduces the design complexity of the system. The proposed system monitors the load continuously on a real time basis and displays the parameters such as voltage, current, power factor, active, reactive and apparent powers in an LCD module. The use of microcontroller reduces the cost and makes the device compact. The proposed system has been implemented and tested in the laboratory for single phase loads.
△ Less
Submitted 11 August, 2021;
originally announced August 2021.
-
Internet of Things (IoT) Based Video Analytics: a use case of Smart Doorbell
Authors:
Shailesh Arya
Abstract:
The vision of the internet of things (IoT) is a reality now. IoT devices are getting cheaper, smaller. They are becoming more and more computationally and energy-efficient. The global market of IoT-based video analytics has seen significant growth in recent years and it is expected to be a growing market segment. For any IoT-based video analytics application, few key points required, such as cost-…
▽ More
The vision of the internet of things (IoT) is a reality now. IoT devices are getting cheaper, smaller. They are becoming more and more computationally and energy-efficient. The global market of IoT-based video analytics has seen significant growth in recent years and it is expected to be a growing market segment. For any IoT-based video analytics application, few key points required, such as cost-effectiveness, widespread use, flexible design, accurate scene detection, reusability of the framework. Video-based smart doorbell system is one such application domain for video analytics where many commercial offerings are available in the consumer market. However, such existing offerings are costly, monolithic, and proprietary. Also, there will be a trade-off between accuracy and portability. To address the foreseen problems, I'm proposing a distributed framework for video analytics with a use case of a smart doorbell system. The proposed framework uses AWS cloud services as a base platform and to meet the price affordability constraint, the system was implemented on affordable Raspberry Pi. The smart doorbell will be able to recognize the known/unknown person with at most accuracy. The smart doorbell system is also having additional detection functionalities such as harmful weapon detection, noteworthy vehicle detection, animal/pet detection. An iOS application is specifically developed for this implementation which can receive the notification from the smart doorbell in real-time. Finally, the paper also mentions the classical approaches for video analytics, their feasibility in implementing with this use-case, and comparative analysis in terms of accuracy and time required to detect an object in the frame is carried out. Results conclude that AWS cloud-based approach is worthy for this smart doorbell use case.
△ Less
Submitted 13 September, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
The Influence of Social Networks on Human Society
Authors:
Shreyash Arya
Abstract:
This report gives a brief overview of the origin of social networks and their most popular manifestation in the modern era - the Online Social Networks (OSNs) or social media. It further discusses the positive and negative implications of OSNs on human society. The coupling of Data Science and social media (social media mining) is then put forward as a powerful tool to overcome the current challen…
▽ More
This report gives a brief overview of the origin of social networks and their most popular manifestation in the modern era - the Online Social Networks (OSNs) or social media. It further discusses the positive and negative implications of OSNs on human society. The coupling of Data Science and social media (social media mining) is then put forward as a powerful tool to overcome the current challenges and pave the path for futuristic advancements
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Adaptive estimation of a function from its Exponential Radon Transform in presence of noise
Authors:
Anuj Abhishek,
Sakshi Arya
Abstract:
In this article we propose a locally adaptive strategy for estimating a function from its Exponential Radon Transform (ERT) data, without prior knowledge of the smoothness of functions that are to be estimated. We build a non-parametric kernel type estimator and show that for a class of functions comprising a wide Sobolev regularity scale, our proposed strategy follows the minimax optimal rate up…
▽ More
In this article we propose a locally adaptive strategy for estimating a function from its Exponential Radon Transform (ERT) data, without prior knowledge of the smoothness of functions that are to be estimated. We build a non-parametric kernel type estimator and show that for a class of functions comprising a wide Sobolev regularity scale, our proposed strategy follows the minimax optimal rate up to a $\log{n}$ factor. We also show that there does not exist an optimal adaptive estimator on the Sobolev scale when the pointwise risk is used and in fact the rate achieved by the proposed estimator is the adaptive rate of convergence.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
A Distributed Framework to Orchestrate Video Analytics Applications
Authors:
Tapan Pathak,
Vatsal Patel,
Sarth Kanani,
Shailesh Arya,
Pankesh Patel,
Muhammad Intizar Ali,
John Breslin
Abstract:
The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary of…
▽ More
The concept of the Internet of Things (IoT) is a reality now. This paradigm shift has caught everyones attention in a large class of applications, including IoT-based video analytics using smart doorbells. Due to its growing application segments, various efforts exist in scientific literature and many video-based doorbell solutions are commercially available in the market. However, contemporary offerings are bespoke, offering limited composability and reusability of a smart doorbell framework. Second, they are monolithic and proprietary, which means that the implementation details remain hidden from the users. We believe that a transparent design can greatly aid in the development of a smart doorbell, enabling its use in multiple application domains.
To address the above-mentioned challenges, we propose a distributed framework to orchestrate video analytics across Edge and Cloud resources. We investigate trade-offs in the distribution of different software components over a bespoke/full system, where components over Edge and Cloud are treated generically. This paper evaluates the proposed framework as well as the state-of-the-art models and presents comparative analysis of them on various metrics (such as overall model accuracy, latency, memory, and CPU usage). The evaluation result demonstrates our intuition very well, showcasing that the AWS-based approach exhibits reasonably high object-detection accuracy, low memory, and CPU usage when compared to the state-of-the-art approaches, but high latency.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Effect of lockdown interventions to control the COVID-19 epidemic in India
Authors:
Ankit Sharma,
Shreyash Arya,
Shashee Kumari,
Arnab Chatterjee
Abstract:
The pandemic caused by the novel Coronavirus SARS-CoV2 has been responsible for life threatening health complications, and extreme pressure on healthcare systems. While preventive and definite curative medical interventions are yet to arrive, Non-Pharmaceutical Interventions (NPIs) like physical isolation, quarantine and drastic social measures imposed by governing agencies are effective in arrest…
▽ More
The pandemic caused by the novel Coronavirus SARS-CoV2 has been responsible for life threatening health complications, and extreme pressure on healthcare systems. While preventive and definite curative medical interventions are yet to arrive, Non-Pharmaceutical Interventions (NPIs) like physical isolation, quarantine and drastic social measures imposed by governing agencies are effective in arresting the spread of infections in a population. In densely populated countries like India, lockdown interventions are partially effective due to social and administrative complexities. Using detailed demographic data, we present an agent based model to imitate the behavior of the population and its mobility features, even under intervention. We demonstrate the effectiveness of contact tracing policies and how our model efficiently relates to empirical findings on testing efficiency. We also present various lockdown intervention strategies for mitigation - using the bare number of infections, the effective reproduction rate, as well as using reinforcement learning. Our analysis can help assess the socio-economic consequences of such interventions, and provide useful ideas and insights to policy makers for better decision making.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
Implementation of Google Assistant & Amazon Alexa on Raspberry Pi
Authors:
Shailesh D. Arya,
Samir Patel
Abstract:
This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this pa…
▽ More
This paper investigates the implementation of voice-enabled Google Assistant and Amazon Alexa on Raspberry Pi. Virtual Assistants are being a new trend in how we interact or do computations with physical devices. A voice-enabled system essentially means a system that processes voice as an input, decodes, or understands the meaning of that input and generates an appropriate voice output. In this paper, we are developing a smart speaker prototype that has the functionalities of both in the same Raspberry Pi. Users can invoke a virtual assistant by saying the hot words and can leverage the best services of both eco-systems. This paper also explains the complex architecture of Google Assistant and Amazon Alexa and the working of both assistants as well. Later, this system can be used to control the smart home IoT devices.
△ Less
Submitted 15 June, 2020;
originally announced June 2020.
-
To update or not to update? Delayed Nonparametric Bandits with Randomized Allocation
Authors:
Sakshi Arya,
Yuhong Yang
Abstract:
Delayed rewards problem in contextual bandits has been of interest in various practical settings. We study randomized allocation strategies and provide an understanding on how the exploration-exploitation tradeoff is affected by delays in observing the rewards. In randomized strategies, the extent of exploration-exploitation is controlled by a user-determined exploration probability sequence. In t…
▽ More
Delayed rewards problem in contextual bandits has been of interest in various practical settings. We study randomized allocation strategies and provide an understanding on how the exploration-exploitation tradeoff is affected by delays in observing the rewards. In randomized strategies, the extent of exploration-exploitation is controlled by a user-determined exploration probability sequence. In the presence of delayed rewards, one may choose between using the original exploration sequence that updates at every time point or update the sequence only when a new reward is observed, leading to two competing strategies. In this work, we show that while both strategies may lead to strong consistency in allocation, the property holds for a wider scope of situations for the latter. However, for finite sample performance, we illustrate that both strategies have their own advantages and disadvantages, depending on the severity of the delay and underlying reward generating mechanisms.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Smart Attendance System Usign CNN
Authors:
Shailesh Arya,
Hrithik Mesariya,
Vishal Parekh
Abstract:
The research on the attendance system has been going for a very long time, numerous arrangements have been proposed in the last decade to make this system efficient and less time consuming, but all those systems have several flaws. In this paper, we are introducing a smart and efficient system for attendance using face detection and face recognition. This system can be used to take attendance in c…
▽ More
The research on the attendance system has been going for a very long time, numerous arrangements have been proposed in the last decade to make this system efficient and less time consuming, but all those systems have several flaws. In this paper, we are introducing a smart and efficient system for attendance using face detection and face recognition. This system can be used to take attendance in colleges or offices using real-time face recognition with the help of the Convolution Neural Network(CNN). The conventional methods like Eigenfaces and Fisher faces are sensitive to lighting, noise, posture, obstruction, illumination etc. Hence, we have used CNN to recognize the face and overcome such difficulties. The attendance records will be updated automatically and stored in an excel sheet as well as in a database. We have used MongoDB as a backend database for attendance records.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Optimal Bound on the Combinatorial Complexity of Approximating Polytopes
Authors:
Rahul Arya,
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
This paper considers the question of how to succinctly approximate a multidimensional convex body by a polytope. Given a convex body $K$ of unit diameter in Euclidean $d$-dimensional space (where $d$ is a constant) and an error parameter $\varepsilon > 0$, the objective is to determine a convex polytope of low combinatorial complexity whose Hausdorff distance from $K$ is at most $\varepsilon$. By…
▽ More
This paper considers the question of how to succinctly approximate a multidimensional convex body by a polytope. Given a convex body $K$ of unit diameter in Euclidean $d$-dimensional space (where $d$ is a constant) and an error parameter $\varepsilon > 0$, the objective is to determine a convex polytope of low combinatorial complexity whose Hausdorff distance from $K$ is at most $\varepsilon$. By combinatorial complexity we mean the total number of faces of all dimensions. Classical constructions by Dudley and Bronshteyn/Ivanov show that $O(1/\varepsilon^{(d-1)/2})$ facets or vertices are possible, respectively, but neither achieves both bounds simultaneously. In this paper, we show that it is possible to construct a polytope with $O(1/\varepsilon^{(d-1)/2})$ combinatorial complexity, which is optimal in the worst case.
Our result is based on a new relationship between $\varepsilon$-width caps of a convex body and its polar body. Using this relationship, we are able to obtain a volume-sensitive bound on the number of approximating caps that are "essentially different." We achieve our main result by combining this with a variant of the witness-collector method and a novel variable-thickness layered construction of the economical cap covering.
△ Less
Submitted 24 August, 2022; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Discovery of Ionic Impact Ionization (I3) in Perovskites Triggered by a Single Photon
Authors:
Zihan Xu,
Yugang Yu,
Iftikhar Ahmad Niaz,
Yimu Chen,
Shaurya Arya,
Yusheng Lei,
Mohammad Abu Raihan Miah,
Jiayun Zhou,
Alex Ce Zhang,
Lujiang Yan,
Sheng Xu,
Kenji Nomura,
Yu-Hwa Lo
Abstract:
Organic-inorganic metal halide perovskite devices have generated significant interest for LED, photodetector, and solar cell applications due to their attractive optoelectronic properties and substrate-choice flexibility1-4. These devices exhibit slow time-scale response, which have been explained by point defect migration5-6. In this work, we report the discovery of a room temperature intrinsic a…
▽ More
Organic-inorganic metal halide perovskite devices have generated significant interest for LED, photodetector, and solar cell applications due to their attractive optoelectronic properties and substrate-choice flexibility1-4. These devices exhibit slow time-scale response, which have been explained by point defect migration5-6. In this work, we report the discovery of a room temperature intrinsic amplification process in methylammonium lead iodide perovskite (MAPbI3) that can be triggered by few photons, down to a single photon. The electrical properties of the material, by way of photoresponse, are modified by an input energy as small as 0.19 attojoules, the energy of a single photon. These observations cannot be explained by photo-excited electronic band-to-band transitions or prevailing model of photo-excited point defect migration since none of the above can explain the observed macroscopic property change by absorption of single or few photons. The results suggest the existence of an avalanche-like collective motion of iodides and their accumulation near the anode, which we will call ionic impact ionization (I3 mechanism). The proposed I3 process is the ionic analog of the electronic impact ionization, and has been considered impossible before because conventionally it takes far more energy to move ions out of their equilibrium position than electrons. We have performed first principle calculations to show that in MAPbI3 the activation energy for the I3 mechanism is appreciably lower than the literature value of the activation energy for the electronic impact ionization. The discovery of I3 process in perovskite material opens up possibilities for new classes of devices for photonic and electronic applications.
△ Less
Submitted 12 June, 2019; v1 submitted 6 June, 2019;
originally announced June 2019.
-
Randomized Allocation with Nonparametric Estimation for Contextual Multi-Armed Bandits with Delayed Rewards
Authors:
Sakshi Arya,
Yuhong Yang
Abstract:
We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some mild assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent.
We study a multi-armed bandit problem with covariates in a setting where there is a possible delay in observing the rewards. Under some mild assumptions on the probability distributions for the delays and using an appropriate randomization to select the arms, the proposed strategy is shown to be strongly consistent.
△ Less
Submitted 4 September, 2019; v1 submitted 2 February, 2019;
originally announced February 2019.
-
Approximate Convex Intersection Detection with Applications to Width and Minkowski Sums
Authors:
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
Approximation problems involving a single convex body in $d$-dimensional space have received a great deal of attention in the computational geometry community. In contrast, works involving multiple convex bodies are generally limited to dimensions $d \leq 3$ and/or do not consider approximation. In this paper, we consider approximations to two natural problems involving multiple convex bodies: det…
▽ More
Approximation problems involving a single convex body in $d$-dimensional space have received a great deal of attention in the computational geometry community. In contrast, works involving multiple convex bodies are generally limited to dimensions $d \leq 3$ and/or do not consider approximation. In this paper, we consider approximations to two natural problems involving multiple convex bodies: detecting whether two polytopes intersect and computing their Minkowski sum. Given an approximation parameter $\varepsilon > 0$, we show how to independently preprocess two polytopes $A,B$ into data structures of size $O(1/\varepsilon^{(d-1)/2})$ such that we can answer in polylogarithmic time whether $A$ and $B$ intersect approximately. More generally, we can answer this for the images of $A$ and $B$ under affine transformations. Next, we show how to $\varepsilon$-approximate the Minkowski sum of two given polytopes defined as the intersection of $n$ halfspaces in $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2 + α})$ time, for any constant $α> 0$. Finally, we present a surprising impact of these results to a well studied problem that considers a single convex body. We show how to $\varepsilon$-approximate the width of a set of $n$ points in $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2 + α})$ time, for any constant $α> 0$, a major improvement over the previous bound of roughly $O(n + 1/\varepsilon^{d-1})$ time.
△ Less
Submitted 2 July, 2018;
originally announced July 2018.
-
Near-Optimal $\varepsilon$-Kernel Construction and Related Problems
Authors:
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
The computation of (i) $\varepsilon$-kernels, (ii) approximate diameter, and (iii) approximate bichromatic closest pair are fundamental problems in geometric approximation. In this paper, we describe new algorithms that offer significant improvements to their running times. In each case the input is a set of $n$ points in $R^d$ for a constant dimension $d \geq 3$ and an approximation parameter…
▽ More
The computation of (i) $\varepsilon$-kernels, (ii) approximate diameter, and (iii) approximate bichromatic closest pair are fundamental problems in geometric approximation. In this paper, we describe new algorithms that offer significant improvements to their running times. In each case the input is a set of $n$ points in $R^d$ for a constant dimension $d \geq 3$ and an approximation parameter $\varepsilon > 0$. We reduce the respective running times (i) from $O((n + 1/\varepsilon^{d-2})\log(1/\varepsilon))$ to $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2+α})$, (ii) from $O((n + 1/\varepsilon^{d-2})\log(1/\varepsilon))$ to $O(n \log(1/\varepsilon) + 1/\varepsilon^{(d-1)/2+α})$, and (iii) from $O(n / \varepsilon^{d/3})$ to $O(n / \varepsilon^{d/4+α}),$ for an arbitrarily small constant $α> 0$. Result (i) is nearly optimal since the size of the output $\varepsilon$-kernel is $Θ(1/\varepsilon^{(d-1)/2})$ in the worst case.
These results are all based on an efficient decomposition of a convex body using a hierarchy of Macbeath regions, and contrast to previous solutions that decompose space using quadtrees and grids. By further application of these techniques, we also show that it is possible to obtain near-optimal preprocessing time for the most efficient data structures to approximately answer queries for (iv) nearest-neighbor searching, (v) directional width, and (vi) polytope membership.
△ Less
Submitted 31 March, 2017;
originally announced March 2017.
-
Optimal Approximate Polytope Membership
Authors:
Sunil Arya,
Guilherme D. da Fonseca,
David M. Mount
Abstract:
In the polytope membership problem, a convex polytope $K$ in $R^d$ is given, and the objective is to preprocess $K$ into a data structure so that, given a query point $q \in R^d$, it is possible to determine efficiently whether $q \in K$. We consider this problem in an approximate setting and assume that $d$ is a constant. Given an approximation parameter $\varepsilon > 0$, the query can be answer…
▽ More
In the polytope membership problem, a convex polytope $K$ in $R^d$ is given, and the objective is to preprocess $K$ into a data structure so that, given a query point $q \in R^d$, it is possible to determine efficiently whether $q \in K$. We consider this problem in an approximate setting and assume that $d$ is a constant. Given an approximation parameter $\varepsilon > 0$, the query can be answered either way if the distance from $q$ to $K$'s boundary is at most $\varepsilon$ times $K$'s diameter. Previous solutions to the problem were on the form of a space-time trade-off, where logarithmic query time demands $O(1/\varepsilon^{d-1})$ storage, whereas storage $O(1/\varepsilon^{(d-1)/2})$ admits roughly $O(1/\varepsilon^{(d-1)/8})$ query time. In this paper, we present a data structure that achieves logarithmic query time with storage of only $O(1/\varepsilon^{(d-1)/2})$, which matches the worst-case lower bound on the complexity of any $\varepsilon$-approximating polytope. Our data structure is based on a new technique, a hierarchy of ellipsoids defined as approximations to Macbeath regions.
As an application, we obtain major improvements to approximate Euclidean nearest neighbor searching. Notably, the storage needed to answer $\varepsilon$-approximate nearest neighbor queries for a set of $n$ points in $O(\log \frac{n}{\varepsilon})$ time is reduced to $O(n/\varepsilon^{d/2})$. This halves the exponent in the $\varepsilon$-dependency of the existing space bound of roughly $O(n/\varepsilon^d)$, which has stood for 15 years (Har-Peled, 2001).
△ Less
Submitted 6 December, 2016;
originally announced December 2016.