Search | arXiv e-print repository

Cost for research -- how cost data of research can be included in open metadata to be reused and evaluated

Authors: Julia Bartlewski, Christoph Broschinski, Gernot Deinzer, Cornelia Lang, Dirk Pieper, Bianca Schweighofer, Colin Sippl, Lisa-Marie Stein, Alexander Wagner, Silke Weisheit

Abstract: The openCost project aims to enhance transparency in research funding by making publication-related costs publicly accessible, following FAIR principles. It introduces a metadata schema for cost data, allowing aggregation and analysis across institutions. The project promotes open access and cost-efficient models, benefiting academic institutions, funders, and policymakers. The openCost project aims to enhance transparency in research funding by making publication-related costs publicly accessible, following FAIR principles. It introduces a metadata schema for cost data, allowing aggregation and analysis across institutions. The project promotes open access and cost-efficient models, benefiting academic institutions, funders, and policymakers. △ Less

Submitted 23 June, 2025; originally announced June 2025.

arXiv:2506.17842 [pdf, ps, other]

Generative Grasp Detection and Estimation with Concept Learning-based Safety Criteria

Authors: Al-Harith Farhad, Khalil Abuibaid, Christiane Plociennik, Achim Wagner, Martin Ruskowski

Abstract: Neural networks are often regarded as universal equations that can estimate any function. This flexibility, however, comes with the drawback of high complexity, rendering these networks into black box models, which is especially relevant in safety-centric applications. To that end, we propose a pipeline for a collaborative robot (Cobot) grasping algorithm that detects relevant tools and generates… ▽ More Neural networks are often regarded as universal equations that can estimate any function. This flexibility, however, comes with the drawback of high complexity, rendering these networks into black box models, which is especially relevant in safety-centric applications. To that end, we propose a pipeline for a collaborative robot (Cobot) grasping algorithm that detects relevant tools and generates the optimal grasp. To increase the transparency and reliability of this approach, we integrate an explainable AI method that provides an explanation for the underlying prediction of a model by extracting the learned features and correlating them to corresponding classes from the input. These concepts are then used as additional criteria to ensure the safe handling of work tools. In this paper, we show the consistency of this approach and the criterion for improving the handover position. This approach was tested in an industrial environment, where a camera system was set up to enable a robot to pick up certain tools and objects. △ Less

Submitted 21 June, 2025; originally announced June 2025.

Comments: RAAD 2025: 34th International Conference on Robotics in Alpe-Adria-Danube Region

arXiv:2506.13131 [pdf, ps, other]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Authors: Alexander Novikov, Ngân Vũ, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, Matej Balog

Abstract: In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the… ▽ More In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve iteratively improves the algorithm, potentially leading to new scientific and practical discoveries. We demonstrate the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found a functionally equivalent simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems in mathematics and computer science, significantly expanding the scope of prior automated discovery methods (Romera-Paredes et al., 2023). Notably, AlphaEvolve developed a search algorithm that found a procedure to multiply two $4 \times 4$ complex-valued matrices using $48$ scalar multiplications; offering the first improvement, after 56 years, over Strassen's algorithm in this setting. We believe AlphaEvolve and coding agents like it can have a significant impact in improving solutions of problems across many areas of science and computation. △ Less

Submitted 16 June, 2025; originally announced June 2025.

arXiv:2505.14862 [pdf, ps, other]

Replay Attacks Against Audio Deepfake Detection

Authors: Nicolas Müller, Piotr Kawa, Wei-Herng Choong, Adriana Stan, Aditya Tirumala Bukkapatnam, Karla Pizzi, Alexander Wagner, Philip Sperl

Abstract: We show how replay attacks undermine audio deepfake detection: By playing and re-recording deepfake audio through various speakers and microphones, we make spoofed samples appear authentic to the detection model. To study this phenomenon in more detail, we introduce ReplayDF, a dataset of recordings derived from M-AILABS and MLAAD, featuring 109 speaker-microphone combinations across six languages… ▽ More We show how replay attacks undermine audio deepfake detection: By playing and re-recording deepfake audio through various speakers and microphones, we make spoofed samples appear authentic to the detection model. To study this phenomenon in more detail, we introduce ReplayDF, a dataset of recordings derived from M-AILABS and MLAAD, featuring 109 speaker-microphone combinations across six languages and four TTS models. It includes diverse acoustic conditions, some highly challenging for detection. Our analysis of six open-source detection models across five datasets reveals significant vulnerability, with the top-performing W2V2-AASIST model's Equal Error Rate (EER) surging from 4.7% to 18.2%. Even with adaptive Room Impulse Response (RIR) retraining, performance remains compromised with an 11.0% EER. We release ReplayDF for non-commercial research use. △ Less

Submitted 1 June, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

Journal ref: Interspeech 2025

arXiv:2503.10724 [pdf, other]

Real-time Pollutant Identification through Optical PM Micro-Sensor

Authors: Elie Azeraf, Audrey Wagner, Emilie Bialic, Samia Mellah, Ludovic Lelandais

Abstract: Air pollution remains one of the most pressing environmental challenges of the modern era, significantly impacting human health, ecosystems, and climate. While traditional air quality monitoring systems provide critical data, their high costs and limited spatial coverage hinder effective real-time pollutant identification. Recent advancements in micro-sensor technology have improved data collectio… ▽ More Air pollution remains one of the most pressing environmental challenges of the modern era, significantly impacting human health, ecosystems, and climate. While traditional air quality monitoring systems provide critical data, their high costs and limited spatial coverage hinder effective real-time pollutant identification. Recent advancements in micro-sensor technology have improved data collection but still lack efficient methods for source identification. This paper explores the innovative application of machine learning (ML) models to classify pollutants in real-time using only data from optical micro-sensors. We propose a novel classification framework capable of distinguishing between four pollutant scenarios: Background Pollution, Ash, Sand, and Candle. Three Machine Learning (ML) approaches - XGBoost, Long Short-Term Memory networks, and Hidden Markov Chains - are evaluated for their effectiveness in sequence modeling and pollutant identification. Our results demonstrate the potential of leveraging micro-sensors and ML techniques to enhance air quality monitoring, offering actionable insights for urban planning and environmental protection. △ Less

Submitted 13 March, 2025; originally announced March 2025.

Comments: 11 pages, 4 figures

arXiv:2503.09919 [pdf, ps, other]

Drums of high width

Authors: Alex Davies, Prateek Gupta, Sebastien Racaniere, Grzegorz Swirszcz, Adam Zsolt Wagner, Theophane Weber, Geordie Williamson

Abstract: We provide a family of $5$-dimensional prismatoids whose width grows linearly in the number of vertices. This provides a new infinite family of counter-examples to the Hirsch conjecture whose excess width grows linearly in the number of vertices, and answers a question of Matschke, Santos and Weibel. We provide a family of $5$-dimensional prismatoids whose width grows linearly in the number of vertices. This provides a new infinite family of counter-examples to the Hirsch conjecture whose excess width grows linearly in the number of vertices, and answers a question of Matschke, Santos and Weibel. △ Less

Submitted 12 March, 2025; originally announced March 2025.

Comments: 31 pages

arXiv:2502.05199 [pdf, other]

Advancing Geometry with AI: Multi-agent Generation of Polytopes

Authors: Grzegorz Swirszcz, Adam Zsolt Wagner, Geordie Williamson, Sam Blackwell, Bogdan Georgiev, Alex Davies, Ali Eslami, Sebastien Racaniere, Theophane Weber, Pushmeet Kohli

Abstract: Polytopes are one of the most primitive concepts underlying geometry. Discovery and study of polytopes with complex structures provides a means of advancing scientific knowledge. Construction of polytopes with specific extremal structure is very difficult and time-consuming. Having an automated tool for the generation of such extremal examples is therefore of great value. We present an Artificial… ▽ More Polytopes are one of the most primitive concepts underlying geometry. Discovery and study of polytopes with complex structures provides a means of advancing scientific knowledge. Construction of polytopes with specific extremal structure is very difficult and time-consuming. Having an automated tool for the generation of such extremal examples is therefore of great value. We present an Artificial Intelligence system capable of generating novel polytopes with very high complexity, whose abilities we demonstrate in three different and challenging scenarios: the Hirsch Conjecture, the k-neighbourly problem and the longest monotone paths problem. For each of these three problems the system was able to generate novel examples, which match or surpass the best previously known bounds. Our main focus was the Hirsch Conjecture, which had remained an open problem for over 50 years. The highly parallel A.I. system presented in this paper was able to generate millions of examples, with many of them surpassing best known previous results and possessing properties not present in the earlier human-constructed examples. For comparison, it took leading human experts over 50 years to handcraft the first example of a polytope exceeding the bound conjectured by Hirsch, and in the decade since humans were able to construct only a scarce few families of such counterexample polytopes. With the adoption of computer-aided methods, the creation of new examples of mathematical objects stops being a domain reserved only for human expertise. Advances in A.I. provide mathematicians with yet another powerful tool in advancing mathematical knowledge. The results presented demonstrate that A.I. is capable of addressing problems in geometry recognized as extremely hard, and also to produce extremal examples different in nature from the ones constructed by humans. △ Less

Submitted 30 January, 2025; originally announced February 2025.

Comments: 18 pages, 5 figures

MSC Class: 52B05; 52B55; 68T20

arXiv:2501.14053 [pdf, ps, other]

The Redundancy of Non-Singular Channel Simulation

Authors: Gergely Flamich, Sharang M. Sriramu, Aaron B. Wagner

Abstract: Channel simulation is an alternative to quantization and entropy coding for performing lossy source coding. Recently, channel simulation has gained significant traction in both the machine learning and information theory communities, as it integrates better with machine learning-based data compression algorithms and has better rate-distortion-perception properties than quantization. As the practic… ▽ More Channel simulation is an alternative to quantization and entropy coding for performing lossy source coding. Recently, channel simulation has gained significant traction in both the machine learning and information theory communities, as it integrates better with machine learning-based data compression algorithms and has better rate-distortion-perception properties than quantization. As the practical importance of channel simulation increases, it is vital to understand its fundamental limitations. Recently, Sriramu and Wagner provided an almost complete characterisation of the redundancy of channel simulation algorithms. In this paper, we complete this characterisation. First, we significantly extend a result of Li and El Gamal, and show that the redundancy of any instance of a channel simulation problem is lower bounded by the channel simulation divergence. Second, we give two proofs that the asymptotic redundancy of simulating iid non-singular channels is lower-bounded by $1/2$: one using a direct approach based on the asymptotic expansion of the channel simulation divergence and one using large deviations theory. △ Less

Submitted 1 May, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

arXiv:2501.10953 [pdf, other]

Channel Coding for Gaussian Channels with Mean and Variance Constraints

Authors: Adeel Mahmood, Aaron B. Wagner

Abstract: We consider channel coding for Gaussian channels with the recently introduced mean and variance cost constraints. Through matching converse and achievability bounds, we characterize the optimal first- and second-order performance. The main technical contribution of this paper is an achievability scheme which uses random codewords drawn from a mixture of three uniform distributions on $(n-1)$-spher… ▽ More We consider channel coding for Gaussian channels with the recently introduced mean and variance cost constraints. Through matching converse and achievability bounds, we characterize the optimal first- and second-order performance. The main technical contribution of this paper is an achievability scheme which uses random codewords drawn from a mixture of three uniform distributions on $(n-1)$-spheres of radii $R_1, R_2$ and $R_3$, where $R_i = O(\sqrt{n})$ and $|R_i - R_j| = O(1)$. To analyze such a mixture distribution, we prove a lemma giving a uniform $O(\log n)$ bound, which holds with high probability, on the log ratio of the output distributions $Q_i^{cc}$ and $Q_j^{cc}$, where $Q_i^{cc}$ is induced by a random channel input uniformly distributed on an $(n-1)$-sphere of radius $R_i$. To facilitate the application of the usual central limit theorem, we also give a uniform $O(\log n)$ bound, which holds with high probability, on the log ratio of the output distributions $Q_i^{cc}$ and $Q^*_i$, where $Q_i^*$ is induced by a random channel input with i.i.d. components. △ Less

Submitted 19 January, 2025; originally announced January 2025.

arXiv:2411.19734 [pdf, other]

A Note on Small Percolating Sets on Hypercubes via Generative AI

Authors: Gergely Bérczi, Adam Zsolt Wagner

Abstract: We apply a generative AI pattern-recognition technique called PatternBoost to study bootstrap percolation on hypercubes. With this, we slightly improve the best existing upper bound for the size of percolating subsets of the hypercube. We apply a generative AI pattern-recognition technique called PatternBoost to study bootstrap percolation on hypercubes. With this, we slightly improve the best existing upper bound for the size of percolating subsets of the hypercube. △ Less

Submitted 29 November, 2024; originally announced November 2024.

MSC Class: 68R10; 68T07; 05C35

arXiv:2411.00566 [pdf, other]

PatternBoost: Constructions in Mathematics with a Little Help from AI

Authors: François Charton, Jordan S. Ellenberg, Adam Zsolt Wagner, Geordie Williamson

Abstract: We introduce PatternBoost, a flexible method for finding interesting constructions in mathematics. Our algorithm alternates between two phases. In the first ``local'' phase, a classical search algorithm is used to produce many desirable constructions. In the second ``global'' phase, a transformer neural network is trained on the best such constructions. Samples from the trained transformer are the… ▽ More We introduce PatternBoost, a flexible method for finding interesting constructions in mathematics. Our algorithm alternates between two phases. In the first ``local'' phase, a classical search algorithm is used to produce many desirable constructions. In the second ``global'' phase, a transformer neural network is trained on the best such constructions. Samples from the trained transformer are then used as seeds for the first phase, and the process is repeated. We give a detailed introduction to this technique, and discuss the results of its application to several problems in extremal combinatorics. The performance of PatternBoost varies across different problems, but there are many situations where its performance is quite impressive. Using our technique, we find the best known solutions to several long-standing problems, including the construction of a counterexample to a conjecture that had remained open for 30 years. △ Less

Submitted 1 November, 2024; originally announced November 2024.

Comments: 32 pages

MSC Class: 05C35; 05D99; 68T20; 68V99

arXiv:2410.20256 [pdf]

doi 10.1007/s10514-022-10074-5

That was not what I was aiming at! Differentiating human intent and outcome in a physically dynamic throwing task

Authors: Vidullan Surendran, Alan R. Wagner

Abstract: Recognising intent in collaborative human robot tasks can improve team performance and human perception of robots. Intent can differ from the observed outcome in the presence of mistakes which are likely in physically dynamic tasks. We created a dataset of 1227 throws of a ball at a target from 10 participants and observed that 47% of throws were mistakes with 16% completely missing the target. Ou… ▽ More Recognising intent in collaborative human robot tasks can improve team performance and human perception of robots. Intent can differ from the observed outcome in the presence of mistakes which are likely in physically dynamic tasks. We created a dataset of 1227 throws of a ball at a target from 10 participants and observed that 47% of throws were mistakes with 16% completely missing the target. Our research leverages facial images capturing the person's reaction to the outcome of a throw to predict when the resulting throw is a mistake and then we determine the actual intent of the throw. The approach we propose for outcome prediction performs 38% better than the two-stream architecture used previously for this task on front-on videos. In addition, we propose a 1-D CNN model which is used in conjunction with priors learned from the frequency of mistakes to provide an end-to-end pipeline for outcome and intent recognition in this throwing task. △ Less

Submitted 26 October, 2024; originally announced October 2024.

Comments: Accepted October 2022 in Autonomous Robots. Published December 2022

Journal ref: Auton Robot 47, 249-265 (2023)

arXiv:2409.20409 [pdf, other]

Physics-Regularized Multi-Modal Image Assimilation for Brain Tumor Localization

Authors: Michal Balcerak, Tamaz Amiranashvili, Andreas Wagner, Jonas Weidner, Petr Karnakov, Johannes C. Paetzold, Ivan Ezhov, Petros Koumoutsakos, Benedikt Wiestler, Bjoern Menze

Abstract: Physical models in the form of partial differential equations serve as important priors for many under-constrained problems. One such application is tumor treatment planning, which relies on accurately estimating the spatial distribution of tumor cells within a patient's anatomy. While medical imaging can detect the bulk of a tumor, it cannot capture the full extent of its spread, as low-concentra… ▽ More Physical models in the form of partial differential equations serve as important priors for many under-constrained problems. One such application is tumor treatment planning, which relies on accurately estimating the spatial distribution of tumor cells within a patient's anatomy. While medical imaging can detect the bulk of a tumor, it cannot capture the full extent of its spread, as low-concentration tumor cells often remain undetectable, particularly in glioblastoma, the most common primary brain tumor. Machine learning approaches struggle to estimate the complete tumor cell distribution due to a lack of appropriate training data. Consequently, most existing methods rely on physics-based simulations to generate anatomically and physiologically plausible estimations. However, these approaches face challenges with complex and unknown initial conditions and are constrained by overly rigid physical models. In this work, we introduce a novel method that integrates data-driven and physics-based cost functions, akin to Physics-Informed Neural Networks (PINNs). However, our approach parametrizes the solution directly on a dynamic discrete mesh, allowing for the effective modeling of complex biomechanical behaviors. Specifically, we propose a unique discretization scheme that quantifies how well the learned spatiotemporal distributions of tumor and brain tissues adhere to their respective growth and elasticity equations. This quantification acts as a regularization term, offering greater flexibility and improved integration of patient data compared to existing models. We demonstrate enhanced coverage of tumor recurrence areas using real-world data from a patient cohort, highlighting the potential of our method to improve model-driven treatment planning for glioblastoma in clinical practice. △ Less

Submitted 30 October, 2024; v1 submitted 30 September, 2024; originally announced September 2024.

Comments: Accepted to NeurIPS 2024

arXiv:2409.18589 [pdf, ps, other]

Towards Event-Triggered NMPC for Efficient 6G Communications: Experimental Results and Open Problems

Authors: Jens Püttschneider, Julian Golembiewski, Niklas A. Wagner, Christian Wietfeld, Timm Faulwasser

Abstract: Networked control systems enable real-time control and coordination of distributed systems, leveraging the low latency, high reliability, and massive connectivity offered by 5G and future 6G networks. Applications include autonomous vehicles, robotics, industrial automation, and smart grids. Despite networked control algorithms admitting nominal stability guarantees even in the presence of delays… ▽ More Networked control systems enable real-time control and coordination of distributed systems, leveraging the low latency, high reliability, and massive connectivity offered by 5G and future 6G networks. Applications include autonomous vehicles, robotics, industrial automation, and smart grids. Despite networked control algorithms admitting nominal stability guarantees even in the presence of delays and packet dropouts, their practical performance still heavily depends on the specific characteristics and conditions of the underlying network. To achieve the desired performance while efficiently using communication resources, co-design of control and communication is pivotal. Although periodic schemes, where communication instances are fixed, can provide reliable control performance, unnecessary transmissions, when updates are not needed, result in inefficient usage of network resources. In this paper, we investigate the potential for co-design of model predictive control and network communication. To this end, we design and implement an event-triggered nonlinear model predictive controller for stabilizing a Furuta pendulum communicating over a tailored open radio access network 6G research platform. We analyze the control performance as well as network utilization under varying channel conditions and event-triggering criteria. Additionally, we analyze the network-induced delay pattern and its interaction with the event-triggered controller. Our results show that the event-triggered control scheme achieves similar performance to periodic control with reduced communication demand. △ Less

Submitted 26 June, 2025; v1 submitted 27 September, 2024; originally announced September 2024.

Comments: Accepted for presentation at IEEE ICCA 2025

arXiv:2409.07558 [pdf, other]

Unsupervised Point Cloud Registration with Self-Distillation

Authors: Christian Löwens, Thorben Funke, André Wagner, Alexandru Paul Condurache

Abstract: Rigid point cloud registration is a fundamental problem and highly relevant in robotics and autonomous driving. Nowadays deep learning methods can be trained to match a pair of point clouds, given the transformation between them. However, this training is often not scalable due to the high cost of collecting ground truth poses. Therefore, we present a self-distillation approach to learn point clou… ▽ More Rigid point cloud registration is a fundamental problem and highly relevant in robotics and autonomous driving. Nowadays deep learning methods can be trained to match a pair of point clouds, given the transformation between them. However, this training is often not scalable due to the high cost of collecting ground truth poses. Therefore, we present a self-distillation approach to learn point cloud registration in an unsupervised fashion. Here, each sample is passed to a teacher network and an augmented view is passed to a student network. The teacher includes a trainable feature extractor and a learning-free robust solver such as RANSAC. The solver forces consistency among correspondences and optimizes for the unsupervised inlier ratio, eliminating the need for ground truth labels. Our approach simplifies the training procedure by removing the need for initial hand-crafted features or consecutive point cloud frames as seen in related methods. We show that our method not only surpasses them on the RGB-D benchmark 3DMatch but also generalizes well to automotive radar, where classical features adopted by others fail. The code is available at https://github.com/boschresearch/direg . △ Less

Submitted 11 September, 2024; originally announced September 2024.

Comments: Oral at BMVC 2024

arXiv:2407.07015 [pdf, other]

A Framework for Multimodal Medical Image Interaction

Authors: Laura Schütz, Sasan Matinfar, Gideon Schafroth, Navid Navab, Merle Fairhurst, Arthur Wagner, Benedikt Wiestler, Ulrich Eck, Nassir Navab

Abstract: Medical doctors rely on images of the human anatomy, such as magnetic resonance imaging (MRI), to localize regions of interest in the patient during diagnosis and treatment. Despite advances in medical imaging technology, the information conveyance remains unimodal. This visual representation fails to capture the complexity of the real, multisensory interaction with human tissue. However, perceivi… ▽ More Medical doctors rely on images of the human anatomy, such as magnetic resonance imaging (MRI), to localize regions of interest in the patient during diagnosis and treatment. Despite advances in medical imaging technology, the information conveyance remains unimodal. This visual representation fails to capture the complexity of the real, multisensory interaction with human tissue. However, perceiving multimodal information about the patient's anatomy and disease in real-time is critical for the success of medical procedures and patient outcome. We introduce a Multimodal Medical Image Interaction (MMII) framework to allow medical experts a dynamic, audiovisual interaction with human tissue in three-dimensional space. In a virtual reality environment, the user receives physically informed audiovisual feedback to improve the spatial perception of anatomical structures. MMII uses a model-based sonification approach to generate sounds derived from the geometry and physical properties of tissue, thereby eliminating the need for hand-crafted sound design. Two user studies involving 34 general and nine clinical experts were conducted to evaluate the proposed interaction framework's learnability, usability, and accuracy. Our results showed excellent learnability of audiovisual correspondence as the rate of correct associations significantly improved (p < 0.001) over the course of the study. MMII resulted in superior brain tumor localization accuracy (p < 0.05) compared to conventional medical image interaction. Our findings substantiate the potential of this novel framework to enhance interaction with medical images, for example, during surgical procedures where immediate and precise feedback is needed. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Accepted for publication in IEEE TVCG; presentation at IEEE ISMAR 2024

ACM Class: H.5.2; H.5.5; H.5.1; J.3

arXiv:2407.05260 [pdf, other]

Improved Channel Coding Performance Through Cost Variability

Authors: Adeel Mahmood, Aaron B. Wagner

Abstract: Channel coding for discrete memoryless channels (DMCs) with mean and variance cost constraints has been recently introduced. We show that there is an improvement in coding performance due to cost variability, both with and without feedback. We demonstrate this improvement over the traditional almost-sure cost constraint (also called the peak-power constraint) that prohibits any cost variation abov… ▽ More Channel coding for discrete memoryless channels (DMCs) with mean and variance cost constraints has been recently introduced. We show that there is an improvement in coding performance due to cost variability, both with and without feedback. We demonstrate this improvement over the traditional almost-sure cost constraint (also called the peak-power constraint) that prohibits any cost variation above a fixed threshold. Our result simultaneously shows that feedback does not improve the second-order coding rate of simple-dispersion DMCs under the peak-power constraint. This finding parallels similar results for unconstrained simple-dispersion DMCs, additive white Gaussian noise (AWGN) channels and parallel Gaussian channels. △ Less

Submitted 17 September, 2024; v1 submitted 7 July, 2024; originally announced July 2024.

arXiv:2405.13604 [pdf, other]

Skills Composition Framework for Reconfigurable Cyber-Physical Production Modules

Authors: Aleksandr Sidorenko, Achim Wagner, Martin Ruskowski

Abstract: While the benefits of reconfigurable manufacturing systems (RMS) are well-known, there are still challenges to their development, including, among others, a modular software architecture that enables rapid reconfiguration without much reprogramming effort. Skill-based engineering improves software modularity and increases the reconfiguration potential of RMS. Nevertheless, a skills' composition fr… ▽ More While the benefits of reconfigurable manufacturing systems (RMS) are well-known, there are still challenges to their development, including, among others, a modular software architecture that enables rapid reconfiguration without much reprogramming effort. Skill-based engineering improves software modularity and increases the reconfiguration potential of RMS. Nevertheless, a skills' composition framework with a focus on frequent and rapid software changes is still missing. The Behavior trees (BTs) framework is a novel approach, which enables intuitive design of modular hierarchical control structures. BTs have been mostly explored from the AI and robotics perspectives, and little work has been done in investigating their potential for composing skills in the manufacturing domain. This paper proposes a framework for skills' composition and execution in skill-based reconfigurable cyber-physical production modules (RCPPMs). It is based on distributed BTs and provides good integration between low-level devices' specific code and AI-based task-oriented frameworks. We have implemented the provided models for the IEC 61499-based distributed automation controllers to show the instantiation of the proposed framework with the specific industrial technology and enable its evaluation by the automation community. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2404.14030 [pdf, other]

Towards Using Behavior Trees in Industrial Automation Controllers

Authors: Aleksandr Sidorenko, Mahdi Rezapour, Achim Wagner, Martin Ruskowski

Abstract: The Industry 4.0 paradigm manifests the shift towards mass customization and cyber-physical production systems (CPPS) and sets new requirements for industrial automation software in terms of modularity, flexibility, and short development cycles of control programs. Though programmable logical controllers (PLCs) have been evolving into versatile and powerful edge devices, there is a lack of PLC sof… ▽ More The Industry 4.0 paradigm manifests the shift towards mass customization and cyber-physical production systems (CPPS) and sets new requirements for industrial automation software in terms of modularity, flexibility, and short development cycles of control programs. Though programmable logical controllers (PLCs) have been evolving into versatile and powerful edge devices, there is a lack of PLC software flexibility and integration between low-level programs and high-level task-oriented control frameworks. Behavior trees (BTs) is a novel framework, which enables rapid design of modular hierarchical control structures. It combines improved modularity with a simple and intuitive design of control logic. This paper proposes an approach for improving the industrial control software design by integrating BTs into PLC programs and separating hardware related functionalities from the coordination logic. Several strategies for integration of BTs into PLCs are shown. The first two integrate BTs with the IEC 61131 based PLCs and are based on the use of the PLCopen Common Behavior Model. The last one utilized event-based BTs and shows the integration with the IEC 61499 based controllers. An application example demonstrates the approach. The paper contributes in the following ways. First, we propose a new PLC software design, which improves modularity, supports better separation of concerns, and enables rapid development and reconfiguration of the control software. Second, we show and evaluate the integration of the BT framework into both IEC 61131 and IEC 61499 based PLCs, as well as the integration of the PLCopen function blocks with the external BT library. This leads to better integration of the low-level PLC code and the AI-based task-oriented frameworks. It also improves the skill-based programming approach for PLCs by using BTs for skills composition. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.01111 [pdf, other]

The Rate-Distortion-Perception Trade-off: The Role of Private Randomness

Authors: Yassine Hamdi, Aaron B. Wagner, Deniz Gündüz

Abstract: In image compression, with recent advances in generative modeling, the existence of a trade-off between the rate and the perceptual quality (realism) has been brought to light, where the realism is measured by the closeness of the output distribution to the source. It has been shown that randomized codes can be strictly better under a number of formulations. In particular, the role of common rando… ▽ More In image compression, with recent advances in generative modeling, the existence of a trade-off between the rate and the perceptual quality (realism) has been brought to light, where the realism is measured by the closeness of the output distribution to the source. It has been shown that randomized codes can be strictly better under a number of formulations. In particular, the role of common randomness has been well studied. We elucidate the role of private randomness in the compression of a memoryless source $X^n=(X_1,...,X_n)$ under two kinds of realism constraints. The near-perfect realism constraint requires the joint distribution of output symbols $(Y_1,...,Y_n)$ to be arbitrarily close the distribution of the source in total variation distance (TVD). The per-symbol near-perfect realism constraint requires that the TVD between the distribution of output symbol $Y_t$ and the source distribution be arbitrarily small, uniformly in the index $t.$ We characterize the corresponding asymptotic rate-distortion trade-off and show that encoder private randomness is not useful if the compression rate is lower than the entropy of the source, however limited the resources in terms of common randomness and decoder private randomness may be. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Submitted to IEEE ISIT 2024

arXiv:2401.16858 [pdf, other]

Low-Rate, Low-Distortion Compression with Wasserstein Distortion

Authors: Yang Qiu, Aaron B. Wagner

Abstract: Wasserstein distortion is a one-parameter family of distortion measures that was recently proposed to unify fidelity and realism constraints. After establishing continuity results for Wasserstein in the extreme cases of pure fidelity and pure realism, we prove the first coding theorems for compression under Wasserstein distortion focusing on the regime in which both the rate and the distortion are… ▽ More Wasserstein distortion is a one-parameter family of distortion measures that was recently proposed to unify fidelity and realism constraints. After establishing continuity results for Wasserstein in the extreme cases of pure fidelity and pure realism, we prove the first coding theorems for compression under Wasserstein distortion focusing on the regime in which both the rate and the distortion are small. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.03629

MSC Class: 94A34 ACM Class: E.4

arXiv:2401.16707 [pdf, ps, other]

Optimal Redundancy in Exact Channel Synthesis

Authors: Sharang M. Sriramu, Aaron B. Wagner

Abstract: We consider the redundancy of the exact channel synthesis problem under an i.i.d. assumption. Existing results provide an upper bound on the unnormalized redundancy that is logarithmic in the block length. We show, via an improved scheme, that the logarithmic term can be halved for most channels and eliminated for all others. For full-support discrete memoryless channels, we show that this is the… ▽ More We consider the redundancy of the exact channel synthesis problem under an i.i.d. assumption. Existing results provide an upper bound on the unnormalized redundancy that is logarithmic in the block length. We show, via an improved scheme, that the logarithmic term can be halved for most channels and eliminated for all others. For full-support discrete memoryless channels, we show that this is the best possible. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.16417 [pdf, other]

Channel Coding with Mean and Variance Cost Constraints

Authors: Adeel Mahmood, Aaron B. Wagner

Abstract: We consider channel coding for discrete memoryless channels (DMCs) with a novel cost constraint that constrains both the mean and the variance of the cost of the codewords. We show that the maximum (asymptotically) achievable rate under the new cost formulation is equal to the capacity-cost function; in particular, the strong converse holds. We further characterize the optimal second-order coding… ▽ More We consider channel coding for discrete memoryless channels (DMCs) with a novel cost constraint that constrains both the mean and the variance of the cost of the codewords. We show that the maximum (asymptotically) achievable rate under the new cost formulation is equal to the capacity-cost function; in particular, the strong converse holds. We further characterize the optimal second-order coding rate of these cost-constrained codes; in particular, the optimal second-order coding rate is finite. We then show that the second-order coding performance is strictly improved with feedback using a new variation of timid/bold coding, significantly broadening the applicability of timid/bold coding schemes from unconstrained compound-dispersion channels to all cost-constrained channels. Equivalent results on the minimum average probability of error are also given. △ Less

Submitted 22 January, 2025; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.03708 [pdf, other]

Data assimilation and parameter identification for water waves using the nonlinear Schrödinger equation and physics-informed neural networks

Authors: Svenja Ehlers, Niklas A. Wagner, Annamaria Scherzl, Marco Klein, Norbert Hoffmann, Merten Stender

Abstract: The measurement of deep water gravity wave elevations using in-situ devices, such as wave gauges, typically yields spatially sparse data. This sparsity arises from the deployment of a limited number of gauges due to their installation effort and high operational costs. The reconstruction of the spatio-temporal extent of surface elevation poses an ill-posed data assimilation problem, challenging to… ▽ More The measurement of deep water gravity wave elevations using in-situ devices, such as wave gauges, typically yields spatially sparse data. This sparsity arises from the deployment of a limited number of gauges due to their installation effort and high operational costs. The reconstruction of the spatio-temporal extent of surface elevation poses an ill-posed data assimilation problem, challenging to solve with conventional numerical techniques. To address this issue, we propose the application of a physics-informed neural network (PINN), aiming to reconstruct physically consistent wave fields between two designated measurement locations several meters apart. Our method ensures this physical consistency by integrating residuals of the hydrodynamic nonlinear Schrödinger equation (NLSE) into the PINN's loss function. Using synthetic wave elevation time series from distinct locations within a wave tank, we initially achieve successful reconstruction quality by employing constant, predetermined NLSE coefficients. However, the reconstruction quality is further improved by introducing NLSE coefficients as additional identifiable variables during PINN training. The results not only showcase a technically relevant application of the PINN method but also represent a pioneering step towards improving the initialization of deterministic wave prediction methods. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 16 pages with 11 figures

arXiv:2311.15505 [pdf]

Exploring Trust and Risk during Online Bartering Interactions

Authors: Kalyani Lakkanige, Lamar Cooley-Russ, Alan R. Wagner, Sarah Rajtmajer

Abstract: This paper investigates how risk influences the way people barter. We used Minecraft to create an experimental environment in which people bartered to earn a monetary bonus. Our findings reveal that subjects exhibit risk-aversion to competitive bartering environments and deliberate over their trades longer when compared to cooperative environments. These initial experiments lay groundwork for deve… ▽ More This paper investigates how risk influences the way people barter. We used Minecraft to create an experimental environment in which people bartered to earn a monetary bonus. Our findings reveal that subjects exhibit risk-aversion to competitive bartering environments and deliberate over their trades longer when compared to cooperative environments. These initial experiments lay groundwork for development of agents capable of strategically trading with human counterparts in different environments. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: Paper accepted into Multittrust 2.0 @ HAI 2023 (https://multittrust.github.io/2ed/)

arXiv:2311.03583 [pdf, other]

Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search

Authors: Abbas Mehrabian, Ankit Anand, Hyunjik Kim, Nicolas Sonnerat, Matej Balog, Gheorghe Comanici, Tudor Berariu, Andrew Lee, Anian Ruoss, Anna Bulanova, Daniel Toyama, Sam Blackwell, Bernardino Romera Paredes, Petar Veličković, Laurent Orseau, Joonkyung Lee, Anurag Murty Naredla, Doina Precup, Adam Zsolt Wagner

Abstract: This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erdős, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method… ▽ More This work studies a central extremal graph theory problem inspired by a 1975 conjecture of Erdős, which aims to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this problem as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum -- jump-starting the search for larger graphs using good graphs found at smaller sizes -- we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs. △ Less

Submitted 29 July, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: To appear in the proceedings of IJCAI 2024. First three authors contributed equally, last two authors made equal senior contribution

arXiv:2310.03629 [pdf, other]

Wasserstein Distortion: Unifying Fidelity and Realism

Authors: Yang Qiu, Aaron B. Wagner, Johannes Ballé, Lucas Theis

Abstract: We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasse… ▽ More We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense. △ Less

Submitted 28 March, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.15054 [pdf, other]

Near Real-Time Position Tracking for Robot-Guided Evacuation

Authors: Mollik Nayyar, Alan Wagner

Abstract: During the evacuation of a building, the rapid and accurate tracking of human evacuees can be used by a guide robot to increase the effectiveness of the evacuation [1],[2]. This paper introduces a near real-time human position tracking solution tailored for evacuation robots. Using a pose detector, our system first identifies human joints in the camera frame in near real-time and then translates t… ▽ More During the evacuation of a building, the rapid and accurate tracking of human evacuees can be used by a guide robot to increase the effectiveness of the evacuation [1],[2]. This paper introduces a near real-time human position tracking solution tailored for evacuation robots. Using a pose detector, our system first identifies human joints in the camera frame in near real-time and then translates the position of these pixels into real-world coordinates via a simple calibration process. We run multiple trials of the system in action in an indoor lab environment and show that the system can achieve an accuracy of 0.55 meters when compared to ground truth. The system can also achieve an average of 3 frames per second (FPS) which was sufficient for our study on robot-guided human evacuation. The potential of our approach extends beyond mere tracking, paving the way for evacuee motion prediction, allowing the robot to proactively respond to human movements during an evacuation. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: The 2nd Workshop on Social Robot Navigation: Advances and Evaluation. In conjunction with: IEEE International Conference on Intelligent Robots and Systems (IROS 2023)

arXiv:2309.15045 [pdf, other]

Modeling Evacuee Behavior for Robot-Guided Emergency Evacuation

Authors: Mollik Nayyar, Alan Wagner

Abstract: This paper considers the problem of developing suitable behavior models of human evacuees during a robot-guided emergency evacuation. We describe our recent research developing behavior models of evacuees and potential future uses of these models. This paper considers how behavior models can contribute to the development and design of emergency evacuation simulations in order to improve social nav… ▽ More This paper considers the problem of developing suitable behavior models of human evacuees during a robot-guided emergency evacuation. We describe our recent research developing behavior models of evacuees and potential future uses of these models. This paper considers how behavior models can contribute to the development and design of emergency evacuation simulations in order to improve social navigation during an evacuation. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: Presented at Social Robot Navigation: Advances and Evaluation. In conjunction with: IEEE International Conference on Robotics and Automation, ICRA 2022

arXiv:2308.00199 [pdf, other]

doi 10.1109/TCDS.2023.3299755

CBCL-PR: A Cognitively Inspired Model for Class-Incremental Learning in Robotics

Authors: Ali Ayub, Alan R. Wagner

Abstract: For most real-world applications, robots need to adapt and learn continually with limited data in their environments. In this paper, we consider the problem of Few-Shot class Incremental Learning (FSIL), in which an AI agent is required to learn incrementally from a few data samples without forgetting the data it has previously learned. To solve this problem, we present a novel framework inspired… ▽ More For most real-world applications, robots need to adapt and learn continually with limited data in their environments. In this paper, we consider the problem of Few-Shot class Incremental Learning (FSIL), in which an AI agent is required to learn incrementally from a few data samples without forgetting the data it has previously learned. To solve this problem, we present a novel framework inspired by theories of concept learning in the hippocampus and the neocortex. Our framework represents object classes in the form of sets of clusters and stores them in memory. The framework replays data generated by the clusters of the old classes, to avoid forgetting when learning new classes. Our approach is evaluated on two object classification datasets resulting in state-of-the-art (SOTA) performance for class-incremental learning and FSIL. We also evaluate our framework for FSIL on a robot demonstrating that the robot can continually learn to classify a large set of household objects with limited human assistance. △ Less

Submitted 31 July, 2023; originally announced August 2023.

Comments: Accepted to IEEE Transactions on Cognitive and Developmental Systems

arXiv:2307.13267 [pdf, other]

Federated K-Means Clustering via Dual Decomposition-based Distributed Optimization

Authors: Vassilios Yfantis, Achim Wagner, Martin Ruskowski

Abstract: The use of distributed optimization in machine learning can be motivated either by the resulting preservation of privacy or the increase in computational efficiency. On the one hand, training data might be stored across multiple devices. Training a global model within a network where each node only has access to its confidential data requires the use of distributed algorithms. Even if the data is… ▽ More The use of distributed optimization in machine learning can be motivated either by the resulting preservation of privacy or the increase in computational efficiency. On the one hand, training data might be stored across multiple devices. Training a global model within a network where each node only has access to its confidential data requires the use of distributed algorithms. Even if the data is not confidential, sharing it might be prohibitive due to bandwidth limitations. On the other hand, the ever-increasing amount of available data leads to large-scale machine learning problems. By splitting the training process across multiple nodes its efficiency can be significantly increased. This paper aims to demonstrate how dual decomposition can be applied for distributed training of $ K $-means clustering problems. After an overview of distributed and federated machine learning, the mixed-integer quadratically constrained programming-based formulation of the $ K $-means clustering training problem is presented. The training can be performed in a distributed manner by splitting the data across different nodes and linking these nodes through consensus constraints. Finally, the performance of the subgradient method, the bundle trust method, and the quasi-Newton dual ascent algorithm are evaluated on a set of benchmark problems. While the mixed-integer programming-based formulation of the clustering problems suffers from weak integer relaxations, the presented approach can potentially be used to enable an efficient solution in the future, both in a central and distributed setting. △ Less

Submitted 25 July, 2023; originally announced July 2023.

MSC Class: 90

arXiv:2307.02641 [pdf, other]

Active Class Selection for Few-Shot Class-Incremental Learning

Authors: Christopher McClurg, Ali Ayub, Harsh Tyagi, Sarah M. Rajtmajer, Alan R. Wagner

Abstract: For real-world applications, robots will need to continually learn in their environments through limited interactions with their users. Toward this, previous works in few-shot class incremental learning (FSCIL) and active class selection (ACS) have achieved promising results but were tested in constrained setups. Therefore, in this paper, we combine ideas from FSCIL and ACS to develop a novel fram… ▽ More For real-world applications, robots will need to continually learn in their environments through limited interactions with their users. Toward this, previous works in few-shot class incremental learning (FSCIL) and active class selection (ACS) have achieved promising results but were tested in constrained setups. Therefore, in this paper, we combine ideas from FSCIL and ACS to develop a novel framework that can allow an autonomous agent to continually learn new objects by asking its users to label only a few of the most informative objects in the environment. To this end, we build on a state-of-the-art (SOTA) FSCIL model and extend it with techniques from ACS literature. We term this model Few-shot Incremental Active class SeleCtiOn (FIASco). We further integrate a potential field-based navigation technique with our model to develop a complete framework that can allow an agent to process and reason on its sensory data through the FIASco model, navigate towards the most informative object in the environment, gather data about the object through its sensors and incrementally update the FIASco model. Experimental results on a simulated agent and a real robot show the significance of our approach for long-term real-world robotics applications. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Accepted at the Conference on Lifelong Learning Agents (CoLLAs), 2023

arXiv:2306.17824 [pdf, other]

Learning Evacuee Models from Robot-Guided Emergency Evacuation Experiments

Authors: Mollik Nayyar, Ghanghoon Paik, Zhenyuan Yuan, Tongjia Zheng, Minghui Zhu, Hai Lin, Alan R. Wagner

Abstract: Recent research has examined the possibility of using robots to guide evacuees to safe exits during emergencies. Yet, there are many factors that can impact a person's decision to follow a robot. Being able to model how an evacuee follows an emergency robot guide could be crucial for designing robots that effectively guide evacuees during an emergency. This paper presents a method for developing r… ▽ More Recent research has examined the possibility of using robots to guide evacuees to safe exits during emergencies. Yet, there are many factors that can impact a person's decision to follow a robot. Being able to model how an evacuee follows an emergency robot guide could be crucial for designing robots that effectively guide evacuees during an emergency. This paper presents a method for developing realistic and predictive human evacuee models from physical human evacuation experiments. The paper analyzes the behavior of 14 human subjects during physical robot-guided evacuation. We then use the video data to create evacuee motion models that predict the person's future positions during the emergency. Finally, we validate the resulting models by running a k-fold cross-validation on the data collected during physical human subject experiments. We also present performance results of the model using data from a similar simulated emergency evacuation experiment demonstrating that these models can serve as a tool to predict evacuee behavior in novel evacuation simulations. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.06325 [pdf, other]

Explaining a machine learning decision to physicians via counterfactuals

Authors: Supriya Nagesh, Nina Mishra, Yonatan Naamad, James M. Rehg, Mehul A. Shah, Alexei Wagner

Abstract: Machine learning models perform well on several healthcare tasks and can help reduce the burden on the healthcare system. However, the lack of explainability is a major roadblock to their adoption in hospitals. \textit{How can the decision of an ML model be explained to a physician?} The explanations considered in this paper are counterfactuals (CFs), hypothetical scenarios that would have resulte… ▽ More Machine learning models perform well on several healthcare tasks and can help reduce the burden on the healthcare system. However, the lack of explainability is a major roadblock to their adoption in hospitals. \textit{How can the decision of an ML model be explained to a physician?} The explanations considered in this paper are counterfactuals (CFs), hypothetical scenarios that would have resulted in the opposite outcome. Specifically, time-series CFs are investigated, inspired by the way physicians converse and reason out decisions `I would have given the patient a vasopressor if their blood pressure was lower and falling'. Key properties of CFs that are particularly meaningful in clinical settings are outlined: physiological plausibility, relevance to the task and sparse perturbations. Past work on CF generation does not satisfy these properties, specifically plausibility in that realistic time-series CFs are not generated. A variational autoencoder (VAE)-based approach is proposed that captures these desired properties. The method produces CFs that improve on prior approaches quantitatively (more plausible CFs as evaluated by their likelihood w.r.t original data distribution, and 100$\times$ faster at generating CFs) and qualitatively (2$\times$ more plausible and relevant) as evaluated by three physicians. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2302.14752 [pdf, other]

Multi-Robot-Guided Crowd Evacuation: Two-Scale Modeling and Control

Authors: Tongjia Zheng, Zhenyuan Yuan, Mollik Nayyar, Alan R. Wagner, Minghui Zhu, Hai Lin

Abstract: Emergency evacuation describes a complex situation involving time-critical decision-making by evacuees. Mobile robots are being actively explored as a potential solution to provide timely guidance. In this work, we study a robot-guided crowd evacuation problem where a small group of robots is used to guide a large human crowd to safe locations. The challenge lies in how to use micro-level human-ro… ▽ More Emergency evacuation describes a complex situation involving time-critical decision-making by evacuees. Mobile robots are being actively explored as a potential solution to provide timely guidance. In this work, we study a robot-guided crowd evacuation problem where a small group of robots is used to guide a large human crowd to safe locations. The challenge lies in how to use micro-level human-robot interactions to indirectly influence a population that significantly outnumbers the robots to achieve the collective evacuation objective. To address the challenge, we follow a two-scale modeling strategy and explore hydrodynamic models, which consist of a family of microscopic social force models that describe how human movements are locally affected by other humans, the environment, and robots, and associated macroscopic equations for the temporal and spatial evolution of the crowd density and flow velocity. We design controllers for the robots such that they not only automatically explore the environment (with unknown dynamic obstacles) to cover it as much as possible, but also dynamically adjust the directions of their local navigation force fields based on the real-time macrostates of the crowd to guide the crowd to a safe location. We prove the stability of the proposed evacuation algorithm and conduct extensive simulations to investigate the performance of the algorithm with different combinations of human numbers, robot numbers, and obstacle settings. △ Less

Submitted 11 January, 2024; v1 submitted 28 February, 2023; originally announced February 2023.

arXiv:2210.08414 [pdf, other]

Using Virtual Reality to Simulate Human-Robot Emergency Evacuation Scenarios

Authors: Alan R. Wagner, Colin Holbrook, Daniel Holman, Brett Sheeran, Vidullan Surendran, Jared Armagost, Savanna Spazak, Yinxuan Yin

Abstract: This paper describes our recent effort to use virtual reality to simulate threatening emergency evacuation scenarios in which a robot guides a person to an exit. Our prior work has demonstrated that people will follow a robot's guidance, even when the robot is faulty, during an emergency evacuation. Yet, because physical in-person emergency evacuation experiments are difficult and costly to conduc… ▽ More This paper describes our recent effort to use virtual reality to simulate threatening emergency evacuation scenarios in which a robot guides a person to an exit. Our prior work has demonstrated that people will follow a robot's guidance, even when the robot is faulty, during an emergency evacuation. Yet, because physical in-person emergency evacuation experiments are difficult and costly to conduct and because we would like to evaluate many different factors, we are motivated to develop a system that immerses people in the simulation environment to encourage genuine subject reactions. We are working to complete experiments verifying the validity of our approach. △ Less

Submitted 15 October, 2022; originally announced October 2022.

Comments: Accepted at AAAI Fall Symposium AI-HRI Workshop

Report number: AIHRI/2022/8997

arXiv:2209.09795 [pdf, other]

Multi-Robot-Assisted Human Crowd Evacuation using Navigation Velocity Fields

Authors: Tongjia Zheng, Zhenyuan Yuan, Mollik Nayyar, Alan R. Wagner, Minghui Zhu, Hai Lin

Abstract: This work studies a robot-assisted crowd evacuation problem where we control a small group of robots to guide a large human crowd to safe locations. The challenge lies in how to model human-robot interactions and design robot controls to indirectly control a human population that significantly outnumbers the robots. To address the challenge, we treat the crowd as a continuum and formulate the evac… ▽ More This work studies a robot-assisted crowd evacuation problem where we control a small group of robots to guide a large human crowd to safe locations. The challenge lies in how to model human-robot interactions and design robot controls to indirectly control a human population that significantly outnumbers the robots. To address the challenge, we treat the crowd as a continuum and formulate the evacuation objective as driving the crowd density to target locations. We propose a novel mean-field model which consists of a family of microscopic equations that explicitly model how human motions are locally guided by the robots and an associated macroscopic equation that describes how the crowd density is controlled by the navigation velocity fields generated by all robots. Then, we design density feedback controllers for the robots to dynamically adjust their states such that the generated navigation velocity fields drive the crowd density to a target density. Stability guarantees of the proposed controllers are proven. Agent-based simulations are included to evaluate the proposed evacuation algorithms. △ Less

Submitted 20 September, 2022; originally announced September 2022.

arXiv:2207.14114 [pdf]

Classification of FIB/SEM-tomography images for highly porous multiphase materials using random forest classifiers

Authors: Markus Osenberg, André Hilger, Matthias Neumann, Amalia Wagner, Nicole Bohn, Joachim R. Binder, Volker Schmidt, John Banhart, Ingo Manke

Abstract: FIB/SEM tomography represents an indispensable tool for the characterization of three-dimensional nanostructures in battery research and many other fields. However, contrast and 3D classification/reconstruction problems occur in many cases, which strongly limits the applicability of the technique especially on porous materials, like those used for electrode materials in batteries or fuel cells. Di… ▽ More FIB/SEM tomography represents an indispensable tool for the characterization of three-dimensional nanostructures in battery research and many other fields. However, contrast and 3D classification/reconstruction problems occur in many cases, which strongly limits the applicability of the technique especially on porous materials, like those used for electrode materials in batteries or fuel cells. Distinguishing the different components like active Li storage particles and carbon/binder materials is difficult and often prevents a reliable quantitative analysis of image data, or may even lead to wrong conclusions about structure-property relationships. In this contribution, we present a novel approach for data classification in three-dimensional image data obtained by FIB/SEM tomography and its applications to NMC battery electrode materials. We use two different image signals, namely the signal of the angled SE2 chamber detector and the Inlens detector signal, combine both signals and train a random forest, i.e. a particular machine learning algorithm. We demonstrate that this approach can overcome current limitations of existing techniques suitable for multi-phase measurements and that it allows for quantitative data reconstruction even where current state-of the art techniques fail, or demand for large training sets. This approach may yield as guideline for future research using FIB/SEM tomography. △ Less

Submitted 28 July, 2022; originally announced July 2022.

arXiv:2206.10727 [pdf]

Toward Ethical Robotic Behavior in Human-Robot Interaction Scenarios

Authors: Shengkang Chen, Vidullan Surendran, Alan R. Wagner, Jason Borenstein, Ronald C. Arkin

Abstract: This paper describes current progress on developing an ethical architecture for robots that are designed to follow human ethical decision-making processes. We surveyed both regular adults (folks) and ethics experts (experts) on what they consider to be ethical behavior in two specific scenarios: pill-sorting with an older adult and game playing with a child. A key goal of the surveys is to better… ▽ More This paper describes current progress on developing an ethical architecture for robots that are designed to follow human ethical decision-making processes. We surveyed both regular adults (folks) and ethics experts (experts) on what they consider to be ethical behavior in two specific scenarios: pill-sorting with an older adult and game playing with a child. A key goal of the surveys is to better understand human ethical decision-making. In the first survey, folk responses were based on the subject's ethical choices ("folk morality"); in the second survey, expert responses were based on the expert's application of different formal ethical frameworks to each scenario. We observed that most of the formal ethical frameworks we included in the survey (Utilitarianism, Kantian Ethics, Ethics of Care and Virtue Ethics) and "folk morality" were conservative toward deception in the high-risk task with an older adult when both the adult and the child had significant performance deficiencies. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: in TRAITS Workshop Proceedings (arXiv:2206.08270) held in conjunction with Companion of the 2022 ACM/IEEE International Conference on Human-Robot Interaction, March 2022, Pages 1284-1286

Report number: TRAITS/2022/02

arXiv:2205.08518 [pdf, other]

Do Neural Networks Compress Manifolds Optimally?

Authors: Sourbh Bhadane, Aaron B. Wagner, Johannes Ballé

Abstract: Artificial Neural-Network-based (ANN-based) lossy compressors have recently obtained striking results on several sources. Their success may be ascribed to an ability to identify the structure of low-dimensional manifolds in high-dimensional ambient spaces. Indeed, prior work has shown that ANN-based compressors can achieve the optimal entropy-distortion curve for some such sources. In contrast, we… ▽ More Artificial Neural-Network-based (ANN-based) lossy compressors have recently obtained striking results on several sources. Their success may be ascribed to an ability to identify the structure of low-dimensional manifolds in high-dimensional ambient spaces. Indeed, prior work has shown that ANN-based compressors can achieve the optimal entropy-distortion curve for some such sources. In contrast, we determine the optimal entropy-distortion tradeoffs for two low-dimensional manifolds with circular structure and show that state-of-the-art ANN-based compressors fail to optimally compress them. △ Less

Submitted 9 September, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

arXiv:2204.09188 [pdf, other]

Functional Covering of Point Processes

Authors: Nirmal V. Shende, Aaron B. Wagner

Abstract: We introduce a new distortion measure for point processes called functional-covering distortion. It is inspired by intensity theory and is related to both the covering of point processes and logarithmic loss distortion. We obtain the distortion-rate function with feedforward under this distortion measure for a large class of point processes. For Poisson processes, the rate-distortion function is o… ▽ More We introduce a new distortion measure for point processes called functional-covering distortion. It is inspired by intensity theory and is related to both the covering of point processes and logarithmic loss distortion. We obtain the distortion-rate function with feedforward under this distortion measure for a large class of point processes. For Poisson processes, the rate-distortion function is obtained under a general condition called constrained functional-covering distortion, of which both covering and functional-covering are special cases. Also for Poisson processes, we characterize the rate-distortion region for a two-encoder CEO problem and show that feedforward does not enlarge this region. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2202.13158 [pdf, ps, other]

Semantic Soundness for Language Interoperability

Authors: Daniel Patterson, Noble Mushtak, Andrew Wagner, Amal Ahmed

Abstract: Programs are rarely implemented in a single language, and thus questions of type soundness should address not only the semantics of a single language, but how it interacts with others. Even between type-safe languages, disparate features frustrate interoperability, as invariants from one language can easily be violated in the other. In their seminal 2007 paper, Matthews and Findler proposed a mult… ▽ More Programs are rarely implemented in a single language, and thus questions of type soundness should address not only the semantics of a single language, but how it interacts with others. Even between type-safe languages, disparate features frustrate interoperability, as invariants from one language can easily be violated in the other. In their seminal 2007 paper, Matthews and Findler proposed a multi-language construction that augments the interoperating languages with a pair of boundaries that allow code from one language to be embedded in the other. While the technique has been widely applied, their syntactic source-level interoperability doesn't reflect practical implementations, where behavior of interaction is defined after compilation to a common target, and any safety must be ensured by target invariants or inserted target-level "glue code." In this paper, we present a framework for the design and verification of sound language interoperability that follows an interoperation-after-compilation strategy. Language designers specify what data can be converted between types of the languages via a relation $τ_A \sim τ_B$ and specify target glue code implementing conversions. Then, by giving a semantic model of source types as sets of target terms, we can establish soundness of conversions: i.e., whenever $τ_A \sim τ_B$, the corresponding pair of conversions convert target terms that behave as $τ_A$ to target terms that behave as $τ_B$, and vice versa. We can then prove semantic type soundness for the entire system. We illustrate our framework via a series of case studies that demonstrate how our semantic interoperation-after-compilation approach allows us both to account for complex differences in language semantics and make efficiency trade-offs based on particularities of compilers or targets. △ Less

Submitted 11 April, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

Comments: revised version with more exposition, typos fixed, etc

arXiv:2202.05292 [pdf, other]

On One-Bit Quantization

Authors: Sourbh Bhadane, Aaron B. Wagner

Abstract: We consider the one-bit quantizer that minimizes the mean squared error for a source living in a real Hilbert space. The optimal quantizer is a projection followed by a thresholding operation, and we provide methods for identifying the optimal direction along which to project. As an application of our methods, we characterize the optimal one-bit quantizer for a continuous-time random process that… ▽ More We consider the one-bit quantizer that minimizes the mean squared error for a source living in a real Hilbert space. The optimal quantizer is a projection followed by a thresholding operation, and we provide methods for identifying the optimal direction along which to project. As an application of our methods, we characterize the optimal one-bit quantizer for a continuous-time random process that exhibits low-dimensional structure. We numerically show that this optimal quantizer is found by a neural-network-based compressor trained via stochastic gradient descent. △ Less

Submitted 10 February, 2022; originally announced February 2022.

arXiv:2202.04481 [pdf, ps, other]

Minimax Rate-Distortion

Authors: Adeel Mahmood, Aaron B. Wagner

Abstract: We show the existence of variable-rate rate-distortion codes that meet the disortion constraint almost surely and are minimax, i.e., strongly, universal with respect to an unknown source distribution and a distortion measure that is revealed only to the encoder and only at runtime. If we only require minimax universality with respect to the source distribution and not the distortion measure, then… ▽ More We show the existence of variable-rate rate-distortion codes that meet the disortion constraint almost surely and are minimax, i.e., strongly, universal with respect to an unknown source distribution and a distortion measure that is revealed only to the encoder and only at runtime. If we only require minimax universality with respect to the source distribution and not the distortion measure, then we provide an achievable $\tilde{O}(1/\sqrt{n})$ redundancy rate, which we show is optimal. This is in contrast to prior work on universal lossy compression, which provides $O(\log n/n)$ redundancy guarantees for weakly universal codes under various regularity conditions. We show that either eliminating the regularity conditions or upgrading to strong universality while keeping these regularity conditions entails an inevitable increase in the redundancy to $\tilde{O}(1/\sqrt{n})$. Our construction involves random coding with non-i.i.d.\ codewords and a zero-rate uncoded transmission scheme. The proof uses exact asymptotics from large deviations, acceptance-rejection sampling, and the VC dimension of distortion measures. △ Less

Submitted 27 November, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

arXiv:2202.04267 [pdf, other]

Efficiently Computable Converses for Finite-Blocklength Communication

Authors: Felipe Areces, Dan Song, Richard Wesel, Aaron B. Wagner

Abstract: This paper presents a method for computing a finite-blocklength converse for the rate of fixed-length codes with feedback used on discrete memoryless channels (DMCs). The new converse is expressed in terms of a stochastic control problem whose solution can be efficiently computed using dynamic programming and Fourier methods. For channels such as the binary symmetric channel (BSC) and binary erasu… ▽ More This paper presents a method for computing a finite-blocklength converse for the rate of fixed-length codes with feedback used on discrete memoryless channels (DMCs). The new converse is expressed in terms of a stochastic control problem whose solution can be efficiently computed using dynamic programming and Fourier methods. For channels such as the binary symmetric channel (BSC) and binary erasure channel (BEC), the accuracy of the proposed converse is similar to that of existing special-purpose converse bounds, but the new converse technique can be applied to arbitrary DMCs. We provide example applications of the new converse technique to the binary asymmetric channel (BAC) and the quantized amplitude-constrained AWGN channel. △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: 6 pages, 7 figures

arXiv:2202.04147 [pdf, other]

The Rate-Distortion-Perception Tradeoff: The Role of Common Randomness

Authors: Aaron B. Wagner

Abstract: A rate-distortion-perception (RDP) tradeoff has recently been proposed by Blau and Michaeli and also Matsumoto. Focusing on the case of perfect realism, which coincides with the problem of distribution-preserving lossy compression studied by Li et al., a coding theorem for the RDP tradeoff that allows for a specified amount of common randomness between the encoder and decoder is provided. The exis… ▽ More A rate-distortion-perception (RDP) tradeoff has recently been proposed by Blau and Michaeli and also Matsumoto. Focusing on the case of perfect realism, which coincides with the problem of distribution-preserving lossy compression studied by Li et al., a coding theorem for the RDP tradeoff that allows for a specified amount of common randomness between the encoder and decoder is provided. The existing RDP tradeoff is recovered by allowing for the amount of common randomness to be infinite. The quadratic Gaussian case is examined in detail. △ Less

Submitted 8 February, 2022; originally announced February 2022.

arXiv:2201.11630 [pdf, other]

Automatic Classification of Neuromuscular Diseases in Children Using Photoacoustic Imaging

Authors: Maja Schlereth, Daniel Stromer, Katharina Breininger, Alexandra Wagner, Lina Tan, Andreas Maier, Ferdinand Knieling

Abstract: Neuromuscular diseases (NMDs) cause a significant burden for both healthcare systems and society. They can lead to severe progressive muscle weakness, muscle degeneration, contracture, deformity and progressive disability. The NMDs evaluated in this study often manifest in early childhood. As subtypes of disease, e.g. Duchenne Muscular Dystropy (DMD) and Spinal Muscular Atrophy (SMA), are difficul… ▽ More Neuromuscular diseases (NMDs) cause a significant burden for both healthcare systems and society. They can lead to severe progressive muscle weakness, muscle degeneration, contracture, deformity and progressive disability. The NMDs evaluated in this study often manifest in early childhood. As subtypes of disease, e.g. Duchenne Muscular Dystropy (DMD) and Spinal Muscular Atrophy (SMA), are difficult to differentiate at the beginning and worsen quickly, fast and reliable differential diagnosis is crucial. Photoacoustic and ultrasound imaging has shown great potential to visualize and quantify the extent of different diseases. The addition of automatic classification of such image data could further improve standard diagnostic procedures. We compare deep learning-based 2-class and 3-class classifiers based on VGG16 for differentiating healthy from diseased muscular tissue. This work shows promising results with high accuracies above 0.86 for the 3-class problem and can be used as a proof of concept for future approaches for earlier diagnosis and therapeutic monitoring of NMDs. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Comments: accepted by BVM conference proceedings 2022

arXiv:2112.07051 [pdf]

doi 10.1093/database/baac035

A Simple Standard for Sharing Ontological Mappings (SSSOM)

Authors: Nicolas Matentzoglu, James P. Balhoff, Susan M. Bello, Chris Bizon, Matthew Brush, Tiffany J. Callahan, Christopher G Chute, William D. Duncan, Chris T. Evelo, Davera Gabriel, John Graybeal, Alasdair Gray, Benjamin M. Gyori, Melissa Haendel, Henriette Harmse, Nomi L. Harris, Ian Harrow, Harshad Hegde, Amelia L. Hoyt, Charles T. Hoyt, Dazhi Jiao, Ernesto Jiménez-Ruiz, Simon Jupp, Hyeongsik Kim, Sebastian Koehler , et al. (19 additional authors not shown)

Abstract: Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, ar… ▽ More Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Are they associated in some other way? Such relationships between the mapped terms are often not documented, leading to incorrect assumptions and making them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Also, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. The Simple Standard for Sharing Ontological Mappings (SSSOM) addresses these problems by: 1. Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. 2. Defining an easy to use table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data standards. 3. Implementing open and community-driven collaborative workflows designed to evolve the standard continuously to address changing requirements and mapping practices. 4. Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases, and survey some existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable, and Reusable (FAIR). The SSSOM specification is at http://w3id.org/sssom/spec. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: Corresponding author: Christopher J. Mungall <[email protected]>

arXiv:2110.07022 [pdf, ps, other]

Lossy Compression with Universal Distortion

Authors: Adeel Mahmood, Aaron B. Wagner

Abstract: We consider a novel variant of $d$-semifaithful lossy coding in which the distortion measure is revealed only to the encoder and only at run-time, as well as an extension of it in which the distortion constraint $d$ is also revealed at run-time. Two forms of rate redundancy are used to analyze the performance, and achievability results of both a pointwise and minimax nature are demonstrated. The f… ▽ More We consider a novel variant of $d$-semifaithful lossy coding in which the distortion measure is revealed only to the encoder and only at run-time, as well as an extension of it in which the distortion constraint $d$ is also revealed at run-time. Two forms of rate redundancy are used to analyze the performance, and achievability results of both a pointwise and minimax nature are demonstrated. The first coding scheme uses ideas from VC dimension and growth functions, the second uses appropriate quantization of the space of distortion measures, and the third relies on a random coding argument. △ Less

Submitted 4 October, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

arXiv:2107.00123 [pdf, other]

If you Cheat, I Cheat: Cheating on a Collaborative Task with a Social Robot

Authors: Ali Ayub, Huiqing Hu, Guangwei Zhou, Carter Fendley, Crystal Ramsay, Kathy Lou Jackson, Alan R. Wagner

Abstract: Robots may soon play a role in higher education by augmenting learning environments and managing interactions between instructors and learners. Little, however, is known about how the presence of robots in the learning environment will influence academic integrity. This study therefore investigates if and how college students cheat while engaged in a collaborative sorting task with a robot. We emp… ▽ More Robots may soon play a role in higher education by augmenting learning environments and managing interactions between instructors and learners. Little, however, is known about how the presence of robots in the learning environment will influence academic integrity. This study therefore investigates if and how college students cheat while engaged in a collaborative sorting task with a robot. We employed a 2x2 factorial design to examine the effects of cheating exposure (exposure to cheating or no exposure) and task clarity (clear or vague rules) on college student cheating behaviors while interacting with a robot. Our study finds that prior exposure to cheating on the task significantly increases the likelihood of cheating. Yet, the tendency to cheat was not impacted by the clarity of the task rules. These results suggest that normative behavior by classmates may strongly influence the decision to cheat while engaged in an instructional experience with a robot. △ Less

Submitted 30 June, 2021; originally announced July 2021.

Comments: Accepted at IEEE International Conference on Robot and Human Interactive Communication (ROMAN), 2021

Showing 1–50 of 122 results for author: Wagner, A