Search | arXiv e-print repository

Scaling Multi Agent Reinforcement Learning for Underwater Acoustic Tracking via Autonomous Vehicles

Authors: Matteo Gallici, Ivan Masmitja, Mario Martín

Abstract: Autonomous vehicles (AV) offer a cost-effective solution for scientific missions such as underwater tracking. Recently, reinforcement learning (RL) has emerged as a powerful method for controlling AVs in complex marine environments. However, scaling these techniques to a fleet--essential for multi-target tracking or targets with rapid, unpredictable motion--presents significant computational chall… ▽ More Autonomous vehicles (AV) offer a cost-effective solution for scientific missions such as underwater tracking. Recently, reinforcement learning (RL) has emerged as a powerful method for controlling AVs in complex marine environments. However, scaling these techniques to a fleet--essential for multi-target tracking or targets with rapid, unpredictable motion--presents significant computational challenges. Multi-Agent Reinforcement Learning (MARL) is notoriously sample-inefficient, and while high-fidelity simulators like Gazebo's LRAUV provide 100x faster-than-real-time single-robot simulations, they offer no significant speedup for multi-vehicle scenarios, making MARL training impractical. To address these limitations, we propose an iterative distillation method that transfers high-fidelity simulations into a simplified, GPU-accelerated environment while preserving high-level dynamics. This approach achieves up to a 30,000x speedup over Gazebo through parallelization, enabling efficient training via end-to-end GPU acceleration. Additionally, we introduce a novel Transformer-based architecture (TransfMAPPO) that learns multi-agent policies invariant to the number of agents and targets, significantly improving sample efficiency. Following large-scale curriculum learning conducted entirely on GPU, we perform extensive evaluations in Gazebo, demonstrating that our method maintains tracking errors below 5 meters over extended durations, even in the presence of multiple fast-moving targets. This work bridges the gap between large-scale MARL training and high-fidelity deployment, providing a scalable framework for autonomous fleet control in real-world sea missions. △ Less

Submitted 13 May, 2025; originally announced May 2025.

arXiv:2504.20067 [pdf, other]

Scalable and Performant Data Loading

Authors: Moto Hira, Christian Puhrsch, Valentin Andrei, Roman Malinovskyy, Gael Le Lan, Abhinandan Krishnan, Joseph Cummings, Miguel Martin, Gokul Gunasekaran, Yuta Inoue, Alex J Turner, Raghuraman Krishnamoorthi

Abstract: We present SPDL (Scalable and Performant Data Loading), an open-source, framework-agnostic library designed for efficiently loading array data to GPU. Data loading is often a bottleneck in AI applications, and is challenging to optimize because it requires coordination of network calls, CPU-bound tasks, and GPU device transfer. On top of that, Python's GIL (Global Interpreter Lock) makes it diffic… ▽ More We present SPDL (Scalable and Performant Data Loading), an open-source, framework-agnostic library designed for efficiently loading array data to GPU. Data loading is often a bottleneck in AI applications, and is challenging to optimize because it requires coordination of network calls, CPU-bound tasks, and GPU device transfer. On top of that, Python's GIL (Global Interpreter Lock) makes it difficult to gain performance improvement from multi-threading. We found that when data preprocessing functions release the GIL entirely, it is possible to execute them concurrently in a thread pool, thereby improving the workflow performance. Our benchmark shows that compared to the PyTorch DataLoader, SPDL can iterate through the ImageNet dataset 74% faster while using 38% less CPU and 50GB less memory. When training ViT-B/16 model, SPDL can send data to the GPU at a speed that does not starve the training. Additionally, when using SPDL on Python 3.13t, without changing any code, the throughput is further by improved by 33%, thanks to the disabled GIL. SPDL can improve the performance of current AI model training, and receives further performance improvements when Free-Threaded Python is adopted in production systems. SPDL is available at https://github.com/facebookresearch/spdl. △ Less

Submitted 23 April, 2025; originally announced April 2025.

Comments: For the latest version of the software please visit https://facebookresearch.github.io/spdl/main/

arXiv:2504.13180 [pdf, other]

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

Authors: Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi, Triantafyllos Afouras, Tushar Nagarajan, Muhammad Maaz, Yale Song, Tengyu Ma, Shuming Hu, Suyog Jain, Miguel Martin, Huiyu Wang, Hanoona Rasheed, Peize Sun, Po-Yao Huang, Daniel Bolya, Nikhila Ravi, Shashank Jain, Tammy Stark, Shane Moon, Babak Damavandi, Vivian Lee, Andrew Westbury, Salman Khan, Philipp Krähenbühl , et al. (4 additional authors not shown)

Abstract: Vision-language models are integral to computer vision research, yet many high-performing models remain closed-source, obscuring their data, design and training recipe. The research community has responded by using distillation from black-box models to label training data, achieving strong benchmark results, at the cost of measurable scientific progress. However, without knowing the details of the… ▽ More Vision-language models are integral to computer vision research, yet many high-performing models remain closed-source, obscuring their data, design and training recipe. The research community has responded by using distillation from black-box models to label training data, achieving strong benchmark results, at the cost of measurable scientific progress. However, without knowing the details of the teacher model and its data sources, scientific progress remains difficult to measure. In this paper, we study building a Perception Language Model (PLM) in a fully open and reproducible framework for transparent research in image and video understanding. We analyze standard training pipelines without distillation from proprietary models and explore large-scale synthetic data to identify critical data gaps, particularly in detailed video understanding. To bridge these gaps, we release 2.8M human-labeled instances of fine-grained video question-answer pairs and spatio-temporally grounded video captions. Additionally, we introduce PLM-VideoBench, a suite for evaluating challenging video understanding tasks focusing on the ability to reason about "what", "where", "when", and "how" of a video. We make our work fully reproducible by providing data, training recipes, code & models. △ Less

Submitted 17 April, 2025; originally announced April 2025.

Comments: Technical report

arXiv:2504.12352 [pdf, other]

Deep Generative Model-Based Generation of Synthetic Individual-Specific Brain MRI Segmentations

Authors: Ruijie Wang, Luca Rossetto, Susan Mérillat, Christina Röcke, Mike Martin, Abraham Bernstein

Abstract: To the best of our knowledge, all existing methods that can generate synthetic brain magnetic resonance imaging (MRI) scans for a specific individual require detailed structural or volumetric information about the individual's brain. However, such brain information is often scarce, expensive, and difficult to obtain. In this paper, we propose the first approach capable of generating synthetic brai… ▽ More To the best of our knowledge, all existing methods that can generate synthetic brain magnetic resonance imaging (MRI) scans for a specific individual require detailed structural or volumetric information about the individual's brain. However, such brain information is often scarce, expensive, and difficult to obtain. In this paper, we propose the first approach capable of generating synthetic brain MRI segmentations -- specifically, 3D white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) segmentations -- for individuals using their easily obtainable and often readily available demographic, interview, and cognitive test information. Our approach features a novel deep generative model, CSegSynth, which outperforms existing prominent generative models, including conditional variational autoencoder (C-VAE), conditional generative adversarial network (C-GAN), and conditional latent diffusion model (C-LDM). We demonstrate the high quality of our synthetic segmentations through extensive evaluations. Also, in assessing the effectiveness of the individual-specific generation, we achieve superior volume prediction, with mean absolute errors of only 36.44mL, 29.20mL, and 35.51mL between the ground-truth WM, GM, and CSF volumes of test individuals and those volumes predicted based on generated individual-specific segmentations, respectively. △ Less

Submitted 23 April, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

arXiv:2502.19190 [pdf, ps, other]

Provocations from the Humanities for Generative AI Research

Authors: Lauren Klein, Meredith Martin, André Brock, Maria Antoniak, Melanie Walsh, Jessica Marie Johnson, Lauren Tilton, David Mimno

Abstract: This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities research, summarize some of its most salient theories and methods, and apply these theories and methods to the current landscape of AI. Drawing from foundational work in critical data studies, along with… ▽ More This paper presents a set of provocations for considering the uses, impact, and harms of generative AI from the perspective of humanities researchers. We provide a working definition of humanities research, summarize some of its most salient theories and methods, and apply these theories and methods to the current landscape of AI. Drawing from foundational work in critical data studies, along with relevant humanities scholarship, we elaborate eight claims with broad applicability to current conversations about generative AI: 1) Models make words, but people make meaning; 2) Generative AI requires an expanded definition of culture; 3) Generative AI can never be representative; 4) Bigger models are not always better models; 5) Not all training data is equivalent; 6) Openness is not an easy fix; 7) Limited access to compute enables corporate capture; and 8) AI universalism creates narrow human subjects. We conclude with a discussion of the importance of resisting the extraction of humanities research by computer science and related fields. △ Less

Submitted 26 February, 2025; originally announced February 2025.

Comments: working draft; final draft in preparation

ACM Class: I.2.0; K.4.0

arXiv:2501.10601 [pdf]

Understanding Computational Science and Domain Science Skills Development in National Laboratory Graduate Internships

Authors: Morgan M. Fong, Hilary Egan, Marc Day, Kristin Potter, Michael J. Martin

Abstract: Contribution: This study presents an evaluation of federally-funded graduate internship outcomes in computational science at a national laboratory. Additionally, we present a survey instrument that may be used for other internship programs with a similar focus. Background: There is ongoing demand for computational scientists to grapple with large-scale problems such as climate change. Internships… ▽ More Contribution: This study presents an evaluation of federally-funded graduate internship outcomes in computational science at a national laboratory. Additionally, we present a survey instrument that may be used for other internship programs with a similar focus. Background: There is ongoing demand for computational scientists to grapple with large-scale problems such as climate change. Internships may help provide additional training and access to greater compute capabilities for graduate students. However, little work has been done to quantify the learning outcomes of such internships. Background: There is ongoing demand for computational scientists to grapple with large-scale problems such as climate change. Internships may help provide additional training and access to greater compute capabilities for graduate students. However, little work has been done to quantify the learning outcomes of such internships. Research Questions: What computational skills, research skills, and professional skills do graduate students improve through their internships at NREL, the national laboratory selected for the study? What sustainability and renewable energy topics do graduate students gain more familiarity with through their internships at NREL? Do graduate students' career interests change after their internships at NREL? Methodology: We developed a survey and collected responses from past participants of five federally-funded internship programs and compare participant ratings of their prior experience to their internship experience. Findings: Our results indicate participants improve their computational skills, familiarity with sustainability and renewable energy topics, and are more interested in working at national labs. Additionally, participants go on to degree programs and positions related to sustainability and renewable energy after their internships. △ Less

Submitted 17 January, 2025; originally announced January 2025.

Comments: Submission to IEEE Transactions on Education pending

MSC Class: 97 ACM Class: K.3

arXiv:2412.15618 [pdf, other]

3D Shape Tokenization via Latent Flow Matching

Authors: Jen-Hao Rick Chang, Yuyang Wang, Miguel Angel Bautista Martin, Jiatao Gu, Xiaoming Zhao, Josh Susskind, Oncel Tuzel

Abstract: We introduce a latent 3D representation that models 3D surfaces as probability density functions in 3D, i.e., p(x,y,z), with flow-matching. Our representation is specifically designed for consumption by machine learning models, offering continuity and compactness by construction while requiring only point clouds and minimal data preprocessing. Despite being a data-driven method, our use of flow ma… ▽ More We introduce a latent 3D representation that models 3D surfaces as probability density functions in 3D, i.e., p(x,y,z), with flow-matching. Our representation is specifically designed for consumption by machine learning models, offering continuity and compactness by construction while requiring only point clouds and minimal data preprocessing. Despite being a data-driven method, our use of flow matching in the 3D space enables interesting geometry properties, including the capabilities to perform zero-shot estimation of surface normal and deformation field. We evaluate with several machine learning tasks, including 3D-CLIP, unconditional generative models, single-image conditioned generative model, and intersection-point estimation. Across all experiments, our models achieve competitive performance to existing baselines, while requiring less preprocessing and auxiliary information from training data. △ Less

Submitted 24 March, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

arXiv:2412.12355 [pdf]

doi 10.1109/MCSE.2025.3549359

Integrating Energy-Efficient Computing Research to Accelerate Energy Technology

Authors: Michael James Martin, Aaron Andersen, Charles Tripp, David Sickinger, Kristin Munch

Abstract: NREL's computational sciences center hosts the largest high-performance computing (HPC) capabilities dedicated to energy research while functioning as a living laboratory for energy-efficient computing. NREL's HPC capabilities support the research needs of the Department of Energy's Office of Energy Efficiency and Renewable Energy (EERE). In ten years of operation, HPC use in EERE-sponsored resear… ▽ More NREL's computational sciences center hosts the largest high-performance computing (HPC) capabilities dedicated to energy research while functioning as a living laboratory for energy-efficient computing. NREL's HPC capabilities support the research needs of the Department of Energy's Office of Energy Efficiency and Renewable Energy (EERE). In ten years of operation, HPC use in EERE-sponsored research has grown by a factor of 30, including work in electricity generation, energy efficiency, transportation, and energy system modeling. This paper analyzes this research portfolio, providing examples of individual use cases. The paper documents NREL's history of operating one of the world's most energy-efficient data centers while examining pathways to reduce economic and environmental impact beyond reduction of Power Usage Efficiency (PUE). This paper concludes by examining the unique opportunities created for accelerating improvements in data center efficiency created by combining an HPC system dedicated to energy research and a research program in energy-efficient computing. △ Less

Submitted 28 March, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

Comments: Invited submission to IEEE Computing in Science and Engineering

MSC Class: 00-02 ACM Class: K.4; J.2

arXiv:2412.06095 [pdf, other]

Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance

Authors: Fermin Moscoso del Prado Martin

Abstract: In many fields, such as language acquisition, neuropsychology of language, the study of aging, and historical linguistics, corpora are used for estimating the diversity of grammatical structures that are produced during a period by an individual, community, or type of speakers. In these cases, treebanks are taken as representative samples of the syntactic structures that might be encountered. Gene… ▽ More In many fields, such as language acquisition, neuropsychology of language, the study of aging, and historical linguistics, corpora are used for estimating the diversity of grammatical structures that are produced during a period by an individual, community, or type of speakers. In these cases, treebanks are taken as representative samples of the syntactic structures that might be encountered. Generalizing the potential syntactic diversity from the structures documented in a small corpus requires careful extrapolation whose accuracy is constrained by the limited size of representative sub-corpora. In this article, I demonstrate -- theoretically, and empirically -- that a grammar's derivational entropy and the mean length of the utterances (MLU) it generates are fundamentally linked, giving rise to a new measure, the derivational entropy rate. The mean length of utterances becomes the most practical index of syntactic complexity; I demonstrate that MLU is not a mere proxy, but a fundamental measure of syntactic diversity. In combination with the new derivational entropy rate measure, it provides a theory-free assessment of grammatical complexity. The derivational entropy rate indexes the rate at which different grammatical annotation frameworks determine the grammatical complexity of treebanks. I introduce the Smoothed Induced Treebank Entropy (SITE) as a tool for estimating these measures accurately, even from very small treebanks. I conclude by discussing important implications of these results for both NLP and human language processing. △ Less

Submitted 8 December, 2024; originally announced December 2024.

arXiv:2411.09772 [pdf, other]

Beyond Static Tools: Evaluating Large Language Models for Cryptographic Misuse Detection

Authors: Zohaib Masood, Miguel Vargas Martin

Abstract: The use of Large Language Models (LLMs) in software development is rapidly growing, with developers increasingly relying on these models for coding assistance, including security-critical tasks. Our work presents a comprehensive comparison between traditional static analysis tools for cryptographic API misuse detection-CryptoGuard, CogniCrypt, and Snyk Code-and the LLMs-GPT and Gemini. Using bench… ▽ More The use of Large Language Models (LLMs) in software development is rapidly growing, with developers increasingly relying on these models for coding assistance, including security-critical tasks. Our work presents a comprehensive comparison between traditional static analysis tools for cryptographic API misuse detection-CryptoGuard, CogniCrypt, and Snyk Code-and the LLMs-GPT and Gemini. Using benchmark datasets (OWASP, CryptoAPI, and MASC), we evaluate the effectiveness of each tool in identifying cryptographic misuses. Our findings show that GPT 4-o-mini surpasses current state-of-the-art static analysis tools on the CryptoAPI and MASC datasets, though it lags on the OWASP dataset. Additionally, we assess the quality of LLM responses to determine which models provide actionable and accurate advice, giving developers insights into their practical utility for secure coding. This study highlights the comparative strengths and limitations of static analysis versus LLM-driven approaches, offering valuable insights into the evolving role of AI in advancing software security practices. △ Less

Submitted 14 November, 2024; originally announced November 2024.

arXiv:2407.04811 [pdf, other]

Simplifying Deep Temporal Difference Learning

Authors: Matteo Gallici, Mattie Fellows, Benjamin Ellis, Bartomeu Pou, Ivan Masmitja, Jakob Nicolaus Foerster, Mario Martin

Abstract: Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabilise training, primarily a large replay buffer and target networks. Unfortunately, the delayed updating of frozen network parameters in the target netw… ▽ More Q-learning played a foundational role in the field reinforcement learning (RL). However, TD algorithms with off-policy data, such as Q-learning, or nonlinear function approximation like deep neural networks require several additional tricks to stabilise training, primarily a large replay buffer and target networks. Unfortunately, the delayed updating of frozen network parameters in the target network harms the sample efficiency and, similarly, the large replay buffer introduces memory and implementation overheads. In this paper, we investigate whether it is possible to accelerate and simplify off-policy TD training while maintaining its stability. Our key theoretical result demonstrates for the first time that regularisation techniques such as LayerNorm can yield provably convergent TD algorithms without the need for a target network or replay buffer, even with off-policy data. Empirically, we find that online, parallelised sampling enabled by vectorised environments stabilises training without the need for a large replay buffer. Motivated by these findings, we propose PQN, our simplified deep online Q-Learning algorithm. Surprisingly, this simple algorithm is competitive with more complex methods like: Rainbow in Atari, PPO-RNN in Craftax, QMix in Smax, and can be up to 50x faster than traditional DQN without sacrificing sample efficiency. In an era where PPO has become the go-to RL algorithm, PQN reestablishes off-policy Q-learning as a viable alternative. △ Less

Submitted 21 April, 2025; v1 submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04184 [pdf, other]

QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024

Authors: Zeyun Zhong, Manuel Martin, Frederik Diederichs, Juergen Beyerer

Abstract: This report presents a novel Mamba-based encoder-decoder architecture, QueryMamba, featuring an integrated verb-noun interaction module that utilizes a statistical verb-noun co-occurrence matrix to enhance video action forecasting. This architecture not only predicts verbs and nouns likely to occur based on historical data but also considers their joint occurrence to improve forecast accuracy. The… ▽ More This report presents a novel Mamba-based encoder-decoder architecture, QueryMamba, featuring an integrated verb-noun interaction module that utilizes a statistical verb-noun co-occurrence matrix to enhance video action forecasting. This architecture not only predicts verbs and nouns likely to occur based on historical data but also considers their joint occurrence to improve forecast accuracy. The efficacy of this approach is substantiated by experimental results, with the method achieving second place in the Ego4D LTA challenge and ranking first in noun prediction accuracy. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2406.00225 [pdf, other]

Kinematic Model of Magnetic Domain Wall Motion for Fast, High-Accuracy Simulations

Authors: Kristi Doleh, Leonard Humphrey, Chandler M. Linseisen, Michael D. Kitcher, Joanna M. Martin, Can Cui, Jean Anne C. Incorvia, Felipe Garcia-Sanchez, Naimul Hassan, Alexander J. Edwards, Joseph S. Friedman

Abstract: Domain wall (DW) devices have garnered recent interest for diverse applications including memory, logic, and neuromorphic primitives; fast, accurate device models are therefore imperative for large-scale system design and verification. Extant DW motion models are sub-optimal for large-scale system design either over-consuming compute resources with physics-heavy equations or oversimplifying the ph… ▽ More Domain wall (DW) devices have garnered recent interest for diverse applications including memory, logic, and neuromorphic primitives; fast, accurate device models are therefore imperative for large-scale system design and verification. Extant DW motion models are sub-optimal for large-scale system design either over-consuming compute resources with physics-heavy equations or oversimplifying the physics, drastically reducing model accuracy. We propose a DW model inspired by the phenomenological similarities between motions of a DW and a classical object being acted on by forces like air resistance or static friction. Our proposed phenomenological model predicts DW motion within 1.2% on average compared with micromagnetic simulations that are 400 times slower. Additionally our model is seven times faster than extant collective coordinate models and 14 times more accurate than extant hyper-reduced models making it an essential tool for large-scale DW circuit design and simulation. The model is publicly posted along with scripts that automatically extract model parameters from user-provided simulation or experimental data to extend the model to alternative micromagnetic parameters. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.09733 [pdf, other]

SCI 3.0: A Web-based Schema Curation Interface for Graphical Event Representations

Authors: Reece Suchocki, Mary Martin, Martha Palmer, Susan Brown

Abstract: To understand the complexity of global events, one must navigate a web of interwoven sub-events, identifying those most impactful elements within the larger, abstract macro-event framework at play. This concept can be extended to the field of natural language processing (NLP) through the creation of structured event schemas which can serve as representations of these abstract events. Central to ou… ▽ More To understand the complexity of global events, one must navigate a web of interwoven sub-events, identifying those most impactful elements within the larger, abstract macro-event framework at play. This concept can be extended to the field of natural language processing (NLP) through the creation of structured event schemas which can serve as representations of these abstract events. Central to our approach is the Schema Curation Interface 3.0 (SCI 3.0), a web application that facilitates real-time editing of event schema properties within a generated graph e.g., adding, removing, or editing sub-events, entities, and relations directly through an interface. △ Less

Submitted 16 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2403.07632 [pdf]

CardioGenAI: A Machine Learning-Based Framework for Re-Engineering Drugs for Reduced hERG Liability

Authors: Gregory W. Kyro, Matthew T. Martin, Eric D. Watt, Victor S. Batista

Abstract: The link between in vitro hERG ion channel inhibition and subsequent in vivo QT interval prolongation, a critical risk factor for the development of arrythmias such as Torsade de Pointes, is so well established that in vitro hERG activity alone is often sufficient to end the development of an otherwise promising drug candidate. It is therefore of tremendous interest to develop advanced methods for… ▽ More The link between in vitro hERG ion channel inhibition and subsequent in vivo QT interval prolongation, a critical risk factor for the development of arrythmias such as Torsade de Pointes, is so well established that in vitro hERG activity alone is often sufficient to end the development of an otherwise promising drug candidate. It is therefore of tremendous interest to develop advanced methods for identifying hERG-active compounds in the early stages of drug development, as well as for proposing redesigned compounds with reduced hERG liability and preserved on-target potency. In this work, we present CardioGenAI, a machine learning-based framework for re-engineering both developmental and commercially available drugs for reduced hERG activity while preserving their pharmacological activity. The framework incorporates novel state-of-the-art discriminative models for predicting hERG channel activity, as well as activity against the voltage-gated NaV1.5 and CaV1.2 channels due to their potential implications in modulating the arrhythmogenic potential induced by hERG channel blockade. We applied the complete framework to pimozide, an FDA-approved antipsychotic agent that demonstrates high affinity to the hERG channel, and generated 100 refined candidates. Remarkably, among the candidates is fluspirilene, a compound which is of the same class of drugs (diphenylmethanes) as pimozide and therefore has similar pharmacological activity, yet exhibits over 700-fold weaker binding to hERG. We envision that this method can effectively be applied to developmental compounds exhibiting hERG liabilities to provide a means of rescuing drug development programs that have stalled due to hERG-related safety concerns. We have made all of our software open-source to facilitate integration of the CardioGenAI framework for molecular hypothesis generation into drug discovery workflows. △ Less

Submitted 6 August, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.15552 [pdf, other]

doi 10.1177/02783649241282422

Morphological Symmetries in Robotics

Authors: Daniel Ordoñez-Apraez, Giulio Turrisi, Vladimir Kostic, Mario Martin, Antonio Agudo, Francesc Moreno-Noguer, Massimiliano Pontil, Claudio Semini, Carlos Mastalli

Abstract: We present a comprehensive framework for studying and leveraging morphological symmetries in robotic systems. These are intrinsic properties of the robot's morphology, frequently observed in animal biology and robotics, which stem from the replication of kinematic structures and the symmetrical distribution of mass. We illustrate how these symmetries extend to the robot's state space and both prop… ▽ More We present a comprehensive framework for studying and leveraging morphological symmetries in robotic systems. These are intrinsic properties of the robot's morphology, frequently observed in animal biology and robotics, which stem from the replication of kinematic structures and the symmetrical distribution of mass. We illustrate how these symmetries extend to the robot's state space and both proprioceptive and exteroceptive sensor measurements, resulting in the equivariance of the robot's equations of motion and optimal control policies. Thus, we recognize morphological symmetries as a relevant and previously unexplored physics-informed geometric prior, with significant implications for both data-driven and analytical methods used in modeling, control, estimation and design in robotics. For data-driven methods, we demonstrate that morphological symmetries can enhance the sample efficiency and generalization of machine learning models through data augmentation, or by applying equivariant/invariant constraints on the model's architecture. In the context of analytical methods, we employ abstract harmonic analysis to decompose the robot's dynamics into a superposition of lower-dimensional, independent dynamics. We substantiate our claims with both synthetic and real-world experiments conducted on bipedal and quadrupedal robots. Lastly, we introduce the repository MorphoSymm to facilitate the practical use of the theory and applications outlined in this work. △ Less

Submitted 24 March, 2025; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: 18 pages, 11 figures

MSC Class: 68T40 ACM Class: I.2.9

Journal ref: International Journal of Robotics Research, vol. 0, no. 0, pp. 1-22, 2025

arXiv:2401.06821 [pdf, other]

Surrogate Neural Networks Local Stability for Aircraft Predictive Maintenance

Authors: Mélanie Ducoffe, Guillaume Povéda, Audrey Galametz, Ryma Boumazouza, Marion-Cécile Martin, Julien Baris, Derk Daverschot, Eugene O'Higgins

Abstract: Surrogate Neural Networks are nowadays routinely used in industry as substitutes for computationally demanding engineering simulations (e.g., in structural analysis). They allow to generate faster predictions and thus analyses in industrial applications e.g., during a product design, testing or monitoring phases. Due to their performance and time-efficiency, these surrogate models are now being de… ▽ More Surrogate Neural Networks are nowadays routinely used in industry as substitutes for computationally demanding engineering simulations (e.g., in structural analysis). They allow to generate faster predictions and thus analyses in industrial applications e.g., during a product design, testing or monitoring phases. Due to their performance and time-efficiency, these surrogate models are now being developed for use in safety-critical applications. Neural network verification and in particular the assessment of their robustness (e.g., to perturbations) is the next critical step to allow their inclusion in real-life applications and certification. We assess the applicability and scalability of empirical and formal methods in the context of aircraft predictive maintenance for surrogate neural networks designed to predict the stress sustained by an aircraft part from external loads. The case study covers a high-dimensional input and output space and the verification process thus accommodates multi-objective constraints. We explore the complementarity of verification methods in assessing the local stability property of such surrogate models to input noise. We showcase the effectiveness of sequentially combining methods in one verification 'pipeline' and demonstrate the subsequent gain in runtime required to assess the targeted property. △ Less

Submitted 24 July, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Comments: Peer-reviewed and accepted at the 29th International Conference on Formal Methods for Industrial Critical Systems (FMICS 2024) - 15 pages

arXiv:2312.07787 [pdf, other]

doi 10.1109/ACCESS.2020.2987483

INRISCO: INcident monitoRing In Smart COmmunities

Authors: Mónica Aguilar Igartua, Florina Almenares, Rebeca P. Díaz Redondo, Manuela I. Martín, Jordi Forné, Celeste Campo, Ana Fernández, Luis J. de la Cruz, Carlos García-Rubio, Andrés Marínn, Ahmad Mohamad Mezher, Daniel Díaz, Héctor Cerezo, David Rebollo-Monedero, Patricia Arias, Francisco Rico

Abstract: Major advances in information and communication technologies (ICTs) make citizens to be considered as sensors in motion. Carrying their mobile devices, moving in their connected vehicles or actively participating in social networks, citizens provide a wealth of information that, after properly processing, can support numerous applications for the benefit of the community. In the context of smart c… ▽ More Major advances in information and communication technologies (ICTs) make citizens to be considered as sensors in motion. Carrying their mobile devices, moving in their connected vehicles or actively participating in social networks, citizens provide a wealth of information that, after properly processing, can support numerous applications for the benefit of the community. In the context of smart communities, the INRISCO proposal intends for (i) the early detection of abnormal situations in cities (i.e., incidents), (ii) the analysis of whether, according to their impact, those incidents are really adverse for the community; and (iii) the automatic actuation by dissemination of appropriate information to citizens and authorities. Thus, INRISCO will identify and report on incidents in traffic (jam, accident) or public infrastructure (e.g., works, street cut), the occurrence of specific events that affect other citizens life (e.g., demonstrations, concerts), or environmental problems (e.g., pollution, bad weather). It is of particular interest to this proposal the identification of incidents with a social and economic impact, which affects the quality of life of citizens. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Journal ref: EEE Access, vol. 8, 2020

arXiv:2311.18259 [pdf, other]

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Authors: Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain , et al. (76 additional authors not shown)

Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from… ▽ More We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from 1 to 42 minutes each and 1,286 hours of video combined. The multimodal nature of the dataset is unprecedented: the video is accompanied by multichannel audio, eye gaze, 3D point clouds, camera poses, IMU, and multiple paired language descriptions -- including a novel "expert commentary" done by coaches and teachers and tailored to the skilled-activity domain. To push the frontier of first-person video understanding of skilled human activity, we also present a suite of benchmark tasks and their annotations, including fine-grained activity understanding, proficiency estimation, cross-view translation, and 3D hand/body pose. All resources are open sourced to fuel new research in the community. Project page: http://ego-exo4d-data.org/ △ Less

Submitted 25 September, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

Comments: Expanded manuscript (compared to arxiv v1 from Nov 2023 and CVPR 2024 paper from June 2024) for more comprehensive dataset and benchmark presentation, plus new results on v2 data release

arXiv:2311.15991 [pdf, other]

DiffAnt: Diffusion Models for Action Anticipation

Authors: Zeyun Zhong, Chengzhi Wu, Manuel Martin, Michael Voit, Juergen Gall, Jürgen Beyerer

Abstract: Anticipating future actions is inherently uncertain. Given an observed video segment containing ongoing actions, multiple subsequent actions can plausibly follow. This uncertainty becomes even larger when predicting far into the future. However, the majority of existing action anticipation models adhere to a deterministic approach, neglecting to account for future uncertainties. In this work, we r… ▽ More Anticipating future actions is inherently uncertain. Given an observed video segment containing ongoing actions, multiple subsequent actions can plausibly follow. This uncertainty becomes even larger when predicting far into the future. However, the majority of existing action anticipation models adhere to a deterministic approach, neglecting to account for future uncertainties. In this work, we rethink action anticipation from a generative view, employing diffusion models to capture different possible future actions. In this framework, future actions are iteratively generated from standard Gaussian noise in the latent space, conditioned on the observed video, and subsequently transitioned into the action space. Extensive experiments on four benchmark datasets, i.e., Breakfast, 50Salads, EpicKitchens, and EGTEA Gaze+, are performed and the proposed method achieves superior or comparable results to state-of-the-art methods, showing the effectiveness of a generative approach for action anticipation. Our code and trained models will be published on GitHub. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.12841 [pdf, other]

Tool Wear Segmentation in Blanking Processes with Fully Convolutional Networks based Digital Image Processing

Authors: Clemens Schlegel, Dirk Alexander Molitor, Christian Kubik, Daniel Michael Martin, Peter Groche

Abstract: The extend of tool wear significantly affects blanking processes and has a decisive impact on product quality and productivity. For this reason, numerous scientists have addressed their research to wear monitoring systems in order to identify or even predict critical wear at an early stage. Existing approaches are mainly based on indirect monitoring using time series, which are used to detect crit… ▽ More The extend of tool wear significantly affects blanking processes and has a decisive impact on product quality and productivity. For this reason, numerous scientists have addressed their research to wear monitoring systems in order to identify or even predict critical wear at an early stage. Existing approaches are mainly based on indirect monitoring using time series, which are used to detect critical wear states via thresholds or machine learning models. Nevertheless, differentiation between types of wear phenomena affecting the tool during blanking as well as quantification of worn surfaces is still limited in practice. While time series data provides partial insights into wear occurrence and evolution, direct monitoring techniques utilizing image data offer a more comprehensive perspective and increased robustness when dealing with varying process parameters. However, acquiring and processing this data in real-time is challenging. In particular, high dynamics combined with increasing strokes rates as well as the high dimensionality of image data have so far prevented the development of direct image-based monitoring systems. For this reason, this paper demonstrates how high-resolution images of tools at 600 spm can be captured and subsequently processed using semantic segmentation deep learning algorithms, more precisely Fully Convolutional Networks (FCN). 125,000 images of the tool are taken from successive strokes, and microscope images are captured to investigate the worn surfaces. Based on findings from the microscope images, selected images are labeled pixel by pixel according to their wear condition and used to train a FCN (U-Net). △ Less

Submitted 6 October, 2023; originally announced November 2023.

Report number: PtU-23-10

arXiv:2310.20254 [pdf]

Artificial Intelligence for reverse engineering: application to detergents using Raman spectroscopy

Authors: Pedro Marote, Marie Martin, Anne Bonhomme, Pierre Lantéri, Yohann Clément

Abstract: The reverse engineering of a complex mixture, regardless of its nature, has become significant today. Being able to quickly assess the potential toxicity of new commercial products in relation to the environment presents a genuine analytical challenge. The development of digital tools (databases, chemometrics, machine learning, etc.) and analytical techniques (Raman spectroscopy, NIR spectroscopy,… ▽ More The reverse engineering of a complex mixture, regardless of its nature, has become significant today. Being able to quickly assess the potential toxicity of new commercial products in relation to the environment presents a genuine analytical challenge. The development of digital tools (databases, chemometrics, machine learning, etc.) and analytical techniques (Raman spectroscopy, NIR spectroscopy, mass spectrometry, etc.) will allow for the identification of potential toxic molecules. In this article, we use the example of detergent products, whose composition can prove dangerous to humans or the environment, necessitating precise identification and quantification for quality control and regulation purposes. The combination of various digital tools (spectral database, mixture database, experimental design, Chemometrics / Machine Learning algorithm{\ldots}) together with different sample preparation methods (raw sample, or several concentrated / diluted samples) Raman spectroscopy, has enabled the identification of the mixture's constituents and an estimation of its composition. Implementing such strategies across different analytical tools can result in time savings for pollutant identification and contamination assessment in various matrices. This strategy is also applicable in the industrial sector for product or raw material control, as well as for quality control purposes. △ Less

Submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.16909 [pdf, other]

doi 10.1038/s41928-024-01303-z

Neuromorphic weighted sum with magnetic skyrmions

Authors: Tristan da Câmara Santa Clara Gomes, Yanis Sassi, Dédalo Sanz-Hernández, Sachin Krishnia, Sophie Collin, Marie-Blandine Martin, Pierre Seneor, Vincent Cros, Julie Grollier, Nicolas Reyren

Abstract: Integrating magnetic skyrmion properties into neuromorphic computing promises advancements in hardware efficiency and computational power. However, a scalable implementation of the weighted sum of neuron signals, a core operation in neural networks, has yet to be demonstrated. In this study, we exploit the non-volatile and particle-like characteristics of magnetic skyrmions, akin to synaptic vesic… ▽ More Integrating magnetic skyrmion properties into neuromorphic computing promises advancements in hardware efficiency and computational power. However, a scalable implementation of the weighted sum of neuron signals, a core operation in neural networks, has yet to be demonstrated. In this study, we exploit the non-volatile and particle-like characteristics of magnetic skyrmions, akin to synaptic vesicles and neurotransmitters, to perform this weighted sum operation in a compact, biologically-inspired manner. To this aim, skyrmions are electrically generated in numbers proportional to the input with an efficiency given by a non-volatile weight. These chiral particles are then directed using localized current injections to a location where their presence is quantified through non-perturbative electrical measurements. Our experimental demonstration, currently with two inputs, can be scaled to accommodate multiple inputs and outputs using a crossbar array design, potentially nearing the energy efficiency observed in biological systems. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: 12 pages, 5 figures

Journal ref: Nat. Electron. (2025)

arXiv:2310.13341 [pdf, ps, other]

Regular packing of rooted hyperforests with root constraints in hypergraphs

Authors: Pierre Hoppenot, Mathis Martin, Zoltán Szigeti

Abstract: The seminal papers of Edmonds \cite{Egy}, Nash-Williams \cite{NW} and Tutte \cite{Tu} have laid the foundations of the theories of packing arborescences and packing trees. The directed version has been extensively investigated, resulting in a great number of generalizations. In contrast, the undirected version has been marginally considered. The aim of this paper is to further develop the theories… ▽ More The seminal papers of Edmonds \cite{Egy}, Nash-Williams \cite{NW} and Tutte \cite{Tu} have laid the foundations of the theories of packing arborescences and packing trees. The directed version has been extensively investigated, resulting in a great number of generalizations. In contrast, the undirected version has been marginally considered. The aim of this paper is to further develop the theories of packing trees and forests. Our main result on graphs characterizes the existence of a packing of $k$ forests, $F_1, \ldots, F_k$, in a graph $G$ such that each vertex of $G$ belongs to exactly $h$ of the forests, and in addition, each $F_i$ has between $\ell(i)$ and $\ell'(i)$ connected components and the total number of connected components in the packing is between $α$ and $β$. Finally, we extend this result to hypergraphs and dypergraphs, the latter giving a generalization of a theorem of Bérczi and Frank \cite{BF3}. △ Less

Submitted 25 November, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: 18 pages

arXiv:2309.17257 [pdf, other]

A Survey on Deep Learning Techniques for Action Anticipation

Authors: Zeyun Zhong, Manuel Martin, Michael Voit, Juergen Gall, Jürgen Beyerer

Abstract: The ability to anticipate possible future human actions is essential for a wide range of applications, including autonomous driving and human-robot interaction. Consequently, numerous methods have been introduced for action anticipation in recent years, with deep learning-based approaches being particularly popular. In this work, we review the recent advances of action anticipation algorithms with… ▽ More The ability to anticipate possible future human actions is essential for a wide range of applications, including autonomous driving and human-robot interaction. Consequently, numerous methods have been introduced for action anticipation in recent years, with deep learning-based approaches being particularly popular. In this work, we review the recent advances of action anticipation algorithms with a particular focus on daily-living scenarios. Additionally, we classify these methods according to their primary contributions and summarize them in tabular form, allowing readers to grasp the details at a glance. Furthermore, we delve into the common evaluation metrics and datasets used for action anticipation and provide future directions with systematical discussions. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: Submitted to TPAMI

arXiv:2308.16622 [pdf, other]

Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering

Authors: Lars-Peter Meyer, Johannes Frey, Kurt Junghanns, Felix Brei, Kirill Bulert, Sabine Gründer-Fahrer, Michael Martin

Abstract: As the field of Large Language Models (LLMs) evolves at an accelerated pace, the critical need to assess and monitor their performance emerges. We introduce a benchmarking framework focused on knowledge graph engineering (KGE) accompanied by three challenges addressing syntax and error correction, facts extraction and dataset generation. We show that while being a useful tool, LLMs are yet unfit t… ▽ More As the field of Large Language Models (LLMs) evolves at an accelerated pace, the critical need to assess and monitor their performance emerges. We introduce a benchmarking framework focused on knowledge graph engineering (KGE) accompanied by three challenges addressing syntax and error correction, facts extraction and dataset generation. We show that while being a useful tool, LLMs are yet unfit to assist in knowledge graph generation with zero-shot prompting. Consequently, our LLM-KG-Bench framework provides automatic evaluation and storage of LLM responses as well as statistical data and visualization tools to support tracking of prompt engineering and model performance. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: To be published in SEMANTICS 2023 poster track proceedings. SEMANTICS 2023 EU: 19th International Conference on Semantic Systems, September 20-22, 2023, Leipzig, Germany

arXiv:2308.04965 [pdf, other]

doi 10.1111/itor.13358

Comparative analysis of mathematical formulations for the two-dimensional guillotine cutting problem

Authors: Henrique Becker, Mateus Martin, Olinto Araujo, Luciana S. Buriol, Reinaldo Morabito

Abstract: About ten years ago, a paper proposed the first integer linear programming formulation for the constrained two-dimensional guillotine cutting problem (with unlimited cutting stages). Since, six other formulations followed, five of them in the last two years. This spike of interest gave no opportunity for a comprehensive comparison between the formulations. We review each formulation and compare th… ▽ More About ten years ago, a paper proposed the first integer linear programming formulation for the constrained two-dimensional guillotine cutting problem (with unlimited cutting stages). Since, six other formulations followed, five of them in the last two years. This spike of interest gave no opportunity for a comprehensive comparison between the formulations. We review each formulation and compare their empirical results over instance datasets of the literature. We adapt most formulations to allow for piece rotation. The possibility of adaptation was already predicted but not realized by the prior work. The results show the dominance of pseudo-polynomial formulations until the point instances become intractable by them, while more compact formulations keep achieving good primal solutions. Our study also reveals a small but consistent advantage of the Gurobi solver over the CPLEX solver in our context; that the choice of solver hardly benefits one formulation over another; and a mistake in the generation of the T instances, which should have the same optima with or without guillotine cuts. Our study also proposes hybridising the most recent formulation with a prior formulation for a restricted version of the problem. The hybridisations show a reduction of about 20% of the branch-and-bound time thanks to the symmetries broken by the hybridisation. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 23 pages, 7 tables, 3 figures

MSC Class: 90-02 ACM Class: G.2.0

arXiv:2307.06917 [pdf, ps, other]

doi 10.1007/978-3-658-43705-3_8

LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT

Authors: Lars-Peter Meyer, Claus Stadler, Johannes Frey, Norman Radtke, Kurt Junghanns, Roy Meissner, Gordian Dziwis, Kirill Bulert, Michael Martin

Abstract: Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, we… ▽ More Knowledge Graphs (KG) provide us with a structured, flexible, transparent, cross-system, and collaborative way of organizing our knowledge and data across various domains in society and industrial as well as scientific disciplines. KGs surpass any other form of representation in terms of effectiveness. However, Knowledge Graph Engineering (KGE) requires in-depth experiences of graph structures, web technologies, existing models and vocabularies, rule sets, logic, as well as best practices. It also demands a significant amount of work. Considering the advancements in large language models (LLMs) and their interfaces and applications in recent years, we have conducted comprehensive experiments with ChatGPT to explore its potential in supporting KGE. In this paper, we present a selection of these experiments and their results to demonstrate how ChatGPT can assist us in the development and management of KGs. △ Less

Submitted 13 July, 2023; originally announced July 2023.

Comments: to appear in conference proceedings of AI-Tomorrow-23, 29.+30.6.2023 in Leipzig, Germany

Journal ref: Informatik aktuell. First Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow 2023. AIDRST 2023. p. 103-115

arXiv:2305.01971 [pdf, other]

doi 10.1038/s41597-023-02749-0

District-scale surface temperatures generated from high-resolution longitudinal thermal infrared images

Authors: Subin Lin, Vasantha Ramani, Miguel Martin, Pandarasamy Arjunan, Adrian Chong, Filip Biljecki, Marcel Ignatius, Kameshwar Poolla, Clayton Miller

Abstract: The paper describes a dataset that was collected by infrared thermography, which is a non-contact, non-intrusive technique to collect data and analyze the built environment in various aspects. While most studies focus on the city and building scales, the rooftop observatory provides high temporal and spatial resolution observations with dynamic interactions on the district scale. The rooftop infra… ▽ More The paper describes a dataset that was collected by infrared thermography, which is a non-contact, non-intrusive technique to collect data and analyze the built environment in various aspects. While most studies focus on the city and building scales, the rooftop observatory provides high temporal and spatial resolution observations with dynamic interactions on the district scale. The rooftop infrared thermography observatory with a multi-modal platform that is capable of assessing a wide range of dynamic processes in urban systems was deployed in Singapore. It was placed on the top of two buildings that overlook the outdoor context of the campus of the National University of Singapore. The platform collects remote sensing data from tropical areas on a temporal scale, allowing users to determine the temperature trend of individual features such as buildings, roads, and vegetation. The dataset includes 1,365,921 thermal images collected on average at approximately 10 seconds intervals from two locations during ten months. △ Less

Submitted 12 December, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

Journal ref: Sci Data 10, 859 (2023)

arXiv:2304.09919 [pdf, other]

The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages

Authors: Vesa Akerman, David Baines, Damien Daspit, Ulf Hermjakob, Taeho Jang, Colin Leong, Michael Martin, Joel Mathew, Jonathan Robie, Marcus Schwarting

Abstract: Efficiently and accurately translating a corpus into a low-resource language remains a challenge, regardless of the strategies employed, whether manual, automated, or a combination of the two. Many Christian organizations are dedicated to the task of translating the Holy Bible into languages that lack a modern translation. Bible translation (BT) work is currently underway for over 3000 extremely l… ▽ More Efficiently and accurately translating a corpus into a low-resource language remains a challenge, regardless of the strategies employed, whether manual, automated, or a combination of the two. Many Christian organizations are dedicated to the task of translating the Holy Bible into languages that lack a modern translation. Bible translation (BT) work is currently underway for over 3000 extremely low resource languages. We introduce the eBible corpus: a dataset containing 1009 translations of portions of the Bible with data in 833 different languages across 75 language families. In addition to a BT benchmarking dataset, we introduce model performance benchmarks built on the No Language Left Behind (NLLB) neural machine translation (NMT) models. Finally, we describe several problems specific to the domain of BT and consider how the established data and model benchmarks might be used for future translation efforts. For a BT task trained with NLLB, Austronesian and Trans-New Guinea language families achieve 35.1 and 31.6 BLEU scores respectively, which spurs future innovations for NMT for low-resource languages in Papua New Guinea. △ Less

Submitted 19 April, 2023; originally announced April 2023.

arXiv:2304.03103 [pdf, other]

doi 10.1145/3583780.3615497

Retention Is All You Need

Authors: Karishma Mohiuddin, Mirza Ariful Alam, Mirza Mohtashim Alam, Pascal Welke, Michael Martin, Jens Lehmann, Sahar Vahdati

Abstract: Skilled employees are the most important pillars of an organization. Despite this, most organizations face high attrition and turnover rates. While several machine learning models have been developed to analyze attrition and its causal factors, the interpretations of those models remain opaque. In this paper, we propose the HR-DSS approach, which stands for Human Resource (HR) Decision Support Sys… ▽ More Skilled employees are the most important pillars of an organization. Despite this, most organizations face high attrition and turnover rates. While several machine learning models have been developed to analyze attrition and its causal factors, the interpretations of those models remain opaque. In this paper, we propose the HR-DSS approach, which stands for Human Resource (HR) Decision Support System, and uses explainable AI for employee attrition problems. The system is designed to assist HR departments in interpreting the predictions provided by machine learning models. In our experiments, we employ eight machine learning models to provide predictions. We further process the results achieved by the best-performing model by the SHAP explainability process and use the SHAP values to generate natural language explanations which can be valuable for HR. Furthermore, using "What-if-analysis", we aim to observe plausible causes for attrition of an individual employee. The results show that by adjusting the specific dominant features of each individual, employee attrition can turn into employee retention through informative business decisions. △ Less

Submitted 26 August, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

Comments: Accepted at CIKM 2023 Applied Research Track

arXiv:2303.00795 [pdf, other]

Improved Segmentation of Deep Sulci in Cortical Gray Matter Using a Deep Learning Framework Incorporating Laplace's Equation

Authors: Sadhana Ravikumar, Ranjit Ittyerah, Sydney Lim, Long Xie, Sandhitsu Das, Pulkit Khandelwal, Laura E. M. Wisse, Madigan L. Bedard, John L. Robinson, Terry Schuck, Murray Grossman, John Q. Trojanowski, Edward B. Lee, M. Dylan Tisdall, Karthik Prabhakaran, John A. Detre, David J. Irwin, Winifred Trotman, Gabor Mizsei, Emilio Artacho-Pérula, Maria Mercedes Iñiguez de Onzono Martin, Maria del Mar Arroyo Jiménez, Monica Muñoz, Francisco Javier Molina Romero, Maria del Pilar Marcos Rabal , et al. (7 additional authors not shown)

Abstract: When developing tools for automated cortical segmentation, the ability to produce topologically correct segmentations is important in order to compute geometrically valid morphometry measures. In practice, accurate cortical segmentation is challenged by image artifacts and the highly convoluted anatomy of the cortex itself. To address this, we propose a novel deep learning-based cortical segmentat… ▽ More When developing tools for automated cortical segmentation, the ability to produce topologically correct segmentations is important in order to compute geometrically valid morphometry measures. In practice, accurate cortical segmentation is challenged by image artifacts and the highly convoluted anatomy of the cortex itself. To address this, we propose a novel deep learning-based cortical segmentation method in which prior knowledge about the geometry of the cortex is incorporated into the network during the training process. We design a loss function which uses the theory of Laplace's equation applied to the cortex to locally penalize unresolved boundaries between tightly folded sulci. Using an ex vivo MRI dataset of human medial temporal lobe specimens, we demonstrate that our approach outperforms baseline segmentation networks, both quantitatively and qualitatively. △ Less

Submitted 3 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: Accepted at the 28th biennial international conference on Information Processing in Medical Imaging (IPMI 2023)

arXiv:2302.12120 [pdf, other]

Sequential Counterfactual Risk Minimization

Authors: Houssam Zenati, Eustache Diemert, Matthieu Martin, Julien Mairal, Pierre Gaillard

Abstract: Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data. In this paper, we explore the case where it is possible to deploy learned policies multiple times and acquire new data. We extend the CRM principle and its theory to this scenario, which we call "Sequential Counterfactual Risk… ▽ More Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data. In this paper, we explore the case where it is possible to deploy learned policies multiple times and acquire new data. We extend the CRM principle and its theory to this scenario, which we call "Sequential Counterfactual Risk Minimization (SCRM)." We introduce a novel counterfactual estimator and identify conditions that can improve the performance of CRM in terms of excess risk and regret rates, by using an analysis similar to restart strategies in accelerated optimization methods. We also provide an empirical evaluation of our method in both discrete and continuous action settings, and demonstrate the benefits of multiple deployments of CRM. △ Less

Submitted 25 May, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: To appear at ICML23

arXiv:2302.10433 [pdf, other]

doi 10.15607/RSS.2023.XIX.053

On discrete symmetries of robotics systems: A group-theoretic and data-driven analysis

Authors: Daniel Ordonez-Apraez, Mario Martin, Antonio Agudo, Francesc Moreno-Noguer

Abstract: We present a comprehensive study on discrete morphological symmetries of dynamical systems, which are commonly observed in biological and artificial locomoting systems, such as legged, swimming, and flying animals/robots/virtual characters. These symmetries arise from the presence of one or more planes/axis of symmetry in the system's morphology, resulting in harmonious duplication and distributio… ▽ More We present a comprehensive study on discrete morphological symmetries of dynamical systems, which are commonly observed in biological and artificial locomoting systems, such as legged, swimming, and flying animals/robots/virtual characters. These symmetries arise from the presence of one or more planes/axis of symmetry in the system's morphology, resulting in harmonious duplication and distribution of body parts. Significantly, we characterize how morphological symmetries extend to symmetries in the system's dynamics, optimal control policies, and in all proprioceptive and exteroceptive measurements related to the system's dynamics evolution. In the context of data-driven methods, symmetry represents an inductive bias that justifies the use of data augmentation or symmetric function approximators. To tackle this, we present a theoretical and practical framework for identifying the system's morphological symmetry group $\G$ and characterizing the symmetries in proprioceptive and exteroceptive data measurements. We then exploit these symmetries using data augmentation and $\G$-equivariant neural networks. Our experiments on both synthetic and real-world applications provide empirical evidence of the advantageous outcomes resulting from the exploitation of these symmetries, including improved sample efficiency, enhanced generalization, and reduction of trainable parameters. △ Less

Submitted 7 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: 8 pages, 4 figures, 7 optional appendix pages, 4 appendix figures

MSC Class: 37J15; ACM Class: J.2

Journal ref: Robotics: Science and System 2023

arXiv:2302.00129 [pdf, other]

Universal Topological Regularities of Syntactic Structures: Decoupling Efficiency from Optimization

Authors: Fermín Moscoso del Prado Martín

Abstract: Human syntactic structures are usually represented as graphs. Much research has focused on the mapping between such graphs and linguistic sequences, but less attention has been paid to the shapes of the graphs themselves: their topologies. This study investigates how the topologies of syntactic graphs reveal traces of the processes that led to their emergence. I report a new universal regularity i… ▽ More Human syntactic structures are usually represented as graphs. Much research has focused on the mapping between such graphs and linguistic sequences, but less attention has been paid to the shapes of the graphs themselves: their topologies. This study investigates how the topologies of syntactic graphs reveal traces of the processes that led to their emergence. I report a new universal regularity in syntactic structures: Their topology is communicatively efficient above chance. The pattern holds, without exception, for all 124 languages studied, across linguistic families and modalities (spoken, written, and signed). This pattern can arise from a process optimizing for communicative efficiency or, alternatively, by construction, as a by-effect of a sublinear preferential attachment process reflecting language production mechanisms known from psycholinguistics. This dual explanation shows how communicative efficiency, per se, does not require optimization. Among the two options, efficiency without optimization offers the better explanation for the new pattern. △ Less

Submitted 31 January, 2023; originally announced February 2023.

Comments: 30 pages, 7 figures

arXiv:2301.06863 [pdf, other]

doi 10.1109/CASE49997.2022.9926499

A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles

Authors: Ivan Masmitja, Mario Martin, Kakani Katija, Spartacus Gomariz, Joan Navarro

Abstract: Underwater target localization using range-only and single-beacon (ROSB) techniques with autonomous vehicles has been used recently to improve the limitations of more complex methods, such as long baseline and ultra-short baseline systems. Nonetheless, in ROSB target localization methods, the trajectory of the tracking vehicle near the localized target plays an important role in obtaining the best… ▽ More Underwater target localization using range-only and single-beacon (ROSB) techniques with autonomous vehicles has been used recently to improve the limitations of more complex methods, such as long baseline and ultra-short baseline systems. Nonetheless, in ROSB target localization methods, the trajectory of the tracking vehicle near the localized target plays an important role in obtaining the best accuracy of the predicted target position. Here, we investigate a Reinforcement Learning (RL) approach to find the optimal path that an autonomous vehicle should follow in order to increase and optimize the overall accuracy of the predicted target localization, while reducing time and power consumption. To accomplish this objective, different experimental tests have been designed using state-of-the-art deep RL algorithms. Our study also compares the results obtained with the analytical Fisher information matrix approach used in previous studies. The results revealed that the policy learned by the RL agent outperforms trajectories based on these analytical solutions, e.g. the median predicted error at the beginning of the target's localisation is 17% less. These findings suggest that using deep RL for localizing acoustic targets could be successfully applied to in-water applications that include tracking of acoustically tagged marine animals by autonomous underwater vehicles. This is envisioned as a first necessary step to validate the use of RL to tackle such problems, which could be used later on in a more complex scenarios △ Less

Submitted 17 January, 2023; originally announced January 2023.

Comments: Accepted at CASE2022. Code at this Github repository https://github.com/imasmitja/RLforUTracking

Journal ref: IEEE 18th International Conference on Automation Science and Engineering (CASE), Mexico City, Mexico, 2022, pp. 675-682

arXiv:2301.05334 [pdf]

TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems

Authors: Matteo Gallici, Mario Martin, Ivan Masmitja

Abstract: Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team… ▽ More Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team reward. However, the structure in which the environment is presented to the agents and to the CF is typically overlooked. We have observed that the features used to describe the coordination problem can be represented as vertex features of a latent graph structure. Here, we present TransfQMix, a new approach that uses transformers to leverage this latent structure and learn better coordination policies. Our transformer agents perform a graph reasoning over the state of the observable entities. Our transformer Q-mixer learns a monotonic mixing-function from a larger graph that includes the internal and external states of the agents. TransfQMix is designed to be entirely transferable, meaning that same parameters can be used to control and train larger or smaller teams of agents. This enables to deploy promising approaches to save training time and derive general policies in MARL, such as transfer learning, zero-shot transfer, and curriculum learning. We report TransfQMix's performances in the Spread and StarCraft II environments. In both settings, it outperforms state-of-the-art Q-Learning models, and it demonstrates effectiveness in solving problems that other methods can not solve. △ Less

Submitted 12 January, 2023; originally announced January 2023.

Comments: Accepted at AAMAS 2023. Code at https://github.com/mttga/pymarl_transformers

arXiv:2211.09288 [pdf, other]

doi 10.1016/j.enbuild.2023.112997

Longitudinal thermal imaging for scalable non-residential HVAC and occupant behaviour characterization

Authors: Vasantha Ramani, Miguel Martin, Pandarasamy Arjunan, Adrian Chong, Kameshwar Poolla, Clayton Miller

Abstract: This work presents a study on the characterization of the air-conditioning (AC) usage pattern of non-residential buildings from thermal images collected from an urban-scale infrared (IR) observatory. To achieve this first, an image processing scheme, for cleaning and extraction of the temperature time series from the thermal images is implemented. To test the accuracy of the thermal measurements u… ▽ More This work presents a study on the characterization of the air-conditioning (AC) usage pattern of non-residential buildings from thermal images collected from an urban-scale infrared (IR) observatory. To achieve this first, an image processing scheme, for cleaning and extraction of the temperature time series from the thermal images is implemented. To test the accuracy of the thermal measurements using IR camera, the extracted temperature is compared against the ground truth surface temperature measurements. It is observed that the detrended thermal measurements match well with the ground truth surface temperature measurements. Subsequently, the operational pattern of the water-cooled systems and window AC units are extracted from the analysis of the thermal signature. It is observed that for the water-cooled system, the difference between the rate of change of the window and wall can be used to extract the operational pattern. While, in the case of the window AC units, wavelet transform of the AC unit temperature is used to extract the frequency and time domain information of the AC unit operation. The results of the analysis are compared against the indoor temperature sensors installed in the office spaces of the building. It is realized that the accuracy in the prediction of the operational pattern is highest between 8 pm to 10 am, and it reduces during the day because of solar radiation and high daytime temperature. Subsequently, a characterization study is conducted for eight window/split AC units from the thermal image collected during the nighttime. This forms one of the first studies on the operational behavior of HVAC systems for non-residential buildings using the longitudinal thermal imaging technique. The output from this study can be used to better understand the operational and occupant behavior, without requiring to deploy a large array of sensors in the building space. △ Less

Submitted 20 March, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Journal ref: Energy and Buildings, Volume 287, 2023, 112997

arXiv:2211.04545 [pdf, ps, other]

doi 10.1090/conm/795/15967

Voting on Cyclic Orders, Group Theory, and Ballots

Authors: Karl-Dieter Crisman, Abraham Holleran, Micah Martin, Josephine Noonan

Abstract: A cyclic order may be thought of informally as a way to seat people around a table, perhaps for a game of chance or for dinner. Given a set of agents such as $\{A,B,C\}$, we can formalize this by defining a cyclic order as a permutation or linear order on this finite set, under the equivalence relation where $A\succ B\succ C$ is identified with both $B\succ C\succ A$ and $C\succ A\succ B$. As with… ▽ More A cyclic order may be thought of informally as a way to seat people around a table, perhaps for a game of chance or for dinner. Given a set of agents such as $\{A,B,C\}$, we can formalize this by defining a cyclic order as a permutation or linear order on this finite set, under the equivalence relation where $A\succ B\succ C$ is identified with both $B\succ C\succ A$ and $C\succ A\succ B$. As with other collections of sets with some structure, we might want to aggregate preferences of a (possibly different) set of voters on the set of possible ways to choose a cyclic order. However, given the combinatorial explosion of the number of full rankings of cyclic orders, one may not wish to use the usual voting machinery. This raises the question of what sort of ballots may be appropriate; a single cyclic order, a set of them, or some other ballot type? Further, there is a natural action of the group of permutations on the set of agents. A reasonable requirement for a choice procedure would be to respect this symmetry (the equivalent of neutrality in normal voting theory). In this paper we will exploit the representation theory of the symmetric group to analyze several natural types of ballots for voting on cyclic orders, and points-based procedures using such ballots. We provide a full characterization of such procedures for two quite different ballot types for $n=4$, along with the most important observations for $n=5$. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: 29 pages, to be published in conference proceedings from AMS Special Session on The Mathematics of Decisions, Elections and Games, 2022

MSC Class: 91B12 (Primary) 91B14; 20C05 (Secondary)

arXiv:2210.11663 [pdf, other]

doi 10.1016/j.enbuild.2024.113973

InfraRed Investigation in Singapore (IRIS) Observatory: Urban heat island contributors and mitigators analysis using neighborhood-scale thermal imaging

Authors: Miguel Martin, Vasantha Ramani, Clayton Miller

Abstract: This paper studies heat fluxes from contributors and mitigators of urban heat islands using thermal images and weather data. Thermal images were collected from an observatory operating on the rooftop of a building between November 2021 and April 2022. Over the same period, an automatic weather station network was used to measure weather conditions at several locations on a university campus in Sin… ▽ More This paper studies heat fluxes from contributors and mitigators of urban heat islands using thermal images and weather data. Thermal images were collected from an observatory operating on the rooftop of a building between November 2021 and April 2022. Over the same period, an automatic weather station network was used to measure weather conditions at several locations on a university campus in Singapore. From data collected by the observatory and the automatic weather station network, a method was developed to estimate the heat emitted by building facades, vegetation, and traffic. Before performing the analysis of urban heat fluxes, it was observed that the surface temperature collected from the observatory is sensitive to some variables. After the sensitivity analysis, thermal images were calibrated against measurements of the surface temperature in an outdoor environment. Finally, several contributors and mitigators of urban heat islands were analyzed from heat fluxes assessed with thermal images and weather data. According to thermal images collected by the rooftop observatory, concrete walls are an important contributor to urban heat islands due to the longwave radiation they emit at night. Vegetation, on the other hand, seems to be an effective mitigator because of latent heat fluxes generated by evapotranspiration. Traffic looks to be a negligible source of heat if considered over a small portion of a road. In the future, more efforts can be made to estimate the magnitude of the heat released by an air-conditioning system from thermal images. △ Less

Submitted 8 February, 2024; v1 submitted 20 October, 2022; originally announced October 2022.

Journal ref: Energy Build. 2024;307: 113973

arXiv:2210.10474 [pdf, other]

doi 10.1364/OE.478308

Video super-resolution for single-photon LIDAR

Authors: Germán Mora Martín, Stirling Scholes, Alice Ruget, Robert K. Henderson, Jonathan Leach, Istvan Gyongy

Abstract: 3D Time-of-Flight (ToF) image sensors are used widely in applications such as self-driving cars, Augmented Reality (AR) and robotics. When implemented with Single-Photon Avalanche Diodes (SPADs), compact, array format sensors can be made that offer accurate depth maps over long distances, without the need for mechanical scanning. However, array sizes tend to be small, leading to low lateral resolu… ▽ More 3D Time-of-Flight (ToF) image sensors are used widely in applications such as self-driving cars, Augmented Reality (AR) and robotics. When implemented with Single-Photon Avalanche Diodes (SPADs), compact, array format sensors can be made that offer accurate depth maps over long distances, without the need for mechanical scanning. However, array sizes tend to be small, leading to low lateral resolution, which combined with low Signal-to-Noise Ratio (SNR) levels under high ambient illumination, may lead to difficulties in scene interpretation. In this paper, we use synthetic depth sequences to train a 3D Convolutional Neural Network (CNN) for denoising and upscaling (x4) depth data. Experimental results, based on synthetic as well as real ToF data, are used to demonstrate the effectiveness of the scheme. With GPU acceleration, frames are processed at >30 frames per second, making the approach suitable for low-latency imaging, as required for obstacle avoidance. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: 18 pages, 10 figures, 3 tables

arXiv:2210.02544 [pdf, other]

Deep learning for ECoG brain-computer interface: end-to-end vs. hand-crafted features

Authors: Maciej Śliwowski, Matthieu Martin, Antoine Souloumiac, Pierre Blanchart, Tetiana Aksenova

Abstract: In brain signal processing, deep learning (DL) models have become commonly used. However, the performance gain from using end-to-end DL models compared to conventional ML approaches is usually significant but moderate, typically at the cost of increased computational load and deteriorated explainability. The core idea behind deep learning approaches is scaling the performance with bigger datasets.… ▽ More In brain signal processing, deep learning (DL) models have become commonly used. However, the performance gain from using end-to-end DL models compared to conventional ML approaches is usually significant but moderate, typically at the cost of increased computational load and deteriorated explainability. The core idea behind deep learning approaches is scaling the performance with bigger datasets. However, brain signals are temporal data with a low signal-to-noise ratio, uncertain labels, and nonstationary data in time. Those factors may influence the training process and slow down the models' performance improvement. These factors' influence may differ for end-to-end DL model and one using hand-crafted features. As not studied before, this paper compares models that use raw ECoG signal and time-frequency features for BCI motor imagery decoding. We investigate whether the current dataset size is a stronger limitation for any models. Finally, obtained filters were compared to identify differences between hand-crafted features and optimized with backpropagation. To compare the effectiveness of both strategies, we used a multilayer perceptron and a mix of convolutional and LSTM layers that were already proved effective in this task. The analysis was performed on the long-term clinical trial database (almost 600 minutes of recordings) of a tetraplegic patient executing motor imagery tasks for 3D hand translation. For a given dataset, the results showed that end-to-end training might not be significantly better than the hand-crafted features-based model. The performance gap is reduced with bigger datasets, but considering the increased computational load, end-to-end training may not be profitable for this application. △ Less

Submitted 12 October, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

Comments: Replaced duplicated plot in figure 7

arXiv:2209.11772 [pdf, other]

A direct time-of-flight image sensor with in-pixel surface detection and dynamic vision

Authors: Istvan Gyongy, Ahmet T. Erdogan, Neale A. W. Dutton, Germán Mora Martín, Alistair Gorman, Hanning Mai, Francesco Mattioli Della Rocca, Robert K. Henderson

Abstract: 3D flash LIDAR is an alternative to the traditional scanning LIDAR systems, promising precise depth imaging in a compact form factor, and free of moving parts, for applications such as self-driving cars, robotics and augmented reality (AR). Typically implemented using single-photon, direct time-of-flight (dToF) receivers in image sensor format, the operation of the devices can be hindered by the l… ▽ More 3D flash LIDAR is an alternative to the traditional scanning LIDAR systems, promising precise depth imaging in a compact form factor, and free of moving parts, for applications such as self-driving cars, robotics and augmented reality (AR). Typically implemented using single-photon, direct time-of-flight (dToF) receivers in image sensor format, the operation of the devices can be hindered by the large number of photon events needing to be processed and compressed in outdoor scenarios, limiting frame rates and scalability to larger arrays. We here present a 64x32 pixel (256x128 SPAD) dToF imager that overcomes these limitations by using pixels with embedded histogramming, which lock onto and track the return signal. This reduces the size of output data frames considerably, enabling maximum frame rates in the 10 kFPS range or 100 kFPS for direct depth readings. The sensor offers selective readout of pixels detecting surfaces, or those sensing motion, leading to reduced power consumption and off-chip processing requirements. We demonstrate the application of the sensor in mid-range LIDAR. △ Less

Submitted 23 September, 2022; originally announced September 2022.

Comments: 24 pages, 16 figures. The visualisations may be viewed by clicking on the hyperlinks in the text

arXiv:2209.03789 [pdf, other]

Impact of dataset size and long-term ECoG-based BCI usage on deep learning decoders performance

Authors: Maciej Śliwowski, Matthieu Martin, Antoine Souloumiac, Pierre Blanchart, Tetiana Aksenova

Abstract: In brain-computer interfaces (BCI) research, recording data is time-consuming and expensive, which limits access to big datasets. This may influence the BCI system performance as machine learning methods depend strongly on the training dataset size. Important questions arise: taking into account neuronal signal characteristics (e.g., non-stationarity), can we achieve higher decoding performance wi… ▽ More In brain-computer interfaces (BCI) research, recording data is time-consuming and expensive, which limits access to big datasets. This may influence the BCI system performance as machine learning methods depend strongly on the training dataset size. Important questions arise: taking into account neuronal signal characteristics (e.g., non-stationarity), can we achieve higher decoding performance with more data to train decoders? What is the perspective for further improvement with time in the case of long-term BCI studies? In this study, we investigated the impact of long-term recordings on motor imagery decoding from two main perspectives: model requirements regarding dataset size and potential for patient adaptation. We evaluated the multilinear model and two deep learning (DL) models on a long-term BCI and Tetraplegia NCT02550522 clinical trial dataset containing 43 sessions of ECoG recordings performed with a tetraplegic patient. In the experiment, a participant executed 3D virtual hand translation using motor imagery patterns. We designed multiple computational experiments in which training datasets were increased or translated to investigate the relationship between models' performance and different factors influencing recordings. Our analysis showed that adding more data to the training dataset may not instantly increase performance for datasets already containing 40 minutes of the signal. DL decoders showed similar requirements regarding the dataset size compared to the multilinear model while demonstrating higher decoding performance. Moreover, high decoding performance was obtained with relatively small datasets recorded later in the experiment, suggesting motor imagery patterns improvement and patient adaptation. Finally, we proposed UMAP embeddings and local intrinsic dimensionality as a way to visualize the data and potentially evaluate data quality. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2208.06946 [pdf, other]

Targeted Honeyword Generation with Language Models

Authors: Fangyi Yu, Miguel Vargas Martin

Abstract: Honeywords are fictitious passwords inserted into databases in order to identify password breaches. The major difficulty is how to produce honeywords that are difficult to distinguish from real passwords. Although the generation of honeywords has been widely investigated in the past, the majority of existing research assumes attackers have no knowledge of the users. These honeyword generating tech… ▽ More Honeywords are fictitious passwords inserted into databases in order to identify password breaches. The major difficulty is how to produce honeywords that are difficult to distinguish from real passwords. Although the generation of honeywords has been widely investigated in the past, the majority of existing research assumes attackers have no knowledge of the users. These honeyword generating techniques (HGTs) may utterly fail if attackers exploit users' personally identifiable information (PII) and the real passwords include users' PII. In this paper, we propose to build a more secure and trustworthy authentication system that employs off-the-shelf pre-trained language models which require no further training on real passwords to produce honeywords while retaining the PII of the associated real password, therefore significantly raising the bar for attackers. We conducted a pilot experiment in which individuals are asked to distinguish between authentic passwords and honeywords when the username is provided for GPT-3 and a tweaking technique. Results show that it is extremely difficult to distinguish the real passwords from the artifical ones for both techniques. We speculate that a larger sample size could reveal a significant difference between the two HGT techniques, favouring our proposed approach. △ Less

Submitted 23 August, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

Comments: 8 pages, 7 tables, 2 figures

ACM Class: I.2

arXiv:2208.06943 [pdf, other]

doi 10.1109/EuroSPW55150.2022.00009

GNPassGAN: Improved Generative Adversarial Networks For Trawling Offline Password Guessing

Authors: Fangyi Yu, Miguel Vargas Martin

Abstract: The security of passwords depends on a thorough understanding of the strategies used by attackers. Unfortunately, real-world adversaries use pragmatic guessing tactics like dictionary attacks, which are difficult to simulate in password security research. Dictionary attacks must be carefully configured and modified to represent an actual threat. This approach, however, needs domain-specific knowle… ▽ More The security of passwords depends on a thorough understanding of the strategies used by attackers. Unfortunately, real-world adversaries use pragmatic guessing tactics like dictionary attacks, which are difficult to simulate in password security research. Dictionary attacks must be carefully configured and modified to represent an actual threat. This approach, however, needs domain-specific knowledge and expertise that are difficult to duplicate. This paper reviews various deep learning-based password guessing approaches that do not require domain knowledge or assumptions about users' password structures and combinations. It also introduces GNPassGAN, a password guessing tool built on generative adversarial networks for trawling offline attacks. In comparison to the state-of-the-art PassGAN model, GNPassGAN is capable of guessing 88.03\% more passwords and generating 31.69\% fewer duplicates. △ Less

Submitted 14 August, 2022; originally announced August 2022.

Comments: 9 pages, 8 tables, 3 figures

ACM Class: I.2

Journal ref: 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2022, pp. 10-18

arXiv:2206.09348 [pdf, other]

Nested bandits

Authors: Matthieu Martin, Panayotis Mertikopoulos, Thibaud Rahier, Houssam Zenati

Abstract: In many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the le… ▽ More In many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the learner seeks to minimize their regret in the presence of a large number of distinct alternatives with a hierarchy of embedded (non-combinatorial) similarities. In this setting, optimal algorithms based on the exponential weights blueprint (like Hedge, EXP3, and their variants) may incur significant regret because they tend to spend excessive amounts of time exploring irrelevant alternatives with similar, suboptimal costs. To account for this, we propose a nested exponential weights (NEW) algorithm that performs a layered exploration of the learner's set of alternatives based on a nested, step-by-step selection method. In so doing, we obtain a series of tight bounds for the learner's regret showing that online learning problems with a high degree of similarity between alternatives can be resolved efficiently, without a red bus / blue bus paradox occurring. △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: 35 pages, 14 figures; to appear in ICML 2022

MSC Class: Primary 68Q32; secondary 91B06

arXiv:2206.02848 [pdf]

Plagiarism deterrence for introductory programming

Authors: Simon J. Cohen, Michael J. Martin, Chance A. Shipley, Abhishek Kumar, Andrew R. Cohen

Abstract: Plagiarism in introductory programming courses is an enormous challenge for both students and institutions. For students, relying on the work of others too early in their academic development can make it impossible to acquire necessary skills for independent success in the future. For institutions, widespread student cheating can dilute the quality of the educational experience being offered. Curr… ▽ More Plagiarism in introductory programming courses is an enormous challenge for both students and institutions. For students, relying on the work of others too early in their academic development can make it impossible to acquire necessary skills for independent success in the future. For institutions, widespread student cheating can dilute the quality of the educational experience being offered. Currently available solutions consider only pairwise comparisons between student submissions and focus on punitive deterrence. Our approach instead relies on a class-wide statistical characterization that can be clearly and securely shared with students via an intuitive new p-value representing independence of student effort. A pairwise, compression-based similarity detection algorithm captures relationships between assignments more accurately. An automated deterrence system is used to warn students that their behavior is being closely monitored. High-confidence instances are made directly available for instructor review using our open-source toolkit. An unbiased scoring system aids students and the instructor in understanding true independence of effort. Preliminary results indicate that the system can provide meaningful measurements of independence from week one, improving the efficacy of technical education. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2202.05638 [pdf, other]

Efficient Kernel UCB for Contextual Bandits

Authors: Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard

Abstract: In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel… ▽ More In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm^2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(\sqrt(T)) and nearly constant in some cases. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: To appear at AISTATS2022

arXiv:2201.03331 [pdf, other]

Fiuncho: a program for any-order epistasis detection in CPU clusters

Authors: Christian Ponte-Fernández, Jorge González-Domínguez, María J. Martín

Abstract: Epistasis can be defined as the statistical interaction of genes during the expression of a phenotype. It is believed that it plays a fundamental role in gene expression, as individual genetic variants have reported a very small increase in disease risk in previous Genome-Wide Association Studies. The most successful approach to epistasis detection is the exhaustive method, although its exponentia… ▽ More Epistasis can be defined as the statistical interaction of genes during the expression of a phenotype. It is believed that it plays a fundamental role in gene expression, as individual genetic variants have reported a very small increase in disease risk in previous Genome-Wide Association Studies. The most successful approach to epistasis detection is the exhaustive method, although its exponential time complexity requires a highly parallel implementation in order to be used. This work presents Fiuncho, a program that exploits all levels of parallelism present in \textit{x86\_64} CPU clusters in order to mitigate the complexity of this approach. It supports epistasis interactions of any order, and when compared with other exhaustive methods, it is on average 358, 7 and 3 times faster than MDR, MPI3SNP and BitEpi, respectively. △ Less

Submitted 8 March, 2022; v1 submitted 10 January, 2022; originally announced January 2022.

Comments: Submitted to The Journal of Supercomputing. Source code available at https://github.com/UDC-GAC/fiuncho

Showing 1–50 of 93 results for author: Martín, M