Skip to main content

Showing 1–50 of 95 results for author: Martin, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.04388  [pdf, ps, other

    cs.CL cs.AI

    The Aloe Family Recipe for Open and Specialized Healthcare LLMs

    Authors: Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés

    Abstract: Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which in… ▽ More

    Submitted 28 May, 2025; v1 submitted 7 May, 2025; originally announced May 2025.

    Comments: Follow-up work from arXiv:2405.01886

  2. arXiv:2504.10826  [pdf, other

    cs.SD cs.MM eess.AS

    SteerMusic: Enhanced Musical Consistency for Zero-shot Text-Guided and Personalized Music Editing

    Authors: Xinlei Niu, Kin Wai Cheuk, Jing Zhang, Naoki Murata, Chieh-Hsin Lai, Michele Mancusi, Woosung Choi, Giorgio Fabbro, Wei-Hsiang Liao, Charles Patrick Martin, Yuki Mitsufuji

    Abstract: Music editing is an important step in music production, which has broad applications, including game development and film production. Most existing zero-shot text-guided methods rely on pretrained diffusion models by involving forward-backward diffusion processes for editing. However, these methods often struggle to maintain the music content consistency. Additionally, text instructions alone usua… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  3. Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis

    Authors: Alexandre Bazin, Alain Gutierrez, Marianne Huchard, Pierre Martin, Yulin, Zhang

    Abstract: A widely used Agile practice for requirements is to produce a set of user stories (also called ``agile product backlog''), which roughly includes a list of pairs (role, feature), where the role handles the feature for a certain purpose. In the context of Software Product Lines, the requirements for a family of similar systems is thus a family of user-story sets, one per system, leading to a 3-dime… ▽ More

    Submitted 11 April, 2025; originally announced April 2025.

    Comments: 20th International Conference on Evaluation of Novel Approaches to Software Engineering April 4-6, 2025, in Porto, Portugal

    Journal ref: Proceedings of ENASE 2025; SciTePress, pages 618-625 (2025)

  4. arXiv:2504.00012  [pdf, other

    cs.CR cs.CY cs.LG

    I'm Sorry Dave: How the old world of personnel security can inform the new world of AI insider risk

    Authors: Paul Martin, Sarah Mercer

    Abstract: Organisations are rapidly adopting artificial intelligence (AI) tools to perform tasks previously undertaken by people. The potential benefits are enormous. Separately, some organisations deploy personnel security measures to mitigate the security risks arising from trusted human insiders. Unfortunately, there is no meaningful interplay between the rapidly evolving domain of AI and the traditional… ▽ More

    Submitted 5 April, 2025; v1 submitted 26 March, 2025; originally announced April 2025.

  5. arXiv:2502.13196  [pdf

    cs.MM cs.CV

    GS-QA: Comprehensive Quality Assessment Benchmark for Gaussian Splatting View Synthesis

    Authors: Pedro Martin, António Rodrigues, João Ascenso, Maria Paula Queluz

    Abstract: Gaussian Splatting (GS) offers a promising alternative to Neural Radiance Fields (NeRF) for real-time 3D scene rendering. Using a set of 3D Gaussians to represent complex geometry and appearance, GS achieves faster rendering times and reduced memory consumption compared to the neural network approach used in NeRF. However, quality assessment of GS-generated static content is not yet explored in-de… ▽ More

    Submitted 16 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

  6. arXiv:2502.06666  [pdf, other

    cs.CL cs.AI

    Automatic Evaluation of Healthcare LLMs Beyond Question-Answering

    Authors: Anna Arias-Duart, Pablo Agustin Martin-Torres, Daniel Hinjos, Pablo Bernabeu-Perez, Lucia Urcelay Ganzabal, Marta Gonzalez Mallo, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Sergio Alvarez-Napagao, Dario Garcia-Gasulla

    Abstract: Current Large Language Models (LLMs) benchmarks are often based on open-ended or close-ended QA evaluations, avoiding the requirement of human labor. Close-ended measurements evaluate the factuality of responses but lack expressiveness. Open-ended capture the model's capacity to produce discourse responses but are harder to assess for correctness. These two approaches are commonly used, either ind… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  7. arXiv:2502.02523  [pdf, other

    cs.LG

    Brief analysis of DeepSeek R1 and its implications for Generative AI

    Authors: Sarah Mercer, Samuel Spillard, Daniel P. Martin

    Abstract: In late January 2025, DeepSeek released their new reasoning model (DeepSeek R1); which was developed at a fraction of the cost yet remains competitive with OpenAI's models, despite the US's GPU export ban. This report discusses the model, and what its release means for the field of Generative AI more widely. We briefly discuss other models released from China in recent weeks, their similarities; i… ▽ More

    Submitted 7 February, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

  8. arXiv:2412.06095  [pdf, other

    cs.CL cs.FL cs.IT

    Measuring Grammatical Diversity from Small Corpora: Derivational Entropy Rates, Mean Length of Utterances, and Annotation Invariance

    Authors: Fermin Moscoso del Prado Martin

    Abstract: In many fields, such as language acquisition, neuropsychology of language, the study of aging, and historical linguistics, corpora are used for estimating the diversity of grammatical structures that are produced during a period by an individual, community, or type of speakers. In these cases, treebanks are taken as representative samples of the syntactic structures that might be encountered. Gene… ▽ More

    Submitted 8 December, 2024; originally announced December 2024.

  9. arXiv:2412.04820  [pdf, other

    cs.RO cs.HC

    Assessing Similarity Measures for the Evaluation of Human-Robot Motion Correspondence

    Authors: Charles Dietzel, Patrick J. Martin

    Abstract: One key area of research in Human-Robot Interaction is solving the human-robot correspondence problem, which asks how a robot can learn to reproduce a human motion demonstration when the human and robot have different dynamics and kinematic structures. Evaluating these correspondence problem solutions often requires the use of qualitative surveys that can be time consuming to design and administer… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: 8 pages, 4 figures

  10. arXiv:2410.19459  [pdf

    cs.MM cs.CV eess.IV

    Evaluation of strategies for efficient rate-distortion NeRF streaming

    Authors: Pedro Martin, António Rodrigues, João Ascenso, Maria Paula Queluz

    Abstract: Neural Radiance Fields (NeRF) have revolutionized the field of 3D visual representation by enabling highly realistic and detailed scene reconstructions from a sparse set of images. NeRF uses a volumetric functional representation that maps 3D points to their corresponding colors and opacities, allowing for photorealistic view synthesis from arbitrary viewpoints. Despite its advancements, the effic… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  11. arXiv:2410.02144  [pdf, other

    cs.SD cs.LG eess.AS

    SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model

    Authors: Xinlei Niu, Jing Zhang, Charles Patrick Martin

    Abstract: We present SoundMorpher, an open-world sound morphing method designed to generate perceptually uniform morphing trajectories. Traditional sound morphing techniques typically assume a linear relationship between the morphing factor and sound perception, achieving smooth transitions by linearly interpolating the semantic features of source and target sounds while gradually adjusting the morphing fac… ▽ More

    Submitted 16 December, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

  12. Tonal Cognition in Sonification: Exploring the Needs of Practitioners in Sonic Interaction Design

    Authors: Minsik Choi, Josh Andres, Charles Patrick Martin

    Abstract: Research into tonal music examines the structural relationships among sounds and how they align with our auditory perception. The exploration of integrating tonal cognition into sonic interaction design, particularly for practitioners lacking extensive musical knowledge, and developing an accessible software tool, remains limited. We report on a study of designers to understand the sound creation… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: To be published in: Proceedings of the 19th Audio Mostly Conference: A Conference on Explorations in Sonic Cultures, Milan, Italy, 2024

  13. arXiv:2406.18571  [pdf, other

    cs.CV

    UltraCortex: Submillimeter Ultra-High Field 9.4 T Brain MR Image Collection and Manual Cortical Segmentations

    Authors: Lucas Mahler, Julius Steiglechner, Benjamin Bender, Tobias Lindig, Dana Ramadan, Jonas Bause, Florian Birk, Rahel Heule, Edyta Charyasz, Michael Erb, Vinod Jangir Kumar, Gisela E Hagberg, Pascal Martin, Gabriele Lohmann, Klaus Scheffler

    Abstract: The UltraCortex repository (https://www.ultracortex.org) houses magnetic resonance imaging data of the human brain obtained at an ultra-high field strength of 9.4 T. It contains 86 structural MR images with spatial resolutions ranging from 0.6 to 0.8 mm. Additionally, the repository includes segmentations of 12 brains into gray and white matter compartments. These segmentations have been independe… ▽ More

    Submitted 9 January, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

  14. arXiv:2405.20078  [pdf

    cs.MM

    NeRF View Synthesis: Subjective Quality Assessment and Objective Metrics Evaluation

    Authors: Pedro Martin, Antonio Rodrigues, Joao Ascenso, Maria Paula Queluz

    Abstract: Neural radiance fields (NeRF) are a groundbreaking computer vision technology that enables the generation of high-quality, immersive visual content from multiple viewpoints. This capability has significant advantages for applications such as virtual/augmented reality, 3D modelling, and content creation for the film and entertainment industry. However, the evaluation of NeRF methods poses several c… ▽ More

    Submitted 27 September, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  15. arXiv:2405.15338  [pdf, other

    cs.SD eess.AS

    SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation

    Authors: Xinlei Niu, Jing Zhang, Christian Walder, Charles Patrick Martin

    Abstract: We present SoundLoCD, a novel text-to-sound generation framework, which incorporates a LoRA-based conditional discrete contrastive latent diffusion model. Unlike recent large-scale sound generation models, our model can be efficiently trained under limited computational resources. The integration of a contrastive learning strategy further enhances the connection between text conditions and the gen… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.04592  [pdf

    cs.LG

    Integrating knowledge-guided symbolic regression and model-based design of experiments to automate process flow diagram development

    Authors: Alexander W. Rogers, Amanda Lane, Cesar Mendoza, Simon Watson, Adam Kowalski, Philip Martin, Dongda Zhang

    Abstract: New products must be formulated rapidly to succeed in the global formulated product market; however, key product indicators (KPIs) can be complex, poorly understood functions of the chemical composition and processing history. Consequently, scale-up must currently undergo expensive trial-and-error campaigns. To accelerate process flow diagram (PFD) optimisation and knowledge discovery, this work p… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  17. arXiv:2405.01886  [pdf, other

    cs.CL cs.AI

    Aloe: A Family of Fine-tuned Open Healthcare LLMs

    Authors: Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Jordi Bayarri-Planas, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Lucia Urcelay-Ganzabal, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés Dario Garcia-Gasulla

    Abstract: As the capabilities of Large Language Models (LLMs) in healthcare and medicine continue to advance, there is a growing need for competitive open-source models that can safeguard public interest. With the increasing availability of highly competitive open base models, the impact of continued pre-training is increasingly uncertain. In this work, we explore the role of instruct tuning, model merging,… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Five appendix

  18. HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts

    Authors: Xinlei Niu, Jing Zhang, Charles Patrick Martin

    Abstract: We introduce HybridVC, a voice conversion (VC) framework built upon a pre-trained conditional variational autoencoder (CVAE) that combines the strengths of a latent model with contrastive learning. HybridVC supports text and audio prompts, enabling more flexible voice style conversion. HybridVC models a latent distribution conditioned on speaker embeddings acquired by a pretrained speaker encoder… ▽ More

    Submitted 24 September, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: Proceedings of Interspeech

    Journal ref: Proc. Interspeech 2024, 4368-4372

  19. arXiv:2404.12356  [pdf, other

    stat.ML cs.LG cs.SI

    Improving the interpretability of GNN predictions through conformal-based graph sparsification

    Authors: Pablo Sanchez-Martin, Kinaan Aamir Khan, Isabel Valera

    Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance in solving graph classification tasks. However, most GNN architectures aggregate information from all nodes and edges in a graph, regardless of their relevance to the task at hand, thus hindering the interpretability of their predictions. In contrast to prior work, in this paper we propose a GNN \emph{training} approach that j… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  20. arXiv:2403.17776  [pdf, other

    cs.SI cs.HC stat.ME

    Exploring the Boundaries of Ambient Awareness in Twitter

    Authors: Pablo Sanchez-Martin, Sonja Utz, Isabel Valera

    Abstract: Ambient awareness refers to the ability of social media users to obtain knowledge about who knows what (i.e., users' expertise) in their network, by simply being exposed to other users' content (e.g, tweets on Twitter). Previous work, based on user surveys, reveals that individuals self-report ambient awareness only for parts of their networks. However, it is unclear whether it is their limited co… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  21. arXiv:2309.08048  [pdf, other

    cs.CV cs.AI

    Padding Aware Neurons

    Authors: Dario Garcia-Gasulla, Victor Gimenez-Abalos, Pablo Martin-Torres

    Abstract: Convolutional layers are a fundamental component of most image-related models. These layers often implement by default a static padding policy (\eg zero padding), to control the scale of the internal representations, and to allow kernel activations centered on the border regions. In this work we identify Padding Aware Neurons (PANs), a type of filter that is found in most (if not all) convolutiona… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: In 4th Visual Inductive Priors for Data-Efficient Deep Learning Workshop, ICCV 2023

  22. arXiv:2309.03671  [pdf, other

    cs.CV cs.AI cs.LG

    Dataset Generation and Bonobo Classification from Weakly Labelled Videos

    Authors: Pierre-Etienne Martin

    Abstract: This paper presents a bonobo detection and classification pipeline built from the commonly used machine learning methods. Such application is motivated by the need to test bonobos in their enclosure using touch screen devices without human assistance. This work introduces a newly acquired dataset based on bonobo recordings generated semi-automatically. The recordings are weakly labelled and fed to… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: IntelliSys 2023 paper

  23. arXiv:2308.02534  [pdf, other

    cs.CV cs.AI

    Exploring the Role of Explainability in AI-Assisted Embryo Selection

    Authors: Lucia Urcelay, Daniel Hinjos, Pablo A. Martin-Torres, Marta Gonzalez, Marta Mendez, Salva Cívico, Sergio Álvarez-Napagao, Dario Garcia-Gasulla

    Abstract: In Vitro Fertilization is among the most widespread treatments for infertility. One of its main challenges is the evaluation and selection of embryo for implantation, a process with large inter- and intra-clinician variability. Deep learning based methods are gaining attention, but their opaque nature compromises their acceptance in the clinical context, where transparency in the decision making i… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

  24. arXiv:2306.11840  [pdf, ps, other

    cs.DC

    A C++20 Interface for MPI 4.0

    Authors: Ali Can Demiralp, Philipp Martin, Niko Sakic, Marcel Krüger, Tim Gerrits

    Abstract: We present a modern C++20 interface for MPI 4.0. The interface utilizes recent language features to ease development of MPI applications. An aggregate reflection system enables generation of MPI data types from user-defined classes automatically. Immediate and persistent operations are mapped to futures, which can be chained to describe sequential asynchronous operations and task graphs in a conci… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: To appear in SC '22: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

  25. arXiv:2306.02568  [pdf, other

    stat.ML cs.LG

    Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming

    Authors: Xinlei Niu, Christian Walder, Jing Zhang, Charles Patrick Martin

    Abstract: We propose the stochastic optimal path which solves the classical optimal path problem by a probability-softening solution. This unified approach transforms a wide range of DP problems into directed acyclic graphs in which all paths follow a Gibbs distribution. We show the equivalence of the Gibbs distribution to a message-passing algorithm by the properties of the Gumbel distribution and give all… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted by ICML 2024

  26. arXiv:2305.03176  [pdf

    cs.MM

    NeRF-QA: Neural Radiance Fields Quality Assessment Database

    Authors: Pedro Martin, António Rodrigues, João Ascenso, Maria Paula Queluz

    Abstract: This short paper proposes a new database - NeRF-QA - containing 48 videos synthesized with seven NeRF based methods, along with their perceived quality scores, resulting from subjective assessment tests; for the videos selection, both real and synthetic, 360 degrees scenes were considered. This database will allow to evaluate the suitability, to NeRF based synthesized views, of existing objective… ▽ More

    Submitted 16 June, 2025; v1 submitted 4 May, 2023; originally announced May 2023.

  27. arXiv:2303.16960  [pdf, ps, other

    math.PR cs.DM math.CO

    Boltzmann Distribution on "Short" Integer Partitions with Power Parts: Limit Laws and Sampling

    Authors: Jean C. Peyen, Leonid V. Bogachev, Paul P. Martin

    Abstract: The paper is concerned with the asymptotic analysis of a family of Boltzmann (multiplicative) distributions over the set $\check{\varLambda}^{q}$ of strict integer partitions (i.e., with unequal parts) into perfect $q$-th powers. A combinatorial link is provided via a suitable conditioning by fixing the partition weight (the sum of parts) and length (the number of parts), leading to uniform distri… ▽ More

    Submitted 23 July, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: 62 pages, 5 figures, 4 tables

    MSC Class: 05A17 (Primary); 05A16; 60C05; 68Q87; 82B10 (Secondary)

    Journal ref: Advances in Applied Mathematics, Volume 159, August 2024, 102739

  28. arXiv:2302.02755  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Fine-Grained Action Detection with RGB and Pose Information using Two Stream Convolutional Networks

    Authors: Leonard Hacker, Finn Bartels, Pierre-Etienne Martin

    Abstract: As participants of the MediaEval 2022 Sport Task, we propose a two-stream network approach for the classification and detection of table tennis strokes. Each stream is a succession of 3D Convolutional Neural Network (CNN) blocks using attention mechanisms. Each stream processes different 4D inputs. Our method utilizes raw RGB data and pose information computed from MMPose toolbox. The pose informa… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Working note paper of the sport task of MediaEval 2022 in Bergen, Norway, 12-13 Jan 2023

  29. arXiv:2302.02752  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Baseline Method for the Sport Task of MediaEval 2022 with 3D CNNs using Attention Mechanisms

    Authors: Pierre-Etienne Martin

    Abstract: This paper presents the baseline method proposed for the Sports Video task part of the MediaEval 2022 benchmark. This task proposes two subtasks: stroke classification from trimmed videos, and stroke detection from untrimmed videos. This baseline addresses both subtasks. We propose two types of 3D-CNN architectures to solve the two subtasks. Both 3D-CNNs use Spatio-temporal convolutions and attent… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Baseline paper for the sport Task of MediaEval 2022

  30. arXiv:2302.00129  [pdf, other

    cs.CL q-bio.NC

    Universal Topological Regularities of Syntactic Structures: Decoupling Efficiency from Optimization

    Authors: Fermín Moscoso del Prado Martín

    Abstract: Human syntactic structures are usually represented as graphs. Much research has focused on the mapping between such graphs and linguistic sequences, but less attention has been paid to the shapes of the graphs themselves: their topologies. This study investigates how the topologies of syntactic graphs reveal traces of the processes that led to their emergence. I report a new universal regularity i… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: 30 pages, 7 figures

  31. arXiv:2301.13576  [pdf, other

    cs.AI cs.CV cs.HC cs.LG cs.MM

    Sport Task: Fine Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2022

    Authors: Pierre-Etienne Martin, Jordan Calandre, Boris Mansencal, Jenny Benois-Pineau, Renaud Péteri, Laurent Mascarilla, Julien Morlier

    Abstract: Sports video analysis is a widespread research topic. Its applications are very diverse, like events detection during a match, video summary, or fine-grained movement analysis of athletes. As part of the MediaEval 2022 benchmarking initiative, this task aims at detecting and classifying subtle movements from sport videos. We focus on recordings of table tennis matches. Conducted since 2019, this t… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: MediaEval 2022 Workshop, Jan 2023, Bergen, Norway. arXiv admin note: substantial text overlap with arXiv:2112.11384

  32. arXiv:2212.08484  [pdf, other

    cs.NE cs.MA

    Emergent communication enhances foraging behaviour in evolved swarms controlled by Spiking Neural Networks

    Authors: Cristian Jimenez Romero, Alper Yegenoglu, Aarón Pérez Martín, Sandra Diaz-Pier, Abigail Morrison

    Abstract: Social insects such as ants communicate via pheromones which allows them to coordinate their activity and solve complex tasks as a swarm, e.g. foraging for food. This behavior was shaped through evolutionary processes. In computational models, self-coordination in swarms has been implemented using probabilistic or simple action rules to shape the decision of each agent and the collective behavior.… ▽ More

    Submitted 8 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

    Comments: 27 pages, 16 figures

  33. arXiv:2210.09291  [pdf, other

    cs.HC

    Embodying the Glitch: Perspectives on Generative AI in Dance Practice

    Authors: Benedikte Wallace, Charles P. Martin

    Abstract: What role does the break from realism play in the potential for generative artificial intelligence as a creative tool? Through exploration of glitch, we examine the prospective value of these artefacts in creative practice. This paper describes findings from an exploration of AI-generated "mistakes" when using movement produced by a generative deep learning model as an inspiration source in dance… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  34. arXiv:2209.14030  [pdf, other

    cs.RO cs.CL cs.FL

    Monitoring ROS2: from Requirements to Autonomous Robots

    Authors: Ivan Perez, Anastasia Mavridou, Tom Pressburger, Alexander Will, Patrick J. Martin

    Abstract: Runtime verification (RV) has the potential to enable the safe operation of safety-critical systems that are too complex to formally verify, such as Robot Operating System 2 (ROS2) applications. Writing correct monitors can itself be complex, and errors in the monitoring subsystem threaten the mission as a whole. This paper provides an overview of a formal approach to generating runtime monitors f… ▽ More

    Submitted 28 September, 2022; originally announced September 2022.

    Comments: In Proceedings FMAS2022 ASYDE2022, arXiv:2209.13181

    ACM Class: D.2.1; D.2.4; I.2.9;

    Journal ref: EPTCS 371, 2022, pp. 208-216

  35. arXiv:2208.02758  [pdf, other

    cs.LG cs.MA math.DS math.NA

    Learning Interaction Variables and Kernels from Observations of Agent-Based Systems

    Authors: Jinchao Feng, Mauro Maggioni, Patrick Martin, Ming Zhong

    Abstract: Dynamical systems across many disciplines are modeled as interacting particles or agents, with interaction rules that depend on a very small number of variables (e.g. pairwise distances, pairwise differences of phases, etc...), functions of the state of pairs of agents. Yet, these interaction rules can generate self-organized dynamics, with complex emergent behaviors (clustering, flocking, swarmin… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

  36. arXiv:2204.08460  [pdf, other

    cs.CV cs.LG eess.IV

    3D Convolutional Networks for Action Recognition: Application to Sport Gesture Recognition

    Authors: Pierre-Etienne Martin, J Benois-Pineau, R Péteri, A Zemmari, J Morlier

    Abstract: 3D convolutional networks is a good means to perform tasks such as video segmentation into coherent spatio-temporal chunks and classification of them with regard to a target taxonomy. In the chapter we are interested in the classification of continuous video takes with repeatable actions, such as strokes of table tennis. Filmed in a free marker less ecological environment, these videos represent a… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: Multi-faceted Deep Learning, 2021

  37. Exploring hyper-parameter spaces of neuroscience models on high performance computers with Learning to Learn

    Authors: Alper Yegenoglu, Anand Subramoney, Thorsten Hater, Cristian Jimenez-Romero, Wouter Klijn, Aaron Perez Martin, Michiel van der Vlag, Michael Herty, Abigail Morrison, Sandra Diaz-Pier

    Abstract: Neuroscience models commonly have a high number of degrees of freedom and only specific regions within the parameter space are able to produce dynamics of interest. This makes the development of tools and strategies to efficiently find these regions of high importance to advance brain research. Exploring the high dimensional parameter space using numerical simulations has been a frequently used te… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  38. arXiv:2202.09977  [pdf, other

    cs.LG

    RTGNN: A Novel Approach to Model Stochastic Traffic Dynamics

    Authors: Ke Sun, Stephen Chaves, Paul Martin, Vijay Kumar

    Abstract: Modeling stochastic traffic dynamics is critical to developing self-driving cars. Because it is difficult to develop first principle models of cars driven by humans, there is great potential for using data driven approaches in developing traffic dynamical models. While there is extensive literature on this subject, previous works mainly address the prediction accuracy of data-driven models. Moreov… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: Accepted by ICRA 2022

  39. arXiv:2112.12074  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Spatio-Temporal CNN baseline method for the Sports Video Task of MediaEval 2021 benchmark

    Authors: Pierre-Etienne Martin

    Abstract: This paper presents the baseline method proposed for the Sports Video task part of the MediaEval 2021 benchmark. This task proposes a stroke detection and a stroke classification subtasks. This baseline addresses both subtasks. The spatio-temporal CNN architecture and the training process of the model are tailored according to the addressed subtask. The method has the purpose of helping the partic… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Journal ref: MediaEval 2021, Dec 2021, Online, Germany

  40. arXiv:2112.12073  [pdf, other

    cs.CV cs.LG cs.MM

    Two Stream Network for Stroke Detection in Table Tennis

    Authors: Anam Zahra, Pierre-Etienne Martin

    Abstract: This paper presents a table tennis stroke detection method from videos. The method relies on a two-stream Convolutional Neural Network processing in parallel the RGB Stream and its computed optical flow. The method has been developed as part of the MediaEval 2021 benchmark for the Sport task. Our contribution did not outperform the provided baseline on the test set but has performed the best among… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: MediaEval 2021, Dec 2021, Online, Germany

  41. arXiv:2112.11384  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Sports Video: Fine-Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2021

    Authors: Pierre-Etienne Martin, Jordan Calandre, Boris Mansencal, Jenny Benois-Pineau, Renaud Péteri, Laurent Mascarilla, Julien Morlier

    Abstract: Sports video analysis is a prevalent research topic due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests up to analysis of athletes' performance. The Sports Video task is part of the MediaEval 2021 benchmark. This task tackles fine-grained action detection and classification from videos. The focus is on recordings of table tennis games. Ru… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: MediaEval 2021, Dec 2021, Online, Germany

  42. arXiv:2110.14690  [pdf, other

    stat.ML cs.LG

    VACA: Design of Variational Graph Autoencoders for Interventional and Counterfactual Queries

    Authors: Pablo Sanchez-Martin, Miriam Rateike, Isabel Valera

    Abstract: In this paper, we introduce VACA, a novel class of variational graph autoencoders for causal inference in the absence of hidden confounders, when only observational data and the causal graph are available. Without making any parametric assumptions, VACA mimics the necessary properties of a Structural Causal Model (SCM) to provide a flexible and practical framework for approximating interventions (… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  43. arXiv:2109.14306  [pdf, other

    cs.CV cs.AI cs.HC cs.LG cs.MM

    Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis

    Authors: Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier

    Abstract: This paper proposes a fusion method of modalities extracted from video through a three-stream network with spatio-temporal and temporal convolutions for fine-grained action classification in sport. It is applied to TTStroke-21 dataset which consists of untrimmed videos of table tennis games. The goal is to detect and classify table tennis strokes in the videos, the first step of a bigger scheme ai… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Comments: MMSports '21, October 20, 2021, Virtual Event,, Oct 2021, Chengdu, China

  44. arXiv:2107.13386  [pdf, other

    cs.AR

    SPOTS: An Accelerator for Sparse Convolutional Networks Leveraging Systolic General Matrix-Matrix Multiplication

    Authors: Mohammadreza Soltaniyeh, Richard P. Martin, Santosh Nagarakatte

    Abstract: This paper proposes a new hardware accelerator for sparse convolutional neural networks (CNNs) by building a hardware unit to perform the Image to Column (IM2COL) transformation of the input feature map coupled with a systolic array-based general matrix-matrix multiplication (GEMM) unit. Our design carefully overlaps the IM2COL transformation with the GEMM computation to maximize parallelism. We p… ▽ More

    Submitted 24 November, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

    Comments: 24 pages

    Report number: Rutgers Department of Computer Science Technical Report DCS-TR-756

  45. arXiv:2105.06166  [pdf, ps, other

    cs.DS

    The Dynamic k-Mismatch Problem

    Authors: Raphaël Clifford, Paweł Gawrychowski, Tomasz Kociumaka, Daniel P. Martin, Przemysław Uznański

    Abstract: The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length $m$ and all length-$m$ substrings of a given text of length $n\ge m$. We focus on the $k$-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold $k$. We assume $n\le 2m$ (in general, one can partition the text into overlapping… ▽ More

    Submitted 28 March, 2022; v1 submitted 13 May, 2021; originally announced May 2021.

  46. arXiv:2012.05342  [pdf, other

    cs.CV cs.HC cs.LG cs.MM

    3D attention mechanism for fine-grained classification of table tennis strokes using a Twin Spatio-Temporal Convolutional Neural Networks

    Authors: Pierre-Etienne Martin, Jenny Benois-Pineau, Renaud Péteri, Julien Morlier

    Abstract: The paper addresses the problem of recognition of actions in video with low inter-class variability such as Table Tennis strokes. Two stream, "twin" convolutional neural networks are used with 3D convolutions both on RGB data and optical flow. Actions are recognized by classification of temporal windows. We introduce 3D attention modules and examine their impact on classification efficiency. In th… ▽ More

    Submitted 20 November, 2020; originally announced December 2020.

    Journal ref: 25th International Conference on Pattern Recognition (ICPR2020), Jan 2021, Milano, Italy

  47. Composing an Ensemble Standstill Work for Myo and Bela

    Authors: Charles Patrick Martin, Alexander Refsum Jensenius, Jim Torresen

    Abstract: This paper describes the process of developing a standstill performance work using the Myo gesture control armband and the Bela embedded computing platform. The combination of Myo and Bela allows a portable and extensible version of the standstill performance concept while introducing muscle tension as an additional control parameter. We describe the technical details of our setup and introduce My… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

    ACM Class: H.5.5

    Journal ref: Proceedings of the International Conference on New Interfaces for Musical Expression, 2018, pp. 196-197

  48. arXiv:2012.02322  [pdf, other

    cs.HC cs.SD eess.AS

    A Laptop Ensemble Performance System using Recurrent Neural Networks

    Authors: Rohan Proctor, Charles Patrick Martin

    Abstract: The popularity of applying machine learning techniques in musical domains has created an inherent availability of freely accessible pre-trained neural network (NN) models ready for use in creative applications. This work outlines the implementation of one such application in the form of an assistance tool designed for live improvisational performances by laptop ensembles. The primary intention was… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    ACM Class: H.5.5; H.5.3

    Journal ref: Proceedings of the International Conference on New Interfaces for Musical Expression, 2020, pp. 43-48

  49. arXiv:2012.02311  [pdf, other

    cs.HC cs.SD eess.AS

    Sonic Sculpture: Activating Engagement with Head-Mounted Augmented Reality

    Authors: Charles Patrick Martin, Zeruo Liu, Yichen Wang, Wennan He, Henry Gardner

    Abstract: This work examines how head-mounted AR can be used to build an interactive sonic landscape to engage with a public sculpture. We describe a sonic artwork, "Listening To Listening", that has been designed to accompany a real-world sculpture with two prototype interaction schemes. Our artwork is created for the HoloLens platform so that users can have an individual experience in a mixed reality cont… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    ACM Class: H.5.5; H.5.1

    Journal ref: Proceedings of the International Conference on New Interfaces for Musical Expression, 2020, pp. 48-52

  50. arXiv:2011.13453  [pdf, other

    cs.SD eess.AS

    Towards Movement Generation with Audio Features

    Authors: Benedikte Wallace, Charles P. Martin, Jim Torresen, Kristian Nymoen

    Abstract: Sound and movement are closely coupled, particularly in dance. Certain audio features have been found to affect the way we move to music. Is this relationship between sound and movement something which can be modelled using machine learning? This work presents initial experiments wherein high-level audio features calculated from a set of music pieces are included in a movement generation model tra… ▽ More

    Submitted 26 November, 2020; originally announced November 2020.