-
Crossing Borders Without Crossing Boundaries: How Sociolinguistic Awareness Can Optimize User Engagement with Localized Spanish AI Models Across Hispanophone Countries
Authors:
Martin Capdevila,
Esteban Villa Turek,
Ellen Karina Chumbe Fernandez,
Luis Felipe Polo Galvez,
Luis Cadavid,
Andrea Marroquin,
Rebeca Vargas Quesada,
Johanna Crew,
Nicole Vallejo Galarraga,
Christopher Rodriguez,
Diego Gutierrez,
Radhi Datla
Abstract:
Large language models are, by definition, based on language. In an effort to underscore the critical need for regional localized models, this paper examines primary differences between variants of written Spanish across Latin America and Spain, with an in-depth sociocultural and linguistic contextualization therein. We argue that these differences effectively constitute significant gaps in the quo…
▽ More
Large language models are, by definition, based on language. In an effort to underscore the critical need for regional localized models, this paper examines primary differences between variants of written Spanish across Latin America and Spain, with an in-depth sociocultural and linguistic contextualization therein. We argue that these differences effectively constitute significant gaps in the quotidian use of Spanish among dialectal groups by creating sociolinguistic dissonances, to the extent that locale-sensitive AI models would play a pivotal role in bridging these divides. In doing so, this approach informs better and more efficient localization strategies that also serve to more adequately meet inclusivity goals, while securing sustainable active daily user growth in a major low-risk investment geographic area. Therefore, implementing at least the proposed five sub variants of Spanish addresses two lines of action: to foment user trust and reliance on AI language models while also demonstrating a level of cultural, historical, and sociolinguistic awareness that reflects positively on any internationalization strategy.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
Optimizing Large Language Models for Detecting Symptoms of Comorbid Depression or Anxiety in Chronic Diseases: Insights from Patient Messages
Authors:
Jiyeong Kim,
Stephen P. Ma,
Michael L. Chen,
Isaac R. Galatzer-Levy,
John Torous,
Peter J. van Roessel,
Christopher Sharp,
Michael A. Pfeffer,
Carolyn I. Rodriguez,
Eleni Linos,
Jonathan H. Chen
Abstract:
Patients with diabetes are at increased risk of comorbid depression or anxiety, complicating their management. This study evaluated the performance of large language models (LLMs) in detecting these symptoms from secure patient messages. We applied multiple approaches, including engineered prompts, systemic persona, temperature adjustments, and zero-shot and few-shot learning, to identify the best…
▽ More
Patients with diabetes are at increased risk of comorbid depression or anxiety, complicating their management. This study evaluated the performance of large language models (LLMs) in detecting these symptoms from secure patient messages. We applied multiple approaches, including engineered prompts, systemic persona, temperature adjustments, and zero-shot and few-shot learning, to identify the best-performing model and enhance performance. Three out of five LLMs demonstrated excellent performance (over 90% of F-1 and accuracy), with Llama 3.1 405B achieving 93% in both F-1 and accuracy using a zero-shot approach. While LLMs showed promise in binary classification and handling complex metrics like Patient Health Questionnaire-4, inconsistencies in challenging cases warrant further real-life assessment. The findings highlight the potential of LLMs to assist in timely screening and referrals, providing valuable empirical knowledge for real-world triage systems that could improve mental health care for patients with chronic diseases.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Can We Detect Failures Without Failure Data? Uncertainty-Aware Runtime Failure Detection for Imitation Learning Policies
Authors:
Chen Xu,
Tony Khuong Nguyen,
Emma Dixon,
Christopher Rodriguez,
Patrick Miller,
Robert Lee,
Paarth Shah,
Rares Ambrus,
Haruki Nishimura,
Masha Itkina
Abstract:
Recent years have witnessed impressive robotic manipulation systems driven by advances in imitation learning and generative modeling, such as diffusion- and flow-based approaches. As robot policy performance increases, so does the complexity and time horizon of achievable tasks, inducing unexpected and diverse failure modes that are difficult to predict a priori. To enable trustworthy policy deplo…
▽ More
Recent years have witnessed impressive robotic manipulation systems driven by advances in imitation learning and generative modeling, such as diffusion- and flow-based approaches. As robot policy performance increases, so does the complexity and time horizon of achievable tasks, inducing unexpected and diverse failure modes that are difficult to predict a priori. To enable trustworthy policy deployment in safety-critical human environments, reliable runtime failure detection becomes important during policy inference. However, most existing failure detection approaches rely on prior knowledge of failure modes and require failure data during training, which imposes a significant challenge in practicality and scalability. In response to these limitations, we present FAIL-Detect, a modular two-stage approach for failure detection in imitation learning-based robotic manipulation. To accurately identify failures from successful training data alone, we frame the problem as sequential out-of-distribution (OOD) detection. We first distill policy inputs and outputs into scalar signals that correlate with policy failures and capture epistemic uncertainty. FAIL-Detect then employs conformal prediction (CP) as a versatile framework for uncertainty quantification with statistical guarantees. Empirically, we thoroughly investigate both learned and post-hoc scalar signal candidates on diverse robotic manipulation tasks. Our experiments show learned signals to be mostly consistently effective, particularly when using our novel flow-based density estimator. Furthermore, our method detects failures more accurately and faster than state-of-the-art (SOTA) failure detection baselines. These results highlight the potential of FAIL-Detect to enhance the safety and reliability of imitation learning-based robotic systems as they progress toward real-world deployment.
△ Less
Submitted 20 June, 2025; v1 submitted 11 March, 2025;
originally announced March 2025.
-
Manual-PA: Learning 3D Part Assembly from Instruction Diagrams
Authors:
Jiahao Zhang,
Anoop Cherian,
Cristian Rodriguez,
Weijian Deng,
Stephen Gould
Abstract:
Assembling furniture amounts to solving the discrete-continuous optimization task of selecting the furniture parts to assemble and estimating their connecting poses in a physically realistic manner. The problem is hampered by its combinatorially large yet sparse solution space thus making learning to assemble a challenging task for current machine learning models. In this paper, we attempt to solv…
▽ More
Assembling furniture amounts to solving the discrete-continuous optimization task of selecting the furniture parts to assemble and estimating their connecting poses in a physically realistic manner. The problem is hampered by its combinatorially large yet sparse solution space thus making learning to assemble a challenging task for current machine learning models. In this paper, we attempt to solve this task by leveraging the assembly instructions provided in diagrammatic manuals that typically accompany the furniture parts. Our key insight is to use the cues in these diagrams to split the problem into discrete and continuous phases. Specifically, we present Manual-PA, a transformer-based instruction Manual-guided 3D Part Assembly framework that learns to semantically align 3D parts with their illustrations in the manuals using a contrastive learning backbone towards predicting the assembly order and infers the 6D pose of each part via relating it to the final furniture depicted in the manual. To validate the efficacy of our method, we conduct experiments on the benchmark PartNet dataset. Our results show that using the diagrams and the order of the parts lead to significant improvements in assembly performance against the state of the art. Further, Manual-PA demonstrates strong generalization to real-world IKEA furniture assembly on the IKEA-Manual dataset.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
Analysis of Classifier Training on Synthetic Data for Cross-Domain Datasets
Authors:
Andoni Cortés,
Clemente Rodríguez,
Gorka Velez,
Javier Barandiarán,
Marcos Nieto
Abstract:
A major challenges of deep learning (DL) is the necessity to collect huge amounts of training data. Often, the lack of a sufficiently large dataset discourages the use of DL in certain applications. Typically, acquiring the required amounts of data costs considerable time, material and effort. To mitigate this problem, the use of synthetic images combined with real data is a popular approach, wide…
▽ More
A major challenges of deep learning (DL) is the necessity to collect huge amounts of training data. Often, the lack of a sufficiently large dataset discourages the use of DL in certain applications. Typically, acquiring the required amounts of data costs considerable time, material and effort. To mitigate this problem, the use of synthetic images combined with real data is a popular approach, widely adopted in the scientific community to effectively train various detectors. In this study, we examined the potential of synthetic data-based training in the field of intelligent transportation systems. Our focus is on camera-based traffic sign recognition applications for advanced driver assistance systems and autonomous driving. The proposed augmentation pipeline of synthetic datasets includes novel augmentation processes such as structured shadows and gaussian specular highlights. A well-known DL model was trained with different datasets to compare the performance of synthetic and real image-based trained models. Additionally, a new, detailed method to objectively compare these models is proposed. Synthetic images are generated using a semi-supervised errors-guide method which is also described. Our experiments showed that a synthetic image-based approach outperforms in most cases real image-based training when applied to cross-domain test datasets (+10% precision for GTSRB dataset) and consequently, the generalization of the model is improved decreasing the cost of acquiring images.
△ Less
Submitted 30 October, 2024;
originally announced October 2024.
-
Cross-lingual Transfer of Reward Models in Multilingual Alignment
Authors:
Jiwoo Hong,
Noah Lee,
Rodrigo Martínez-Castaño,
César Rodríguez,
James Thorne
Abstract:
Reinforcement learning with human feedback (RLHF) is shown to largely benefit from precise reward models (RMs). However, recent studies in reward modeling schemes are skewed towards English, limiting the applicability of RLHF in multilingual alignments. In this work, we investigate the cross-lingual transfer of RMs trained in diverse languages, primarily from English. Our experimental results demo…
▽ More
Reinforcement learning with human feedback (RLHF) is shown to largely benefit from precise reward models (RMs). However, recent studies in reward modeling schemes are skewed towards English, limiting the applicability of RLHF in multilingual alignments. In this work, we investigate the cross-lingual transfer of RMs trained in diverse languages, primarily from English. Our experimental results demonstrate the strong cross-lingual transfer of English RMs, exceeding target language RMs by 3~4% average increase in Multilingual RewardBench. Furthermore, we analyze the cross-lingual transfer of RMs through the representation shifts. Finally, we perform multilingual alignment to exemplify how cross-lingual transfer in RM propagates to enhanced multilingual instruction-following capability, along with extensive analyses on off-the-shelf RMs. We release the code, model, and data.
△ Less
Submitted 23 January, 2025; v1 submitted 23 October, 2024;
originally announced October 2024.
-
Fast Object Detection with a Machine Learning Edge Device
Authors:
Richard C. Rodriguez,
Jonah Elijah P. Bardos
Abstract:
This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humano…
▽ More
This machine learning study investigates a lowcost edge device integrated with an embedded system having computer vision and resulting in an improved performance in inferencing time and precision of object detection and classification. A primary aim of this study focused on reducing inferencing time and low-power consumption and to enable an embedded device of a competition-ready autonomous humanoid robot and to support real-time object recognition, scene understanding, visual navigation, motion planning, and autonomous navigation of the robot. This study compares processors for inferencing time performance between a central processing unit (CPU), a graphical processing unit (GPU), and a tensor processing unit (TPU). CPUs, GPUs, and TPUs are all processors that can be used for machine learning tasks. Related to the aim of supporting an autonomous humanoid robot, there was an additional effort to observe whether or not there was a significant difference in using a camera having monocular vision versus stereo vision capability. TPU inference time results for this study reflect a 25% reduction in time over the GPU, and a whopping 87.5% reduction in inference time compared to the CPU. Much information in this paper is contributed to the final selection of Google's Coral brand, Edge TPU device. The Arduino Nano 33 BLE Sense Tiny ML Kit was also considered for comparison but due to initial incompatibilities and in the interest of time to complete this study, a decision was made to review the kit in a future experiment.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Evaluating the Energy Consumption of Machine Learning: Systematic Literature Review and Experiments
Authors:
Charlotte Rodriguez,
Laura Degioanni,
Laetitia Kameni,
Richard Vidal,
Giovanni Neglia
Abstract:
Monitoring, understanding, and optimizing the energy consumption of Machine Learning (ML) are various reasons why it is necessary to evaluate the energy usage of ML. However, there exists no universal tool that can answer this question for all use cases, and there may even be disagreement on how to evaluate energy consumption for a specific use case. Tools and methods are based on different approa…
▽ More
Monitoring, understanding, and optimizing the energy consumption of Machine Learning (ML) are various reasons why it is necessary to evaluate the energy usage of ML. However, there exists no universal tool that can answer this question for all use cases, and there may even be disagreement on how to evaluate energy consumption for a specific use case. Tools and methods are based on different approaches, each with their own advantages and drawbacks, and they need to be mapped out and explained in order to select the most suitable one for a given situation. We address this challenge through two approaches. First, we conduct a systematic literature review of all tools and methods that permit to evaluate the energy consumption of ML (both at training and at inference), irrespective of whether they were originally designed for machine learning or general software. Second, we develop and use an experimental protocol to compare a selection of these tools and methods. The comparison is both qualitative and quantitative on a range of ML tasks of different nature (vision, language) and computational complexity. The systematic literature review serves as a comprehensive guide for understanding the array of tools and methods used in evaluating energy consumption of ML, for various use cases going from basic energy monitoring to consumption optimization. Two open-source repositories are provided for further exploration. The first one contains tools that can be used to replicate this work or extend the current review. The second repository houses the experimental protocol, allowing users to augment the protocol with new ML computing tasks and additional energy evaluation tools.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Temporally Grounding Instructional Diagrams in Unconstrained Videos
Authors:
Jiahao Zhang,
Frederic Z. Zhang,
Cristian Rodriguez,
Yizhak Ben-Shabat,
Anoop Cherian,
Stephen Gould
Abstract:
We study the challenging problem of simultaneously localizing a sequence of queries in the form of instructional diagrams in a video. This requires understanding not only the individual queries but also their interrelationships. However, most existing methods focus on grounding one query at a time, ignoring the inherent structures among queries such as the general mutual exclusiveness and the temp…
▽ More
We study the challenging problem of simultaneously localizing a sequence of queries in the form of instructional diagrams in a video. This requires understanding not only the individual queries but also their interrelationships. However, most existing methods focus on grounding one query at a time, ignoring the inherent structures among queries such as the general mutual exclusiveness and the temporal order. Consequently, the predicted timespans of different step diagrams may overlap considerably or violate the temporal order, thus harming the accuracy. In this paper, we tackle this issue by simultaneously grounding a sequence of step diagrams. Specifically, we propose composite queries, constructed by exhaustively pairing up the visual content features of the step diagrams and a fixed number of learnable positional embeddings. Our insight is that self-attention among composite queries carrying different content features suppress each other to reduce timespan overlaps in predictions, while the cross-attention corrects the temporal misalignment via content and position joint guidance. We demonstrate the effectiveness of our approach on the IAW dataset for grounding step diagrams and the YouCook2 benchmark for grounding natural language queries, significantly outperforming existing methods while simultaneously grounding multiple queries.
△ Less
Submitted 1 December, 2024; v1 submitted 16 July, 2024;
originally announced July 2024.
-
Temporal True and Surrogate Fitness Landscape Analysis for Expensive Bi-Objective Optimisation
Authors:
C. J. Rodriguez,
S. L. Thomson,
T. Alderliesten,
P. A. N. Bosman
Abstract:
Many real-world problems have expensive-to-compute fitness functions and are multi-objective in nature. Surrogate-assisted evolutionary algorithms are often used to tackle such problems. Despite this, literature about analysing the fitness landscapes induced by surrogate models is limited, and even non-existent for multi-objective problems. This study addresses this critical gap by comparing lands…
▽ More
Many real-world problems have expensive-to-compute fitness functions and are multi-objective in nature. Surrogate-assisted evolutionary algorithms are often used to tackle such problems. Despite this, literature about analysing the fitness landscapes induced by surrogate models is limited, and even non-existent for multi-objective problems. This study addresses this critical gap by comparing landscapes of the true fitness function with those of surrogate models for multi-objective functions. Moreover, it does so temporally by examining landscape features at different points in time during optimisation, in the vicinity of the population at that point in time. We consider the BBOB bi-objective benchmark functions in our experiments. The results of the fitness landscape analysis reveals significant differences between true and surrogate features at different time points during optimisation. Despite these differences, the true and surrogate landscape features still show high correlations between each other. Furthermore, this study identifies which landscape features are related to search and demonstrates that both surrogate and true landscape features are capable of predicting algorithm performance. These findings indicate that temporal analysis of the landscape features may help to facilitate the design of surrogate switching approaches to improve performance in multi-objective optimisation.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Virtual Reality for Understanding Artificial-Intelligence-driven Scientific Discovery with an Application in Quantum Optics
Authors:
Philipp Schmidt,
Sören Arlt,
Carlos Ruiz-Gonzalez,
Xuemei Gu,
Carla Rodríguez,
Mario Krenn
Abstract:
Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fu…
▽ More
Generative Artificial Intelligence (AI) models can propose solutions to scientific problems beyond human capability. To truly make conceptual contributions, researchers need to be capable of understanding the AI-generated structures and extracting the underlying concepts and ideas. When algorithms provide little explanatory reasoning alongside the output, scientists have to reverse-engineer the fundamental insights behind proposals based solely on examples. This task can be challenging as the output is often highly complex and thus not immediately accessible to humans. In this work we show how transferring part of the analysis process into an immersive Virtual Reality (VR) environment can assist researchers in developing an understanding of AI-generated solutions. We demonstrate the usefulness of VR in finding interpretable configurations of abstract graphs, representing Quantum Optics experiments. Thereby, we can manually discover new generalizations of AI-discoveries as well as new understanding in experimental quantum optics. Furthermore, it allows us to customize the search space in an informed way - as a human-in-the-loop - to achieve significantly faster subsequent discovery iterations. As concrete examples, with this technology, we discover a new resource-efficient 3-dimensional entanglement swapping scheme, as well as a 3-dimensional 4-particle Greenberger-Horne-Zeilinger-state analyzer. Our results show the potential of VR for increasing a human researcher's ability to derive knowledge from graph-based generative AI that, which is a common abstract data representation used in diverse fields of science.
△ Less
Submitted 20 February, 2024;
originally announced March 2024.
-
MolecularWebXR: Multiuser discussions about chemistry and biology in immersive and inclusive VR
Authors:
Fabio J. Cortes Rodriguez,
Gianfranco Frattini,
Fernando Teixeira Pinto Meireles,
Danae A. Terrien,
Sergio Cruz-Leon,
Matteo Dal Peraro,
Eva Schier,
Diego M. Moreno,
Luciano A. Abriata
Abstract:
MolecularWebXR is our new website for education, science communication and scientific peer discussion in chemistry and biology built on WebXR. It democratizes multi-user, inclusive virtual reality (VR) experiences that are deeply immersive for users wearing high-end headsets, yet allow participation by users with consumer devices such as smartphones, possibly inserted into cardboard goggles for im…
▽ More
MolecularWebXR is our new website for education, science communication and scientific peer discussion in chemistry and biology built on WebXR. It democratizes multi-user, inclusive virtual reality (VR) experiences that are deeply immersive for users wearing high-end headsets, yet allow participation by users with consumer devices such as smartphones, possibly inserted into cardboard goggles for immersivity, or even computers or tablets. With no installs as it is all web-served, MolecularWebXR enables multiple users to simultaneously explore, communicate and discuss chemistry and biology concepts in immersive 3D environments, manipulating objects with their bare hands, either present in the same real space or scattered throughout the globe thanks to built-in audio features. A series of preset rooms cover educational material on chemistry and structural biology, and an empty room can be populated with material prepared ad hoc using moleculARweb's VMD-based PDB2AR tool. We verified ease of use and versatility by users aged 12-80 in entirely virtual sessions or mixed real-virtual sessions at science outreach events, student instruction, scientific collaborations, and conference lectures. MolecularWebXR is available for free use without registration at https://molecularwebxr.org, and a blog post version of this preprint with embedded videos is available at https://go.epfl.ch/molecularwebxr-blog-post.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
MAVIS: Multi-Camera Augmented Visual-Inertial SLAM using SE2(3) Based Exact IMU Pre-integration
Authors:
Yifu Wang,
Yonhon Ng,
Inkyu Sa,
Alvaro Parra,
Cristian Rodriguez,
Tao Jun Lin,
Hongdong Li
Abstract:
We present a novel optimization-based Visual-Inertial SLAM system designed for multiple partially overlapped camera systems, named MAVIS. Our framework fully exploits the benefits of wide field-of-view from multi-camera systems, and the metric scale measurements provided by an inertial measurement unit (IMU). We introduce an improved IMU pre-integration formulation based on the exponential functio…
▽ More
We present a novel optimization-based Visual-Inertial SLAM system designed for multiple partially overlapped camera systems, named MAVIS. Our framework fully exploits the benefits of wide field-of-view from multi-camera systems, and the metric scale measurements provided by an inertial measurement unit (IMU). We introduce an improved IMU pre-integration formulation based on the exponential function of an automorphism of SE_2(3), which can effectively enhance tracking performance under fast rotational motion and extended integration time. Furthermore, we extend conventional front-end tracking and back-end optimization module designed for monocular or stereo setup towards multi-camera systems, and introduce implementation details that contribute to the performance of our system in challenging scenarios. The practical validity of our approach is supported by our experiments on public datasets. Our MAVIS won the first place in all the vision-IMU tracks (single and multi-session SLAM) on Hilti SLAM Challenge 2023 with 1.7 times the score compared to the second place.
△ Less
Submitted 16 July, 2024; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Directed Scattering for Knowledge Graph-based Cellular Signaling Analysis
Authors:
Aarthi Venkat,
Joyce Chew,
Ferran Cardoso Rodriguez,
Christopher J. Tape,
Michael Perlmutter,
Smita Krishnaswamy
Abstract:
Directed graphs are a natural model for many phenomena, in particular scientific knowledge graphs such as molecular interaction or chemical reaction networks that define cellular signaling relationships. In these situations, source nodes typically have distinct biophysical properties from sinks. Due to their ordered and unidirectional relationships, many such networks also have hierarchical and mu…
▽ More
Directed graphs are a natural model for many phenomena, in particular scientific knowledge graphs such as molecular interaction or chemical reaction networks that define cellular signaling relationships. In these situations, source nodes typically have distinct biophysical properties from sinks. Due to their ordered and unidirectional relationships, many such networks also have hierarchical and multiscale structure. However, the majority of methods performing node- and edge-level tasks in machine learning do not take these properties into account, and thus have not been leveraged effectively for scientific tasks such as cellular signaling network inference. We propose a new framework called Directed Scattering Autoencoder (DSAE) which uses a directed version of a geometric scattering transform, combined with the non-linear dimensionality reduction properties of an autoencoder and the geometric properties of the hyperbolic space to learn latent hierarchies. We show this method outperforms numerous others on tasks such as embedding directed graphs and learning cellular signaling networks.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
SafeLS: Toward Building a Lockstep NOEL-V Core
Authors:
Marcel Sarraseca,
Sergi Alcaide,
Francisco Fuentes,
Juan Carlos Rodriguez,
Feng Chang,
Ilham Lasfar,
Ramon Canal,
Francisco J. Cazorla,
Jaume Abella
Abstract:
Safety-critical systems such as those in automotive, avionics and space, require appropriate safety measures to avoid silent data corruption upon random hardware errors such as those caused by radiation and other types of electromagnetic interference. Those safety measures must be able to prevent faults from causing the so-called common cause failures (CCFs), which occur when a fault produces iden…
▽ More
Safety-critical systems such as those in automotive, avionics and space, require appropriate safety measures to avoid silent data corruption upon random hardware errors such as those caused by radiation and other types of electromagnetic interference. Those safety measures must be able to prevent faults from causing the so-called common cause failures (CCFs), which occur when a fault produces identical errors in redundant elements so that comparison fails to detect the errors and a failure arises. The usual solution to avoid CCFs in CPU cores is using lockstep cores, so that two cores execute the same flow of instructions, but with some time staggering so that their state is never identical and faults can only lead to different errors, which are then detectable by means of comparison. This paper extends Gaisler's RISC-V NOEL-V core with lockstep; and presents future prospects for its use and distribution.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
Authors:
Jiahao Zhang,
Anoop Cherian,
Yanbin Liu,
Yizhak Ben-Shabat,
Cristian Rodriguez,
Stephen Gould
Abstract:
Multimodal alignment facilitates the retrieval of instances from one modality when queried using another. In this paper, we consider a novel setting where such an alignment is between (i) instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) and (ii) video segments from in-the-wild videos; these videos comprising an enactment of the assembly actions in t…
▽ More
Multimodal alignment facilitates the retrieval of instances from one modality when queried using another. In this paper, we consider a novel setting where such an alignment is between (i) instruction steps that are depicted as assembly diagrams (commonly seen in Ikea assembly manuals) and (ii) video segments from in-the-wild videos; these videos comprising an enactment of the assembly actions in the real world. To learn this alignment, we introduce a novel supervised contrastive learning method that learns to align videos with the subtle details in the assembly diagrams, guided by a set of novel losses. To study this problem and demonstrate the effectiveness of our method, we introduce a novel dataset: IAW for Ikea assembly in the wild consisting of 183 hours of videos from diverse furniture assembly collections and nearly 8,300 illustrations from their associated instruction manuals and annotated for their ground truth alignments. We define two tasks on this dataset: First, nearest neighbor retrieval between video segments and illustrations, and, second, alignment of instruction steps and the segments for each video. Extensive experiments on IAW demonstrate superior performances of our approach against alternatives.
△ Less
Submitted 1 December, 2024; v1 submitted 24 March, 2023;
originally announced March 2023.
-
UTC Time, Formally Verified
Authors:
Ana de Almeida Borges,
Mireia González Bedmar,
Juan Conejero Rodríguez,
Eduardo Hermo Reyes,
Joaquim Casals Buñuel,
Joost J. Joosten
Abstract:
FV Time is a small-scale verification project developed in the Coq proof assistant using the Mathematical Components libraries. It is a library for managing conversions between time formats (UTC and timestamps), as well as commonly used functions for time arithmetic. As a library for time conversions, its novelty is the implementation of leap seconds, which are part of the UTC standard but usually…
▽ More
FV Time is a small-scale verification project developed in the Coq proof assistant using the Mathematical Components libraries. It is a library for managing conversions between time formats (UTC and timestamps), as well as commonly used functions for time arithmetic. As a library for time conversions, its novelty is the implementation of leap seconds, which are part of the UTC standard but usually not implemented in existing libraries. Since the verified functions of FV Time are reasonably simple yet non-trivial, it nicely illustrates our methodology for verifying software with Coq.
In this paper we present a description of the project, emphasizing the main problems faced while developing the library, as well as some general-purpose solutions that were produced as by-products and may be used in other verification projects. These include a refinement package between proof-oriented MathComp numbers and computation-oriented primitive numbers from the Coq standard library, as well as a set of tactics to automatically prove certain decidable statements over finite ranges through brute-force computation.
△ Less
Submitted 13 December, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Recognition of Unseen Bird Species by Learning from Field Guides
Authors:
Andrés C. Rodríguez,
Stefano D'Aronco,
Rodrigo Caye Daudt,
Jan D. Wegner,
Konrad Schindler
Abstract:
We exploit field guides to learn bird species recognition, in particular zero-shot recognition of unseen species. Illustrations contained in field guides deliberately focus on discriminative properties of each species, and can serve as side information to transfer knowledge from seen to unseen bird species. We study two approaches: (1) a contrastive encoding of illustrations, which can be fed into…
▽ More
We exploit field guides to learn bird species recognition, in particular zero-shot recognition of unseen species. Illustrations contained in field guides deliberately focus on discriminative properties of each species, and can serve as side information to transfer knowledge from seen to unseen bird species. We study two approaches: (1) a contrastive encoding of illustrations, which can be fed into standard zero-shot learning schemes; and (2) a novel method that leverages the fact that illustrations are also images and as such structurally more similar to photographs than other kinds of side information. Our results show that illustrations from field guides, which are readily available for a wide range of species, are indeed a competitive source of side information for zero-shot learning. On a subset of the iNaturalist2021 dataset with 749 seen and 739 unseen species, we obtain a classification accuracy of unseen bird species of $12\%$ @top-1 and $38\%$ @top-10, which shows the potential of field guides for challenging real-world scenarios with many species. Our code is available at https://github.com/ac-rodriguez/zsl_billow
△ Less
Submitted 2 November, 2023; v1 submitted 3 June, 2022;
originally announced June 2022.
-
Improving the repeatability of deep learning models with Monte Carlo dropout
Authors:
Andreanne Lemay,
Katharina Hoebel,
Christopher P. Bridge,
Brian Befano,
Silvia De Sanjosé,
Diden Egemen,
Ana Cecilia Rodriguez,
Mark Schiffman,
John Peter Campbell,
Jayashree Kalpathy-Cramer
Abstract:
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Repeatable models output predictions with low variation during independent tests carried out under similar conditions. During model development and evaluation, much attention is given to classification performance while model repeatability is…
▽ More
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Repeatability is a key attribute of model robustness. Repeatable models output predictions with low variation during independent tests carried out under similar conditions. During model development and evaluation, much attention is given to classification performance while model repeatability is rarely assessed, leading to the development of models that are unusable in clinical practice. In this work, we evaluate the repeatability of four model types (binary classification, multi-class classification, ordinal classification, and regression) on images that were acquired from the same patient during the same visit. We study the performance of binary, multi-class, ordinal, and regression models on four medical image classification tasks from public and private datasets: knee osteoarthritis, cervical cancer screening, breast density estimation, and retinopathy of prematurity. Repeatability is measured and compared on ResNet and DenseNet architectures. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increased repeatability for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95\% limits of agreement by 16% points and of the disagreement rate by 7% points. The classification accuracy improved in most settings along with the repeatability. Our results suggest that beyond about 20 Monte Carlo iterations, there is no further gain in repeatability. In addition to the higher test-retest agreement, Monte Carlo predictions were better calibrated which leads to output probabilities reflecting more accurately the true likelihood of being correctly classified.
△ Less
Submitted 15 February, 2022;
originally announced February 2022.
-
Monte Carlo dropout increases model repeatability
Authors:
Andreanne Lemay,
Katharina Hoebel,
Christopher P. Bridge,
Didem Egemen,
Ana Cecilia Rodriguez,
Mark Schiffman,
John Peter Campbell,
Jayashree Kalpathy-Cramer
Abstract:
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Among the main features of robustness is repeatability. Much attention is given to classification performance without assessing the model repeatability, leading to the development of models that turn out to be unusable in practice. In this work, we evaluate the repeatability of four model types…
▽ More
The integration of artificial intelligence into clinical workflows requires reliable and robust models. Among the main features of robustness is repeatability. Much attention is given to classification performance without assessing the model repeatability, leading to the development of models that turn out to be unusable in practice. In this work, we evaluate the repeatability of four model types on images from the same patient that were acquired during the same visit. We study the performance of binary, multi-class, ordinal, and regression models on three medical image analysis tasks: cervical cancer screening, breast density estimation, and retinopathy of prematurity classification. Moreover, we assess the impact of sampling Monte Carlo dropout predictions at test time on classification performance and repeatability. Leveraging Monte Carlo predictions significantly increased repeatability for all tasks on the binary, multi-class, and ordinal models leading to an average reduction of the 95% limits of agreement by 17% points.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
MARS: Nano-Power Battery-free Wireless Interfaces for Touch, Swipe and Speech Input
Authors:
Nivedita Arora,
Ali Mirzazadeh,
Injoo Moon,
Charles Ramey,
Yuhui Zhao,
Daniela C. Rodriguez,
Gregory D. Abowd,
Thad E. Starner
Abstract:
Augmenting everyday surfaces with interaction sensing capability that is maintenance-free, low-cost (about $1), and in an appropriate form factor is a challenge with current technologies. MARS (Multi-channel Ambiently-powered Realtime Sensing) enables battery-free sensing and wireless communication of touch, swipe, and speech interactions by combining a nanowatt programmable oscillator with freque…
▽ More
Augmenting everyday surfaces with interaction sensing capability that is maintenance-free, low-cost (about $1), and in an appropriate form factor is a challenge with current technologies. MARS (Multi-channel Ambiently-powered Realtime Sensing) enables battery-free sensing and wireless communication of touch, swipe, and speech interactions by combining a nanowatt programmable oscillator with frequency-shifted analog backscatter communication. A zero-threshold voltage field-effect transistor (FET) is used to create an oscillator with a low startup voltage (about 500 mV) and current (< 2uA), whose frequency can be affected through changes in inductance or capacitance from the user interactions. Multiple MARS systems can operate in the same environment by tuning each oscillator circuit to a different frequency range. The nanowatt power budget allows the system to be powered directly through ambient energy sources like photodiodes or thermoelectric generators. We differentiate MARS from previous systems based on power requirements, cost, and part count and explore different interaction and activity sensing scenarios suitable for indoor environments.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing
Authors:
Mark Anderson,
Mathieu Dehouck,
Carlos Gómez Rodríguez
Abstract:
We evaluate the efficacy of predicted UPOS tags as input features for dependency parsers in lower resource settings to evaluate how treebank size affects the impact tagging accuracy has on parsing performance. We do this for real low resource universal dependency treebanks, artificially low resource data with varying treebank sizes, and for very small treebanks with varying amounts of augmented da…
▽ More
We evaluate the efficacy of predicted UPOS tags as input features for dependency parsers in lower resource settings to evaluate how treebank size affects the impact tagging accuracy has on parsing performance. We do this for real low resource universal dependency treebanks, artificially low resource data with varying treebank sizes, and for very small treebanks with varying amounts of augmented data. We find that predicted UPOS tags are somewhat helpful for low resource treebanks, especially when fewer fully-annotated trees are available. We also find that this positive impact diminishes as the amount of data increases.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021
Authors:
Mark Anderson,
Carlos Gómez Rodríguez
Abstract:
We evaluate three leading dependency parser systems from different paradigms on a small yet diverse subset of languages in terms of their accuracy-efficiency Pareto front. As we are interested in efficiency, we evaluate core parsers without pretrained language models (as these are typically huge networks and would constitute most of the compute time) or other augmentations that can be transversall…
▽ More
We evaluate three leading dependency parser systems from different paradigms on a small yet diverse subset of languages in terms of their accuracy-efficiency Pareto front. As we are interested in efficiency, we evaluate core parsers without pretrained language models (as these are typically huge networks and would constitute most of the compute time) or other augmentations that can be transversally applied to any of them. Biaffine parsing emerges as a well-balanced default choice, with sequence-labelling parsing being preferable if inference speed (but not training energy cost) is the priority.
△ Less
Submitted 9 June, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance
Authors:
Mark Anderson,
Anders Søgaard,
Carlos Gómez Rodríguez
Abstract:
Søgaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance. Similar to other statistical analyses in NLP, the results were based on evaluating linear regressions. However, the study had methodological issues and was undertaken using a small sample size leading to unre…
▽ More
Søgaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance. Similar to other statistical analyses in NLP, the results were based on evaluating linear regressions. However, the study had methodological issues and was undertaken using a small sample size leading to unreliable results. We present a replication study in which we also bin sentences by length and find that only a small subset of sentences vary in performance with respect to graph isomorphism. Further, the correlation observed between parser performance and graph isomorphism in the wild disappears when controlling for covariants. However, in a controlled experiment, where covariants are kept fixed, we do observe a strong correlation. We suggest that conclusions drawn from statistical analyses like this need to be tempered and that controlled experiments can complement them by more readily teasing factors apart.
△ Less
Submitted 2 June, 2021; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Mapping oil palm density at country scale: An active learning approach
Authors:
Andrés C. Rodríguez,
Stefano D'Aronco,
Konrad Schindler,
Jan D. Wegner
Abstract:
Accurate mapping of oil palm is important for understanding its past and future impact on the environment. We propose to map and count oil palms by estimating tree densities per pixel for large-scale analysis. This allows for fine-grained analysis, for example regarding different planting patterns. To that end, we propose a new, active deep learning method to estimate oil palm density at large sca…
▽ More
Accurate mapping of oil palm is important for understanding its past and future impact on the environment. We propose to map and count oil palms by estimating tree densities per pixel for large-scale analysis. This allows for fine-grained analysis, for example regarding different planting patterns. To that end, we propose a new, active deep learning method to estimate oil palm density at large scale from Sentinel-2 satellite images, and apply it to generate complete maps for Malaysia and Indonesia. What makes the regression of oil palm density challenging is the need for representative reference data that covers all relevant geographical conditions across a large territory. Specifically for density estimation, generating reference data involves counting individual trees. To keep the associated labelling effort low we propose an active learning (AL) approach that automatically chooses the most relevant samples to be labelled. Our method relies on estimates of the epistemic model uncertainty and of the diversity among samples, making it possible to retrieve an entire batch of relevant samples in a single iteration. Moreover, our algorithm has linear computational complexity and is easily parallelisable to cover large areas. We use our method to compute the first oil palm density map with $10\,$m Ground Sampling Distance (GSD) , for all of Indonesia and Malaysia and for two different years, 2017 and 2019. The maps have a mean absolute error of $\pm$7.3 trees/$ha$, estimated from an independent validation set. We also analyse density variations between different states within a country and compare them to official estimates. According to our estimates there are, in total, $>1.2$ billion oil palms in Indonesia covering $>$15 million $ha$, and $>0.5$ billion oil palms in Malaysia covering $>6$ million $ha$.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
Model-predictive control and reinforcement learning in multi-energy system case studies
Authors:
Glenn Ceusters,
Román Cantú Rodríguez,
Alberte Bouso García,
Rüdiger Franke,
Geert Deconinck,
Lieve Helsen,
Ann Nowé,
Maarten Messagie,
Luis Ramirez Camargo
Abstract:
Model-predictive-control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing…
▽ More
Model-predictive-control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing project-specific engineering cost. In this paper, we present an on- and off-policy multi-objective reinforcement learning (RL) approach, that does not assume a model a priori, benchmarking this against a linear MPC (LMPC - to reflect current practice, though non-linear MPC performs better) - both derived from the general optimal control problem, highlighting their differences and similarities. In a simple multi-energy system (MES) configuration case study, we show that a twin delayed deep deterministic policy gradient (TD3) RL agent offers potential to match and outperform the perfect foresight LMPC benchmark (101.5%). This while the realistic LMPC, i.e. imperfect predictions, only achieves 98%. While in a more complex MES system configuration, the RL agent's performance is generally lower (94.6%), yet still better than the realistic LMPC (88.9%). In both case studies, the RL agents outperformed the realistic LMPC after a training period of 2 years using quarterly interactions with the environment. We conclude that reinforcement learning is a viable optimal control technique for multi-energy systems given adequate constraint handling and pre-training, to avoid unsafe interactions and long training periods, as is proposed in fundamental future work.
△ Less
Submitted 9 September, 2021; v1 submitted 20 April, 2021;
originally announced April 2021.
-
A Survey on Future Railway Radio Communications Services: Challenges and Opportunities
Authors:
Juan Moreno Garcia-Loygorri,
Jose Manuel Riera,
Leandro de Haro,
Carlos Rodriguez
Abstract:
Radio communications is one of the most disruptive technologies in railways, enabling a huge set of value-added services that greatly improve many aspects of railways, making them more efficient, safer, and profitable. Lately, some major technologies like ERTMS for high-speed railways and CBTC for subways have made possible a reduction of headway and increased safety never before seen in this fiel…
▽ More
Radio communications is one of the most disruptive technologies in railways, enabling a huge set of value-added services that greatly improve many aspects of railways, making them more efficient, safer, and profitable. Lately, some major technologies like ERTMS for high-speed railways and CBTC for subways have made possible a reduction of headway and increased safety never before seen in this field. The railway industry is now looking at wireless communications with great interest, and this can be seen in many projects around the world. Thus, railway radio communications is again a flourishing field, with a lot of research and many things to be done. This survey article explains both opportunities and challenges to be addressed by the railway sector in order to obtain all the possible benefits of the latest radio technologies.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Symbolic Partial-Order Execution for Testing Multi-Threaded Programs
Authors:
Daniel Schemmel,
Julian Büning,
César Rodríguez,
David Laprell,
Klaus Wehrle
Abstract:
We describe a technique for systematic testing of multi-threaded programs. We combine Quasi-Optimal Partial-Order Reduction, a state-of-the-art technique that tackles path explosion due to interleaving non-determinism, with symbolic execution to handle data non-determinism. Our technique iteratively and exhaustively finds all executions of the program. It represents program executions using partia…
▽ More
We describe a technique for systematic testing of multi-threaded programs. We combine Quasi-Optimal Partial-Order Reduction, a state-of-the-art technique that tackles path explosion due to interleaving non-determinism, with symbolic execution to handle data non-determinism. Our technique iteratively and exhaustively finds all executions of the program. It represents program executions using partial orders and finds the next execution using an underlying unfolding semantics. We avoid the exploration of redundant program traces using cutoff events. We implemented our technique as an extension of KLEE and evaluated it on a set of large multi-threaded C programs. Our experiments found several previously undiscovered bugs and undefined behaviors in memcached and GNU sort, showing that the new method is capable of finding bugs in industrial-size benchmarks.
△ Less
Submitted 22 July, 2020; v1 submitted 13 May, 2020;
originally announced May 2020.
-
Fine-grained Species Recognition with Privileged Pooling: Better Sample Efficiency Through Supervised Attention
Authors:
Andres C. Rodriguez,
Stefano D'Aronco,
Konrad Schindler,
Jan Dirk Wegner
Abstract:
We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our main motivation is the recognition of animal species for ecological applications such as biodiversity modelling, which is challenging because of long-tailed species distributions due t…
▽ More
We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our main motivation is the recognition of animal species for ecological applications such as biodiversity modelling, which is challenging because of long-tailed species distributions due to rare species, and strong dataset biases such as repetitive scene background in camera traps. To counteract these challenges, we propose a visual attention mechanism that is supervised via keypoint annotations that highlight important object parts. This privileged information, implemented as a novel privileged pooling operation, is only required during training and helps the model to focus on regions that are discriminative. In experiments with three different animal species datasets, we show that deep networks with privileged pooling can use small training sets more efficiently and generalize better.
△ Less
Submitted 4 August, 2023; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Always Look on the Bright Side of the Field: Merging Pose and Contextual Data to Estimate Orientation of Soccer Players
Authors:
Adrià Arbués-Sangüesa,
Adrián Martín,
Javier Fernández,
Carlos Rodríguez,
Gloria Haro,
Coloma Ballester
Abstract:
Although orientation has proven to be a key skill of soccer players in order to succeed in a broad spectrum of plays, body orientation is a yet-little-explored area in sports analytics' research. Despite being an inherently ambiguous concept, player orientation can be defined as the projection (2D) of the normal vector placed in the center of the upper-torso of players (3D). This research presents…
▽ More
Although orientation has proven to be a key skill of soccer players in order to succeed in a broad spectrum of plays, body orientation is a yet-little-explored area in sports analytics' research. Despite being an inherently ambiguous concept, player orientation can be defined as the projection (2D) of the normal vector placed in the center of the upper-torso of players (3D). This research presents a novel technique to obtain player orientation from monocular video recordings by mapping pose parts (shoulders and hips) in a 2D field by combining OpenPose with a super-resolution network, and merging the obtained estimation with contextual information (ball position). Results have been validated with players-held EPTS devices, obtaining a median error of 27 degrees/player. Moreover, three novel types of orientation maps are proposed in order to make raw orientation data easy to visualize and understand, thus allowing further analysis at team- or player-level.
△ Less
Submitted 18 May, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Performance Evaluation of a Substrate Integrated Waveguide Antenna for Vehicular Networks
Authors:
Kevin Delgadillo,
Cesar Rodriguez,
Wilder Castellanos,
Hector Guarnizo
Abstract:
This paper describes the design and evaluation of a Substrate Integrated Waveguide (SIW) antenna operating at 2.4 GHz. This antenna was designed as a possible solution for the implementation of the vehicular networks (VANETs). The main advantages of the SIW antennas, such as their simplicity, small size and low cost, make this kind of antennas particularly suitable for using in wireless nodes wher…
▽ More
This paper describes the design and evaluation of a Substrate Integrated Waveguide (SIW) antenna operating at 2.4 GHz. This antenna was designed as a possible solution for the implementation of the vehicular networks (VANETs). The main advantages of the SIW antennas, such as their simplicity, small size and low cost, make this kind of antennas particularly suitable for using in wireless nodes where the size is a critical factor. For example, in Vehicular Networks (VANETs) which is one of the most promising wireless communication systems. We present in this paper the design of the SIW antenna using the software ANSYS HFSS as well as the results of the assessment of some parameters like the radiation pattern, operation frequency and bandwidth. Also, we present the simulation of a VANET that include the designed parameters of the SIW antenna in order to evaluate its integration into the vehicular nodes. This simulation was performed in the NS2 simulator and some network performance metrics, like packet delay and the packet loss rate, were evaluated. Other additional software, for example MOVE and SUMO, were also used to generate the routes and the movement of the vehicles
△ Less
Submitted 26 December, 2019;
originally announced January 2020.
-
Staff dimensioning in homecare services with uncertain demands
Authors:
C. Rodriguez,
Thierry Garaix,
X. Xie,
V. Augusto
Abstract:
The problem addressed in this paper is how to calculate the amount of personnel required to ensure the activity of a home health care (HHC) center on a tactical horizon. Design of quantitative approaches for this question is challenging. The number of caregivers has to be determined for each profession in order to balance the coverage of patients in a region and the workforce cost over several mon…
▽ More
The problem addressed in this paper is how to calculate the amount of personnel required to ensure the activity of a home health care (HHC) center on a tactical horizon. Design of quantitative approaches for this question is challenging. The number of caregivers has to be determined for each profession in order to balance the coverage of patients in a region and the workforce cost over several months. Unknown demand in care and spatial dimensions, combination of skills to cover a care and individual trips visiting patients make the underlaying optimization problem very hard. Few studies are dedicated to staff dimensioning for HHC compared to patient to nurses assignment/sequencing and centers location problems. We propose an original two-stage approach based on integer linear stochastic programming, that exploits historical medical data. The first stage calculates (near-)optimal levels of resources for possible demand scenarios , while the second stage computes the optimal number of caregiver for each profession to meet a target coverage indicator. For decision-makers, our algorithm gives the number of employees for each category required to satisfy the demand without any recourse (overtime, external resources) with fixed probability and confidence interval. The approach has been tested on various instances built from data of the French agency of hospitalization data (ATIH).
△ Less
Submitted 29 October, 2018;
originally announced November 2018.
-
When logic lays down the law
Authors:
Bjørn Jespersen,
Ana de Almeida Borges,
Jorge del Castillo Tierz,
Juan José Conejero Rodríguez,
Eric Sancho Adamson,
Aleix Solé Sánchez,
Nika Pona,
Joost J. Joosten
Abstract:
We analyse so-called computable laws, i.e., laws that can be enforced by automatic procedures. These laws should be logically perfect and unambiguous, but sometimes they are not. We use a regulation on road transport to illustrate this issue, and show what some fragments of this regulation would look like if rewritten in the image of logic. We further propose desiderata to be fulfilled by computab…
▽ More
We analyse so-called computable laws, i.e., laws that can be enforced by automatic procedures. These laws should be logically perfect and unambiguous, but sometimes they are not. We use a regulation on road transport to illustrate this issue, and show what some fragments of this regulation would look like if rewritten in the image of logic. We further propose desiderata to be fulfilled by computable laws, and provide a critical platform from which to assess existing laws and a guideline for composing future ones.
△ Less
Submitted 6 October, 2018;
originally announced October 2018.
-
Counting the uncountable: deep semantic density estimation from Space
Authors:
Andres C. Rodriguez,
Jan D. Wegner
Abstract:
We propose a new method to count objects of specific categories that are significantly smaller than the ground sampling distance of a satellite image. This task is hard due to the cluttered nature of scenes where different object categories occur. Target objects can be partially occluded, vary in appearance within the same class and look alike to different categories. Since traditional object dete…
▽ More
We propose a new method to count objects of specific categories that are significantly smaller than the ground sampling distance of a satellite image. This task is hard due to the cluttered nature of scenes where different object categories occur. Target objects can be partially occluded, vary in appearance within the same class and look alike to different categories. Since traditional object detection is infeasible due to the small size of objects with respect to the pixel size, we cast object counting as a density estimation problem. To distinguish objects of different classes, our approach combines density estimation with semantic segmentation in an end-to-end learnable convolutional neural network (CNN). Experiments show that deep semantic density estimation can robustly count objects of various classes in cluttered scenes. Experiments also suggest that we need specific CNN architectures in remote sensing instead of blindly applying existing ones from computer vision.
△ Less
Submitted 20 September, 2018; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Action Anticipation By Predicting Future Dynamic Images
Authors:
Cristian Rodriguez,
Basura Fernando,
Hongdong Li
Abstract:
Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress. This is critical for applications where computers have to react to human actions as early as possible such as autonomous driving, human-robotic interaction, assistive robotics among others. In this paper, we present a method for human action anticipation by predicting the m…
▽ More
Human action-anticipation methods predict what is the future action by observing only a few portion of an action in progress. This is critical for applications where computers have to react to human actions as early as possible such as autonomous driving, human-robotic interaction, assistive robotics among others. In this paper, we present a method for human action anticipation by predicting the most plausible future human motion. We represent human motion using Dynamic Images and make use of tailored loss functions to encourage a generative model to produce accurate future motion prediction. Our method outperforms the currently best performing action-anticipation methods by 4% on JHMDB-21, 5.2% on UT-Interaction and 5.1% on UCF 101-24 benchmarks.
△ Less
Submitted 31 July, 2018;
originally announced August 2018.
-
Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources
Authors:
Edmon Begoli,
Jesús Camacho Rodríguez,
Julian Hyde,
Michael J. Mior,
Daniel Lemire
Abstract:
Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of proces…
▽ More
Apache Calcite is a foundational software framework that provides query processing, optimization, and query language support to many popular open-source data processing systems such as Apache Hive, Apache Storm, Apache Flink, Druid, and MapD. Calcite's architecture consists of a modular and extensible query optimizer with hundreds of built-in optimization rules, a query processor capable of processing a variety of query languages, an adapter architecture designed for extensibility, and support for heterogeneous data models and stores (relational, semi-structured, streaming, and geospatial). This flexible, embeddable, and extensible architecture is what makes Calcite an attractive choice for adoption in big-data frameworks. It is an active project that continues to introduce support for the new types of data sources, query languages, and approaches to query processing and optimization.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
Quasi-Optimal Partial Order Reduction
Authors:
Huyen T. T Nguyen,
César Rodríguez,
Marcelo Sousa,
Camille Coti,
Laure Petrucci
Abstract:
A dynamic partial order reduction (DPOR) algorithm is optimal when it always explores at most one representative per Mazurkiewicz trace. Existing literature suggests that the reduction obtained by the non-optimal, state-of-the-art Source-DPOR (SDPOR) algorithm is comparable to optimal DPOR. We show the first program with $\mathop{\mathcal{O}}(n)$ Mazurkiewicz traces where SDPOR explores…
▽ More
A dynamic partial order reduction (DPOR) algorithm is optimal when it always explores at most one representative per Mazurkiewicz trace. Existing literature suggests that the reduction obtained by the non-optimal, state-of-the-art Source-DPOR (SDPOR) algorithm is comparable to optimal DPOR. We show the first program with $\mathop{\mathcal{O}}(n)$ Mazurkiewicz traces where SDPOR explores $\mathop{\mathcal{O}}(2^n)$ redundant schedules and identify the cause of the blow-up as an NP-hard problem. Our main contribution is a new approach, called Quasi-Optimal POR, that can arbitrarily approximate an optimal exploration using a provided constant k. We present an implementation of our method in a new tool called Dpu using specialised data structures. Experiments with Dpu, including Debian packages, show that optimality is achieved with low values of k, outperforming state-of-the-art tools.
△ Less
Submitted 20 April, 2018; v1 submitted 12 February, 2018;
originally announced February 2018.
-
Programming Bots by Synthesizing Natural Language Expressions into API Invocations
Authors:
Shayan Zamanirad,
Boualem Benatallah,
Moshe Chai Barukh,
Fabio Casati,
Carlos Rodriguez
Abstract:
At present, bots are still in their preliminary stages of development. Many are relatively simple, or developed ad-hoc for a very specific use-case. For this reason, they are typically programmed manually, or utilize machine-learning classifiers to interpret a fixed set of user utterances. In reality, real world conversations with humans require support for dynamically capturing users expressions.…
▽ More
At present, bots are still in their preliminary stages of development. Many are relatively simple, or developed ad-hoc for a very specific use-case. For this reason, they are typically programmed manually, or utilize machine-learning classifiers to interpret a fixed set of user utterances. In reality, real world conversations with humans require support for dynamically capturing users expressions. Moreover, bots will derive immeasurable value by programming them to invoke APIs for their results. Today, within the Web and Mobile development community, complex applications are being stringed together with a few lines of code -- all made possible by APIs. Yet, developers today are not as empowered to program bots in much the same way. To overcome this, we introduce BotBase, a bot programming platform that dynamically synthesizes natural language user expressions into API invocations. Our solution is two faceted: Firstly, we construct an API knowledge graph to encode and evolve APIs; secondly, leveraging the above we apply techniques in NLP, ML and Entity Recognition to perform the required synthesis from natural language user expressions into API calls.
△ Less
Submitted 15 November, 2017;
originally announced November 2017.
-
Algorithms and Architecture for Real-time Recommendations at News UK
Authors:
Dion Bailey,
Tom Pajak,
Daoud Clarke,
Carlos Rodriguez
Abstract:
Recommendation systems are recognised as being hugely important in industry, and the area is now well understood. At News UK, there is a requirement to be able to quickly generate recommendations for users on news items as they are published. However, little has been published about systems that can generate recommendations in response to changes in recommendable items and user behaviour in a very…
▽ More
Recommendation systems are recognised as being hugely important in industry, and the area is now well understood. At News UK, there is a requirement to be able to quickly generate recommendations for users on news items as they are published. However, little has been published about systems that can generate recommendations in response to changes in recommendable items and user behaviour in a very short space of time. In this paper we describe a new algorithm for updating collaborative filtering models incrementally, and demonstrate its effectiveness on clickstream data from The Times. We also describe the architecture that allows recommendations to be generated on the fly, and how we have made each component scalable. The system is currently being used in production at News UK.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.
-
Ephemeral Context to Support Robust and Diverse Music Recommendations
Authors:
Pavel Kucherbaev,
Nava Tintarev,
Carlos Rodriguez
Abstract:
While prior work on context-based music recommendation focused on fixed set of contexts (e.g. walking, driving, jogging), we propose to use multiple sensors and external data sources to describe momentary (ephemeral) context in a rich way with a very large number of possible states (e.g. jogging fast along in downtown of Sydney under a heavy rain at night being tired and angry). With our approach,…
▽ More
While prior work on context-based music recommendation focused on fixed set of contexts (e.g. walking, driving, jogging), we propose to use multiple sensors and external data sources to describe momentary (ephemeral) context in a rich way with a very large number of possible states (e.g. jogging fast along in downtown of Sydney under a heavy rain at night being tired and angry). With our approach, we address the problems which current approaches face: 1) a limited ability to infer context from missing or faulty sensor data; 2) an inability to use contextual information to support novel content discovery.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.
-
Abstract Interpretation with Unfoldings
Authors:
Marcelo Sousa,
César Rodríguez,
Vijay D'Silva,
Daniel Kroening
Abstract:
We present and evaluate a technique for computing path-sensitive interference conditions during abstract interpretation of concurrent programs. In lieu of fixed point computation, we use prime event structures to compactly represent causal dependence and interference between sequences of transformers. Our main contribution is an unfolding algorithm that uses a new notion of independence to avoid r…
▽ More
We present and evaluate a technique for computing path-sensitive interference conditions during abstract interpretation of concurrent programs. In lieu of fixed point computation, we use prime event structures to compactly represent causal dependence and interference between sequences of transformers. Our main contribution is an unfolding algorithm that uses a new notion of independence to avoid redundant transformer application, thread-local fixed points to reduce the size of the unfolding, and a novel cutoff criterion based on subsumption to guarantee termination of the analysis. Our experiments show that the abstract unfolding produces an order of magnitude fewer false alarms than a mature abstract interpreter, while being several orders of magnitude faster than solver-based tools that have the same precision.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Unfolding-Based Process Discovery
Authors:
Hernán Ponce-de-León,
César Rodríguez,
Josep Carmona,
Keijo Heljanko,
Stefan Haar
Abstract:
This paper presents a novel technique for process discovery. In contrast to the current trend, which only considers an event log for discovering a process model, we assume two additional inputs: an independence relation on the set of logged activities, and a collection of negative traces. After deriving an intermediate net unfolding from them, we perform a controlled folding giving rise to a Petri…
▽ More
This paper presents a novel technique for process discovery. In contrast to the current trend, which only considers an event log for discovering a process model, we assume two additional inputs: an independence relation on the set of logged activities, and a collection of negative traces. After deriving an intermediate net unfolding from them, we perform a controlled folding giving rise to a Petri net which contains both the input log and all independence-equivalent traces arising from it. Remarkably, the derived Petri net cannot execute any trace from the negative collection. The entire chain of transformations is fully automated. A tool has been developed and experimental results are provided that witness the significance of the contribution of this paper.
△ Less
Submitted 9 July, 2015;
originally announced July 2015.
-
Unfolding-based Partial Order Reduction
Authors:
César Rodríguez,
Marcelo Sousa,
Subodh Sharma,
Daniel Kroening
Abstract:
Partial order reduction (POR) and net unfoldings are two alternative methods to tackle state-space explosion caused by concurrency. In this paper, we propose the combination of both approaches in an effort to combine their strengths. We first define, for an abstract execution model, unfolding semantics parameterized over an arbitrary independence relation. Based on it, our main contribution is a n…
▽ More
Partial order reduction (POR) and net unfoldings are two alternative methods to tackle state-space explosion caused by concurrency. In this paper, we propose the combination of both approaches in an effort to combine their strengths. We first define, for an abstract execution model, unfolding semantics parameterized over an arbitrary independence relation. Based on it, our main contribution is a novel stateless POR algorithm that explores at most one execution per Mazurkiewicz trace, and in general, can explore exponentially fewer, thus achieving a form of super-optimality. Furthermore, our unfolding-based POR copes with non-terminating executions and incorporates state-caching. Over benchmarks with busy-waits, among others, our experiments show a dramatic reduction in the number of executions when compared to a state-of-the-art DPOR.
△ Less
Submitted 3 July, 2015;
originally announced July 2015.
-
Towards a New Paradigm for Privacy and Security in Cloud Services
Authors:
Thomas Loruenser,
Charles Bastos Rodriguez,
Denise Demirel,
Simone Fischer-Huebner,
Thomas Gross,
Thomas Langer,
Mathieu des Noes,
Henrich C. Poehls,
Boris Rozenberg,
Daniel Slamanig
Abstract:
The market for cloud computing can be considered as the major growth area in ICT. However, big companies and public authorities are reluctant to entrust their most sensitive data to external parties for storage and processing. The reason for their hesitation is clear: There exist no satisfactory approaches to adequately protect the data during its lifetime in the cloud. The EU Project Prismacloud…
▽ More
The market for cloud computing can be considered as the major growth area in ICT. However, big companies and public authorities are reluctant to entrust their most sensitive data to external parties for storage and processing. The reason for their hesitation is clear: There exist no satisfactory approaches to adequately protect the data during its lifetime in the cloud. The EU Project Prismacloud (Horizon 2020 programme; duration 2/2015-7/2018) addresses these challenges and yields a portfolio of novel technologies to build security enabled cloud services, guaranteeing the required security with the strongest notion possible, namely by means of cryptography. We present a new approach towards a next generation of security and privacy enabled services to be deployed in only partially trusted cloud infrastructures.
△ Less
Submitted 21 July, 2015; v1 submitted 19 June, 2015;
originally announced June 2015.
-
Model Checking Contest @ Petri Nets, Report on the 2013 edition
Authors:
Fabrice Kordon,
Alban Linard,
Marco Beccuti,
Didier Buchs,
Łukasz Fronc,
Lom-Messan Hillah,
Francis Hulin-Hubard,
Fabrice Legond-Aubry,
Niels Lohmann,
Alexis Marechal,
Emmanuel Paviot-Adet,
Franck Pommereau,
César Rodríguez,
Christian Rohr,
Yann Thierry-Mieg,
Harro Wimmel,
Karsten Wolf
Abstract:
This document presents the results of the Model Checking Contest held at Petri Nets 2013 in Milano. This contest aimed at a fair and experimental evaluation of the performances of model checking techniques applied to Petri nets. This is the third edition after two successful editions in 2011 and 2012.
The participating tools were compared on several examinations (state space generation and evalu…
▽ More
This document presents the results of the Model Checking Contest held at Petri Nets 2013 in Milano. This contest aimed at a fair and experimental evaluation of the performances of model checking techniques applied to Petri nets. This is the third edition after two successful editions in 2011 and 2012.
The participating tools were compared on several examinations (state space generation and evaluation of several types of formulæ -- reachability, LTL, CTL for various classes of atomic propositions) run on a set of common models (Place/Transition and Symmetric Petri nets).
After a short overview of the contest, this paper provides the raw results from the contest, model per model and examination per examination. An HTML version of this report is also provided (http://mcc.lip6.fr).
△ Less
Submitted 10 September, 2013;
originally announced September 2013.
-
High-Rate Short-Block LDPC Codes for Iterative Decoding with Applications to High-Density Magnetic Recording Channels
Authors:
Damian A. Morero,
Graciela Corral-Briones,
Carmen Rodriguez,
Mario R. Hueda
Abstract:
This paper investigates the Triangle Single Parity Check (T/SPC) code, a novel class of high-rate low-complexity LDPC codes. T/SPC is a regular, soft decodable, linear-time encodable/decodable code. Compared to previous high-rate and low-complexity LDPC codes, such as the well-known Turbo Product Code / Single Parity Check (TPC/SPC), T/SPC provides higher code rates, shorter code words, and lower…
▽ More
This paper investigates the Triangle Single Parity Check (T/SPC) code, a novel class of high-rate low-complexity LDPC codes. T/SPC is a regular, soft decodable, linear-time encodable/decodable code. Compared to previous high-rate and low-complexity LDPC codes, such as the well-known Turbo Product Code / Single Parity Check (TPC/SPC), T/SPC provides higher code rates, shorter code words, and lower complexity. This makes T/SPC very attractive for practical implementation on integrated circuits.
In addition, we analyze the performance of iterative decoders based on a soft-input soft-output (SISO) equalizer using T/SPC over high-density perpendicular magnetic recording channels. Computer simulations show that the proposed scheme is able to achieve a gain of up to 0.3 dB over TPC/SPC codes with a significant reduction of implementation complexity.
△ Less
Submitted 7 April, 2011;
originally announced April 2011.
-
Metalinguistic Information Extraction for Terminology
Authors:
Carlos Rodriguez
Abstract:
This paper describes and evaluates the Metalinguistic Operation Processor (MOP) system for automatic compilation of metalinguistic information from technical and scientific documents. This system is designed to extract non-standard terminological resources that we have called Metalinguistic Information Databases (or MIDs), in order to help update changing glossaries, knowledge bases and ontologi…
▽ More
This paper describes and evaluates the Metalinguistic Operation Processor (MOP) system for automatic compilation of metalinguistic information from technical and scientific documents. This system is designed to extract non-standard terminological resources that we have called Metalinguistic Information Databases (or MIDs), in order to help update changing glossaries, knowledge bases and ontologies, as well as to reflect the metastable dynamics of special-domain knowledge.
△ Less
Submitted 15 April, 2005;
originally announced April 2005.