-
BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
Authors:
Dujian Ding,
Ankur Mallick,
Shaokun Zhang,
Chi Wang,
Daniel Madrigal,
Mirian Del Carmen Hipolito Garcia,
Menglin Xia,
Laks V. S. Lakshmanan,
Qingyun Wu,
Victor Rühle
Abstract:
Large language models (LLMs) are powerful tools but are often expensive to deploy at scale. LLM query routing mitigates this by dynamically assigning queries to models of varying cost and quality to obtain a desired trade-off. Prior query routing approaches generate only one response from the selected model and a single response from a small (inexpensive) model was often not good enough to beat a…
▽ More
Large language models (LLMs) are powerful tools but are often expensive to deploy at scale. LLM query routing mitigates this by dynamically assigning queries to models of varying cost and quality to obtain a desired trade-off. Prior query routing approaches generate only one response from the selected model and a single response from a small (inexpensive) model was often not good enough to beat a response from a large (expensive) model due to which they end up overusing the large model and missing out on potential cost savings. However, it is well known that for small models, generating multiple responses and selecting the best can enhance quality while remaining cheaper than a single large-model response. We leverage this idea to propose BEST-Route, a novel routing framework that chooses a model and the number of responses to sample from it based on query difficulty and the quality thresholds. Experiments on real-world datasets demonstrate that our method reduces costs by up to 60% with less than 1% performance drop.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
MDC-R: The Minecraft Dialogue Corpus with Reference
Authors:
Chris Madge,
Maris Camilleri,
Paloma Carretero Garcia,
Mladen Karan,
Juexi Shao,
Prashant Jayannavar,
Julian Hough,
Benjamin Roth,
Massimo Poesio
Abstract:
We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orientated, multi-turn, situated dialogue in a dynamic environment has motivated multiple annotation efforts, owing to the interesting linguistic phenomena that this s…
▽ More
We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orientated, multi-turn, situated dialogue in a dynamic environment has motivated multiple annotation efforts, owing to the interesting linguistic phenomena that this setting gives rise to. We believe it can serve as a valuable resource when annotated with reference, too. Here, we discuss our method of annotation and the resulting corpus, and provide both a quantitative and a qualitative analysis of the data. Furthermore, we carry out a short experiment demonstrating the usefulness of our corpus for referring expression comprehension.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search
Authors:
Dongge Han,
Menglin Xia,
Daniel Madrigal Diaz,
Samuel Kessler,
Ankur Mallick,
Xuchao Zhang,
Mirian Del Carmen Hipolito Garcia,
Jin Xu,
Victor Rühle,
Saravan Rajmohan
Abstract:
Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-leve…
▽ More
Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-level reasoning guides that help SLMs systematically tackle related problems. Furthermore, our framework integrates a prompt template search mechanism to mitigate the SLMs' sensitivity to prompt variations. Our framework demonstrates improved SLM performance across various tasks, including math (GSM8K), coding (MBPP), and logic reasoning (BBH). Our approach improves the reasoning capabilities of SLMs without increasing model size or requiring additional training, offering a lightweight and deployment-friendly solution for on-device or resource-constrained environments.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios
Authors:
Gerard I. Gállego,
Oriol Pareras,
Martí Cortada Garcia,
Lucas Takanori,
Javier Hernando
Abstract:
We propose a Speech-to-Text Translation (S2TT) approach that integrates phoneme representations into a Chain-of-Thought (CoT) framework to improve translation in low-resource and zero-resource settings. By introducing phoneme recognition as an intermediate step, we enhance cross-lingual transfer, enabling translation even for languages with no labeled speech data. Our system builds on a multilingu…
▽ More
We propose a Speech-to-Text Translation (S2TT) approach that integrates phoneme representations into a Chain-of-Thought (CoT) framework to improve translation in low-resource and zero-resource settings. By introducing phoneme recognition as an intermediate step, we enhance cross-lingual transfer, enabling translation even for languages with no labeled speech data. Our system builds on a multilingual LLM, which we extend to process speech and phonemes. Training follows a curriculum learning strategy that progressively introduces more complex tasks. Experiments on multilingual S2TT benchmarks show that phoneme-augmented CoT improves translation quality in low-resource conditions and enables zero-resource translation, while slightly impacting high-resource performance. Despite this trade-off, our findings demonstrate that phoneme-based CoT is a promising step toward making S2TT more accessible across diverse languages.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Higher-Order Convolution Improves Neural Predictivity in the Retina
Authors:
Simone Azeglio,
Victor Calbiague Garcia,
Guilhem Glaziou,
Peter Neri,
Olivier Marre,
Ulisse Ferrari
Abstract:
We present a novel approach to neural response prediction that incorporates higher-order operations directly within convolutional neural networks (CNNs). Our model extends traditional 3D CNNs by embedding higher-order operations within the convolutional operator itself, enabling direct modeling of multiplicative interactions between neighboring pixels across space and time. Our model increases the…
▽ More
We present a novel approach to neural response prediction that incorporates higher-order operations directly within convolutional neural networks (CNNs). Our model extends traditional 3D CNNs by embedding higher-order operations within the convolutional operator itself, enabling direct modeling of multiplicative interactions between neighboring pixels across space and time. Our model increases the representational power of CNNs without increasing their depth, therefore addressing the architectural disparity between deep artificial networks and the relatively shallow processing hierarchy of biological visual systems. We evaluate our approach on two distinct datasets: salamander retinal ganglion cell (RGC) responses to natural scenes, and a new dataset of mouse RGC responses to controlled geometric transformations. Our higher-order CNN (HoCNN) achieves superior performance while requiring only half the training data compared to standard architectures, demonstrating correlation coefficients up to 0.75 with neural responses (against 0.80$\pm$0.02 retinal reliability). When integrated into state-of-the-art architectures, our approach consistently improves performance across different species and stimulus conditions. Analysis of the learned representations reveals that our network naturally encodes fundamental geometric transformations, particularly scaling parameters that characterize object expansion and contraction. This capability is especially relevant for specific cell types, such as transient OFF-alpha and transient ON cells, which are known to detect looming objects and object motion respectively, and where our model shows marked improvement in response prediction. The correlation coefficients for scaling parameters are more than twice as high in HoCNN (0.72) compared to baseline models (0.32).
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification
Authors:
Daniel Strick,
Carlos Garcia,
Anthony Huang
Abstract:
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outper…
▽ More
Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Safe Autonomous Environmental Contact for Soft Robots using Control Barrier Functions
Authors:
Akua K. Dickson,
Juan C. Pacheco Garcia,
Meredith L. Anderson,
Ran Jing,
Sarah Alizadeh-Shabdiz,
Audrey X. Wang,
Charles DeLorey,
Zach J. Patterson,
Andrew P. Sabelhaus
Abstract:
Robots built from soft materials will inherently apply lower environmental forces than their rigid counterparts, and therefore may be more suitable in sensitive settings with unintended contact. However, these robots' applied forces result from both their design and their control system in closed-loop, and therefore, ensuring bounds on these forces requires controller synthesis for safety as well.…
▽ More
Robots built from soft materials will inherently apply lower environmental forces than their rigid counterparts, and therefore may be more suitable in sensitive settings with unintended contact. However, these robots' applied forces result from both their design and their control system in closed-loop, and therefore, ensuring bounds on these forces requires controller synthesis for safety as well. This article introduces the first feedback controller for a soft manipulator that formally meets a safety specification with respect to environmental contact. In our proof-of-concept setting, the robot's environment has known geometry and is deformable with a known elastic modulus. Our approach maps a bound on applied forces to a safe set of positions of the robot's tip via predicted deformations of the environment. Then, a quadratic program with Control Barrier Functions in its constraints is used to supervise a nominal feedback signal, verifiably maintaining the robot's tip within this safe set. Hardware experiments on a multi-segment soft pneumatic robot demonstrate that the proposed framework successfully constrains its environmental contact forces. This framework represents a fundamental shift in perspective on control and safety for soft robots, defining and implementing a formally verifiable logic specification on their pose and contact forces.
△ Less
Submitted 20 April, 2025;
originally announced April 2025.
-
TheBlueScrubs-v1, a comprehensive curated medical dataset derived from the internet
Authors:
Luis Felipe,
Carlos Garcia,
Issam El Naqa,
Monique Shotande,
Aakash Tripathi,
Vivek Rudrapatna,
Ghulam Rasool,
Danielle Bitterman,
Gilmer Valdes
Abstract:
The need for robust and diverse data sets to train clinical large language models (cLLMs) is critical given that currently available public repositories often prove too limited in size or scope for comprehensive medical use. While resources like PubMed provide foundational medical literature, they capture only a narrow range of formal publications and omit the broader medical discourse on the inte…
▽ More
The need for robust and diverse data sets to train clinical large language models (cLLMs) is critical given that currently available public repositories often prove too limited in size or scope for comprehensive medical use. While resources like PubMed provide foundational medical literature, they capture only a narrow range of formal publications and omit the broader medical discourse on the internet. To address these deficits, we introduce TheBlueScrubs-v1, a curated dataset of over 25 billion medical tokens - nearly three times larger than PubMed - drawn from a broad-scale internet corpus. Our two-stage filtering pipeline employs a Logistic Regression model for document screening (achieving an AUC of approximately 0.95 on external validation), followed by verification via a 70B-parameter Llama 3.1 instruct model. Each text is assigned three LLM-based quality scores encompassing medical relevance, precision and factual detail, and safety and ethical standards. Clinician reviews confirm high concordance with these automated evaluations, and a specialized cancer classifier further labels approximately 11 billion oncology tokens. Two demonstration tasks highlight the dataset's practical value: first, we distill the safety evaluations to a smaller BERT-style model that reaches an AUC near 0.96 on unseen data; second, we fine-tune a compact LLM on a filtered subset, showing measurable improvements over standard baselines in medical benchmarks as well as private ones. This Data Descriptor details the dataset's creation and validation, underscoring its potential utility for medical AI research.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Authors:
Carlos Vélez García,
Miguel Cazorla,
Jorge Pomares
Abstract:
The reliance on large-scale datasets and extensive computational resources has become a major barrier to advancing representation learning in vision, especially in data-scarce domains. In this paper, we address the critical question: Can we escape the big data paradigm in self-supervised representation learning from images? We introduce SCOTT (Sparse Convolutional Tokenizer for Transformers), a sh…
▽ More
The reliance on large-scale datasets and extensive computational resources has become a major barrier to advancing representation learning in vision, especially in data-scarce domains. In this paper, we address the critical question: Can we escape the big data paradigm in self-supervised representation learning from images? We introduce SCOTT (Sparse Convolutional Tokenizer for Transformers), a shallow tokenization architecture that is compatible with Masked Image Modeling (MIM) tasks. SCOTT injects convolutional inductive biases into Vision Transformers (ViTs), enhancing their efficacy in small-scale data regimes. Alongside, we propose to train on a Joint-Embedding Predictive Architecture within a MIM framework (MIM-JEPA), operating in latent representation space to capture more semantic features. Our approach enables ViTs to be trained from scratch on datasets orders of magnitude smaller than traditionally required --without relying on massive external datasets for pretraining. We validate our method on three small-size, standard-resoultion, fine-grained datasets: Oxford Flowers-102, Oxford IIIT Pets-37, and ImageNet-100. Despite the challenges of limited data and high intra-class similarity, frozen SCOTT models pretrained with MIM-JEPA significantly outperform fully supervised methods and achieve competitive results with SOTA approaches that rely on large-scale pretraining, complex image augmentations and bigger model sizes. By demonstrating that robust off-the-shelf representations can be learned with limited data, compute, and model sizes, our work paves the way for computer applications in resource constrained environments such as medical imaging or robotics. Our findings challenge the prevailing notion that vast amounts of data are indispensable for effective representation learning in vision, offering a new pathway toward more accessible and inclusive advancements in the field.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
KPIs 2024 Challenge: Advancing Glomerular Segmentation from Patch- to Slide-Level
Authors:
Ruining Deng,
Tianyuan Yao,
Yucheng Tang,
Junlin Guo,
Siqi Lu,
Juming Xiong,
Lining Yu,
Quan Huu Cap,
Pengzhou Cai,
Libin Lan,
Ze Zhao,
Adrian Galdran,
Amit Kumar,
Gunjan Deotale,
Dev Kumar Das,
Inyoung Paik,
Joonho Lee,
Geongyu Lee,
Yujia Chen,
Wangkai Li,
Zhaoyang Li,
Xuege Hou,
Zeyuan Wu,
Shengjin Wang,
Maximilian Fischer
, et al. (22 additional authors not shown)
Abstract:
Chronic kidney disease (CKD) is a major global health issue, affecting over 10% of the population and causing significant mortality. While kidney biopsy remains the gold standard for CKD diagnosis and treatment, the lack of comprehensive benchmarks for kidney pathology segmentation hinders progress in the field. To address this, we organized the Kidney Pathology Image Segmentation (KPIs) Challenge…
▽ More
Chronic kidney disease (CKD) is a major global health issue, affecting over 10% of the population and causing significant mortality. While kidney biopsy remains the gold standard for CKD diagnosis and treatment, the lack of comprehensive benchmarks for kidney pathology segmentation hinders progress in the field. To address this, we organized the Kidney Pathology Image Segmentation (KPIs) Challenge, introducing a dataset that incorporates preclinical rodent models of CKD with over 10,000 annotated glomeruli from 60+ Periodic Acid Schiff (PAS)-stained whole slide images. The challenge includes two tasks, patch-level segmentation and whole slide image segmentation and detection, evaluated using the Dice Similarity Coefficient (DSC) and F1-score. By encouraging innovative segmentation methods that adapt to diverse CKD models and tissue conditions, the KPIs Challenge aims to advance kidney pathology analysis, establish new benchmarks, and enable precise, large-scale quantification for disease research and diagnosis.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Real-Time Trajectory Generation for Soft Robot Manipulators Using Differential Flatness
Authors:
Akua Dickson,
Juan C. Pacheco Garcia,
Ran Jing,
Meredith L. Anderson,
Andrew P. Sabelhaus
Abstract:
Soft robots have the potential to interact with sensitive environments and perform complex tasks effectively. However, motion plans and trajectories for soft manipulators are challenging to calculate due to their deformable nature and nonlinear dynamics. This article introduces a fast real-time trajectory generation approach for soft robot manipulators, which creates dynamically-feasible motions f…
▽ More
Soft robots have the potential to interact with sensitive environments and perform complex tasks effectively. However, motion plans and trajectories for soft manipulators are challenging to calculate due to their deformable nature and nonlinear dynamics. This article introduces a fast real-time trajectory generation approach for soft robot manipulators, which creates dynamically-feasible motions for arbitrary kinematically-feasible paths of the robot's end effector. Our insight is that piecewise constant curvature (PCC) dynamics models of soft robots can be differentially flat, therefore control inputs can be calculated algebraically rather than through a nonlinear differential equation. We prove this flatness under certain conditions, with the curvatures of the robot as the flat outputs. Our two-step trajectory generation approach uses an inverse kinematics procedure to calculate a motion plan of robot curvatures per end-effector position, then, our flatness diffeomorphism generates corresponding control inputs that respect velocity. We validate our approach through simulations of our representative soft robot manipulator along three different trajectories, demonstrating a margin of 23x faster than real-time at a frequency of 100 Hz. This approach could allow fast verifiable replanning of soft robots' motions in safety-critical physical environments, crucial for deployment in the real world.
△ Less
Submitted 11 December, 2024;
originally announced December 2024.
-
Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems
Authors:
Alejandro Castañeda Garcia,
Jan van Gemert,
Daan Brinks,
Nergis Tömen
Abstract:
Extracting physical dynamical system parameters from recorded observations is key in natural science. Current methods for automatic parameter estimation from video train supervised deep networks on large datasets. Such datasets require labels, which are difficult to acquire. While some unsupervised techniques--which depend on frame prediction--exist, they suffer from long training times, initializ…
▽ More
Extracting physical dynamical system parameters from recorded observations is key in natural science. Current methods for automatic parameter estimation from video train supervised deep networks on large datasets. Such datasets require labels, which are difficult to acquire. While some unsupervised techniques--which depend on frame prediction--exist, they suffer from long training times, initialization instabilities, only consider motion-based dynamical systems, and are evaluated mainly on synthetic data. In this work, we propose an unsupervised method to estimate the physical parameters of known, continuous governing equations from single videos suitable for different dynamical systems beyond motion and robust to initialization. Moreover, we remove the need for frame prediction by implementing a KL-divergence-based loss function in the latent space, which avoids convergence to trivial solutions and reduces model size and compute. We first evaluate our model on synthetic data, as commonly done. After which, we take the field closer to reality by recording Delfys75: our own real-world dataset of 75 videos for five different types of dynamical systems to evaluate our method and others. Our method compares favorably to others. %, yet, and real-world video datasets and demonstrate improved parameter estimation accuracy compared to existing methods. Code and data are available online:https://github.com/Alejandro-neuro/Learning_physics_from_video.
△ Less
Submitted 24 March, 2025; v1 submitted 2 October, 2024;
originally announced October 2024.
-
Self-Sensing for Proprioception and Contact Detection in Soft Robots Using Shape Memory Alloy Artificial Muscles
Authors:
Ran Jing,
Meredith L. Anderson,
Juan C. Pacheco Garcia,
Andrew P. Sabelhaus
Abstract:
Estimating a soft robot's pose and applied forces, also called proprioception, is crucial for safe interaction of the robot with its environment. However, most solutions for soft robot proprioception use dedicated sensors, particularly for external forces, which introduce design trade-offs, rigidity, and risk of failure. This work presents an approach for pose estimation and contact detection for…
▽ More
Estimating a soft robot's pose and applied forces, also called proprioception, is crucial for safe interaction of the robot with its environment. However, most solutions for soft robot proprioception use dedicated sensors, particularly for external forces, which introduce design trade-offs, rigidity, and risk of failure. This work presents an approach for pose estimation and contact detection for soft robots actuated by shape memory alloy (SMA) artificial muscles, using no dedicated force sensors. Our framework uses the unique material properties of SMAs to self-sense their internal stress, via offboard measurements of their electrical resistance and in-situ temperature readings, in an existing fully-soft limb design. We demonstrate that a simple polynomial regression model on these measurements is sufficient to predict the robot's pose, under no-contact conditions. Then, we show that if an additional measurement of the true pose is available (e.g. from an already-in-place bending sensor), it is possible to predict a binary contact/no-contact using multiple combinations of self-sensing signals. Our hardware tests verify our hypothesis via a contact detection test with a human operator. This proof-of-concept validates that self-sensing signals in soft SMA-actuated soft robots can be used for proprioception and contact detection, and suggests a direction for integrating proprioception into soft robots without design compromises. Future work could employ machine learning for enhanced accuracy.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Perfect codes over non-prime power alphabets: an approach based on Diophantine equations
Authors:
Pedro-José Cazorla García
Abstract:
Perfect error correcting codes allow for an optimal transmission of information while guaranteeing error correction. For this reason, proving their existence has been a classical problem in both pure mathematics and information theory. Indeed, the classification of the parameters of $e-$error correcting perfect codes over $q-$ary alphabets was a very active topic of research in the late 20th centu…
▽ More
Perfect error correcting codes allow for an optimal transmission of information while guaranteeing error correction. For this reason, proving their existence has been a classical problem in both pure mathematics and information theory. Indeed, the classification of the parameters of $e-$error correcting perfect codes over $q-$ary alphabets was a very active topic of research in the late 20th century. Consequently, all parameters of perfect $e-$error correcting codes were found if $e \ge 3$, and it was conjectured that no perfect $2-$error correcting codes exist over any $q-$ary alphabet, where $q > 3$. In the 1970s, this was proved for $q$ a prime power, for $q = 2^r3^s$ and for only $7$ other values of $q$. Almost $50$ years later, it is surprising to note that there have been no new results in this regard and the classification of $2-$error correcting codes over non-prime power alphabets remains an open problem. In this paper, we use techniques from the resolution of generalised Ramanujan--Nagell equation and from modern computational number theory to show that perfect $2-$error correcting codes do not exist for $172$ new values of $q$ which are not prime powers, substantially increasing the values of $q$ which are now classified. In addition, we prove that, for any fixed value of $q$, there can be at most finitely many perfect $2-$error correcting codes over an alphabet of size $q$.
△ Less
Submitted 24 May, 2024; v1 submitted 6 May, 2024;
originally announced May 2024.
-
Parallel-in-Time Integration of Transient Phenomena in No-Insulation Superconducting Coils Using Parareal
Authors:
Erik Schnaubelt,
Mariusz Wozniak,
Julien Dular,
Idoia Cortes Garcia,
Arjan Verweij,
Sebastian Schöps
Abstract:
High-temperature superconductors (HTS) have the potential to enable magnetic fields beyond the current limits of low-temperature superconductors in applications like accelerator magnets. However, the design of HTS-based magnets requires computationally demanding transient multi-physics simulations with highly non-linear material properties. To reduce the solution time, we propose using Parareal (P…
▽ More
High-temperature superconductors (HTS) have the potential to enable magnetic fields beyond the current limits of low-temperature superconductors in applications like accelerator magnets. However, the design of HTS-based magnets requires computationally demanding transient multi-physics simulations with highly non-linear material properties. To reduce the solution time, we propose using Parareal (PR) for parallel-in-time magneto-thermal simulation of magnets based on HTS, particularly, no-insulation coils without turn-to-turn insulation. We propose extending the classical PR method to automatically find a time partitioning using a first coarse adaptive propagator. The proposed PR method is shown to reduce the computing time when fine engineering tolerances are required despite the highly nonlinear character of the problem. The full software stack used is open-source.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Voice EHR: Introducing Multimodal Audio Data for Health
Authors:
James Anibal,
Hannah Huth,
Ming Li,
Lindsey Hazen,
Veronica Daoud,
Dominique Ebedes,
Yen Minh Lam,
Hang Nguyen,
Phuc Hong,
Michael Kleinman,
Shelley Ost,
Christopher Jackson,
Laura Sprabery,
Cheran Elangovan,
Balaji Krishnaiah,
Lee Akst,
Ioan Lina,
Iqbal Elyazar,
Lenny Ekwati,
Stefan Jansen,
Richard Nduwayezu,
Charisse Garcia,
Jeffrey Plum,
Jacqueline Brenner,
Miranda Song
, et al. (5 additional authors not shown)
Abstract:
Artificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive recording equipment in high-income countries, which challenges deployment in resource-constrained, high-volume setti…
▽ More
Artificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive recording equipment in high-income countries, which challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact on health equity. This report introduces a novel data type and a corresponding collection system that captures health data through guided questions using only a mobile/web application. The app facilitates the collection of an audio electronic health record (Voice EHR) which may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and spoken language with semantic meaning and longitudinal context, potentially compensating for the typical limitations of unimodal clinical datasets. This report presents the application used for data collection, initial experiments on data quality, and case studies which demonstrate the potential of voice EHR to advance the scalability/diversity of audio AI.
△ Less
Submitted 9 November, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streams
Authors:
Cristiano Mesquita Garcia,
Alessandro Lameiras Koerich,
Alceu de Souza Britto Jr,
Jean Paul Barddal
Abstract:
The proliferation of textual data on the Internet presents a unique opportunity for institutions and companies to monitor public opinion about their services and products. Given the rapid generation of such data, the text stream mining setting, which handles sequentially arriving, potentially infinite text streams, is often more suitable than traditional batch learning. While pre-trained language…
▽ More
The proliferation of textual data on the Internet presents a unique opportunity for institutions and companies to monitor public opinion about their services and products. Given the rapid generation of such data, the text stream mining setting, which handles sequentially arriving, potentially infinite text streams, is often more suitable than traditional batch learning. While pre-trained language models are commonly employed for their high-quality text vectorization capabilities in streaming contexts, they face challenges adapting to concept drift - the phenomenon where the data distribution changes over time, adversely affecting model performance. Addressing the issue of concept drift, this study explores the efficacy of seven text sampling methods designed to selectively fine-tune language models, thereby mitigating performance degradation. We precisely assess the impact of these methods on fine-tuning the SBERT model using four different loss functions. Our evaluation, focused on Macro F1-score and elapsed time, employs two text stream datasets and an incremental SVM classifier to benchmark performance. Our findings indicate that Softmax loss and Batch All Triplets loss are particularly effective for text stream classification, demonstrating that larger sample sizes generally correlate with improved macro F1-scores. Notably, our proposed WordPieceToken ratio sampling method significantly enhances performance with the identified loss functions, surpassing baseline results.
△ Less
Submitted 16 August, 2024; v1 submitted 18 March, 2024;
originally announced March 2024.
-
Methods for Generating Drift in Text Streams
Authors:
Cristiano Mesquita Garcia,
Alessandro Lameiras Koerich,
Alceu de Souza Britto Jr,
Jean Paul Barddal
Abstract:
Systems and individuals produce data continuously. On the Internet, people share their knowledge, sentiments, and opinions, provide reviews about services and products, and so on. Automatically learning from these textual data can provide insights to organizations and institutions, thus preventing financial impacts, for example. To learn from textual data over time, the machine learning system mus…
▽ More
Systems and individuals produce data continuously. On the Internet, people share their knowledge, sentiments, and opinions, provide reviews about services and products, and so on. Automatically learning from these textual data can provide insights to organizations and institutions, thus preventing financial impacts, for example. To learn from textual data over time, the machine learning system must account for concept drift. Concept drift is a frequent phenomenon in real-world datasets and corresponds to changes in data distribution over time. For instance, a concept drift occurs when sentiments change or a word's meaning is adjusted over time. Although concept drift is frequent in real-world applications, benchmark datasets with labeled drifts are rare in the literature. To bridge this gap, this paper provides four textual drift generation methods to ease the production of datasets with labeled drifts. These methods were applied to Yelp and Airbnb datasets and tested using incremental classifiers respecting the stream mining paradigm to evaluate their ability to recover from the drifts. Results show that all methods have their performance degraded right after the drifts, and the incremental SVM is the fastest to run and recover the previous performance levels regarding accuracy and Macro F1-Score.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Advancing dermatological diagnosis: Development of a hyperspectral dermatoscope for enhanced skin imaging
Authors:
Martin J. Hetz,
Carina Nogueira Garcia,
Sarah Haggenmüller,
Titus J. Brinker
Abstract:
Clinical dermatology necessitates precision and innovation for efficient diagnosis and treatment of various skin conditions. This paper introduces the development of a cutting-edge hyperspectral dermatoscope (the Hyperscope) tailored for human skin analysis. We detail the requirements to such a device and the design considerations, from optical configurations to sensor selection, necessary to capt…
▽ More
Clinical dermatology necessitates precision and innovation for efficient diagnosis and treatment of various skin conditions. This paper introduces the development of a cutting-edge hyperspectral dermatoscope (the Hyperscope) tailored for human skin analysis. We detail the requirements to such a device and the design considerations, from optical configurations to sensor selection, necessary to capture a wide spectral range with high fidelity. Preliminary results from 15 individuals and 160 recorded skin images demonstrate the potential of the Hyperscope in identifying and characterizing various skin conditions, offering a promising avenue for non-invasive skin evaluation and a platform for future research in dermatology-related hyperspectral imaging.
△ Less
Submitted 25 June, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
Multi-organ Self-supervised Contrastive Learning for Breast Lesion Segmentation
Authors:
Hugo Figueiras,
Helena Aidos,
Nuno Cruz Garcia
Abstract:
Self-supervised learning has proven to be an effective way to learn representations in domains where annotated labels are scarce, such as medical imaging. A widely adopted framework for this purpose is contrastive learning and it has been applied to different scenarios. This paper seeks to advance our understanding of the contrastive learning framework by exploring a novel perspective: employing m…
▽ More
Self-supervised learning has proven to be an effective way to learn representations in domains where annotated labels are scarce, such as medical imaging. A widely adopted framework for this purpose is contrastive learning and it has been applied to different scenarios. This paper seeks to advance our understanding of the contrastive learning framework by exploring a novel perspective: employing multi-organ datasets for pre-training models tailored to specific organ-related target tasks. More specifically, our target task is breast tumour segmentation in ultrasound images. The pre-training datasets include ultrasound images from other organs, such as the lungs and heart, and large datasets of natural images. Our results show that conventional contrastive learning pre-training improves performance compared to supervised baseline approaches. Furthermore, our pre-trained models achieve comparable performance when fine-tuned with only half of the available labelled data. Our findings also show the advantages of pre-training on diverse organ data for improving performance in the downstream task.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Temporal Analysis of Drifting Hashtags in Textual Data Streams: A Graph-Based Application
Authors:
Cristiano M. Garcia,
Alceu de Souza Britto Jr,
Jean Paul Barddal
Abstract:
Initially supported by Twitter, hashtags are now used on several social media platforms. Hashtags are helpful for tagging, tracking, and grouping posts on similar topics. In this paper, based on a hashtag stream regarding the hashtag #mybodymychoice, we analyze hashtag drifts over time using concepts from graph analysis and textual data streams using the Girvan-Newman method to uncover hashtag com…
▽ More
Initially supported by Twitter, hashtags are now used on several social media platforms. Hashtags are helpful for tagging, tracking, and grouping posts on similar topics. In this paper, based on a hashtag stream regarding the hashtag #mybodymychoice, we analyze hashtag drifts over time using concepts from graph analysis and textual data streams using the Girvan-Newman method to uncover hashtag communities in annual snapshots between 2018 and 2022. In addition, we offer insights about some correlated hashtags found in the study. Our approach can be useful for monitoring changes over time in opinions and sentiment patterns about an entity on social media. Even though the hashtag #mybodymychoice was initially coupled with women's rights, abortion, and bodily autonomy, we observe that it suffered drifts during the studied period across topics such as drug legalization, vaccination, political protests, war, and civil rights. The year 2021 was the most significant drifting year, in which the communities detected and their respective sizes suggest that #mybodymychoice had a significant drift to vaccination and Covid-19-related topics.
△ Less
Submitted 16 August, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Maximizing Consistent Force Output for Shape Memory Alloy Artificial Muscles in Soft Robots
Authors:
Meredith L. Anderson,
Ran Jing,
Juan C. Pacheco Garcia,
Ilyoung Yang,
Sarah Alizadeh-Shabdiz,
Charles DeLorey,
Andrew P. Sabelhaus
Abstract:
Soft robots have immense potential given their inherent safety and adaptability, but challenges in soft actuator forces and design constraints have limited scaling up soft robots to larger sizes. Electrothermal shape memory alloy (SMA) artificial muscles have the potential to create these large forces and high displacements, but consistently using these muscles under a well-defined model, in-situ…
▽ More
Soft robots have immense potential given their inherent safety and adaptability, but challenges in soft actuator forces and design constraints have limited scaling up soft robots to larger sizes. Electrothermal shape memory alloy (SMA) artificial muscles have the potential to create these large forces and high displacements, but consistently using these muscles under a well-defined model, in-situ in a soft robot, remains an open challenge. This article provides a system for maintaining the highest-possible consistent SMA forces, over long lifetimes, by combining a fatigue testing protocol with a supervisory control system for the muscles' internal temperature state. We propose a design of a soft limb with swap-able SMA muscles, and deploy the limb in a blocked-force test to quantify the relationship between the measured maximum force at different temperatures over different lifetimes. Then, by applying an invariance-based control system to maintain temperatures under our long-life limit, we demonstrate consistent high forces in a practical task over hundreds of cycles. The method we developed allows for practical implementation of SMAs in soft robots through characterizing and controlling their behavior in-situ, and provides a method to impose limits that maximize their consistent, repeatable behavior.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big.LITTLE asymmetric architectures
Authors:
Alberto Corpas,
Luis Costero,
Guillermo Botella,
Francisco D. Igual,
Carlos García,
Manuel Rodríguez
Abstract:
This paper proposes a mechanism to accelerate and optimize the energy consumption of a face detection software based on Haar-like cascading classifiers, taking advantage of the features of low-cost Asymmetric Multicore Processors (AMPs) with limited power budget. A modelling and task scheduling/allocation is proposed in order to efficiently make use of the existing features on big.LITTLE ARM proce…
▽ More
This paper proposes a mechanism to accelerate and optimize the energy consumption of a face detection software based on Haar-like cascading classifiers, taking advantage of the features of low-cost Asymmetric Multicore Processors (AMPs) with limited power budget. A modelling and task scheduling/allocation is proposed in order to efficiently make use of the existing features on big.LITTLE ARM processors, including: (I) source-code adaptation for parallel computing, which enables code acceleration by applying the OmpSs programming model, a task-based programming model that handles data-dependencies between tasks in a transparent fashion; (II) different OmpSs task allocation policies which take into account the processor asymmetry and can dynamically set processing resources in a more efficient way based on their particular features. The proposed mechanism can be efficiently applied to take advantage of the processing elements existing on low-cost and low-energy multi-core embedded devices executing object detection algorithms based on cascading classifiers. Although these classifiers yield the best results for detection algorithms in the field of computer vision, their high computational requirements prevent them from being used on these devices under real-time requirements. Finally, we compare the energy efficiency of a heterogeneous architecture based on asymmetric multicore processors with a suitable task scheduling, with that of a homogeneous symmetric architecture.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Simultaneous Calibration and Navigation (SCAN) of Multiple Ultrasonic Local Positioning Systems
Authors:
David Gualda,
Jesus Urena,
Juan C. Garcia,
Enrique Garcia,
Jose Alcala
Abstract:
This paper proposes a Simultaneous Calibration and Navigation (SCAN) algorithm of a multiple Ultrasonic Local Positioning Systems (ULPSs) that cover an extensive indoor area. The idea is the development of the same concept than SLAM (Simultaneous Localization and Mapping), in which a Mobile Robot (MR) estimates the map while it is navigating. The MR calibrates the beacons of several ULPSs while it…
▽ More
This paper proposes a Simultaneous Calibration and Navigation (SCAN) algorithm of a multiple Ultrasonic Local Positioning Systems (ULPSs) that cover an extensive indoor area. The idea is the development of the same concept than SLAM (Simultaneous Localization and Mapping), in which a Mobile Robot (MR) estimates the map while it is navigating. The MR calibrates the beacons of several ULPSs while it is moving inside the localization area. The concept of calibration is the estimation of the position of the beacons referenced to a known map. The scenario is composed of some calibrated ULPSs that we denote as Globally Referenced Ultrasonic Local Positioning Systems (GRULPSs) that are located in strategic points like entrances covering the start and the end of a possible trajectory in the environment. Additionally, there are several non-calibrated ULPSs named Locally Referenced Ultrasonic Local Positioning Systems (LRULPSs) that are placed around the localization area. The proposal uses a MR with odometer for calibrating the beacons of the LRULPSs while it is navigating on their coverage area and go from one GRULPS to another. The algorithm is based on multiple filters running in parallel (one filter for each LRULPS and another one for the GRULPSs) that estimate the global and local trajectories of the MR (one trajectory for each local reference system of the LRULPSs) fusing the information related to the Ultrasound Signals (US) and the odometer of the MR. The position of the beacons of the LRULPSs are obtained by a transformation vector for each LRULPS that converts the local coordinates to the global reference system. Extended Kalman Filter (EKF), Unscented Kalman Filter (UKF) and H-Inf Filter have been tested, in simulations and real experiments, in order to compare their performance in this case.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
PhenoLinker: Phenotype-Gene Link Prediction and Explanation using Heterogeneous Graph Neural Networks
Authors:
Jose L. Mellina Andreu,
Luis Bernal,
Antonio F. Skarmeta,
Mina Ryten,
Sara Álvarez,
Alejandro Cisterna García,
Juan A. Botía
Abstract:
The association of a given human phenotype to a genetic variant remains a critical challenge for biology. We present a novel system called PhenoLinker capable of associating a score to a phenotype-gene relationship by using heterogeneous information networks and a convolutional neural network-based model for graphs, which can provide an explanation for the predictions. This system can aid in the d…
▽ More
The association of a given human phenotype to a genetic variant remains a critical challenge for biology. We present a novel system called PhenoLinker capable of associating a score to a phenotype-gene relationship by using heterogeneous information networks and a convolutional neural network-based model for graphs, which can provide an explanation for the predictions. This system can aid in the discovery of new associations and in the understanding of the consequences of human genetic variation.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
AVELA -- A Vision for Engineering Literacy & Access: Understanding Why Technology Alone Is Not Enough
Authors:
Kyle Johnson,
Vicente Arroyos,
Celeste Garcia,
Liban Hussein,
Aisha Cora,
Tsewone Melaku,
Jay L. Cunningham,
R. Benjamin Shapiro,
Vikram Iyer
Abstract:
Unequal technology access for Black and Latine communities has been a persistent economic, social justice, and human rights issue despite increased technology accessibility due to advancements in consumer electronics like phones, tablets, and computers. We contextualize socio-technical access inequalities for Black and Latine urban communities and find that many students are hesitant to engage wit…
▽ More
Unequal technology access for Black and Latine communities has been a persistent economic, social justice, and human rights issue despite increased technology accessibility due to advancements in consumer electronics like phones, tablets, and computers. We contextualize socio-technical access inequalities for Black and Latine urban communities and find that many students are hesitant to engage with available technologies due to a lack of engaging support systems. We present a holistic student-led STEM engagement model through AVELA - A Vision for Engineering Literacy and Access leveraging culturally responsive lessons, mentor embodied community representation, and service learning. To evaluate the model's impact after 4 years of mentoring 200+ university student instructors in teaching to 2,500+ secondary school students in 100+ classrooms, we conducted 24 semi-structured interviews with college AnonymizedOrganization members. We identify access barriers and provide principled recommendations for designing future STEM education programs.
△ Less
Submitted 29 January, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Concept Drift Adaptation in Text Stream Mining Settings: A Systematic Review
Authors:
Cristiano Mesquita Garcia,
Ramon Simoes Abilio,
Alessandro Lameiras Koerich,
Alceu de Souza Britto Jr.,
Jean Paul Barddal
Abstract:
The society produces textual data online in several ways, e.g., via reviews and social media posts. Therefore, numerous researchers have been working on discovering patterns in textual data that can indicate peoples' opinions, interests, etc. Most tasks regarding natural language processing are addressed using traditional machine learning methods and static datasets. This setting can lead to sever…
▽ More
The society produces textual data online in several ways, e.g., via reviews and social media posts. Therefore, numerous researchers have been working on discovering patterns in textual data that can indicate peoples' opinions, interests, etc. Most tasks regarding natural language processing are addressed using traditional machine learning methods and static datasets. This setting can lead to several problems, e.g., outdated datasets and models, which degrade in performance over time. This is particularly true regarding concept drift, in which the data distribution changes over time. Furthermore, text streaming scenarios also exhibit further challenges, such as the high speed at which data arrives over time. Models for stream scenarios must adhere to the aforementioned constraints while learning from the stream, thus storing texts for limited periods and consuming low memory. This study presents a systematic literature review regarding concept drift adaptation in text stream scenarios. Considering well-defined criteria, we selected 48 papers published between 2018 and August 2024 to unravel aspects such as text drift categories, detection types, model update mechanisms, stream mining tasks addressed, and text representation methods and their update mechanisms. Furthermore, we discussed drift visualization and simulation and listed real-world datasets used in the selected papers. Finally, we brought forward a discussion on existing works in the area, also highlighting open challenges and future research directions for the community.
△ Less
Submitted 25 November, 2024; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Long-Range Transformer Architectures for Document Understanding
Authors:
Thibault Douzon,
Stefan Duffner,
Christophe Garcia,
Jérémy Espinas
Abstract:
Since their release, Transformers have revolutionized many fields from Natural Language Understanding to Computer Vision. Document Understanding (DU) was not left behind with first Transformer based models for DU dating from late 2019. However, the computational complexity of the self-attention operation limits their capabilities to small sequences. In this paper we explore multiple strategies to…
▽ More
Since their release, Transformers have revolutionized many fields from Natural Language Understanding to Computer Vision. Document Understanding (DU) was not left behind with first Transformer based models for DU dating from late 2019. However, the computational complexity of the self-attention operation limits their capabilities to small sequences. In this paper we explore multiple strategies to apply Transformer based models to long multi-page documents. We introduce 2 new multi-modal (text + layout) long-range models for DU. They are based on efficient implementations of Transformers for long sequences. Long-range models can process whole documents at once effectively and are less impaired by the document's length. We compare them to LayoutLM, a classical Transformer adapted for DU and pre-trained on millions of documents. We further propose 2D relative attention bias to guide self-attention towards relevant tokens without harming model efficiency. We observe improvements on multi-page business documents on Information Retrieval for a small performance cost on smaller sequences. Relative 2D attention revealed to be effective on dense text for both normal and long-range models.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Improving Information Extraction on Business Documents with Specific Pre-Training Tasks
Authors:
Thibault Douzon,
Stefan Duffner,
Christophe Garcia,
Jérémy Espinas
Abstract:
Transformer-based Language Models are widely used in Natural Language Processing related tasks. Thanks to their pre-training, they have been successfully adapted to Information Extraction in business documents. However, most pre-training tasks proposed in the literature for business documents are too generic and not sufficient to learn more complex structures. In this paper, we use LayoutLM, a lan…
▽ More
Transformer-based Language Models are widely used in Natural Language Processing related tasks. Thanks to their pre-training, they have been successfully adapted to Information Extraction in business documents. However, most pre-training tasks proposed in the literature for business documents are too generic and not sufficient to learn more complex structures. In this paper, we use LayoutLM, a language model pre-trained on a collection of business documents, and introduce two new pre-training tasks that further improve its capacity to extract relevant information. The first is aimed at better understanding the complex layout of documents, and the second focuses on numeric values and their order of magnitude. These tasks force the model to learn better-contextualized representations of the scanned documents. We further introduce a new post-processing algorithm to decode BIESO tags in Information Extraction that performs better with complex entities. Our method significantly improves extraction performance on both public (from 93.88 to 95.50 F1 score) and private (from 84.35 to 84.84 F1 score) datasets composed of expense receipts, invoices, and purchase orders.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Index-aware learning of circuits
Authors:
Idoia Cortes Garcia,
Peter Förster,
Lennart Jansen,
Wil Schilders,
Sebastian Schöps
Abstract:
Electrical circuits are present in a variety of technologies, making their design an important part of computer aided engineering. The growing number of parameters that affect the final design leads to a need for new approaches to quantify their impact. Machine learning may play a key role in this regard, however current approaches often make suboptimal use of existing knowledge about the system a…
▽ More
Electrical circuits are present in a variety of technologies, making their design an important part of computer aided engineering. The growing number of parameters that affect the final design leads to a need for new approaches to quantify their impact. Machine learning may play a key role in this regard, however current approaches often make suboptimal use of existing knowledge about the system at hand. In terms of circuits, their description via modified nodal analysis is well-understood. This particular formulation leads to systems of differential-algebraic equations (DAEs) which bring with them a number of peculiarities, e.g. hidden constraints that the solution needs to fulfill. We use the recently introduced dissection index that can decouple a given system of DAEs into ordinary differential equations, only depending on differential variables, and purely algebraic equations, that describe the relations between differential and algebraic variables. The idea is to then only learn the differential variables and reconstruct the algebraic ones using the relations from the decoupling. This approach guarantees that the algebraic constraints are fulfilled up to the accuracy of the nonlinear system solver, and it may also reduce the learning effort as only the differential variables need to be learned.
△ Less
Submitted 9 March, 2024; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Deep learning-based interactive segmentation in remote sensing
Authors:
Zhe Wang,
Shoukun Sun,
Xiang Que,
Xiaogang Ma,
Carmen Galaz Garcia
Abstract:
Interactive segmentation, a computer vision technique where a user provides guidance to help an algorithm segment a feature of interest in an image, has achieved outstanding accuracy and efficient human-computer interaction. However, few studies have discussed its application to remote sensing imagery, where click-based interactive segmentation could greatly facilitate the analysis of complicated…
▽ More
Interactive segmentation, a computer vision technique where a user provides guidance to help an algorithm segment a feature of interest in an image, has achieved outstanding accuracy and efficient human-computer interaction. However, few studies have discussed its application to remote sensing imagery, where click-based interactive segmentation could greatly facilitate the analysis of complicated landscapes. This study aims to bridge the gap between click-based interactive segmentation and remote sensing image analysis by conducting a benchmark study on various click-based interactive segmentation models. We assessed the performance of five state-of-the-art interactive segmentation methods (Reviving Iterative Training with Mask Guidance for Interactive Segmentation (RITM), FocalClick, SimpleClick, Iterative Click Loss (ICL), and Segment Anything (SAM)) on two high-resolution aerial imagery datasets. The Cascade-Forward Refinement (CFR) approach, an innovative inference strategy for interactive segmentation, was also introduced to enhance the segmentation results without requiring manual efforts. We further integrated CFR into all models for comparison. The performance of these methods on various land cover types, different object sizes, and multiple band combinations in the datasets was evaluated. The SimpleClick-CFR model consistently outperformed the other methods in our experiments. Building upon these findings, we developed a dedicated online tool called SegMap for interactive segmentation of remote sensing data. SegMap incorporates a well-performing interactive model that is fine-tuned with remote sensing data. Unlike existing interactive segmentation tools, SegMap offers robust interactivity, modifiability, and adaptability to analyze remote sensing imagery.
△ Less
Submitted 12 May, 2025; v1 submitted 25 August, 2023;
originally announced August 2023.
-
Non-invasive Diabetes Detection using Gabor Filter: A Comparative Analysis of Different Cameras
Authors:
Christina A. Garcia,
Patricia Angela R. Abu,
Rosula SJ. Reyes
Abstract:
This paper compares and explores the performance of both mobile device camera and laptop camera as convenient tool for capturing images for non-invasive detection of Diabetes Mellitus (DM) using facial block texture features. Participants within age bracket 20 to 79 years old were chosen for the dataset. 12mp and 7mp mobile cameras, and a laptop camera were used to take the photo under normal ligh…
▽ More
This paper compares and explores the performance of both mobile device camera and laptop camera as convenient tool for capturing images for non-invasive detection of Diabetes Mellitus (DM) using facial block texture features. Participants within age bracket 20 to 79 years old were chosen for the dataset. 12mp and 7mp mobile cameras, and a laptop camera were used to take the photo under normal lighting condition. Extracted facial blocks were classified using k-Nearest Neighbors (k-NN) and Support Vector Machine (SVM). 100 images were captured, preprocessed, filtered using Gabor, and iterated. Performance of the system was measured in terms of accuracy, specificity, and sensitivity. Best performance of 96.7% accuracy, 100% sensitivity, and 93% specificity were achieved from 12mp back camera using SVM with 100 images.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Development of Authenticated Clients and Applications for ICICLE CI Services -- Final Report for the REHS Program, June-August, 2022
Authors:
Sahil Samar,
Mia Chen,
Jack Karpinski,
Michael Ray,
Archita Sarin,
Christian Garcia,
Matthew Lange,
Joe Stubbs,
Mary Thomas
Abstract:
The Artificial Intelligence (AI) institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) is funded by the NSF to build the next generation of Cyberinfrastructure to render AI more accessible to everyone and drive its further democratization in the larger society. We describe our efforts to develop Jupyter Notebooks and Python command line clients that…
▽ More
The Artificial Intelligence (AI) institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE) is funded by the NSF to build the next generation of Cyberinfrastructure to render AI more accessible to everyone and drive its further democratization in the larger society. We describe our efforts to develop Jupyter Notebooks and Python command line clients that would access these ICICLE resources and services using ICICLE authentication mechanisms. To connect our clients, we used Tapis, which is a framework that supports computational research to enable scientists to access, utilize, and manage multi-institution resources and services. We used Neo4j to organize data into a knowledge graph (KG). We then hosted the KG on a Tapis Pod, which offers persistent data storage with a template made specifically for Neo4j KGs. In order to demonstrate the capabilities of our software, we developed several clients: Jupyter notebooks authentication, Neural Networks (NN) notebook, and command line applications that provide a convenient frontend to the Tapis API. In addition, we developed a data processing notebook that can manipulate KGs on the Tapis servers, including creations of a KG, data upload and modification. In this report we present the software architecture, design and approach, the successfulness of our client software, and future work.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
Dermatologist-like explainable AI enhances trust and confidence in diagnosing melanoma
Authors:
Tirtha Chanda,
Katja Hauser,
Sarah Hobelsberger,
Tabea-Clara Bucher,
Carina Nogueira Garcia,
Christoph Wies,
Harald Kittler,
Philipp Tschandl,
Cristian Navarrete-Dechent,
Sebastian Podlipnik,
Emmanouil Chousakos,
Iva Crnaric,
Jovana Majstorovic,
Linda Alhajwan,
Tanya Foreman,
Sandra Peternel,
Sergei Sarap,
İrem Özdemir,
Raymond L. Barnhill,
Mar Llamas Velasco,
Gabriela Poch,
Sören Korsing,
Wiebke Sondermann,
Frank Friedrich Gellrich,
Markus V. Heppt
, et al. (10 additional authors not shown)
Abstract:
Although artificial intelligence (AI) systems have been shown to improve the accuracy of initial melanoma diagnosis, the lack of transparency in how these systems identify melanoma poses severe obstacles to user acceptance. Explainable artificial intelligence (XAI) methods can help to increase transparency, but most XAI methods are unable to produce precisely located domain-specific explanations,…
▽ More
Although artificial intelligence (AI) systems have been shown to improve the accuracy of initial melanoma diagnosis, the lack of transparency in how these systems identify melanoma poses severe obstacles to user acceptance. Explainable artificial intelligence (XAI) methods can help to increase transparency, but most XAI methods are unable to produce precisely located domain-specific explanations, making the explanations difficult to interpret. Moreover, the impact of XAI methods on dermatologists has not yet been evaluated. Extending on two existing classifiers, we developed an XAI system that produces text and region based explanations that are easily interpretable by dermatologists alongside its differential diagnoses of melanomas and nevi. To evaluate this system, we conducted a three-part reader study to assess its impact on clinicians' diagnostic accuracy, confidence, and trust in the XAI-support. We showed that our XAI's explanations were highly aligned with clinicians' explanations and that both the clinicians' trust in the support system and their confidence in their diagnoses were significantly increased when using our XAI compared to using a conventional AI system. The clinicians' diagnostic accuracy was numerically, albeit not significantly, increased. This work demonstrates that clinicians are willing to adopt such an XAI system, motivating their future use in the clinic.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.
-
Multi-Environment based Meta-Learning with CSI Fingerprints for Radio Based Positioning
Authors:
Anastasios Foliadis,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher,
Reiner S. Thomä
Abstract:
Radio based positioning of a user equipment (UE) based on deep learning (DL) methods using channel state information (CSI) fingerprints have shown promising results. DL models are able to capture complex properties embedded in the CSI about a particular environment and map UE's CSI to the UE's position. However, the CSI fingerprints and the DL models trained on such fingerprints are highly depende…
▽ More
Radio based positioning of a user equipment (UE) based on deep learning (DL) methods using channel state information (CSI) fingerprints have shown promising results. DL models are able to capture complex properties embedded in the CSI about a particular environment and map UE's CSI to the UE's position. However, the CSI fingerprints and the DL models trained on such fingerprints are highly dependent on a particular propagation environment, which generally limits the transfer of knowledge of the DL models from one environment to another. In this paper, we propose a DL model consisting of two parts: the first part aims to learn environment independent features while the second part combines those features depending on the particular environment. To improve transfer learning, we propose a meta learning scheme for training the first part over multiple environments. We show that for positioning in a new environment, initializing a DL model with the meta learned environment independent function achieves higher UE positioning accuracy compared to regular transfer learning from one environment to the new environment, or compared to training the DL model from scratch with only fingerprints from the new environment. Our proposed scheme is able to create an environment independent function which can embed knowledge from multiple environments and more effectively learn from a new environment.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
IQUAFLOW: A new framework to measure image quality
Authors:
P. Gallés,
K. Takats,
M. Hernández-Cabronero,
D. Berga,
L. Pega,
L. Riordan-Chen,
C. Garcia,
G. Becker,
A. Garriga,
A. Bukva,
J. Serra-Sagristà,
D. Vilaseca,
J. Marín
Abstract:
IQUAFLOW is a new image quality framework that provides a set of tools to assess image quality. The user can add custom metrics that can be easily integrated. Furthermore, iquaflow allows to measure quality by using the performance of AI models trained on the images as a proxy. This also helps to easily make studies of performance degradation of several modifications of the original dataset, for i…
▽ More
IQUAFLOW is a new image quality framework that provides a set of tools to assess image quality. The user can add custom metrics that can be easily integrated. Furthermore, iquaflow allows to measure quality by using the performance of AI models trained on the images as a proxy. This also helps to easily make studies of performance degradation of several modifications of the original dataset, for instance, with images reconstructed after different levels of lossy compression; satellite images would be a use case example, since they are commonly compressed before downloading to the ground. In this situation, the optimization problem consists in finding the smallest images that provide yet sufficient quality to meet the required performance of the deep learning algorithms. Thus, a study with iquaflow is suitable for such case. All this development is wrapped in Mlflow: an interactive tool used to visualize and summarize the results. This document describes different use cases and provides links to their respective repositories. To ease the creation of new studies, we include a cookie-cutter repository. The source code, issue tracker and aforementioned repositories are all hosted on GitHub https://github.com/satellogic/iquaflow.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Aggregating Crowdsourced and Automatic Judgments to Scale Up a Corpus of Anaphoric Reference for Fiction and Wikipedia Texts
Authors:
Juntao Yu,
Silviu Paun,
Maris Camilleri,
Paloma Carretero Garcia,
Jon Chamberlain,
Udo Kruschwitz,
Massimo Poesio
Abstract:
Although several datasets annotated for anaphoric reference/coreference exist, even the largest such datasets have limitations in terms of size, range of domains, coverage of anaphoric phenomena, and size of documents included. Yet, the approaches proposed to scale up anaphoric annotation haven't so far resulted in datasets overcoming these limitations. In this paper, we introduce a new release of…
▽ More
Although several datasets annotated for anaphoric reference/coreference exist, even the largest such datasets have limitations in terms of size, range of domains, coverage of anaphoric phenomena, and size of documents included. Yet, the approaches proposed to scale up anaphoric annotation haven't so far resulted in datasets overcoming these limitations. In this paper, we introduce a new release of a corpus for anaphoric reference labelled via a game-with-a-purpose. This new release is comparable in size to the largest existing corpora for anaphoric reference due in part to substantial activity by the players, in part thanks to the use of a new resolve-and-aggregate paradigm to 'complete' markable annotations through the combination of an anaphoric resolver and an aggregation method for anaphoric reference. The proposed method could be adopted to greatly speed up annotation time in other projects involving games-with-a-purpose. In addition, the corpus covers genres for which no comparable size datasets exist (Fiction and Wikipedia); it covers singletons and non-referring expressions; and it includes a substantial number of long documents (> 2K in length).
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
A deep learning model for brain vessel segmentation in 3DRA with arteriovenous malformations
Authors:
Camila García,
Yibin Fang,
Jianmin Liu,
Ana Paula Narata,
José Ignacio Orlando,
Ignacio Larrabide
Abstract:
Segmentation of brain arterio-venous malformations (bAVMs) in 3D rotational angiographies (3DRA) is still an open problem in the literature, with high relevance for clinical practice. While deep learning models have been applied for segmenting the brain vasculature in these images, they have never been used in cases with bAVMs. This is likely caused by the difficulty to obtain sufficiently annotat…
▽ More
Segmentation of brain arterio-venous malformations (bAVMs) in 3D rotational angiographies (3DRA) is still an open problem in the literature, with high relevance for clinical practice. While deep learning models have been applied for segmenting the brain vasculature in these images, they have never been used in cases with bAVMs. This is likely caused by the difficulty to obtain sufficiently annotated data to train these approaches. In this paper we introduce a first deep learning model for blood vessel segmentation in 3DRA images of patients with bAVMs. To this end, we densely annotated 5 3DRA volumes of bAVM cases and used these to train two alternative 3DUNet-based architectures with different segmentation objectives. Our results show that the networks reach a comprehensive coverage of relevant structures for bAVM analysis, much better than what is obtained using standard methods. This is promising for achieving a better topological and morphological characterisation of the bAVM structures of interest. Furthermore, the models have the ability to segment venous structures even when missing in the ground truth labelling, which is relevant for planning interventional treatments. Ultimately, these results could be used as more reliable first initial guesses, alleviating the cumbersome task of creating manual labels.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Low-complexity Approximate Convolutional Neural Networks
Authors:
R. J. Cintra,
S. Duffner,
C. Garcia,
A. Leite
Abstract:
In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity.…
▽ More
In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity. Low-complexity convolution filters are obtained through a binary (zero-one) linear programming scheme based on the Frobenius norm over sets of dyadic rationals. The resulting matrices allow for multiplication-free computations requiring only addition and bit-shifting operations. Such low-complexity structures pave the way for low-power, efficient hardware designs. We applied our approach on three use cases of different complexity: (i) a "light" but efficient ConvNet for face detection (with around 1000 parameters); (ii) another one for hand-written digit classification (with more than 180000 parameters); and (iii) a significantly larger ConvNet: AlexNet with $\approx$1.2 million matrices. We evaluated the overall performance on the respective tasks for different levels of approximations. In all considered applications, very low-complexity approximations have been derived maintaining an almost equal classification performance.
△ Less
Submitted 29 July, 2022;
originally announced August 2022.
-
Run-of-Mine Stockyard Recovery Scheduling and Optimisation for Multiple Reclaimers
Authors:
Hirad Assimi,
Ben Koch,
Chris Garcia,
Markus Wagner,
Frank Neumann
Abstract:
Stockpiles are essential in the mining value chain, assisting in maximising value and production. Quality control of taken minerals from the stockpiles is a major concern for stockpile managers where failure to meet some requirements can lead to losing money. This problem was recently investigated using a single reclaimer, and basic assumptions. This study extends the approach to consider multiple…
▽ More
Stockpiles are essential in the mining value chain, assisting in maximising value and production. Quality control of taken minerals from the stockpiles is a major concern for stockpile managers where failure to meet some requirements can lead to losing money. This problem was recently investigated using a single reclaimer, and basic assumptions. This study extends the approach to consider multiple reclaimers in preparing for short and long-term deliveries. The engagement of multiple reclaimers complicates the problem in terms of their interaction in preparing a delivery simultaneously and safety distancing of reclaimers. We also consider more realistic settings, such as handling different minerals with different types of reclaimers. We propose methods that construct a solution step by step to meet precedence constraints for all reclaimers in the stockyard. We study various instances of the problem using greedy algorithms, Ant Colony Optimisation (ACO), and propose an integrated local search method determining an efficient schedule. We fine-tune and compare the algorithms and show that the ACO combined with local search can yield efficient solutions.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Reliable Deep Learning based Localization with CSI Fingerprints and Multiple Base Stations
Authors:
Anastasios Foliadis,
Mario H. Castañeda Garcia,
Richard A. Stirling-Gallacher,
Reiner S. Thomä
Abstract:
Deep learning (DL) methods have been recently proposed for user equipment (UE) localization in wireless communication networks, based on the channel state information (CSI) between a UE and each base station (BS) in the uplink. With the CSI from the available BSs, UE localization can be performed in different ways. One the one hand, a single neural network (NN) can be trained for the UE localizati…
▽ More
Deep learning (DL) methods have been recently proposed for user equipment (UE) localization in wireless communication networks, based on the channel state information (CSI) between a UE and each base station (BS) in the uplink. With the CSI from the available BSs, UE localization can be performed in different ways. One the one hand, a single neural network (NN) can be trained for the UE localization by considering the CSI from all the available BSs as one overall fingerprint of the user's location. On the other hand, the CSI at each BS can be used to obtain an estimate of the UE's position with a separate NN at each BS, and then the position estimates of all BSs are combined to obtain an overall estimate of the UE position. In this work, we show that UE localization with the latter approach can achieve a higher positioning accuracy. We propose to consider the uncertainty in the UE localization at each BS, such that overall UE's position is determined by combining the position estimates of the different BSs based on the uncertainty at each BS. With this approach, a more reliable position estimate can be obtained in case of variations in the channel.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Equinox: neural networks in JAX via callable PyTrees and filtered transformations
Authors:
Patrick Kidger,
Cristian Garcia
Abstract:
JAX and PyTorch are two popular Python autodifferentiation frameworks. JAX is based around pure functions and functional programming. PyTorch has popularised the use of an object-oriented (OO) class-based syntax for defining parameterised functions, such as neural networks. That this seems like a fundamental difference means current libraries for building parameterised functions in JAX have either…
▽ More
JAX and PyTorch are two popular Python autodifferentiation frameworks. JAX is based around pure functions and functional programming. PyTorch has popularised the use of an object-oriented (OO) class-based syntax for defining parameterised functions, such as neural networks. That this seems like a fundamental difference means current libraries for building parameterised functions in JAX have either rejected the OO approach entirely (Stax) or have introduced OO-to-functional transformations, multiple new abstractions, and been limited in the extent to which they integrate with JAX (Flax, Haiku, Objax). Either way this OO/functional difference has been a source of tension. Here, we introduce `Equinox', a small neural network library showing how a PyTorch-like class-based approach may be admitted without sacrificing JAX-like functional programming. We provide two main ideas. One: parameterised functions are themselves represented as `PyTrees', which means that the parameterisation of a function is transparent to the JAX framework. Two: we filter a PyTree to isolate just those components that should be treated when transforming (`jit', `grad' or `vmap'-ing) a higher-order function of a parameterised function -- such as a loss function applied to a model. Overall Equinox resolves the above tension without introducing any new programmatic abstractions: only PyTrees and transformations, just as with regular JAX. Equinox is available at \url{https://github.com/patrick-kidger/equinox}.
△ Less
Submitted 30 October, 2021;
originally announced November 2021.
-
Signaling Design for Cooperative Resource Allocation and its Impact to Reliability
Authors:
Rasmus Liborius Bruun,
C. Santiago Morejón García,
Troels B. Sørensen,
Nuno K. Pratas,
Tatiana Kozlova Madsen,
Preben Mogensen
Abstract:
Decentralized cooperative resource allocation schemes for robotic swarms are essential to enable high reliability in high throughput data exchanges. These cooperative schemes require control signaling with the aim to avoid half-duplex problems at the receiver and mitigate interference. We propose two cooperative resource allocation schemes, device sequential and group scheduling, and introduce a c…
▽ More
Decentralized cooperative resource allocation schemes for robotic swarms are essential to enable high reliability in high throughput data exchanges. These cooperative schemes require control signaling with the aim to avoid half-duplex problems at the receiver and mitigate interference. We propose two cooperative resource allocation schemes, device sequential and group scheduling, and introduce a control signaling design. We observe that failure in the reception of these control signals leads to non-cooperative behavior and to significant performance degradation. The cause of these failures are identified and specific countermeasures are proposed and evaluated. We compare the proposed resource allocation schemes against the NR sidelink mode 2 resource allocation and show that even though signaling has an important impact on the resource allocation performance, our proposed device sequential and group scheduling resource allocation schemes improve reliability by an order of magnitude compared to sidelink mode 2.
△ Less
Submitted 15 September, 2022; v1 submitted 15 September, 2021;
originally announced September 2021.
-
Testing a Battery Management System via Criticality-based Rare Event Simulation
Authors:
Daniel Grujic,
Tabea Henning,
Emilio José Calleja García,
Andre Bergmann
Abstract:
For the validation of safety-critical systems regarding safety and comfort, e.g., in the context of automated driving, engineers often have to cope with large (parametric) test spaces for which it is infeasible to test through all possible parameter configurations. At the same time, critical behavior of a well-engineered system with respect to prescribed safety and comfort requirements tends to be…
▽ More
For the validation of safety-critical systems regarding safety and comfort, e.g., in the context of automated driving, engineers often have to cope with large (parametric) test spaces for which it is infeasible to test through all possible parameter configurations. At the same time, critical behavior of a well-engineered system with respect to prescribed safety and comfort requirements tends to be extremely rare, speaking of probabilities of order $10^{-6}$ or less, but clearly has to be examined carefully for valid argumentation. Hence, common approaches such as boundary value analysis are insufficient while methods based on random sampling from the parameter space (simple Monte Carlo) lack the ability to detect these rare critical events efficiently, i.e., with appropriate simulation budget. For this reason, a more sophisticated simulation-based approach is proposed which employs optimistic optimization on an objective function called "criticality" in order to identify effectively the set of critical parameter configurations. Within the scope of the ITEA 3 TESTOMAT project (http://www.testomatproject.eu/) the collaboration partners OFFIS e.V. and AKKA Germany GmbH conducted a case study on applying criticality-based rare event simulation to the charging process of an automotive battery management system given as a model. The present technical report documents the industrial use case, the approach, application and experimental results, as well as lessons learned from the case study.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
Sugestões de Rotas Personalizadas para Carrinheiros na Coleta Seletiva de Materiais Recicláveis
Authors:
Maria Vitória R. Oliveira,
Islene C. Garcia
Abstract:
Carrinheiros are collectors of recyclable materials that use human-powered vehicles. Carrinheiro's collection routes can be tiring depending on the paths chosen. Therefore, this work proposes an algorithm for suggesting customizable routes based on three edge costing policies: Less Work Policy, Less Impedance Policy, and Short Distance Policy. This work used the tools osmnx and networkx to constru…
▽ More
Carrinheiros are collectors of recyclable materials that use human-powered vehicles. Carrinheiro's collection routes can be tiring depending on the paths chosen. Therefore, this work proposes an algorithm for suggesting customizable routes based on three edge costing policies: Less Work Policy, Less Impedance Policy, and Short Distance Policy. This work used the tools osmnx and networkx to construct graphs, geographic data from Open Street Map, and elevations from Topodata. The simulations performed in Simulation of Urban MObility (SUMO) demonstrated that the proposed algorithm could minimize the power applied to push the vehicle, the distance, and the travel time, according to the policy used.
△ Less
Submitted 22 May, 2021;
originally announced May 2021.
-
Integração e Entrega Contínua para aplicações móveis desenvolvidas em React Native
Authors:
Pedro José de Souza Neto,
Vinicius Cardoso Garcia
Abstract:
Continuous integration and continuous delivery are not new for developers who create web applications, however in the development of mobile applications this practice is still not very common mainly because of the challenges during the process of distributing the application. In the face of the growing number of applications, a greater requirement for quality and ever-shorter delivery times, deliv…
▽ More
Continuous integration and continuous delivery are not new for developers who create web applications, however in the development of mobile applications this practice is still not very common mainly because of the challenges during the process of distributing the application. In the face of the growing number of applications, a greater requirement for quality and ever-shorter delivery times, delivering a healthy code is often extremely important to keep up with the competition. The purpose of this work is to implement an integration and continuous delivery pipeline for mobile applications developed in React Native. It intends to automate the process of build and delivery of applications developed with this technology.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Bluejay: A Cross-Tooling Audit Framework For Agile Software Teams
Authors:
Cesar Garcia,
Alejandro Guerrero,
Joshua Zeitsoff,
Srujay Korlakunta,
Pablo Fernandez,
Armando Fox,
Antonio Ruiz-Cortes
Abstract:
Agile software teams are expected to follow a number of specific Team Practices (TPs) during each iteration, such as estimating the effort ("points") required to complete user stories and coordinating the management of the codebase with the delivery of features. For software engineering instructors trying to teach such TPs to student teams, manually auditing teams if teams are following the TPs an…
▽ More
Agile software teams are expected to follow a number of specific Team Practices (TPs) during each iteration, such as estimating the effort ("points") required to complete user stories and coordinating the management of the codebase with the delivery of features. For software engineering instructors trying to teach such TPs to student teams, manually auditing teams if teams are following the TPs and improving over time is tedious, time-consuming and error-prone. It is even more difficult when those TPs involve two or more tools. For example, starting work on a feature in a project-management tool such as Pivotal Tracker should usually be followed relatively quickly by the creation of a feature branch on GitHub. Merging a feature branch on GitHub should usually be followed relatively quickly by deploying the new feature to a staging server for customer feedback. Few systems are designed specifically to audit such TPs, and existing ones, as far as we know, are limited to a single specific tool.
We present Bluejay, an open-source extensible platform that uses the APIs of multiple tools to collect raw data, synthesize it into TP measurements, and present dashboards to audit the TPs. A key insight in Bluejay's design is that TPs can be expressed in terminology similar to that used for modeling and auditing Service Level Agreement (SLA) compliance. Bluejay therefore builds on mature tools used in that ecosystem and adapts them for describing, auditing, and reporting on TPs. Bluejay currently consumes data from five different widely-used development tools, and can be customized by connecting it to any service with a REST API. Video showcase available at governify.io/showcase/bluejay
△ Less
Submitted 11 March, 2021;
originally announced March 2021.
-
Semantic Segmentation with Labeling Uncertainty and Class Imbalance
Authors:
Patrik Olã Bressan,
José Marcato Junior,
José Augusto Correa Martins,
Diogo Nunes Gonçalves,
Daniel Matte Freitas,
Lucas Prado Osco,
Jonathan de Andrade Silva,
Zhipeng Luo,
Jonathan Li,
Raymundo Cordero Garcia,
Wesley Nunes Gonçalves
Abstract:
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix…
▽ More
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
A Tutorial on 5G NR V2X Communications
Authors:
Mario H. Castañeda Garcia,
Alejandro Molina-Galan,
Mate Boban,
Javier Gozalvez,
Baldomero Coll-Perales,
Taylan Şahin,
Apostolos Kousaridas
Abstract:
The Third Generation Partnership Project (3GPP) has recently published its Release 16 that includes the first Vehicle to-Everything (V2X) standard based on the 5G New Radio (NR) air interface. 5G NR V2X introduces advanced functionalities on top of the 5G NR air interface to support connected and automated driving use cases with stringent requirements. This paper presents an in-depth tutorial of t…
▽ More
The Third Generation Partnership Project (3GPP) has recently published its Release 16 that includes the first Vehicle to-Everything (V2X) standard based on the 5G New Radio (NR) air interface. 5G NR V2X introduces advanced functionalities on top of the 5G NR air interface to support connected and automated driving use cases with stringent requirements. This paper presents an in-depth tutorial of the 3GPP Release 16 5G NR V2X standard for V2X communications, with a particular focus on the sidelink, since it is the most significant part of 5G NR V2X. The main part of the paper is an in-depth treatment of the key aspects of 5G NR V2X: the physical layer, the resource allocation, the quality of service management, the enhancements introduced to the Uu interface and the mobility management for V2N (Vehicle to Network) communications, as well as the co-existence mechanisms between 5G NR V2X and LTE V2X. We also review the use cases, the system architecture, and describe the evaluation methodology and simulation assumptions for 5G NR V2X. Finally, we provide an outlook on possible 5G NR V2X enhancements, including those identified within Release 17.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Reinforcement Learning with Probabilistic Boolean Network Models of Smart Grid Devices
Authors:
Pedro J. Rivera Torres,
Carlos Gershenson García,
Samir Kanaan Izquierdo
Abstract:
The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and…
▽ More
The area of Smart Power Grids needs to constantly improve its efficiency and resilience, to pro-vide high quality electrical power, in a resistant grid, managing faults and avoiding failures. Achieving this requires high component reliability, adequate maintenance, and a studied failure occurrence. Correct system operation involves those activities, and novel methodologies to detect, classify, and isolate faults and failures, model and simulate processes with predictive algorithms and analytics (using data analysis and asset condition to plan and perform activities). We show-case the application of a complex-adaptive, self-organizing modeling method, Probabilistic Boolean Networks (PBN), as a way towards the understanding of the dynamics of smart grid devices, and to model and characterize their behavior. This work demonstrates that PBNs are is equivalent to the standard Reinforcement Learning Cycle, in which the agent/model has an inter-action with its environment and receives feedback from it in the form of a reward signal. Differ-ent reward structures were created in order to characterize preferred behavior. This information can be used to guide the PBN to avoid fault conditions and failures.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.