Search | arXiv e-print repository

Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search

Authors: Dongge Han, Menglin Xia, Daniel Madrigal Diaz, Samuel Kessler, Ankur Mallick, Xuchao Zhang, Mirian Del Carmen Hipolito Garcia, Jin Xu, Victor Rühle, Saravan Rajmohan

Abstract: Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-leve… ▽ More Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-level reasoning guides that help SLMs systematically tackle related problems. Furthermore, our framework integrates a prompt template search mechanism to mitigate the SLMs' sensitivity to prompt variations. Our framework demonstrates improved SLM performance across various tasks, including math (GSM8K), coding (MBPP), and logic reasoning (BBH). Our approach improves the reasoning capabilities of SLMs without increasing model size or requiring additional training, offering a lightweight and deployment-friendly solution for on-device or resource-constrained environments. △ Less

Submitted 10 June, 2025; originally announced June 2025.

Comments: TTODLer-FM Workshop@ICML'25 (Tiny Titans: The next wave of On-Device Learning for Foundational Models)

arXiv:2505.23599 [pdf, ps, other]

On Transferring Transferability: Towards a Theory for Size Generalization

Authors: Eitan Levin, Yuxin Ma, Mateo Díaz, Soledad Villar

Abstract: Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work… ▽ More Many modern learning tasks require models that can take inputs of varying sizes. Consequently, dimension-independent architectures have been proposed for domains where the inputs are graphs, sets, and point clouds. Recent work on graph neural networks has explored whether a model trained on low-dimensional data can transfer its performance to higher-dimensional inputs. We extend this body of work by introducing a general framework for transferability across dimensions. We show that transferability corresponds precisely to continuity in a limit space formed by identifying small problem instances with equivalent large ones. This identification is driven by the data and the learning task. We instantiate our framework on existing architectures, and implement the necessary changes to ensure their transferability. Finally, we provide design principles for designing new transferable models. Numerical experiments support our findings. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: 69 pages, 8 figures

arXiv:2505.09004 [pdf, ps, other]

Lower Bounds on the MMSE of Adversarially Inferring Sensitive Features

Authors: Monica Welfert, Nathan Stromberg, Mario Diaz, Lalitha Sankar

Abstract: We propose an adversarial evaluation framework for sensitive feature inference based on minimum mean-squared error (MMSE) estimation with a finite sample size and linear predictive models. Our approach establishes theoretical lower bounds on the true MMSE of inferring sensitive features from noisy observations of other correlated features. These bounds are expressed in terms of the empirical MMSE… ▽ More We propose an adversarial evaluation framework for sensitive feature inference based on minimum mean-squared error (MMSE) estimation with a finite sample size and linear predictive models. Our approach establishes theoretical lower bounds on the true MMSE of inferring sensitive features from noisy observations of other correlated features. These bounds are expressed in terms of the empirical MMSE under a restricted hypothesis class and a non-negative error term. The error term captures both the estimation error due to finite number of samples and the approximation error from using a restricted hypothesis class. For linear predictive models, we derive closed-form bounds, which are order optimal in terms of the noise variance, on the approximation error for several classes of relationships between the sensitive and non-sensitive features, including linear mappings, binary symmetric channels, and class-conditional multi-variate Gaussian distributions. We also present a new lower bound that relies on the MSE computed on a hold-out validation dataset of the MMSE estimator learned on finite-samples and a restricted hypothesis class. Through empirical evaluation, we demonstrate that our framework serves as an effective tool for MMSE-based adversarial evaluation of sensitive feature inference that balances theoretical guarantees with practical efficiency. △ Less

Submitted 13 May, 2025; originally announced May 2025.

Comments: submitted to IEEE Transactions on Information Theory

arXiv:2505.05660 [pdf, ps, other]

doi 10.1145/3715275.3732045

Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs

Authors: Jeffrey Basoah, Daniel Chechelnitsky, Tao Long, Katharina Reinecke, Chrysoula Zerva, Kaitlyn Zhou, Mark Díaz, Maarten Sap

Abstract: As large language models (LLMs) increasingly adapt and personalize to diverse sets of users, there is an increased risk of systems appropriating sociolects, i.e., language styles or dialects that are associated with specific minoritized lived experiences (e.g., African American English, Queer slang). In this work, we examine whether sociolect usage by an LLM agent affects user reliance on its outp… ▽ More As large language models (LLMs) increasingly adapt and personalize to diverse sets of users, there is an increased risk of systems appropriating sociolects, i.e., language styles or dialects that are associated with specific minoritized lived experiences (e.g., African American English, Queer slang). In this work, we examine whether sociolect usage by an LLM agent affects user reliance on its outputs and user perception (satisfaction, frustration, trust, and social presence). We designed and conducted user studies where 498 African American English (AAE) speakers and 487 Queer slang speakers performed a set of question-answering tasks with LLM-based suggestions in either standard American English (SAE) or their self-identified sociolect. Our findings showed that sociolect usage by LLMs influenced both reliance and perceptions, though in some surprising ways. Results suggest that both AAE and Queer slang speakers relied more on the SAE agent, and had more positive perceptions of the SAE agent. Yet, only Queer slang speakers felt more social presence from the Queer slang agent over the SAE one, whereas only AAE speakers preferred and trusted the SAE agent over the AAE one. These findings emphasize the need to test for behavioral outcomes rather than simply assume that personalization would lead to a better and safer reliance outcome. They also highlight the nuanced dynamics of minoritized language in machine interactions, underscoring the need for LLMs to be carefully designed to respect cultural and linguistic boundaries while fostering genuine user engagement and trust. △ Less

Submitted 9 June, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

Comments: accepted to FAccT 2025

arXiv:2505.05197 [pdf, other]

Societal and technological progress as sewing an ever-growing, ever-changing, patchy, and polychrome quilt

Authors: Joel Z. Leibo, Alexander Sasha Vezhnevets, William A. Cunningham, Sébastien Krier, Manfred Diaz, Simon Osindero

Abstract: Artificial Intelligence (AI) systems are increasingly placed in positions where their decisions have real consequences, e.g., moderating online spaces, conducting research, and advising on policy. Ensuring they operate in a safe and ethically acceptable fashion is thus critical. However, most solutions have been a form of one-size-fits-all "alignment". We are worried that such systems, which overl… ▽ More Artificial Intelligence (AI) systems are increasingly placed in positions where their decisions have real consequences, e.g., moderating online spaces, conducting research, and advising on policy. Ensuring they operate in a safe and ethically acceptable fashion is thus critical. However, most solutions have been a form of one-size-fits-all "alignment". We are worried that such systems, which overlook enduring moral diversity, will spark resistance, erode trust, and destabilize our institutions. This paper traces the underlying problem to an often-unstated Axiom of Rational Convergence: the idea that under ideal conditions, rational agents will converge in the limit of conversation on a single ethics. Treating that premise as both optional and doubtful, we propose what we call the appropriateness framework: an alternative approach grounded in conflict theory, cultural evolution, multi-agent systems, and institutional economics. The appropriateness framework treats persistent disagreement as the normal case and designs for it by applying four principles: (1) contextual grounding, (2) community customization, (3) continual adaptation, and (4) polycentric governance. We argue here that adopting these design principles is a good way to shift the main alignment metaphor from moral unification to a more productive metaphor of conflict management, and that taking this step is both desirable and urgent. △ Less

Submitted 8 May, 2025; originally announced May 2025.

Comments: 16 pages

arXiv:2504.16871 [pdf, other]

Exploring How LLMs Capture and Represent Domain-Specific Knowledge

Authors: Mirian Hipolito Garcia, Camille Couturier, Daniel Madrigal Diaz, Ankur Mallick, Anastasios Kyrillidis, Robert Sim, Victor Ruhle, Saravan Rajmohan

Abstract: We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains… ▽ More We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains. We also study the robustness of these domain representations to variations in prompt styles and sources. Our approach leverages these representations for model selection, mapping the LLM that best matches the domain trace of the input query (i.e., the model with the highest performance on similar traces). Our findings show that LLMs can differentiate queries for related domains, and that the fine-tuned model is not always the most accurate. Unlike previous work, our interpretations apply to both closed and open-ended generative tasks △ Less

Submitted 24 April, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.11146 [pdf, other]

Exploring Student Behaviors and Motivations using AI TAs with Optional Guardrails

Authors: Amanpreet Kapoor, Marc Diaz, Stephen MacNeil, Leo Porter, Paul Denny

Abstract: AI-powered chatbots and digital teaching assistants (AI TAs) are gaining popularity in programming education, offering students timely and personalized feedback. Despite their potential benefits, concerns about student over-reliance and academic misconduct have prompted the introduction of "guardrails" into AI TAs - features that provide scaffolded support rather than direct solutions. However, ov… ▽ More AI-powered chatbots and digital teaching assistants (AI TAs) are gaining popularity in programming education, offering students timely and personalized feedback. Despite their potential benefits, concerns about student over-reliance and academic misconduct have prompted the introduction of "guardrails" into AI TAs - features that provide scaffolded support rather than direct solutions. However, overly restrictive guardrails may lead students to bypass these tools and use unconstrained AI models, where interactions are not observable, thus limiting our understanding of students' help-seeking behaviors. To investigate this, we designed and deployed a novel AI TA tool with optional guardrails in one lab of a large introductory programming course. As students completed three code writing and debugging tasks, they had the option to receive guardrailed help or use a "See Solution" feature which disabled the guardrails and generated a verbatim response from the underlying model. We investigate students' motivations and use of this feature and examine the association between usage and their course performance. We found that 50% of the 885 students used the "See Solution" feature for at least one problem and 14% used it for all three problems. Additionally, low-performing students were more likely to use this feature and use it close to the deadline as they started assignments later. The predominant factors that motivated students to disable the guardrails were assistance in solving problems, time pressure, lack of self-regulation, and curiosity. Our work provides insights into students' solution-seeking motivations and behaviors, which has implications for the design of AI TAs that balance pedagogical goals with student preferences. △ Less

Submitted 15 April, 2025; originally announced April 2025.

arXiv:2503.19075 [pdf, ps, other]

The Case for "Thick Evaluations" of Cultural Representation in AI

Authors: Rida Qadri, Mark Diaz, Ding Wang, Michael Madaio

Abstract: Generative AI image models have been increasingly evaluated for their (in)ability to represent non-Western cultures. We argue that these evaluations operate through reductive ideals of representation, abstracted from how people define their own representation and neglecting the inherently interpretive and contextual nature of cultural representation. In contrast to these 'thin' evaluations, we int… ▽ More Generative AI image models have been increasingly evaluated for their (in)ability to represent non-Western cultures. We argue that these evaluations operate through reductive ideals of representation, abstracted from how people define their own representation and neglecting the inherently interpretive and contextual nature of cultural representation. In contrast to these 'thin' evaluations, we introduce the idea of 'thick evaluations': a more granular, situated, and discursive measurement framework for evaluating representations of social worlds in AI images, steeped in communities' own understandings of representation. We develop this evaluation framework through workshops in South Asia, by studying the 'thick' ways in which people interpret and assign meaning to images of their own cultures. We introduce practices for thicker evaluations of representation that expand the understanding of representation underpinning AI evaluations and by co-constructing metrics with communities, bringing measurement in line with the experiences of communities on the ground. △ Less

Submitted 24 March, 2025; originally announced March 2025.

Comments: 14 pages

arXiv:2503.13573 [pdf]

doi 10.1016/j.patcog.2025.111581

Online Signature Verification based on the Lagrange formulation with 2D and 3D robotic models

Authors: Moises Diaz, Miguel A. Ferrer, Juan M. Gil, Rafael Rodriguez, Peirong Zhang, Lianwen Jin

Abstract: Online Signature Verification commonly relies on function-based features, such as time-sampled horizontal and vertical coordinates, as well as the pressure exerted by the writer, obtained through a digitizer. Although inferring additional information about the writers arm pose, kinematics, and dynamics based on digitizer data can be useful, it constitutes a challenge. In this paper, we tackle this… ▽ More Online Signature Verification commonly relies on function-based features, such as time-sampled horizontal and vertical coordinates, as well as the pressure exerted by the writer, obtained through a digitizer. Although inferring additional information about the writers arm pose, kinematics, and dynamics based on digitizer data can be useful, it constitutes a challenge. In this paper, we tackle this challenge by proposing a new set of features based on the dynamics of online signatures. These new features are inferred through a Lagrangian formulation, obtaining the sequences of generalized coordinates and torques for 2D and 3D robotic arm models. By combining kinematic and dynamic robotic features, our results demonstrate their significant effectiveness for online automatic signature verification and achieving state-of-the-art results when integrated into deep learning models. △ Less

Submitted 17 March, 2025; originally announced March 2025.

Journal ref: Science direct, March 17 2025

arXiv:2503.05609 [pdf, ps, other]

Decoding Safety Feedback from Diverse Raters: A Data-driven Lens on Responsiveness to Severity

Authors: Pushkar Mishra, Charvi Rastogi, Stephen R. Pfohl, Alicia Parrish, Tian Huey Teh, Roma Patel, Mark Diaz, Ding Wang, Michela Paganini, Vinodkumar Prabhakaran, Lora Aroyo, Verena Rieser

Abstract: Ensuring the safety of Generative AI requires a nuanced understanding of pluralistic viewpoints. In this paper, we introduce a novel data-driven approach for interpreting granular ratings in pluralistic datasets. Specifically, we address the challenge of analyzing nuanced differences in safety feedback from a diverse population expressed via ordinal scales (e.g., a Likert scale). We distill non-pa… ▽ More Ensuring the safety of Generative AI requires a nuanced understanding of pluralistic viewpoints. In this paper, we introduce a novel data-driven approach for interpreting granular ratings in pluralistic datasets. Specifically, we address the challenge of analyzing nuanced differences in safety feedback from a diverse population expressed via ordinal scales (e.g., a Likert scale). We distill non-parametric responsiveness metrics that quantify the consistency of raters in scoring varying levels of the severity of safety violations. Leveraging a publicly available pluralistic dataset of safety feedback on AI-generated content as our case study, we investigate how raters from different demographic groups (age, gender, ethnicity) use an ordinal scale to express their perceptions of the severity of violations. We apply our metrics across violation types, demonstrating their utility in extracting nuanced insights that are crucial for aligning AI systems reliably in multi-cultural contexts. We show that our approach can inform rater selection and feedback interpretation by capturing nuanced viewpoints across different demographic groups, hence improving the quality of pluralistic data collection and in turn contributing to more robust AI development. △ Less

Submitted 7 July, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

arXiv:2501.09048 [pdf]

doi 10.1109/TPAMI.2018.2869163

Anthropomorphic Features for On-Line Signatures

Authors: Moises Diaz, Miguel A. Ferrer, Jose J. Quintana

Abstract: Many features have been proposed in on-line signature verification. Generally, these features rely on the position of the on-line signature samples and their dynamic properties, as recorded by a tablet. This paper proposes a novel feature space to describe efficiently on-line signatures. Since producing a signature requires a skeletal arm system and its associated muscles, the new feature space is… ▽ More Many features have been proposed in on-line signature verification. Generally, these features rely on the position of the on-line signature samples and their dynamic properties, as recorded by a tablet. This paper proposes a novel feature space to describe efficiently on-line signatures. Since producing a signature requires a skeletal arm system and its associated muscles, the new feature space is based on characterizing the movement of the shoulder, the elbow and the wrist joints when signing. As this motion is not directly obtained from a digital tablet, the new features are calculated by means of a virtual skeletal arm (VSA) model, which simulates the architecture of a real arm and forearm. Specifically, the VSA motion is described by its 3D joint position and its joint angles. These anthropomorphic features are worked out from both pen position and orientation through the VSA forward and direct kinematic model. The anthropomorphic features' robustness is proved by achieving state-of-the-art performance with several verifiers and multiple benchmarks on third party signature databases, which were collected with different devices and in different languages and scripts. △ Less

Submitted 15 January, 2025; originally announced January 2025.

Journal ref: IEEE,Volume 41, Issue 12 (2019),Page(s):2807 - 2819

arXiv:2412.19010 [pdf, other]

A theory of appropriateness with applications to generative artificial intelligence

Authors: Joel Z. Leibo, Alexander Sasha Vezhnevets, Manfred Diaz, John P. Agapiou, William A. Cunningham, Peter Sunehag, Julia Haas, Raphael Koster, Edgar A. Duéñez-Guzmán, William S. Isaac, Georgios Piliouras, Stanley M. Bileschi, Iyad Rahwan, Simon Osindero

Abstract: What is appropriateness? Humans navigate a multi-scale mosaic of interlocking notions of what is appropriate for different situations. We act one way with our friends, another with our family, and yet another in the office. Likewise for AI, appropriate behavior for a comedy-writing assistant is not the same as appropriate behavior for a customer-service representative. What determines which action… ▽ More What is appropriateness? Humans navigate a multi-scale mosaic of interlocking notions of what is appropriate for different situations. We act one way with our friends, another with our family, and yet another in the office. Likewise for AI, appropriate behavior for a comedy-writing assistant is not the same as appropriate behavior for a customer-service representative. What determines which actions are appropriate in which contexts? And what causes these standards to change over time? Since all judgments of AI appropriateness are ultimately made by humans, we need to understand how appropriateness guides human decision making in order to properly evaluate AI decision making and improve it. This paper presents a theory of appropriateness: how it functions in human society, how it may be implemented in the brain, and what it means for responsible deployment of generative AI technology. △ Less

Submitted 25 December, 2024; originally announced December 2024.

Comments: 115 pages, 2 figures

arXiv:2411.17506 [pdf]

Neural network modelling of kinematic and dynamic features for signature verification

Authors: Moises Diaz, Miguel A. Ferrer, Jose Juan Quintana, Adam Wolniakowski, Roman Trochimczuk, Konstantsin Miatliuk, Giovanna Castellano, Gennaro Vessio

Abstract: Online signature parameters, which are based on human characteristics, broaden the applicability of an automatic signature verifier. Although kinematic and dynamic features have previously been suggested, accurately measuring features such as arm and forearm torques remains challenging. We present two approaches for estimating angular velocities, angular positions, and force torques. The first app… ▽ More Online signature parameters, which are based on human characteristics, broaden the applicability of an automatic signature verifier. Although kinematic and dynamic features have previously been suggested, accurately measuring features such as arm and forearm torques remains challenging. We present two approaches for estimating angular velocities, angular positions, and force torques. The first approach involves using a physical UR5e robotic arm to reproduce a signature while capturing those parameters over time. The second method, a cost effective approach, uses a neural network to estimate the same parameters. Our findings demonstrate that a simple neural network model can extract effective parameters for signature verification. Training the neural network with the MCYT300 dataset and cross validating with other databases, namely, BiosecurID, Visual, Blind, OnOffSigDevanagari 75 and OnOffSigBengali 75 confirm the models generalization capability. △ Less

Submitted 26 November, 2024; originally announced November 2024.

Journal ref: Procedia Computer Science, Volume 3, 2011, Pages 155-161

arXiv:2411.08791 [pdf, other]

Locally Private Sampling with Public Data

Authors: Behnoosh Zamanlooy, Mario Diaz, Shahab Asoodeh

Abstract: Local differential privacy (LDP) is increasingly employed in privacy-preserving machine learning to protect user data before sharing it with an untrusted aggregator. Most LDP methods assume that users possess only a single data record, which is a significant limitation since users often gather extensive datasets (e.g., images, text, time-series data) and frequently have access to public datasets.… ▽ More Local differential privacy (LDP) is increasingly employed in privacy-preserving machine learning to protect user data before sharing it with an untrusted aggregator. Most LDP methods assume that users possess only a single data record, which is a significant limitation since users often gather extensive datasets (e.g., images, text, time-series data) and frequently have access to public datasets. To address this limitation, we propose a locally private sampling framework that leverages both the private and public datasets of each user. Specifically, we assume each user has two distributions: $p$ and $q$ that represent their private dataset and the public dataset, respectively. The objective is to design a mechanism that generates a private sample approximating $p$ while simultaneously preserving $q$. We frame this objective as a minimax optimization problem using $f$-divergence as the utility measure. We fully characterize the minimax optimal mechanisms for general $f$-divergences provided that $p$ and $q$ are discrete distributions. Remarkably, we demonstrate that this optimal mechanism is universal across all $f$-divergences. Experiments validate the effectiveness of our minimax optimal sampler compared to the state-of-the-art locally private sampler. △ Less

Submitted 2 May, 2025; v1 submitted 13 November, 2024; originally announced November 2024.

arXiv:2411.04128 [pdf]

doi 10.1007/978-3-031-45461-5_4

On the analysis of saturated pressure to detect fatigue

Authors: Marcos Faundez-Zanuy, Josep Lopez-Xarbau, Moises Diaz, Manuel Garnacho-Castaño

Abstract: This paper examines the saturation of pressure signals during various handwriting tasks, including drawings, cursive text, capital words text, and signature, under different levels of fatigue. Experimental results demonstrate a significant rise in the proportion of saturated samples following strenuous exercise in tasks performed without resting wrist. The analysis of saturation highlights signifi… ▽ More This paper examines the saturation of pressure signals during various handwriting tasks, including drawings, cursive text, capital words text, and signature, under different levels of fatigue. Experimental results demonstrate a significant rise in the proportion of saturated samples following strenuous exercise in tasks performed without resting wrist. The analysis of saturation highlights significant differences when comparing the results to the baseline situation and strenuous fatigue. △ Less

Submitted 22 October, 2024; originally announced November 2024.

Comments: 12 pages. arXiv admin note: substantial text overlap with arXiv:2203.14782

Journal ref: In: Parziale, A., Diaz, M., Melo, F. (eds) Graphonomics in Human Body Movement. IGS 2023. Lecture Notes in Computer Science, vol 14285. Springer, Cham

arXiv:2411.00179 [pdf, other]

What Makes An Expert? Reviewing How ML Researchers Define "Expert"

Authors: Mark Díaz, Angela DR Smith

Abstract: Human experts are often engaged in the development of machine learning systems to collect and validate data, consult on algorithm development, and evaluate system performance. At the same time, who counts as an 'expert' and what constitutes 'expertise' is not always explicitly defined. In this work, we review 112 academic publications that explicitly reference 'expert' and 'expertise' and that des… ▽ More Human experts are often engaged in the development of machine learning systems to collect and validate data, consult on algorithm development, and evaluate system performance. At the same time, who counts as an 'expert' and what constitutes 'expertise' is not always explicitly defined. In this work, we review 112 academic publications that explicitly reference 'expert' and 'expertise' and that describe the development of machine learning (ML) systems to survey how expertise is characterized and the role experts play. We find that expertise is often undefined and forms of knowledge outside of formal education and professional certification are rarely sought, which has implications for the kinds of knowledge that are recognized and legitimized in ML development. Moreover, we find that expert knowledge tends to be utilized in ways focused on mining textbook knowledge, such as through data annotation. We discuss the ways experts are engaged in ML development in relation to deskilling, the social construction of expertise, and implications for responsible AI development. We point to a need for reflection and specificity in justifications of domain expert engagement, both as a matter of documentation and reproducibility, as well as a matter of broadening the range of recognized expertise. △ Less

Submitted 31 October, 2024; originally announced November 2024.

arXiv:2411.00119 [pdf, ps, other]

Soft Condorcet Optimization for Ranking of General Agents

Authors: Marc Lanctot, Kate Larson, Michael Kaisers, Quentin Berthet, Ian Gemp, Manfred Diaz, Roberto-Rafael Maura-Rivero, Yoram Bachrach, Anna Koop, Doina Precup

Abstract: Driving progress of AI models and agents requires comparing their performance on standardized benchmarks; for general agents, individual performances must be aggregated across a potentially wide variety of different tasks. In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the o… ▽ More Driving progress of AI models and agents requires comparing their performance on standardized benchmarks; for general agents, individual performances must be aggregated across a potentially wide variety of different tasks. In this paper, we describe a novel ranking scheme inspired by social choice frameworks, called Soft Condorcet Optimization (SCO), to compute the optimal ranking of agents: the one that makes the fewest mistakes in predicting the agent comparisons in the evaluation data. This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria. SCO ratings are maximal for Condorcet winners when they exist, which we show is not necessarily true for the classical rating system Elo. We propose three optimization algorithms to compute SCO ratings and evaluate their empirical performance. When serving as an approximation to the Kemeny-Young voting method, SCO rankings are on average 0 to 0.043 away from the optimal ranking in normalized Kendall-tau distance across 865 preference profiles from the PrefLib open ranking archive. In a simulated noisy tournament setting, SCO achieves accurate approximations to the ground truth ranking and the best among several baselines when 59\% or more of the preference data is missing. Finally, SCO ranking provides the best approximation to the optimal ranking, measured on held-out test sets, in a problem containing 52,958 human players across 31,049 games of the classic seven-player game of Diplomacy. △ Less

Submitted 27 June, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

Journal ref: AAMAS 2025

arXiv:2410.17032 [pdf, other]

Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Authors: Charvi Rastogi, Tian Huey Teh, Pushkar Mishra, Roma Patel, Zoe Ashwood, Aida Mostafazadeh Davani, Mark Diaz, Michela Paganini, Alicia Parrish, Ding Wang, Vinodkumar Prabhakaran, Lora Aroyo, Verena Rieser

Abstract: AI systems crucially rely on human ratings, but these ratings are often aggregated, obscuring the inherent diversity of perspectives in real-world phenomenon. This is particularly concerning when evaluating the safety of generative AI, where perceptions and associated harms can vary significantly across socio-cultural contexts. While recent research has studied the impact of demographic difference… ▽ More AI systems crucially rely on human ratings, but these ratings are often aggregated, obscuring the inherent diversity of perspectives in real-world phenomenon. This is particularly concerning when evaluating the safety of generative AI, where perceptions and associated harms can vary significantly across socio-cultural contexts. While recent research has studied the impact of demographic differences on annotating text, there is limited understanding of how these subjective variations affect multimodal safety in generative AI. To address this, we conduct a large-scale study employing highly-parallel safety ratings of about 1000 text-to-image (T2I) generations from a demographically diverse rater pool of 630 raters balanced across 30 intersectional groups across age, gender, and ethnicity. Our study shows that (1) there are significant differences across demographic groups (including intersectional groups) on how severe they assess the harm to be, and that these differences vary across different types of safety violations, (2) the diverse rater pool captures annotation patterns that are substantially different from expert raters trained on specific set of safety policies, and (3) the differences we observe in T2I safety are distinct from previously documented group level differences in text-based safety tasks. To further understand these varying perspectives, we conduct a qualitative analysis of the open-ended explanations provided by raters. This analysis reveals core differences into the reasons why different groups perceive harms in T2I generations. Our findings underscore the critical need for incorporating diverse perspectives into safety evaluation of generative AI ensuring these systems are truly inclusive and reflect the values of all users. △ Less

Submitted 22 October, 2024; originally announced October 2024.

Comments: 20 pages, 7 figures

arXiv:2410.11539 [pdf, other]

doi 10.1016/j.inffus.2025.103247

Transfer Learning with Foundational Models for Time Series Forecasting using Low-Rank Adaptations

Authors: M. Germán-Morales, A. J. Rivera-Rivas, M. J. del Jesus Díaz, C. J. Carmona

Abstract: Foundational Models are an emerging widely used technique of GenAI. These models are distinguished by their scalability and the ease with which they can be adapted through the exploitation of Transfer Learning. The availability of high computational power and large datasets have supported their development, achieving a high generalization capacity due to the enormous and heterogeneous amounts of d… ▽ More Foundational Models are an emerging widely used technique of GenAI. These models are distinguished by their scalability and the ease with which they can be adapted through the exploitation of Transfer Learning. The availability of high computational power and large datasets have supported their development, achieving a high generalization capacity due to the enormous and heterogeneous amounts of data used in their initial training. These characteristics contribute to a solid base that can be adapted or adjusted to a wide range of tasks, increasing their applicability. This study proposes the methodology LLIAM, a straightforward adaptation of a kind of FM, Large Language Models, for the Time Series Forecasting task. An adequate time-series prompting schema and Low-Rank Adaptations are used to enhance the knowledge of the model with diverse time series datasets, known as the fine-tuning phase. A study divided in two stages has been performed for evaluating the effectiveness of the proposed methodology. Initially, a comparison was made between the performance of LLIAM and different state-of-the-art DL algorithms, including Recurrent Neural Networks and Temporal Convolutional Networks, as well as a LLM-based method, TimeLLM. Following this, a zero-shot study is presented in order to evaluate the generalization capacity of the proposed methodology with time series datasets from unknown domains not considered in the model training. The outcomes of this investigation demonstrate the efficacy of LLIAM, highlighting that this straightforward and general approach can attain competent results without the necessity for applying complex modifications. This work also encourages the use of available resources (such as these pre-trained models) and efficient fine-tuning techniques to avoid unnecessary and costly training, narrowing the gap between the goals of traditional AI and Green AI. △ Less

Submitted 12 May, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

Journal ref: Information Fusion, Volume 123, November 2025, 103247

arXiv:2410.05892 [pdf, other]

Towards an Autonomous Surface Vehicle Prototype for Artificial Intelligence Applications of Water Quality Monitoring

Authors: Luis Miguel Díaz, Samuel Yanes Luis, Alejandro Mendoza Barrionuevo, Dame Seck Diop, Manuel Perales, Alejandro Casado, Sergio Toral, Daniel Gutiérrez

Abstract: The use of Autonomous Surface Vehicles, equipped with water quality sensors and artificial vision systems, allows for a smart and adaptive deployment in water resources environmental monitoring. This paper presents a real implementation of a vehicle prototype that to address the use of Artificial Intelligence algorithms and enhanced sensing techniques for water quality monitoring. The vehicle is f… ▽ More The use of Autonomous Surface Vehicles, equipped with water quality sensors and artificial vision systems, allows for a smart and adaptive deployment in water resources environmental monitoring. This paper presents a real implementation of a vehicle prototype that to address the use of Artificial Intelligence algorithms and enhanced sensing techniques for water quality monitoring. The vehicle is fully equipped with high-quality sensors to measure water quality parameters and water depth. Furthermore, by means of a stereo-camera, it also can detect and locate macro-plastics in real environments by means of deep visual models, such as YOLOv5. In this paper, experimental results, carried out in Lago Mayor (Sevilla), has been presented as proof of the capabilities of the proposed architecture. The overall system, and the early results obtained, are expected to provide a solid example of a real platform useful for the water resource monitoring task, and to serve as a real case scenario for deploying Artificial Intelligence algorithms, such as path planning, artificial vision, etc. △ Less

Submitted 8 October, 2024; originally announced October 2024.

arXiv:2409.17114 [pdf]

doi 10.1109/ICCST49569.2021.9717393

Towards human-like kinematics in industrial robotic arms: a case study on a UR3 robot

Authors: Adam Wolniakowski, Kanstantsin Miatliuk, Jose J. Quintana, Miguel A. Ferrer, Moises Diaz

Abstract: Safety in industrial robotic environments is a hot research topic in the area of human-robot interaction (HRI). Up to now, a robotic arm on an assembly line interacts with other machines away from human workers. Nowadays, robotic arm manufactures are aimed to their robots could increasingly perform tasks collaborating with humans. One of the ways to improve this collaboration is by making the move… ▽ More Safety in industrial robotic environments is a hot research topic in the area of human-robot interaction (HRI). Up to now, a robotic arm on an assembly line interacts with other machines away from human workers. Nowadays, robotic arm manufactures are aimed to their robots could increasingly perform tasks collaborating with humans. One of the ways to improve this collaboration is by making the movement of robots more humanlike. This way, it would be easier for a human to foresee the movement of the robot and approach it without fear of contact. The main difference between the movement of a human and of a robotic arm is that the former has a bell-shaped speed profile while the latter has a uniform speed one. To generate this speed profile, the kinematic theory of rapid human movements and its Sigma-Lognormal model has been used. This model is widely used to explain most of the basic phenomena related to the control of human movements. Both human-like and robotic-like movements are transferred to the UR3 robot. In this paper we detail the how the UR3 robot was programmed to produce both kinds of movement. The dissimilarities result between the input motion and output motion to the robot confirm the possibility to develop human-like velocities in the UR3 robot. △ Less

Submitted 25 September, 2024; originally announced September 2024.

Comments: 6 pages, 5 figures

Journal ref: 2021 International Carnahan Conference on Security Technology (ICCST). IEEE, 2021

arXiv:2409.06147 [pdf, other]

doi 10.1109/ICASSP49660.2025.10889502

Multiclass Arrhythmia Classification using Smartwatch Photoplethysmography Signals Collected in Real-life Settings

Authors: Dong Han, Jihye Moon, Luís Roberto Mercado Díaz, Darren Chen, Devan Williams, Eric Y. Ding, Khanh-Van Tran, David D. McManus, Ki H. Chon

Abstract: Most deep learning models of multiclass arrhythmia classification are tested on fingertip photoplethysmographic (PPG) data, which has higher signal-to-noise ratios compared to smartwatch-derived PPG, and the best reported sensitivity value for premature atrial/ventricular contraction (PAC/PVC) detection is only 75%. To improve upon PAC/PVC detection sensitivity while maintaining high AF detection,… ▽ More Most deep learning models of multiclass arrhythmia classification are tested on fingertip photoplethysmographic (PPG) data, which has higher signal-to-noise ratios compared to smartwatch-derived PPG, and the best reported sensitivity value for premature atrial/ventricular contraction (PAC/PVC) detection is only 75%. To improve upon PAC/PVC detection sensitivity while maintaining high AF detection, we use multi-modal data which incorporates 1D PPG, accelerometers, and heart rate data as the inputs to a computationally efficient 1D bi-directional Gated Recurrent Unit (1D-Bi-GRU) model to detect three arrhythmia classes. We used motion-artifact prone smartwatch PPG data from the NIH-funded Pulsewatch clinical trial. Our multimodal model tested on 72 subjects achieved an unprecedented 83% sensitivity for PAC/PVC detection while maintaining a high accuracy of 97.31% for AF detection. These results outperformed the best state-of-the-art model by 20.81% for PAC/PVC and 2.55% for AF detection even while our model was computationally more efficient (14 times lighter and 2.7 faster). △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2409.01596 [pdf, ps, other]

Synthesizing Late-Stage Contrast Enhancement in Breast MRI: A Comprehensive Pipeline Leveraging Temporal Contrast Enhancement Dynamics

Authors: Ruben D. Fonnegra, Maria Liliana Hernández, Juan C. Caicedo, Gloria M. Díaz

Abstract: Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is essential for breast cancer diagnosis due to its ability to characterize tissue through contrast agent kinetics. However, traditional DCE-MRI protocols require multiple imaging phases, including early and late post-contrast acquisitions, leading to prolonged scan times, patient discomfort, motion artifacts, high costs, and limited a… ▽ More Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is essential for breast cancer diagnosis due to its ability to characterize tissue through contrast agent kinetics. However, traditional DCE-MRI protocols require multiple imaging phases, including early and late post-contrast acquisitions, leading to prolonged scan times, patient discomfort, motion artifacts, high costs, and limited accessibility. To overcome these limitations, this study presents a pipeline for synthesizing late-phase DCE-MRI images from early-phase data, replicating the time-intensity (TI) curve behavior in enhanced regions while maintaining visual fidelity across the entire image. The proposed approach introduces a novel loss function, Time Intensity Loss (TI-loss), leveraging the temporal behavior of contrast agents to guide the training of a generative model. Additionally, a new normalization strategy, TI-norm, preserves the contrast enhancement pattern across multiple image sequences at various timestamps, addressing limitations of conventional normalization methods. Two metrics are proposed to evaluate image quality: the Contrast Agent Pattern Score ($\mathcal{CP}_{s}$), which validates enhancement patterns in annotated regions, and the Average Difference in Enhancement ($\mathcal{ED}$), measuring differences between real and generated enhancements. Using a public DCE-MRI dataset with 1.5T and 3T scanners, the proposed method demonstrates accurate synthesis of late-phase images that outperform existing models in replicating the TI curve's behavior in regions of interest while preserving overall image quality. This advancement shows a potential to optimize DCE-MRI protocols by reducing scanning time without compromising diagnostic accuracy, and bringing generative models closer to practical implementation in clinical scenarios to enhance efficiency in breast cancer imaging. △ Less

Submitted 24 January, 2025; v1 submitted 3 September, 2024; originally announced September 2024.

arXiv:2408.01852 [pdf, other]

Sólo Escúchame: Spanish Emotional Accompaniment Chatbot

Authors: Bruno Gil Ramírez, Jessica López Espejel, María del Carmen Santiago Díaz, Gustavo Trinidad Rubín Linares

Abstract: According to the World Health Organization (WHO), suicide was the fourth leading cause of death in the world for individuals aged 15 to 29 in 2019. Given the rapid increase in mental health issues, providing psychological support is both crucial and urgent. In this paper: (1) we propose Sólo Escúchame, the first open-source Spanish emotional assistance chatbot, based on LLaMA-2-7b-Chat. (2) We int… ▽ More According to the World Health Organization (WHO), suicide was the fourth leading cause of death in the world for individuals aged 15 to 29 in 2019. Given the rapid increase in mental health issues, providing psychological support is both crucial and urgent. In this paper: (1) we propose Sólo Escúchame, the first open-source Spanish emotional assistance chatbot, based on LLaMA-2-7b-Chat. (2) We introduced the HEAR (Hispanic Emotional Accompaniment Responses) dataset, compiled from multiple English sources translated into Spanish, as well as generic data generated using ChatGPT-3.5-Turbo. Finally, (3) we propose an evaluation metric based on two semi-automatic assessment methods. Our system outperforms a range of state-of-the-art models in providing psychological assistance in Spanish. Our models and datasets are publicly available to facilitate reproducibility. △ Less

Submitted 7 August, 2024; v1 submitted 3 August, 2024; originally announced August 2024.

Comments: Accepted at the 23rd Mexican International Conference on Artificial Intelligence (MICAI) 2024

arXiv:2407.20773 [pdf]

UpDown: Programmable fine-grained Events for Scalable Performance on Irregular Applications

Authors: Andronicus Rajasukumar, Jiya Su, Yuqing, Wang, Tianshuo Su, Marziyeh Nourian, Jose M Monsalve Diaz, Tianchi Zhang, Jianru Ding, Wenyi Wang, Ziyi Zhang, Moubarak Jeje, Henry Hoffmann, Yanjing Li, Andrew A. Chien

Abstract: Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports fine-grained execution with novel architecture mechanisms - lightweight threading, event-driven scheduling, efficient ultra-short threads, and split-transaction DRAM acc… ▽ More Applications with irregular data structures, data-dependent control flows and fine-grained data transfers (e.g., real-world graph computations) perform poorly on cache-based systems. We propose the UpDown accelerator that supports fine-grained execution with novel architecture mechanisms - lightweight threading, event-driven scheduling, efficient ultra-short threads, and split-transaction DRAM access with software-controlled synchronization. These hardware primitives support software programmable events, enabling high performance on diverse data structures and algorithms. UpDown also supports scalable performance; hardware replication enables programs to scale up performance. Evaluation results show UpDown's flexibility and scalability enable it to outperform CPUs on graph mining and analytics computations by up to 116-195x geomean speedup and more than 4x speedup over prior accelerators. We show that UpDown generates high memory parallelism (~4.6x over CPU) required for memory intensive graph computations. We present measurements that attribute the performance of UpDown (23x architectural advantage) to its individual architectural mechanisms. Finally, we also analyze the area and power cost of UpDown's mechanisms for software programmability. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: 14 pages, 23 figures

arXiv:2407.19922 [pdf, other]

Monetizing Currency Pair Sentiments through LLM Explainability

Authors: Lior Limonad, Fabiana Fournier, Juan Manuel Vera Díaz, Inna Skarbovsky, Shlomit Gur, Raquel Lazcano

Abstract: Large language models (LLMs) play a vital role in almost every domain in today's organizations. In the context of this work, we highlight the use of LLMs for sentiment analysis (SA) and explainability. Specifically, we contribute a novel technique to leverage LLMs as a post-hoc model-independent tool for the explainability of SA. We applied our technique in the financial domain for currency-pair p… ▽ More Large language models (LLMs) play a vital role in almost every domain in today's organizations. In the context of this work, we highlight the use of LLMs for sentiment analysis (SA) and explainability. Specifically, we contribute a novel technique to leverage LLMs as a post-hoc model-independent tool for the explainability of SA. We applied our technique in the financial domain for currency-pair price predictions using open news feed data merged with market prices. Our application shows that the developed technique is not only a viable alternative to using conventional eXplainable AI but can also be fed back to enrich the input to the machine learning (ML) model to better predict future currency-pair values. We envision our results could be generalized to employing explainability as a conventional enrichment for ML input for better ML predictions in general. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: 7 pages, 3 figures, AIFin@ECAI 2024

MSC Class: 68T50

Journal ref: AIFin workshop at ECAI 2024

arXiv:2406.11757 [pdf, other]

STAR: SocioTechnical Approach to Red Teaming Language Models

Authors: Laura Weidinger, John Mellor, Bernat Guillen Pegueroles, Nahema Marchal, Ravin Kumar, Kristian Lum, Canfer Akbulut, Mark Diaz, Stevie Bergman, Mikel Rodriguez, Verena Rieser, William Isaac

Abstract: This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failur… ▽ More This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failures at no increased cost. Second, STAR improves signal quality by matching demographics to assess harms for specific groups, resulting in more sensitive annotations. STAR further employs a novel step of arbitration to leverage diverse viewpoints and improve label reliability, treating disagreement not as noise but as a valuable contribution to signal quality. △ Less

Submitted 23 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: 8 pages, 5 figures, 5 pages appendix. * denotes equal contribution

arXiv:2406.04818 [pdf]

doi 10.1007/978-3-031-45461-5_1

A short review on graphonometric evaluation tools in children

Authors: Belen Esther Aleman, Moises Diaz, Miguel Angel Ferrer

Abstract: Handwriting is a complex task that involves the coordination of motor, perceptual and cognitive skills. It is a fundamental skill for the cognitive and academic development of children. However, the technological, and educational changes in recent decades have affected both the teaching and assessment of handwriting. This paper presents a literature review of handwriting analysis in children, incl… ▽ More Handwriting is a complex task that involves the coordination of motor, perceptual and cognitive skills. It is a fundamental skill for the cognitive and academic development of children. However, the technological, and educational changes in recent decades have affected both the teaching and assessment of handwriting. This paper presents a literature review of handwriting analysis in children, including a bibliometric analysis of published articles, the study participants, and the methods of evaluating the graphonometric state of children. The aim is to synthesize the state of the art and provide an overview of the main study trends over the last decade. The review concludes that handwriting remains a fundamental tool for early estimation of cognitive problems and early intervention. The article analyzes graphonometric evaluation tools. Likewise, it reflects on the importance of graphonometric evaluation as a means to detect possible difficulties or disorders in learning to write. The article concludes by highlighting the need to agree on an evaluation methodology and to combine databases. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Journal ref: Computer Science, vol 14285. Springer, Cham, 2024

arXiv:2406.03859 [pdf]

doi 10.1016/j.compag.2020.105531

From operculum and body tail movements to different coupling of physical activity and respiratory frequency in farmed gilthead sea bream and European sea bass. Insights on aquaculture biosensing

Authors: Miguel A. Ferrer, Josep A. Calduch-Giner, Moises Díaz, Javier Sosa, Enrique Rosell-Moll, Judith Santana Abril, Graciela Santana Sosa, Tomás Bautista Delgado, Cristina Carmona, Juan Antonio Martos-Sitcha, Enric Cabruja, Juan Manuel Afonso, Aurelio Vega, Manuel Lozano, Juan Antonio Montiel-Nelson, Jaume Pérez-Sánchez

Abstract: The AEFishBIT tri-axial accelerometer was externally attached to the operculum to assess the divergent activity and respiratory patterns of two marine farmed fish, the gilthead sea bream (Sparus aurata) and European sea bass (Dicentrarchus labrax). Analysis of raw data from exercised fish highlighted the large amplitude of operculum aperture and body tail movements in European sea bass, which were… ▽ More The AEFishBIT tri-axial accelerometer was externally attached to the operculum to assess the divergent activity and respiratory patterns of two marine farmed fish, the gilthead sea bream (Sparus aurata) and European sea bass (Dicentrarchus labrax). Analysis of raw data from exercised fish highlighted the large amplitude of operculum aperture and body tail movements in European sea bass, which were overall more stable at low-medium exercise intensity levels. Cosinor analysis in free-swimming fish (on-board data processing) highlighted a pronounced daily rhythmicity of locomotor activity and respiratory frequency in both gilthead sea bream and European sea bass. Acrophases of activity and respiration were coupled in gilthead sea bream, acting feeding time (once daily at 11:00 h) as a main synchronizing factor. By contrast, locomotor activity and respiratory frequency were out of phase in European sea bass with activity acrophase on early morning and respiration acrophase on the afternoon. The daily range of activity and respiration variation was also higher in European sea bass, probably as part of the adaptation of this fish species to act as a fast swimming predator. In any case, lower locomotor activity and enhanced respiration were associated with larger body weight in both fish species. This agrees with the notion that selection for fast growth in farming conditions is accompanied by a lower activity profile, which may favor an efficient feed conversion for growth purposes. Therefore, the use of behavioral monitoring is becoming a reliable and large-scale promising tool for selecting more efficient farmed fish, allowing researchers and farmers to establish stricter criteria of welfare for more sustainable and ethical fish production. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Journal ref: Computers and Electronics in Agriculture, col.175,pp.105531,2020

arXiv:2406.03194 [pdf]

doi 10.9781/ijimai.2021.04.003

Writing Order Recovery in Complex and Long Static Handwriting

Authors: Moises Diaz, Gioele Crispo, Antonio Parziale, Angelo Marcelli, Miguel A. Ferrer

Abstract: The order in which the trajectory is executed is a powerful source of information for recognizers. However, there is still no general approach for recovering the trajectory of complex and long handwriting from static images. Complex specimens can result in multiple pen-downs and in a high number of trajectory crossings yielding agglomerations of pixels (also known as clusters). While the scientifi… ▽ More The order in which the trajectory is executed is a powerful source of information for recognizers. However, there is still no general approach for recovering the trajectory of complex and long handwriting from static images. Complex specimens can result in multiple pen-downs and in a high number of trajectory crossings yielding agglomerations of pixels (also known as clusters). While the scientific literature describes a wide range of approaches for recovering the writing order in handwriting, these approaches nevertheless lack a common evaluation metric. In this paper, we introduce a new system to estimate the order recovery of thinned static trajectories, which allows to effectively resolve the clusters and select the order of the executed pen-downs. We evaluate how knowing the starting points of the pen-downs affects the quality of the recovered writing. Once the stability and sensitivity of the system is analyzed, we describe a series of experiments with three publicly available databases, showing competitive results in all cases. We expect the proposed system, whose code is made publicly available to the research community, to reduce potential confusion when the order of complex trajectories are recovered, and this will in turn make the trajectories recovered to be viable for further applications, such as velocity estimation. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Journal ref: International Journal of Interactive Multimedia and Artificial Intelligence, Volume 7, number 4, Pages 171-184, 2022

arXiv:2406.00512 [pdf]

doi 10.1007/978-3-031-43085-5_36

On the use of first and second derivative approximations for biometric online signature recognition

Authors: Marcos Faundez-Zanuy, Moises Diaz

Abstract: This paper investigates the impact of different approximation methods in feature extraction for pattern recognition applications, specifically focused on delta and delta-delta parameters. Using MCYT330 online signature data-base, our experiments show that 11-point approximation outperforms 1-point approximation, resulting in a 1.4% improvement in identification rate, 36.8% reduction in random forg… ▽ More This paper investigates the impact of different approximation methods in feature extraction for pattern recognition applications, specifically focused on delta and delta-delta parameters. Using MCYT330 online signature data-base, our experiments show that 11-point approximation outperforms 1-point approximation, resulting in a 1.4% improvement in identification rate, 36.8% reduction in random forgeries and 2.4% reduction in skilled forgeries △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: Advances in Computational Intelligence. IWANN 2023. pp 461 to 472

Journal ref: Lecture Notes in Computer Science, vol 14134, 2023

arXiv:2405.19081 [pdf]

doi 10.3390/app122312045

Uniform vs. Lognormal Kinematics in Robots: Perceptual Preferences for Robotic Movements

Authors: Jose J. Quintana, Miguel A. Ferrer, Moises Diaz, Jose J. Feo, Adam Wolniakowski, Konstantsin Miatliuk

Abstract: Collaborative robots or cobots interact with humans in a common work environment. In cobots, one under investigated but important issue is related to their movement and how it is perceived by humans. This paper tries to analyze whether humans prefer a robot moving in a human or in a robotic fashion. To this end, the present work lays out what differentiates the movement performed by an industrial… ▽ More Collaborative robots or cobots interact with humans in a common work environment. In cobots, one under investigated but important issue is related to their movement and how it is perceived by humans. This paper tries to analyze whether humans prefer a robot moving in a human or in a robotic fashion. To this end, the present work lays out what differentiates the movement performed by an industrial robotic arm from that performed by a human one. The main difference lies in the fact that the robotic movement has a trapezoidal speed profile, while for the human arm, the speed profile is bell-shaped and during complex movements, it can be considered as a sum of superimposed bell-shaped movements. Based on the lognormality principle, a procedure was developed for a robotic arm to perform human-like movements. Both speed profiles were implemented in two industrial robots, namely, an ABB IRB 120 and a Universal Robot UR3. Three tests were used to study the subjects' preference when seeing both movements and another analyzed the same when interacting with the robot by touching its ends with their fingers. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Journal ref: Applied Sciences Volume 12 Issue 23 (2022)

arXiv:2405.18924 [pdf]

doi 10.1007/s12559-023-10193-w

MDIW-13: a New Multi-Lingual and Multi-Script Database and Benchmark for Script Identification

Authors: Miguel A. Ferrer, Abhijit Das, Moises Diaz, Aythami Morales, Cristina Carmona-Duarte, Umapada Pal

Abstract: Script identification plays a vital role in applications that involve handwriting and document analysis within a multi-script and multi-lingual environment. Moreover, it exhibits a profound connection with human cognition. This paper provides a new database for benchmarking script identification algorithms, which contains both printed and handwritten documents collected from a wide variety of scri… ▽ More Script identification plays a vital role in applications that involve handwriting and document analysis within a multi-script and multi-lingual environment. Moreover, it exhibits a profound connection with human cognition. This paper provides a new database for benchmarking script identification algorithms, which contains both printed and handwritten documents collected from a wide variety of scripts, such as Arabic, Bengali (Bangla), Gujarati, Gurmukhi, Devanagari, Japanese, Kannada, Malayalam, Oriya, Roman, Tamil, Telugu, and Thai. The dataset consists of 1,135 documents scanned from local newspaper and handwritten letters as well as notes from different native writers. Further, these documents are segmented into lines and words, comprising a total of 13,979 and 86,655 lines and words, respectively, in the dataset. Easy-to-go benchmarks are proposed with handcrafted and deep learning methods. The benchmark includes results at the document, line, and word levels with printed and handwritten documents. Results of script identification independent of the document/line/word level and independent of the printed/handwritten letters are also given. The new multi-lingual database is expected to create new script identifiers, present various challenges, including identifying handwritten and printed samples and serve as a foundation for future research in script identification based on the reported results of the three benchmarks. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Journal ref: Cognitive Computation, Volume 16, pages 131 to 157,(2024)

arXiv:2405.17886 [pdf]

doi 10.1080/19404158.2024.2326686

Graphomotor and Handwriting Disabilities Rating Scale (GHDRS):towards complex and objective assessment

Authors: Jiri Mekyska, Katarina Safarova, Tomas Urbanek, Jirina Bednarova, Vojtech Zvoncak, Jana Marie Havigerova, Lukas Cunek, Zoltan Galaz, Jan Mucha, Christine Klauszova, Marcos Faundez-Zanuy, Miguel A. Ferrer, Moises Diaz

Abstract: Graphomotor and handwriting disabilities (GD and HD, respectively) could significantly reduce children's quality of life. Effective remediation depends on proper diagnosis; however, current approaches to diagnosis and assessment of GD and HD have several limitations and knowledge gaps, e.g. they are subjective, they do not facilitate identification of specific manifestations, etc. The aim of this… ▽ More Graphomotor and handwriting disabilities (GD and HD, respectively) could significantly reduce children's quality of life. Effective remediation depends on proper diagnosis; however, current approaches to diagnosis and assessment of GD and HD have several limitations and knowledge gaps, e.g. they are subjective, they do not facilitate identification of specific manifestations, etc. The aim of this work is to introduce a new scale (GHDRS Graphomotor and Handwriting Disabilities Rating Scale) that will enable experts to perform objective and complex computeraided diagnosis and assessment of GD and HD. The scale supports quantification of 17 manifestations associated with the process/product of drawing/ handwriting. The whole methodology of GHDRS design is made maximally transparent so that it could be adapted for other languages. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Journal ref: Australian Journalof Learning Difficulties, Routledge, 1-34,2024

arXiv:2405.16959 [pdf, other]

doi 10.1007/978-3-031-45461-5_8

A Machine Learning Approach to Analyze the Effects of Alzheimer's Disease on Handwriting through Lognormal Features

Authors: Tiziana D'Alessandro, Cristina Carmona-Duarte, Claudio De Stefano, Moises Diaz, Miguel A. Ferrer, Francesco Fontanella

Abstract: Alzheimer's disease is one of the most incisive illnesses among the neurodegenerative ones, and it causes a progressive decline in cognitive abilities that, in the worst cases, becomes severe enough to interfere with daily life. Currently, there is no cure, so an early diagnosis is strongly needed to try and slow its progression through medical treatments. Handwriting analysis is considered a pote… ▽ More Alzheimer's disease is one of the most incisive illnesses among the neurodegenerative ones, and it causes a progressive decline in cognitive abilities that, in the worst cases, becomes severe enough to interfere with daily life. Currently, there is no cure, so an early diagnosis is strongly needed to try and slow its progression through medical treatments. Handwriting analysis is considered a potential tool for detecting and understanding certain neurological conditions, including Alzheimer's disease. While handwriting analysis alone cannot provide a definitive diagnosis of Alzheimer's, it may offer some insights and be used for a comprehensive assessment. The Sigma-lognormal model is conceived for movement analysis and can also be applied to handwriting. This model returns a set of lognormal parameters as output, which forms the basis for the computation of novel and significant features. This paper presents a machine learning approach applied to handwriting features extracted through the sigma-lognormal model. The aim is to develop a support system to help doctors in the diagnosis and study of Alzheimer, evaluate the effectiveness of the extracted features and finally study the relation among them. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Journal ref: IGS 2023. Lecture Notes in Computer Science, vol 14285. Springer (2023)

arXiv:2405.15550 [pdf]

doi 10.1016/j.compag.2023.108500

CowScreeningDB: A public benchmark dataset for lameness detection in dairy cows

Authors: Shahid Ismail, Moises Diaz, Cristina Carmona-Duarte, Jose Manuel Vilar, Miguel A. Ferrer

Abstract: Lameness is one of the costliest pathological problems affecting dairy animals. It is usually assessed by trained veterinary clinicians who observe features such as gait symmetry or gait parameters as step counts in real-time. With the development of artificial intelligence, various modular systems have been proposed to minimize subjectivity in lameness assessment. However, the major limitation in… ▽ More Lameness is one of the costliest pathological problems affecting dairy animals. It is usually assessed by trained veterinary clinicians who observe features such as gait symmetry or gait parameters as step counts in real-time. With the development of artificial intelligence, various modular systems have been proposed to minimize subjectivity in lameness assessment. However, the major limitation in their development is the unavailability of a public dataset which is currently either commercial or privately held. To tackle this limitation, we have introduced CowScreeningDB which was created using sensory data. This dataset was sourced from 43 cows at a dairy located in Gran Canaria, Spain. It consists of a multi-sensor dataset built on data collected using an Apple Watch 6 during the normal daily routine of a dairy cow. Thanks to the collection environment, sampling technique, information regarding the sensors, the applications used for data conversion and storage make the dataset a transparent one. This transparency of data can thus be used for further development of techniques for lameness detection for dairy cows which can be objectively compared. Aside from the public sharing of the dataset, we have also shared a machine-learning technique which classifies the caws in healthy and lame by using the raw sensory data. Hence validating the major objective which is to establish the relationship between sensor data and lameness. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Journal ref: Computers and Electronics in Agriculture, vol.216, pp.108500, 2024

arXiv:2405.14409 [pdf]

doi 10.1109/TIFS.2019.2924195

Investigating the Common Authorship of Signatures by Off-Line Automatic Signature Verification Without the Use of Reference Signatures

Authors: Moises Diaz, Miguel A. Ferrer, Soodamani Ramalingam, Richard Guest

Abstract: In automatic signature verification, questioned specimens are usually compared with reference signatures. In writer-dependent schemes, a number of reference signatures are required to build up the individual signer model while a writer-independent system requires a set of reference signatures from several signers to develop the model of the system. This paper addresses the problem of automatic sig… ▽ More In automatic signature verification, questioned specimens are usually compared with reference signatures. In writer-dependent schemes, a number of reference signatures are required to build up the individual signer model while a writer-independent system requires a set of reference signatures from several signers to develop the model of the system. This paper addresses the problem of automatic signature verification when no reference signatures are available. The scenario we explore consists of a set of signatures, which could be signed by the same author or by multiple signers. As such, we discuss three methods which estimate automatically the common authorship of a set of off-line signatures. The first method develops a score similarity matrix, worked out with the assistance of duplicated signatures; the second uses a feature-distance matrix for each pair of signatures; and the last method introduces pre-classification based on the complexity of each signature. Publicly available signatures were used in the experiments, which gave encouraging results. As a baseline for the performance obtained by our approaches, we carried out a visual Turing Test where forensic and non-forensic human volunteers, carrying out the same task, performed less well than the automatic schemes. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Journal ref: IEEE Transactions on Information Forensics and Security, vol.15, no.1, pp. 487 to 499 (2019)

arXiv:2405.13555 [pdf]

doi 10.1145/3274658

A Perspective Analysis of Handwritten Signature Technology

Authors: Moises Diaz, Miguel A. Ferrer, Donato Impedovo, Muhammad Imran Malik, Giuseppe Pirlo, Rejean Plamondon

Abstract: Handwritten signatures are biometric traits at the center of debate in the scientific community. Over the last 40 years, the interest in signature studies has grown steadily, having as its main reference the application of automatic signature verification, as previously published reviews in 1989, 2000, and 2008 bear witness. Ever since, and over the last 10 years, the application of handwritten si… ▽ More Handwritten signatures are biometric traits at the center of debate in the scientific community. Over the last 40 years, the interest in signature studies has grown steadily, having as its main reference the application of automatic signature verification, as previously published reviews in 1989, 2000, and 2008 bear witness. Ever since, and over the last 10 years, the application of handwritten signature technology has strongly evolved, and much research has focused on the possibility of applying systems based on handwritten signature analysis and processing to a multitude of new fields. After several years of haphazard growth of this research area, it is time to assess its current developments for their applicability in order to draw a structured way forward. This perspective reports a systematic review of the last 10 years of the literature on handwritten signatures with respect to the new scenario, focusing on the most promising domains of research and trying to elicit possible future research directions in this subject. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Journal ref: ACM Computing Surveys (CSUR), vol.51, no 6, pp. 117:1-117:39 (2018)

arXiv:2405.13438 [pdf]

doi 10.1016/j.patrec.2019.08.018

Dynamically enhanced static handwriting representation for Parkinson's disease detection

Authors: Moises Diaz, Miguel Angel Ferrer, Donato Impedovo, Giuseppe Pirlo, Gennaro Vessio

Abstract: Computer aided diagnosis systems can provide non-invasive, low-cost tools to support clinicians. These systems have the potential to assist the diagnosis and monitoring of neurodegenerative disorders, in particular Parkinson's disease (PD). Handwriting plays a special role in the context of PD assessment. In this paper, the discriminating power of "dynamically enhanced" static images of handwritin… ▽ More Computer aided diagnosis systems can provide non-invasive, low-cost tools to support clinicians. These systems have the potential to assist the diagnosis and monitoring of neurodegenerative disorders, in particular Parkinson's disease (PD). Handwriting plays a special role in the context of PD assessment. In this paper, the discriminating power of "dynamically enhanced" static images of handwriting is investigated. The enhanced images are synthetically generated by exploiting simultaneously the static and dynamic properties of handwriting. Specifically, we propose a static representation that embeds dynamic information based on: (i) drawing the points of the samples, instead of linking them, so as to retain temporal/velocity information; and (ii) adding pen-ups for the same purpose. To evaluate the effectiveness of the new handwriting representation, a fair comparison between this approach and state-of-the-art methods based on static and dynamic handwriting is conducted on the same dataset, i.e. PaHaW. The classification workflow employs transfer learning to extract meaningful features from multiple representations of the input data. An ensemble of different classifiers is used to achieve the final predictions. Dynamically enhanced static handwriting is able to outperform the results obtained by using static and dynamic handwriting separately. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Journal ref: Pattern Recognition Letters, vol. 128, pp. 204-210 (2019)

arXiv:2405.12695 [pdf]

doi 10.1007/s00521-023-09192-7

Explainable offline automatic signature verifier to support forensic handwriting examiners

Authors: Moises Diaz, Miguel A. Ferrer, Gennaro Vessio

Abstract: Signature verification is a critical task in many applications, including forensic science, legal judgments, and financial markets. However, current signature verification systems are often difficult to explain, which can limit their acceptance in these applications. In this paper, we propose a novel explainable offline automatic signature verifier (ASV) to support forensic handwriting examiners.… ▽ More Signature verification is a critical task in many applications, including forensic science, legal judgments, and financial markets. However, current signature verification systems are often difficult to explain, which can limit their acceptance in these applications. In this paper, we propose a novel explainable offline automatic signature verifier (ASV) to support forensic handwriting examiners. Our ASV is based on a universal background model (UBM) constructed from offline signature images. It allows us to assign a questioned signature to the UBM and to a reference set of known signatures using simple distance measures. This makes it possible to explain the verifier's decision in a way that is understandable to non experts. We evaluated our ASV on publicly available databases and found that it achieves competitive performance with state of the art ASVs, even when challenging 1 versus 1 comparison are considered. Our results demonstrate that it is possible to develop an explainable ASV that is also competitive in terms of performance. We believe that our ASV has the potential to improve the acceptance of signature verification in critical applications such as forensic science and legal judgments. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Journal ref: Neural Computing and Applications, Volume 36, pages 2411 to 2427 (2024)

arXiv:2405.12556 [pdf]

doi 10.1007/s12559-023-10205-9

Online Signature Recognition: A Biologically Inspired Feature Vector Splitting Approach

Authors: Marcos Faundez, Moises Diaz, Miguel Angel Ferrer

Abstract: This research introduces an innovative approach to explore the cognitive and biologically inspired underpinnings of feature vector splitting for analyzing the significance of different attributes in e-security biometric signature recognition applications. Departing from traditional methods of concatenating features into an extended set, we employ multiple splitting strategies, aligning with cognit… ▽ More This research introduces an innovative approach to explore the cognitive and biologically inspired underpinnings of feature vector splitting for analyzing the significance of different attributes in e-security biometric signature recognition applications. Departing from traditional methods of concatenating features into an extended set, we employ multiple splitting strategies, aligning with cognitive principles, to preserve control over the relative importance of each feature subset. Our methodology is applied to three diverse databases (MCYT100, MCYT300,and SVC) using two classifiers (vector quantization and dynamic time warping with one and five training samples). Experimentation demonstrates that the fusion of pressure data with spatial coordinates (x and y) consistently enhances performance. However, the inclusion of pen-tip angles in the same feature set yields mixed results, with performance improvements observed in select cases. This work delves into the cognitive aspects of feature fusion,shedding light on the cognitive relevance of feature vector splitting in e-security biometric applications. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Journal ref: Cognitive Computation,vol:16,Pages 265 to 277 (2024)

arXiv:2405.11978 [pdf, other]

doi 10.1016/j.patrec.2018.07.029

SM-DTW: Stability Modulated Dynamic Time Warping for signature verification

Authors: Antonio Parziale, Moises Diaz, Miguel A. Ferrer, Angelo Marcelli

Abstract: Building upon findings in computational model of handwriting learning and execution, we introduce the concept of stability to explain the difference between the actual movements performed during multiple execution of the subject's signature, and conjecture that the most stable parts of the signature should play a paramount role in evaluating the similarity between a questioned signature and the re… ▽ More Building upon findings in computational model of handwriting learning and execution, we introduce the concept of stability to explain the difference between the actual movements performed during multiple execution of the subject's signature, and conjecture that the most stable parts of the signature should play a paramount role in evaluating the similarity between a questioned signature and the reference ones during signature verification. We then introduce the Stability Modulated Dynamic Time Warping algorithm for incorporating the stability regions, i.e. the most similar parts between two signatures, into the distance measure between a pair of signatures computed by the Dynamic Time Warping for signature verification. Experiments were conducted on two datasets largely adopted for performance evaluation. Experimental results show that the proposed algorithm improves the performance of the baseline system and compares favourably with other top performing signature verification systems. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Journal ref: Pattern Recognition Letters, Volume: 121, Pages 113-122 (2019)

arXiv:2404.10857 [pdf, other]

D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation

Authors: Aida Mostafazadeh Davani, Mark Díaz, Dylan Baker, Vinodkumar Prabhakaran

Abstract: While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within de… ▽ More While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within demographic groups may hold diverse values, which can influence their perceptions beyond their group norms. To effectively incorporate these considerations into NLP pipelines, we need datasets with extensive parallel annotations from various social and cultural groups. In this paper we introduce the \dataset dataset: a large-scale cross-cultural dataset of parallel annotations for offensive language in over 4.5K sentences annotated by a pool of over 4k annotators, balanced across gender and age, from across 21 countries, representing eight geo-cultural regions. The dataset contains annotators' moral values captured along six moral foundations: care, equality, proportionality, authority, loyalty, and purity. Our analyses reveal substantial regional variations in annotators' perceptions that are shaped by individual moral values, offering crucial insights for building pluralistic, culturally sensitive NLP models. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.03084 [pdf, other]

Rethinking Teacher-Student Curriculum Learning through the Cooperative Mechanics of Experience

Authors: Manfred Diaz, Liam Paull, Andrea Tacchetti

Abstract: Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and learning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose… ▽ More Teacher-Student Curriculum Learning (TSCL) is a curriculum learning framework that draws inspiration from human cultural transmission and learning. It involves a teacher algorithm shaping the learning process of a learner algorithm by exposing it to controlled experiences. Despite its success, understanding the conditions under which TSCL is effective remains challenging. In this paper, we propose a data-centric perspective to analyze the underlying mechanics of the teacher-student interactions in TSCL. We leverage cooperative game theory to describe how the composition of the set of experiences presented by the teacher to the learner, as well as their order, influences the performance of the curriculum that is found by TSCL approaches. To do so, we demonstrate that for every TSCL problem, an equivalent cooperative game exists, and several key components of the TSCL framework can be reinterpreted using game-theoretic principles. Through experiments covering supervised learning, reinforcement learning, and classical games, we estimate the cooperative values of experiences and use value-proportional curriculum mechanisms to construct curricula, even in cases where TSCL struggles. The framework and experimental setup we present in this work represents a novel foundation for a deeper exploration of TSCL, shedding light on its underlying mechanisms and providing insights into its broader applicability in machine learning. △ Less

Submitted 12 September, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: Accepted at TMLR (https://openreview.net/forum?id=qWh82br6KT)

arXiv:2402.06811 [pdf, ps, other]

Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation

Authors: Andrew Smart, Ding Wang, Ellis Monk, Mark Díaz, Atoosa Kasirzadeh, Erin Van Liemt, Sonja Schmer-Galunder

Abstract: Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from… ▽ More Data annotation remains the sine qua non of machine learning and AI. Recent empirical work on data annotation has begun to highlight the importance of rater diversity for fairness, model performance, and new lines of research have begun to examine the working conditions for data annotation workers, the impacts and role of annotator subjectivity on labels, and the potential psychological harms from aspects of annotation work. This paper outlines a critical genealogy of data annotation; starting with its psychological and perceptual aspects. We draw on similarities with critiques of the rise of computerized lab-based psychological experiments in the 1970's which question whether these experiments permit the generalization of results beyond the laboratory settings within which these results are typically obtained. Do data annotations permit the generalization of results beyond the settings, or locations, in which they were obtained? Psychology is overly reliant on participants from Western, Educated, Industrialized, Rich, and Democratic societies (WEIRD). Many of the people who work as data annotation platform workers, however, are not from WEIRD countries; most data annotation workers are based in Global South countries. Social categorizations and classifications from WEIRD countries are imposed on non-WEIRD annotators through instructions and tasks, and through them, on data, which is then used to train or evaluate AI models in WEIRD countries. We synthesize evidence from several recent lines of research and argue that data annotation is a form of automated social categorization that risks entrenching outdated and static social categories that are in reality dynamic and changing. We propose a framework for understanding the interplay of the global social conditions of data annotation with the subjective phenomenological experience of data annotation work. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 18 pages

arXiv:2402.01849 [pdf, other]

doi 10.1016/j.engappai.2020.104113

Capturing waste collection planning expert knowledge in a fitness function through preference learning

Authors: Laura Fernández Díaz, Miriam Fernández Díaz, José Ramón Quevedo, Elena Montañés

Abstract: This paper copes with the COGERSA waste collection process. Up to now, experts have been manually designed the process using a trial and error mechanism. This process is not globally optimized, since it has been progressively and locally built as council demands appear. Planning optimization algorithms usually solve it, but they need a fitness function to evaluate a route planning quality. The dra… ▽ More This paper copes with the COGERSA waste collection process. Up to now, experts have been manually designed the process using a trial and error mechanism. This process is not globally optimized, since it has been progressively and locally built as council demands appear. Planning optimization algorithms usually solve it, but they need a fitness function to evaluate a route planning quality. The drawback is that even experts are not able to propose one in a straightforward way due to the complexity of the process. Hence, the goal of this paper is to build a fitness function though a preference framework, taking advantage of the available expert knowledge and expertise. Several key performance indicators together with preference judgments are carefully established according to the experts for learning a promising fitness function. Particularly, the additivity property of them makes the task be much more affordable, since it allows to work with routes rather than with route plannings. Besides, a feature selection analysis is performed over such indicators, since the experts suspect of a potential existing (but unknown) redundancy among them. The experiment results confirm this hypothesis, since the best $C-$index ($98\%$ against around $94\%$) is reached when 6 or 8 out of 21 indicators are taken. Particularly, truck load seems to be a highly promising key performance indicator, together to the travelled distance along non-main roads. A comparison with other existing approaches shows that the proposed method clearly outperforms them, since the $C-$index goes from $72\%$ or $90\%$ to $98\%$. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Journal ref: Engineering Applications of Artificial Intelligence 2021 Volume 99 104113

arXiv:2401.17026 [pdf]

doi 10.1109/TCYB.2017.2751740

Static and Dynamic Synthesis of Bengali and Devanagari Signatures

Authors: Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Pal

Abstract: Developing an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a mot… ▽ More Developing an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a motor equivalence model divides human actions into two steps: 1) the effector independent step at cognitive level and 2) the effector dependent step at motor level. In fact, recent work reports the successful application to Western scripts of a handwriting synthesizer, based on this theory. This paper aims to adapt this scheme for the generation of synthetic signatures in two Indic scripts, Bengali (Bangla), and Devanagari (Hindi). For this purpose, we use two different online and offline databases for both Bengali and Devanagari signatures. This paper reports an effective synthesizer for static and dynamic signatures written in Devanagari or Bengali scripts. We obtain promising results with artificially generated signatures in terms of appearance and performance when we compare the results with those for real signatures. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: Accepted version. Published on IEEE Transactions on Cybernetics [ISSN 2168-2267], v. 48(10), p. 2896-2907

Journal ref: IEEE Transactions on Cybernetics, v. 48(10), p. 2896-2907, 2018

arXiv:2401.16519 [pdf, other]

doi 10.1016/j.patrec.2023.02.021

Extending the kinematic theory of rapid movements with new primitives

Authors: Miguel A. Ferrer, Moises Diaz, Jose J. Quintana, Cristina Carmona-Duarte

Abstract: The Kinematic Theory of rapid movements, and its associated Sigma-Lognormal, model 2D spatiotemporal trajectories. It is constructed mainly as a temporal overlap of curves between virtual target points. Specifically, it uses an arc and a lognormal as primitives for the representation of the trajectory and velocity, respectively. This paper proposes developing this model, in what we call the Kinema… ▽ More The Kinematic Theory of rapid movements, and its associated Sigma-Lognormal, model 2D spatiotemporal trajectories. It is constructed mainly as a temporal overlap of curves between virtual target points. Specifically, it uses an arc and a lognormal as primitives for the representation of the trajectory and velocity, respectively. This paper proposes developing this model, in what we call the Kinematic Theory Transform, which establishes a mathematical framework that allows further primitives to be used. Mainly, we evaluate Euler curves to link virtual target points and Gaussian, Beta, Gamma, Double-bounded lognormal, and Generalized Extreme Value functions to model the bell-shaped velocity profile. Using these primitives, we report reconstruction results with spatiotemporal trajectories executed by human beings, animals, and anthropomorphic robots. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted version: published on Pattern Recognition Letters [ISSN 0167-8655], v. 167, p. 181-188, (Marzo 2023)

Journal ref: Pattern Recognition Letters, 167, 181-188,2023

arXiv:2401.16329 [pdf, other]

doi 10.1016/j.knosys.2023.110365

Synthesis of 3D on-air signatures with the Sigma-Lognormal model

Authors: Miguel A. Ferrer, Moises Diaz, Cristina Carmona-Duarte, Jose J. Quintana Hernandez, Rejean Plamondon

Abstract: Signature synthesis is a computation technique that generates artificial specimens which can support decision making in automatic signature verification. A lot of work has been dedicated to this subject, which centres on synthesizing dynamic and static two-dimensional handwriting on canvas. This paper proposes a framework to generate synthetic 3D on-air signatures exploiting the lognormality princ… ▽ More Signature synthesis is a computation technique that generates artificial specimens which can support decision making in automatic signature verification. A lot of work has been dedicated to this subject, which centres on synthesizing dynamic and static two-dimensional handwriting on canvas. This paper proposes a framework to generate synthetic 3D on-air signatures exploiting the lognormality principle, which mimics the complex neuromotor control processes at play as the fingertip moves. Addressing the usual cases involving the development of artificial individuals and duplicated samples, this paper contributes to the synthesis of: (1) the trajectory and velocity of entirely 3D new signatures; (2) kinematic information when only the 3D trajectory of the signature is known, and (3) duplicate samples of 3D real signatures. Validation was conducted by generating synthetic 3D signature databases mimicking real ones and showing that automatic signature verifications of genuine and skilled forgeries report performances similar to those of real and synthetic databases. We also observed that training 3D automatic signature verifiers with duplicates can reduce errors. We further demonstrated that our proposal is also valid for synthesizing 3D air writing and gestures. Finally, a perception test confirmed the human likeness of the generated specimens. The databases generated are publicly available, only for research purposes, at . △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted Version. Published on Knowledge-Based Systems

Journal ref: Knowledge-Based Systems, Vol. 265,2023

arXiv:2401.15473 [pdf]

doi 10.1109/TPAMI.2018.2879312

iDeLog: Iterative Dual Spatial and Kinematic Extraction of Sigma-Lognormal Parameters

Authors: Miguel A. Ferrer, Moises Diaz, Cristina Carmona-Duarte, Rejean Plamondon

Abstract: The Kinematic Theory of rapid movements and its associated Sigma-Lognormal model have been extensively used in a large variety of applications. While the physical and biological meaning of the model have been widely tested and validated for rapid movements, some shortcomings have been detected when it is used with continuous long and complex movements. To alleviate such drawbacks, and inspired by… ▽ More The Kinematic Theory of rapid movements and its associated Sigma-Lognormal model have been extensively used in a large variety of applications. While the physical and biological meaning of the model have been widely tested and validated for rapid movements, some shortcomings have been detected when it is used with continuous long and complex movements. To alleviate such drawbacks, and inspired by the motor equivalence theory and a conceivable visual feedback, this paper proposes a novel framework to extract the Sigma-Lognormal parameters, namely iDeLog. Specifically, iDeLog consists of two steps. The first one, influenced by the motor equivalence model, separately derives an initial action plan defined by a set of virtual points and angles from the trajectory and a sequence of lognormals from the velocity. In the second step, based on a hypothetical visual feedback compatible with an open-loop motor control, the virtual target points of the action plan are iteratively moved to improve the matching between the observed and reconstructed trajectory and velocity. During experiments conducted with handwritten signatures, iDeLog obtained promising results as compared to the previous development of the Sigma-Lognormal. △ Less

Submitted 7 February, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

Comments: Accepted Version published by Transactions on Pattern Analysis and Machine Intelligence

Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(1); p.p. 114-125, 2020

Showing 1–50 of 124 results for author: Díaz, M