Search | arXiv e-print repository

Cooperation and the Design of Public Goods

Authors: J. Carlos Martínez Mori, Alejandro Toriello

Abstract: We consider the cooperative elements that arise in the design of public goods, such as transportation policies and infrastructure. These involve a variety of stakeholders: governments, businesses, advocates, and users. Their eventual deployment depends on the decision maker's ability to garner sufficient support from each of these groups; we formalize these strategic requirements from the perspect… ▽ More We consider the cooperative elements that arise in the design of public goods, such as transportation policies and infrastructure. These involve a variety of stakeholders: governments, businesses, advocates, and users. Their eventual deployment depends on the decision maker's ability to garner sufficient support from each of these groups; we formalize these strategic requirements from the perspective of cooperative game theory. Specifically, we introduce non-transferable utility, linear production (NTU LP) games, which combine the game-theoretic tensions inherent in public decision-making with the modeling flexibility of linear programming. We derive structural properties regarding the non-emptiness, representability and complexity of the core, a solution concept that models the viability of cooperation. In particular, we provide fairly general sufficient conditions under which the core of an NTU LP game is guaranteed to be non-empty, prove that determining membership in the core is co-NP-complete, and develop a cutting plane algorithm to optimize various social welfare objectives subject to core membership. Lastly, we apply these results in a data-driven case study on service plan optimization for the Chicago bus system. As our study illustrates, cooperation is necessary for the successful deployment of transportation service plans and similar public goods, but it may also have adverse or counterintuitive distributive implications. △ Less

Submitted 19 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

Comments: 26th ACM Conference on Economics and Computation (EC '25)

MSC Class: 91A12; 90C90; 90B06

arXiv:2504.06473 [pdf, other]

Membrane: Accelerating Database Analytics with Bank-Level DRAM-PIM Filtering

Authors: Akhil Shekar, Kevin Gaffney, Martin Prammer, Khyati Kiyawat, Lingxi Wu, Helena Caminal, Zhenxing Fan, Yimin Gao, Ashish Venkat, José F. Martínez, Jignesh Patel, Kevin Skadron

Abstract: In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to alleviate this bottleneck. In our study, we employ a commonly used software approach that streamlines JOIN operations into simpler selection or filtering tasks usin… ▽ More In-memory database query processing frequently involves substantial data transfers between the CPU and memory, leading to inefficiencies due to Von Neumann bottleneck. Processing-in-Memory (PIM) architectures offer a viable solution to alleviate this bottleneck. In our study, we employ a commonly used software approach that streamlines JOIN operations into simpler selection or filtering tasks using pre-join denormalization which makes query processing workload more amenable to PIM acceleration. This research explores DRAM design landscape to evaluate how effectively these filtering tasks can be efficiently executed across DRAM hierarchy and their effect on overall application speedup. We also find that operations such as aggregates are more suitably executed on the CPU rather than PIM. Thus, we propose a cooperative query processing framework that capitalizes on both CPU and PIM strengths, where (i) the DRAM-based PIM block, with its massive parallelism, supports scan operations while (ii) CPU, with its flexible architecture, supports the rest of query execution. This allows us to utilize both PIM and CPU where appropriate and prevent dramatic changes to the overall system architecture. With these minimal modifications, our methodology enables us to faithfully perform end-to-end performance evaluations using established analytics benchmarks such as TPCH and star-schema benchmark (SSB). Our findings show that this novel mapping approach improves performance, delivering a 5.92x/6.5x speedup compared to a traditional schema and 3.03-4.05x speedup compared to a denormalized schema with 9-17% memory overhead, depending on the degree of partial denormalization. Further, we provide insights into query selectivity, memory overheads, and software optimizations in the context of PIM-based filtering, which better explain the behavior and performance of these systems across the benchmarks. △ Less

Submitted 8 April, 2025; originally announced April 2025.

arXiv:2502.08603 [pdf, other]

Scalable Thermodynamic Second-order Optimization

Authors: Kaelan Donatella, Samuel Duffield, Denis Melanson, Maxwell Aifer, Phoebe Klett, Rajath Salegame, Zach Belateche, Gavin Crooks, Antonio J. Martinez, Patrick J. Coles

Abstract: Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-… ▽ More Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-of-reach (e.g., due to expensive matrix inversions) on digital hardware could be unlocked with physics-based hardware. In this work, we propose a scalable algorithm for employing thermodynamic computers to accelerate a popular second-order optimizer called Kronecker-factored approximate curvature (K-FAC). Our asymptotic complexity analysis predicts increasing advantage with our algorithm as $n$, the number of neurons per layer, increases. Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved. Finally, we predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics. △ Less

Submitted 12 February, 2025; originally announced February 2025.

Comments: 17 pages, 5 figures

arXiv:2502.07785 [pdf, other]

Pippo: High-Resolution Multi-View Humans from a Single Image

Authors: Yash Kant, Ethan Weber, Jin Kyu Kim, Rawal Khirodkar, Su Zhaoen, Julieta Martinez, Igor Gilitschenski, Shunsuke Saito, Timur Bagautdinov

Abstract: We present Pippo, a generative model capable of producing 1K resolution dense turnaround videos of a person from a single casually clicked photo. Pippo is a multi-view diffusion transformer and does not require any additional inputs - e.g., a fitted parametric model or camera parameters of the input image. We pre-train Pippo on 3B human images without captions, and conduct multi-view mid-training… ▽ More We present Pippo, a generative model capable of producing 1K resolution dense turnaround videos of a person from a single casually clicked photo. Pippo is a multi-view diffusion transformer and does not require any additional inputs - e.g., a fitted parametric model or camera parameters of the input image. We pre-train Pippo on 3B human images without captions, and conduct multi-view mid-training and post-training on studio captured humans. During mid-training, to quickly absorb the studio dataset, we denoise several (up to 48) views at low-resolution, and encode target cameras coarsely using a shallow MLP. During post-training, we denoise fewer views at high-resolution and use pixel-aligned controls (e.g., Spatial anchor and Plucker rays) to enable 3D consistent generations. At inference, we propose an attention biasing technique that allows Pippo to simultaneously generate greater than 5 times as many views as seen during training. Finally, we also introduce an improved metric to evaluate 3D consistency of multi-view generations, and show that Pippo outperforms existing works on multi-view human generation from a single image. △ Less

Submitted 11 February, 2025; originally announced February 2025.

Comments: Project Page - http://yashkant.github.io/pippo

arXiv:2501.13309 [pdf, other]

Representing Visualization Insights as a Dense Insight Network

Authors: Jane Hoffswell, Victor Soares Bursztyn, Shunan Guo, Jesse Martinez, Eunyee Koh

Abstract: We propose a dense insight network framework to encode the relationships between automatically generated insights from a complex dashboard based on their shared characteristics. Our insight network framework includes five high-level categories of relationships (e.g., type, topic, value, metadata, and compound scores). The goal of this insight network framework is to provide a foundation for implem… ▽ More We propose a dense insight network framework to encode the relationships between automatically generated insights from a complex dashboard based on their shared characteristics. Our insight network framework includes five high-level categories of relationships (e.g., type, topic, value, metadata, and compound scores). The goal of this insight network framework is to provide a foundation for implementing new insight interpretation and exploration strategies, including both user-driven and automated approaches. To illustrate the complexity and flexibility of our framework, we first describe a visualization playground to directly visualize key network characteristics; this playground also demonstrates potential interactive capabilities for decomposing the dense insight network. Then, we discuss a case study application for ranking insights based on the underlying network characteristics captured by our framework, before prompting a large language model to generate a concise, natural language summary. Finally, we reflect on next steps for leveraging our insight network framework to design and evaluate new systems. △ Less

Submitted 22 January, 2025; originally announced January 2025.

Comments: Currently Under Review

arXiv:2501.08104 [pdf, other]

doi 10.1109/ICASSP49660.2025.10889702

Loudspeaker Beamforming to Enhance Speech Recognition Performance of Voice Driven Applications

Authors: Dimme de Groot, Baturalp Karslioglu, Odette Scharenborg, Jorge Martinez

Abstract: In this paper we propose a robust loudspeaker beamforming algorithm which is used to enhance the performance of voice driven applications in scenarios where the loudspeakers introduce the majority of the noise, e.g. when music is playing loudly. The loudspeaker beamformer modifies the loudspeaker playback signals to create a low-acoustic-energy region around the device that implements automatic sp… ▽ More In this paper we propose a robust loudspeaker beamforming algorithm which is used to enhance the performance of voice driven applications in scenarios where the loudspeakers introduce the majority of the noise, e.g. when music is playing loudly. The loudspeaker beamformer modifies the loudspeaker playback signals to create a low-acoustic-energy region around the device that implements automatic speech recognition for a voice driven application (VDA). The algorithm utilises a distortion measure based on human auditory perception to limit the distortion perceived by human listeners. Simulations and real-world experiments show that the proposed loudspeaker beamformer improves the speech recognition performance in all tested scenarios. Moreover, the algorithm allows to further reduce the acoustic energy around the VDA device at the expense of reduced objective audio quality at the listener's location. △ Less

Submitted 14 January, 2025; originally announced January 2025.

Comments: To appear at ICASSP 2025

Journal ref: ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2412.09483 [pdf]

Early Detection of At-Risk Students Using Machine Learning

Authors: Azucena L. Jimenez Martinez, Kanika Sood, Rakeshkumar Mahto

Abstract: This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout… ▽ More This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps take place to establish a target variable, anonymize student information, manage missing data, and identify the most significant features. Given the mixed data types in the datasets and the binary classification nature of this study, this work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. These models predict at-risk students and identify critical periods of the semester when student performance is most vulnerable. We will use validation techniques such as train test split and k-fold cross-validation to ensure the reliability of the models. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall. △ Less

Submitted 12 December, 2024; originally announced December 2024.

arXiv:2412.08586 [pdf, ps, other]

Asymptotically good CSS-T codes and a new construction of triorthogonal codes

Authors: Elena Berardini, Reza Dastbasteh, Josu Etxezarreta Martinez, Shreyas Jain, Olatz Sanz Larrarte

Abstract: We propose a new systematic construction of CSS-T codes from any given CSS code using a map $φ$. When $φ$ is the identity map $I$, we retrieve the construction of [1] and use it to prove the existence of asymptotically good binary CSS-T codes, resolving a previously open problem in the literature, and of asymptotically good quantum LDPC CSS-T codes. We analyze the structure of the logical operator… ▽ More We propose a new systematic construction of CSS-T codes from any given CSS code using a map $φ$. When $φ$ is the identity map $I$, we retrieve the construction of [1] and use it to prove the existence of asymptotically good binary CSS-T codes, resolving a previously open problem in the literature, and of asymptotically good quantum LDPC CSS-T codes. We analyze the structure of the logical operators corresponding to certain non-Clifford gates supported by the quantum codes obtained from this construction ($φ= I$), concluding that they always result in the logical identity. An immediate application of these codes in dealing with coherent noise is discussed. We then develop a new doubling transformation for obtaining triorthogonal codes, which generalizes the doubling construction presented in [2]. Our approach permits using self-orthogonal codes, instead of only doubly-even codes, as building blocks for triorthogonal codes. This broadens the range of codes available for magic state distillation. △ Less

Submitted 20 June, 2025; v1 submitted 11 December, 2024; originally announced December 2024.

Comments: new results on triorthogonal codes; changes in the title and the presentation; accepted for publication in IEEE Journal on Selected Areas in Information Theory as a Special Issue on Quantum Error Correction and Fault Tolerance

arXiv:2410.06698 [pdf, other]

doi 10.1002/aisy.202400353

Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras

Authors: Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego

Abstract: Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot… ▽ More Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. We propose approaches to action recognition based on the Fourier Transform. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display", a behavior where the observed penguins flap their wings at a certain frequency. We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. We hope this work opens a new perspective on event-based processing and action recognition. △ Less

Submitted 9 October, 2024; originally announced October 2024.

Comments: 11 pages, 10 figures, 7 tables

Journal ref: Advanced Intelligent Systems, 2024

arXiv:2410.01793 [pdf, other]

Thermodynamic Bayesian Inference

Authors: Maxwell Aifer, Samuel Duffield, Kaelan Donatella, Denis Melanson, Phoebe Klett, Zach Belateche, Gavin Crooks, Antonio J. Martinez, Patrick J. Coles

Abstract: A fully Bayesian treatment of complicated predictive models (such as deep neural networks) would enable rigorous uncertainty quantification and the automation of higher-level tasks including model selection. However, the intractability of sampling Bayesian posteriors over many parameters inhibits the use of Bayesian methods where they are most needed. Thermodynamic computing has emerged as a parad… ▽ More A fully Bayesian treatment of complicated predictive models (such as deep neural networks) would enable rigorous uncertainty quantification and the automation of higher-level tasks including model selection. However, the intractability of sampling Bayesian posteriors over many parameters inhibits the use of Bayesian methods where they are most needed. Thermodynamic computing has emerged as a paradigm for accelerating operations used in machine learning, such as matrix inversion, and is based on the mapping of Langevin equations to the dynamics of noisy physical systems. Hence, it is natural to consider the implementation of Langevin sampling algorithms on thermodynamic devices. In this work we propose electronic analog devices that sample from Bayesian posteriors by realizing Langevin dynamics physically. Circuit designs are given for sampling the posterior of a Gaussian-Gaussian model and for Bayesian logistic regression, and are validated by simulations. It is shown, under reasonable assumptions, that the Bayesian posteriors for these models can be sampled in time scaling with $\ln(d)$, where $d$ is dimension. For the Gaussian-Gaussian model, the energy cost is shown to scale with $ d \ln(d)$. These results highlight the potential for fast, energy-efficient Bayesian inference using thermodynamic computing. △ Less

Submitted 2 October, 2024; originally announced October 2024.

Comments: 20 pages, 8 figures

arXiv:2409.17688 [pdf, other]

doi 10.1007/s11227-022-04574-5

HPC acceleration of large (min, +) matrix products to compute domination-type parameters in graphs

Authors: E. M. Garzón, J. A. Martínez, J. J. Moreno, M. L. Puertas

Abstract: The computation of the domination-type parameters is a challenging problem in Cartesian product graphs. We present an algorithmic method to compute the $2$-domination number of the Cartesian product of a path with small order and any cycle, involving the $(\min,+)$ matrix product. We establish some theoretical results that provide the algorithms necessary to compute that parameter, and the main ch… ▽ More The computation of the domination-type parameters is a challenging problem in Cartesian product graphs. We present an algorithmic method to compute the $2$-domination number of the Cartesian product of a path with small order and any cycle, involving the $(\min,+)$ matrix product. We establish some theoretical results that provide the algorithms necessary to compute that parameter, and the main challenge to run such algorithms comes from the large size of the matrices used, which makes it necessary to improve the techniques to handle these objects. We analyze the performance of the algorithms on modern multicore CPUs and on GPUs and we show the advantages over the sequential implementation. The use of these platforms allows us to compute the $2$-domination number of cylinders such that their paths have at most $12$ vertices. △ Less

Submitted 26 September, 2024; originally announced September 2024.

Journal ref: Journal of Supercomputing 78, pp. 17826-17843, 2022

arXiv:2409.17658 [pdf, other]

doi 10.1109/ACCESS.2021.3058738

Powers of large matrices on GPU platforms to compute the Roman domination number of cylindrical graphs

Authors: J. A. Martínez, E. M. Garzón, M. L. Puertas

Abstract: The Roman domination in a graph $G$ is a variant of the classical domination, defined by means of a so-called Roman domination function $f\colon V(G)\to \{0,1,2\}$ such that if $f(v)=0$ then, the vertex $v$ is adjacent to at least one vertex $w$ with $f(w)=2$. The weight $f(G)$ of a Roman dominating function of $G$ is the sum of the weights of all vertices of $G$, that is,… ▽ More The Roman domination in a graph $G$ is a variant of the classical domination, defined by means of a so-called Roman domination function $f\colon V(G)\to \{0,1,2\}$ such that if $f(v)=0$ then, the vertex $v$ is adjacent to at least one vertex $w$ with $f(w)=2$. The weight $f(G)$ of a Roman dominating function of $G$ is the sum of the weights of all vertices of $G$, that is, $f(G)=\sum_{u\in V(G)}f(u)$. The Roman domination number $γ_R(G)$ is the minimum weight of a Roman dominating function of $G$. In this paper we propose algorithms to compute this parameter involving the $(\min,+)$ powers of large matrices with high computational requirements and the GPU (Graphics Processing Unit) allows us to accelerate such operations. Specific routines have been developed to efficiently compute the $(\min ,+)$ product on GPU architecture, taking advantage of its computational power. These algorithms allow us to compute the Roman domination number of cylindrical graphs $P_m\Box C_n$ i.e., the Cartesian product of a path and a cycle, in cases $m=7,8,9$, $ n\geq 3$ and $m\geq $10$, n\equiv 0\pmod 5$. Moreover, we provide a lower bound for the remaining cases $m\geq 10, n\not\equiv 0\pmod 5$. △ Less

Submitted 26 September, 2024; originally announced September 2024.

Journal ref: IEEE Access, vol. 9, pp. 29346-29355, 2021

arXiv:2409.16703 [pdf, other]

doi 10.1007/s40314-022-02137-1

The 2-domination number of cylindrical graphs

Authors: José Antonio Martínez, Ana Belén Castaño-Fernández, María Luz Puertas

Abstract: A vertex subset S of a graph G is said to 2-dominate the graph if each vertex not in S has at least two neighbors in it. As usual, the associated parameter is the minimum cardinal of a 2-dominating set, which is called the 2-domination number of the graph G. We present both lower and upper bounds of the 2-domination number of cylinders, which are the Cartesian products of a path and a cycle. These… ▽ More A vertex subset S of a graph G is said to 2-dominate the graph if each vertex not in S has at least two neighbors in it. As usual, the associated parameter is the minimum cardinal of a 2-dominating set, which is called the 2-domination number of the graph G. We present both lower and upper bounds of the 2-domination number of cylinders, which are the Cartesian products of a path and a cycle. These bounds allow us to compute the exact value of the 2-domination number of cylinders where the path is arbitrary, and the order of the cycle is n $\equiv$ 0(mod 3) and as large as desired. In the case of the lower bound, we adapt the technique of the wasted domination to this parameter and we use the so-called tropical matrix product to obtain the desired bound. Moreover, we provide a regular patterned construction of a minimum 2-dominating set in the cylinders having the mentioned cycle order. △ Less

Submitted 25 September, 2024; originally announced September 2024.

Comments: 19 pages, 4 figures

MSC Class: 05C69; 05C85; 15A80

Journal ref: Comp. Appl. Math. 41, 424 (2022)

arXiv:2409.15813 [pdf, other]

Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks

Authors: Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Jose M Martínez

Abstract: Merging parameters of multiple models has resurfaced as an effective strategy to enhance task performance and robustness, but prior work is limited by the high costs of ensemble creation and inference. In this paper, we leverage the abundance of freely accessible trained models to introduce a cost-free approach to model merging. It focuses on a layer-wise integration of merged models, aiming to ma… ▽ More Merging parameters of multiple models has resurfaced as an effective strategy to enhance task performance and robustness, but prior work is limited by the high costs of ensemble creation and inference. In this paper, we leverage the abundance of freely accessible trained models to introduce a cost-free approach to model merging. It focuses on a layer-wise integration of merged models, aiming to maintain the distinctiveness of the task-specific final layers while unifying the initial layers, which are primarily associated with feature extraction. This approach ensures parameter consistency across all layers, essential for boosting performance. Moreover, it facilitates seamless integration of knowledge, enabling effective merging of models from different datasets and tasks. Specifically, we investigate its applicability in Unsupervised Domain Adaptation (UDA), an unexplored area for model merging, for Semantic and Panoptic Segmentation. Experimental results demonstrate substantial UDA improvements without additional costs for merging same-architecture models from distinct datasets ($\uparrow 2.6\%$ mIoU) and different-architecture models with a shared backbone ($\uparrow 6.8\%$ mIoU). Furthermore, merging Semantic and Panoptic Segmentation models increases mPQ by $\uparrow 7\%$. These findings are validated across a wide variety of UDA strategies, architectures, and datasets. △ Less

Submitted 24 September, 2024; originally announced September 2024.

arXiv:2408.12569 [pdf, other]

Sapiens: Foundation for Human Vision Models

Authors: Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez, Su Zhaoen, Austin James, Peter Selednik, Stuart Anderson, Shunsuke Saito

Abstract: We present Sapiens, a family of models for four fundamental human-centric vision tasks -- 2D pose estimation, body-part segmentation, depth estimation, and surface normal prediction. Our models natively support 1K high-resolution inference and are extremely easy to adapt for individual tasks by simply fine-tuning models pretrained on over 300 million in-the-wild human images. We observe that, give… ▽ More We present Sapiens, a family of models for four fundamental human-centric vision tasks -- 2D pose estimation, body-part segmentation, depth estimation, and surface normal prediction. Our models natively support 1K high-resolution inference and are extremely easy to adapt for individual tasks by simply fine-tuning models pretrained on over 300 million in-the-wild human images. We observe that, given the same computational budget, self-supervised pretraining on a curated dataset of human images significantly boosts the performance for a diverse set of human-centric tasks. The resulting models exhibit remarkable generalization to in-the-wild data, even when labeled data is scarce or entirely synthetic. Our simple model design also brings scalability -- model performance across tasks improves as we scale the number of parameters from 0.3 to 2 billion. Sapiens consistently surpasses existing baselines across various human-centric benchmarks. We achieve significant improvements over the prior state-of-the-art on Humans-5K (pose) by 7.6 mAP, Humans-2K (part-seg) by 17.1 mIoU, Hi4D (depth) by 22.4% relative RMSE, and THuman2 (normal) by 53.5% relative angular error. Project page: https://about.meta.com/realitylabs/codecavatars/sapiens. △ Less

Submitted 26 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

Comments: ECCV 2024 (Oral)

arXiv:2407.07753 [pdf, other]

Quantum CSS Duadic and Triadic Codes: New Insights and Properties

Authors: Reza Dastbasteh, Olatz Sanz Larrarte, Josu Etxezarreta Martinez, Antonio deMarti iOlius, Javier Oliva del Moral, Pedro Crespo Bofill

Abstract: In this study, we investigate the construction of quantum CSS duadic codes with dimensions greater than one. We introduce a method for extending smaller splittings of quantum duadic codes to create larger, potentially degenerate quantum duadic codes. Furthermore, we present a technique for computing or bounding the minimum distances of quantum codes constructed through this approach. Additionally,… ▽ More In this study, we investigate the construction of quantum CSS duadic codes with dimensions greater than one. We introduce a method for extending smaller splittings of quantum duadic codes to create larger, potentially degenerate quantum duadic codes. Furthermore, we present a technique for computing or bounding the minimum distances of quantum codes constructed through this approach. Additionally, we introduce quantum CSS triadic codes, a family of quantum codes with a rate of at least $\frac{1}{3}$. △ Less

Submitted 10 July, 2024; originally announced July 2024.

MSC Class: 94B05; 94B15

arXiv:2406.13264 [pdf, other]

WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks

Authors: Michael Wornow, Avanika Narayan, Ben Viggiano, Ishan S. Khare, Tathagat Verma, Tibor Thompson, Miguel Angel Fuentes Hernandez, Sudharsan Sundar, Chloe Trujillo, Krrish Chawla, Rongfei Lu, Justin Shen, Divya Nagaraj, Joshua Martinez, Vardhan Agrawal, Althea Hudson, Nigam H. Shah, Christopher Re

Abstract: Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This f… ▽ More Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This focus on automation ignores the reality of how most BPM tools are applied today - simply documenting the relevant workflow takes 60% of the time of the typical process optimization project. To address this gap we present WONDERBREAD, the first benchmark for evaluating multimodal FMs on BPM tasks beyond automation. Our contributions are: (1) a dataset containing 2928 documented workflow demonstrations; (2) 6 novel BPM tasks sourced from real-world applications ranging from workflow documentation to knowledge transfer to process improvement; and (3) an automated evaluation harness. Our benchmark shows that while state-of-the-art FMs can automatically generate documentation (e.g. recalling 88% of the steps taken in a video demonstration of a workflow), they struggle to re-apply that knowledge towards finer-grained validation of workflow completion (F1 < 0.3). We hope WONDERBREAD encourages the development of more "human-centered" AI tooling for enterprise applications and furthers the exploration of multimodal FMs for the broader universe of BPM tasks. We publish our dataset and experiments here: https://github.com/HazyResearch/wonderbread △ Less

Submitted 10 October, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.00166 [pdf, other]

On complexity of colloid cellular automata

Authors: Andrew Adamatzky, Nic Roberts, Raphael Fortulan, Noushin Raeisi Kheirabadi, Panagiotis Mougkogiannis, Michail-Antisthenis Tsompanas, Genaro J. Martinez, Georgios Ch. Sirakoulis, Alessandro Chiolerio

Abstract: The colloid cellular automata do not imitate the physical structure of colloids but are governed by logical functions derived from the colloids. We analyse the space-time complexity of Boolean circuits derived from the electrical responses of colloids: ZnO (zinc oxide, an inorganic compound also known as calamine or zinc white, which naturally occurs as the mineral zincite), proteinoids (microsphe… ▽ More The colloid cellular automata do not imitate the physical structure of colloids but are governed by logical functions derived from the colloids. We analyse the space-time complexity of Boolean circuits derived from the electrical responses of colloids: ZnO (zinc oxide, an inorganic compound also known as calamine or zinc white, which naturally occurs as the mineral zincite), proteinoids (microspheres and crystals of thermal abiotic proteins), and combinations thereof to electrical stimulation. To extract Boolean circuits from colloids, we send all possible configurations of two-, four-, and eight-bit binary strings, encoded as electrical potential values, to the colloids, record their responses, and thereby infer the Boolean functions they implement. We map the discovered functions onto the cell-state transition rules of cellular automata (arrays of binary state machines that update their states synchronously according to the same rule) -- the colloid cellular automata. We then analyse the phenomenology of the space-time configurations of the automata and evaluate their complexity using measures such as compressibility, Shannon entropy, Simpson diversity, and expressivity. A hierarchy of phenomenological and measurable space-time complexity is constructed. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2405.20204 [pdf, other]

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Authors: Andreas Koukounas, Georgios Mastrapas, Michael Günther, Bo Wang, Scott Martens, Isabelle Mohr, Saba Sturua, Mohammad Kalim Akram, Joan Fontanals Martínez, Saahil Ognawala, Susana Guzman, Maximilian Werk, Nan Wang, Han Xiao

Abstract: Contrastive Language-Image Pretraining (CLIP) is widely used to train models to align images and texts in a common embedding space by mapping them to fixed-sized vectors. These models are key to multimodal information retrieval and related tasks. However, CLIP models generally underperform in text-only tasks compared to specialized text models. This creates inefficiencies for information retrieval… ▽ More Contrastive Language-Image Pretraining (CLIP) is widely used to train models to align images and texts in a common embedding space by mapping them to fixed-sized vectors. These models are key to multimodal information retrieval and related tasks. However, CLIP models generally underperform in text-only tasks compared to specialized text models. This creates inefficiencies for information retrieval systems that keep separate embeddings and models for text-only and multimodal tasks. We propose a novel, multi-task contrastive training method to address this issue, which we use to train the jina-clip-v1 model to achieve the state-of-the-art performance on both text-image and text-text retrieval tasks. △ Less

Submitted 26 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: 4 pages, MFM-EAI@ICML2024

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2405.18350 [pdf, other]

doi 10.1109/ACCESS.2019.2937505

A System for Automatic English Text Expansion

Authors: Silvia García Méndez, Milagros Fernández Gavilanes, Enrique Costa Montenegro, Jonathan Juncal Martínez, Francisco Javier González Castaño, Ehud Reiter

Abstract: We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptabilit… ▽ More We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches. Here, "automatic" means that the system can generate coherent and correct sentences from a minimum set of words. From its inception, the design is modular and adaptable to other languages. This adaptability is one of its greatest advantages. For English, we have created the highly precise aLexiE lexicon with wide coverage, which represents a contribution on its own. We have evaluated the resulting NLG library in an Augmentative and Alternative Communication (AAC) proof of concept, both directly (by regenerating corpus sentences) and manually (from annotations) using a popular corpus in the NLG field. We performed a second analysis by comparing the quality of text expansion in English to Spanish, using an ad-hoc Spanish-English parallel corpus. The system might also be applied to other domains such as report and news generation. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Journal ref: (2019) IEEE Access, 7, 123320-123333

arXiv:2405.09546 [pdf, other]

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Authors: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

Abstract: The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data generators offer a promising alternative, particularly for embodied AI tasks, they often fall short for computer vision tasks due to low asset and renderin… ▽ More The systematic evaluation and understanding of computer vision models under varying conditions require large amounts of data with comprehensive and customized labels, which real-world vision datasets rarely satisfy. While current synthetic data generators offer a promising alternative, particularly for embodied AI tasks, they often fall short for computer vision tasks due to low asset and rendering quality, limited diversity, and unrealistic physical properties. We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models, based on the newly developed embodied AI benchmark, BEHAVIOR-1K. BVS supports a large number of adjustable parameters at the scene level (e.g., lighting, object placement), the object level (e.g., joint configuration, attributes such as "filled" and "folded"), and the camera level (e.g., field of view, focal length). Researchers can arbitrarily vary these parameters during data generation to perform controlled experiments. We showcase three example application scenarios: systematically evaluating the robustness of models across different continuous axes of domain shift, evaluating scene understanding models on the same set of images, and training and evaluating simulation-to-real transfer for a novel vision task: unary and binary state prediction. Project website: https://behavior-vision-suite.github.io/ △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: CVPR 2024 (Highlight). Project website: https://behavior-vision-suite.github.io/

arXiv:2403.14291 [pdf, other]

Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models

Authors: Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan C. SanMiguel, Jose M. Martínez

Abstract: Diffusion models represent a new paradigm in text-to-image generation. Beyond generating high-quality images from text prompts, models such as Stable Diffusion have been successfully extended to the joint generation of semantic segmentation pseudo-masks. However, current extensions primarily rely on extracting attentions linked to prompt words used for image synthesis. This approach limits the gen… ▽ More Diffusion models represent a new paradigm in text-to-image generation. Beyond generating high-quality images from text prompts, models such as Stable Diffusion have been successfully extended to the joint generation of semantic segmentation pseudo-masks. However, current extensions primarily rely on extracting attentions linked to prompt words used for image synthesis. This approach limits the generation of segmentation masks derived from word tokens not contained in the text prompt. In this work, we introduce Open-Vocabulary Attention Maps (OVAM)-a training-free method for text-to-image diffusion models that enables the generation of attention maps for any word. In addition, we propose a lightweight optimization process based on OVAM for finding tokens that generate accurate attention maps for an object class with a single annotation. We evaluate these tokens within existing state-of-the-art Stable Diffusion extensions. The best-performing model improves its mIoU from 52.1 to 86.6 for the synthetic images' pseudo-masks, demonstrating that our optimized tokens are an efficient way to improve the performance of existing methods without architectural changes or retraining. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)

arXiv:2403.13014 [pdf]

General Line Coordinates in 3D

Authors: Joshua Martinez, Boris Kovalerchuk

Abstract: Interpretable interactive visual pattern discovery in lossless 3D visualization is a promising way to advance machine learning. It enables end users who are not data scientists to take control of the model development process as a self-service. It is conducted in 3D General Line Coordinates (GLC) visualization space, which preserves all n-D information in 3D. This paper presents a system which com… ▽ More Interpretable interactive visual pattern discovery in lossless 3D visualization is a promising way to advance machine learning. It enables end users who are not data scientists to take control of the model development process as a self-service. It is conducted in 3D General Line Coordinates (GLC) visualization space, which preserves all n-D information in 3D. This paper presents a system which combines three types of GLC: Shifted Paired Coordinates (SPC), Shifted Tripled Coordinates (STC), and General Line Coordinates-Linear (GLC-L) for interactive visual pattern discovery. A transition from 2-D visualization to 3-D visualization allows for a more distinct visual pattern than in 2-D and it also allows for finding the best data viewing positions, which are not available in 2-D. It enables in-depth visual analysis of various class-specific data subsets comprehensible for end users in the original interpretable attributes. Controlling model overgeneralization by end users is an additional benefit of this approach. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: 8 pages, 25 figures

arXiv:2402.17016 [pdf, other]

Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

Authors: Isabelle Mohr, Markus Krimmel, Saba Sturua, Mohammad Kalim Akram, Andreas Koukounas, Michael Günther, Georgios Mastrapas, Vinit Ravishankar, Joan Fontanals Martínez, Feng Wang, Qi Liu, Ziniu Yu, Jie Fu, Saahil Ognawala, Susana Guzman, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao

Abstract: We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations. By f… ▽ More We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations. By focusing on bilingual models and introducing a unique multi-task learning objective, we have significantly improved the model performance on STS tasks, which outperforms the capabilities of existing multilingual models in both target language understanding and cross-lingual evaluation tasks. Moreover, our bilingual models are more efficient, requiring fewer parameters and less memory due to their smaller vocabulary needs. Furthermore, we have expanded the Massive Text Embedding Benchmark (MTEB) to include benchmarks for German and Spanish embedding models. This integration aims to stimulate further research and advancement in text embedding technologies for these languages. △ Less

Submitted 26 February, 2024; originally announced February 2024.

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2402.05435 [pdf, other]

GPT-4 Generated Narratives of Life Events using a Structured Narrative Prompt: A Validation Study

Authors: Christopher J. Lynch, Erik Jensen, Madison H. Munro, Virginia Zamponi, Joseph Martinez, Kevin O'Brien, Brandon Feldhaus, Katherine Smith, Ann Marie Reinhold, Ross Gore

Abstract: Large Language Models (LLMs) play a pivotal role in generating vast arrays of narratives, facilitating a systematic exploration of their effectiveness for communicating life events in narrative form. In this study, we employ a zero-shot structured narrative prompt to generate 24,000 narratives using OpenAI's GPT-4. From this dataset, we manually classify 2,880 narratives and evaluate their validit… ▽ More Large Language Models (LLMs) play a pivotal role in generating vast arrays of narratives, facilitating a systematic exploration of their effectiveness for communicating life events in narrative form. In this study, we employ a zero-shot structured narrative prompt to generate 24,000 narratives using OpenAI's GPT-4. From this dataset, we manually classify 2,880 narratives and evaluate their validity in conveying birth, death, hiring, and firing events. Remarkably, 87.43% of the narratives sufficiently convey the intention of the structured prompt. To automate the identification of valid and invalid narratives, we train and validate nine Machine Learning models on the classified datasets. Leveraging these models, we extend our analysis to predict the classifications of the remaining 21,120 narratives. All the ML models excelled at classifying valid narratives as valid, but experienced challenges at simultaneously classifying invalid narratives as invalid. Our findings not only advance the study of LLM capabilities, limitations, and validity but also offer practical insights for narrative generation and natural language processing applications. △ Less

Submitted 12 July, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: 29 pages, 24 figures

ACM Class: I.2.7; I.6.4

arXiv:2401.03780 [pdf, other]

Cybersecurity in Critical Infrastructures: A Post-Quantum Cryptography Perspective

Authors: Javier Oliva del Moral, Antonio deMarti iOlius, Gerard Vidal, Pedro M. Crespo, Josu Etxezarreta Martinez

Abstract: The machinery of industrial environments was connected to the Internet years ago with the scope of increasing their performance. However, this change made such environments vulnerable against cyber-attacks that can compromise their correct functioning resulting in economic or social problems. Moreover, implementing cryptosystems in the communications between operational technology (OT) devices is… ▽ More The machinery of industrial environments was connected to the Internet years ago with the scope of increasing their performance. However, this change made such environments vulnerable against cyber-attacks that can compromise their correct functioning resulting in economic or social problems. Moreover, implementing cryptosystems in the communications between operational technology (OT) devices is a more challenging task than for information technology (IT) environments since the OT networks are generally composed of legacy elements, characterized by low-computational capabilities. Consequently, implementing cryptosystems in industrial communication networks faces a trade-off between the security of the communications and the amortization of the industrial infrastructure. Critical Infrastructure (CI) refers to the industries which provide key resources for the daily social and economical development, e.g. electricity. Furthermore, a new threat to cybersecurity has arisen with the theoretical proposal of quantum computers, due to their potential ability of breaking state-of-the-art cryptography protocols, such as RSA or ECC. Many global agents have become aware that transitioning their secure communications to a quantum secure paradigm is a priority that should be established before the arrival of fault-tolerance. In this paper, we aim to describe the problematic of implementing post-quantum cryptography (PQC) to CI environments. For doing so, we describe the requirements for these scenarios and how they differ against IT. We also introduce classical cryptography and how quantum computers pose a threat to such security protocols. Furthermore, we introduce state-of-the-art proposals of PQC protocols and present their characteristics. We conclude by discussing the problematic of integrating PQC in industrial environments. △ Less

Submitted 11 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

Comments: 27 pages, 7 figures, 10 tables

arXiv:2401.03307 [pdf, other]

Modeling Processes of Neighborhood Change

Authors: J. Carlos Martínez Mori, Zhanzhan Zhao

Abstract: An urban planner might design the spatial layout of transportation amenities so as to improve accessibility for underserved communities -- a fairness objective. However, implementing such a design might trigger processes of neighborhood change that change who benefits from these amenities in the long term. If so, has the planner really achieved their fairness objective? Can algorithmic decision-ma… ▽ More An urban planner might design the spatial layout of transportation amenities so as to improve accessibility for underserved communities -- a fairness objective. However, implementing such a design might trigger processes of neighborhood change that change who benefits from these amenities in the long term. If so, has the planner really achieved their fairness objective? Can algorithmic decision-making anticipate second order effects? In this paper, we take a step in this direction by formulating processes of neighborhood change as instances of no-regret dynamics; a collective learning process in which a set of strategic agents rapidly reach a state of approximate equilibrium. We mathematize concepts of neighborhood change to model the incentive structures impacting individual dwelling-site decision-making. Our model accounts for affordability, access to relevant transit amenities, community ties, and site upkeep. We showcase our model with computational experiments that provide semi-quantitative insights on the spatial economics of neighborhood change, particularly on the influence of residential zoning policy and the placement of transit amenities. △ Less

Submitted 9 February, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

MSC Class: 91D10; 91A80; 90B06

arXiv:2312.15849 [pdf]

FODT: Fast, Online, Distributed and Temporary Failure Recovery Approach for MEC

Authors: Xin Yuan, Ning Li, Jose Fernan Martinez

Abstract: Mobile edge computing (MEC) can reduce the latency of cloud computing successfully. However, the edge server may fail due to the hardware of software issues. When the edge server failure happens, the users who offload tasks to this server will be affected. How to recover the services for these affected users quickly and effectively is challenging. Moreover, considering that the server failure is c… ▽ More Mobile edge computing (MEC) can reduce the latency of cloud computing successfully. However, the edge server may fail due to the hardware of software issues. When the edge server failure happens, the users who offload tasks to this server will be affected. How to recover the services for these affected users quickly and effectively is challenging. Moreover, considering that the server failure is continuous and temporary, and the failed server can be repaired, the previous works cannot handle this problem effectively. Therefore, in this paper, we propose the fast, online, distributed, and temporary failure recovery algorithm (FODT) for MEC. In FODT, when edge sever failure happens, only the affected APs recalculate their user-server allocation strategies and the other APs do not change their strategies. For the affected access points (Aps), the strategies before server failure are reused to reduce complexity and latency. When the failed server is repaired, the influenced APs reuse the strategies before server failure to offload task to this server. Based on this approach, the FODT can achieve better performance than previous works. To the best of knowledge, the FODT is the first failure recovery algorithm, and when compared with previous research, it has higher failure recovery efficiency and lower complexity with acceptable approximate ratio. △ Less

Submitted 16 April, 2025; v1 submitted 25 December, 2023; originally announced December 2023.

Comments: 12 pages, 7 figures

arXiv:2312.06504 [pdf, ps, other]

An infinite class of quantum codes derived from duadic constacyclic codes

Authors: Reza Dastbasteh, Josu Etxezarreta Martinez, Andrew Nemec, Antonio deMarti iOlius, Pedro Crespo Bofill

Abstract: We present a family of quantum stabilizer codes using the structure of duadic constacyclic codes over $\mathbb{F}_4$. Within this family, quantum codes can possess varying dimensions, and their minimum distances are lower bounded by a square root bound. For each fixed dimension, this allows us to construct an infinite sequence of binary quantum codes with a growing minimum distance. Additionally,… ▽ More We present a family of quantum stabilizer codes using the structure of duadic constacyclic codes over $\mathbb{F}_4$. Within this family, quantum codes can possess varying dimensions, and their minimum distances are lower bounded by a square root bound. For each fixed dimension, this allows us to construct an infinite sequence of binary quantum codes with a growing minimum distance. Additionally, we prove that this family of quantum codes includes an infinite subclass of degenerate codes. We also introduce a technique for extending splittings of duadic constacyclic codes, providing new insights into the minimum distance and minimum odd-like weight of specific duadic constacyclic codes. Finally, we provide numerical examples of some quantum codes with short lengths within this family. △ Less

Submitted 27 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: 31 pages, 2 tables

MSC Class: 94B05; 94B15

arXiv:2312.04836 [pdf, other]

Thermodynamic Computing System for AI Applications

Authors: Denis Melanson, Mohammad Abu Khater, Maxwell Aifer, Kaelan Donatella, Max Hunter Gordon, Thomas Ahle, Gavin Crooks, Antonio J. Martinez, Faris Sbahi, Patrick J. Coles

Abstract: Recent breakthroughs in artificial intelligence (AI) algorithms have highlighted the need for novel computing hardware in order to truly unlock the potential for AI. Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives, especially generative AI and probabilistic AI. In this work, we present the first continuous-va… ▽ More Recent breakthroughs in artificial intelligence (AI) algorithms have highlighted the need for novel computing hardware in order to truly unlock the potential for AI. Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives, especially generative AI and probabilistic AI. In this work, we present the first continuous-variable thermodynamic computer, which we call the stochastic processing unit (SPU). Our SPU is composed of RLC circuits, as unit cells, on a printed circuit board, with 8 unit cells that are all-to-all coupled via switched capacitances. It can be used for either sampling or linear algebra primitives, and we demonstrate Gaussian sampling and matrix inversion on our hardware. The latter represents the first thermodynamic linear algebra experiment. We also illustrate the applicability of the SPU to uncertainty quantification for neural network classification. We envision that this hardware, when scaled up in size, will have significant impact on accelerating various probabilistic AI applications. △ Less

Submitted 8 December, 2023; originally announced December 2023.

Comments: 26 pages, 22 figures

arXiv:2312.03799 [pdf, other]

doi 10.1109/CVPR52733.2024.01761

Low-power, Continuous Remote Behavioral Localization with Event Cameras

Authors: Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego

Abstract: Researchers in natural science need reliable methods for quantifying animal behavior. Recently, numerous computer vision methods emerged to automate the process. However, observing wild species at remote locations remains a challenging task due to difficult lighting conditions and constraints on power supply and data storage. Event cameras offer unique advantages for battery-dependent remote monit… ▽ More Researchers in natural science need reliable methods for quantifying animal behavior. Recently, numerous computer vision methods emerged to automate the process. However, observing wild species at remote locations remains a challenging task due to difficult lighting conditions and constraints on power supply and data storage. Event cameras offer unique advantages for battery-dependent remote monitoring due to their low power consumption and high dynamic range capabilities. We use this novel sensor to quantify a behavior in Chinstrap penguins called ecstatic display. We formulate the problem as a temporal action detection task, determining the start and end times of the behavior. For this purpose, we recorded a colony of breeding penguins in Antarctica for several weeks and labeled event data on 16 nests. The developed method consists of a generator of candidate time intervals (proposals) and a classifier of the actions within them. The experiments show that the event cameras' natural response to motion is effective for continuous behavior monitoring and detection, reaching a mean average precision (mAP) of 58% (which increases to 63% in good weather conditions). The results also demonstrate the robustness against various lighting conditions contained in the challenging dataset. The low-power capabilities of the event camera allow it to record significantly longer than with a conventional camera. This work pioneers the use of event cameras for remote wildlife observation, opening new interdisciplinary opportunities. https://tub-rip.github.io/eventpenguins/ △ Less

Submitted 19 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

Comments: 13 pages, 8 figures, 12 tables, Project page: https://tub-rip.github.io/eventpenguins/

Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 2024

arXiv:2311.02986 [pdf, other]

doi 10.1145/3718349

Hacking Cryptographic Protocols with Advanced Variational Quantum Attacks

Authors: Borja Aizpurua, Pablo Bermejo, Josu Etxezarreta Martinez, Roman Orus

Abstract: Here we introduce an improved approach to Variational Quantum Attack Algorithms (VQAA) on crytographic protocols. Our methods provide robust quantum attacks to well-known cryptographic algorithms, more efficiently and with remarkably fewer qubits than previous approaches. We implement simulations of our attacks for symmetric-key protocols such as S-DES, S-AES and Blowfish. For instance, we show ho… ▽ More Here we introduce an improved approach to Variational Quantum Attack Algorithms (VQAA) on crytographic protocols. Our methods provide robust quantum attacks to well-known cryptographic algorithms, more efficiently and with remarkably fewer qubits than previous approaches. We implement simulations of our attacks for symmetric-key protocols such as S-DES, S-AES and Blowfish. For instance, we show how our attack allows a classical simulation of a small 8-qubit quantum computer to find the secret key of one 32-bit Blowfish instance with 24 times fewer number of iterations than a brute-force attack. Our work also shows improvements in attack success rates for lightweight ciphers such as S-DES and S-AES. Further applications beyond symmetric-key cryptography are also discussed, including asymmetric-key protocols and hash functions. In addition, we also comment on potential future improvements of our methods. Our results bring one step closer assessing the vulnerability of large-size classical cryptographic protocols with Noisy Intermediate-Scale Quantum (NISQ) devices, and set the stage for future research in quantum cybersecurity. △ Less

Submitted 14 March, 2025; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: 16 pages, 9 figures

Journal ref: ACM Transactions on Quantum Computing, Volume 6, Issue 2, Article No.: 14, June 2025, Pages 1-24

arXiv:2310.20093 [pdf, other]

Evaluating Neural Language Models as Cognitive Models of Language Acquisition

Authors: Héctor Javier Vázquez Martínez, Annika Lea Heuser, Charles Yang, Jordan Kodner

Abstract: The success of neural language models (LMs) on many technological tasks has brought about their potential relevance as scientific theories of language despite some clear differences between LM training and child language acquisition. In this paper we argue that some of the most prominent benchmarks for evaluating the syntactic capacities of LMs may not be sufficiently rigorous. In particular, we s… ▽ More The success of neural language models (LMs) on many technological tasks has brought about their potential relevance as scientific theories of language despite some clear differences between LM training and child language acquisition. In this paper we argue that some of the most prominent benchmarks for evaluating the syntactic capacities of LMs may not be sufficiently rigorous. In particular, we show that the template-based benchmarks lack the structural diversity commonly found in the theoretical and psychological studies of language. When trained on small-scale data modeling child language acquisition, the LMs can be readily matched by simple baseline models. We advocate for the use of the readily available, carefully curated datasets that have been evaluated for gradient acceptability by large pools of native speakers and are designed to probe the structural basis of grammar specifically. On one such dataset, the LI-Adger dataset, LMs evaluate sentences in a way inconsistent with human language users. We conclude with suggestions for better connecting LMs with the empirical study of child language acquisition. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: To appear in the GenBench 2023 workshop proceedings, the first workshop on (benchmarking) generalisation in NLP. GenBench 2023 will be held at EMNLP 2023 on December 6, 2023

arXiv:2310.07803 [pdf]

doi 10.1515/humor-2023-0032

A general mechanism of humor: reformulating the semantic overlap

Authors: Javier Martínez

Abstract: This article proposes a cognitive mechanism of humour of general applicability, not restricted to verbal communication. It is indebted to Raskin's concept of script overlap, and conforms to the incongruity-resolution theoretical framework, but it is built on the notion of constraint, an abstract correspondence between sets of data. Under this view, script overlap is an outcome of a more abstractly… ▽ More This article proposes a cognitive mechanism of humour of general applicability, not restricted to verbal communication. It is indebted to Raskin's concept of script overlap, and conforms to the incongruity-resolution theoretical framework, but it is built on the notion of constraint, an abstract correspondence between sets of data. Under this view, script overlap is an outcome of a more abstractly described phenomenon, constraint overlap. The important concept of the overlooked argument is introduced to characterise the two overlapping constraints -- overt and covert. Their inputs and outputs are not directly encoded in utterances, but implicated by them, and their overlap results in another overlap at the level of the communicated utterances, that the incongruity reveals. Our hypothesis assumes as a given that the evocation of such constraints is a cognitive effect of the inferential process by which a hearer interprets utterances. We base this assumption on Hofstadter's theory of analogy-making as the essence of human thought. By substituting "stimuli" of any kind for "utterances" in this model, we obtain a mechanism as easily applicable to non-verbal communication -- slapstick, cartoons -- and we propose it describes the necessary and sufficient conditions for a communicative act in any modality to carry humour. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: 24 pages, 8 figures

ACM Class: I.2.m; J.5

Journal ref: HUMOR: International Journal of Humor Research, vol.36, no.4, 2023, pp. 529-565

arXiv:2310.06572 [pdf, other]

doi 10.1016/j.engappai.2024.107876

Deep Learning reconstruction with uncertainty estimation for $γ$ photon interaction in fast scintillator detectors

Authors: Geoffrey Daniel, Mohamed Bahi Yahiaoui, Claude Comtat, Sebastien Jan, Olga Kochebina, Jean-Marc Martinez, Viktoriya Sergeyeva, Viatcheslav Sharyy, Chi-Hsun Sung, Dominique Yvon

Abstract: This article presents a physics-informed deep learning method for the quantitative estimation of the spatial coordinates of gamma interactions within a monolithic scintillator, with a focus on Positron Emission Tomography (PET) imaging. A Density Neural Network approach is designed to estimate the 2-dimensional gamma photon interaction coordinates in a fast lead tungstate (PbWO4) monolithic scinti… ▽ More This article presents a physics-informed deep learning method for the quantitative estimation of the spatial coordinates of gamma interactions within a monolithic scintillator, with a focus on Positron Emission Tomography (PET) imaging. A Density Neural Network approach is designed to estimate the 2-dimensional gamma photon interaction coordinates in a fast lead tungstate (PbWO4) monolithic scintillator detector. We introduce a custom loss function to estimate the inherent uncertainties associated with the reconstruction process and to incorporate the physical constraints of the detector. This unique combination allows for more robust and reliable position estimations and the obtained results demonstrate the effectiveness of the proposed approach and highlights the significant benefits of the uncertainties estimation. We discuss its potential impact on improving PET imaging quality and show how the results can be used to improve the exploitation of the model, to bring benefits to the application and how to evaluate the validity of the given prediction and the associated uncertainties. Importantly, our proposed methodology extends beyond this specific use case, as it can be generalized to other applications beyond PET imaging. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: Submitted to Artificial Intelligence

Journal ref: Engineering Applications of Artificial Intelligence, Volume 131, 2024, 107876

arXiv:2309.12265 [pdf, ps, other]

doi 10.46298/dmtcs.13113

Cost-sharing in Parking Games

Authors: Jennifer Elder, Pamela E. Harris, Jan Kretschmann, J. Carlos Martínez Mori

Abstract: In this paper, we study the total displacement statistic of parking functions from the perspective of cooperative game theory. We introduce parking games, which are coalitional cost-sharing games in characteristic function form derived from the total displacement statistic. We show that parking games are supermodular cost-sharing games, indicating that cooperation is difficult (i.e., their core is… ▽ More In this paper, we study the total displacement statistic of parking functions from the perspective of cooperative game theory. We introduce parking games, which are coalitional cost-sharing games in characteristic function form derived from the total displacement statistic. We show that parking games are supermodular cost-sharing games, indicating that cooperation is difficult (i.e., their core is empty). Next, we study their Shapley value, which formalizes a notion of "fair" cost-sharing and amounts to charging each car for its expected marginal displacement under a random arrival order. Our main contribution is a polynomial-time algorithm to compute the Shapley value of parking games, in contrast with known hardness results on computing the Shapley value of arbitrary games. The algorithm leverages the permutation-invariance of total displacement, combinatorial enumeration, and dynamic programming. We conclude with open questions around an alternative solution concept for supermodular cost-sharing games and connections to other areas in combinatorics. △ Less

Submitted 16 September, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: 16 pages

MSC Class: 05A05; 91A12; 91A46

Journal ref: Discrete Mathematics & Theoretical Computer Science, vol. 26:3, Combinatorics (November 4, 2024) dmtcs:13113

arXiv:2305.13452 [pdf, other]

Measuring and Modeling Physical Intrinsic Motivation

Authors: Julio Martinez, Felix Binder, Haoliang Wang, Nick Haber, Judith Fan, Daniel L. K. Yamins

Abstract: Humans are interactive agents driven to seek out situations with interesting physical dynamics. Here we formalize the functional form of physical intrinsic motivation. We first collect ratings of how interesting humans find a variety of physics scenarios. We then model human interestingness responses by implementing various hypotheses of intrinsic motivation including models that rely on simple sc… ▽ More Humans are interactive agents driven to seek out situations with interesting physical dynamics. Here we formalize the functional form of physical intrinsic motivation. We first collect ratings of how interesting humans find a variety of physics scenarios. We then model human interestingness responses by implementing various hypotheses of intrinsic motivation including models that rely on simple scene features to models that depend on forward physics prediction. We find that the single best predictor of human responses is adversarial reward, a model derived from physical prediction loss. We also find that simple scene feature models do not generalize their prediction of human responses across all scenarios. Finally, linearly combining the adversarial model with the number of collisions in a scene leads to the greatest improvement in predictivity of human responses, suggesting humans are driven towards scenarios that result in high information gain and physical activity. △ Less

Submitted 7 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: 6 pages, 5 figures, accepted to CogSci 2023 with full paper publication in the proceedings

arXiv:2302.13961 [pdf, other]

Soft labelling for semantic segmentation: Bringing coherence to label down-sampling

Authors: Roberto Alcover-Couso, Marcos Escudero-Vinolo, Juan C. SanMiguel, Jose M. Martinez

Abstract: In semantic segmentation, training data down-sampling is commonly performed due to limited resources, the need to adapt image size to the model input, or improve data augmentation. This down-sampling typically employs different strategies for the image data and the annotated labels. Such discrepancy leads to mismatches between the down-sampled color and label images. Hence, the training performanc… ▽ More In semantic segmentation, training data down-sampling is commonly performed due to limited resources, the need to adapt image size to the model input, or improve data augmentation. This down-sampling typically employs different strategies for the image data and the annotated labels. Such discrepancy leads to mismatches between the down-sampled color and label images. Hence, the training performance significantly decreases as the down-sampling factor increases. In this paper, we bring together the down-sampling strategies for the image data and the training labels. To that aim, we propose a novel framework for label down-sampling via soft-labeling that better conserves label information after down-sampling. Therefore, fully aligning soft-labels with image data to keep the distribution of the sampled pixels. This proposal also produces reliable annotations for under-represented semantic classes. Altogether, it allows training competitive models at lower resolutions. Experiments show that the proposal outperforms other down-sampling strategies. Moreover, state-of-the-art performance is achieved for reference benchmarks, but employing significantly less computational resources than foremost approaches. This proposal enables competitive research for semantic segmentation under resource constraints. △ Less

Submitted 19 February, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

arXiv:2302.06584 [pdf, other]

Thermodynamic AI and the fluctuation frontier

Authors: Patrick J. Coles, Collin Szczepanski, Denis Melanson, Kaelan Donatella, Antonio J. Martinez, Faris Sbahi

Abstract: Many Artificial Intelligence (AI) algorithms are inspired by physics and employ stochastic fluctuations. We connect these physics-inspired AI algorithms by unifying them under a single mathematical framework that we call Thermodynamic AI. Seemingly disparate algorithmic classes can be described by this framework, for example, (1) Generative diffusion models, (2) Bayesian neural networks, (3) Monte… ▽ More Many Artificial Intelligence (AI) algorithms are inspired by physics and employ stochastic fluctuations. We connect these physics-inspired AI algorithms by unifying them under a single mathematical framework that we call Thermodynamic AI. Seemingly disparate algorithmic classes can be described by this framework, for example, (1) Generative diffusion models, (2) Bayesian neural networks, (3) Monte Carlo sampling and (4) Simulated annealing. Such Thermodynamic AI algorithms are currently run on digital hardware, ultimately limiting their scalability and overall potential. Stochastic fluctuations naturally occur in physical thermodynamic systems, and such fluctuations can be viewed as a computational resource. Hence, we propose a novel computing paradigm, where software and hardware become inseparable. Our algorithmic unification allows us to identify a single full-stack paradigm, involving Thermodynamic AI hardware, that could accelerate such algorithms. We contrast Thermodynamic AI hardware with quantum computing where noise is a roadblock rather than a resource. Thermodynamic AI hardware can be viewed as a novel form of computing, since it uses a novel fundamental building block. We identify stochastic bits (s-bits) and stochastic modes (s-modes) as the respective building blocks for discrete and continuous Thermodynamic AI hardware. In addition to these stochastic units, Thermodynamic AI hardware employs a Maxwell's demon device that guides the system to produce non-trivial states. We provide a few simple physical architectures for building these devices and we develop a formalism for programming the hardware via gate sequences. We hope to stimulate discussion around this new computing paradigm. Beyond acceleration, we believe it will impact the design of both hardware and algorithms, while also deepening our understanding of the connection between physics and intelligence. △ Less

Submitted 13 June, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

Comments: 47 pages, 18 figures, Updated authors

arXiv:2301.10132 [pdf, other]

doi 10.1103/PhysRevA.108.032602

The superadditivity effects of quantum capacity decrease with the dimension for qudit depolarizing channels

Authors: Josu Etxezarreta Martinez, Antonio deMarti iOlius, Pedro M. Crespo

Abstract: Quantum channel capacity is a fundamental quantity in order to understand how good can quantum information be transmitted or corrected when subjected to noise. However, it is generally not known how to compute such quantities, since the quantum channel coherent information is not additive for all channels, implying that it must be maximized over an unbounded number of channel uses. This leads to t… ▽ More Quantum channel capacity is a fundamental quantity in order to understand how good can quantum information be transmitted or corrected when subjected to noise. However, it is generally not known how to compute such quantities, since the quantum channel coherent information is not additive for all channels, implying that it must be maximized over an unbounded number of channel uses. This leads to the phenomenon known as superadditivity, which refers to the fact that the regularized coherent information of $n$ channel uses exceeds one-shot coherent information. In this article, we study how the gain in quantum capacity of qudit depolarizing channels relates to the dimension of the systems considered. We make use of an argument based on the no-cloning bound in order to proof that the possible superadditive effects decrease as a function of the dimension for such family of channels. In addition, we prove that the capacity of the qudit depolarizing channel coincides with the coherent information when $d\rightarrow\infty$. We also discuss the private classical capacity and obain similar results. We conclude that when high dimensional qudits experiencing depolarizing noise are considered, the coherent information of the channel is not only an achievable rate but essentially the maximum possible rate for any quantum block code. △ Less

Submitted 31 August, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

Comments: 10 pages, 2 figures

Journal ref: Phys. Rev. A 108, 032602 (2023)

arXiv:2211.15538 [pdf, other]

Graph Convolutional Network for Multi-Target Multi-Camera Vehicle Tracking

Authors: Elena Luna, Juan Carlos San Miguel, José María Martínez, Marcos Escudero-Viñolo

Abstract: This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal… ▽ More This letter focuses on the task of Multi-Target Multi-Camera vehicle tracking. We propose to associate single-camera trajectories into multi-camera global trajectories by training a Graph Convolutional Network. Our approach simultaneously processes all cameras providing a global solution, and it is also robust to large cameras unsynchronizations. Furthermore, we design a new loss function to deal with class imbalance. Our proposal outperforms the related work showing better generalization and without requiring ad-hoc manual annotations or thresholds, unlike compared approaches. △ Less

Submitted 28 November, 2022; originally announced November 2022.

arXiv:2210.09184 [pdf, other]

Packed-Ensembles for Efficient Uncertainty Estimation

Authors: Olivier Laurent, Adrien Lafage, Enzo Tartaglione, Geoffrey Daniel, Jean-Marc Martinez, Andrei Bursuc, Gianni Franchi

Abstract: Deep Ensembles (DE) are a prominent approach for achieving excellent performance on key metrics such as accuracy, calibration, uncertainty estimation, and out-of-distribution detection. However, hardware limitations of real-world systems constrain to smaller ensembles and lower-capacity networks, significantly deteriorating their performance and properties. We introduce Packed-Ensembles (PE), a st… ▽ More Deep Ensembles (DE) are a prominent approach for achieving excellent performance on key metrics such as accuracy, calibration, uncertainty estimation, and out-of-distribution detection. However, hardware limitations of real-world systems constrain to smaller ensembles and lower-capacity networks, significantly deteriorating their performance and properties. We introduce Packed-Ensembles (PE), a strategy to design and train lightweight structured ensembles by carefully modulating the dimension of their encoding space. We leverage grouped convolutions to parallelize the ensemble into a single shared backbone and forward pass to improve training and inference speeds. PE is designed to operate within the memory limits of a standard neural network. Our extensive research indicates that PE accurately preserves the properties of DE, such as diversity, and performs equally well in terms of accuracy, calibration, out-of-distribution detection, and robustness to distribution shift. We make our code available at https://github.com/ENSTA-U2IS/torch-uncertainty. △ Less

Submitted 27 April, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

Comments: Published as a conference paper at ICLR 2023 (notable 25%)

arXiv:2207.13991 [pdf]

CoNet: Borderless and decentralized server cooperation in edge computing

Authors: Ning Li, Xin Yuan, Zhaoxin Zhang, Jose Fernan Martinez

Abstract: In edge computing (EC), by offloading tasks to edge server or remote cloud, the system performance can be improved greatly. However, since the traffic distribution in EC is heterogeneous and dynamic, it is difficult for an individual edge server to provide satisfactory computation service anytime and anywhere. This issue motivated the researchers to study the cooperation between edge servers. The… ▽ More In edge computing (EC), by offloading tasks to edge server or remote cloud, the system performance can be improved greatly. However, since the traffic distribution in EC is heterogeneous and dynamic, it is difficult for an individual edge server to provide satisfactory computation service anytime and anywhere. This issue motivated the researchers to study the cooperation between edge servers. The previous server cooperation algorithms have disadvantages since the cooperated region is limited within one-hop. However, the performance of EC can be improved further by releasing the restriction of cooperation region. Even some works have extended the cooperated region to multi-hops, they fail to support the task offloading which is one of the core issues of edge computing. Therefore, we propose a new decentralized and borderless server cooperation algorithm for edge computing which takes task offloading strategy into account, named CoNet. In CoNet, the cooperation region is not limited. Each server forms its own basic cooperation unit (BCU) and calculates its announced capability based on BCU. The server's capability, the processing delay, the task and calculation result forwarding delay are considered during the calculation. The task division strategy bases on the real capability of host-server and the announced capability of cooperation-servers. This cooperation process is recursive and will be terminated once the terminal condition is satisfied. The simulation results demonstrate the advantages of CoNet over previous works. △ Less

Submitted 28 July, 2022; originally announced July 2022.

arXiv:2207.01323 [pdf, other]

Computer vision application for improved product traceability in the granite manufacturing industry

Authors: Xurxo Rigueira, Javier Martinez, Maria Araujo, Antonio Recaman

Abstract: The traceability of granite blocks consists in identifying each block with a finite number of color bands which represent a numerical code. This code has to be read several times throughout the manufacturing process, but its accuracy is subject to human errors, leading to cause faults in the traceability system. A computer vision system is presented to address this problem through color detection… ▽ More The traceability of granite blocks consists in identifying each block with a finite number of color bands which represent a numerical code. This code has to be read several times throughout the manufacturing process, but its accuracy is subject to human errors, leading to cause faults in the traceability system. A computer vision system is presented to address this problem through color detection and the decryption of the associated code. The system developed makes use of color space transformations, and several thresholds for the isolation of the colors. Computer vision methods are implemented, along with contour detection procedures for color identification. Lastly, the analysis of geometrical features is used to decrypt the color code captured. The proposed algorithm is trained on a set of 109 pictures taken in different environmental conditions and validated on a set of 21 images. The outcome shows promising results with an accuracy rate of 75.00% in the validation process. Therefore, the application presented can help employees reduce the number of mistakes on product tracking. △ Less

Submitted 4 July, 2022; originally announced July 2022.

MSC Class: 65D19 ACM Class: I.4

arXiv:2206.04663 [pdf, other]

Provably efficient variational generative modeling of quantum many-body systems via quantum-probabilistic information geometry

Authors: Faris M. Sbahi, Antonio J. Martinez, Sahil Patel, Dmitri Saberi, Jae Hyeon Yoo, Geoffrey Roeder, Guillaume Verdon

Abstract: The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent… ▽ More The dual tasks of quantum Hamiltonian learning and quantum Gibbs sampling are relevant to many important problems in physics and chemistry. In the low temperature regime, algorithms for these tasks often suffer from intractabilities, for example from poor sample- or time-complexity. With the aim of addressing such intractabilities, we introduce a generalization of quantum natural gradient descent to parameterized mixed states, as well as provide a robust first-order approximating algorithm, Quantum-Probabilistic Mirror Descent. We prove data sample efficiency for the dual tasks using tools from information geometry and quantum metrology, thus generalizing the seminal result of classical Fisher efficiency to a variational quantum algorithm for the first time. Our approaches extend previously sample-efficient techniques to allow for flexibility in model choice, including to spectrally-decomposed models like Quantum Hamiltonian-Based Models, which may circumvent intractable time complexities. Our first-order algorithm is derived using a novel quantum generalization of the classical mirror descent duality. Both results require a special choice of metric, namely, the Bogoliubov-Kubo-Mori metric. To test our proposed algorithms numerically, we compare their performance to existing baselines on the task of quantum Gibbs sampling for the transverse field Ising model. Finally, we propose an initialization strategy leveraging geometric locality for the modelling of sequences of states such as those arising from quantum-stochastic processes. We demonstrate its effectiveness empirically for both real and imaginary time evolution while defining a broader class of potential applications. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 24 + 49 pages, 5 + 4 figures

arXiv:2204.12918 [pdf, other]

We're Not Gonna Break It! Consistency-Preserving Operators for Efficient Product Line Configuration

Authors: Jose-Miguel Horcas, Daniel Strüber, Alexandru Burdusel, Jabier Martinez, Steffen Zschaler

Abstract: When configuring a software product line, finding a good trade-off between multiple orthogonal quality concerns is a challenging multi-objective optimisation problem. State-of-the-art solutions based on search-based techniques create invalid configurations in intermediate steps, requiring additional repair actions that reduce the efficiency of the search. In this work, we introduce consistency-pre… ▽ More When configuring a software product line, finding a good trade-off between multiple orthogonal quality concerns is a challenging multi-objective optimisation problem. State-of-the-art solutions based on search-based techniques create invalid configurations in intermediate steps, requiring additional repair actions that reduce the efficiency of the search. In this work, we introduce consistency-preserving configuration operators (CPCOs)--genetic operators that maintain valid configurations throughout the entire search. CPCOs bundle coherent sets of changes: the activation or deactivation of a particular feature together with other (de)activations that are needed to preserve validity. In our evaluation, our instantiation of the IBEA algorithm with CPCOs outperforms two state-of-the-art tools for optimal product line configuration in terms of both speed and solution quality. The improvements are especially pronounced in large product lines with thousands of features. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: Accepted for publication in IEEE Transactions on Software Engineering (TSE). 16 pages, 10 figures; includes an appendix with 8 additional pages and 4 additional figures

arXiv:2204.10476 [pdf]

doi 10.1016/j.jbi.2007.01.001

Global Mapping of Gene/Protein Interactions in PubMed Abstracts: A Framework and an Experiment with P53 Interactions

Authors: Xin Li, Hsinchun Chen, Zan Huang, Hua Su, Jesse D. Martinez

Abstract: Gene/protein interactions provide critical information for a thorough understanding of cellular processes. Recently, considerable interest and effort has been focused on the construction and analysis of genome-wide gene networks. The large body of biomedical literature is an important source of gene/protein interaction information. Recent advances in text mining tools have made it possible to auto… ▽ More Gene/protein interactions provide critical information for a thorough understanding of cellular processes. Recently, considerable interest and effort has been focused on the construction and analysis of genome-wide gene networks. The large body of biomedical literature is an important source of gene/protein interaction information. Recent advances in text mining tools have made it possible to automatically extract such documented interactions from free-text literature. In this paper, we propose a comprehensive framework for constructing and analyzing large-scale gene functional networks based on the gene/protein interactions extracted from biomedical literature repositories using text mining tools. Our proposed framework consists of analyses of the network topology, network topology-gene function relationship, and temporal network evolution to distill valuable information embedded in the gene functional interactions in literature. We demonstrate the application of the proposed framework using a testbed of P53-related PubMed abstracts, which shows that literature-based P53 networks exhibit small-world and scale-free properties. We also found that high degree genes in the literature-based networks have a high probability of appearing in the manually curated database and genes in the same pathway tend to form local clusters in our literature-based networks. Temporal analysis showed that genes interacting with many other genes tend to be involved in a large number of newly discovered interactions. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Journal ref: Journal of biomedical informatics, 2007

arXiv:2202.07127 [pdf, other]

Computing with Modular Robots

Authors: Genaro J. Martinez, Andrew Adamatzky, Ricardo Q. Figueroa, Eric Schweikardt, Dmitry A. Zaitsev, Ivan Zelinka, Luz N. Oliva-Moreno

Abstract: Propagating patterns are used to transfer and process information in chemical and physical prototypes of unconventional computing devices. Logical values are represented by fronts of traveling diffusive, trigger or phase waves. We apply this concept of pattern based computation to develop experimental prototypes of computing circuits implemented in small modular robots. In the experimental prototy… ▽ More Propagating patterns are used to transfer and process information in chemical and physical prototypes of unconventional computing devices. Logical values are represented by fronts of traveling diffusive, trigger or phase waves. We apply this concept of pattern based computation to develop experimental prototypes of computing circuits implemented in small modular robots. In the experimental prototypes the modular robots Cubelets are concatenated into channels and junction. The structures developed by Cubelets propagate signals in parallel and asynchronously. The approach is illustrated with a working circuit of a one-bit full adder. Complementarily a formalization of these constructions are developed across Sleptsov nets. Finally, a perspective to swarm dynamics is discussed. △ Less

Submitted 14 February, 2022; originally announced February 2022.

Comments: 33 pages, 23 figures, 5 tables

Journal ref: International Journal of Unconventional Computing, 17(1-2), 31-60, 2022

arXiv:2202.03212 [pdf, other]

Introducing explainable supervised machine learning into interactive feedback loops for statistical production system

Authors: Carlos Mougan, George Kanellos, Johannes Micheler, Jose Martinez, Thomas Gottron

Abstract: Statistical production systems cover multiple steps from the collection, aggregation, and integration of data to tasks like data quality assurance and dissemination. While the context of data quality assurance is one of the most promising fields for applying machine learning, the lack of curated and labeled training data is often a limiting factor. The statistical production system for the Centr… ▽ More Statistical production systems cover multiple steps from the collection, aggregation, and integration of data to tasks like data quality assurance and dissemination. While the context of data quality assurance is one of the most promising fields for applying machine learning, the lack of curated and labeled training data is often a limiting factor. The statistical production system for the Centralised Securities Database features an interactive feedback loop between data collected by the European Central Bank and data quality assurance performed by data quality managers at National Central Banks. The quality assurance feedback loop is based on a set of rule-based checks for raising exceptions, upon which the user either confirms the data or corrects an actual error. In this paper we use the information received from this feedback loop to optimize the exceptions presented to the National Central Banks thereby improving the quality of exceptions generated and the time consumed on the system by the users authenticating those exceptions. For this approach we make use of explainable supervised machine learning to (a) identify the types of exceptions and (b) to prioritize which exceptions are more likely to require an intervention or correction by the NCBs. Furthermore, we provide an explainable AI taxonomy aiming to identify the different explainable AI needs that arose during the project. △ Less

Submitted 18 February, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: Irving Fisher Committee (IFC) - Bank of Italy workshop on Data science in central banking: Applications and tools. arXiv admin note: text overlap with arXiv:2107.08045

arXiv:2201.10985 [pdf, other]

Jalisco's multiclass land cover analysis and classification using a novel lightweight convnet with real-world multispectral and relief data

Authors: Alexander Quevedo, Abraham Sánchez, Raul Nancláres, Diana P. Montoya, Juan Pacho, Jorge Martínez, E. Ulises Moya-Sánchez

Abstract: The understanding of global climate change, agriculture resilience, and deforestation control rely on the timely observations of the Land Use and Land Cover Change (LULCC). Recently, some deep learning (DL) methods have been adapted to make an automatic classification of Land Cover (LC) for global and homogeneous data. However, most of these DL models can not apply effectively to real-world data.… ▽ More The understanding of global climate change, agriculture resilience, and deforestation control rely on the timely observations of the Land Use and Land Cover Change (LULCC). Recently, some deep learning (DL) methods have been adapted to make an automatic classification of Land Cover (LC) for global and homogeneous data. However, most of these DL models can not apply effectively to real-world data. i.e. a large number of classes, multi-seasonal data, diverse climate regions, high imbalance label dataset, and low-spatial resolution. In this work, we present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis to handle these problems for the Jalisco region. In contrast to the global approaches, the regional data provide the context-specificity that is required for policymakers to plan the land use and management, conservation areas, or ecosystem services. In this work, we combine three real-world open data sources to obtain 13 channels. Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar, as a result, the test accuracy performance increase from 73 % to 83 %. We hope that this research helps other regional groups with limited data sources or computational resources to attain the United Nations Sustainable Development Goal (SDG) concerning Life on Land. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: 12 pages

Showing 1–50 of 116 results for author: Martínez, J