-
Relational Graph Transformer
Authors:
Vijay Prakash Dwivedi,
Sri Jaladi,
Yangyi Shen,
Federico López,
Charilaos I. Kanatsoulis,
Rishi Puri,
Matthias Fey,
Jure Leskovec
Abstract:
Relational Deep Learning (RDL) is a promising approach for building state-of-the-art predictive models on multi-table relational data by representing it as a heterogeneous temporal graph. However, commonly used Graph Neural Network models suffer from fundamental limitations in capturing complex structural patterns and long-range dependencies that are inherent in relational data. While Graph Transf…
▽ More
Relational Deep Learning (RDL) is a promising approach for building state-of-the-art predictive models on multi-table relational data by representing it as a heterogeneous temporal graph. However, commonly used Graph Neural Network models suffer from fundamental limitations in capturing complex structural patterns and long-range dependencies that are inherent in relational data. While Graph Transformers have emerged as powerful alternatives to GNNs on general graphs, applying them to relational entity graphs presents unique challenges: (i) Traditional positional encodings fail to generalize to massive, heterogeneous graphs; (ii) existing architectures cannot model the temporal dynamics and schema constraints of relational data; (iii) existing tokenization schemes lose critical structural information. Here we introduce the Relational Graph Transformer (RelGT), the first graph transformer architecture designed specifically for relational tables. RelGT employs a novel multi-element tokenization strategy that decomposes each node into five components (features, type, hop distance, time, and local structure), enabling efficient encoding of heterogeneity, temporality, and topology without expensive precomputation. Our architecture combines local attention over sampled subgraphs with global attention to learnable centroids, incorporating both local and database-wide representations. Across 21 tasks from the RelBench benchmark, RelGT consistently matches or outperforms GNN baselines by up to 18%, establishing Graph Transformers as a powerful architecture for Relational Deep Learning.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites
Authors:
Divyansh Garg,
Shaun VanWeelden,
Diego Caples,
Andis Draguns,
Nikil Ravi,
Pranav Putta,
Naman Garg,
Tomas Abraham,
Michael Lara,
Federico Lopez,
James Liu,
Atharva Gundawar,
Prannay Hebbar,
Youngchul Joo,
Jindong Gu,
Charles London,
Christian Schroeder de Witt,
Sumeet Motwani
Abstract:
We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of real-world websites. REAL comprises high-fidelity, deterministic replicas of 11 widely-used websites across domains such as e-commerce, travel, communication, and professional networking. We also release a benchmark consisting of 112 practical tasks that mirror everyday complex user intera…
▽ More
We introduce REAL, a benchmark and framework for multi-turn agent evaluations on deterministic simulations of real-world websites. REAL comprises high-fidelity, deterministic replicas of 11 widely-used websites across domains such as e-commerce, travel, communication, and professional networking. We also release a benchmark consisting of 112 practical tasks that mirror everyday complex user interactions requiring both accurate information retrieval and state-changing actions. All interactions occur within this fully controlled setting, eliminating safety risks and enabling robust, reproducible evaluation of agent capability and reliability. Our novel evaluation framework combines programmatic checks of website state for action-based tasks with rubric-guided LLM-based judgments for information retrieval. The framework supports both open-source and proprietary agent systems through a flexible evaluation harness that accommodates black-box commands within browser environments, allowing research labs to test agentic systems without modification. Our empirical results show that frontier language models achieve at most a 41% success rate on REAL, highlighting critical gaps in autonomous web navigation and task completion capabilities. Our framework supports easy integration of new tasks, reproducible evaluation, and scalable post-training data generation, marking a significant step forward in evaluating and advancing agent capabilities.
△ Less
Submitted 17 April, 2025; v1 submitted 15 April, 2025;
originally announced April 2025.
-
A multitask transformer to sign language translation using motion gesture primitives
Authors:
Fredy Alejandro Mendoza López,
Jefferson Rodriguez,
Fabio Martínez
Abstract:
The absence of effective communication the deaf population represents the main social gap in this community. Furthermore, the sign language, main deaf communication tool, is unlettered, i.e., there is no formal written representation. In consequence, main challenge today is the automatic translation among spatiotemporal sign representation and natural text language. Recent approaches are based on…
▽ More
The absence of effective communication the deaf population represents the main social gap in this community. Furthermore, the sign language, main deaf communication tool, is unlettered, i.e., there is no formal written representation. In consequence, main challenge today is the automatic translation among spatiotemporal sign representation and natural text language. Recent approaches are based on encoder-decoder architectures, where the most relevant strategies integrate attention modules to enhance non-linear correspondences, besides, many of these approximations require complex training and architectural schemes to achieve reasonable predictions, because of the absence of intermediate text projections. However, they are still limited by the redundant background information of the video sequences. This work introduces a multitask transformer architecture that includes a gloss learning representation to achieve a more suitable translation. The proposed approach also includes a dense motion representation that enhances gestures and includes kinematic information, a key component in sign language. From this representation it is possible to avoid background information and exploit the geometry of the signs, in addition, it includes spatiotemporal representations that facilitate the alignment between gestures and glosses as an intermediate textual representation. The proposed approach outperforms the state-of-the-art evaluated on the CoL-SLTD dataset, achieving a BLEU-4 of 72,64% in split 1, and a BLEU-4 of 14,64% in split 2. Additionally, the strategy was validated on the RWTH-PHOENIX-Weather 2014 T dataset, achieving a competitive BLEU-4 of 11,58%.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Hierarchical Residuals Exploit Brain-Inspired Compositionality
Authors:
Francisco M. López,
Jochen Triesch
Abstract:
We present Hierarchical Residual Networks (HiResNets), deep convolutional neural networks with long-range residual connections between layers at different hierarchical levels. HiResNets draw inspiration on the organization of the mammalian brain by replicating the direct connections from subcortical areas to the entire cortical hierarchy. We show that the inclusion of hierarchical residuals in sev…
▽ More
We present Hierarchical Residual Networks (HiResNets), deep convolutional neural networks with long-range residual connections between layers at different hierarchical levels. HiResNets draw inspiration on the organization of the mammalian brain by replicating the direct connections from subcortical areas to the entire cortical hierarchy. We show that the inclusion of hierarchical residuals in several architectures, including ResNets, results in a boost in accuracy and faster learning. A detailed analysis of our models reveals that they perform hierarchical compositionality by learning feature maps relative to the compressed representations provided by the skip connections.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
Targeted incentives for social tipping in heterogeneous networked populations
Authors:
Dhruv Mittal,
Fátima González-Novo López,
Sara Constantino,
Shaul Shalvi,
Xiaojie Chen,
Vítor V. Vasconcelos
Abstract:
Many societal challenges, such as climate change or disease outbreaks, require coordinated behavioral changes. For many behaviors, the tendency of individuals to adhere to social norms can reinforce the status quo. However, these same social processes can also result in rapid, self-reinforcing change. Interventions may be strategically targeted to initiate endogenous social change processes, often…
▽ More
Many societal challenges, such as climate change or disease outbreaks, require coordinated behavioral changes. For many behaviors, the tendency of individuals to adhere to social norms can reinforce the status quo. However, these same social processes can also result in rapid, self-reinforcing change. Interventions may be strategically targeted to initiate endogenous social change processes, often referred to as social tipping. While recent research has considered how the size and targeting of such interventions impact their effectiveness at bringing about change, they tend to overlook constraints faced by policymakers, including the cost, speed, and distributional consequences of interventions. To address this complexity, we introduce a game-theoretic framework that includes heterogeneous agents and networks of local influence. We implement various targeting heuristics based on information about individual preferences and commonly used local network properties to identify individuals to incentivize. Analytical and simulation results suggest that there is a trade-off between preventing backsliding among targeted individuals and promoting change among non-targeted individuals. Thus, where the change is initiated in the population and the direction in which it propagates is essential to the effectiveness of interventions. We identify cost-optimal strategies under different scenarios, such as varying levels of resistance to change, preference heterogeneity, and homophily. These results provide insights that can be experimentally tested and help policymakers to better direct incentives.
△ Less
Submitted 9 March, 2025; v1 submitted 23 January, 2025;
originally announced January 2025.
-
Digital twins to alleviate the need for real field data in vision-based vehicle speed detection systems
Authors:
Antonio Hernández Martínez,
Iván García Daza,
Carlos Fernández López,
David Fernández Llorca
Abstract:
Accurate vision-based speed estimation is much more cost-effective than traditional methods based on radar or LiDAR. However, it is also challenging due to the limitations of perspective projection on a discrete sensor, as well as the high sensitivity to calibration, lighting and weather conditions. Interestingly, deep learning approaches (which dominate the field of computer vision) are very limi…
▽ More
Accurate vision-based speed estimation is much more cost-effective than traditional methods based on radar or LiDAR. However, it is also challenging due to the limitations of perspective projection on a discrete sensor, as well as the high sensitivity to calibration, lighting and weather conditions. Interestingly, deep learning approaches (which dominate the field of computer vision) are very limited in this context due to the lack of available data. Indeed, obtaining video sequences of real road traffic with accurate speed values associated with each vehicle is very complex and costly, and the number of available datasets is very limited. Recently, some approaches are focusing on the use of synthetic data. However, it is still unclear how models trained on synthetic data can be effectively applied to real world conditions. In this work, we propose the use of digital-twins using CARLA simulator to generate a large dataset representative of a specific real-world camera. The synthetic dataset contains a large variability of vehicle types, colours, speeds, lighting and weather conditions. A 3D CNN model is trained on the digital twin and tested on the real sequences. Unlike previous approaches that generate multi-camera sequences, we found that the gap between the the real and the virtual conditions is a key factor in obtaining low speed estimation errors. Even with a preliminary approach, the mean absolute error obtained remains below 3km/h.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
"If the Machine Is As Good As Me, Then What Use Am I?" -- How the Use of ChatGPT Changes Young Professionals' Perception of Productivity and Accomplishment
Authors:
Charlotte Kobiella,
Yarhy Said Flores López,
Fiona Draxler,
Albrecht Schmidt
Abstract:
Large language models (LLMs) like ChatGPT have been widely adopted in work contexts. We explore the impact of ChatGPT on young professionals' perception of productivity and sense of accomplishment. We collected LLMs' main use cases in knowledge work through a preliminary study, which served as the basis for a two-week diary study with 21 young professionals reflecting on their ChatGPT use. Finding…
▽ More
Large language models (LLMs) like ChatGPT have been widely adopted in work contexts. We explore the impact of ChatGPT on young professionals' perception of productivity and sense of accomplishment. We collected LLMs' main use cases in knowledge work through a preliminary study, which served as the basis for a two-week diary study with 21 young professionals reflecting on their ChatGPT use. Findings indicate that ChatGPT enhanced some participants' perceptions of productivity and accomplishment by enabling greater creative output and satisfaction from efficient tool utilization. Others experienced decreased perceived productivity and accomplishment, driven by a diminished sense of ownership, perceived lack of challenge, and mediocre results. We found that the suitability of task delegation to ChatGPT varies strongly depending on the task nature. It's especially suitable for comprehending broad subject domains, generating creative solutions, and uncovering new information. It's less suitable for research tasks due to hallucinations, which necessitate extensive validation.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Self-Supervised Learning of Color Constancy
Authors:
Markus R. Ernst,
Francisco M. López,
Arthur Aubret,
Roland W. Fleming,
Jochen Triesch
Abstract:
Color constancy (CC) describes the ability of the visual system to perceive an object as having a relatively constant color despite changes in lighting conditions. While CC and its limitations have been carefully characterized in humans, it is still unclear how the visual system acquires this ability during development. Here, we present a first study showing that CC develops in a neural network tr…
▽ More
Color constancy (CC) describes the ability of the visual system to perceive an object as having a relatively constant color despite changes in lighting conditions. While CC and its limitations have been carefully characterized in humans, it is still unclear how the visual system acquires this ability during development. Here, we present a first study showing that CC develops in a neural network trained in a self-supervised manner through an invariance learning objective. During learning, objects are presented under changing illuminations, while the network aims to map subsequent views of the same object onto close-by latent representations. This gives rise to representations that are largely invariant to the illumination conditions, offering a plausible example of how CC could emerge during human cognitive development via a form of self-supervised learning.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
MIMo: A Multi-Modal Infant Model for Studying Cognitive Development
Authors:
Dominik Mattern,
Pierre Schumacher,
Francisco M. López,
Marcel C. Raabe,
Markus R. Ernst,
Arthur Aubret,
Jochen Triesch
Abstract:
Human intelligence and human consciousness emerge gradually during the process of cognitive development. Understanding this development is an essential aspect of understanding the human mind and may facilitate the construction of artificial minds with similar properties. Importantly, human cognitive development relies on embodied interactions with the physical and social environment, which is perc…
▽ More
Human intelligence and human consciousness emerge gradually during the process of cognitive development. Understanding this development is an essential aspect of understanding the human mind and may facilitate the construction of artificial minds with similar properties. Importantly, human cognitive development relies on embodied interactions with the physical and social environment, which is perceived via complementary sensory modalities. These interactions allow the developing mind to probe the causal structure of the world. This is in stark contrast to common machine learning approaches, e.g., for large language models, which are merely passively ``digesting'' large amounts of training data, but are not in control of their sensory inputs. However, computational modeling of the kind of self-determined embodied interactions that lead to human intelligence and consciousness is a formidable challenge. Here we present MIMo, an open-source multi-modal infant model for studying early cognitive development through computer simulations. MIMo's body is modeled after an 18-month-old child with detailed five-fingered hands. MIMo perceives its surroundings via binocular vision, a vestibular system, proprioception, and touch perception through a full-body virtual skin, while two different actuation models allow control of his body. We describe the design and interfaces of MIMo and provide examples illustrating its use. All code is available at https://github.com/trieschlab/MIMo .
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Robust Wake-Up Word Detection by Two-stage Multi-resolution Ensembles
Authors:
Fernando López,
Jordi Luque,
Carlos Segura,
Pablo Gómez
Abstract:
Voice-based interfaces rely on a wake-up word mechanism to initiate communication with devices. However, achieving a robust, energy-efficient, and fast detection remains a challenge. This paper addresses these real production needs by enhancing data with temporal alignments and using detection based on two phases with multi-resolution. It employs two models: a lightweight on-device model for real-…
▽ More
Voice-based interfaces rely on a wake-up word mechanism to initiate communication with devices. However, achieving a robust, energy-efficient, and fast detection remains a challenge. This paper addresses these real production needs by enhancing data with temporal alignments and using detection based on two phases with multi-resolution. It employs two models: a lightweight on-device model for real-time processing of the audio stream and a verification model on the server-side, which is an ensemble of heterogeneous architectures that refine detection. This scheme allows the optimization of two operating points. To protect privacy, audio features are sent to the cloud instead of raw audio. The study investigated different parametric configurations for feature extraction to select one for on-device detection and another for the verification model. Furthermore, thirteen different audio classifiers were compared in terms of performance and inference time. The proposed ensemble outperforms our stronger classifier in every noise condition.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Modeling Graphs Beyond Hyperbolic: Graph Neural Networks in Symmetric Positive Definite Matrices
Authors:
Wei Zhao,
Federico Lopez,
J. Maxwell Riestenberg,
Michael Strube,
Diaaeldin Taha,
Steve Trettel
Abstract:
Recent research has shown that alignment between the structure of graph data and the geometry of an embedding space is crucial for learning high-quality representations of the data. The uniform geometry of Euclidean and hyperbolic spaces allows for representing graphs with uniform geometric and topological features, such as grids and hierarchies, with minimal distortion. However, real-world graph…
▽ More
Recent research has shown that alignment between the structure of graph data and the geometry of an embedding space is crucial for learning high-quality representations of the data. The uniform geometry of Euclidean and hyperbolic spaces allows for representing graphs with uniform geometric and topological features, such as grids and hierarchies, with minimal distortion. However, real-world graph data is characterized by multiple types of geometric and topological features, necessitating more sophisticated geometric embedding spaces. In this work, we utilize the Riemannian symmetric space of symmetric positive definite matrices (SPD) to construct graph neural networks that can robustly handle complex graphs. To do this, we develop an innovative library that leverages the SPD gyrocalculus tools \cite{lopez2021gyroSPD} to implement the building blocks of five popular graph neural networks in SPD. Experimental results demonstrate that our graph neural networks in SPD substantially outperform their counterparts in Euclidean and hyperbolic spaces, as well as the Cartesian product thereof, on complex graphs for node and graph classification tasks. We release the library and datasets at \url{https://github.com/andyweizhao/SPD4GNNs}.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
MaskedFusion360: Reconstruct LiDAR Data by Querying Camera Features
Authors:
Royden Wagner,
Marvin Klemp,
Carlos Fernandez Lopez
Abstract:
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data. Therefore, state-of-the-art methods for perception in urban scenes fuse data from both sensor types. In this work, we introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications. We build upon masked autoencoders (MAEs…
▽ More
In self-driving applications, LiDAR data provides accurate information about distances in 3D but lacks the semantic richness of camera data. Therefore, state-of-the-art methods for perception in urban scenes fuse data from both sensor types. In this work, we introduce a novel self-supervised method to fuse LiDAR and camera data for self-driving applications. We build upon masked autoencoders (MAEs) and train deep learning models to reconstruct masked LiDAR data from fused LiDAR and camera features. In contrast to related methods that use birds-eye-view representations, we fuse features from dense spherical LiDAR projections and features from fish-eye camera crops with a similar field of view. Therefore, we reduce the learned spatial transformations to moderate perspective transformations and do not require additional modules to generate dense LiDAR representations. Code is available at: https://github.com/KIT-MRT/masked-fusion-360
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
Differentiable Clustering with Perturbed Spanning Forests
Authors:
Lawrence Stewart,
Francis S Bach,
Felipe Llinares López,
Quentin Berthet
Abstract:
We introduce a differentiable clustering method based on stochastic perturbations of minimum-weight spanning forests. This allows us to include clustering in end-to-end trainable pipelines, with efficient gradients. We show that our method performs well even in difficult settings, such as data sets with high noise and challenging geometries. We also formulate an ad hoc loss to efficiently learn fr…
▽ More
We introduce a differentiable clustering method based on stochastic perturbations of minimum-weight spanning forests. This allows us to include clustering in end-to-end trainable pipelines, with efficient gradients. We show that our method performs well even in difficult settings, such as data sets with high noise and challenging geometries. We also formulate an ad hoc loss to efficiently learn from partial clustering data using this operation. We demonstrate its performance on several data sets for supervised and semi-supervised tasks.
△ Less
Submitted 6 November, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
On the Parenthesisations of Matrix Chains: All are Useful, Few Are Essential
Authors:
Francisco López,
Lars Karlsson,
Paolo Bientinesi
Abstract:
The product of a matrix chain consisting of $n$ matrices can be computed in $C_{n-1}$ (Catalan's number) different ways, each identified by a distinct parenthesisation of the chain. The best algorithm to select a parenthesisation that minimises the cost runs in $O(n \log n)$ time. Approximate algorithms run in $O(n)$ time and find solutions that are guaranteed to be within a certain factor from op…
▽ More
The product of a matrix chain consisting of $n$ matrices can be computed in $C_{n-1}$ (Catalan's number) different ways, each identified by a distinct parenthesisation of the chain. The best algorithm to select a parenthesisation that minimises the cost runs in $O(n \log n)$ time. Approximate algorithms run in $O(n)$ time and find solutions that are guaranteed to be within a certain factor from optimal; the best factor is currently $1.155$. In this article, we first prove two results that characterise different parenthesisations, and then use those results to improve on the best known approximation algorithms. Specifically, we show that (a) each parenthesisation is optimal somewhere in the problem domain, and (b) exactly $n + 1$ parenthesisations are essential in the sense that the removal of any one of them causes an unbounded penalty for an infinite number of problem instances. By focusing on essential parenthesisations, we improve on the best known approximation algorithm and show that the approximation factor is at most $1.143$.
△ Less
Submitted 7 April, 2025; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Self-supervised pseudo-colorizing of masked cells
Authors:
Royden Wagner,
Carlos Fernandez Lopez,
Christoph Stiller
Abstract:
Self-supervised learning, which is strikingly referred to as the dark matter of intelligence, is gaining more attention in biomedical applications of deep learning. In this work, we introduce a novel self-supervision objective for the analysis of cells in biomedical microscopy images. We propose training deep learning models to pseudo-colorize masked cells. We use a physics-informed pseudo-spectra…
▽ More
Self-supervised learning, which is strikingly referred to as the dark matter of intelligence, is gaining more attention in biomedical applications of deep learning. In this work, we introduce a novel self-supervision objective for the analysis of cells in biomedical microscopy images. We propose training deep learning models to pseudo-colorize masked cells. We use a physics-informed pseudo-spectral colormap that is well suited for colorizing cell topology. Our experiments reveal that approximating semantic segmentation by pseudo-colorization is beneficial for subsequent fine-tuning on cell detection. Inspired by the recent success of masked image modeling, we additionally mask out cell parts and train to reconstruct these parts to further enrich the learned representations. We compare our pre-training method with self-supervised frameworks including contrastive learning (SimCLR), masked autoencoders (MAEs), and edge-based self-supervision. We build upon our previous work and train hybrid models for cell detection, which contain both convolutional and vision transformer modules. Our pre-training method can outperform SimCLR, MAE-like masked image modeling, and edge-based self-supervision when pre-training on a diverse set of six fluorescence microscopy datasets. Code is available at: https://github.com/roydenwa/pseudo-colorize-masked-cells
△ Less
Submitted 28 August, 2023; v1 submitted 12 February, 2023;
originally announced February 2023.
-
Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
Authors:
Fernando López,
Jordi Luque
Abstract:
High-quality data labeling from specific domains is costly and human time-consuming. In this work, we propose a self-supervised domain adaptation method, based upon an iterative pseudo-forced alignment algorithm. The produced alignments are employed to customize an end-to-end Automatic Speech Recognition (ASR) and iteratively refined. The algorithm is fed with frame-wise character posteriors produ…
▽ More
High-quality data labeling from specific domains is costly and human time-consuming. In this work, we propose a self-supervised domain adaptation method, based upon an iterative pseudo-forced alignment algorithm. The produced alignments are employed to customize an end-to-end Automatic Speech Recognition (ASR) and iteratively refined. The algorithm is fed with frame-wise character posteriors produced by a seed ASR, trained with out-of-domain data, and optimized throughout a Connectionist Temporal Classification (CTC) loss. The alignments are computed iteratively upon a corpus of broadcast TV. The process is repeated by reducing the quantity of text to be aligned or expanding the alignment window until finding the best possible audio-text alignment. The starting timestamps, or temporal anchors, are produced uniquely based on the confidence score of the last aligned utterance. This score is computed with the paths of the CTC-alignment matrix. With this methodology, no human-revised text references are required. Alignments from long audio files with low-quality transcriptions, like TV captions, are filtered out by confidence score and ready for further ASR adaptation. The obtained results, on both the Spanish RTVE2022 and CommonVoice databases, underpin the feasibility of using CTC-based systems to perform: highly accurate audio-text alignments, domain adaptation and semi-supervised training of end-to-end ASR.
△ Less
Submitted 15 January, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Guiding the retraining of convolutional neural networks against adversarial inputs
Authors:
Francisco Durán López,
Silverio Martínez-Fernández,
Michael Felderer,
Xavier Franch
Abstract:
Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy ef…
▽ More
Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy efficient testing and retraining, data scientists need support on which are the best guidance metrics and optimal dataset configurations.
Aims: We examined four guidance metrics for retraining convolutional neural networks and three retraining configurations. Our goal is to improve the models against adversarial inputs regarding accuracy, resource utilization and time from the point of view of a data scientist in the context of image classification.
Method: We conducted an empirical study in two datasets for image classification. We explore: (a) the accuracy, resource utilization and time of retraining convolutional neural networks by ordering new training set by four different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (from scratch and augmented dataset, using weights and augmented dataset, and using weights and only adversarial inputs).
Results: We reveal that retraining with adversarial inputs from original weights and by ordering with surprise adequacy metrics gives the best model w.r.t. the used metrics.
Conclusions: Although more studies are necessary, we recommend data scientists to use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs.
△ Less
Submitted 12 July, 2022; v1 submitted 8 July, 2022;
originally announced July 2022.
-
FLOPs as a Discriminant for Dense Linear Algebra Algorithms
Authors:
Francisco López,
Lars Karlsson,
Paolo Bientinesi
Abstract:
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly evaluated through a sequence of invocations to highly optimised kernels provided in libraries such as BLAS and LAPACK. A sequence of kernels represents an algorithm, and in general, because of associativity, algebraic identities, and multiple kernels, one expression can be evaluated via many different…
▽ More
Expressions that involve matrices and vectors, known as linear algebra expressions, are commonly evaluated through a sequence of invocations to highly optimised kernels provided in libraries such as BLAS and LAPACK. A sequence of kernels represents an algorithm, and in general, because of associativity, algebraic identities, and multiple kernels, one expression can be evaluated via many different algorithms. These algorithms are all mathematically equivalent (i.e., in exact arithmetic, they all compute the same result), but often differ noticeably in terms of execution time. When faced with a decision, high-level languages, libraries, and tools such as Julia, Armadillo, and Linnea choose by selecting the algorithm that minimises the FLOP count. In this paper, we test the validity of the FLOP count as a discriminant for dense linear algebra algorithms, analysing "anomalies": problem instances for which the fastest algorithm does not perform the least number of FLOPs.
To do so, we focused on relatively simple expressions and analysed when and why anomalies occurred. We found that anomalies exist and tend to cluster into large contiguous regions. For one expression anomalies were rare, whereas for the other they were abundant. We conclude that FLOPs is not a sufficiently dependable discriminant even when building algorithms with highly optimised kernels. Plus, most of the anomalies remained as such even after filtering out the inter-kernel cache effects. We conjecture that combining FLOP counts with kernel performance models will significantly improve our ability to choose optimal algorithms.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors
Authors:
Naser Damer,
César Augusto Fontanillo López,
Meiling Fang,
Noémie Spiller,
Minh Vu Pham,
Fadi Boutros
Abstract:
The main question this work aims at answering is: "can morphing attack detection (MAD) solutions be successfully developed based on synthetic data?". Towards that, this work introduces the first synthetic-based MAD development dataset, namely the Synthetic Morphing Attack Detection Development dataset (SMDD). This dataset is utilized successfully to train three MAD backbones where it proved to lea…
▽ More
The main question this work aims at answering is: "can morphing attack detection (MAD) solutions be successfully developed based on synthetic data?". Towards that, this work introduces the first synthetic-based MAD development dataset, namely the Synthetic Morphing Attack Detection Development dataset (SMDD). This dataset is utilized successfully to train three MAD backbones where it proved to lead to high MAD performance, even on completely unknown attack types. Additionally, an essential aspect of this work is the detailed legal analyses of the challenges of using and sharing real biometric data, rendering our proposed SMDD dataset extremely essential. The SMDD dataset, consisting of 30,000 attack and 50,000 bona fide samples, is publicly available for research purposes.
△ Less
Submitted 19 April, 2022; v1 submitted 13 March, 2022;
originally announced March 2022.
-
iNaturalist citizen science community during City Nature Challenge: new computational approach for analysis of user activity
Authors:
Liubov Tupikina,
Frank Schlosser,
Vadim Voskresenskii,
Katharina Kloppenborg,
Florence Lopez,
Albrecht Mariz,
Anna Mogilevskaja,
Muki Haklay,
Bastian Greshake Tzovaras
Abstract:
Analysing patterns of engagement among citizen science participants can provide important insights into the organisation and practice of individual citizen science projects. In particular, methods from statistics and network science can be used to understand different types of user behaviour and user interactions to help the further implementation and organization of community efforts. Using publi…
▽ More
Analysing patterns of engagement among citizen science participants can provide important insights into the organisation and practice of individual citizen science projects. In particular, methods from statistics and network science can be used to understand different types of user behaviour and user interactions to help the further implementation and organization of community efforts. Using publicly available data from the iNaturalist community and their yearly City Nature Challenges (CNC) from 2017-2020 as an example; we showcase computational methods to explore the spatio-temporal evolution of this citizen science community that typically interacts in a hybrid offline-online way. In particular, we investigate the user types present in the community along with their interactions, finding significant differences in usage-behavior on both the level of engagement and the types of community tasks/roles and how they interact with the network of contributors. We expect that these computational analysis strategies will be useful to gain further understanding of other citizen science communities and projects.
△ Less
Submitted 5 December, 2021;
originally announced December 2021.
-
Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices
Authors:
Federico López,
Beatrice Pozzetti,
Steve Trettel,
Michael Strube,
Anna Wienhard
Abstract:
We propose the use of the vector-valued distance to compute distances and extract geometric information from the manifold of symmetric positive definite matrices (SPD), and develop gyrovector calculus, constructing analogs of vector space operations in this curved space. We implement these operations and showcase their versatility in the tasks of knowledge graph completion, item recommendation, an…
▽ More
We propose the use of the vector-valued distance to compute distances and extract geometric information from the manifold of symmetric positive definite matrices (SPD), and develop gyrovector calculus, constructing analogs of vector space operations in this curved space. We implement these operations and showcase their versatility in the tasks of knowledge graph completion, item recommendation, and question answering. In experiments, the SPD models outperform their equivalents in Euclidean and hyperbolic space. The vector-valued distance allows us to visualize embeddings, showing that the models learn to disentangle representations of positive samples from negative ones.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Augmenting the User-Item Graph with Textual Similarity Models
Authors:
Federico López,
Martin Scholz,
Jessica Yung,
Marie Pellat,
Michael Strube,
Lucas Dixon
Abstract:
This paper introduces a simple and effective form of data augmentation for recommender systems. A paraphrase similarity model is applied to widely available textual data, such as reviews and product descriptions, yielding new semantic relations that are added to the user-item graph. This increases the density of the graph without needing further labeled data. The data augmentation is evaluated on…
▽ More
This paper introduces a simple and effective form of data augmentation for recommender systems. A paraphrase similarity model is applied to widely available textual data, such as reviews and product descriptions, yielding new semantic relations that are added to the user-item graph. This increases the density of the graph without needing further labeled data. The data augmentation is evaluated on a variety of recommendation algorithms, using Euclidean, hyperbolic, and complex spaces, and over three categories of Amazon product reviews with differing characteristics. Results show that the data augmentation technique provides significant improvements to all types of models, with the most pronounced gains for knowledge graph-based recommenders, particularly in cold-start settings, leading to state-of-the-art performance.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Symmetric Spaces for Graph Embeddings: A Finsler-Riemannian Approach
Authors:
Federico López,
Beatrice Pozzetti,
Steve Trettel,
Michael Strube,
Anna Wienhard
Abstract:
Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. We propose the systematic use of symmetric spaces in representation learning, a class encompassing many of the previously used embedding targets. This enables us to introduce a new method, the use of Finsler metrics integrated in a Rieman…
▽ More
Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. We propose the systematic use of symmetric spaces in representation learning, a class encompassing many of the previously used embedding targets. This enables us to introduce a new method, the use of Finsler metrics integrated in a Riemannian optimization scheme, that better adapts to dissimilar structures in the graph. We develop a tool to analyze the embeddings and infer structural properties of the data sets. For implementation, we choose Siegel spaces, a versatile family of symmetric spaces. Our approach outperforms competitive baselines for graph reconstruction tasks on various synthetic and real-world datasets. We further demonstrate its applicability on two downstream tasks, recommender systems and node classification.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
Hermitian Symmetric Spaces for Graph Embeddings
Authors:
Federico López,
Beatrice Pozzetti,
Steve Trettel,
Anna Wienhard
Abstract:
Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. The quality of the embeddings is usually determined by how well the geometry of the target space matches the structure of the data. In this work we learn continuous representations of graphs in spaces of symmetric matrices over C. These s…
▽ More
Learning faithful graph representations as sets of vertex embeddings has become a fundamental intermediary step in a wide range of machine learning applications. The quality of the embeddings is usually determined by how well the geometry of the target space matches the structure of the data. In this work we learn continuous representations of graphs in spaces of symmetric matrices over C. These spaces offer a rich geometry that simultaneously admits hyperbolic and Euclidean subspaces, and are amenable to analysis and explicit computations. We implement an efficient method to learn embeddings and compute distances, and develop the tools to operate with such spaces. The proposed models are able to automatically adapt to very dissimilar arrangements without any apriori estimates of graph features. On various datasets with very diverse structural properties and reconstruction measures our model ties the results of competitive baselines for geometrically pure graphs and outperforms them for graphs with mixed geometric features, showcasing the versatility of our approach.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Assessing deep learning methods for the identification of kidney stones in endoscopic images
Authors:
Francisco Lopez,
Andres Varela,
Oscar Hinojosa,
Mauricio Mendez,
Dinh-Hoan Trinh,
Jonathan ElBeze,
Jacques Hubert,
Vincent Estrade,
Miguel Gonzalez,
Gilberto Ochoa,
Christian Daul
Abstract:
Knowing the type (i.e., the biochemical composition) of kidney stones is crucial to prevent relapses with an appropriate treatment. During ureteroscopies, kidney stones are fragmented, extracted from the urinary tract, and their composition is determined using a morpho-constitutional analysis. This procedure is time consuming (the morpho-constitutional analysis results are only available after som…
▽ More
Knowing the type (i.e., the biochemical composition) of kidney stones is crucial to prevent relapses with an appropriate treatment. During ureteroscopies, kidney stones are fragmented, extracted from the urinary tract, and their composition is determined using a morpho-constitutional analysis. This procedure is time consuming (the morpho-constitutional analysis results are only available after some days) and tedious (the fragment extraction lasts up to an hour). Identifying the kidney stone type only with the in-vivo endoscopic images would allow for the dusting of the fragments, while the morpho-constitutional analysis could be avoided. Only few contributions dealing with the in vivo identification of kidney stones were published. This paper discusses and compares five classification methods including deep convolutional neural networks (DCNN)-based approaches and traditional (non DCNN-based) ones. Even if the best method is a DCCN approach with a precision and recall of 98% and 97% over four classes, this contribution shows that a XGBoost classifier exploiting well-chosen feature vectors can closely approach the performances of DCNN classifiers for a medical application with a limited number of annotated data.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Speech Enhancement for Wake-Up-Word detection in Voice Assistants
Authors:
David Bonet,
Guillermo Cámbara,
Fernando López,
Pablo Gómez,
Carlos Segura,
Jordi Luque
Abstract:
Keyword spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants. A very common issue of voice assistants is that they get easily activated by background noise like music, TV or background speech that accidentally triggers the device. In this paper, we propose a Speech Enhancement (SE) model adapted to the task of WUW detection that aims at increasing t…
▽ More
Keyword spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants. A very common issue of voice assistants is that they get easily activated by background noise like music, TV or background speech that accidentally triggers the device. In this paper, we propose a Speech Enhancement (SE) model adapted to the task of WUW detection that aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises. The SE model is a fully-convolutional denoising auto-encoder at waveform level and is trained using a log-Mel Spectrogram and waveform reconstruction losses together with the BCE loss of a simple WUW classification network. A new database has been purposely prepared for the task of recognizing the WUW in challenging conditions containing negative samples that are very phonetically similar to the keyword. The database is extended with public databases and an exhaustive data augmentation to simulate different noises and environments. The results obtained by concatenating the SE with a simple and state-of-the-art WUW detectors show that the SE does not have a negative impact on the recognition rate in quiet environments while increasing the performance in the presence of noise, especially when the SE and WUW detector are trained jointly end-to-end.
△ Less
Submitted 29 January, 2021;
originally announced January 2021.
-
Integrating Deep Learning in Domain Sciences at Exascale
Authors:
Rick Archibald,
Edmond Chow,
Eduardo D'Azevedo,
Jack Dongarra,
Markus Eisenbach,
Rocco Febbo,
Florent Lopez,
Daniel Nichols,
Stanimire Tomov,
Kwai Wong,
Junqi Yin
Abstract:
This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently, identify challenges, and propose new asynchronous parallelization and optimiza…
▽ More
This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
A Deep Learning Forecaster with Exogenous Variables for Day-Ahead Locational Marginal Price
Authors:
Dipanwita Saha,
Felipe Lopez
Abstract:
Several approaches have been proposed to forecast day-ahead locational marginal price (daLMP) in deregulated energy markets. The rise of deep learning has motivated its use in energy price forecasts but most deep learning approaches fail to accommodate for exogenous variables, which have significant influence in the peaks and valleys of the daLMP. Accurate forecasts of the daLMP valleys are of cru…
▽ More
Several approaches have been proposed to forecast day-ahead locational marginal price (daLMP) in deregulated energy markets. The rise of deep learning has motivated its use in energy price forecasts but most deep learning approaches fail to accommodate for exogenous variables, which have significant influence in the peaks and valleys of the daLMP. Accurate forecasts of the daLMP valleys are of crucial importance for power generators since one of the most important decisions they face is whether to sell power at a loss to prevent incurring in shutdown and start-up costs, or to bid at production cost and face the risk of shutting down. In this article we propose a deep learning model that incorporates both the history of daLMP and the effect of exogenous variables (e.g., forecasted load, weather data). A numerical study at the PJM independent system operator (ISO) illustrates how the proposed model outperforms traditional time series techniques while supporting risk-based analysis of shutdown decisions.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification
Authors:
Federico López,
Michael Strube
Abstract:
Label inventories for fine-grained entity typing have grown in size and complexity. Nonetheless, they exhibit a hierarchical structure. Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of symbolic data. However, it is not clear how to integrate hyperbolic components into downstream tasks. This is the first work that proposes a fully hyperbolic m…
▽ More
Label inventories for fine-grained entity typing have grown in size and complexity. Nonetheless, they exhibit a hierarchical structure. Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of symbolic data. However, it is not clear how to integrate hyperbolic components into downstream tasks. This is the first work that proposes a fully hyperbolic model for multi-class multi-label classification, which performs all operations in hyperbolic space. We evaluate the proposed model on two challenging datasets and compare to different baselines that operate under Euclidean assumptions. Our hyperbolic model infers the latent hierarchy from the class distribution, captures implicit hyponymic relations in the inventory, and shows performance on par with state-of-the-art methods on fine-grained classification with remarkable reduction of the parameter size. A thorough analysis sheds light on the impact of each component in the final prediction and showcases its ease of integration with Euclidean layers.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Risk-Aware High-level Decisions for Automated Driving at Occluded Intersections with Reinforcement Learning
Authors:
Danial Kamran,
Carlos Fernandez Lopez,
Martin Lauer,
Christoph Stiller
Abstract:
Reinforcement learning is nowadays a popular framework for solving different decision making problems in automated driving. However, there are still some remaining crucial challenges that need to be addressed for providing more reliable policies. In this paper, we propose a generic risk-aware DQN approach in order to learn high level actions for driving through unsignalized occluded intersections.…
▽ More
Reinforcement learning is nowadays a popular framework for solving different decision making problems in automated driving. However, there are still some remaining crucial challenges that need to be addressed for providing more reliable policies. In this paper, we propose a generic risk-aware DQN approach in order to learn high level actions for driving through unsignalized occluded intersections. The proposed state representation provides lane based information which allows to be used for multi-lane scenarios. Moreover, we propose a risk based reward function which punishes risky situations instead of only collision failures. Such rewarding approach helps to incorporate risk prediction into our deep Q network and learn more reliable policies which are safer in challenging situations. The efficiency of the proposed approach is compared with a DQN learned with conventional collision based rewarding scheme and also with a rule-based intersection navigation policy. Evaluation results show that the proposed approach outperforms both of these methods. It provides safer actions than collision-aware DQN approach and is less overcautious than the rule-based policy.
△ Less
Submitted 9 April, 2020;
originally announced April 2020.
-
Fine-Grained Entity Typing in Hyperbolic Space
Authors:
Federico López,
Benjamin Heinzerling,
Michael Strube
Abstract:
How can we represent hierarchical information present in large type inventories for entity typing? We study the ability of hyperbolic embeddings to capture hierarchical relations between mentions in context and their target types in a shared vector space. We evaluate on two datasets and investigate two different techniques for creating a large hierarchical entity type inventory: from an expert-gen…
▽ More
How can we represent hierarchical information present in large type inventories for entity typing? We study the ability of hyperbolic embeddings to capture hierarchical relations between mentions in context and their target types in a shared vector space. We evaluate on two datasets and investigate two different techniques for creating a large hierarchical entity type inventory: from an expert-generated ontology and by automatically mining type co-occurrences. We find that the hyperbolic model yields improvements over its Euclidean counterpart in some, but not all cases. Our analysis suggests that the adequacy of this geometry depends on the granularity of the type inventory and the way hierarchical relations are inferred.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Rucio - Scientific data management
Authors:
Martin Barisits,
Thomas Beermann,
Frank Berghaus,
Brian Bockelman,
Joaquin Bogado,
David Cameron,
Dimitrios Christidis,
Diego Ciangottini,
Gancho Dimitrov,
Markus Elsing,
Vincent Garonne,
Alessandro di Girolamo,
Luc Goossens,
Wen Guan,
Jaroslav Guenther,
Tomas Javurek,
Dietmar Kuhn,
Mario Lassnig,
Fernando Lopez,
Nicolo Magini,
Angelos Molfetas,
Armin Nairz,
Farid Ould-Saada,
Stefan Prenner,
Cedric Serfon
, et al. (5 additional authors not shown)
Abstract:
Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support t…
▽ More
Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations. Rucio was originally developed to meet the requirements of the high-energy physics experiment ATLAS, and now is continuously extended to support the LHC experiments and other diverse scientific communities. In this article, we detail the fundamental concepts of Rucio, describe the architecture along with implementation details, and give operational experience from production usage.
△ Less
Submitted 6 June, 2019; v1 submitted 26 February, 2019;
originally announced February 2019.
-
A Study of Delay Drifts on Massive MIMO Wideband Channel Models
Authors:
Carlos F. Lopez,
Cheng-Xiang Wang
Abstract:
In this paper, we study the effects of the variations of the propagation delay over large-scale antenna-arrays used in massive multiple-input multiple-output (MIMO) wideband communication systems on the statistical properties of the channel. Due to its simplicity and popularity, the Elliptical geometry-based stochastic channel model (GBSM) is employed to demonstrate new non-stationary properties o…
▽ More
In this paper, we study the effects of the variations of the propagation delay over large-scale antenna-arrays used in massive multiple-input multiple-output (MIMO) wideband communication systems on the statistical properties of the channel. Due to its simplicity and popularity, the Elliptical geometry-based stochastic channel model (GBSM) is employed to demonstrate new non-stationary properties of the channel in the frequency and spatial domains caused by the drift of delays. In addition, we show that the time of travel of multi-path components (MPCs) over large-scale arrays may result in overlooked frequency and spatial decorrelation effects. These are theoretically demonstrated by deriving the space-time-frequency correlation functions (STFCFs) of both narrowband and wideband Elliptical models. Closed-form expressions of the array-variant frequency correlation function (FCF), power delay profile (PDP), mean delay, and delay spread of single- and multi-confocal Elliptical models are derived when the angles of arrival (AOAs) are von Mises distributed. In such conditions, we find that the large dimensions of the antenna array may limit the narrowband characteristic of the single-ellipse model and alter the wideband characteristics (PDP and FCF) of the multi-confocal Elliptical channel model. Although we present and analyze numerical and simulation results for a particular GBSM, similar conclusions can be extended to other GBSMs.
△ Less
Submitted 22 March, 2018;
originally announced March 2018.
-
A novel 2D non-stationary wideband massive MIMO channel model
Authors:
C. F. Lopez,
C. -X. Wang,
R. Feng
Abstract:
In this paper, a novel two-dimensional (2D) non-stationary wideband geometry-based stochastic model (GBSM) for massive multiple-input multiple-output (MIMO) communication systems is proposed. Key characteristics of massive MIMO channels such as near field effects and cluster evolution along the array are addressed in this model. Near field effects are modelled by a second-order approximation to sp…
▽ More
In this paper, a novel two-dimensional (2D) non-stationary wideband geometry-based stochastic model (GBSM) for massive multiple-input multiple-output (MIMO) communication systems is proposed. Key characteristics of massive MIMO channels such as near field effects and cluster evolution along the array are addressed in this model. Near field effects are modelled by a second-order approximation to spherical wavefronts, i.e., parabolic wavefronts, leading to linear drifts of the angles of multipath components (MPCs) and non-stationarity along the array. Cluster evolution along the array involving cluster (dis)appearance and smooth average power variations is considered. Cluster (dis)appearance is modeled by a two-state Markov process and smooth average power variations are modelled by a spatial lognormal process. Statistical properties of the channel model such as time autocorrelation function (ACF), spatial cross-correlation function (CCF), and cluster average power and Rician factor variations over the array are derived. Finally, simulation results are presented and analyzed, demonstrating that parabolic wavefronts and cluster soft evolution are good candidates to model important massive MIMO channel characteristics.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Variations of the Similarity Function of TextRank for Automated Summarization
Authors:
Federico Barrios,
Federico López,
Luis Argerich,
Rosa Wachenchauzer
Abstract:
This article presents new alternatives to the similarity function for the TextRank algorithm for automatic summarization of texts. We describe the generalities of the algorithm and the different functions we propose. Some of these variants achieve a significative improvement using the same metrics and dataset as the original publication.
This article presents new alternatives to the similarity function for the TextRank algorithm for automatic summarization of texts. We describe the generalities of the algorithm and the different functions we propose. Some of these variants achieve a significative improvement using the same metrics and dataset as the original publication.
△ Less
Submitted 10 February, 2016;
originally announced February 2016.
-
Modeling emergence of norms in multi-agent systems by applying tipping points ideas
Authors:
Francisco Lopez
Abstract:
Norms are known to be a major factor determining humans behavior. It's also shown that norms can be quite effective tool for building agent-based societies. Various normative architectures have been proposed for designing normative multi-agent systems (NorMAS). Due to human nature of the concept norms, many of these architectures are built based on theories in social sciences. Tipping point theory…
▽ More
Norms are known to be a major factor determining humans behavior. It's also shown that norms can be quite effective tool for building agent-based societies. Various normative architectures have been proposed for designing normative multi-agent systems (NorMAS). Due to human nature of the concept norms, many of these architectures are built based on theories in social sciences. Tipping point theory, as is briefly discussed in this paper, seems to have a great potential to be used for designing normative architectures. This theory deals with the factors that affect social epidemics that arise in human societies. In this paper, we try to apply the main concepts of this theory to agent-based normative architectures. We show several ways to implement these concepts, and study their effects in an agent-based normative scenario.
△ Less
Submitted 19 August, 2015;
originally announced August 2015.
-
Significant Subgraph Mining with Multiple Testing Correction
Authors:
Mahito Sugiyama,
Felipe Llinares López,
Niklas Kasenburg,
Karsten M. Borgwardt
Abstract:
The problem of finding itemsets that are statistically significantly enriched in a class of transactions is complicated by the need to correct for multiple hypothesis testing. Pruning untestable hypotheses was recently proposed as a strategy for this task of significant itemset mining. It was shown to lead to greater statistical power, the discovery of more truly significant itemsets, than the sta…
▽ More
The problem of finding itemsets that are statistically significantly enriched in a class of transactions is complicated by the need to correct for multiple hypothesis testing. Pruning untestable hypotheses was recently proposed as a strategy for this task of significant itemset mining. It was shown to lead to greater statistical power, the discovery of more truly significant itemsets, than the standard Bonferroni correction on real-world datasets. An open question, however, is whether this strategy of excluding untestable hypotheses also leads to greater statistical power in subgraph mining, in which the number of hypotheses is much larger than in itemset mining. Here we answer this question by an empirical investigation on eight popular graph benchmark datasets. We propose a new efficient search strategy, which always returns the same solution as the state-of-the-art approach and is approximately two orders of magnitude faster. Moreover, we exploit the dependence between subgraphs by considering the effective number of tests and thereby further increase the statistical power.
△ Less
Submitted 30 January, 2015; v1 submitted 1 July, 2014;
originally announced July 2014.
-
A High Quality/Low Computational Cost Technique for Block Matching Motion Estimation
Authors:
S. Lopez,
G. M. Callico,
J. F. Lopez,
R. Sarmiento
Abstract:
Motion estimation is the most critical process in video coding systems. First of all, it has a definitive impact on the rate-distortion performance given by the video encoder. Secondly, it is the most computationally intensive process within the encoding loop. For these reasons, the design of high-performance low-cost motion estimators is a crucial task in the video compression field. An adaptiv…
▽ More
Motion estimation is the most critical process in video coding systems. First of all, it has a definitive impact on the rate-distortion performance given by the video encoder. Secondly, it is the most computationally intensive process within the encoding loop. For these reasons, the design of high-performance low-cost motion estimators is a crucial task in the video compression field. An adaptive cost block matching (ACBM) motion estimation technique is presented in this paper, featuring an excellent tradeoff between the quality of the reconstructed video sequences and the computational effort. Simulation results demonstrate that the ACBM algorithm achieves a slight better rate-distortion performance than the one given by the well-known full search algorithm block matching algorithm with reductions of up to 95% in the computational load.
△ Less
Submitted 25 October, 2007;
originally announced October 2007.