Search | arXiv e-print repository

A numerical approach to particle creation in accelerating toy models

Authors: Pedro Duarte Baptista, Alex Vañó-Viñuales, Adrían del Río Vega

Abstract: The formation of black holes by the gravitational collapse of stars is known to spontaneously excite particle pairs out of the quantum vacuum. For the canonical vacuum state at past null infinity, the expected number of particles received at future null infinity can be obtained in full closed form at sufficiently late times. However, for intermediate times, or for more complicated astrophysical pr… ▽ More The formation of black holes by the gravitational collapse of stars is known to spontaneously excite particle pairs out of the quantum vacuum. For the canonical vacuum state at past null infinity, the expected number of particles received at future null infinity can be obtained in full closed form at sufficiently late times. However, for intermediate times, or for more complicated astrophysical processes (e.g. binary black hole mergers), the problem is technically challenging and has not yet been resolved. We develop here a numerical approach to study scattering problems of massless quantum fields in asymptotically flat spacetimes, based on the hyperboloidal slice method used in numerical relativity and perturbation theory. This promising approach can reach both past and future null infinities, and therefore it has the potential to address the Hawking scattering problem more rigorously than evolution on the usual Cauchy slices. We test this approach with some dynamical toy models in Minkowski using effective potentials that mimic the effects of gravity, and compute the spectrum of particles received at future null infinity. We finally discuss future prospects for applying this framework in more relevant gravitational scenarios. △ Less

Submitted 23 June, 2025; originally announced June 2025.

Comments: 22 pages, 16 figures

arXiv:2506.13623 [pdf, ps, other]

Global fits and the 95 GeV diphoton excesses in the Supersymmetric Georgi-Machacek Model

Authors: Yingnan Xu, Dikai Li, Roberto Vega, Roberto Vega-Morales, Keping Xie

Abstract: Recently the ATLAS and CMS experiments have reported modest excesses in the diphoton channel at around 95 GeV.~A number of recent studies have examined whether these could be due to an extended electroweak symmetry breaking (EWSB) sector, including the well known Georgi-Machacek (GM) model.~Here we examine whether the excesses can be explained by a light exotic Higgs boson in the \emph{Supersymmet… ▽ More Recently the ATLAS and CMS experiments have reported modest excesses in the diphoton channel at around 95 GeV.~A number of recent studies have examined whether these could be due to an extended electroweak symmetry breaking (EWSB) sector, including the well known Georgi-Machacek (GM) model.~Here we examine whether the excesses can be explained by a light exotic Higgs boson in the \emph{Supersymmetric} GM (SGM) model which has the same scalar spectrum as the conventional GM model, but with a more constrained Higgs potential and the presence of custodial Higgsino fermions.~We perform a global fit of the SGM model including all relevant production and decay channels, some of which have been neglected in previous studies, which severely constrain the parameter space.~We find that the SGM model can fit the data if the LHC diphoton excesses at 95\,GeV are due to the lightest custodial singlet Higgs boson which contributes $(5-7)\%$ to EWSB, but \emph{cannot} accommodate the LEP $b\bar{b}$ excess, in contrast to other recent studies of the GM model.~Since the SGM model has a highly constrained Higgs potential, the rest of the mass spectrum is sharply predicted, allowing for targeted searches at the LHC or future colliders.~We also compare the SGM model with the non-supersymmetric GM model and identify how they can be distinguished at the LHC or future colliders. △ Less

Submitted 16 June, 2025; originally announced June 2025.

Comments: 27 pages, 14 figures

Report number: MSUHEP-24-014, SMU-PHY-24-02

arXiv:2504.08991 [pdf, other]

CMS RPC Non-Physics Event Data Automation Ideology

Authors: A. Dimitrov, M. Tytgat, K. Mota Amarilo, A. Samalan, K. Skovpen, G. A. Alves, E. Alves Coelho, F. Marujo da Silva, M. Barroso Ferreira Filho, E. M. Da Costa, D. De Jesus Damiao, S. Fonseca De Souza, R. Gomes De Souza, L. Mundim, H. Nogima, J. P. Pinheiro, A. Santoro, M. Thiel, A. Aleksandrov, R. Hadjiiska, P. Iaydjiev, M. Shopova, G. Sultanov, L. Litov, B. Pavlov , et al. (79 additional authors not shown)

Abstract: This paper presents a streamlined framework for real-time processing and analysis of condition data from the CMS experiment Resistive Plate Chambers (RPC). Leveraging data streaming, it uncovers correlations between RPC performance metrics, like currents and rates, and LHC luminosity or environmental conditions. The Java-based framework automates data handling and predictive modeling, integrating… ▽ More This paper presents a streamlined framework for real-time processing and analysis of condition data from the CMS experiment Resistive Plate Chambers (RPC). Leveraging data streaming, it uncovers correlations between RPC performance metrics, like currents and rates, and LHC luminosity or environmental conditions. The Java-based framework automates data handling and predictive modeling, integrating extensive datasets into synchronized, query-optimized tables. By segmenting LHC operations and analyzing larger virtual detector objects, the automation enhances monitoring precision, accelerates visualization, and provides predictive insights, revolutionizing RPC performance evaluation and future behavior modeling. △ Less

Submitted 11 April, 2025; originally announced April 2025.

Comments: CMS RPC Condition Data Automation, Java framework, 12 pages, 23 figures, CMS Condition database

arXiv:2502.15937 [pdf, other]

Discovery and Deployment of Emergent Robot Swarm Behaviors via Representation Learning and Real2Sim2Real Transfer

Authors: Connor Mattson, Varun Raveendra, Ricardo Vega, Cameron Nowzari, Daniel S. Drew, Daniel S. Brown

Abstract: Given a swarm of limited-capability robots, we seek to automatically discover the set of possible emergent behaviors. Prior approaches to behavior discovery rely on human feedback or hand-crafted behavior metrics to represent and evolve behaviors and only discover behaviors in simulation, without testing or considering the deployment of these new behaviors on real robot swarms. In this work, we pr… ▽ More Given a swarm of limited-capability robots, we seek to automatically discover the set of possible emergent behaviors. Prior approaches to behavior discovery rely on human feedback or hand-crafted behavior metrics to represent and evolve behaviors and only discover behaviors in simulation, without testing or considering the deployment of these new behaviors on real robot swarms. In this work, we present Real2Sim2Real Behavior Discovery via Self-Supervised Representation Learning, which combines representation learning and novelty search to discover possible emergent behaviors automatically in simulation and enable direct controller transfer to real robots. First, we evaluate our method in simulation and show that our proposed self-supervised representation learning approach outperforms previous hand-crafted metrics by more accurately representing the space of possible emergent behaviors. Then, we address the reality gap by incorporating recent work in sim2real transfer for swarms into our lightweight simulator design, enabling direct robot deployment of all behaviors discovered in simulation on an open-source and low-cost robot platform. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: 10 pages, 5 figures. To be included in Proc. of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2025)

arXiv:2411.15156 [pdf, other]

doi 10.1109/ARGENCON62399.2024.10735948

Using Spatial Diffusions for Optoacoustic Tomography Image Reconstruction

Authors: Martin G. Gonzalez, Matias Vera, Leonardo Rey Vega

Abstract: Optoacoustic tomography image reconstruction has been a problem of interest in recent years. By exploiting the exceptional generative power of the recently proposed diffusion models we consider a scheme which is based on a conditional diffusion process. Using a simple initial image reconstruction method such as Delay and Sum, we consider a specially designed autoencoder architecture which generate… ▽ More Optoacoustic tomography image reconstruction has been a problem of interest in recent years. By exploiting the exceptional generative power of the recently proposed diffusion models we consider a scheme which is based on a conditional diffusion process. Using a simple initial image reconstruction method such as Delay and Sum, we consider a specially designed autoencoder architecture which generates a latent representation which is used as conditional information in the generative diffusion process. Numerical results show the merits of our proposal in terms of quality metrics such as PSNR and SSIM, showing that the conditional information generated in terms of the initial reconstructed image is able to bias the generative process of the diffusion model in order to enhance the image, correct artifacts and even recover some finer details that the initial reconstruction method is not able to obtain. △ Less

Submitted 8 November, 2024; originally announced November 2024.

Comments: Published in 2024 IEEE Biennial Congress of Argentina. arXiv admin note: substantial text overlap with arXiv:2404.10239

arXiv:2410.16444 [pdf, other]

Agent-Based Emulation for Deploying Robot Swarm Behaviors

Authors: Ricardo Vega, Kevin Zhu, Connor Mattson, Daniel S. Brown, Cameron Nowzari

Abstract: Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach b… ▽ More Despite significant research, robotic swarms have yet to be useful in solving real-world problems, largely due to the difficulty of creating and controlling swarming behaviors in multi-agent systems. Traditional top-down approaches in which a desired emergent behavior is produced often require complex, resource-heavy robots, limiting their practicality. This paper introduces a bottom-up approach by employing an Embodied Agent-Based Modeling and Simulation approach, emphasizing the use of simple robots and identifying conditions that naturally lead to self-organized collective behaviors. Using the Reality-to-Simulation-to-Reality for Swarms (RSRS) process, we tightly integrate real-world experiments with simulations to reproduce known swarm behaviors as well as discovering a novel emergent behavior without aiming to eliminate or even reduce the sim2real gap. This paper presents the development of an Agent-Based Embodiment and Emulation process that balances the importance of running physical swarming experiments and the prohibitively time-consuming process of even setting up and running a single experiment with 20+ robots by leveraging low-fidelity lightweight simulations to enable hypothesis-formation to guide physical experiments. We demonstrate the usefulness of our methods by emulating two known behaviors from the literature and show a third behavior `discovered' by accident. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: 8 pages, 6 figures, submitted to ICRA 2025

arXiv:2410.16175 [pdf, other]

Spiking Neural Networks as a Controller for Emergent Swarm Agents

Authors: Kevin Zhu, Connor Mattson, Shay Snyder, Ricardo Vega, Daniel S. Brown, Maryam Parsa, Cameron Nowzari

Abstract: Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible… ▽ More Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Comments: 8 pages, 7 figures, presented at the 2024 International Conference on Neuromorphic Systems

arXiv:2404.10239 [pdf, other]

Diffusion assisted image reconstruction in optoacoustic tomography

Authors: M. G. González, M. Vera, A. Dreszman, L. J. Rey Vega

Abstract: In this paper we consider the problem of acoustic inversion in the context of the optoacoustic tomography image reconstruction problem. By leveraging the ability of the recently proposed diffusion models for image generative tasks among others, we devise an image reconstruction architecture based on a conditional diffusion process. The scheme makes use of an initial image reconstruction, which is… ▽ More In this paper we consider the problem of acoustic inversion in the context of the optoacoustic tomography image reconstruction problem. By leveraging the ability of the recently proposed diffusion models for image generative tasks among others, we devise an image reconstruction architecture based on a conditional diffusion process. The scheme makes use of an initial image reconstruction, which is preprocessed by an autoencoder to generate an adequate representation. This representation is used as conditional information in a generative diffusion process. Although the computational requirements for training and implementing the architecture are not low, several design choices discussed in the work were made to keep them manageable. Numerical results show that the conditional information allows to properly bias the parameters of the diffusion model to improve the quality of the initial reconstructed image, eliminating artifacts or even reconstructing finer details of the ground-truth image that are not recoverable by the initial image reconstruction method. We also tested the proposal under experimental conditions and the obtained results were in line with those corresponding to the numerical simulations. Improvements in image quality up to 17 % in terms of peak signal-to-noise ratio were observed. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Paper accepted for publication in the journal Optics and Lasers in Engineering

MSC Class: 68T07; 78A46

arXiv:2311.03103 [pdf, other]

Parametric Resurgences of the Second Painlevé Equation and Minimal Superstrings

Authors: Roberto Vega

Abstract: The aim of this paper is to study the resurgent transseries structure of the inhomogeneous and $q$-deformed Painlevé II equations. Appearing in a variety of physical systems we here focus on their description of $(2,4)$-super minimal string theory with either D-branes or RR-flux backgrounds. In this context they appear as double scaled string equations of matrix models, and we relate the resurgent… ▽ More The aim of this paper is to study the resurgent transseries structure of the inhomogeneous and $q$-deformed Painlevé II equations. Appearing in a variety of physical systems we here focus on their description of $(2,4)$-super minimal string theory with either D-branes or RR-flux backgrounds. In this context they appear as double scaled string equations of matrix models, and we relate the resurgent transseries structures appearing in this way with explicit matrix model computations. The main body of the paper is focused on studying the transseries structure of these equations as well as the corresponding resurgence analyses. Concretely, the aim will be to give a recursion relation for the transseries sectors and obtain the non-perturbative transmonomials -- {\it i.e.}, the instanton actions of the systems. From the resurgence point of view, the goal is to obtain Stokes data. These encode how the transseries parameters jump at the Stokes lines when turning around the complex plane in order to produce a global transseries solution. The main result will be a conjectured form for the transition functions of these transseries parameters. We explore how these equations are related to each other via the Miura map. In particular, we focus on how their resurgent properties can be translated into each other. We study the special solutions of the inhomogeneous Painlevé II equation and how these might be encoded in the transseries parameters. Specifically, we have a discussion on the Hastings--McLeod solution and some results on special function solutions. Finally, we discuss our results in the context of the matrix model and the (2,4)-minimal superstring theory. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2309.11408 [pdf, other]

Indirect Swarm Control: Characterization and Analysis of Emergent Swarm Behaviors

Authors: Ricardo Vega, Connor Mattson, Daniel S. Brown, Cameron Nowzari

Abstract: Emergence and emergent behaviors are often defined as cases where changes in local interactions between agents at a lower level effectively changes what occurs in the higher level of the system (i.e., the whole swarm) and its properties. However, the manner in which these collective emergent behaviors self-organize is less understood. The focus of this paper is in presenting a new framework for ch… ▽ More Emergence and emergent behaviors are often defined as cases where changes in local interactions between agents at a lower level effectively changes what occurs in the higher level of the system (i.e., the whole swarm) and its properties. However, the manner in which these collective emergent behaviors self-organize is less understood. The focus of this paper is in presenting a new framework for characterizing the conditions that lead to different macrostates and how to predict/analyze their macroscopic properties, allowing us to indirectly engineer the same behaviors from the bottom up by tuning their environmental conditions rather than local interaction rules. We then apply this framework to a simple system of binary sensing and acting agents as an example to see if a re-framing of this swarms problem can help us push the state of the art forward. By first creating some working definitions of macrostates in a particular swarm system, we show how agent-based modeling may be combined with control theory to enable a generalized understanding of controllable emergent processes without needing to simulate everything. Whereas phase diagrams can generally only be created through Monte Carlo simulations or sweeping through ranges of parameters in a simulator, we develop closed-form functions that can immediately produce them revealing an infinite set of swarm parameter combinations that can lead to a specifically chosen self-organized behavior. While the exact methods are still under development, we believe simply laying out a potential path towards solutions that have evaded our traditional methods using a novel method is worth considering. Our results are characterized through both simulations and real experiments on ground robots. △ Less

Submitted 28 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: 8 pages, 13 figures, submitted to IROS 2024 conference

arXiv:2306.17744 [pdf, other]

Zespol: A Lightweight Environment for Training Swarming Agents

Authors: Shay Snyder, Kevin Zhu, Ricardo Vega, Cameron Nowzari, Maryam Parsa

Abstract: Agent-based modeling (ABM) and simulation have emerged as important tools for studying emergent behaviors, especially in the context of swarming algorithms for robotic systems. Despite significant research in this area, there is a lack of standardized simulation environments, which hinders the development and deployment of real-world robotic swarms. To address this issue, we present Zespol, a modu… ▽ More Agent-based modeling (ABM) and simulation have emerged as important tools for studying emergent behaviors, especially in the context of swarming algorithms for robotic systems. Despite significant research in this area, there is a lack of standardized simulation environments, which hinders the development and deployment of real-world robotic swarms. To address this issue, we present Zespol, a modular, Python-based simulation environment that enables the development and testing of multi-agent control algorithms. Zespol provides a flexible and extensible sandbox for initial research, with the potential for scaling to real-world applications. We provide a topological overview of the system and detailed descriptions of its plug-and-play elements. We demonstrate the fidelity of Zespol in simulated and real-word robotics by replicating existing works highlighting the simulation to real gap with the milling behavior. We plan to leverage Zespol's plug-and-play feature for neuromorphic computing in swarming scenarios, which involves using the modules in Zespol to simulate the behavior of neurons and their connections as synapses. This will enable optimizing and studying the emergent behavior of swarm systems in complex environments. Our goal is to gain a better understanding of the interplay between environmental factors and neural-like computations in swarming systems. △ Less

Submitted 30 June, 2023; originally announced June 2023.

Comments: 5 pages, 4 figures, 1 table

arXiv:2305.00335 [pdf, other]

doi 10.1063/5.0139286

Invariant Representations in Deep Learning for Optoacoustic Imaging

Authors: Matias Vera, Martin G. Gonzalez, Leonardo Rey Vega

Abstract: Image reconstruction in optoacoustic tomography (OAT) is a trending learning task highly dependent on measured physical magnitudes present at sensing time. The large number of different settings, and also the presence of uncertainties or partial knowledge of parameters, can lead to reconstructions algorithms that are specifically tailored and designed to a particular configuration which could not… ▽ More Image reconstruction in optoacoustic tomography (OAT) is a trending learning task highly dependent on measured physical magnitudes present at sensing time. The large number of different settings, and also the presence of uncertainties or partial knowledge of parameters, can lead to reconstructions algorithms that are specifically tailored and designed to a particular configuration which could not be the one that will be ultimately faced in a final practical situation. Being able to learn reconstruction algorithms that are robust to different environments (e.g. the different OAT image reconstruction settings) or invariant to such environments is highly valuable because it allows to focus on what truly matters for the application at hand and discard what are considered spurious features. In this work we explore the use of deep learning algorithms based on learning invariant and robust representations for the OAT inverse problem. In particular, we consider the application of the ANDMask scheme due to its easy adaptation to the OAT problem. Numerical experiments are conducted showing that, when out-of-distribution generalization (against variations in parameters such as the location of the sensors) is imposed, there is no degradation of the performance and, in some cases, it is even possible to achieve improvements with respect to standard deep learning approaches where invariance robustness is not explicitly considered. △ Less

Submitted 29 April, 2023; originally announced May 2023.

Comments: paper accepted for publication in Review Scientific Instruments

MSC Class: 68T07; 78A46

arXiv:2304.14898 [pdf, other]

doi 10.1109/TSIPN.2023.3341407

An Asymptotically Equivalent GLRT Test for Distributed Detection in Wireless Sensor Networks

Authors: Juan Augusto Maya, Leonardo Rey Vega, Andrea M. Tonello

Abstract: In this article, we consider the problem of distributed detection of a localized radio source emitting a signal. We consider that geographically distributed sensor nodes obtain energy measurements and compute cooperatively a statistic to decide if the source is present or absent. We model the radio source as a stochastic signal and deal with spatially statistically dependent measurements, whose pr… ▽ More In this article, we consider the problem of distributed detection of a localized radio source emitting a signal. We consider that geographically distributed sensor nodes obtain energy measurements and compute cooperatively a statistic to decide if the source is present or absent. We model the radio source as a stochastic signal and deal with spatially statistically dependent measurements, whose probability density function (PDF) has unknown positive parameters when the radio source is active. Under the framework of the Generalized Likelihood Ratio Test (GLRT) theory, the positive constraint on the unknown multidimensional parameter makes the computation of the GLRT asymptotic performance (when the amount of sensor measurements tends to infinity) more involved. Nevertheless, we analytically characterize the asymptotic distribution of the statistic. Moreover, as the GLRT is not amenable for distributed settings because of the spatial statistical dependence of the measurements, we study a GLRT-like test where the joint PDF of the measurements is substituted by the product of its marginal PDFs, and therefore, the statistical dependence is completely discarded for building this test. Nevertheless, its asymptotic performance is proved to be identical to the original GLRT, showing that the statistically dependence of the measurements has no impact on the detection performance in the asymptotic scenario. Furthermore, the GLRT-like algorithm has a low computational complexity and demands low communication resources, as compared to the GLRT. △ Less

Submitted 19 December, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

arXiv:2303.08985 [pdf, other]

doi 10.1109/ARGENCON55245.2022.9940056

Cross-domain Sentiment Classification in Spanish

Authors: Lautaro Estienne, Matias Vera, Leonardo Rey Vega

Abstract: Sentiment Classification is a fundamental task in the field of Natural Language Processing, and has very important academic and commercial applications. It aims to automatically predict the degree of sentiment present in a text that contains opinions and subjectivity at some level, like product and movie reviews, or tweets. This can be really difficult to accomplish, in part, because different dom… ▽ More Sentiment Classification is a fundamental task in the field of Natural Language Processing, and has very important academic and commercial applications. It aims to automatically predict the degree of sentiment present in a text that contains opinions and subjectivity at some level, like product and movie reviews, or tweets. This can be really difficult to accomplish, in part, because different domains of text contains different words and expressions. In addition, this difficulty increases when text is written in a non-English language due to the lack of databases and resources. As a consequence, several cross-domain and cross-language techniques are often applied to this task in order to improve the results. In this work we perform a study on the ability of a classification system trained with a large database of product reviews to generalize to different Spanish domains. Reviews were collected from the MercadoLibre website from seven Latin American countries, allowing the creation of a large and balanced dataset. Results suggest that generalization across domains is feasible though very challenging when trained with these product reviews, and can be improved by pre-training and fine-tuning the classification model. △ Less

Submitted 15 March, 2023; originally announced March 2023.

arXiv:2302.04829 [pdf, other]

Modeling and Forecasting COVID-19 Cases using Latent Subpopulations

Authors: Roberto Vega, Zehra Shah, Pouria Ramazi, Russell Greiner

Abstract: Classical epidemiological models assume homogeneous populations. There have been important extensions to model heterogeneous populations, when the identity of the sub-populations is known, such as age group or geographical location. Here, we propose two new methods to model the number of people infected with COVID-19 over time, each as a linear combination of latent sub-populations -- i.e., when w… ▽ More Classical epidemiological models assume homogeneous populations. There have been important extensions to model heterogeneous populations, when the identity of the sub-populations is known, such as age group or geographical location. Here, we propose two new methods to model the number of people infected with COVID-19 over time, each as a linear combination of latent sub-populations -- i.e., when we do not know which person is in which sub-population, and the only available observations are the aggregates across all sub-populations. Method #1 is a dictionary-based approach, which begins with a large number of pre-defined sub-population models (each with its own starting time, shape, etc), then determines the (positive) weight of small (learned) number of sub-populations. Method #2 is a mixture-of-$M$ fittable curves, where $M$, the number of sub-populations to use, is given by the user. Both methods are compatible with any parametric model; here we demonstrate their use with first (a)~Gaussian curves and then (b)~SIR trajectories. We empirically show the performance of the proposed methods, first in (i) modeling the observed data and then in (ii) forecasting the number of infected people 1 to 4 weeks in advance. Across 187 countries, we show that the dictionary approach had the lowest mean absolute percentage error and also the lowest variance when compared with classical SIR models and moreover, it was a strong baseline that outperforms many of the models developed for COVID-19 forecasting. △ Less

Submitted 9 February, 2023; originally announced February 2023.

Comments: 14 pages, 8 figures, submitted to Frontiers in Big Data

arXiv:2301.09018 [pdf, ps, other]

Simulate Less, Expect More: Bringing Robot Swarms to Life via Low-Fidelity Simulations

Authors: Ricardo Vega, Kevin Zhu, Sean Luke, Maryam Parsa, Cameron Nowzari

Abstract: This paper proposes a novel methodology for addressing the simulation-reality gap for multi-robot swarm systems. Rather than immediately try to shrink or `bridge the gap' anytime a real-world experiment failed that worked in simulation, we characterize conditions under which this is actually necessary. When these conditions are not satisfied, we show how very simple simulators can still be used to… ▽ More This paper proposes a novel methodology for addressing the simulation-reality gap for multi-robot swarm systems. Rather than immediately try to shrink or `bridge the gap' anytime a real-world experiment failed that worked in simulation, we characterize conditions under which this is actually necessary. When these conditions are not satisfied, we show how very simple simulators can still be used to both (i) design new multi-robot systems, and (ii) guide real-world swarming experiments towards certain emergent behaviors when the gap is very large. The key ideas are an iterative simulator-in-the-design-loop in which real-world experiments, simulator modifications, and simulated experiments are intimately coupled in a way that minds the gap without needing to shrink it, as well as the use of minimally viable phase diagrams to guide real world experiments. We demonstrate the usefulness of our methods on deploying a real multi-robot swarm system to successfully exhibit an emergent milling behavior. △ Less

Submitted 21 January, 2023; originally announced January 2023.

Comments: 9 pages, 9 figures

arXiv:2211.04892 [pdf, other]

doi 10.1109/OJCOMS.2023.3332259

An Exponentially-Tight Approximate Factorization of the Joint PDF of Statistical Dependent Measurements in Wireless Sensor Networks

Authors: Juan Augusto Maya, Leonardo Rey Vega, Andrea M. Tonello

Abstract: We consider the distributed detection problem of a temporally correlated random radio source signal using a wireless sensor network capable of measuring the energy of the received signals. It is well-known that optimal tests in the Neyman-Pearson setting are based on likelihood ratio tests (LRT), which, in this set-up, evaluate the quotient between the probability density functions (PDF) of the me… ▽ More We consider the distributed detection problem of a temporally correlated random radio source signal using a wireless sensor network capable of measuring the energy of the received signals. It is well-known that optimal tests in the Neyman-Pearson setting are based on likelihood ratio tests (LRT), which, in this set-up, evaluate the quotient between the probability density functions (PDF) of the measurements when the source signal is present and absent. When the source is present, the computation of the joint PDF of the energy measurements at the nodes is a challenging problem. This is due to the statistical dependence introduced to the received signals by the radio source propagated through fading channels. We deal with this problem using the characteristic function of the (intractable) joint PDF, and proposing an approximation to it. We derive bounds for the approximation error in two wireless propagation scenarios, slow and fast fading, and show that the proposed approximation is exponentially tight with the number of nodes when the time-bandwidth product is sufficiently high. The approximation is used as a substitute of the exact joint PDF for building an approximate LRT, which performs better than other well-known detectors, as verified by Monte Carlo simulations. △ Less

Submitted 19 December, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

arXiv:2210.08099 [pdf, other]

doi 10.1016/j.optlaseng.2022.107471

Combining band-frequency separation and deep neural networks for optoacoustic imaging

Authors: Martin G. Gonzalez, Matias Vera, Leonardo Rey Vega

Abstract: In this paper we consider the problem of image reconstruction in optoacoustic tomography. In particular, we devise a deep neural architecture that can explicitly take into account the band-frequency information contained in the sinogram. This is accomplished by two means. First, we jointly use a linear filtered back-projection method and a fully dense UNet for the generation of the images correspo… ▽ More In this paper we consider the problem of image reconstruction in optoacoustic tomography. In particular, we devise a deep neural architecture that can explicitly take into account the band-frequency information contained in the sinogram. This is accomplished by two means. First, we jointly use a linear filtered back-projection method and a fully dense UNet for the generation of the images corresponding to each one of the frequency bands considered in the separation. Secondly, in order to train the model, we introduce a special loss function consisting of three terms: (i) a separating frequency bands term; (ii) a sinogram-based consistency term and (iii) a term that directly measures the quality of image reconstruction and which takes advantage of the presence of ground-truth images present in training dataset. Numerical experiments show that the proposed model, which can be easily trainable by standard optimization methods, presents an excellent generalization performance quantified by a number of metrics commonly used in practice. Also, in the testing phase, our solution has a comparable (in some cases lower) computational complexity, which is a desirable feature for real-time implementation of optoacoustic imaging. △ Less

Submitted 5 December, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

Comments: 10 pages, 5 figures, 1 table, submitted to Optics and Lasers in Engineering

arXiv:2209.07148 [pdf, ps, other]

Semi-supervised Batch Learning From Logged Data

Authors: Gholamali Aminian, Armin Behnamnia, Roberto Vega, Laura Toni, Chengchun Shi, Hamid R. Rabiee, Omar Rivasplata, Miguel R. D. Rodrigues

Abstract: Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedb… ▽ More Off-policy learning methods are intended to learn a policy from logged data, which includes context, action, and feedback (cost or reward) for each sample point. In this work, we build on the counterfactual risk minimization framework, which also assumes access to propensity scores. We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedback and samples missing-feedback in the logged data. We refer to this type of learning as semi-supervised batch learning from logged data, which arises in a wide range of application domains. We derive a novel upper bound for the true risk under the inverse propensity score estimator to address this kind of learning problem. Using this bound, we propose a regularized semi-supervised batch learning method with logged data where the regularization term is feedback-independent and, as a result, can be evaluated using the logged missing-feedback data. Consequently, even though feedback is only present for some samples, a learning policy can be learned by leveraging the missing-feedback samples. The results of experiments derived from benchmark datasets indicate that these algorithms achieve policies with better performance in comparison with logging policies. △ Less

Submitted 18 February, 2024; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: 46 pages,

arXiv:2203.16463 [pdf]

doi 10.1109/TDSC.2023.3326230

Perfectly Accurate Membership Inference by a Dishonest Central Server in Federated Learning

Authors: Georg Pichler, Marco Romanelli, Leonardo Rey Vega, Pablo Piantanida

Abstract: Federated Learning is expected to provide strong privacy guarantees, as only gradients or model parameters but no plain text training data is ever exchanged either between the clients or between the clients and the central server. In this paper, we challenge this claim by introducing a simple but still very effective membership inference attack algorithm, which relies only on a single training ste… ▽ More Federated Learning is expected to provide strong privacy guarantees, as only gradients or model parameters but no plain text training data is ever exchanged either between the clients or between the clients and the central server. In this paper, we challenge this claim by introducing a simple but still very effective membership inference attack algorithm, which relies only on a single training step. In contrast to the popular honest-but-curious model, we investigate a framework with a dishonest central server. Our strategy is applicable to models with ReLU activations and uses the properties of this activation function to achieve perfect accuracy. Empirical evaluation on visual classification tasks with MNIST, CIFAR10, CIFAR100 and CelebA datasets show that our method provides perfect accuracy in identifying one sample in a training set with thousands of samples. Occasional failures of our method lead us to discover duplicate images in the CIFAR100 and CelebA datasets. △ Less

Submitted 9 November, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: accepted for publication in IEEE Transactions on Dependable and Secure Computing

arXiv:2203.13726 [pdf, other]

Resurgent Stokes Data for Painleve Equations and Two-Dimensional Quantum (Super) Gravity

Authors: Salvatore Baldino, Ricardo Schiappa, Maximilian Schwick, Roberto Vega

Abstract: Resurgent-transseries solutions to Painleve equations may be recursively constructed out of these nonlinear differential-equations -- but require Stokes data to be globally defined over the complex plane. Stokes data explicitly construct connection-formulae which describe the nonlinear Stokes phenomena associated to these solutions, via implementation of Stokes transitions acting on the transserie… ▽ More Resurgent-transseries solutions to Painleve equations may be recursively constructed out of these nonlinear differential-equations -- but require Stokes data to be globally defined over the complex plane. Stokes data explicitly construct connection-formulae which describe the nonlinear Stokes phenomena associated to these solutions, via implementation of Stokes transitions acting on the transseries. Nonlinear resurgent Stokes data lack, however, a first-principle computational approach, hence are hard to determine generically. In the Painleve I and Painleve II contexts, nonlinear Stokes data get further hindered as these equations are resonant, with non-trivial consequences for the interconnections between transseries sectors, bridge equations, and associated Stokes coefficients. In parallel to this, the Painleve I and Painleve II equations are string-equations for two-dimensional quantum (super) gravity and minimal string theories, where Stokes data have natural ZZ-brane interpretations. This work computes for the first time the complete, analytical, resurgent Stokes data for the first two Painleve equations, alongside their quantum gravity or minimal string incarnations. The method developed herein, dubbed "closed-form asymptotics", makes sole use of resurgent large-order asymptotics of transseries solutions -- alongside a careful analysis of the role resonance plays. Given its generality, it may be applicable to other distinct (nonlinear, resonant) problems. Results for analytical Stokes coefficients have natural structures, which are described, and extensive high-precision numerical tests corroborate all analytical predictions. Connection-formulae are explicitly constructed, with rather simple and compact final results encoding the full Stokes data, and further allowing for exact monodromy checks -- hence for an analytical proof of our results. △ Less

Submitted 28 September, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

Comments: 122 pages, 51 plots in 40 figures, 19 tables, jheppub-nosort.sty; v2: small changes (additions, rewrites, typos); v3: more typos/corrections + more (iso)monodromy details in 7.3, 7.4

arXiv:2203.07622 [pdf, other]

The International Linear Collider: Report to Snowmass 2021

Authors: Alexander Aryshev, Ties Behnke, Mikael Berggren, James Brau, Nathaniel Craig, Ayres Freitas, Frank Gaede, Spencer Gessner, Stefania Gori, Christophe Grojean, Sven Heinemeyer, Daniel Jeans, Katja Kruger, Benno List, Jenny List, Zhen Liu, Shinichiro Michizono, David W. Miller, Ian Moult, Hitoshi Murayama, Tatsuya Nakada, Emilio Nanni, Mihoko Nojiri, Hasan Padamsee, Maxim Perelstein , et al. (487 additional authors not shown)

Abstract: The International Linear Collider (ILC) is on the table now as a new global energy-frontier accelerator laboratory taking data in the 2030s. The ILC addresses key questions for our current understanding of particle physics. It is based on a proven accelerator technology. Its experiments will challenge the Standard Model of particle physics and will provide a new window to look beyond it. This docu… ▽ More The International Linear Collider (ILC) is on the table now as a new global energy-frontier accelerator laboratory taking data in the 2030s. The ILC addresses key questions for our current understanding of particle physics. It is based on a proven accelerator technology. Its experiments will challenge the Standard Model of particle physics and will provide a new window to look beyond it. This document brings the story of the ILC up to date, emphasizing its strong physics motivation, its readiness for construction, and the opportunity it presents to the US and the global particle physics community. △ Less

Submitted 16 January, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

Comments: 356 pages, Large pdf file (40 MB) submitted to Snowmass 2021; v2 references to Snowmass contributions added, additional authors; v3 references added, some updates, additional authors

Report number: DESY-22-045, IFT--UAM/CSIC--22-028, KEK Preprint 2021-61, PNNL-SA-160884, SLAC-PUB-17662

arXiv:2201.05282 [pdf, other]

Domain-shift adaptation via linear transformations

Authors: Roberto Vega, Russell Greiner

Abstract: A predictor, $f_A : X \to Y$, learned with data from a source domain (A) might not be accurate on a target domain (B) when their distributions are different. Domain adaptation aims to reduce the negative effects of this distribution mismatch. Here, we analyze the case where $P_A(Y\ |\ X) \neq P_B(Y\ |\ X)$, $P_A(X) \neq P_B(X)$ but $P_A(Y) = P_B(Y)$; where there are affine transformations of $X$ t… ▽ More A predictor, $f_A : X \to Y$, learned with data from a source domain (A) might not be accurate on a target domain (B) when their distributions are different. Domain adaptation aims to reduce the negative effects of this distribution mismatch. Here, we analyze the case where $P_A(Y\ |\ X) \neq P_B(Y\ |\ X)$, $P_A(X) \neq P_B(X)$ but $P_A(Y) = P_B(Y)$; where there are affine transformations of $X$ that makes all distributions equivalent. We propose an approach to project the source and target domains into a lower-dimensional, common space, by (1) projecting the domains into the eigenvectors of the empirical covariance matrices of each domain, then (2) finding an orthogonal matrix that minimizes the maximum mean discrepancy between the projections of both domains. For arbitrary affine transformations, there is an inherent unidentifiability problem when performing unsupervised domain adaptation that can be alleviated in the semi-supervised case. We show the effectiveness of our approach in simulated data and in binary digit classification tasks, obtaining improvements up to 48% accuracy when correcting for the domain shift in the data. △ Less

Submitted 13 January, 2022; originally announced January 2022.

arXiv:2112.05547 [pdf, other]

PACMAN: PAC-style bounds accounting for the Mismatch between Accuracy and Negative log-loss

Authors: Matias Vera, Leonardo Rey Vega, Pablo Piantanida

Abstract: The ultimate performance of machine learning algorithms for classification tasks is usually measured in terms of the empirical error probability (or accuracy) based on a testing dataset. Whereas, these algorithms are optimized through the minimization of a typically different--more convenient--loss function based on a training set. For classification tasks, this loss function is often the negative… ▽ More The ultimate performance of machine learning algorithms for classification tasks is usually measured in terms of the empirical error probability (or accuracy) based on a testing dataset. Whereas, these algorithms are optimized through the minimization of a typically different--more convenient--loss function based on a training set. For classification tasks, this loss function is often the negative log-loss that leads to the well-known cross-entropy risk which is typically better behaved (from a numerical perspective) than the error probability. Conventional studies on the generalization error do not usually take into account the underlying mismatch between losses at training and testing phases. In this work, we introduce an analysis based on point-wise PAC approach over the generalization gap considering the mismatch of testing based on the accuracy metric and training on the negative log-loss. We label this analysis PACMAN. Building on the fact that the mentioned mismatch can be written as a likelihood ratio, concentration inequalities can be used to provide some insights for the generalization problem in terms of some point-wise PAC bounds depending on some meaningful information-theoretic quantities. An analysis of the obtained bounds and a comparison with available results in the literature are also provided. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: Submitted to be considered for publication in Information and Inference: a Journal of the IMA

arXiv:2109.14028 [pdf, other]

doi 10.1063/5.0065966

On the robustness of model-based algorithms for photoacoustic tomography: comparison between time and frequency domains

Authors: L. Hirsch, M. G. Gonzalez, L. Rey Vega

Abstract: For photoacoustic image reconstruction, certain parameters such as sensor positions and speed of sound have a major impact in the reconstruction process and must be carefully determined before data acquisition. Uncertainties in these parameters can lead to errors produced by a modeling mismatch, hindering the reconstruction process and severely affecting the resulting image quality. Therefore, in… ▽ More For photoacoustic image reconstruction, certain parameters such as sensor positions and speed of sound have a major impact in the reconstruction process and must be carefully determined before data acquisition. Uncertainties in these parameters can lead to errors produced by a modeling mismatch, hindering the reconstruction process and severely affecting the resulting image quality. Therefore, in this work we study how modeling errors arising from uncertainty in sensor locations affect the images obtained by matrix model-based reconstruction algorithms based on time domain and frequency domain models of the photoacoustic problem. The effects on the reconstruction performance with respect to the uncertainty in the knowledge of the sensors location is compared and analyzed both in a qualitative and quantitative fashion for both time and frequency models. Ultimately, our study shows that the frequency domain approach is more sensitive to this kind of modeling errors. These conclusions are supported by numerical experiments and a theoretical sensitivity analysis of the mathematical operator for the direct problem. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: This work has been submitted to the Review Scientific Instruments for publication

arXiv:2106.01590 [pdf, other]

SIMLR: Machine Learning inside the SIR model for COVID-19 Forecasting

Authors: Roberto Vega, Leonardo Flores, Russell Greiner

Abstract: Accurate forecasts of the number of newly infected people during an epidemic are critical for making effective timely decisions. This paper addresses this challenge using the SIMLR model, which incorporates machine learning (ML) into the epidemiological SIR model. For each region, SIMLR tracks the changes in the policies implemented at the government level, which it uses to estimate the time-varyi… ▽ More Accurate forecasts of the number of newly infected people during an epidemic are critical for making effective timely decisions. This paper addresses this challenge using the SIMLR model, which incorporates machine learning (ML) into the epidemiological SIR model. For each region, SIMLR tracks the changes in the policies implemented at the government level, which it uses to estimate the time-varying parameters of an SIR model for forecasting the number of new infections 1- to 4-weeks in advance.It also forecasts the probability of changes in those government policies at each of these future times, which is essential for the longer-range forecasts. We applied SIMLR to data from regions in Canada and in the United States,and show that its MAPE (mean average percentage error) performance is as good as SOTA forecasting models, with the added advantage of being an interpretable model. We expect that this approach will be useful not only for forecasting COVID-19 infections, but also in predicting the evolution of other infectious diseases. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2105.09905 [pdf, ps, other]

The generalized Bolzano-Weierstrass property revisited

Authors: Ramiro de la Vega

Abstract: We investigate the question of when a topological space $X$ has the $\textit{Generalized Bolzano-Weierstrass property}$: every sequence of subsets of $X$ has a convergent subsequence (in the sense of Kuratowski). We investigate the question of when a topological space $X$ has the $\textit{Generalized Bolzano-Weierstrass property}$: every sequence of subsets of $X$ has a convergent subsequence (in the sense of Kuratowski). △ Less

Submitted 20 May, 2021; originally announced May 2021.

MSC Class: Primary 54A20; 54A25; Secondary 54G20; 54A35

arXiv:2104.00644 [pdf, other]

doi 10.3390/biomedicines9020224

Anomalous angiogenesis in retina

Authors: Rocío Vega, Manuel Carretero, Luis L. Bonilla

Abstract: Age-related macular degeneration (AMD) may cause severe loss of vision or blindness particularly in elderly people. Exudative AMD is characterized by angiogenesis of blood vessels growing from underneath the macula, crossing the blood-retina barrier (that comprise Bruch's membrane, BM, and the retinal pigmentation epithelium RPE), leaking blood and fluid into the retina and knocking off photorecep… ▽ More Age-related macular degeneration (AMD) may cause severe loss of vision or blindness particularly in elderly people. Exudative AMD is characterized by angiogenesis of blood vessels growing from underneath the macula, crossing the blood-retina barrier (that comprise Bruch's membrane, BM, and the retinal pigmentation epithelium RPE), leaking blood and fluid into the retina and knocking off photoreceptors. Here, we simulate a computational model of angiogenesis from the choroid blood vessels via a cellular Potts model, as well as BM, RPE cells, drusen deposits and photoreceptors. Our results indicate that improving AMD may require fixing the impaired lateral adhesion between RPE cells and with BM, as well as diminishing Vessel Endothelial Growth Factor (VEGF) and Jagged proteins that affect the Notch signaling pathway. Our numerical simulations suggest that anti-VEGF and anti-Jagged therapies could temporarily halt exudative AMD while addressing impaired cellular adhesion could be more effective on a longer time span. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: 20 pages, 8 figures

Journal ref: Biomedicines 9, 224 (2021)

arXiv:2102.06164 [pdf, other]

Sample Efficient Learning of Image-Based Diagnostic Classifiers Using Probabilistic Labels

Authors: Roberto Vega, Pouneh Gorji, Zichen Zhang, Xuebin Qin, Abhilash Rakkunedeth Hareendranathan, Jeevesh Kapur, Jacob L. Jaremko, Russell Greiner

Abstract: Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilist… ▽ More Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilistic labels to train accurate and calibrated deep networks from relatively small datasets. We observe gains of up to 22% in the accuracy of models trained with these labels, as compared with traditional approaches, in three classification tasks: diagnosis of hip dysplasia, fatty liver, and glaucoma. The outputs of models trained with probabilistic labels are calibrated, allowing the interpretation of its predictions as proper probabilities. We anticipate this approach will apply to other tasks where few training instances are available and expert knowledge can be encoded as probabilities. △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: To appear in the Proceedings of the 24 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego,California, USA. PMLR: Volume 130

arXiv:2010.11642 [pdf, other]

The Role of Mutual Information in Variational Classifiers

Authors: Matias Vera, Leonardo Rey Vega, Pablo Piantanida

Abstract: Overfitting data is a well-known phenomenon related with the generation of a model that mimics too closely (or exactly) a particular instance of data, and may therefore fail to predict future observations reliably. In practice, this behaviour is controlled by various--sometimes heuristics--regularization techniques, which are motivated by developing upper bounds to the generalization error. In thi… ▽ More Overfitting data is a well-known phenomenon related with the generation of a model that mimics too closely (or exactly) a particular instance of data, and may therefore fail to predict future observations reliably. In practice, this behaviour is controlled by various--sometimes heuristics--regularization techniques, which are motivated by developing upper bounds to the generalization error. In this work, we study the generalization error of classifiers relying on stochastic encodings trained on the cross-entropy loss, which is often used in deep learning for classification problems. We derive bounds to the generalization error showing that there exists a regime where the generalization error is bounded by the mutual information between input features and the corresponding representations in the latent space, which are randomly generated according to the encoding distribution. Our bounds provide an information-theoretic understanding of generalization in the so-called class of variational classifiers, which are regularized by a Kullback-Leibler (KL) divergence term. These results give theoretical grounds for the highly popular KL term in variational inference methods that was already recognized to act effectively as a regularization penalty. We further observe connections with well studied notions such as Variational Autoencoders, Information Dropout, Information Bottleneck and Boltzmann Machines. Finally, we perform numerical experiments on MNIST and CIFAR datasets and show that mutual information is indeed highly representative of the behaviour of the generalization error. △ Less

Submitted 13 April, 2023; v1 submitted 22 October, 2020; originally announced October 2020.

Comments: Accepted for publication to Machine Learning Springer

arXiv:2004.13475 [pdf, other]

Efficient GPU Thread Mapping on Embedded 2D Fractals

Authors: Cristóbal A. Navarro, Felipe A. Quezada, Nancy Hitschfeld, Raimundo Vega, Benjamin Bustos

Abstract: This work proposes a new approach for mapping GPU threads onto a family of discrete embedded 2D fractals. A block-space map $λ: \mathbb{Z}_{\mathbb{E}}^{2} \mapsto \mathbb{Z}_{\mathbb{F}}^{2}$ is proposed, from Euclidean parallel space $\mathbb{E}$ to embedded fractal space $\mathbb{F}$, that maps in $\mathcal{O}(\log_2 \log_2(n))$ time and uses no more than $\mathcal{O}(n^\mathbb{H})$ threads wit… ▽ More This work proposes a new approach for mapping GPU threads onto a family of discrete embedded 2D fractals. A block-space map $λ: \mathbb{Z}_{\mathbb{E}}^{2} \mapsto \mathbb{Z}_{\mathbb{F}}^{2}$ is proposed, from Euclidean parallel space $\mathbb{E}$ to embedded fractal space $\mathbb{F}$, that maps in $\mathcal{O}(\log_2 \log_2(n))$ time and uses no more than $\mathcal{O}(n^\mathbb{H})$ threads with $\mathbb{H}$ being the Hausdorff dimension of the fractal, making it parallel space efficient. When compared to a bounding-box (BB) approach, $λ(ω)$ offers a sub-exponential improvement in parallel space and a monotonically increasing speedup $n \ge n_0$. The Sierpinski gasket fractal is used as a particular case study and the experimental performance results show that $λ(ω)$ reaches up to $9\times$ of speedup over the bounding-box approach. A tensor-core based implementation of $λ(ω)$ is also proposed for modern GPUs, providing up to $\sim40\%$ of extra performance. The results obtained in this work show that doing efficient GPU thread mapping on fractal domains can significantly improve the performance of several applications that work with this type of geometry. △ Less

Submitted 25 April, 2020; originally announced April 2020.

Comments: 20 Pages. arXiv admin note: text overlap with arXiv:1706.04552

ACM Class: C.1.4; G.2.0

arXiv:2003.02490 [pdf, other]

On fully-distributed composite tests with general parametric data distributions in sensor networks

Authors: Juan Maya, Leonardo Rey Vega

Abstract: We consider a distributed detection problem where measurements at each sensor follow a general parametric distribution. The network does not have a central processing unit or fusion center (FC). Thus, each node takes some measurements, does some processing, exchanges messages with its neighbors and finally makes a decision (typically the same for all nodes) about the phenomenon of interest. The pr… ▽ More We consider a distributed detection problem where measurements at each sensor follow a general parametric distribution. The network does not have a central processing unit or fusion center (FC). Thus, each node takes some measurements, does some processing, exchanges messages with its neighbors and finally makes a decision (typically the same for all nodes) about the phenomenon of interest. The problem can be formulated as a composite hypothesis test with unknown parameters where, in general, a uniformly most powerful test does not exist. This leads naturally to the use of the Generalized Likelihood Ratio (GLR) test. As the measurements follow a general parametric distribution (which could model spatial dependence of the data), the implementation of fully-distributed detection procedures could be demanding in network resources. For this reason, we study the use of a simpler test (referred as L-MP) which uses the product of the marginals of the measurements taken at each node, where the unknown parameters are easily estimated with only local measurements. Although this simple proposal still requires network-wide cooperation between nodes, the number of communications is significantly reduced with respect to the GLR test, making it a suitable choice in severely resource-constrained sensor networks. This simpler test does not exploit the full parametric model of data, so, it becomes important to analyze its statistical properties and its potential performance loss. This is done through the analysis of the L-MP asymptotic distribution. Interestingly, despite the fact that the L-MP is simpler and more efficient to implement than the GLR test, we obtain some conditions under which the L-MP has superior asymptotic performance to the GLR test. Finally, we present numerical results for a fully-distributed spectrum sensing application for cognitive radios. △ Less

Submitted 6 June, 2021; v1 submitted 5 March, 2020; originally announced March 2020.

Comments: 13 pages, 6 figures. Submitted to the IEEE Transactions on Signal and Information Processing over Networks

arXiv:2001.07156 [pdf, ps, other]

Selective separability and $q^+$ on maximal spaces

Authors: Ramiro de la Vega, Javier Murgas, Carlos Uzcátegui

Abstract: Given a hereditarily meager ideal $\mathcal{I}$ on a countable set $X$ we use Martin's axiom for countable posets to produce a zero-dimensional maximal topology $τ^\mathcal{I}$ on $X$ such that $τ^\mathcal{I}\cap \mathcal{I}=\{\emptyset\}$ and, moreover, if $\mathcal{I}$ is $p^+$ then $τ^\mathcal{I}$ is selectively separable (SS) and if $\mathcal{I}$ is $q^+$, so is $τ^\mathcal{I}$. In particular,… ▽ More Given a hereditarily meager ideal $\mathcal{I}$ on a countable set $X$ we use Martin's axiom for countable posets to produce a zero-dimensional maximal topology $τ^\mathcal{I}$ on $X$ such that $τ^\mathcal{I}\cap \mathcal{I}=\{\emptyset\}$ and, moreover, if $\mathcal{I}$ is $p^+$ then $τ^\mathcal{I}$ is selectively separable (SS) and if $\mathcal{I}$ is $q^+$, so is $τ^\mathcal{I}$. In particular, we obtain regular maximal spaces satisfying all boolean combinations of the properties SS and $q^+$. △ Less

Submitted 20 January, 2020; originally announced January 2020.

Comments: 17 pages

MSC Class: 54G05; 54A35; 03E57

arXiv:2001.05585 [pdf, ps, other]

GPU Tensor Cores for fast Arithmetic Reductions

Authors: Cristóbal A. Navarro, Roberto Carrasco, Ricardo J. Barrientos, Javier A. Riquelme, Raimundo Vega

Abstract: This work proposes a GPU tensor core approach that encodes the arithmetic reduction of $n$ numbers as a set of chained $m \times m$ matrix multiply accumulate (MMA) operations executed in parallel by GPU tensor cores. The asymptotic running time of the proposed chained tensor core approach is $T(n)=5 log_{m^2}{n}$ and its speedup is $S=\dfrac{4}{5} log_{2}{m^2}$ over the classic $O(n \log n)$ para… ▽ More This work proposes a GPU tensor core approach that encodes the arithmetic reduction of $n$ numbers as a set of chained $m \times m$ matrix multiply accumulate (MMA) operations executed in parallel by GPU tensor cores. The asymptotic running time of the proposed chained tensor core approach is $T(n)=5 log_{m^2}{n}$ and its speedup is $S=\dfrac{4}{5} log_{2}{m^2}$ over the classic $O(n \log n)$ parallel reduction algorithm. Experimental performance results show that the proposed reduction method is $\sim 3.2 \times$ faster than a conventional GPU reduction implementation, and preserves the numerical precision because the sub-results of each chain of $R$ MMAs is kept as a 32-bit floating point value, before being all reduced into as a final 32-bit result. The chained MMA design allows a flexible configuration of thread-blocks; small thread-blocks of 32 or 128 threads can still achieve maximum performance using a chain of $R=4,5$ MMAs per block, while large thread-blocks work best with $R=1$. The results obtained in this work show that tensor cores can indeed provide a significant performance improvement to non-Machine Learning applications such as the arithmetic reduction, which is an integration tool for studying many scientific phenomena. △ Less

Submitted 15 January, 2020; originally announced January 2020.

Comments: 14 pages, 11 figures

arXiv:1912.01772 [pdf, ps, other]

A Resource for Computational Experiments on Mapudungun

Authors: Mingjun Duan, Carlos Fasola, Sai Krishna Rallabandi, Rodolfo M. Vega, Antonios Anastasopoulos, Lori Levin, Alan W Black

Abstract: We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers. We provide 142 hours of culturally significant conversations in the domain of medical treatment. The conversations are fully transcribed and translated into Spanish. The transcriptions also include annotations for code-switching and non-stand… ▽ More We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers. We provide 142 hours of culturally significant conversations in the domain of medical treatment. The conversations are fully transcribed and translated into Spanish. The transcriptions also include annotations for code-switching and non-standard pronunciations. We also provide baseline results on three core NLP tasks: speech recognition, speech synthesis, and machine translation between Spanish and Mapudungun. We further explore other applications for which the corpus will be suitable, including the study of code-switching, historical orthography change, linguistic structure, and sociological and anthropological studies. △ Less

Submitted 4 April, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

Comments: accepted at LREC 2020

arXiv:1908.02798 [pdf, other]

doi 10.1109/JIOT.2019.2959552

Extreme coverage in 5G Narrowband IoT: a LUT-based strategy to optimize shared channels

Authors: Emmanuel Luján, Juan A. Zuloaga Mellino, Alejandro D. Otero, Leonardo Rey Vega, Cecilia G. Galarza, Esteban E. Mocskos

Abstract: One of the main challenges in IoT is providing communication support to an increasing number of connected devices. In recent years, narrowband radio technology has emerged to address this situation: Narrowband Internet of Things (NB-IoT), which is now part of 5G. Supporting massive connectivity becomes particularly demanding in extreme coverage scenarios such as underground or deep inside building… ▽ More One of the main challenges in IoT is providing communication support to an increasing number of connected devices. In recent years, narrowband radio technology has emerged to address this situation: Narrowband Internet of Things (NB-IoT), which is now part of 5G. Supporting massive connectivity becomes particularly demanding in extreme coverage scenarios such as underground or deep inside buildings sites. We propose a novel strategy for these situations focused on optimizing NB-IoT shared channels through the selection of link parameters: modulation and coding scheme, as well as the number of repetitions. These parameters are established by the base station (BS) for each block transmitted until reaching a target block error rate (BLER_t ). A wrong selection of these magnitudes leads to radio resource waste and a decrease in the number of possible concurrent connections. Specifically, our strategy is based on a look-up table (LUT) scheme which is used for rapidly delivering the optimal link parameters given a target QoS. To validate our proposal, we compare with alternative strategies using an open source NB-IoT uplink simulator. The experiments are based on transmitting blocks of 256 bits using an AWGN channel over the NPUSCH. Results show that, especially under extreme conditions, only a few options for link parameters are available, favoring robustness against measurement uncertainties. Our strategy minimizes resource usage in all scenarios of acknowledged mode and remarkably reduces losses in the unacknowledged mode, presenting also substantial gains in performance. We expect to influence future BS software design and implementation, favoring connection support under extreme environments. △ Less

Submitted 24 December, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

Comments: Paper accepted at IEEE IoT Journal

arXiv:1906.07909 [pdf, other]

doi 10.1051/0004-6361/201935934

Characterization of the continuum and kinematical properties of nearby NLS1

Authors: Gabriel A. Oio, Luis R. Vega, Eduardo O. Schmidt, Diego Ferreiro

Abstract: In order to study the slope and strength of the non-stellar continuum, we analyzed a sample of nearby Narrow Line Seyfert 1 (NLS1). Also, we re-examined the location of NLS1 galaxies on the M $-$ $σ$ relation using the stellar velocity dispersion and the [OIII]$λ$5007 emission line as surrogate of the former. We studied spectra of a sample of 131 NLS1 galaxies taken from the Sloan Digital Sky Surv… ▽ More In order to study the slope and strength of the non-stellar continuum, we analyzed a sample of nearby Narrow Line Seyfert 1 (NLS1). Also, we re-examined the location of NLS1 galaxies on the M $-$ $σ$ relation using the stellar velocity dispersion and the [OIII]$λ$5007 emission line as surrogate of the former. We studied spectra of a sample of 131 NLS1 galaxies taken from the Sloan Digital Sky Survey (SDSS) DR7. We approached determining the non-stellar continuum by employing the spectral synthesis technique, which uses the code {\sc starlight}, and by adopting a power-law base to model the non-stellar continuum. Composite spectra of NLS1 galaxies were also obtained based on the sample.In addition, we obtained the stellar velocity dispersion from the code and by measuring Calcium II Triplet absorption lines and [OIII] emission lines. From Gaussian decomposition of the H$β$ profile we calculated the black hole mass. We obtained a median slope of $β$ = $-$1.6 with a median fraction of contribution of the non-stellar continuum to the total flux of 0.64. We determined black hole masses in the range of log(M$_{BH}$/M$_{\odot}$) = 5.6 $-$ 7.5 which is in agreement with previous works. We found a correlation between the luminosity of the broad component of H$β$ and black hole mass with the fraction of a power-law component. Finally, according to our results, NLS1 galaxies in our sample are located mostly underneath the MBH - $σ_{\star}$ relation, both considering the stellar velocity dispersion ($σ_{\star}$) and the core component of [OIII]$λ$5007. △ Less

Submitted 23 September, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

Comments: 12 pages, 12 figures. Accepted to be published in the A&A

Journal ref: A&A 629, A50 (2019)

arXiv:1905.11972 [pdf, other]

Understanding the Behaviour of the Empirical Cross-Entropy Beyond the Training Distribution

Authors: Matias Vera, Pablo Piantanida, Leonardo Rey Vega

Abstract: Machine learning theory has mostly focused on generalization to samples from the same distribution as the training data. Whereas a better understanding of generalization beyond the training distribution where the observed distribution changes is also fundamentally important to achieve a more powerful form of generalization. In this paper, we attempt to study through the lens of information measure… ▽ More Machine learning theory has mostly focused on generalization to samples from the same distribution as the training data. Whereas a better understanding of generalization beyond the training distribution where the observed distribution changes is also fundamentally important to achieve a more powerful form of generalization. In this paper, we attempt to study through the lens of information measures how a particular architecture behaves when the true probability law of the samples is potentially different at training and testing times. Our main result is that the testing gap between the empirical cross-entropy and its statistical expectation (measured with respect to the testing probability law) can be bounded with high probability by the mutual information between the input testing samples and the corresponding representations, generated by the encoder obtained at training time. These results of theoretical nature are supported by numerical simulations showing that the mentioned mutual information is representative of the testing gap, capturing qualitatively the dynamic in terms of the hyperparameters of the network. △ Less

Submitted 28 May, 2019; originally announced May 2019.

Comments: 18 pages, 6 Figures

arXiv:1904.13328 [pdf, other]

A Self-Adaptive Contractive Algorithm for Enhanced Dynamic Phasor Estimation

Authors: Francisco Messina, Pablo Marchi, Leonardo Rey Vega, Cecilia Galarza

Abstract: In this paper, a self-adaptive contractive (SAC) algorithm is proposed for enhanced dynamic phasor estimation in the diverse operating conditions of modern power systems. At a high-level, the method is composed of three stages: parameter shifting, filtering and parameter unshifting. The goal of the first stage is to transform the input signal phasor so that it is approximately mapped to nominal co… ▽ More In this paper, a self-adaptive contractive (SAC) algorithm is proposed for enhanced dynamic phasor estimation in the diverse operating conditions of modern power systems. At a high-level, the method is composed of three stages: parameter shifting, filtering and parameter unshifting. The goal of the first stage is to transform the input signal phasor so that it is approximately mapped to nominal conditions. The second stage provides estimates of the phasor, frequency, rate of change of frequency (ROCOF), damping and rate of change of damping (ROCOD) of the parameter shifted phasor by using a differentiator filter bank (DFB). The final stage recovers the original signal phasor parameters while rejecting misleading estimates. The most important features of the algorithm are that it offers convergence guarantees in a set of desired conditions, and also great harmonic rejection. Numerical examples, including the IEEE C37.118.1 standard tests with realistic noise levels, as well as fault conditions, validate the proposed algorithm. △ Less

Submitted 30 April, 2019; originally announced April 2019.

Comments: 8 pages, 4 figures. Submitted to IEEE Transactions on Smart Grid

arXiv:1904.06173 [pdf, other]

Design and performance analysis of a fully distributed source detection algorithm for WSNs

Authors: Juan Augusto Maya, Leonardo Rey Vega

Abstract: In this article, we consider the detection of a localized source emitting a signal using a wireless sensor network (WSN). We consider that geographically distributed sensor nodes obtain energy measurements and compute cooperatively and in a distributed manner a statistic to decide if the source is present or absent without the need of a central node or fusion center (FC). We first start from the c… ▽ More In this article, we consider the detection of a localized source emitting a signal using a wireless sensor network (WSN). We consider that geographically distributed sensor nodes obtain energy measurements and compute cooperatively and in a distributed manner a statistic to decide if the source is present or absent without the need of a central node or fusion center (FC). We first start from the continuous-time signal sensed by the nodes and obtain an equivalent discrete-time hypothesis testing problem. Secondly, we propose a fully distributed scheme, based on the well-known generalized likelihood ratio (GLR) test, which is suitable for a WSN, where resources such as energy and communication bandwidth are typically scarce. In third place, we consider the asymptotic performance of the proposed GLR test. The derived results provide an excellent matching with the scenario in which only a finite amount of measurements are available at each sensor node. We finally show that the proposed distributed algorithm performs as well as the global GLR test in the considered scenarios, requiring only a small number of communication exchanges between nodes and a limited knowledge about the network structure and its connectivity. △ Less

Submitted 9 April, 2019; originally announced April 2019.

Comments: Submitted to TSP-IEEE

arXiv:1903.03640 [pdf, ps, other]

doi 10.29007/zlmg

Analyzing GPU Tensor Core Potential for Fast Reductions

Authors: Roberto Carrasco, Raimundo Vega, Cristóbal A. Navarro

Abstract: The Nvidia GPU architecture has introduced new computing elements such as the \textit{tensor cores}, which are special processing units dedicated to perform fast matrix-multiply-accumulate (MMA) operations and accelerate \textit{Deep Learning} applications. In this work we present the idea of using tensor cores for a different purpose such as the parallel arithmetic reduction problem, and propose… ▽ More The Nvidia GPU architecture has introduced new computing elements such as the \textit{tensor cores}, which are special processing units dedicated to perform fast matrix-multiply-accumulate (MMA) operations and accelerate \textit{Deep Learning} applications. In this work we present the idea of using tensor cores for a different purpose such as the parallel arithmetic reduction problem, and propose a new GPU tensor-core based algorithm as well as analyze its potential performance benefits in comparison to a traditional GPU-based one. The proposed method, encodes the reduction of $n$ numbers as a set of $m\times m$ MMA tensor-core operations (for Nvidia's Volta architecture $m=16$) and takes advantage from the fact that each MMA operation takes just one GPU cycle. When analyzing the cost under a simplified GPU computing model, the result is that the new algorithm manages to reduce a problem of $n$ numbers in $T(n) = 5\log_{m^2}(n)$ steps with a speedup of $S = \frac{4}{5}\log_2(m^2)$. △ Less

Submitted 8 March, 2019; originally announced March 2019.

Comments: This paper was presented in the SCCC 2018 Conference, November 5

Journal ref: 37th Internatioinal Conference of the Chilean Computer Science Society, SCCC 2018, November 5-9, Santiago, Chile, 2018

arXiv:1807.05133 [pdf, other]

Augmented Generator Sub-transient Model Using Dynamic Phasor Measurements

Authors: Pablo Marchi, Francisco Messina, Leonardo Rey Vega, Cecilia Galarza

Abstract: In this article, we present a new model for a synchronous generator based on phasor measurement units (PMUs) data. The proposed sub-transient model allows to estimate the dynamic state variables as well as to calibrate model parameters. The motivation for this new model is to use more efficiently the PMU measurements which are becoming widely available in power grids. The concept of phasor derivat… ▽ More In this article, we present a new model for a synchronous generator based on phasor measurement units (PMUs) data. The proposed sub-transient model allows to estimate the dynamic state variables as well as to calibrate model parameters. The motivation for this new model is to use more efficiently the PMU measurements which are becoming widely available in power grids. The concept of phasor derivative is applied, which not only includes the signal phase derivative but also its amplitude derivative. Applying known non-linear estimation techniques, we study the merits of this new model. In particular, we test robustness by considering a generator with different mechanical power controls. △ Less

Submitted 13 July, 2018; originally announced July 2018.

Comments: 8 pages, 8 figures

arXiv:1805.01970 [pdf, other]

doi 10.1007/JHEP06(2018)137

Light (and darkness) from a light hidden Higgs

Authors: Roberto Vega, Roberto Vega-Morales, Keping Xie

Abstract: We examine light diphoton signals from extended Higgs sectors possessing (approximate) fermiophobia with Standard Model (SM) fermions as well as custodial symmetry. This class of Higgs sectors can be realized in various beyond the SM scenarios and is able to evade many experimental limits, even at light masses, which are otherwise strongly constraining. Below the $WW$ threshold, the most robust pr… ▽ More We examine light diphoton signals from extended Higgs sectors possessing (approximate) fermiophobia with Standard Model (SM) fermions as well as custodial symmetry. This class of Higgs sectors can be realized in various beyond the SM scenarios and is able to evade many experimental limits, even at light masses, which are otherwise strongly constraining. Below the $WW$ threshold, the most robust probes of the neutral component are di and multi-photon searches. Utilizing the dominant Drell-Yan Higgs pair production mechanism and combining it with updated LHC diphoton data, we derive robust upper bounds on the allowed branching ratio for masses between $45 - 160$ GeV. Furthermore, masses $\lesssim 110$ GeV are ruled out if the coupling to photons is dominated by $W$ boson loops. We then examine two simple ways to evade these bounds via cancellations between different loop contributions or by introducing decays into an invisible sector. This also opens up the possibility of future LHC diphoton signals from a light hidden Higgs sector. As explicit realizations, we consider the Georgi-Machacek (GM) and Supersymmetric GM (SGM) models which contain custodial (degenerate) Higgs bosons with suppressed couplings to SM fermions and, in the SGM model, a (neutralino) LSP. We also breifly examine the recent $\sim 3σ$ CMS diphoton excess at $\sim 95$ GeV. △ Less

Submitted 20 June, 2018; v1 submitted 4 May, 2018; originally announced May 2018.

Comments: references added, typos corrected, version to be published

Report number: UG-FT 327/18, CAFPE 197/18, SMU-HEP-18-08, FERMILAB-PUB-18-159-T

arXiv:1804.09103 [pdf, ps, other]

On the reflection of the countable chain condition

Authors: Ramiro de la Vega

Abstract: We study the question of when an uncountable ccc topological space $X$ contains a ccc subspace of size $\aleph_1$. We show that it does if $X$ is compact Hausdorff and more generally if $X$ is Hausdorff with $\mathrm{pct}(X) \leq \aleph_1$. For each regular cardinal $κ$, an example is constructed of a ccc Tychonoff space of size $κ$ and countable pseudocharacter but with no ccc subspace of size le… ▽ More We study the question of when an uncountable ccc topological space $X$ contains a ccc subspace of size $\aleph_1$. We show that it does if $X$ is compact Hausdorff and more generally if $X$ is Hausdorff with $\mathrm{pct}(X) \leq \aleph_1$. For each regular cardinal $κ$, an example is constructed of a ccc Tychonoff space of size $κ$ and countable pseudocharacter but with no ccc subspace of size less than $κ$. We also give a ccc compact $T_1$ space of size $κ$ with no ccc subspace of size less than $κ$. △ Less

Submitted 24 April, 2018; originally announced April 2018.

Comments: 7 pages

MSC Class: 54A25; 54G20; 54D30 (Primary); 54A10 (Secondary)

arXiv:1802.05355 [pdf, other]

The Role of Information Complexity and Randomization in Representation Learning

Authors: Matías Vera, Pablo Piantanida, Leonardo Rey Vega

Abstract: A grand challenge in representation learning is to learn the different explanatory factors of variation behind the high dimen- sional data. Encoder models are often determined to optimize performance on training data when the real objective is to generalize well to unseen data. Although there is enough numerical evidence suggesting that noise injection (during training) at the representation level… ▽ More A grand challenge in representation learning is to learn the different explanatory factors of variation behind the high dimen- sional data. Encoder models are often determined to optimize performance on training data when the real objective is to generalize well to unseen data. Although there is enough numerical evidence suggesting that noise injection (during training) at the representation level might improve the generalization ability of encoders, an information-theoretic understanding of this principle remains elusive. This paper presents a sample-dependent bound on the generalization gap of the cross-entropy loss that scales with the information complexity (IC) of the representations, meaning the mutual information between inputs and their representations. The IC is empirically investigated for standard multi-layer neural networks with SGD on MNIST and CIFAR-10 datasets; the behaviour of the gap and the IC appear to be in direct correlation, suggesting that SGD selects encoders to implicitly minimize the IC. We specialize the IC to study the role of Dropout on the generalization capacity of deep encoders which is shown to be directly related to the encoder capacity, being a measure of the distinguishability among samples from their representations. Our results support some recent regularization methods. △ Less

Submitted 14 February, 2018; originally announced February 2018.

Comments: 35 pages, 3 figures. Submitted for publication

arXiv:1711.07099 [pdf, other]

doi 10.1109/JSTSP.2018.2846218

Compression-Based Regularization with an Application to Multi-Task Learning

Authors: Matías Vera, Leonardo Rey Vega, Pablo Piantanida

Abstract: This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin by introducing the noisy lossy source… ▽ More This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to fully describe the data itself, in order to build meaningful representations of a relevant content (multiple labels). We begin by introducing the noisy lossy source coding paradigm with the log-loss fidelity criterion which provides the fundamental tradeoffs between the \emph{cross-entropy loss} (average risk) and the information rate of the features (model complexity). Our approach allows an information theoretic formulation of the \emph{multi-task learning} (MTL) problem which is a supervised learning framework in which the prediction models for several related tasks are learned jointly from common representations to achieve better generalization performance. Then, we present an iterative algorithm for computing the optimal tradeoffs and its global convergence is proven provided that some conditions hold. An important property of this algorithm is that it provides a natural safeguard against overfitting, because it minimizes the average risk taking into account a penalization induced by the model complexity. Remarkably, empirical results illustrate that there exists an optimal information rate minimizing the \emph{excess risk} which depends on the nature and the amount of available training data. An application to hierarchical text categorization is also investigated, extending previous works. △ Less

Submitted 19 November, 2017; originally announced November 2017.

Comments: 13 pages, 7 figures. Submitted for publication

arXiv:1711.05329 [pdf, other]

doi 10.1007/JHEP03(2018)168

The Supersymmetric Georgi-Machacek Model

Authors: Roberto Vega, Roberto Vega-Morales, Keping Xie

Abstract: We show that the well known Georgi-Machacek (GM) model can be realized as a limit of the recently constructed Supersymmetric Custodial Higgs Triplet Model (SCTM) which in general contains a significantly more complex scalar spectrum. We dub this limit of the SCTM, which gives a weakly coupled origin for the GM model at the electroweak scale, the Supersymmetric GM (SGM) model. We derive a mapping b… ▽ More We show that the well known Georgi-Machacek (GM) model can be realized as a limit of the recently constructed Supersymmetric Custodial Higgs Triplet Model (SCTM) which in general contains a significantly more complex scalar spectrum. We dub this limit of the SCTM, which gives a weakly coupled origin for the GM model at the electroweak scale, the Supersymmetric GM (SGM) model. We derive a mapping between the SGM and GM models using it to show how a supersymmetric origin implies constraints on the Higgs potential in conventional GM model constructions which would generically not be present. We then perform a simplified phenomenological study of diphoton and ZZ signals for a pair of benchmark scenarios to illustrate under what circumstances the GM model can mimic the SGM model and when they should be easily distinguishable. △ Less

Submitted 16 March, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

Comments: References added and figures updated. Version to be published in JHEP

Report number: UG-FT 325/17, CAFPE 195/17, SMU-HEP-17-09

arXiv:1710.08054 [pdf]

Consilience: A Holistic Measure of Goodness-of-Fit

Authors: William H. Neill, Ray H. Kamps, Scott J. Walker, Hsin-i Wu, T. Scott Brandes, Delbert M. Gatlin III, Tiffany L. Hopper, Robert R. Vega

Abstract: We describe an apparently new measure of multivariate goodness-of-fit between sets of quantitative results from a model (simulation, analytical, or multiple regression), paired with those observed under corresponding conditions from the system being modeled. Our approach returns a single, integrative measure, even though it can accommodate complex systems that produce responses of M types. For eac… ▽ More We describe an apparently new measure of multivariate goodness-of-fit between sets of quantitative results from a model (simulation, analytical, or multiple regression), paired with those observed under corresponding conditions from the system being modeled. Our approach returns a single, integrative measure, even though it can accommodate complex systems that produce responses of M types. For each response-type, the goodness-of-fit measure, which we label "Consilience" (C), is maximally 1, for perfect fit; near 0 for the large-sample case (number of pairs, N, more than about 25) in which the modeled series is a random sample from a quasi-normal distribution with the same mean and variance as that of the observed series (null model); and, less than 0, toward minus-infinity, for progressively worse fit. In addition, lack-of-fit for each response-type can be apportioned between systematic and non-systematic (unexplained) components of error. Finally, for statistical assessment of models relative to the equivalent null model, we offer provisional estimates of critical C vs. N, and of critical joint-C vs. N and M, at various levels of Pr(type-I error). Application of our proposed methodology requires only MS Excel (2003 or later); we provide Excel XLS and XLSX templates that afford semi-automatic computation for systems involving up to M = 5 response types, each represented by up to N = 1000 observed-and-modeled result pairs. N need not be equal, nor response pairs in complete overlap, over M. △ Less

Submitted 20 October, 2018; v1 submitted 22 October, 2017; originally announced October 2017.

Comments: This 3rd update of the ms. (permanent arXiv identifier 1710.08054, 23OCT2017) differs from the 2nd only in that the 3rd provides an additional, alternative pathway for retrieving files cited via people.tamu.edu-hyperlinks in the ms. <end>

arXiv:1706.04552 [pdf, ps, other]

Block-space GPU Mapping for Embedded Sierpiński Gasket Fractals

Authors: Cristóbal A. Navarro, Benjamín Bustos, Raimundo Vega, Nancy Hitschfeld

Abstract: This work studies the problem of GPU thread mapping for a Sierpiński gasket fractal embedded in a discrete Euclidean space of $n \times n$. A block-space map $λ: \mathbb{Z}_{\mathbb{E}}^{2} \mapsto \mathbb{Z}_{\mathbb{F}}^{2}$ is proposed, from Euclidean parallel space $\mathbb{E}$ to embedded fractal space $\mathbb{F}$, that maps in $\mathcal{O}(\log_2 \log_2(n))$ time and uses no more than… ▽ More This work studies the problem of GPU thread mapping for a Sierpiński gasket fractal embedded in a discrete Euclidean space of $n \times n$. A block-space map $λ: \mathbb{Z}_{\mathbb{E}}^{2} \mapsto \mathbb{Z}_{\mathbb{F}}^{2}$ is proposed, from Euclidean parallel space $\mathbb{E}$ to embedded fractal space $\mathbb{F}$, that maps in $\mathcal{O}(\log_2 \log_2(n))$ time and uses no more than $\mathcal{O}(n^\mathbb{H})$ threads with $\mathbb{H} \approx 1.58...$ being the Hausdorff dimension, making it parallel space efficient. When compared to a bounding-box map, $λ(ω)$ offers a sub-exponential improvement in parallel space and a monotonically increasing speedup once $n > n_0$. Experimental performance tests show that in practice $λ(ω)$ can produce performance improvement at any block-size once $n > n_0 = 2^8$, reaching approximately $10\times$ of speedup for $n=2^{16}$ under optimal block configurations. △ Less

Submitted 14 June, 2017; originally announced June 2017.

Comments: 7 pages, 8 Figures

arXiv:1604.01433 [pdf, other]

doi 10.1109/TIT.2018.2883295

Collaborative Information Bottleneck

Authors: Matías Vera, Leonardo Rey Vega, Pablo Piantanida

Abstract: This paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the Information Bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with… ▽ More This paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the Information Bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with respect to other unobserved (hidden) sources. More precisely, we study fundamental information-theoretic limits of the so-called: (i) Two-way Collaborative Information Bottleneck (TW-CIB) and (ii) the Collaborative Distributed Information Bottleneck (CDIB) problems. The TW-CIB problem consists of two distant encoders that separately observe marginal (dependent) components $X_1$ and $X_2$ and can cooperate through multiple exchanges of limited information with the aim of extracting information about hidden variables $(Y_1,Y_2)$, which can be arbitrarily dependent on $(X_1,X_2)$. On the other hand, in CDIB there are two cooperating encoders which separately observe $X_1$ and $X_2$ and a third node which can listen to the exchanges between the two encoders in order to obtain information about a hidden variable $Y$. The relevance (figure-of-merit) is measured in terms of a normalized (per-sample) multi-letter mutual information metric (log-loss fidelity) and an interesting tradeoff arises by constraining the complexity of descriptions, measured in terms of the rates needed for the exchanges between the encoders and decoders involved. Inner and outer bounds to the complexity-relevance region of these problems are derived from which optimality is characterized for several cases of interest. Our resulting theoretical complexity-relevance regions are finally evaluated for binary symmetric and Gaussian statistical models. △ Less

Submitted 24 November, 2021; v1 submitted 5 April, 2016; originally announced April 2016.

Comments: Submitted to IEEE Transactions on Information Theory (revised, 29, 7 figures)

Journal ref: IEEE Transactions on Information Theory ( Volume: 65, Issue: 2, Feb. 2019)

Showing 1–50 of 79 results for author: Vega, R