Skip to main content

Showing 1–27 of 27 results for author: Silva, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2509.09336  [pdf, ps, other

    stat.AP stat.ME

    A Zero-Inflated Spatio-Temporal Model for Integrating Fishery-Dependent and Independent Data under Preferential Sampling

    Authors: Daniela Silva, Raquel Menezes, Gonçalo Araújo, Ana Machado, Renato Rosa, Ana Moreno, Alexandra Silva, Susana Garrido

    Abstract: Sustainable management of marine ecosystems is vital for maintaining healthy fishery resources, and benefits from advanced scientific tools to accurately assess species distribution patterns. In fisheries science, two primary data sources are used: fishery-independent data (FID), collected through systematic surveys, and fishery-dependent data (FDD), obtained from commercial fishing activities. Wh… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

  2. arXiv:2503.24079  [pdf, other

    stat.ME

    Joint model for zero-inflated data combining fishery-dependent and fishery-independent sources

    Authors: Daniela Silva, Raquel Menezes, Gonçalo Araújo, Renato Rosa, Ana Moreno, Alexandra Silva, Susana Garrido

    Abstract: Accurately identifying spatial patterns of species distribution is crucial for scientific insight and societal benefit, aiding our understanding of species fluctuations. The increasing quantity and quality of ecological datasets present heightened statistical challenges, complicating spatial species dynamics comprehension. Addressing the complex task of integrating multiple data sources to enhance… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  3. arXiv:2410.09355  [pdf, other

    cs.LG stat.ML

    On Divergence Measures for Training GFlowNets

    Authors: Tiago da Silva, Eliezer de Souza da Silva, Diego Mesquita

    Abstract: Generative Flow Networks (GFlowNets) are amortized inference models designed to sample from unnormalized distributions over composable objects, with applications in generative modeling for tasks in fields such as causal discovery, NLP, and drug discovery. Traditionally, the training procedure for GFlowNets seeks to minimize the expected log-squared difference between a proposal (forward policy) an… ▽ More

    Submitted 21 October, 2024; v1 submitted 11 October, 2024; originally announced October 2024.

    Comments: Accepted at NeurIPS 2024, https://openreview.net/forum?id=N5H4z0Pzvn

    MSC Class: 68T05 ACM Class: G.3; I.5.1; I.2.8; I.2.6

  4. arXiv:2405.15934  [pdf, other

    cs.LG stat.ML

    Clustering Survival Data using a Mixture of Non-parametric Experts

    Authors: Gabriel Buginga, Edmundo de Souza e Silva

    Abstract: Survival analysis aims to predict the timing of future events across various fields, from medical outcomes to customer churn. However, the integration of clustering into survival analysis, particularly for precision medicine, remains underexplored. This study introduces SurvMixClust, a novel algorithm for survival analysis that integrates clustering with survival function prediction within a unifi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  5. arXiv:2403.18219  [pdf

    cs.LG cs.AI stat.CO

    From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no Libraries

    Authors: Ergon Cugler de Moraes Silva

    Abstract: Reinforcement learning (RL) algorithms have become indispensable tools in artificial intelligence, empowering agents to acquire optimal decision-making policies through interactions with their environment and feedback mechanisms. This study explores the performance of RL agents in both two-dimensional (2D) and three-dimensional (3D) environments, aiming to research the dynamics of learning across… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  6. arXiv:2402.16253  [pdf

    math.NA stat.CO

    To be, or not to be, that is the Question: Exploring the pseudorandom generation of texts to write Hamlet from the perspective of the Infinite Monkey Theorem

    Authors: Ergon Cugler de Moraes Silva

    Abstract: This article explores the theoretical and computational aspects of the Infinite Monkey Theorem, investigating the number of attempts and the time required for a set of pseudorandom characters to assemble and recite Hamlets iconic phrase, To be, or not to be, that is the Question. Drawing inspiration from Emile Borel's original concept (1913), the study delves into the practical implications of pse… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  7. arXiv:2402.05867  [pdf

    math.NA stat.CO

    Exploring pseudorandom value addition operations in datasets: A layered approach to escape from normal-Gaussian patterns

    Authors: Ergon Cugler de Moraes Silva

    Abstract: In the realm of statistical exploration, the manipulation of pseudo-random values to discern their impact on data distribution presents a compelling avenue of inquiry. This article investigates the question: Is it possible to add pseudo-random values without compelling a shift towards a normal distribution?. Employing Python techniques, the study explores the nuances of pseudo-random value additio… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  8. Unsupervised Feature Based Algorithms for Time Series Extrinsic Regression

    Authors: David Guijo-Rubio, Matthew Middlehurst, Guilherme Arcencio, Diego Furtado Silva, Anthony Bagnall

    Abstract: Time Series Extrinsic Regression (TSER) involves using a set of training time series to form a predictive model of a continuous response variable that is not directly related to the regressor series. The TSER archive for comparing algorithms was released in 2022 with 19 problems. We increase the size of this archive to 63 problems and reproduce the previous comparison of baseline algorithms. We th… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 19 pages, 21 figures, 6 tables. Appendix included

  9. arXiv:2210.14411  [pdf, other

    stat.ME

    Exact Bayesian Inference for Geostatistical Models under Preferential Sampling

    Authors: Douglas Mateus da Silva, Dani Gamerman

    Abstract: Preferential sampling is a common feature in geostatistics and occurs when the locations to be sampled are chosen based on information about the phenomena under study. In this case, point pattern models are commonly used as the probability law for the distribution of the locations. However, analytic intractability of the point process likelihood prevents its direct calculation. Many Bayesian (and… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

  10. arXiv:2208.13980  [pdf, other

    stat.ME stat.AP

    Model-robust Bayesian design through Generalised Additive Models for monitoring submerged shoals

    Authors: Dilishiya De Silva, Rebecca Fisher, Ben Radford, Helen Thompson, James McGree

    Abstract: Optimal sampling strategies are critical for surveys of deeper coral reef and shoal systems, due to the significant cost of accessing and field sampling these remote and poorly understood ecosystems. Additionally, well-established standard diver-based sampling techniques used in shallow reef systems cannot be deployed because of water depth. Here we develop a Bayesian design strategy to optimise s… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

  11. arXiv:2108.01510  [pdf, other

    stat.ME

    MCEM and SAEM Algorithms for Geostatistical Models under Preferential Sampling

    Authors: Douglas Mateus da Silva, Lourdes C. Contreras Montenegro

    Abstract: The problem of preferential sampling in geostatistics arises when the choise of location to be sampled is made with information about the phenomena in the study. The geostatistical model under preferential sampling deals with this problem, but parameter estimation is challenging because the likelihood function has no closed form. We developed an MCEM and an SAEM algorithm for finding the maximum l… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

  12. The potential stickiness of pandemic-induced behavior changes in the United States

    Authors: Deborah Salon, Matthew Wigginton Conway, Denise Capasso da Silva, Rishabh Singh Chauhan, Sybil Derrible, Kouros Mohammadian, Sara Khoeini, Nathan Parker, Laura Mirtich, Ali Shamshiripour, Ehsan Rahimi, Ram Pendyala

    Abstract: Human behavior is notoriously difficult to change, but a disruption of the magnitude of the COVID-19 pandemic has the potential to bring about long-term behavioral changes. During the pandemic, people have been forced to experience new ways of interacting, working, learning, shopping, traveling, and eating meals. A critical question going forward is how these experiences have actually changed pref… ▽ More

    Submitted 25 May, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: 6 pages and 2 figures in main text; 24 pages in supplementary materials

    Journal ref: Proceedings of the National Academy of Sciences Jul 2021, 118 (27) e2106499118; DOI: 10.1073/pnas.2106499118

  13. A database of travel-related behaviors and attitudes before, during, and after COVID-19 in the United States

    Authors: Rishabh Singh Chauhan, Matthew Wigginton Conway, Denise Capasso da Silva, Deborah Salon, Ali Shamshiripour, Ehsan Rahimi, Sara Khoeini, Abolfazl Mohammadian, Sybil Derrible, Ram Pendyala

    Abstract: The COVID-19 pandemic has impacted billions of people around the world. To capture some of these impacts in the United States, we are conducting a nationwide longitudinal survey collecting information about activity and travel-related behaviors and attitudes before, during, and after the COVID-19 pandemic. The survey questions cover a wide range of topics including commuting, daily travel, air tra… ▽ More

    Submitted 9 October, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Journal ref: Sci Data 8, 245 (2021)

  14. arXiv:2005.11348  [pdf, other

    eess.AS cs.LG eess.SP stat.ML

    Microphone Array Based Surveillance Audio Classification

    Authors: Dimitri Leandro de Oliveira Silva, Tito Spadini, Ricardo Suyama

    Abstract: The work assessed seven classical classifiers and two beamforming algorithms for detecting surveillance sound events. The tests included the use of AWGN with -10 dB to 30 dB SNR. Data Augmentation was also employed to improve algorithms' performance. The results showed that the combination of SVM and Delay-and-Sum (DaS) scored the best accuracy (up to 86.0\%), but had high computational cost (… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  15. arXiv:2004.12554  [pdf, other

    cs.LG cs.AI cs.CE stat.ML

    Forecasting in Non-stationary Environments with Fuzzy Time Series

    Authors: Petrônio Cândido de Lima e Silva, Carlos Alberto Severiano Junior, Marcos Antonio Alves, Rodrigo Silva, Miri Weiss Cohen, Frederico Gadelha Guimarães

    Abstract: In this paper we introduce a Non-Stationary Fuzzy Time Series (NSFTS) method with time varying parameters adapted from the distribution of the data. In this approach, we employ Non-Stationary Fuzzy Sets, in which perturbation functions are used to adapt the membership function parameters in the knowledge base in response to statistical changes in the time series. The proposed method is capable of… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: 21 pages, 7 figures, submitted to Applied Soft Computing

  16. arXiv:1910.12369  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Sound Event Recognition in a Smart City Surveillance Context

    Authors: Tito Spadini, Dimitri Leandro de Oliveira Silva, Ricardo Suyama

    Abstract: Due to the growing demand for improving surveillance capabilities in smart cities, systems need to be developed to provide better monitoring capabilities to competent authorities, agencies responsible for strategic resource management, and emergency call centers. This work assumes that, as a complementary monitoring solution, the use of a system capable of detecting the occurrence of sound events,… ▽ More

    Submitted 1 February, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

  17. arXiv:1910.12263  [pdf, other

    stat.ML cs.LG

    Prior Specification for Bayesian Matrix Factorization via Prior Predictive Matching

    Authors: Eliezer de Souza da Silva, Tomasz Kuśmierczyk, Marcelo Hartmann, Arto Klami

    Abstract: The behavior of many Bayesian models used in machine learning critically depends on the choice of prior distributions, controlled by some hyperparameters that are typically selected by Bayesian optimization or cross-validation. This requires repeated, costly, posterior inference. We provide an alternative for selecting good priors without carrying out posterior inference, building on the prior pre… ▽ More

    Submitted 30 September, 2022; v1 submitted 27 October, 2019; originally announced October 2019.

    Journal ref: Journal of Machine Learning Research 24 (2023) 1-51

  18. arXiv:1909.01757  [pdf, other

    cs.LG stat.ML

    Augmented Memory Networks for Streaming-Based Active One-Shot Learning

    Authors: Andreas Kvistad, Massimiliano Ruocco, Eliezer de Souza da Silva, Erlend Aune

    Abstract: One of the major challenges in training deep architectures for predictive tasks is the scarcity and cost of labeled training data. Active Learning (AL) is one way of addressing this challenge. In stream-based AL, observations are continuously made available to the learner that have to decide whether to request a label or to make a prediction. The goal is to reduce the request rate while at the sam… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

  19. arXiv:1807.01619  [pdf

    cs.LG stat.ML

    Ensemble learning with Conformal Predictors: Targeting credible predictions of conversion from Mild Cognitive Impairment to Alzheimer's Disease

    Authors: Telma Pereira, Sandra Cardoso, Dina Silva, Manuela Guerreiro, Alexandre de Mendonça, Sara C. Madeira

    Abstract: Most machine learning classifiers give predictions for new examples accurately, yet without indicating how trustworthy predictions are. In the medical domain, this hampers their integration in decision support systems, which could be useful in the clinical practice. We use a supervised learning approach that combines Ensemble learning with Conformal Predictors to predict conversion from Mild Cogni… ▽ More

    Submitted 5 July, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

    Comments: 4 pages, 1 figure, accepted for presentation at the KDD Workshop on Machine Learning for Medicine and Healthcare, London, UK, August 2018

    MSC Class: 68-06

  20. arXiv:1709.04851  [pdf, other

    stat.ME

    Factor Analysis of Interval Data

    Authors: Paula Cheira, Paula Brito, A. Pedro Duarte Silva

    Abstract: This paper presents a factor analysis model for symbolic data, focusing on the particular case of interval-valued variables. The proposed method describes the correlation structure among the measured interval-valued variables in terms of a few underlying, but unobservable, uncorrelated interval-valued variables, called \textit{common factors}. Uniform and Triangular distributions are considered wi… ▽ More

    Submitted 14 September, 2017; originally announced September 2017.

  21. arXiv:1705.07941  [pdf, ps, other

    stat.ME

    Prediction Measures in Nonlinear Beta Regression Models

    Authors: Patrícia Leone Espinheira, Luana C. Meireles da Silva, Alisson de Oliveira Silva, Raydonal Ospina

    Abstract: Nonlinear models are frequently applied to determine the optimal supply natural gas to a given residential unit based on economical and technical factors, or used to fit biochemical and pharmaceutical assay nonlinear data. In this article we propose PRESS statistics and prediction coefficients for a class of nonlinear beta regression models, namely $P^2$ statistics. We aim at using both prediction… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

    Comments: 10 Fig, 20 pag. Submitted to Journal of the Royal Statistical Society. Serie C - Applied Statistics

  22. arXiv:1503.02577  [pdf, ps, other

    cs.DM cs.DS eess.SP stat.ME

    New Algorithms for Computing a Single Component of the Discrete Fourier Transform

    Authors: G. Jerônimo da Silva Jr., R. M. Campello de Souza, H. M. de Oliveira

    Abstract: This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.

    Submitted 9 March, 2015; originally announced March 2015.

    Comments: 4 pages, 3 figures, 1 table. In: 10th International Symposium on Communication Theory and Applications, Ambleside, UK

  23. arXiv:1501.04830  [pdf, ps, other

    stat.AP

    Prediction Measures in Beta Regression Models

    Authors: Patrícia L. Espinheira, Luana Cecília Meireles da Silva, Alisson de Oliveira Silva

    Abstract: We consider the issue of constructing PRESS statistics and coefficients of prediction for a class of beta regression models. We aim at displaying measures of predictive power of the model regardless goodness-of-fit. Monte Carlo simulation results on the finite sample behavior of such measures are provided.We also present an application that relates to the distribution of natural gas for home usage… ▽ More

    Submitted 20 January, 2015; originally announced January 2015.

    Comments: 15 pag. 1 Fig

  24. arXiv:1309.6148  [pdf

    cs.DM stat.ME

    An Integer Programming Formulation Applied to Optimum Allocation in Multivariate Stratified Sampling

    Authors: Jose Andre de Moura Brito, Gustavo Silva Semaan, Pedro Luis do Nascimento Silva, Nelson Maculan

    Abstract: The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two class: The first involves forming a weighted average of the stratum variances and f… ▽ More

    Submitted 24 September, 2013; originally announced September 2013.

  25. arXiv:1301.0974  [pdf, ps, other

    q-bio.BM stat.AP

    Hierarchical Nystrom Methods for Constructing Markov State Models for Conformational Dynamics

    Authors: Yuan Yao, Raymond Z. Cui, Gregory R. Bowman, Daniel Silva, Jian Sun, Xuhui Huang

    Abstract: Markov state models (MSMs) have become a popular approach for investigating the conformational dynamics of proteins and other biomolecules. MSMs are typically built from numerous molecular dynamics simulations by dividing the sampled configurations into a large number of microstates based on geometric criteria. The resulting microstate model can then be coarse-grained into a more understandable ma… ▽ More

    Submitted 5 January, 2013; originally announced January 2013.

  26. arXiv:1208.2242  [pdf, other

    physics.med-ph stat.AP

    Dynamics of Snoring Sounds and Its Connection with Obstructive Sleep Apnea

    Authors: Adriano M. Alencar, Diego Greatti Vaz da Silva, Carolina Beatriz Oliveira, Andre P. Vieira, Henrique T. Moriya, Geraldo Lorenzi-Filho

    Abstract: Snoring is extremely common in the general population and when irregular may indicate the presence of obstructive sleep apnea. We analyze the overnight sequence of wave packets --- the snore sound --- recorded during full polysomnography in patients referred to the sleep laboratory due to suspected obstructive sleep apnea. We hypothesize that irregular snore, with duration in the range between 10… ▽ More

    Submitted 10 August, 2012; originally announced August 2012.

  27. arXiv:1203.3082  [pdf, other

    stat.AP q-bio.GN

    A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies

    Authors: Verena Zuber, A. Pedro Duarte Silva, Korbinian Strimmer

    Abstract: Background: Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that need to be taken into account. Hence, increasingly modern computationally expensive regression methods are employed for SNP selection that consider all markers simultaneously and thus incorporate… ▽ More

    Submitted 25 October, 2012; v1 submitted 14 March, 2012; originally announced March 2012.

    Comments: 15 pages, 2 figures, 4 tables

    Journal ref: BMC Bioinformatics 2012, Vol. 13, 284