Skip to main content

Showing 1–19 of 19 results for author: de Carvalho, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.13370  [pdf, other

    stat.ME stat.ML

    A Kolmogorov-Arnold Neural Model for Cascading Extremes

    Authors: Miguel de Carvalho, Clemente Ferrer, Ronny Vallejos

    Abstract: This paper addresses the growing concern of cascading extreme events, such as an extreme earthquake followed by a tsunami, by presenting a novel method for risk assessment focused on these domino effects. The proposed approach develops an extreme value theory framework within a Kolmogorov-Arnold network (KAN) to estimate the probability of one extreme event triggering another, conditionally on a f… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2505.13188  [pdf, ps, other

    cs.LG cs.AI stat.ML

    When a Reinforcement Learning Agent Encounters Unknown Unknowns

    Authors: Juntian Zhu, Miguel de Carvalho, Zhouwang Yang, Fengxiang He

    Abstract: An AI agent might surprisingly find she has reached an unknown state which she has never been aware of -- an unknown unknown. We mathematically ground this scenario in reinforcement learning: an agent, after taking an action calculated from value functions $Q$ and $V$ defined on the {\it {aware domain}}, reaches a state out of the domain. To enable the agent to handle this scenario, we propose an… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  3. arXiv:2504.12288  [pdf, other

    stat.ME

    The underlap coefficient as a measure of a biomarker's discriminatory ability

    Authors: Zhaoxi Zhang, Vanda Inacio, Miguel de Carvalho

    Abstract: The first step in evaluating a potential diagnostic biomarker is to examine the variation in its values across different disease groups. In a three-class disease setting, the volume under the receiver operating characteristic surface and the three-class Youden index are commonly used summary measures of a biomarker's discriminatory ability. However, these measures rely on a stochastic ordering ass… ▽ More

    Submitted 17 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

  4. arXiv:2404.08480  [pdf, other

    cs.LG cs.CL stat.CO

    Decoding AI: The inside story of data analysis in ChatGPT

    Authors: Ozan Evkaya, Miguel de Carvalho

    Abstract: As a result of recent advancements in generative AI, the field of Data Science is prone to various changes. This review critically examines the Data Analysis (DA) capabilities of ChatGPT assessing its performance across a wide range of tasks. While DA provides researchers and practitioners with unprecedented analytical capabilities, it is far from being perfect, and it is important to recognize an… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 15 pages with figures and appendix

  5. arXiv:2212.04009  [pdf, other

    stat.ML cs.LG stat.ME

    A parallelizable model-based approach for marginal and multivariate clustering

    Authors: Miguel de Carvalho, Gabriel Martos Venturini, Andrej Svetlošák

    Abstract: This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per mar… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  6. arXiv:2209.05569  [pdf, other

    stat.ME stat.AP stat.ML

    Uncovering Regions of Maximum Dissimilarity on Random Process Data

    Authors: Miguel de Carvalho, Gabriel Martos Venturini

    Abstract: The comparison of local characteristics of two random processes can shed light on periods of time or space at which the processes differ the most. This paper proposes a method that learns about regions with a certain volume, where the marginal attributes of two processes are less similar. The proposed methods are devised in full generality for the setting where the data of interest are themselves… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    ACM Class: G.3

  7. arXiv:2011.05067  [pdf, other

    stat.ME q-fin.ST

    Tracking change-points in multivariate extremes

    Authors: Miguel de Carvalho, Manuele Leonelli, Alex Rossi

    Abstract: In this paper we devise a statistical method for tracking and modeling change-points on the dependence structure of multivariate extremes. The methods are motivated by and illustrated on a case study on crypto-assets.

    Submitted 10 November, 2020; originally announced November 2020.

  8. arXiv:2011.03872  [pdf, other

    stat.ME stat.AP

    Modeling Interval Trendlines: Symbolic Singular Spectrum Analysis for Interval Time Series

    Authors: Miguel de Carvalho, Gabriel Martos

    Abstract: In this article we propose an extension of singular spectrum analysis for interval-valued time series. The proposed methods can be used to decompose and forecast the dynamics governing a set-valued stochastic process. The resulting components on which the interval time series is decomposed can be understood as interval trendlines, cycles, or noise. Forecasting can be conducted through a linear rec… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

  9. arXiv:2010.07164  [pdf, other

    stat.ME

    An Extreme Value Bayesian Lasso for the Conditional Left and Right Tails

    Authors: Miguel de Carvalho, Soraia Pereira, Paula Pereira, Patrícia de Zea Bermudez

    Abstract: We introduce a novel regression model for the conditional left and right tail of a possibly heavy-tailed response. The proposed model can be used to learn the effect of covariates on an extreme value setting via a Lasso-type specification based on a Lagrangian restriction. Our model can be used to track if some covariates are significant for the lower values, but not for the (right) tail---and vic… ▽ More

    Submitted 10 August, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

  10. arXiv:2007.06054  [pdf, other

    stat.ME

    Robust and flexible inference for the covariate-specific ROC curve

    Authors: Vanda Inacio, Vanda M. Lourenco, Miguel de Carvalho, Richard A. Parker, Vincent Gnanapragasam

    Abstract: Diagnostic tests are of critical importance in health care and medical research. Motivated by the impact that atypical and outlying test outcomes might have on the assessment of the discriminatory ability of a diagnostic test, we develop a flexible and robust model for conducting inference about the covariate-specific receiver operating characteristic (ROC) curve that safeguards against outlying t… ▽ More

    Submitted 27 July, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

  11. arXiv:1910.03434  [pdf, other

    cs.LG stat.ML

    ATL: Autonomous Knowledge Transfer from Many Streaming Processes

    Authors: Mahardhika Pratama, Marcus de Carvalho, Renchunzi Xie, Edwin Lughofer, Jie Lu

    Abstract: Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for… ▽ More

    Submitted 19 October, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: This paper has been accepted for publication in CIKM 2019

  12. arXiv:1907.13070  [pdf, other

    cs.LG stat.ML

    Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors

    Authors: Telma Pereira, Sofia Pires, Marta Gromicho, Susana Pinto, Mamede de Carvalho, Sara C. Madeira

    Abstract: Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Journal ref: KDD 2019 Workshop on Applied Data Science for Healthcare

  13. arXiv:1812.09607  [pdf, other

    stat.ME

    Bayesian semiparametric modelling of phase-varying point processes

    Authors: Bastian Galasso, Yoav Zemel, Miguel de Carvalho

    Abstract: We propose a Bayesian semiparametric approach for registration of multiple point processes. Our approach entails modelling the mean measures of the phase-varying point processes with a Bernstein-Dirichlet prior, which induces a prior on the space of all warp functions. Theoretical results on the support of the induced priors are derived, and posterior consistency is obtained under mild conditions.… ▽ More

    Submitted 11 December, 2020; v1 submitted 22 December, 2018; originally announced December 2018.

    Comments: 30 pages, 16 figures

    MSC Class: 60G55; 62G99; 62F15

  14. arXiv:1805.07622  [pdf, other

    stat.ME

    Bayesian Bootstrap Inference for the ROC Surface

    Authors: Vanda Inacio de Carvalho, Miguel de Carvalho, Adam Branscum

    Abstract: Accurate diagnosis of disease is of great importance in clinical practice and medical research. The receiver operating characteristic (ROC) surface is a popular tool for evaluating the discriminatory ability of continuous diagnostic test outcomes when there exist three ordered disease classes (e.g., no disease, mild disease, advanced disease). We propose the Bayesian bootstrap, a fully nonparametr… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

  15. arXiv:1712.09982  [pdf, other

    stat.ME

    Affinity-based measures of medical diagnostic test accuracy

    Authors: Miguel de Carvalho, Bradley J. Barney, Garritt L. Page

    Abstract: We propose new summary measures of diagnostic test accuracy which can be used as companions to existing diagnostic accuracy measures. Conceptually, our summary measures are tantamount to the so-called Hellinger affinity and we show that they can be regarded as measures of agreement constructed from similar geometrical principles as Pearson correlation. A covariate-specific version of our summary i… ▽ More

    Submitted 28 December, 2017; originally announced December 2017.

  16. arXiv:1704.08447  [pdf, other

    stat.ME

    Regression Type Models for Extremal Dependence

    Authors: Linda Mhalla, Miguel de Carvalho, Valérie Chavez-Demoulin

    Abstract: We propose a vector generalized additive modeling framework for taking into account the effect of covariates on angular density functions in a multivariate extreme value context. The proposed methods are tailored for settings where the dependence between extreme values may change according to covariates. We devise a maximum penalized log-likelihood estimator, discuss details of the estimation proc… ▽ More

    Submitted 27 November, 2017; v1 submitted 27 April, 2017; originally announced April 2017.

    Comments: 29 pages, 8 figures

  17. arXiv:1701.08994  [pdf, other

    stat.ME

    On the geometry of Bayesian inference

    Authors: Miguel de Carvalho, Garritt L. Page, Bradley J. Barney

    Abstract: We provide a geometric interpretation to Bayesian inference that allows us to introduce a natural measure of the level of agreement between priors, likelihoods, and posteriors. The starting point for the construction of our geometry is the simple observation that the marginal likelihood can be regarded as an inner product between the prior and the likelihood. A key concept in our geometry is that… ▽ More

    Submitted 23 May, 2018; v1 submitted 31 January, 2017; originally announced January 2017.

  18. Combining probability distributions: Extending the logarithmic pooling approach

    Authors: Luiz Max de Carvalho, Daniel A. M. Villela, Flavio Codeco Coelho, Leonardo Soares Bastos

    Abstract: Combining distributions is an important issue in decision theory and Bayesian inference. Logarithmic pooling is a popular method to aggregate expert opinions by using a set of weights that reflect the reliability of each information source. However, the resulting pooled distribution depends heavily on set of weights given to each opinion/prior and thus careful consideration must be given to the ch… ▽ More

    Submitted 30 December, 2020; v1 submitted 14 February, 2015; originally announced February 2015.

    Comments: Massively updated manuscript; submitted for publication

  19. arXiv:1204.3524  [pdf, other

    stat.ME

    A Euclidean likelihood estimator for bivariate tail dependence

    Authors: Miguel de Carvalho, Boris Oumow, Johan Segers, Michał Warchoł

    Abstract: The spectral measure plays a key role in the statistical modeling of multivariate extremes. Estimation of the spectral measure is a complex issue, given the need to obey a certain moment condition. We propose a Euclidean likelihood-based estimator for the spectral measure which is simple and explicitly defined, with its expression being free of Lagrange multipliers. Our estimator is shown to have… ▽ More

    Submitted 16 April, 2012; originally announced April 2012.

    Comments: 18 pages, 8 figures

    MSC Class: 62G32