Search | arXiv e-print repository

FairDRL-ST: Disentangled Representation Learning for Fair Spatio-Temporal Mobility Prediction

Authors: Sichen Zhao, Wei Shao, Jeffrey Chan, Ziqi Xu, Flora Salim

Abstract: As deep spatio-temporal neural networks are increasingly utilised in urban computing contexts, the deployment of such methods can have a direct impact on users of critical urban infrastructure, such as public transport, emergency services, and traffic management systems. While many spatio-temporal methods focus on improving accuracy, fairness has recently gained attention due to growing evidence t… ▽ More As deep spatio-temporal neural networks are increasingly utilised in urban computing contexts, the deployment of such methods can have a direct impact on users of critical urban infrastructure, such as public transport, emergency services, and traffic management systems. While many spatio-temporal methods focus on improving accuracy, fairness has recently gained attention due to growing evidence that biased predictions in spatio-temporal applications can disproportionately disadvantage certain demographic or geographic groups, thereby reinforcing existing socioeconomic inequalities and undermining the ethical deployment of AI in public services. In this paper, we propose a novel framework, FairDRL-ST, based on disentangled representation learning, to address fairness concerns in spatio-temporal prediction, with a particular focus on mobility demand forecasting. By leveraging adversarial learning and disentangled representation learning, our framework learns to separate attributes that contain sensitive information. Unlike existing methods that enforce fairness through supervised learning, which may lead to overcompensation and degraded performance, our framework achieves fairness in an unsupervised manner with minimal performance loss. We apply our framework to real-world urban mobility datasets and demonstrate its ability to close fairness gaps while delivering competitive predictive performance compared to state-of-the-art fairness-aware methods. △ Less

Submitted 10 August, 2025; originally announced August 2025.

Comments: Accepted as a Research Paper (short) at ACM SIGSPATIAL 2025. This arXiv version is the full version of the paper

arXiv:2508.06847 [pdf, ps, other]

MOCA-HESP: Meta High-dimensional Bayesian Optimization for Combinatorial and Mixed Spaces via Hyper-ellipsoid Partitioning

Authors: Lam Ngo, Huong Ha, Jeffrey Chan, Hongyu Zhang

Abstract: High-dimensional Bayesian Optimization (BO) has attracted significant attention in recent research. However, existing methods have mainly focused on optimizing in continuous domains, while combinatorial (ordinal and categorical) and mixed domains still remain challenging. In this paper, we first propose MOCA-HESP, a novel high-dimensional BO method for combinatorial and mixed variables. The key id… ▽ More High-dimensional Bayesian Optimization (BO) has attracted significant attention in recent research. However, existing methods have mainly focused on optimizing in continuous domains, while combinatorial (ordinal and categorical) and mixed domains still remain challenging. In this paper, we first propose MOCA-HESP, a novel high-dimensional BO method for combinatorial and mixed variables. The key idea is to leverage the hyper-ellipsoid space partitioning (HESP) technique with different categorical encoders to work with high-dimensional, combinatorial and mixed spaces, while adaptively selecting the optimal encoders for HESP using a multi-armed bandit technique. Our method, MOCA-HESP, is designed as a \textit{meta-algorithm} such that it can incorporate other combinatorial and mixed BO optimizers to further enhance the optimizers' performance. Finally, we develop three practical BO methods by integrating MOCA-HESP with state-of-the-art BO optimizers for combinatorial and mixed variables: standard BO, CASMOPOLITAN, and Bounce. Our experimental results on various synthetic and real-world benchmarks show that our methods outperform existing baselines. Our code implementation can be found at https://github.com/LamNgo1/moca-hesp △ Less

Submitted 25 August, 2025; v1 submitted 9 August, 2025; originally announced August 2025.

Comments: Published at the 28th European Conference on Artificial Intelligence (ECAI-2025)

arXiv:2506.01422 [pdf, ps, other]

Large Bayesian VARs for Binary and Censored Variables

Authors: Joshua C. C. Chan, Michael Pfarrhofer

Abstract: We extend the standard VAR to jointly model the dynamics of binary, censored and continuous variables, and develop an efficient estimation approach that scales well to high-dimensional settings. In an out-of-sample forecasting exercise, we show that the proposed VARs forecast recessions and short-term interest rates well. We demonstrate the utility of the proposed framework using a wide rage of em… ▽ More We extend the standard VAR to jointly model the dynamics of binary, censored and continuous variables, and develop an efficient estimation approach that scales well to high-dimensional settings. In an out-of-sample forecasting exercise, we show that the proposed VARs forecast recessions and short-term interest rates well. We demonstrate the utility of the proposed framework using a wide rage of empirical applications, including conditional forecasting and a structural analysis that examines the dynamic effects of a financial shock on recession probabilities. △ Less

Submitted 2 June, 2025; originally announced June 2025.

Comments: JEL: C34, C35, C53, E32, E47; keywords: macroeconomic forecasting, effective lower bound, financial shocks, shadow rate, recession

arXiv:2506.01044 [pdf, ps, other]

A novel stratified sampler with unbalanced refinement for network reliability assessment

Authors: Jianpeng Chan, Iason Papaioannou, Daniel Straub

Abstract: We investigate stratified sampling in the context of network reliability assessment. We propose an unbalanced stratum refinement procedure, which operates on a partition of network components into clusters and the number of failed components within each cluster. The size of each refined stratum and the associated conditional failure probability, collectively termed failure signatures, can be calcu… ▽ More We investigate stratified sampling in the context of network reliability assessment. We propose an unbalanced stratum refinement procedure, which operates on a partition of network components into clusters and the number of failed components within each cluster. The size of each refined stratum and the associated conditional failure probability, collectively termed failure signatures, can be calculated and estimated using the conditional Bernoulli model. The estimator is further improved by determining the minimum number of component failure $i^*$ to reach system failure and then by considering only strata with at least $i^*$ failed components. We propose a heuristic but practicable approximation of the optimal sample size for all strata, assuming a coherent network performance function. The efficiency of the proposed stratified sampler with unbalanced refinement (SSuR) is demonstrated through two network reliability problems. △ Less

Submitted 1 June, 2025; originally announced June 2025.

arXiv:2412.12918 [pdf, other]

BOIDS: High-dimensional Bayesian Optimization via Incumbent-guided Direction Lines and Subspace Embeddings

Authors: Lam Ngo, Huong Ha, Jeffrey Chan, Hongyu Zhang

Abstract: When it comes to expensive black-box optimization problems, Bayesian Optimization (BO) is a well-known and powerful solution. Many real-world applications involve a large number of dimensions, hence scaling BO to high dimension is of much interest. However, state-of-the-art high-dimensional BO methods still suffer from the curse of dimensionality, highlighting the need for further improvements. In… ▽ More When it comes to expensive black-box optimization problems, Bayesian Optimization (BO) is a well-known and powerful solution. Many real-world applications involve a large number of dimensions, hence scaling BO to high dimension is of much interest. However, state-of-the-art high-dimensional BO methods still suffer from the curse of dimensionality, highlighting the need for further improvements. In this work, we introduce BOIDS, a novel high-dimensional BO algorithm that guides optimization by a sequence of one-dimensional direction lines using a novel tailored line-based optimization procedure. To improve the efficiency, we also propose an adaptive selection technique to identify most optimal lines for each round of line-based optimization. Additionally, we incorporate a subspace embedding technique for better scaling to high-dimensional spaces. We further provide theoretical analysis of our proposed method to analyze its convergence property. Our extensive experimental results show that BOIDS outperforms state-of-the-art baselines on various synthetic and real-world benchmark problems. △ Less

Submitted 17 December, 2024; originally announced December 2024.

Comments: Published at AAAI Conference on Artificial Intelligence, 2025

arXiv:2410.09741 [pdf, other]

Real-time Fuel Leakage Detection via Online Change Point Detection

Authors: Ruimin Chu, Li Chik, Yiliao Song, Jeffrey Chan, Xiaodong Li

Abstract: Early detection of fuel leakage at service stations with underground petroleum storage systems is a crucial task to prevent catastrophic hazards. Current data-driven fuel leakage detection methods employ offline statistical inventory reconciliation, leading to significant detection delays. Consequently, this can result in substantial financial loss and environmental impact on the surrounding commu… ▽ More Early detection of fuel leakage at service stations with underground petroleum storage systems is a crucial task to prevent catastrophic hazards. Current data-driven fuel leakage detection methods employ offline statistical inventory reconciliation, leading to significant detection delays. Consequently, this can result in substantial financial loss and environmental impact on the surrounding community. In this paper, we propose a novel framework called Memory-based Online Change Point Detection (MOCPD) which operates in near real-time, enabling early detection of fuel leakage. MOCPD maintains a collection of representative historical data within a size-constrained memory, along with an adaptively computed threshold. Leaks are detected when the dissimilarity between the latest data and historical memory exceeds the current threshold. An update phase is incorporated in MOCPD to ensure diversity among historical samples in the memory. With this design, MOCPD is more robust and achieves a better recall rate while maintaining a reasonable precision score. We have conducted a variety of experiments comparing MOCPD to commonly used online change point detection (CPD) baselines on real-world fuel variance data with induced leakages, actual fuel leakage data and benchmark CPD datasets. Overall, MOCPD consistently outperforms the baseline methods in terms of detection accuracy, demonstrating its applicability to fuel leakage detection and CPD problems. △ Less

Submitted 13 October, 2024; originally announced October 2024.

arXiv:2407.13925 [pdf, other]

EggNet: An Evolving Graph-based Graph Attention Network for Particle Track Reconstruction

Authors: Paolo Calafiura, Jay Chan, Loic Delabrouille, Brandon Wang

Abstract: Track reconstruction is a crucial task in particle experiments and is traditionally very computationally expensive due to its combinatorial nature. Recently, graph neural networks (GNNs) have emerged as a promising approach that can improve scalability. Most of these GNN-based methods, including the edge classification (EC) and the object condensation (OC) approach, require an input graph that nee… ▽ More Track reconstruction is a crucial task in particle experiments and is traditionally very computationally expensive due to its combinatorial nature. Recently, graph neural networks (GNNs) have emerged as a promising approach that can improve scalability. Most of these GNN-based methods, including the edge classification (EC) and the object condensation (OC) approach, require an input graph that needs to be constructed beforehand. In this work, we consider a one-shot OC approach that reconstructs particle tracks directly from a set of hits (point cloud) by recursively applying graph attention networks with an evolving graph structure. This approach iteratively updates the graphs and can better facilitate the message passing across each graph. Preliminary studies on the TrackML dataset show better track performance compared to the methods that require a fixed input graph. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: 7 pages, 5 figures

arXiv:2402.03104 [pdf, other]

High-dimensional Bayesian Optimization via Covariance Matrix Adaptation Strategy

Authors: Lam Ngo, Huong Ha, Jeffrey Chan, Vu Nguyen, Hongyu Zhang

Abstract: Bayesian Optimization (BO) is an effective method for finding the global optimum of expensive black-box functions. However, it is well known that applying BO to high-dimensional optimization problems is challenging. To address this issue, a promising solution is to use a local search strategy that partitions the search domain into local regions with high likelihood of containing the global optimum… ▽ More Bayesian Optimization (BO) is an effective method for finding the global optimum of expensive black-box functions. However, it is well known that applying BO to high-dimensional optimization problems is challenging. To address this issue, a promising solution is to use a local search strategy that partitions the search domain into local regions with high likelihood of containing the global optimum, and then use BO to optimize the objective function within these regions. In this paper, we propose a novel technique for defining the local regions using the Covariance Matrix Adaptation (CMA) strategy. Specifically, we use CMA to learn a search distribution that can estimate the probabilities of data points being the global optimum of the objective function. Based on this search distribution, we then define the local regions consisting of data points with high probabilities of being the global optimum. Our approach serves as a meta-algorithm as it can incorporate existing black-box BO optimizers, such as BO, TuRBO, and BAxUS, to find the global optimum of the objective function within our derived local regions. We evaluate our proposed method on various benchmark synthetic and real-world problems. The results demonstrate that our method outperforms existing state-of-the-art techniques. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 31 pages, 17 figures

Journal ref: Transactions on Machine Learning Research 2024

arXiv:2312.02401 [pdf, other]

Enhancing Content Moderation with Culturally-Aware Models

Authors: Alex J. Chan, José Luis Redondo García, Fabrizio Silvestri, Colm O'Donnell, Konstantina Palla

Abstract: Content moderation on a global scale must navigate a complex array of local cultural distinctions, which can hinder effective enforcement. While global policies aim for consistency and broad applicability, they often miss the subtleties of regional language interpretation, cultural beliefs, and local legislation. This work introduces a flexible framework that enhances foundation language models wi… ▽ More Content moderation on a global scale must navigate a complex array of local cultural distinctions, which can hinder effective enforcement. While global policies aim for consistency and broad applicability, they often miss the subtleties of regional language interpretation, cultural beliefs, and local legislation. This work introduces a flexible framework that enhances foundation language models with cultural knowledge. Our approach involves fine-tuning encoder-decoder models on media-diet data to capture cultural nuances, and applies a continued training regime to effectively integrate these models into a content moderation pipeline. We evaluate this framework in a case study of an online podcast platform with content spanning various regions. The results show that our culturally adapted models improve the accuracy of local violation detection and offer explanations that align more closely with regional cultural norms. Our findings reinforce the need for an adaptable content moderation approach that remains flexible in response to the diverse cultural landscapes it operates in and represents a step towards a more equitable and culturally sensitive framework for content moderation, demonstrating what is achievable in this domain. △ Less

Submitted 5 November, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: 7 pages, 7 Figures. Supplementary material

arXiv:2310.14438 [pdf, ps, other]

BVARs and Stochastic Volatility

Authors: Joshua Chan

Abstract: Bayesian vector autoregressions (BVARs) are the workhorse in macroeconomic forecasting. Research in the last decade has established the importance of allowing time-varying volatility to capture both secular and cyclical variations in macroeconomic uncertainty. This recognition, together with the growing availability of large datasets, has propelled a surge in recent research in building stochastic… ▽ More Bayesian vector autoregressions (BVARs) are the workhorse in macroeconomic forecasting. Research in the last decade has established the importance of allowing time-varying volatility to capture both secular and cyclical variations in macroeconomic uncertainty. This recognition, together with the growing availability of large datasets, has propelled a surge in recent research in building stochastic volatility models suitable for large BVARs. Some of these new models are also equipped with additional features that are especially desirable for large systems, such as order invariance -- i.e., estimates are not dependent on how the variables are ordered in the BVAR -- and robustness against COVID-19 outliers. Estimation of these large, flexible models is made possible by the recently developed equation-by-equation approach that drastically reduces the computational cost of estimating large systems. Despite these recent advances, there remains much ongoing work, such as the development of parsimonious approaches for time-varying coefficients and other types of nonlinearities in large BVARs. △ Less

Submitted 22 October, 2023; originally announced October 2023.

arXiv:2310.06808 [pdf, ps, other]

doi 10.1002/bimj.201700199

Odds are the sign is right

Authors: Brian Knaeble, Julian Chan

Abstract: This article introduces a new condition based on odds ratios for sensitivity analysis. The analysis involves the average effect of a treatment or exposure on a response or outcome with estimates adjusted for and conditional on a single, unmeasured, dichotomous covariate. Results of statistical simulations are displayed to show that the odds ratio condition is as reliable as other commonly used con… ▽ More This article introduces a new condition based on odds ratios for sensitivity analysis. The analysis involves the average effect of a treatment or exposure on a response or outcome with estimates adjusted for and conditional on a single, unmeasured, dichotomous covariate. Results of statistical simulations are displayed to show that the odds ratio condition is as reliable as other commonly used conditions for sensitivity analysis. Other conditions utilize quantities reflective of a mediating covariate. The odds ratio condition can be applied when the covariate is a confounding variable. As an example application we use the odds ratio condition to analyze and interpret a positive association observed between Zika virus infection and birth defects. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Journal ref: Biometrical Journal, 60(6) 1164-1171 (2018)

arXiv:2309.12490 [pdf, other]

Bayesian improved cross entropy method with categorical mixture models

Authors: Jianpeng Chan, Iason Papaioannou, Daniel Straub

Abstract: We employ the Bayesian improved cross entropy (BiCE) method for rare event estimation in static networks and choose the categorical mixture as the parametric family to capture the dependence among network components. At each iteration of the BiCE method, the mixture parameters are updated through the weighted maximum a posteriori (MAP) estimate, which mitigates the overfitting issue of the standar… ▽ More We employ the Bayesian improved cross entropy (BiCE) method for rare event estimation in static networks and choose the categorical mixture as the parametric family to capture the dependence among network components. At each iteration of the BiCE method, the mixture parameters are updated through the weighted maximum a posteriori (MAP) estimate, which mitigates the overfitting issue of the standard improved cross entropy (iCE) method through a novel balanced prior, and we propose a generalized version of the expectation-maximization (EM) algorithm to approximate this weighted MAP estimate. The resulting importance sampling distribution is proved to be unbiased. For choosing a proper number of components $K$ in the mixture, we compute the Bayesian information criterion (BIC) of each candidate $K$ as a by-product of the generalized EM algorithm. The performance of the proposed method is investigated through a simple illustration, a benchmark study, and a practical application. In all these numerical examples, the BiCE method results in an efficient and accurate estimator that significantly outperforms the standard iCE method and the BiCE method with the independent categorical distribution. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2302.05390 [pdf, other]

doi 10.1103/PhysRevD.108.016002

Unbinned Profiled Unfolding

Authors: Jay Chan, Benjamin Nachman

Abstract: Unfolding is an important procedure in particle physics experiments which corrects for detector effects and provides differential cross section measurements that can be used for a number of downstream tasks, such as extracting fundamental physics parameters. Traditionally, unfolding is done by discretizing the target phase space into a finite number of bins and is limited in the number of unfolded… ▽ More Unfolding is an important procedure in particle physics experiments which corrects for detector effects and provides differential cross section measurements that can be used for a number of downstream tasks, such as extracting fundamental physics parameters. Traditionally, unfolding is done by discretizing the target phase space into a finite number of bins and is limited in the number of unfolded variables. Recently, there have been a number of proposals to perform unbinned unfolding with machine learning. However, none of these methods (like most unfolding methods) allow for simultaneously constraining (profiling) nuisance parameters. We propose a new machine learning-based unfolding method that results in an unbinned differential cross section and can profile nuisance parameters. The machine learning loss function is the full likelihood function, based on binned inputs at detector-level. We first demonstrate the method with simple Gaussian examples and then show the impact on a simulated Higgs boson cross section measurement. △ Less

Submitted 7 July, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

Comments: Fixed a reference

arXiv:2302.03172 [pdf, ps, other]

High-Dimensional Conditionally Gaussian State Space Models with Missing Data

Authors: Joshua C. C. Chan, Aubrey Poon, Dan Zhu

Abstract: We develop an efficient sampling approach for handling complex missing data patterns and a large number of missing observations in conditionally Gaussian state space models. Two important examples are dynamic factor models with unbalanced datasets and large Bayesian VARs with variables in multiple frequencies. A key insight underlying the proposed approach is that the joint distribution of the mis… ▽ More We develop an efficient sampling approach for handling complex missing data patterns and a large number of missing observations in conditionally Gaussian state space models. Two important examples are dynamic factor models with unbalanced datasets and large Bayesian VARs with variables in multiple frequencies. A key insight underlying the proposed approach is that the joint distribution of the missing data conditional on the observed data is Gaussian. Moreover, the inverse covariance or precision matrix of this conditional distribution is sparse, and this special structure can be exploited to substantially speed up computations. We illustrate the methodology using two empirical applications. The first application combines quarterly, monthly and weekly data using a large Bayesian VAR to produce weekly GDP estimates. In the second application, we extract latent factors from unbalanced datasets involving over a hundred monthly variables via a dynamic factor model with stochastic volatility. △ Less

Submitted 6 February, 2023; originally announced February 2023.

arXiv:2211.09542 [pdf, other]

Bayesian improved cross entropy method for network reliability assessment

Authors: Jianpeng Chan, Iason Papaioannou, Daniel Straub

Abstract: We propose a modification of the improved cross entropy (iCE) method to enhance its performance for network reliability assessment. The iCE method performs a transition from the nominal density to the optimal importance sampling (IS) density via a parametric distribution model whose cross entropy with the optimal IS is minimized. The efficiency and accuracy of the iCE method are largely influenced… ▽ More We propose a modification of the improved cross entropy (iCE) method to enhance its performance for network reliability assessment. The iCE method performs a transition from the nominal density to the optimal importance sampling (IS) density via a parametric distribution model whose cross entropy with the optimal IS is minimized. The efficiency and accuracy of the iCE method are largely influenced by the choice of the parametric model. In the context of reliability of systems with independent multi-state components, the obvious choice of the parametric family is the categorical distribution. When updating this distribution model with standard iCE, the probability assigned to a certain category often converges to 0 due to lack of occurrence of samples from this category during the adaptive sampling process, resulting in a poor IS estima tor with a strong negative bias. To circumvent this issue, we propose an algorithm termed Bayesian improved cross entropy method (BiCE). Thereby, the posterior predictive distribution is employed to update the parametric model instead of the weighted maximum likelihood estimation approach employed in the original iCE method. A set of numerical examples illustrate the efficiency and accuracy of the proposed method. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.06138 [pdf, other]

Practical Approaches for Fair Learning with Multitype and Multivariate Sensitive Attributes

Authors: Tennison Liu, Alex J. Chan, Boris van Breugel, Mihaela van der Schaar

Abstract: It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences. Fair ML has largely focused on the protection of single attributes in the simpler setting where both attributes and target outcomes are binary. However, the practical application in many a real-world problem entails the simultaneous protection of m… ▽ More It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences. Fair ML has largely focused on the protection of single attributes in the simpler setting where both attributes and target outcomes are binary. However, the practical application in many a real-world problem entails the simultaneous protection of multiple sensitive attributes, which are often not simply binary, but continuous or categorical. To address this more challenging task, we introduce FairCOCCO, a fairness measure built on cross-covariance operators on reproducing kernel Hilbert Spaces. This leads to two practical tools: first, the FairCOCCO Score, a normalised metric that can quantify fairness in settings with single or multiple sensitive attributes of arbitrary type; and second, a subsequent regularisation term that can be incorporated into arbitrary learning objectives to obtain fair predictors. These contributions address crucial gaps in the algorithmic fairness literature, and we empirically demonstrate consistent improvements against state-of-the-art techniques in balancing predictive power and fairness on real-world datasets. △ Less

Submitted 11 November, 2022; originally announced November 2022.

arXiv:2208.13255 [pdf, ps, other]

Comparing Stochastic Volatility Specifications for Large Bayesian VARs

Authors: Joshua C. C. Chan

Abstract: Large Bayesian vector autoregressions with various forms of stochastic volatility have become increasingly popular in empirical macroeconomics. One main difficulty for practitioners is to choose the most suitable stochastic volatility specification for their particular application. We develop Bayesian model comparison methods -- based on marginal likelihood estimators that combine conditional Mont… ▽ More Large Bayesian vector autoregressions with various forms of stochastic volatility have become increasingly popular in empirical macroeconomics. One main difficulty for practitioners is to choose the most suitable stochastic volatility specification for their particular application. We develop Bayesian model comparison methods -- based on marginal likelihood estimators that combine conditional Monte Carlo and adaptive importance sampling -- to choose among a variety of stochastic volatility specifications. The proposed methods can also be used to select an appropriate shrinkage prior on the VAR coefficients, which is a critical component for avoiding over-fitting in high-dimensional settings. Using US quarterly data of different dimensions, we find that both the Cholesky stochastic volatility and factor stochastic volatility outperform the common stochastic volatility specification. Their superior performance, however, can mostly be attributed to the more flexible priors that accommodate cross-variable shrinkage. △ Less

Submitted 28 August, 2022; originally announced August 2022.

arXiv:2207.03988 [pdf, ps, other]

Large Bayesian VARs with Factor Stochastic Volatility: Identification, Order Invariance and Structural Analysis

Authors: Joshua Chan, Eric Eisenstat, Xuewen Yu

Abstract: Vector autoregressions (VARs) with multivariate stochastic volatility are widely used for structural analysis. Often the structural model identified through economically meaningful restrictions--e.g., sign restrictions--is supposed to be independent of how the dependent variables are ordered. But since the reduced-form model is not order invariant, results from the structural analysis depend on th… ▽ More Vector autoregressions (VARs) with multivariate stochastic volatility are widely used for structural analysis. Often the structural model identified through economically meaningful restrictions--e.g., sign restrictions--is supposed to be independent of how the dependent variables are ordered. But since the reduced-form model is not order invariant, results from the structural analysis depend on the order of the variables. We consider a VAR based on the factor stochastic volatility that is constructed to be order invariant. We show that the presence of multivariate stochastic volatility allows for statistical identification of the model. We further prove that, with a suitable set of sign restrictions, the corresponding structural model is point-identified. An additional appeal of the proposed approach is that it can easily handle a large number of dependent variables as well as sign restrictions. We demonstrate the methodology through a structural analysis in which we use a 20-variable VAR with sign restrictions to identify 5 structural shocks. △ Less

Submitted 8 July, 2022; originally announced July 2022.

arXiv:2201.07303 [pdf, ps, other]

Large Hybrid Time-Varying Parameter VARs

Authors: Joshua C. C. Chan

Abstract: Time-varying parameter VARs with stochastic volatility are routinely used for structural analysis and forecasting in settings involving a few endogenous variables. Applying these models to high-dimensional datasets has proved to be challenging due to intensive computations and over-parameterization concerns. We develop an efficient Bayesian sparsification method for a class of models we call hybri… ▽ More Time-varying parameter VARs with stochastic volatility are routinely used for structural analysis and forecasting in settings involving a few endogenous variables. Applying these models to high-dimensional datasets has proved to be challenging due to intensive computations and over-parameterization concerns. We develop an efficient Bayesian sparsification method for a class of models we call hybrid TVP-VARs--VARs with time-varying parameters in some equations but constant coefficients in others. Specifically, for each equation, the new method automatically decides whether the VAR coefficients and contemporaneous relations among variables are constant or time-varying. Using US datasets of various dimensions, we find evidence that the parameters in some, but not all, equations are time varying. The large hybrid TVP-VAR also forecasts better than many standard benchmarks. △ Less

Submitted 16 June, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

arXiv:2112.11315 [pdf, ps, other]

Efficient Estimation of State-Space Mixed-Frequency VARs: A Precision-Based Approach

Authors: Joshua C. C. Chan, Aubrey Poon, Dan Zhu

Abstract: State-space mixed-frequency vector autoregressions are now widely used for nowcasting. Despite their popularity, estimating such models can be computationally intensive, especially for large systems with stochastic volatility. To tackle the computational challenges, we propose two novel precision-based samplers to draw the missing observations of the low-frequency variables in these models, buildi… ▽ More State-space mixed-frequency vector autoregressions are now widely used for nowcasting. Despite their popularity, estimating such models can be computationally intensive, especially for large systems with stochastic volatility. To tackle the computational challenges, we propose two novel precision-based samplers to draw the missing observations of the low-frequency variables in these models, building on recent advances in the band and sparse matrix algorithms for state-space models. We show via a simulation study that the proposed methods are more numerically accurate and computationally efficient compared to standard Kalman-filter based methods. We demonstrate how the proposed method can be applied in two empirical macroeconomic applications: estimating the monthly output gap and studying the response of GDP to a monetary policy shock at the monthly frequency. Results from these two empirical applications highlight the importance of incorporating high-frequency indicators in macroeconomic models. △ Less

Submitted 21 December, 2021; originally announced December 2021.

arXiv:2111.07225 [pdf, ps, other]

Large Order-Invariant Bayesian VARs with Stochastic Volatility

Authors: Joshua C. C. Chan, Gary Koop, Xuewen Yu

Abstract: Many popular specifications for Vector Autoregressions (VARs) with multivariate stochastic volatility are not invariant to the way the variables are ordered due to the use of a Cholesky decomposition for the error covariance matrix. We show that the order invariance problem in existing approaches is likely to become more serious in large VARs. We propose the use of a specification which avoids the… ▽ More Many popular specifications for Vector Autoregressions (VARs) with multivariate stochastic volatility are not invariant to the way the variables are ordered due to the use of a Cholesky decomposition for the error covariance matrix. We show that the order invariance problem in existing approaches is likely to become more serious in large VARs. We propose the use of a specification which avoids the use of this Cholesky decomposition. We show that the presence of multivariate stochastic volatility allows for identification of the proposed model and prove that it is invariant to ordering. We develop a Markov Chain Monte Carlo algorithm which allows for Bayesian estimation and prediction. In exercises involving artificial and real macroeconomic data, we demonstrate that the choice of variable ordering can have non-negligible effects on empirical results. In a macroeconomic forecasting exercise involving VARs with 20 variables we find that our order-invariant approach leads to the best forecasts and that some choices of variable ordering can lead to poor forecasts using a conventional, non-order invariant, approach. △ Less

Submitted 13 November, 2021; originally announced November 2021.

arXiv:2111.07170 [pdf, ps, other]

Asymmetric Conjugate Priors for Large Bayesian VARs

Authors: Joshua C. C. Chan

Abstract: Large Bayesian VARs are now widely used in empirical macroeconomics. One popular shrinkage prior in this setting is the natural conjugate prior as it facilitates posterior simulation and leads to a range of useful analytical results. This is, however, at the expense of modeling flexibility, as it rules out cross-variable shrinkage -- i.e., shrinking coefficients on lags of other variables more agg… ▽ More Large Bayesian VARs are now widely used in empirical macroeconomics. One popular shrinkage prior in this setting is the natural conjugate prior as it facilitates posterior simulation and leads to a range of useful analytical results. This is, however, at the expense of modeling flexibility, as it rules out cross-variable shrinkage -- i.e., shrinking coefficients on lags of other variables more aggressively than those on own lags. We develop a prior that has the best of both worlds: it can accommodate cross-variable shrinkage, while maintaining many useful analytical results, such as a closed-form expression of the marginal likelihood. This new prior also leads to fast posterior simulation -- for a BVAR with 100 variables and 4 lags, obtaining 10,000 posterior draws takes less than half a minute on a standard desktop. We demonstrate the usefulness of the new prior via a structural analysis using a 15-variable VAR with sign restrictions to identify 5 structural shocks. △ Less

Submitted 13 November, 2021; originally announced November 2021.

arXiv:2105.10590 [pdf, other]

Parallelizing Contextual Bandits

Authors: Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

Abstract: Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, \textit{simultaneously} proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual bandit algorithms applicable to problems with bounded e… ▽ More Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, \textit{simultaneously} proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual bandit algorithms applicable to problems with bounded eluder dimension whose regret is nearly identical to their perfectly sequential counterparts -- given access to the same total number of oracle queries -- up to a lower-order ``burn-in" term. We further show these algorithms can be specialized to the class of linear reward functions where we introduce and analyze several new linear bandit algorithms which explicitly introduce diversity into their action selection. Finally, we also present an empirical evaluation of these parallel algorithms in several domains, including materials discovery and biological sequence design problems, to demonstrate the utility of parallelized bandits in practical settings. △ Less

Submitted 5 February, 2023; v1 submitted 21 May, 2021; originally announced May 2021.

arXiv:2012.00110 [pdf, other]

Representing and Denoising Wearable ECG Recordings

Authors: Jeffrey Chan, Andrew C. Miller, Emily B. Fox

Abstract: Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentia… ▽ More Modern wearable devices are embedded with a range of noninvasive biomarker sensors that hold promise for improving detection and treatment of disease. One such sensor is the single-lead electrocardiogram (ECG) which measures electrical signals in the heart. The benefits of the sheer volume of ECG measurements with rich longitudinal structure made possible by wearables come at the price of potentially noisier measurements compared to clinical ECGs, e.g., due to movement. In this work, we develop a statistical model to simulate a structured noise process in ECGs derived from a wearable sensor, design a beat-to-beat representation that is conducive for analyzing variation, and devise a factor analysis-based method to denoise the ECG. We study synthetic data generated using a realistic ECG simulator and a structured noise model. At varying levels of signal-to-noise, we quantitatively measure an upper bound on performance and compare estimates from linear and non-linear models. Finally, we apply our method to a set of ECGs collected by wearables in a mobile health study. △ Less

Submitted 30 November, 2020; originally announced December 2020.

Comments: ML for Mobile Health Workshop, NeurIPS 2020

arXiv:2007.12652 [pdf, other]

MurTree: Optimal Classification Trees via Dynamic Programming and Search

Authors: Emir Demirović, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, Peter J. Stuckey

Abstract: Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In r… ▽ More Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees. △ Less

Submitted 28 June, 2022; v1 submitted 24 July, 2020; originally announced July 2020.

Journal ref: Journal of Machine Learning Research 2022

arXiv:2006.14988 [pdf, other]

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Authors: Alex J. Chan, Ahmed M. Alaa, Zhaozhi Qian, Mihaela van der Schaar

Abstract: Modern neural networks have proven to be powerful function approximators, providing state-of-the-art performance in a multitude of applications. They however fall short in their ability to quantify confidence in their predictions - this is crucial in high-stakes applications that involve critical decision-making. Bayesian neural networks (BNNs) aim at solving this problem by placing a prior distri… ▽ More Modern neural networks have proven to be powerful function approximators, providing state-of-the-art performance in a multitude of applications. They however fall short in their ability to quantify confidence in their predictions - this is crucial in high-stakes applications that involve critical decision-making. Bayesian neural networks (BNNs) aim at solving this problem by placing a prior distribution over the network's parameters, thereby inducing a posterior distribution that encapsulates predictive uncertainty. While existing variants of BNNs based on Monte Carlo dropout produce reliable (albeit approximate) uncertainty estimates over in-distribution data, they tend to exhibit over-confidence in predictions made on target data whose feature distribution differs from the training data, i.e., the covariate shift setup. In this paper, we develop an approximate Bayesian inference scheme based on posterior regularisation, wherein unlabelled target data are used as "pseudo-labels" of model confidence that are used to regularise the model's loss on labelled source data. We show that this approach significantly improves the accuracy of uncertainty quantification on covariate-shifted data sets, with minimal modification to the underlying model architecture. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations. △ Less

Submitted 26 June, 2020; originally announced June 2020.

arXiv:2001.09404 [pdf, other]

doi 10.3390/econometrics11010008

Semi-metric portfolio optimization: a new algorithm reducing simultaneous asset shocks

Authors: Nick James, Max Menzies, Jennifer Chan

Abstract: This paper proposes a new method for financial portfolio optimization based on reducing simultaneous asset shocks across a collection of assets. This may be understood as an alternative approach to risk reduction in a portfolio based on a new mathematical quantity. First, we apply recently introduced semi-metrics between finite sets to determine the distance between time series' structural breaks.… ▽ More This paper proposes a new method for financial portfolio optimization based on reducing simultaneous asset shocks across a collection of assets. This may be understood as an alternative approach to risk reduction in a portfolio based on a new mathematical quantity. First, we apply recently introduced semi-metrics between finite sets to determine the distance between time series' structural breaks. Then, we build on the classical portfolio optimization theory of Markowitz and use this distance between asset structural breaks for our penalty function, rather than portfolio variance. Our experiments are promising: on synthetic data, we show that our proposed method does indeed diversify among time series with highly similar structural breaks and enjoys advantages over existing metrics between sets. On real data, experiments illustrate that our proposed optimization method performs well relative to nine other commonly used options, producing the second-highest returns, the lowest volatility, and second-lowest drawdown. The main implication for this method in portfolio management is reducing simultaneous asset shocks and potentially sharp associated drawdowns during periods of highly similar structural breaks, such as a market crisis. Our method adds to a considerable literature of portfolio optimization techniques in econometrics and could complement these via portfolio averaging. △ Less

Submitted 8 March, 2023; v1 submitted 26 January, 2020; originally announced January 2020.

Comments: Accepted manuscript. Substantial additions since v2. Equal contribution from first two authors

Journal ref: Econometrics 11, 8 (2023)

arXiv:1911.00995 [pdf, other]

doi 10.1016/j.physd.2020.132636

Novel semi-metrics for multivariate change point analysis and anomaly detection

Authors: Nick James, Max Menzies, Lamiae Azizi, Jennifer Chan

Abstract: This paper proposes a new method for determining similarity and anomalies between time series, most practically effective in large collections of (likely related) time series, by measuring distances between structural breaks within such a collection. We introduce a class of \emph{semi-metric} distance measures, which we term \emph{MJ distances}. These semi-metrics provide an advantage over existin… ▽ More This paper proposes a new method for determining similarity and anomalies between time series, most practically effective in large collections of (likely related) time series, by measuring distances between structural breaks within such a collection. We introduce a class of \emph{semi-metric} distance measures, which we term \emph{MJ distances}. These semi-metrics provide an advantage over existing options such as the Hausdorff and Wasserstein metrics. We prove they have desirable properties, including better sensitivity to outliers, while experiments on simulated data demonstrate that they uncover similarity within collections of time series more effectively. Semi-metrics carry a potential disadvantage: without the triangle inequality, they may not satisfy a "transitivity property of closeness." We analyse this failure with proof and introduce an computational method to investigate, in which we demonstrate that our semi-metrics violate transitivity infrequently and mildly. Finally, we apply our methods to cryptocurrency and measles data, introducing a judicious application of eigenvalue analysis. △ Less

Submitted 3 July, 2020; v1 submitted 3 November, 2019; originally announced November 2019.

Comments: Accepted manuscript. Minor edits since v2. Equal contribution from first two authors

Journal ref: Physica D: Nonlinear Phenomena 412 (2020) 132636

arXiv:1908.02575 [pdf]

Alternative Blockmodelling

Authors: Oscar Correa, Jeffrey Chan, Vinh Nguyen

Abstract: Many approaches have been proposed to discover clusters within networks. Community finding field encompasses approaches which try to discover clusters where nodes are tightly related within them but loosely related with nodes of other clusters. However, a community network configuration is not the only possible latent structure in a graph. Core-periphery and hierarchical network configurations are… ▽ More Many approaches have been proposed to discover clusters within networks. Community finding field encompasses approaches which try to discover clusters where nodes are tightly related within them but loosely related with nodes of other clusters. However, a community network configuration is not the only possible latent structure in a graph. Core-periphery and hierarchical network configurations are valid structures to discover in a relational dataset. On the other hand, a network is not completely explained by only knowing the membership of each node. A high level view of the inter-cluster relationships is needed. Blockmodelling techniques deal with these two issues. Firstly, blockmodelling allows finding any network configuration besides to the well-known community structure. Secondly, blockmodelling is a summary representation of a network which regards not only membership of nodes but also relations between clusters. Finally, a unique summary representation of a network is unlikely. Networks might hide more than one blockmodel. Therefore, our proposed problem aims to discover a secondary blockmodel representation of a network that is of good quality and dissimilar with respect to a given blockmodel. Our methodology is presented through two approaches, (a) inclusion of cannot-link constraints and (b) dissimilarity between image matrices. Both approaches are based on non-negative matrix factorisation NMF which fits the blockmodelling representation. The evaluation of these two approaches regards quality and dissimilarity of the discovered alternative blockmodel as these are the requirements of the problem. △ Less

Submitted 27 July, 2019; originally announced August 2019.

Comments: 56 pages, 23 figures

arXiv:1903.03348 [pdf, other]

Approximating Optimisation Solutions for Travelling Officer Problem with Customised Deep Learning Network

Authors: Wei Shao, Flora D. Salim, Jeffrey Chan, Sean Morrison, Fabio Zambetta

Abstract: Deep learning has been extended to a number of new domains with critical success, though some traditional orienteering problems such as the Travelling Salesman Problem (TSP) and its variants are not commonly solved using such techniques. Deep neural networks (DNNs) are a potentially promising and under-explored solution to solve these problems due to their powerful function approximation abilities… ▽ More Deep learning has been extended to a number of new domains with critical success, though some traditional orienteering problems such as the Travelling Salesman Problem (TSP) and its variants are not commonly solved using such techniques. Deep neural networks (DNNs) are a potentially promising and under-explored solution to solve these problems due to their powerful function approximation abilities, and their fast feed-forward computation. In this paper, we outline a method for converting an orienteering problem into a classification problem, and design a customised multi-layer deep learning network to approximate traditional optimisation solutions to this problem. We test the performance of the network on a real-world parking violation dataset, and conduct a generic study that empirically shows the critical architectural components that affect network performance for this problem. △ Less

Submitted 8 March, 2019; originally announced March 2019.

arXiv:1810.09317 [pdf, other]

Assessing the Impact of Gamification on Self-Directed Learning in Medical Students

Authors: De-Zhang Lee, Vik Gopal, Jia-Min Chan, Li-Shia Ng, Eng-Tat Ang

Abstract: Gamification refers to the process of adding game elements to a task. Of late, this process has been introduced in pedagogical settings to capture the attention and interest of students. In our study, we apply the process to Anatomy students and assess the impact on their learning behaviour. We apply a novel path analysis to assess the change in their learning behaviour after a semester of games-e… ▽ More Gamification refers to the process of adding game elements to a task. Of late, this process has been introduced in pedagogical settings to capture the attention and interest of students. In our study, we apply the process to Anatomy students and assess the impact on their learning behaviour. We apply a novel path analysis to assess the change in their learning behaviour after a semester of games-enhanced small group sessions. We find that too much games could reduce their enjoyment of the underlying learning. However, we also find that students appreciate a change in the traditional model of instruction - they embraced peer-to-peer learning in the classroom. △ Less

Submitted 22 October, 2018; originally announced October 2018.

arXiv:1802.06153 [pdf, other]

A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks

Authors: Jeffrey Chan, Valerio Perrone, Jeffrey P. Spence, Paul A. Jenkins, Sara Mathieson, Yun S. Song

Abstract: An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential chal… ▽ More An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art. △ Less

Submitted 5 November, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

Comments: 9 pages, 8 figures

arXiv:1706.05585 [pdf, other]

Accelerating Innovation Through Analogy Mining

Authors: Tom Hope, Joel Chan, Aniket Kittur, Dafna Shahaf

Abstract: The availability of large idea repositories (e.g., the U.S. patent database) could significantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for either human or automated methods. Previous approaches include costly hand-c… ▽ More The availability of large idea repositories (e.g., the U.S. patent database) could significantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for either human or automated methods. Previous approaches include costly hand-created databases that have high relational structure (e.g., predicate calculus representations) but are very sparse. Simpler machine-learning/information-retrieval similarity metrics can scale to large, natural-language datasets, but struggle to account for structural similarity, which is central to analogy. In this paper we explore the viability and value of learning simpler structural representations, specifically, "problem schemas", which specify the purpose of a product and the mechanisms by which it achieves that purpose. Our approach combines crowdsourcing and recurrent neural networks to extract purpose and mechanism vector representations from product descriptions. We demonstrate that these learned vectors allow us to find analogies with higher precision and recall than traditional information-retrieval methods. In an ideation experiment, analogies retrieved by our models significantly increased people's likelihood of generating creative ideas compared to analogies retrieved by traditional methods. Our results suggest a promising approach to enabling computational analogy at scale is to learn and leverage weaker structural representations. △ Less

Submitted 17 June, 2017; originally announced June 2017.

Comments: KDD 2017

arXiv:1606.05596 [pdf, other]

Ground Truth Bias in External Cluster Validity Indices

Authors: Yang Lei, James C. Bezdek, Simone Romano, Nguyen Xuan Vinh, Jeffrey Chan, James Bailey

Abstract: It has been noticed that some external CVIs exhibit a preferential bias towards a larger or smaller number of clusters which is monotonic (directly or inversely) in the number of clusters in candidate partitions. This type of bias is caused by the functional form of the CVI model. For example, the popular Rand index (RI) exhibits a monotone increasing (NCinc) bias, while the Jaccard Index (JI) ind… ▽ More It has been noticed that some external CVIs exhibit a preferential bias towards a larger or smaller number of clusters which is monotonic (directly or inversely) in the number of clusters in candidate partitions. This type of bias is caused by the functional form of the CVI model. For example, the popular Rand index (RI) exhibits a monotone increasing (NCinc) bias, while the Jaccard Index (JI) index suffers from a monotone decreasing (NCdec) bias. This type of bias has been previously recognized in the literature. In this work, we identify a new type of bias arising from the distribution of the ground truth (reference) partition against which candidate partitions are compared. We call this new type of bias ground truth (GT) bias. This type of bias occurs if a change in the reference partition causes a change in the bias status (e.g., NCinc, NCdec) of a CVI. For example, NCinc bias in the RI can be changed to NCdec bias by skewing the distribution of clusters in the ground truth partition. It is important for users to be aware of this new type of biased behaviour, since it may affect the interpretations of CVI results. The objective of this article is to study the empirical and theoretical implications of GT bias. To the best of our knowledge, this is the first extensive study of such a property for external cluster validity indices. △ Less

Submitted 17 June, 2016; originally announced June 2016.

arXiv:1602.01213 [pdf, other]

Maximum leave-one-out likelihood estimation for location parameter of unbounded densities

Authors: Thanakorn Nitithumbundit, Jennifer S. K. Chan

Abstract: Maximum likelihood estimation of a location parameter fails when the density have unbounded mode. An alternative approach is considered by leaving out a data point to avoid the unbounded density in the full likelihood. This modification give rise to the leave-one-out likelihood. We propose an ECM algorithm which maximises the leave-one-out likelihood. It was shown that the estimator which maximise… ▽ More Maximum likelihood estimation of a location parameter fails when the density have unbounded mode. An alternative approach is considered by leaving out a data point to avoid the unbounded density in the full likelihood. This modification give rise to the leave-one-out likelihood. We propose an ECM algorithm which maximises the leave-one-out likelihood. It was shown that the estimator which maximises the leave-one-out likelihood is consistent and super-efficient. However, other asymptotic properties such as the optimal rate of convergence and asymptotic distribution is still under question. We use simulations to investigate these asymptotic properties of the location estimator using our proposed algorithm. △ Less

Submitted 3 February, 2016; originally announced February 2016.

Comments: 20 pages, 6 figures

MSC Class: 62F10; 62F12

arXiv:1504.01239 [pdf, other]

An ECM algorithm for Skewed Multivariate Variance Gamma Distribution in Normal Mean-Variance Representation

Authors: Thanakorn Nitithumbundit, Jennifer S. K. Chan

Abstract: Normal mean-variance mixture distributions are widely applied to simplify a model's implementation and improve their computational efficiency under the Maximum Likelihood (ML) approach. Especially for distributions with normal mean-variance mixtures representation such as the multivariate skewed variance gamma (MSVG) distribution, it utilises the expectation-conditional-maximisation (ECM) algorith… ▽ More Normal mean-variance mixture distributions are widely applied to simplify a model's implementation and improve their computational efficiency under the Maximum Likelihood (ML) approach. Especially for distributions with normal mean-variance mixtures representation such as the multivariate skewed variance gamma (MSVG) distribution, it utilises the expectation-conditional-maximisation (ECM) algorithm to iteratively obtain the ML estimates. To facilitate application to financial time series, the mean is further extended to include autoregressive terms. Techniques are proposed to deal with the unbounded density for small shape parameter and to speed up the convergence. Simulation studies are conducted to demonstrate the applicability of this model and examine estimation properties. Finally, the MSVG model is applied to analyse the returns of five daily closing price market indices and standard errors for the estimated parameters are computed using Louis's method. △ Less

Submitted 16 June, 2015; v1 submitted 6 April, 2015; originally announced April 2015.

arXiv:1402.2492 [pdf, ps, other]

Risk Margin Quantile Function Via Parametric and Non-Parametric Bayesian Quantile Regression

Authors: Alice X. D. Dong, Jennifer S. K. Chan, Gareth W. Peters

Abstract: We develop quantile regression models in order to derive risk margin and to evaluate capital in non-life insurance applications. By utilizing the entire range of conditional quantile functions, especially higher quantile levels, we detail how quantile regression is capable of providing an accurate estimation of risk margin and an overview of implied capital based on the historical volatility of a… ▽ More We develop quantile regression models in order to derive risk margin and to evaluate capital in non-life insurance applications. By utilizing the entire range of conditional quantile functions, especially higher quantile levels, we detail how quantile regression is capable of providing an accurate estimation of risk margin and an overview of implied capital based on the historical volatility of a general insurers loss portfolio. Two modelling frameworks are considered based around parametric and nonparametric quantile regression models which we develop specifically in this insurance setting. In the parametric quantile regression framework, several models including the flexible generalized beta distribution family, asymmetric Laplace (AL) distribution and power Pareto distribution are considered under a Bayesian regression framework. The Bayesian posterior quantile regression models in each case are studied via Markov chain Monte Carlo (MCMC) sampling strategies. In the nonparametric quantile regression framework, that we contrast to the parametric Bayesian models, we adopted an AL distribution as a proxy and together with the parametric AL model, we expressed the solution as a scale mixture of uniform distributions to facilitate implementation. The models are extended to adopt dynamic mean, variance and skewness and applied to analyze two real loss reserve data sets to perform inference and discuss interesting features of quantile regression for risk margin calculations. △ Less

Submitted 11 February, 2014; originally announced February 2014.

Showing 1–37 of 37 results for author: Chan, J