Search | arXiv e-print repository

A Scalable Gradient-Based Optimization Framework for Sparse Minimum-Variance Portfolio Selection

Authors: Sarat Moka, Matias Quiroz, Vali Asimit, Samuel Muller

Abstract: Portfolio optimization involves selecting asset weights to minimize a risk-reward objective, such as the portfolio variance in the classical minimum-variance framework. Sparse portfolio selection extends this by imposing a cardinality constraint: only $k$ assets from a universe of $p$ may be included. The standard approach models this problem as a mixed-integer quadratic program and relies on comm… ▽ More Portfolio optimization involves selecting asset weights to minimize a risk-reward objective, such as the portfolio variance in the classical minimum-variance framework. Sparse portfolio selection extends this by imposing a cardinality constraint: only $k$ assets from a universe of $p$ may be included. The standard approach models this problem as a mixed-integer quadratic program and relies on commercial solvers to find the optimal solution. However, the computational costs of such methods increase exponentially with $k$ and $p$, making them too slow for problems of even moderate size. We propose a fast and scalable gradient-based approach that transforms the combinatorial sparse selection problem into a constrained continuous optimization task via Boolean relaxation, while preserving equivalence with the original problem on the set of binary points. Our algorithm employs a tunable parameter that transmutes the auxiliary objective from a convex to a concave function. This allows a stable convex starting point, followed by a controlled path toward a sparse binary solution as the tuning parameter increases and the objective moves toward concavity. In practice, our method matches commercial solvers in asset selection for most instances and, in rare instances, the solution differs by a few assets whilst showing a negligible error in portfolio variance. △ Less

Submitted 15 May, 2025; originally announced May 2025.

arXiv:2504.10530 [pdf, other]

Efficient Rare-Event Simulation for Random Geometric Graphs via Importance Sampling

Authors: Sarat Moka, Christian Hirsch, Volker Schmidt, Dirk Kroese

Abstract: Random geometric graphs defined on Euclidean subspaces, also called Gilbert graphs, are widely used to model spatially embedded networks across various domains. In such graphs, nodes are located at random in Euclidean space, and any two nodes are connected by an edge if they lie within a certain distance threshold. Accurately estimating rare-event probabilities related to key properties of these g… ▽ More Random geometric graphs defined on Euclidean subspaces, also called Gilbert graphs, are widely used to model spatially embedded networks across various domains. In such graphs, nodes are located at random in Euclidean space, and any two nodes are connected by an edge if they lie within a certain distance threshold. Accurately estimating rare-event probabilities related to key properties of these graphs, such as the number of edges and the size of the largest connected component, is important in the assessment of risk associated with catastrophic incidents, for example. However, this task is computationally challenging, especially for large networks. Importance sampling offers a viable solution by concentrating computational efforts on significant regions of the graph. This paper explores the application of an importance sampling method to estimate rare-event probabilities, highlighting its advantages in reducing variance and enhancing accuracy. Through asymptotic analysis and experiments, we demonstrate the effectiveness of our methodology, contributing to improved analysis of Gilbert graphs and showcasing the broader applicability of importance sampling in complex network analysis. △ Less

Submitted 15 April, 2025; v1 submitted 12 April, 2025; originally announced April 2025.

Comments: 29 Pages, 2 figures

arXiv:2409.05052 [pdf, other]

Rating Players of Counter-Strike: Global Offensive Based on Plus/Minus value

Authors: Hongyu Xu, Sarat Moka

Abstract: We propose a player rating mechanism for Counter-Strike: Global Offensive (CS ), a popular e-sport, by analyzing players' Plus/Minus values. The Plus/Minus value represents the average point difference between a player's team and the opponent's team across all matches the player has participated in. Using models such as regularized linear regression, logistic regression, and Bayesian linear models… ▽ More We propose a player rating mechanism for Counter-Strike: Global Offensive (CS ), a popular e-sport, by analyzing players' Plus/Minus values. The Plus/Minus value represents the average point difference between a player's team and the opponent's team across all matches the player has participated in. Using models such as regularized linear regression, logistic regression, and Bayesian linear models, we examine the relationship between player participation and team point differences. The most commonly used metric in the CS community is "Rating 2.0," which focuses solely on individual performance and does not account for indirect contributions to team success. Our approach introduces a new rating system that evaluates both direct and indirect contributions of players, prioritizing those who make a tangible impact on match outcomes rather than those with the highest individual scores. This rating system could help teams distribute rewards more fairly and improve player recruitment. We believe this methodology will positively influence not only the CS community but also the broader e-sports industry. △ Less

Submitted 8 September, 2024; originally announced September 2024.

Comments: 8 pages

arXiv:2407.03383 [pdf, other]

Continuous Optimization for Offline Change Point Detection and Estimation

Authors: Hans Reimann, Sarat Moka, Georgy Sofronov

Abstract: This work explores use of novel advances in best subset selection for regression modelling via continuous optimization for offline change point detection and estimation in univariate Gaussian data sequences. The approach exploits reformulating the normal mean multiple change point model into a regularized statistical inverse problem enforcing sparsity. After introducing the problem statement, crit… ▽ More This work explores use of novel advances in best subset selection for regression modelling via continuous optimization for offline change point detection and estimation in univariate Gaussian data sequences. The approach exploits reformulating the normal mean multiple change point model into a regularized statistical inverse problem enforcing sparsity. After introducing the problem statement, criteria and previous investigations via Lasso-regularization, the recently developed framework of continuous optimization for best subset selection (COMBSS) is briefly introduced and related to the problem at hand. Supervised and unsupervised perspectives are explored with the latter testing different approaches for the choice of regularization penalty parameters via the discrepancy principle and a confidence bound. The main result is an adaptation and evaluation of the COMBSS approach for offline normal mean multiple change-point detection via experimental results on simulated data for different choices of regularisation parameters. Results and future directions are discussed. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2404.13339 [pdf, other]

Group COMBSS: Group Selection via Continuous Optimization

Authors: Anant Mathur, Sarat Moka, Benoit Liquet, Zdravko Botev

Abstract: We present a new optimization method for the group selection problem in linear regression. In this problem, predictors are assumed to have a natural group structure and the goal is to select a small set of groups that best fits the response. The incorporation of group structure in a predictor matrix is a key factor in obtaining better estimators and identifying associations between response and pr… ▽ More We present a new optimization method for the group selection problem in linear regression. In this problem, predictors are assumed to have a natural group structure and the goal is to select a small set of groups that best fits the response. The incorporation of group structure in a predictor matrix is a key factor in obtaining better estimators and identifying associations between response and predictors. Such a discrete constrained problem is well-known to be hard, particularly in high-dimensional settings where the number of predictors is much larger than the number of observations. We propose to tackle this problem by framing the underlying discrete binary constrained problem into an unconstrained continuous optimization problem. The performance of our proposed approach is compared to state-of-the-art variable selection strategies on simulated data sets. We illustrate the effectiveness of our approach on a genetic dataset to identify grouping of markers across chromosomes. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2403.20007 [pdf, other]

Best Subset Solution Path for Linear Dimension Reduction Models using Continuous Optimization

Authors: Benoit Liquet, Sarat Moka, Samuel Muller

Abstract: The selection of best variables is a challenging problem in supervised and unsupervised learning, especially in high dimensional contexts where the number of variables is usually much larger than the number of observations. In this paper, we focus on two multivariate statistical methods: principal components analysis and partial least squares. Both approaches are popular linear dimension-reduction… ▽ More The selection of best variables is a challenging problem in supervised and unsupervised learning, especially in high dimensional contexts where the number of variables is usually much larger than the number of observations. In this paper, we focus on two multivariate statistical methods: principal components analysis and partial least squares. Both approaches are popular linear dimension-reduction methods with numerous applications in several fields including in genomics, biology, environmental science, and engineering. In particular, these approaches build principal components, new variables that are combinations of all the original variables. A main drawback of principal components is the difficulty to interpret them when the number of variables is large. To define principal components from the most relevant variables, we propose to cast the best subset solution path method into principal component analysis and partial least square frameworks. We offer a new alternative by exploiting a continuous optimization algorithm for best subset solution path. Empirical studies show the efficacy of our approach for providing the best subset solution path. The usage of our algorithm is further exposed through the analysis of two real datasets. The first dataset is analyzed using the principle component analysis while the analysis of the second dataset is based on partial least square framework. △ Less

Submitted 29 March, 2024; originally announced March 2024.

Comments: Main paper 26 pages including references and 17 pages for the supplementary material

arXiv:2403.13076 [pdf, other]

Spatial Autoregressive Model on a Dirichlet Distribution

Authors: Teo Nguyen, Sarat Moka, Kerrie Mengersen, Benoit Liquet

Abstract: Compositional data find broad application across diverse fields due to their efficacy in representing proportions or percentages of various components within a whole. Spatial dependencies often exist in compositional data, particularly when the data represents different land uses or ecological variables. Ignoring the spatial autocorrelations in modelling of compositional data may lead to incorrect… ▽ More Compositional data find broad application across diverse fields due to their efficacy in representing proportions or percentages of various components within a whole. Spatial dependencies often exist in compositional data, particularly when the data represents different land uses or ecological variables. Ignoring the spatial autocorrelations in modelling of compositional data may lead to incorrect estimates of parameters. Hence, it is essential to incorporate spatial information into the statistical analysis of compositional data to obtain accurate and reliable results. However, traditional statistical methods are not directly applicable to compositional data due to the correlation between its observations, which are constrained to lie on a simplex. To address this challenge, the Dirichlet distribution is commonly employed, as its support aligns with the nature of compositional vectors. Specifically, the R package DirichletReg provides a regression model, termed Dirichlet regression, tailored for compositional data. However, this model fails to account for spatial dependencies, thereby restricting its utility in spatial contexts. In this study, we introduce a novel spatial autoregressive Dirichlet regression model for compositional data, adeptly integrating spatial dependencies among observations. We construct a maximum likelihood estimator for a Dirichlet density function augmented with a spatial lag term. We compare this spatial autoregressive model with the same model without spatial lag, where we test both models on synthetic data as well as two real datasets, using different metrics. By considering the spatial relationships among observations, our model provides more accurate and reliable results for the analysis of compositional data. The model is further evaluated against a spatial multinomial regression model for compositional data, and their relative effectiveness is discussed. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 33 pages, 2 figures, submitted to "Computational Statistics & Data Analysis"

arXiv:2311.11236 [pdf, other]

Generalized Linear Models via the Lasso: To Scale or Not to Scale?

Authors: Anant Mathur, Sarat Moka, Zdravko Botev

Abstract: The Lasso regression is a popular regularization method for feature selection in statistics. Prior to computing the Lasso estimator in both linear and generalized linear models, it is common to conduct a preliminary rescaling of the feature matrix to ensure that all the features are standardized. Without this standardization, it is argued, the Lasso estimate will unfortunately depend on the units… ▽ More The Lasso regression is a popular regularization method for feature selection in statistics. Prior to computing the Lasso estimator in both linear and generalized linear models, it is common to conduct a preliminary rescaling of the feature matrix to ensure that all the features are standardized. Without this standardization, it is argued, the Lasso estimate will unfortunately depend on the units used to measure the features. We propose a new type of iterative rescaling of the features in the context of generalized linear models. Whilst existing Lasso algorithms perform a single scaling as a preprocessing step, the proposed rescaling is applied iteratively throughout the Lasso computation until convergence. We provide numerical examples, with both real and simulated data, illustrating that the proposed iterative rescaling can significantly improve the statistical performance of the Lasso estimator without incurring any significant additional computational cost. △ Less

Submitted 19 November, 2023; originally announced November 2023.

arXiv:2304.09678 [pdf, other]

Column Subset Selection and Nyström Approximation via Continuous Optimization

Authors: Anant Mathur, Sarat Moka, Zdravko Botev

Abstract: We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nyström approximation. The CSSP and Nyström method construct low-rank approximations of matrices based on a predetermined subset of columns. It is well known that choosing the best column subset of size $k$ is a difficult combinatorial problem. In this work, we show how one can approximate the optimal… ▽ More We propose a continuous optimization algorithm for the Column Subset Selection Problem (CSSP) and Nyström approximation. The CSSP and Nyström method construct low-rank approximations of matrices based on a predetermined subset of columns. It is well known that choosing the best column subset of size $k$ is a difficult combinatorial problem. In this work, we show how one can approximate the optimal solution by defining a penalized continuous loss function which is minimized via stochastic gradient descent. We show that the gradients of this loss function can be estimated efficiently using matrix-vector products with a data matrix $X$ in the case of the CSSP or a kernel matrix $K$ in the case of the Nyström approximation. We provide numerical results for a number of real datasets showing that this continuous optimization is competitive against existing methods. △ Less

Submitted 19 April, 2023; originally announced April 2023.

arXiv:2205.02617 [pdf, other]

COMBSS: Best Subset Selection via Continuous Optimization

Authors: Sarat Moka, Benoit Liquet, Houying Zhu, Samuel Muller

Abstract: The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. Existing optimal methods for solving this problem tend to be slow while fast methods tend to have low accuracy. Ide… ▽ More The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. Existing optimal methods for solving this problem tend to be slow while fast methods tend to have low accuracy. Ideally, new methods perform best subset selection faster than existing optimal methods but with comparable accuracy, or, being more accurate than methods of comparable computational speed. Here, we propose a novel continuous optimization method that identifies a subset solution path, a small set of models of varying size, that consists of candidates for the single best subset of features, that is optimal in a specific sense in linear regression. Our method turns out to be fast, making the best subset selection possible when the number of features is well in excess of thousands. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models. △ Less

Submitted 24 November, 2023; v1 submitted 5 May, 2022; originally announced May 2022.

arXiv:2106.14565

Variance Reduction for Matrix Computations with Applications to Gaussian Processes

Authors: Anant Mathur, Sarat Moka, Zdravko Botev

Abstract: In addition to recent developments in computing speed and memory, methodological advances have contributed to significant gains in the performance of stochastic simulation. In this paper, we focus on variance reduction for matrix computations via matrix factorization. We provide insights into existing variance reduction methods for estimating the entries of large matrices. Popular methods do not e… ▽ More In addition to recent developments in computing speed and memory, methodological advances have contributed to significant gains in the performance of stochastic simulation. In this paper, we focus on variance reduction for matrix computations via matrix factorization. We provide insights into existing variance reduction methods for estimating the entries of large matrices. Popular methods do not exploit the reduction in variance that is possible when the matrix is factorized. We show how computing the square root factorization of the matrix can achieve in some important cases arbitrarily better stochastic performance. In addition, we propose a factorized estimator for the trace of a product of matrices and numerically demonstrate that the estimator can be up to 1,000 times more efficient on certain problems of estimating the log-likelihood of a Gaussian process. Additionally, we provide a new estimator of the log-determinant of a positive semi-definite matrix where the log-determinant is treated as a normalizing constant of a probability density. △ Less

Submitted 26 March, 2023; v1 submitted 28 June, 2021; originally announced June 2021.

Comments: Unable to be updated

Showing 1–11 of 11 results for author: Moka, S