Skip to main content

Showing 1–11 of 11 results for author: Eshragh, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2411.03617  [pdf, ps, other

    math.OC stat.CO

    Efficient Data-Driven Leverage Score Sampling Algorithm for the Minimum Volume Covering Ellipsoid Problem in Big Data

    Authors: Elizabeth Harris, Ali Eshragh, Bishnu Lamichhane, Jordan Shaw-Carmody, Elizabeth Stojanovski

    Abstract: The Minimum Volume Covering Ellipsoid (MVCE) problem, characterised by $n$ observations in $d$ dimensions where $n \gg d$, can be computationally very expensive in the big data regime. We apply methods from randomised numerical linear algebra to develop a data-driven leverage score sampling algorithm for solving MVCE, and establish theoretical error bounds and a convergence guarantee. Assuming the… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Comments: 20 pages, 3 figures

    MSC Class: 62K05; 62-06; 90-08; 90C25; 90C59

  2. arXiv:2401.00122  [pdf, other

    stat.ML cs.LG

    SALSA: Sequential Approximate Leverage-Score Algorithm with Application in Analyzing Big Time Series Data

    Authors: Ali Eshragh, Luke Yerbury, Asef Nazari, Fred Roosta, Michael W. Mahoney

    Abstract: We develop a new efficient sequential approximate leverage score algorithm, SALSA, using methods from randomized numerical linear algebra (RandNLA) for large matrices. We demonstrate that, with high probability, the accuracy of SALSA's approximations is within $(1 + O({\varepsilon}))$ of the true leverage scores. In addition, we show that the theoretical computational complexity and numerical accu… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

    Comments: 42 pages, 7 figures

    MSC Class: 62M10

  3. arXiv:2212.02255  [pdf, other

    cs.LG stat.AP

    A Hybrid Statistical-Machine Learning Approach for Analysing Online Customer Behavior: An Empirical Study

    Authors: Saed Alizamir, Kasun Bandara, Ali Eshragh, Foaad Iravani

    Abstract: We apply classical statistical methods in conjunction with the state-of-the-art machine learning techniques to develop a hybrid interpretable model to analyse 454,897 online customers' behavior for a particular product category at the largest online retailer in China, that is JD. While most mere machine learning methods are plagued by the lack of interpretability in practice, our novel hybrid appr… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  4. arXiv:2112.12994  [pdf, other

    stat.ML cs.LG stat.CO

    Toeplitz Least Squares Problems, Fast Algorithms and Big Data

    Authors: Ali Eshragh, Oliver Di Pietro, Michael A. Saunders

    Abstract: In time series analysis, when fitting an autoregressive model, one must solve a Toeplitz ordinary least squares problem numerous times to find an appropriate model, which can severely affect computational times with large data sets. Two recent algorithms (LSAR and Repeated Halving) have applied randomized numerical linear algebra (RandNLA) techniques to fitting an autoregressive model to big time-… ▽ More

    Submitted 24 December, 2021; originally announced December 2021.

    Comments: 28 pages, 11 figures

    MSC Class: 62M10; 68T09; 62R07

  5. arXiv:2103.09175  [pdf, other

    stat.ME stat.CO

    Rollage: Efficient Rolling Average Algorithm to Estimate ARMA Models for Big Time Series Data

    Authors: Ali Eshragh, Glen Livingston, Thomas McCarthy McCann, Luke Yerbury

    Abstract: We develop a new efficient algorithm for the analysis of large-scale time series data. We firstly define rolling averages, derive their analytical properties, and establish their asymptotic distribution. These theoretical results are subsequently exploited to develop an efficient algorithm, called Rollage, for fitting an appropriate AR model to big time series data. When used in conjunction with t… ▽ More

    Submitted 23 December, 2022; v1 submitted 16 March, 2021; originally announced March 2021.

    MSC Class: 62M10; 62R07

  6. arXiv:2005.12455  [pdf, other

    stat.AP physics.soc-ph q-bio.PE

    Modeling the Dynamics of the COVID-19 Population in Australia: A Probabilistic Analysis

    Authors: Ali Eshragh, Saed Alizamir, Peter Howley, Elizabeth Stojanovski

    Abstract: The novel Corona Virus COVID-19 arrived on Australian shores around 25 January 2020. This paper presents a novel method of dynamically modeling and forecasting the COVID-19 pandemic in Australia with a high degree of accuracy and in a timely manner using limited data; a valuable resource that can be used to guide government decision-making on societal restrictions on a daily and/or weekly basis. T… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: 25 pages, 7 figures, 3 tables

    MSC Class: 92D30; 62M20; 60J28 ACM Class: G.3

  7. arXiv:1911.12321  [pdf, other

    stat.ME stat.ML

    LSAR: Efficient Leverage Score Sampling Algorithm for the Analysis of Big Time Series Data

    Authors: Ali Eshragh, Fred Roosta, Asef Nazari, Michael W. Mahoney

    Abstract: We apply methods from randomized numerical linear algebra (RandNLA) to develop improved algorithms for the analysis of large-scale time series data. We first develop a new fast algorithm to estimate the leverage scores of an autoregressive (AR) model in big data regimes. We show that the accuracy of approximations lies within $(1+\bigO{\varepsilon})$ of the true leverage scores with high probabili… ▽ More

    Submitted 30 October, 2021; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: 38 pages, 8 figures

  8. The Importance of Environmental Factors in Forecasting Australian Power Demand

    Authors: Ali Eshragh, Benjamin Ganim, Terry Perkins, Kasun Bandara

    Abstract: We develop a time series model to forecast weekly peak power demand for three main states of Australia for a yearly time-scale, and show the crucial role of environmental factors in improving the forecasts. More precisely, we construct a seasonal autoregressive integrated moving average (SARIMA) model and reinforce it by employing the exogenous environmental variables including, maximum temperatur… ▽ More

    Submitted 30 October, 2021; v1 submitted 2 November, 2019; originally announced November 2019.

    Comments: Keywords: Electricity power peak demand forecasting, Environmental factors, SARIMA-regression Model

    MSC Class: 62M10; 97K80

  9. arXiv:1909.02716  [pdf

    stat.AP

    Demand Forecasting in the Presence of Systematic Events: Cases in Capturing Sales Promotions

    Authors: Mahdi Abolghasemi, Ali Eshragh, Jason Hurley, Behnam Fahimnia

    Abstract: Reliable demand forecasts are critical for the effective supply chain management. Several endogenous and exogenous variables can influence the dynamics of demand, and hence a single statistical model that only consists of historical sales data is often insufficient to produce accurate forecasts. In practice, the forecasts generated by baseline statistical models are often judgmentally adjusted by… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

  10. arXiv:1901.10868  [pdf, ps, other

    stat.ML cs.LG math.OC

    Learning to Project in Multi-Objective Binary Linear Programming

    Authors: Alvaro Sierra-Altamiranda, Hadi Charkhgard, Iman Dayarian, Ali Eshragh, Sorna Javadi

    Abstract: In this paper, we investigate the possibility of improving the performance of multi-objective optimization solution approaches using machine learning techniques. Specifically, we focus on multi-objective binary linear programs and employ one of the most effective and recently developed criterion space search algorithms, the so-called KSA, during our study. This algorithm computes all nondominated… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.

  11. arXiv:1804.07935  [pdf, ps, other

    stat.ME

    Best subset selection in linear regression via bi-objective mixed integer linear programming

    Authors: Hadi Charkhgard, Ali Eshragh

    Abstract: We study the problem of choosing the best subset of p features in linear regression given n observations. This problem naturally contains two objective functions including minimizing the amount of bias and minimizing the number of predictors. The existing approaches transform the problem into a single-objective optimization problem. We explain the main weaknesses of existing approaches, and to ove… ▽ More

    Submitted 21 April, 2018; originally announced April 2018.

    Comments: 13 pages, 4 figures, 1 table

    MSC Class: 62J05; 90C29