-
Distributionally Robust Optimal Power Flow with Contextual Information
Authors:
Adrián Esteban-Pérez,
Juan M. Morales
Abstract:
In this paper, we develop a distributionally robust chance-constrained formulation of the Optimal Power Flow problem (OPF) whereby the system operator can leverage contextual information. For this purpose, we exploit an ambiguity set based on probability trimmings and optimal transport through which the dispatch solution is protected against the incomplete knowledge of the relationship between the…
▽ More
In this paper, we develop a distributionally robust chance-constrained formulation of the Optimal Power Flow problem (OPF) whereby the system operator can leverage contextual information. For this purpose, we exploit an ambiguity set based on probability trimmings and optimal transport through which the dispatch solution is protected against the incomplete knowledge of the relationship between the OPF uncertainties and the context that is conveyed by a sample of their joint probability distribution. We provide a tractable reformulation of the proposed distributionally robust chance-constrained OPF problem under the popular conditional-value-at-risk approximation. By way of numerical experiments run on a modified IEEE-118 bus network with wind uncertainty, we show how the power system can substantially benefit from taking into account the well-known statistical dependence between the point forecast of wind power outputs and its associated prediction error. Furthermore, the experiments conducted also reveal that the distributional robustness conferred on the OPF solution by our probability-trimmings-based approach is superior to that bestowed by alternative approaches in terms of expected cost and system reliability.
△ Less
Submitted 4 October, 2022; v1 submitted 16 September, 2021;
originally announced September 2021.
-
Prescribing net demand for two-stage electricity generation scheduling
Authors:
Juan M. Morales,
Miguel Á. Muñoz,
Salvador Pineda
Abstract:
We consider a two-stage generation scheduling problem comprising a forward dispatch and a real-time re-dispatch. The former must be conducted facing an uncertain net demand that includes non-dispatchable electricity consumption and renewable power generation. The latter copes with the plausible deviations with respect to the forward schedule by making use of balancing power during the actual opera…
▽ More
We consider a two-stage generation scheduling problem comprising a forward dispatch and a real-time re-dispatch. The former must be conducted facing an uncertain net demand that includes non-dispatchable electricity consumption and renewable power generation. The latter copes with the plausible deviations with respect to the forward schedule by making use of balancing power during the actual operation of the system. Standard industry practice deals with the uncertain net demand in the forward stage by replacing it with a good estimate of its conditional expectation (usually referred to as a point forecast), so as to minimize the need for balancing power in real time. However, it is well known that the cost structure of a power system is highly asymmetric and dependent on its operating point, with the result that minimizing the amount of power imbalances is not necessarily aligned with minimizing operating costs. In this paper, we propose a bilevel program to construct, from the available historical data, a prescription of the net demand that does account for the power system's cost asymmetry. Furthermore, to accommodate the strong dependence of this cost on the power system's operating point, we use clustering to tailor the proposed prescription to the foreseen net-demand regime. By way of an illustrative example and a more realistic case study based on the European power system, we show that our approach leads to substantial cost savings compared to the customary way of doing.
△ Less
Submitted 17 April, 2023; v1 submitted 2 August, 2021;
originally announced August 2021.
-
A Functional Data Analysis Approach to Evolution Outlier Detection for Grouped Smart Meters
Authors:
A. Elías,
J. M. Morales,
S. Pineda
Abstract:
Smart metering infrastructures collect data almost continuously in the form of fine-grained long time series. These massive data series often have common daily patterns that are repeated between similar days or seasons and shared among grouped meters. Within this context, we propose an unsupervised method to highlight individuals with abnormal daily dependency patterns, which we term evolution out…
▽ More
Smart metering infrastructures collect data almost continuously in the form of fine-grained long time series. These massive data series often have common daily patterns that are repeated between similar days or seasons and shared among grouped meters. Within this context, we propose an unsupervised method to highlight individuals with abnormal daily dependency patterns, which we term evolution outliers. To this end, we approach the problem from the standpoint of Functional Data Analysis (FDA) and we use the concept of functional depth to exploit the dynamic group structure and isolate individual meters with a different evolution. The performance of the proposal is first evaluated empirically through a simulation exercise under different evolution scenarios. Subsequently, the importance and need for an evolution outlier detection method is shown by using actual smart-metering data corresponding to photo-voltaic energy generation and circuit voltage records. Here, our proposal detects outliers that might go unnoticed by other approaches of the literature that have demonstrated to be effective capturing magnitude and shape abnormalities.
△ Less
Submitted 7 October, 2022; v1 submitted 2 July, 2021;
originally announced July 2021.
-
Hidden Markov and semi-Markov models: When and why are these models useful for classifying states in time series data?
Authors:
Sofia Ruiz-Suarez,
Vianey Leos-Barajas,
Juan Manuel Morales
Abstract:
Hidden Markov models (HMMs) and their extensions have proven to be powerful tools for classification of observations that stem from systems with temporal dependence as they take into account that observations close in time are likely generated from the same state (i.e.\ class). When information on the classes of the observations is available in advanced, supervised methods can be applied. In this…
▽ More
Hidden Markov models (HMMs) and their extensions have proven to be powerful tools for classification of observations that stem from systems with temporal dependence as they take into account that observations close in time are likely generated from the same state (i.e.\ class). When information on the classes of the observations is available in advanced, supervised methods can be applied. In this paper, we provide details for the implementation of four models for classification in a supervised learning context: HMMs, hidden semi-Markov models (HSMMs), autoregressive-HMMs, and autoregressive-HSMMs. Using simulations, we study the classification performance under various degrees of model misspecification to characterize when it would be important to extend a basic HMM to an HSMM. As an application of these techniques we use the models to classify accelerometer data from Merino sheep to distinguish between four different behaviors of interest. In particular in the field of movement ecology, collection of fine-scale animal movement data over time to identify behavioral states has become ubiquitous, necessitating models that can account for the dependence structure in the data. We demonstrate that when the aim is to conduct classification, various degrees of model misspecification of the proposed model may not impede good classification performance unless there is high overlap between the state-dependent distributions, that is, unless the observation distributions of the different states are difficult to differentiate.
△ Less
Submitted 19 November, 2021; v1 submitted 24 May, 2021;
originally announced May 2021.
-
A novel embedded min-max approach for feature selection in nonlinear support vector machine classification
Authors:
Asunción Jiménez-Cordero,
Juan Miguel Morales,
Salvador Pineda
Abstract:
In recent years, feature selection has become a challenging problem in several machine learning fields, such as classification problems. Support Vector Machine (SVM) is a well-known technique applied in classification tasks. Various methodologies have been proposed in the literature to select the most relevant features in SVM. Unfortunately, all of them either deal with the feature selection probl…
▽ More
In recent years, feature selection has become a challenging problem in several machine learning fields, such as classification problems. Support Vector Machine (SVM) is a well-known technique applied in classification tasks. Various methodologies have been proposed in the literature to select the most relevant features in SVM. Unfortunately, all of them either deal with the feature selection problem in the linear classification setting or propose ad-hoc approaches that are difficult to implement in practice. In contrast, we propose an embedded feature selection method based on a min-max optimization problem, where a trade-off between model complexity and classification accuracy is sought. By leveraging duality theory, we equivalently reformulate the min-max problem and solve it without further ado using off-the-shelf software for nonlinear optimization. The efficiency and usefulness of our approach are tested on several benchmark data sets in terms of accuracy, number of selected features and interpretability.
△ Less
Submitted 15 January, 2021; v1 submitted 21 April, 2020;
originally announced April 2020.
-
Approximate Bayesian inference for a "steps and turns" continuous-time random walk observed at regular time intervals
Authors:
Sofia Ruiz-Suarez,
Vianey Leos-Barajas,
Ignacio Alvarez-Castro,
Juan M. Morales
Abstract:
The study of animal movement is challenging because it is a process modulated by many factors acting at different spatial and temporal scales. Several models have been proposed which differ primarily in the temporal conceptualization, namely continuous and discrete time formulations. Naturally, animal movement occurs in continuous time but we tend to observe it at fixed time intervals. To account…
▽ More
The study of animal movement is challenging because it is a process modulated by many factors acting at different spatial and temporal scales. Several models have been proposed which differ primarily in the temporal conceptualization, namely continuous and discrete time formulations. Naturally, animal movement occurs in continuous time but we tend to observe it at fixed time intervals. To account for the temporal mismatch between observations and movement decisions, we used a state-space model where movement decisions (steps and turns) are made in continuous time. The movement process is then observed at regular time intervals. As the likelihood function of this state-space model turned out to be complex to calculate yet simulating data is straightforward, we conduct inference using a few variations of Approximate Bayesian Computation (ABC). We explore the applicability of these methods as a function of the discrepancy between the temporal scale of the observations and that of the movement process in a simulation study. We demonstrate the application of this model to a real trajectory of a sheep that was reconstructed in high resolution using information from magnetometer and GPS devices. Our results suggest that accurate estimates can be obtained when the observations are less than 5 times the average time between changes in movement direction. The state-space model used here allowed us to connect the scales of the observations and movement decisions in an intuitive and easy to interpret way. Our findings underscore the idea that the time scale at which animal movement decisions are made needs to be considered when designing data collection protocols, and that sometimes high-frequency data may not be necessary to have good estimates of certain movement processes.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
Feature-driven Improvement of Renewable Energy Forecasting and Trading
Authors:
Miguel Á. Muñoz,
Juan M. Morales,
Salvador Pineda
Abstract:
Inspired from recent insights into the common ground of machine learning, optimization and decision-making, this paper proposes an easy-to-implement, but effective procedure to enhance both the quality of renewable energy forecasts and the competitive edge of renewable energy producers in electricity markets with a dual-price settlement of imbalances. The quality and economic gains brought by the…
▽ More
Inspired from recent insights into the common ground of machine learning, optimization and decision-making, this paper proposes an easy-to-implement, but effective procedure to enhance both the quality of renewable energy forecasts and the competitive edge of renewable energy producers in electricity markets with a dual-price settlement of imbalances. The quality and economic gains brought by the proposed procedure essentially stem from the utilization of valuable predictors (also known as features) in a data-driven newsvendor model that renders a computationally inexpensive linear program. We illustrate the proposed procedure and numerically assess its benefits on a realistic case study that considers the aggregate wind power production in the Danish DK1 bidding zone as the variable to be predicted and traded. Within this context, our procedure leverages, among others, spatial information in the form of wind power forecasts issued by transmission system operators (TSO) in surrounding bidding zones and publicly available in online platforms. We show that our method is able to improve the quality of the wind power forecast issued by the Danish TSO by several percentage points (when measured in terms of the mean absolute or the root mean square error) and to significantly reduce the balancing costs incurred by the wind power producer.
△ Less
Submitted 16 January, 2020; v1 submitted 17 July, 2019;
originally announced July 2019.
-
Running on empty: Recharge dynamics from animal movement data
Authors:
Mevin B. Hooten,
Henry R. Scharf,
Juan M. Morales
Abstract:
Vital rates such as survival and recruitment have always been important in the study of population and community ecology. At the individual level, physiological processes such as energetics are critical in understanding biomechanics and movement ecology and also scale up to influence food webs and trophic cascades. Although vital rates and population-level characteristics are tied with individual-…
▽ More
Vital rates such as survival and recruitment have always been important in the study of population and community ecology. At the individual level, physiological processes such as energetics are critical in understanding biomechanics and movement ecology and also scale up to influence food webs and trophic cascades. Although vital rates and population-level characteristics are tied with individual-level animal movement, most statistical models for telemetry data are not equipped to provide inference about these relationships because they lack the explicit, mechanistic connection to physiological dynamics. We present a framework for modeling telemetry data that explicitly includes an aggregated physiological process associated with decision making and movement in heterogeneous environments. Our framework accommodates a wide range of movement and physiological process specifications. We illustrate a specific model formulation in continuous-time to provide direct inference about gains and losses associated with physiological processes based on movement. Our approach can also be extended to accommodate auxiliary data when available. We demonstrate our model to infer mountain lion (in Colorado, USA) and African buffalo (in Kruger National Park, South Africa) recharge dynamics.
△ Less
Submitted 30 January, 2020; v1 submitted 20 July, 2018;
originally announced July 2018.
-
Spatio-Temporal Forecasting by Coupled Stochastic Differential Equations: Applications to Solar Power
Authors:
Emil B. Iversen,
Rune Juhl,
Jan K. Møller,
Jan Kleissl,
Henrik Madsen,
Juan M. Morales
Abstract:
Spatio-temporal problems exist in many areas of knowledge and disciplines ranging from biology to engineering and physics. However, solution strategies based on classical statistical techniques often fall short due to the large number of parameters that are to be estimated and the huge amount of data that need to be handled. In this paper we apply known techniques in a novel way to provide a frame…
▽ More
Spatio-temporal problems exist in many areas of knowledge and disciplines ranging from biology to engineering and physics. However, solution strategies based on classical statistical techniques often fall short due to the large number of parameters that are to be estimated and the huge amount of data that need to be handled. In this paper we apply known techniques in a novel way to provide a framework for spatio-temporal modeling which is both computationally efficient and has a low dimensional parameter space. We present a micro-to-macro approach whereby the local dynamics are first modeled and subsequently combined to capture the global system behavior. The proposed methodology relies on coupled stochastic differential equations and is applied to produce spatio-temporal forecasts for a solar power plant for very short horizons, which essentially implies tracking clouds moving across the field of solar power inverters. We outperform simple and complex benchmarks while providing forecasts for 70 spatial dimensions and 24 lead times (i.e., for a total number of random variables equal to 1680). The resulting model can provide all sorts of forecast products, ranging from point forecasts and co-variances to predictive densities, multi-horizon forecasts, and space-time trajectories.
△ Less
Submitted 14 June, 2017;
originally announced June 2017.
-
Multi-scale modeling of animal movement and general behavior data using hidden Markov models with hierarchical structures
Authors:
Vianey Leos-Barajas,
Eric Gangloff,
Timo Adam,
Roland Langrock,
Floris M. van Beest,
Jacob Nabe-Nielsen,
Juan M. Morales
Abstract:
Hidden Markov models (HMMs) are commonly used to model animal movement data and infer aspects of animal behavior. An HMM assumes that each data point from a time series of observations stems from one of $N$ possible states. The states are loosely connected to behavioral modes that manifest themselves at the temporal resolution at which observations are made. However, due to advances in tag technol…
▽ More
Hidden Markov models (HMMs) are commonly used to model animal movement data and infer aspects of animal behavior. An HMM assumes that each data point from a time series of observations stems from one of $N$ possible states. The states are loosely connected to behavioral modes that manifest themselves at the temporal resolution at which observations are made. However, due to advances in tag technology, data can be collected at increasingly fine temporal resolutions. Yet, inferences at time scales cruder than those at which data are collected, and which correspond to larger-scale behavioral processes, are not yet answered via HMMs. We include additional hierarchical structures to the basic HMM framework in order to incorporate multiple Markov chains at various time scales. The hierarchically structured HMMs allow for behavioral inferences at multiple time scales and can also serve as a means to avoid coarsening data. Our proposed framework is one of the first that models animal behavior simultaneously at multiple time scales, opening new possibilities in the area of animal movement modeling. We illustrate the application of hierarchically structured HMMs in two real-data examples: (i) vertical movements of harbor porpoises observed in the field, and (ii) garter snake movement data collected as part of an experimental design.
△ Less
Submitted 12 February, 2017;
originally announced February 2017.
-
Probabilistic Forecasts of Solar Irradiance by Stochastic Differential Equations
Authors:
Emil B. Iversen,
Juan M. Morales,
Jan K. Møller,
Henrik Madsen
Abstract:
Probabilistic forecasts of renewable energy production provide users with valuable information about the uncertainty associated with the expected generation. Current state-of-the-art forecasts for solar irradiance have focused on producing reliable \emph{point} forecasts. The additional information included in probabilistic forecasts may be paramount for decision makers to efficiently make use of…
▽ More
Probabilistic forecasts of renewable energy production provide users with valuable information about the uncertainty associated with the expected generation. Current state-of-the-art forecasts for solar irradiance have focused on producing reliable \emph{point} forecasts. The additional information included in probabilistic forecasts may be paramount for decision makers to efficiently make use of this uncertain and variable generation. In this paper, a stochastic differential equation (SDE) framework for modeling the uncertainty associated with the solar irradiance point forecast is proposed. This modeling approach allows for characterizing both the interdependence structure of prediction errors of short-term solar irradiance and their predictive distribution. A series of different SDE models are fitted to a training set and subsequently evaluated on a one-year test set. The final model proposed is defined on a bounded and time-varying state space with zero probability almost surely of events outside this space.
△ Less
Submitted 25 October, 2013;
originally announced October 2013.
-
Regularizers for Structured Sparsity
Authors:
Charles A. Micchelli,
Jean M. Morales,
Massimiliano Pontil
Abstract:
We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. This problem is relevant in machine learning, statistics and signal processing. It is well known that a linear regression can benefit from knowledge that the underlying regression vector is sparse. The combinatorial problem of selecting the nonzero components of…
▽ More
We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. This problem is relevant in machine learning, statistics and signal processing. It is well known that a linear regression can benefit from knowledge that the underlying regression vector is sparse. The combinatorial problem of selecting the nonzero components of this vector can be "relaxed" by regularizing the squared error with a convex penalty function like the $\ell_1$ norm. However, in many applications, additional conditions on the structure of the regression vector and its sparsity pattern are available. Incorporating this information into the learning method may lead to a significant decrease of the estimation error. In this paper, we present a family of convex penalty functions, which encode prior knowledge on the structure of the vector formed by the absolute values of the regression coefficients. This family subsumes the $\ell_1$ norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish the basic properties of these penalty functions and discuss some examples where they can be computed explicitly. Moreover, we present a convergent optimization algorithm for solving regularized least squares with these penalty functions. Numerical simulations highlight the benefit of structured sparsity and the advantage offered by our approach over the Lasso method and other related methods.
△ Less
Submitted 30 March, 2011; v1 submitted 4 October, 2010;
originally announced October 2010.