-
Multivariate Long-term Profile Monitoring with Application to the KW51 Railway Bridge
Authors:
Philipp Wittenberg,
Alexander Mendler,
Sven Knoth,
Jan Gertheiss
Abstract:
Structural Health Monitoring (SHM) plays a pivotal role in modern civil engineering, providing critical insights into the health and integrity of infrastructure systems. This work presents a novel multivariate long-term profile monitoring approach to eliminate fluctuations in the measured response quantities, e.g., caused by environmental influences or measurement error. Our methodology addresses…
▽ More
Structural Health Monitoring (SHM) plays a pivotal role in modern civil engineering, providing critical insights into the health and integrity of infrastructure systems. This work presents a novel multivariate long-term profile monitoring approach to eliminate fluctuations in the measured response quantities, e.g., caused by environmental influences or measurement error. Our methodology addresses critical challenges in SHM and combines supervised methods with unsupervised, principal component analysis-based approaches in a single overarching framework, offering both flexibility and robustness in handling real-world large and/or sparse sensor data streams. We propose a function-on-function regression framework, which leverages functional data analysis for multivariate sensor data and integrates nonlinear modeling techniques, mitigating covariate-induced variations that can obscure structural changes.
△ Less
Submitted 25 June, 2025;
originally announced June 2025.
-
Confounder-adjusted Covariances of System Outputs and Applications to Structural Health Monitoring
Authors:
Lizzie Neumann,
Philipp Wittenberg,
Alexander Mendler,
Jan Gertheiss
Abstract:
Automated damage detection is an integral component of each structural health monitoring (SHM) system. Typically, measurements from various sensors are collected and reduced to damage-sensitive features, and diagnostic values are generated by statistically evaluating the features. Since changes in data do not only result from damage, it is necessary to determine the confounding factors (environmen…
▽ More
Automated damage detection is an integral component of each structural health monitoring (SHM) system. Typically, measurements from various sensors are collected and reduced to damage-sensitive features, and diagnostic values are generated by statistically evaluating the features. Since changes in data do not only result from damage, it is necessary to determine the confounding factors (environmental or operational variables) and to remove their effects from the measurements or features. Many existing methods for correcting confounding effects are based on different types of mean regression. This neglects potential changes in higher-order statistical moments, but in particular, the output covariances are essential for generating reliable diagnostics for damage detection. This article presents an approach to explicitly quantify the changes in the covariance, using conditional covariance matrices based on a non-parametric, kernel-based estimator. The method is applied to the Munich Test Bridge and the KW51 Railway Bridge in Leuven, covering both raw sensor measurements (acceleration, strain, inclination) and extracted damage-sensitive features (natural frequencies). The results show that covariances between different vibration or inclination sensors can significantly change due to temperature changes, and the same is true for natural frequencies. To highlight the advantages, it is explained how conditional covariances can be combined with standard approaches for damage detection, such as the Mahalanobis distance and principal component analysis. As a result, more reliable diagnostic values can be generated with fewer false alarms.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Covariate-Adjusted Functional Data Analysis for Structural Health Monitoring
Authors:
Philipp Wittenberg,
Lizzie Neumann,
Alexander Mendler,
Jan Gertheiss
Abstract:
Structural Health Monitoring (SHM) is increasingly applied in civil engineering. One of its primary purposes is detecting and assessing changes in structure conditions to increase safety and reduce potential maintenance downtime. Recent advancements, especially in sensor technology, facilitate data measurements, collection, and process automation, leading to large data streams. We propose a functi…
▽ More
Structural Health Monitoring (SHM) is increasingly applied in civil engineering. One of its primary purposes is detecting and assessing changes in structure conditions to increase safety and reduce potential maintenance downtime. Recent advancements, especially in sensor technology, facilitate data measurements, collection, and process automation, leading to large data streams. We propose a function-on-function regression framework for (nonlinear) modeling the sensor data and adjusting for covariate-induced variation. Our approach is particularly suited for long-term monitoring when several months or years of training data are available. It combines highly flexible yet interpretable semi-parametric modeling with functional principal component analysis and uses the corresponding out-of-sample Phase-II scores for monitoring. The method proposed can also be described as a combination of an ``input-output'' and an ``output-only'' method.
△ Less
Submitted 7 December, 2024; v1 submitted 4 August, 2024;
originally announced August 2024.
-
Structural Health Monitoring with Functional Data: Two Case Studies
Authors:
Philipp Wittenberg,
Sven Knoth,
Jan Gertheiss
Abstract:
Structural Health Monitoring (SHM) is increasingly used in civil engineering. One of its main purposes is to detect and assess changes in infrastructure conditions to reduce possible maintenance downtime and increase safety. Ideally, this process should be automated and implemented in real-time. Recent advances in sensor technology facilitate data collection and process automation, resulting in ma…
▽ More
Structural Health Monitoring (SHM) is increasingly used in civil engineering. One of its main purposes is to detect and assess changes in infrastructure conditions to reduce possible maintenance downtime and increase safety. Ideally, this process should be automated and implemented in real-time. Recent advances in sensor technology facilitate data collection and process automation, resulting in massive data streams. Functional data analysis (FDA) can be used to model and aggregate the data obtained transparently and interpretably. In two real-world case studies of bridges in Germany and Belgium, this paper demonstrates how a function-on-function regression approach, combined with profile monitoring, can be applied to SHM data to adjust sensor/system outputs for environmental-induced variation and detect changes in construction. Specifically, we consider the R package \texttt{funcharts} and discuss some challenges when using this software on real-world SHM data. For instance, we show that pre-smoothing of the data can improve and extend its usability.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Functional Data Analysis: An Introduction and Recent Developments
Authors:
Jan Gertheiss,
David RĂ¼gamer,
Bernard X. W. Liew,
Sonja Greven
Abstract:
Functional data analysis (FDA) is a statistical framework that allows for the analysis of curves, images, or functions on higher dimensional domains. The goals of FDA, such as descriptive analyses, classification, and regression, are generally the same as for statistical analyses of scalar-valued or multivariate data, but FDA brings additional challenges due to the high- and infinite dimensionalit…
▽ More
Functional data analysis (FDA) is a statistical framework that allows for the analysis of curves, images, or functions on higher dimensional domains. The goals of FDA, such as descriptive analyses, classification, and regression, are generally the same as for statistical analyses of scalar-valued or multivariate data, but FDA brings additional challenges due to the high- and infinite dimensionality of observations and parameters, respectively. This paper provides an introduction to FDA, including a description of the most common statistical analysis techniques, their respective software implementations, and some recent developments in the field. The paper covers fundamental concepts such as descriptives and outliers, smoothing, amplitude and phase variation, and functional principal component analysis. It also discusses functional regression, statistical inference with functional data, functional classification and clustering, and machine learning approaches for functional data analysis. The methods discussed in this paper are widely applicable in fields such as medicine, biophysics, neuroscience, and chemistry, and are increasingly relevant due to the widespread use of technologies that allow for the collection of functional data. Sparse functional data methods are also relevant for longitudinal data analysis. All presented methods are demonstrated using available software in R by analyzing a data set on human motion and motor control. To facilitate the understanding of the methods, their implementation, and hands-on application, the code for these practical examples is made available on Github: https://github.com/davidruegamer/FDA_tutorial .
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Regularization and Model Selection for Ordinal-on-Ordinal Regression with Applications to Food Products' Testing and Survey Data
Authors:
Aisouda Hoshiyar,
Laura H. Gertheiss,
Jan Gertheiss
Abstract:
Ordinal data are quite common in applied statistics. Although some model selection and regularization techniques for categorical predictors and ordinal response models have been developed over the past few years, less work has been done concerning ordinal-on-ordinal regression. Motivated by a consumer test and a survey on the willingness to pay for luxury food products consisting of Likert-type it…
▽ More
Ordinal data are quite common in applied statistics. Although some model selection and regularization techniques for categorical predictors and ordinal response models have been developed over the past few years, less work has been done concerning ordinal-on-ordinal regression. Motivated by a consumer test and a survey on the willingness to pay for luxury food products consisting of Likert-type items, we propose a strategy for smoothing and selecting ordinally scaled predictors in the cumulative logit model. First, the group lasso is modified by the use of difference penalties on neighboring dummy coefficients, thus taking into account the predictors' ordinal structure. Second, a fused lasso-type penalty is presented for the fusion of predictor categories and factor selection. The performance of both approaches is evaluated in simulation studies and on real-world data.
△ Less
Submitted 24 July, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Nonparametric Estimation of the Underlying Distribution of Binned Continuous Data
Authors:
Ejike R. Ugba,
Jan Gertheiss
Abstract:
The estimation of cumulative distribution functions (CDF) and probability density functions (PDF) is a fundamental practice in applied statistics. However, challenges often arise when dealing with data arranged in grouped intervals. In this paper, we discuss a suitable and highly flexible non-parametric density estimation approach for binned distributions, based on cubic monotonicity-preserving sp…
▽ More
The estimation of cumulative distribution functions (CDF) and probability density functions (PDF) is a fundamental practice in applied statistics. However, challenges often arise when dealing with data arranged in grouped intervals. In this paper, we discuss a suitable and highly flexible non-parametric density estimation approach for binned distributions, based on cubic monotonicity-preserving splines - known as cubic spline interpolation. Results from simulation studies demonstrate that this approach outperforms many widely used heuristic methods. Additionally, the application of this method to a dataset of train delays in Germany and micro census data on distance and travel time to work yields both meaningful but also some questionable results.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
A Modification of McFadden's $R^2$ for Binary and Ordinal Response Models
Authors:
Ejike R. Ugba,
Jan Gertheiss
Abstract:
A lot of studies on the summary measures of predictive strength of categorical response models consider the likelihood ratio index (LRI), also known as the McFadden-$R^2$, a better option than many other measures. We propose a simple modification of the LRI that adjusts for the effect of the number of response categories on the measure and that also rescales its values, mimicking an underlying lat…
▽ More
A lot of studies on the summary measures of predictive strength of categorical response models consider the likelihood ratio index (LRI), also known as the McFadden-$R^2$, a better option than many other measures. We propose a simple modification of the LRI that adjusts for the effect of the number of response categories on the measure and that also rescales its values, mimicking an underlying latent measure. The modified measure is applicable to both binary and ordinal response models fitted by maximum likelihood. Results from simulation studies and a real data example on the olfactory perception of boar taint show that the proposed measure outperforms most of the widely used goodness-of-fit measures for binary and ordinal models. The proposed $R^2$ interestingly proves quite invariant to an increasing number of response categories of an ordinal model.
△ Less
Submitted 1 February, 2023; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Nonparametric Regression and Classification with Functional, Categorical, and Mixed Covariates
Authors:
Leonie Selk,
Jan Gertheiss
Abstract:
We consider nonparametric prediction with multiple covariates, in particular categorical or functional predictors, or a mixture of both. The method proposed bases on an extension of the Nadaraya-Watson estimator where a kernel function is applied on a linear combination of distance measures each calculated on single covariates, with weights being estimated from the training data. The dependent var…
▽ More
We consider nonparametric prediction with multiple covariates, in particular categorical or functional predictors, or a mixture of both. The method proposed bases on an extension of the Nadaraya-Watson estimator where a kernel function is applied on a linear combination of distance measures each calculated on single covariates, with weights being estimated from the training data. The dependent variable can be categorical (binary or multi-class) or continuous, thus we consider both classification and regression problems. The methodology presented is illustrated and evaluated on artificial and real world data. Particularly it is observed that prediction accuracy can be increased, and irrelevant, noise variables can be identified/removed by `downgrading' the corresponding distance measures in a completely data-driven way.
△ Less
Submitted 4 August, 2022; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Penalized Optimal Scaling for Ordinal Variables with an Application to International Classification of Functioning Core Sets
Authors:
Aisouda Hoshiyar,
Henk A. L. Kiers,
Jan Gertheiss
Abstract:
Ordinal data occur frequently in the social sciences. When applying principal component analysis (PCA), however, those data are often treated as numeric implying linear relationships between the variables at hand, or non-linear PCA is applied where the obtained quantifications are sometimes hard to interpret. Non-linear PCA for categorical data, also called optimal scoring/scaling, constructs new…
▽ More
Ordinal data occur frequently in the social sciences. When applying principal component analysis (PCA), however, those data are often treated as numeric implying linear relationships between the variables at hand, or non-linear PCA is applied where the obtained quantifications are sometimes hard to interpret. Non-linear PCA for categorical data, also called optimal scoring/scaling, constructs new variables by assigning numerical values to categories such that the proportion of variance in those new variables that is explained by a predefined number of principal components is maximized. We propose a penalized version of non-linear PCA for ordinal variables that is a smoothed intermediate between standard PCA on category labels and non-linear PCA as used so far. The new approach is by no means limited to monotonic effects and offers both better interpretability of the non-linear transformation of the category labels as well as better performance on validation data than unpenalized non-linear PCA and/or standard linear PCA. In particular, an application of penalized optimal scaling to ordinal data as given with the International Classification of Functioning, Disability and Health (ICF) is provided.
△ Less
Submitted 17 January, 2023; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Statistical Inference for Ordinal Predictors in Generalized Linear and Additive Models with Application to Bronchopulmonary Dysplasia
Authors:
Jan Gertheiss,
Fabian Scheipl,
Tina Lauer,
Harald Ehrhardt
Abstract:
Discrete but ordered covariates are quite common in applied statistics, and some regularized fitting procedures have been proposed for proper handling of ordinal predictors in statistical modeling. In this study, we show how quadratic penalties on adjacent dummy coefficients of ordinal predictors proposed in the literature can be incorporated in the framework of generalized additive models, making…
▽ More
Discrete but ordered covariates are quite common in applied statistics, and some regularized fitting procedures have been proposed for proper handling of ordinal predictors in statistical modeling. In this study, we show how quadratic penalties on adjacent dummy coefficients of ordinal predictors proposed in the literature can be incorporated in the framework of generalized additive models, making tools for statistical inference developed there available for ordinal predictors as well. Motivated by an application from neonatal medicine, we discuss whether results obtained when constructing confidence intervals and testing significance of smooth terms in generalized additive models are useful with ordinal predictors/penalties as well.
△ Less
Submitted 24 August, 2021; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Generalized Functional Additive Mixed Models
Authors:
Fabian Scheipl,
Jan Gertheiss,
Sonja Greven
Abstract:
We propose a comprehensive framework for additive regression models for non-Gaussian functional responses, allowing for multiple (partially) nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data as well as linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of…
▽ More
We propose a comprehensive framework for additive regression models for non-Gaussian functional responses, allowing for multiple (partially) nested or crossed functional random effects with flexible correlation structures for, e.g., spatial, temporal, or longitudinal functional data as well as linear and nonlinear effects of functional and scalar covariates that may vary smoothly over the index of the functional response. Our implementation handles functional responses from any exponential family distribution as well as many others like Beta- or scaled non-central $t$-distributions. Development is motivated by and evaluated on an application to large-scale longitudinal feeding records of pigs. Results in extensive simulation studies as well as replications of two previously published simulation studies for generalized functional mixed models demonstrate the good performance of our proposal. The approach is implemented in well-documented open source software in the "pffr()" function in R-package "refund".
△ Less
Submitted 6 May, 2016; v1 submitted 17 June, 2015;
originally announced June 2015.
-
Sparse modeling of categorial explanatory variables
Authors:
Jan Gertheiss,
Gerhard Tutz
Abstract:
Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures…
▽ More
Shrinking methods in regression analysis are usually designed for metric predictors. In this article, however, shrinkage methods for categorial predictors are proposed. As an application we consider data from the Munich rent standard, where, for example, urban districts are treated as a categorial predictor. If independent variables are categorial, some modifications to usual shrinking procedures are necessary. Two $L_1$-penalty based methods for factor selection and clustering of categories are presented and investigated. The first approach is designed for nominal scale levels, the second one for ordinal predictors. Besides applying them to the Munich rent standard, methods are illustrated and compared in simulation studies.
△ Less
Submitted 7 January, 2011;
originally announced January 2011.