Search | arXiv e-print repository

Automated Assessment of Residual Plots with Computer Vision Models

Authors: Weihao Li, Dianne Cook, Emi Tanaka, Susan VanderPlas, Klaus Ackermann

Abstract: Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineup protocol to do visual inference. There are a variety of conventional residual tests, but the lineup protocol, used as a statistical test, performs better for d… ▽ More Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineup protocol to do visual inference. There are a variety of conventional residual tests, but the lineup protocol, used as a statistical test, performs better for diagnostic purposes because it is less sensitive and applies more broadly to different types of departures. However, the lineup protocol relies on human judgment which limits its scalability. This work presents a solution by providing a computer vision model to automate the assessment of residual plots. It is trained to predict a distance measure that quantifies the disparity between the residual distribution of a fitted classical normal linear regression model and the reference distribution, based on Kullback-Leibler divergence. From extensive simulation studies, the computer vision model exhibits lower sensitivity than conventional tests but higher sensitivity than human visual tests. It is slightly less effective on non-linearity patterns. Several examples from classical papers and contemporary data illustrate the new procedures, highlighting its usefulness in automating the diagnostic process and supplementing existing methods. △ Less

Submitted 1 November, 2024; originally announced November 2024.

arXiv:2301.07465 [pdf, other]

doi 10.1145/3544548.3580780

QButterfly: Lightweight Survey Extension for Online User Interaction Studies for Non-Tech-Savvy Researchers

Authors: Nico Ebert, Björn Scheppler, Kurt Ackermann, Tim Geppert

Abstract: We provide a user-friendly, flexible, and lightweight open-source HCI toolkit (github.com/QButterfly) that allows non-tech-savvy researchers to conduct online user interaction studies using the widespread Qualtrics and LimeSurvey platforms. These platforms already provide rich functionality (e.g., for experiments or usability tests) and therefore lend themselves to an extension to display stimulus… ▽ More We provide a user-friendly, flexible, and lightweight open-source HCI toolkit (github.com/QButterfly) that allows non-tech-savvy researchers to conduct online user interaction studies using the widespread Qualtrics and LimeSurvey platforms. These platforms already provide rich functionality (e.g., for experiments or usability tests) and therefore lend themselves to an extension to display stimulus web pages and record clickstreams. The toolkit consists of a survey template with embedded JavaScript, a JavaScript library embedded in the HTML web pages, and scripts to analyze the collected data. No special programming skills are required to set up a study or match survey data and user interaction data after data collection. We empirically validated the software in a laboratory and a field study. We conclude that this extension, even in its preliminary version, has the potential to make online user interaction studies (e.g., with crowdsourced participants) accessible to a broader range of researchers. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2209.14594 [pdf, other]

Bayesian Neural Network Versus Ex-Post Calibration For Prediction Uncertainty

Authors: Satya Borgohain, Klaus Ackermann, Ruben Loaiza-Maya

Abstract: Probabilistic predictions from neural networks which account for predictive uncertainty during classification is crucial in many real-world and high-impact decision making settings. However, in practice most datasets are trained on non-probabilistic neural networks which by default do not capture this inherent uncertainty. This well-known problem has led to the development of post-hoc calibration… ▽ More Probabilistic predictions from neural networks which account for predictive uncertainty during classification is crucial in many real-world and high-impact decision making settings. However, in practice most datasets are trained on non-probabilistic neural networks which by default do not capture this inherent uncertainty. This well-known problem has led to the development of post-hoc calibration procedures, such as Platt scaling (logistic), isotonic and beta calibration, which transforms the scores into well calibrated empirical probabilities. A plausible alternative to the calibration approach is to use Bayesian neural networks, which directly models a predictive distribution. Although they have been applied to images and text datasets, they have seen limited adoption in the tabular and small data regime. In this paper, we demonstrate that Bayesian neural networks yields competitive performance when compared to calibrated neural networks and conduct experiments across a wide array of datasets. △ Less

Submitted 29 September, 2022; originally announced September 2022.

ACM Class: I.2

arXiv:2203.10716 [pdf, other]

Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices

Authors: Hansika Hewamalage, Klaus Ackermann, Christoph Bergmeir

Abstract: Machine Learning (ML) and Deep Learning (DL) methods are increasingly replacing traditional methods in many domains involved with important decision making activities. DL techniques tailor-made for specific tasks such as image recognition, signal processing, or speech analysis are being introduced at a fast pace with many improvements. However, for the domain of forecasting, the current state in t… ▽ More Machine Learning (ML) and Deep Learning (DL) methods are increasingly replacing traditional methods in many domains involved with important decision making activities. DL techniques tailor-made for specific tasks such as image recognition, signal processing, or speech analysis are being introduced at a fast pace with many improvements. However, for the domain of forecasting, the current state in the ML community is perhaps where other domains such as Natural Language Processing and Computer Vision were at several years ago. The field of forecasting has mainly been fostered by statisticians/econometricians; consequently the related concepts are not the mainstream knowledge among general ML practitioners. The different non-stationarities associated with time series challenge the data-driven ML models. Nevertheless, recent trends in the domain have shown that with the availability of massive amounts of time series, ML techniques are quite competent in forecasting, when related pitfalls are properly handled. Therefore, in this work we provide a tutorial-like compilation of the details of one of the most important steps in the overall forecasting process, namely the evaluation. This way, we intend to impart the information of forecast evaluation to fit the context of ML, as means of bridging the knowledge gap between traditional methods of forecasting and state-of-the-art ML techniques. We elaborate on the different problematic characteristics of time series such as non-normalities and non-stationarities and how they are associated with common pitfalls in forecast evaluation. Best practices in forecast evaluation are outlined with respect to the different steps such as data partitioning, error calculation, statistical testing, and others. Further guidelines are also provided along selecting valid and suitable error measures depending on the specific characteristics of the dataset at hand. △ Less

Submitted 4 April, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

arXiv:2101.08021 [pdf]

doi 10.1145/3411764.3445516

Bolder is Better: Raising User Awareness through Salient and Concise Privacy Notices

Authors: Nico Ebert, Kurt Alexander Ackermann, Björn Scheppler

Abstract: This paper addresses the question whether the recently proposed approach of concise privacy notices in apps and on websites is effective in raising user awareness. To assess the effectiveness in a realistic setting, we included concise notices in a fictitious but realistic fitness tracking app and asked participants recruited from an online panel to provide their feedback on the usability of the a… ▽ More This paper addresses the question whether the recently proposed approach of concise privacy notices in apps and on websites is effective in raising user awareness. To assess the effectiveness in a realistic setting, we included concise notices in a fictitious but realistic fitness tracking app and asked participants recruited from an online panel to provide their feedback on the usability of the app as a cover story. Importantly, after giving feedback, users were also asked to recall the data practices described in the notices. The experimental setup included the variation of different levels of saliency and riskiness of the privacy notices. Based on a total sample of 2,274 participants, our findings indicate that concise privacy notices are indeed a promising approach to raise user awareness for privacy information when displayed in a salient way, especially in case the notices describe risky data practices. Our results may be helpful for regulators, user advocates and transparency-oriented companies in creating or enforcing better privacy transparency towards average users that do not read traditional privacy policies. △ Less

Submitted 20 January, 2021; originally announced January 2021.

arXiv:2010.08102 [pdf, other]

Estimating Sleep & Work Hours from Alternative Data by Segmented Functional Classification Analysis (SFCA)

Authors: Klaus Ackermann, Simon D. Angus, Paul A. Raschky

Abstract: Alternative data is increasingly adapted to predict human and economic behaviour. This paper introduces a new type of alternative data by re-conceptualising the internet as a data-driven insights platform at global scale. Using data from a unique internet activity and location dataset drawn from over 1.5 trillion observations of end-user internet connections, we construct a functional dataset cove… ▽ More Alternative data is increasingly adapted to predict human and economic behaviour. This paper introduces a new type of alternative data by re-conceptualising the internet as a data-driven insights platform at global scale. Using data from a unique internet activity and location dataset drawn from over 1.5 trillion observations of end-user internet connections, we construct a functional dataset covering over 1,600 cities during a 7 year period with temporal resolution of just 15min. To predict accurate temporal patterns of sleep and work activity from this data-set, we develop a new technique, Segmented Functional Classification Analysis (SFCA), and compare its performance to a wide array of linear, functional, and classification methods. To confirm the wider applicability of SFCA, in a second application we predict sleep and work activity using SFCA from US city-wide electricity demand functional data. Across both problems, SFCA is shown to out-perform current methods. △ Less

Submitted 15 October, 2020; originally announced October 2020.

arXiv:2009.05455 [pdf, other]

Object Recognition for Economic Development from Daytime Satellite Imagery

Authors: Klaus Ackermann, Alexey Chernikov, Nandini Anantharama, Miethy Zaman, Paul A Raschky

Abstract: Reliable data about the stock of physical capital and infrastructure in developing countries is typically very scarce. This is particular a problem for data at the subnational level where existing data is often outdated, not consistently measured or coverage is incomplete. Traditional data collection methods are time and labor-intensive costly, which often prohibits developing countries from colle… ▽ More Reliable data about the stock of physical capital and infrastructure in developing countries is typically very scarce. This is particular a problem for data at the subnational level where existing data is often outdated, not consistently measured or coverage is incomplete. Traditional data collection methods are time and labor-intensive costly, which often prohibits developing countries from collecting this type of data. This paper proposes a novel method to extract infrastructure features from high-resolution satellite images. We collected high-resolution satellite images for 5 million 1km $\times$ 1km grid cells covering 21 African countries. We contribute to the growing body of literature in this area by training our machine learning algorithm on ground-truth data. We show that our approach strongly improves the predictive accuracy. Our methodology can build the foundation to then predict subnational indicators of economic development for areas where this data is either missing or unreliable. △ Less

Submitted 11 September, 2020; originally announced September 2020.

Comments: 8 pages, 7 figures

arXiv:1701.05632 [pdf, other]

The Internet as Quantitative Social Science Platform: Insights from a Trillion Observations

Authors: Klaus Ackermann, Simon D Angus, Paul A Raschky

Abstract: With the large-scale penetration of the internet, for the first time, humanity has become linked by a single, open, communications platform. Harnessing this fact, we report insights arising from a unified internet activity and location dataset of an unparalleled scope and accuracy drawn from over a trillion (1.5$\times 10^{12}$) observations of end-user internet connections, with temporal resoluti… ▽ More With the large-scale penetration of the internet, for the first time, humanity has become linked by a single, open, communications platform. Harnessing this fact, we report insights arising from a unified internet activity and location dataset of an unparalleled scope and accuracy drawn from over a trillion (1.5$\times 10^{12}$) observations of end-user internet connections, with temporal resolution of just 15min over 2006-2012. We first apply this dataset to the expansion of the internet itself over 1,647 urban agglomerations globally. We find that unique IP per capita counts reach saturation at approximately one IP per three people, and take, on average, 16.1 years to achieve; eclipsing the estimated 100- and 60- year saturation times for steam-power and electrification respectively. Next, we use intra-diurnal internet activity features to up-scale traditional over-night sleep observations, producing the first global estimate of over-night sleep duration in 645 cities over 7 years. We find statistically significant variation between continental, national and regional sleep durations including some evidence of global sleep duration convergence. Finally, we estimate the relationship between internet concentration and economic outcomes in 411 OECD regions and find that the internet's expansion is associated with negative or positive productivity gains, depending strongly on sectoral considerations. To our knowledge, our study is the first of its kind to use online/offline activity of the entire internet to infer social science insights, demonstrating the unparalleled potential of the internet as a social data-science platform. △ Less

Submitted 19 January, 2017; originally announced January 2017.

Comments: 40 pages, including 4 main figures, and appendix

Showing 1–8 of 8 results for author: Ackermann, K