-
The mosaic permutation test: an exact and nonparametric goodness-of-fit test for factor models
Authors:
Asher Spector,
Rina Foygel Barber,
Trevor Hastie,
Ronald N. Kahn,
Emmanuel Candès
Abstract:
Financial firms often rely on fundamental factor models to explain correlations among asset returns and manage risk. Yet after major events, e.g., COVID-19, analysts may reassess whether existing risk models continue to fit well: specifically, after accounting for a set of known factor exposures, are the residuals of the asset returns independent? With this motivation, we introduce the mosaic perm…
▽ More
Financial firms often rely on fundamental factor models to explain correlations among asset returns and manage risk. Yet after major events, e.g., COVID-19, analysts may reassess whether existing risk models continue to fit well: specifically, after accounting for a set of known factor exposures, are the residuals of the asset returns independent? With this motivation, we introduce the mosaic permutation test, a nonparametric goodness-of-fit test for preexisting factor models. Our method can leverage modern machine learning techniques to detect model violations while provably controlling the false positive rate, i.e., the probability of rejecting a well-fitting model, without making asymptotic approximations or parametric assumptions. This property helps prevent analysts from unnecessarily rebuilding accurate models, which can waste resources and increase risk. To illustrate our methodology, we apply the mosaic permutation test to the BlackRock Fundamental Equity Risk (BFRE) model. Although the BFRE model generally explains the most significant correlations among assets, we find evidence of unexplained correlations among certain real estate stocks, and we show that adding new factors improves model fit. We implement our methods in the python package mosaicperm.
△ Less
Submitted 26 September, 2024; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Evaluating the Performance of Low-Cost PM2.5 Sensors in Mobile Settings
Authors:
Priyanka deSouza,
An Wang,
Yuki Machida,
Tiffany Duhl,
Simone Mora,
Prashant Kumar,
Ralph Kahn,
Carlo Ratti,
John L. Durant,
Neelakshi Hudda
Abstract:
Low-cost sensors (LCS) for measuring air pollution are increasingly being deployed in mobile applications but questions concerning the quality of the measurements remain unanswered. For example, what is the best way to correct LCS data in a mobile setting? Which factors most significantly contribute to differences between mobile LCS data and higher-quality instruments? Can data from LCS be used to…
▽ More
Low-cost sensors (LCS) for measuring air pollution are increasingly being deployed in mobile applications but questions concerning the quality of the measurements remain unanswered. For example, what is the best way to correct LCS data in a mobile setting? Which factors most significantly contribute to differences between mobile LCS data and higher-quality instruments? Can data from LCS be used to identify hotspots and generate generalizable pollutant concentration maps? To help address these questions we deployed low-cost PM2.5 sensors (Alphasense OPC-N3) and a research-grade instrument (TSI DustTrak) in a mobile laboratory in Boston, MA, USA. We first collocated these instruments with stationary PM2.5 reference monitors at nearby regulatory sites. Next, using the reference measurements, we developed different models to correct the OPC-N3 and DustTrak measurements, and then transferred the corrections to the mobile setting. We observed that more complex correction models appeared to perform better than simpler models in the stationary setting; however, when transferred to the mobile setting, corrected OPC-N3 measurements agreed less well with corrected DustTrak data. In general, corrections developed using minute-level collocation measurements transferred better to the mobile setting than corrections developed using hourly-averaged data. Mobile laboratory speed, OPC-N3 orientation relative to the direction of travel, date, hour-of-the-day, and road class together explain a small but significant amount of variation between corrected OPC-N3 and DustTrak measurements during the mobile deployment. Persistent hotspots identified by the OPC-N3s agreed with those identified by the DustTrak. Similarly, maps of PM2.5 distribution produced from the mobile corrected OPC-N3 and DustTrak measurements agreed well.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
An analysis of degradation in low-cost particulate matter sensors
Authors:
Priyanka deSouza,
Karoline Barkjohn,
Andrea Clements,
Jenny Lee,
Ralph Kahn,
Ben Crawford,
Patrick Kinney
Abstract:
Low-cost sensors (LCS) are increasingly being used to measure fine particulate matter (PM2.5) concentrations in cities around the world. One of the most commonly deployed LCS is the PurpleAir with about 15,000 sensors deployed in the United States. However, the change in sensor performance over time has not been well studied. It is important to understand the lifespan of these sensors to determine…
▽ More
Low-cost sensors (LCS) are increasingly being used to measure fine particulate matter (PM2.5) concentrations in cities around the world. One of the most commonly deployed LCS is the PurpleAir with about 15,000 sensors deployed in the United States. However, the change in sensor performance over time has not been well studied. It is important to understand the lifespan of these sensors to determine when they should be replaced, and when measurements from these devices should or should not be used for various applications. This paper fills in this gap by leveraging the fact that: 1) Each PurpleAir sensor is comprised of two identical sensors and the divergence between their measurements can be observed, and 2) There are numerous PurpleAir sensors within 50 meters of regulatory monitors allowing for the comparison of measurements between these two instruments. We propose empirically-derived degradation outcomes for the PurpleAir sensors and evaluate how these outcomes change over time. On average, we find that the number of 'flagged' measurements, where the two sensors within each PurpleAir disagree, increases in time to 4 percent after 4 years of operation. Approximately, 2 percent of all PurpleAir sensors were permanently degraded. The largest fraction of permanently degraded PurpleAir sensors appeared to be in the hot and humid climate zone, suggesting that the sensors in this zone may need to be replaced sooner. We also find that the bias of PurpleAir sensors, or the difference between corrected PM2.5 levels and the corresponding reference measurements, changed over time by -0.12 ug/m3 (95% CI: -0.13 ug/m3, -0.11 ug/m3) per year. The average bias increases dramatically after 3.5 years. Climate zone is a significant modifier of the association between degradation outcomes and time.
△ Less
Submitted 26 October, 2022; v1 submitted 26 October, 2022;
originally announced October 2022.