-
Economic growth of cities: Does resource allocation matter?
Authors:
Sheng Dai,
Timo Kuosmanen,
Zhiqiang Liao
Abstract:
We study how efficient resource reallocation across cities affects potential aggregate growth. Using optimal resource allocation models and data on 284 China's prefecture-level cities in the years 2003--2019, we quantitatively measure the cost of misallocation of resources. We show that average aggregate output gains from reallocating resources across nationwide cities to their efficient use are 1…
▽ More
We study how efficient resource reallocation across cities affects potential aggregate growth. Using optimal resource allocation models and data on 284 China's prefecture-level cities in the years 2003--2019, we quantitatively measure the cost of misallocation of resources. We show that average aggregate output gains from reallocating resources across nationwide cities to their efficient use are 1.349- and 1.287-fold in the perfect and imperfect allocation scenarios. We further provide evidence on the effects of administrative division adjustments and local allocation. This suggests that city-level adjustments can yield more aggregate gain and that the output gain from nationwide allocation is likely to be more substantial than that from local allocation. Policy implications are proposed to improve the resource allocation efficiency in China.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Variable selection in convex nonparametric least squares via structured Lasso: An application to the Swedish electricity distribution networks
Authors:
Zhiqiang Liao
Abstract:
We study the problem of variable selection in convex nonparametric least squares (CNLS). Whereas the least absolute shrinkage and selection operator (Lasso) is a popular technique for least squares, its variable selection performance is unknown in CNLS problems. In this work, we investigate the performance of the Lasso estimator and find out it is usually unable to select variables efficiently. Ex…
▽ More
We study the problem of variable selection in convex nonparametric least squares (CNLS). Whereas the least absolute shrinkage and selection operator (Lasso) is a popular technique for least squares, its variable selection performance is unknown in CNLS problems. In this work, we investigate the performance of the Lasso estimator and find out it is usually unable to select variables efficiently. Exploiting the unique structure of the subgradients in CNLS, we develop a structured Lasso method by combining $\ell_1$-norm and $\ell_{\infty}$-norm. The relaxed version of the structured Lasso is proposed for achieving model sparsity and predictive performance simultaneously, where we can control the two effects--variable selection and model shrinkage--using separate tuning parameters. A Monte Carlo study is implemented to verify the finite sample performance of the proposed approaches. We also use real data from Swedish electricity distribution networks to illustrate the effects of the proposed variable selection techniques. The results from the simulation and application confirm that the proposed structured Lasso performs favorably, generally leading to sparser and more accurate predictive models, relative to the conventional Lasso methods in the literature.
△ Less
Submitted 13 November, 2024; v1 submitted 3 September, 2024;
originally announced September 2024.
-
Overfitting Reduction in Convex Regression
Authors:
Zhiqiang Liao,
Sheng Dai,
Eunji Lim,
Timo Kuosmanen
Abstract:
Convex regression is a method for estimating the convex function from a data set. This method has played an important role in operations research, economics, machine learning, and many other areas. However, it has been empirically observed that convex regression produces inconsistent estimates of convex functions and extremely large subgradients near the boundary as the sample size increases. In t…
▽ More
Convex regression is a method for estimating the convex function from a data set. This method has played an important role in operations research, economics, machine learning, and many other areas. However, it has been empirically observed that convex regression produces inconsistent estimates of convex functions and extremely large subgradients near the boundary as the sample size increases. In this paper, we provide theoretical evidence of this overfitting behavior. To eliminate this behavior, we propose two new estimators by placing a bound on the subgradients of the convex function. We further show that our proposed estimators can reduce overfitting by proving that they converge to the underlying true convex function and that their subgradients converge to the gradient of the underlying function, both uniformly over the domain with probability one as the sample size is increasing to infinity. An application to Finnish electricity distribution firms confirms the superior performance of the proposed methods in predictive power over the existing methods.
△ Less
Submitted 16 October, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Some Finite-Sample Results on the Hausman Test
Authors:
Jinyong Hahn,
Zhipeng Liao,
Nan Liu,
Shuyang Sheng
Abstract:
This paper shows that the endogeneity test using the control function approach in linear instrumental variable models is a variant of the Hausman test. Moreover, we find that the test statistics used in these tests can be numerically ordered, indicating their relative power properties in finite samples.
This paper shows that the endogeneity test using the control function approach in linear instrumental variable models is a variant of the Hausman test. Moreover, we find that the test statistics used in these tests can be numerically ordered, indicating their relative power properties in finite samples.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Logit-based alternatives to two-stage least squares
Authors:
Denis Chetverikov,
Jinyong Hahn,
Zhipeng Liao,
Shuyang Sheng
Abstract:
We propose logit-based IV and augmented logit-based IV estimators that serve as alternatives to the traditionally used 2SLS estimator in the model where both the endogenous treatment variable and the corresponding instrument are binary. Our novel estimators are as easy to compute as the 2SLS estimator but have an advantage over the 2SLS estimator in terms of causal interpretability. In particular,…
▽ More
We propose logit-based IV and augmented logit-based IV estimators that serve as alternatives to the traditionally used 2SLS estimator in the model where both the endogenous treatment variable and the corresponding instrument are binary. Our novel estimators are as easy to compute as the 2SLS estimator but have an advantage over the 2SLS estimator in terms of causal interpretability. In particular, in certain cases where the probability limits of both our estimators and the 2SLS estimator take the form of weighted-average treatment effects, our estimators are guaranteed to yield non-negative weights whereas the 2SLS estimator is not.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
Standard errors when a regressor is randomly assigned
Authors:
Denis Chetverikov,
Jinyong Hahn,
Zhipeng Liao,
Andres Santos
Abstract:
We examine asymptotic properties of the OLS estimator when the values of the regressor of interest are assigned randomly and independently of other regressors. We find that the OLS variance formula in this case is often simplified, sometimes substantially. In particular, when the regressor of interest is independent not only of other regressors but also of the error term, the textbook homoskedasti…
▽ More
We examine asymptotic properties of the OLS estimator when the values of the regressor of interest are assigned randomly and independently of other regressors. We find that the OLS variance formula in this case is often simplified, sometimes substantially. In particular, when the regressor of interest is independent not only of other regressors but also of the error term, the textbook homoskedastic variance formula is valid even if the error term and auxiliary regressors exhibit a general dependence structure. In the context of randomized controlled trials, this conclusion holds in completely randomized experiments with constant treatment effects. When the error term is heteroscedastic with respect to the regressor of interest, the variance formula has to be adjusted not only for heteroscedasticity but also for correlation structure of the error term. However, even in the latter case, some simplifications are possible as only a part of the correlation structure of the error term should be taken into account. In the context of randomized control trials, this implies that the textbook homoscedastic variance formula is typically not valid if treatment effects are heterogenous but heteroscedasticity-robust variance formulas are valid if treatment effects are independent across units, even if the error term exhibits a general dependence structure. In addition, we extend the results to the case when the regressor of interest is assigned randomly at a group level, such as in randomized control trials with treatment assignment determined at a group (e.g., school/village) level.
△ Less
Submitted 17 March, 2023;
originally announced March 2023.