Model-Assisted Survey Regression Estimation With The Lasso
Abstract
In the U.S. Forest Service’s Forest Inventory and Analysis (FIA) program, as in other natural resource surveys, many auxiliary variables are available for use in model-assisted inference about finite population parameters. Some of this auxiliary information may be extraneous, and therefore model selection is appropriate to improve the efficiency of the survey regression estimators of finite population totals. A model-assisted survey regression estimator using the lasso is presented and extended to the adaptive lasso. For a sequence of finite populations and probability sampling designs, asymptotic properties of the lasso survey regression estimator are derived, including design consistency and central limit theory for the estimator and design consistency of a variance estimator. To estimate multiple finite population quantities with the method, lasso survey regression weights are developed, using both a model calibration approach and a ridge regression approximation. The gains in efficiency of the lasso estimator over the full regression estimator are demonstrated through a simulation study estimating tree canopy cover for a region in Utah.