Showing 1–2 of 2 results for author: Sande, L S

Search v0.5.6 released 2020-02-24

arXiv:2111.05269 [pdf]

cs.MS cs.SI stat.AP

A set of R packages to estimate population counts from mobile phone data

Authors: Bogdan Oancea, David Salgado, Luis Sanguiao Sande, Sandra Barragan

Abstract: In this paper, we describe the software implementation of the methodological framework designed to incorporate mobile phone data into the current production chain of official statistics during the ESSnet Big Data II project. We present an overview of the architecture of the software stack, its components, the interfaces between them, and show how they can be used. Our software implementation consi… ▽ More In this paper, we describe the software implementation of the methodological framework designed to incorporate mobile phone data into the current production chain of official statistics during the ESSnet Big Data II project. We present an overview of the architecture of the software stack, its components, the interfaces between them, and show how they can be used. Our software implementation consists in four R packages: destim for estimation of the spatial distribution of the mobile devices, deduplication for classification of the devices as being in 1:1 or 2:1 correspondence with its owner, aggregation for estimation of the number of individuals detected by the network starting from the geolocation probabilities and the duplicity probabilities and inference which combines the number of individuals provided by the previous package with other information like the population counts from an official register and the mobile operator penetration rates to provide an estimation of the target population counts. △ Less

Submitted 4 November, 2021; originally announced November 2021.

Comments: 16 pages, 4 figures

Journal ref: ROMANIAN STATISTICAL REVIEW, Issue 1, Page 17-38, 2021
arXiv:2003.11423 [pdf, ps, other]

stat.ML cs.LG math.ST

Design-unbiased statistical learning in survey sampling

Authors: Luis Sanguiao Sande, Li-Chun Zhang

Abstract: Design-consistent model-assisted estimation has become the standard practice in survey sampling. However, a general theory is lacking so far, which allows one to incorporate modern machine-learning techniques that can lead to potentially much more powerful assisting models. We propose a subsampling Rao-Blackwell method, and develop a statistical learning theory for exactly design-unbiased estimati… ▽ More Design-consistent model-assisted estimation has become the standard practice in survey sampling. However, a general theory is lacking so far, which allows one to incorporate modern machine-learning techniques that can lead to potentially much more powerful assisting models. We propose a subsampling Rao-Blackwell method, and develop a statistical learning theory for exactly design-unbiased estimation with the help of linear or non-linear prediction models. Our approach makes use of classic ideas from Statistical Science as well as the rapidly growing field of Machine Learning. Provided rich auxiliary information, it can yield considerable efficiency gains over standard linear model-assisted methods, while ensuring valid estimation for the given target population, which is robust against potential mis-specifications of the assisting model at the individual level. △ Less

Submitted 25 March, 2020; originally announced March 2020.

Search v0.5.6 released 2020-02-24