One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity

Khan, Bilal; Lee, Hsuan-Wei; Fellows, Ian; Dombrowski, Kirk

doi:10.1371/journal.pone.0195959

Abstract:Population size estimates for hidden and hard-to-reach populations are particularly important when members are known to suffer from disproportion health issues or to pose health risks to the larger ambient population in which they are embedded. Efforts to derive size estimates are often frustrated by a range of factors that preclude conventional survey strategies, including social stigma associated with group membership or members' involvement in illegal activities.
This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, to be used in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. Provably sufficient conditions for the consistency of these estimators (in large configuration networks) are given. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which are seen to perform well on a real-world location-based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable tradeoffs between anonymity guarantees and estimator performance.
Taken together, we demonstrate that reasonable population estimates can be derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. Limitations and future work are discussed in the concluding section.

Subjects:	Social and Information Networks (cs.SI); Combinatorics (math.CO); Methodology (stat.ME)
MSC classes:	05C80, 91D30
ACM classes:	G.2.2; G.3
Cite as:	arXiv:1710.03953 [cs.SI]
	(or arXiv:1710.03953v1 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.1710.03953
Related DOI:	https://doi.org/10.1371/journal.pone.0195959

Computer Science > Social and Information Networks

Title:One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators