-
Beyond the Coverage of Information Spreading: Analytical and Empirical Evidence of Re-exposure in Large-scale Online Social Networks
Authors:
Xin Lu,
Shuo Qin,
Petter Holme,
Fanhui Meng,
Yanqing Hu,
Fredrik Liljeros,
Gad Allon
Abstract:
Peer influence and social contagion are key denominators in the adoption and participation of information spreading, such as news propagation, word-of-mouth or viral marketing. In this study, we argue that it is biased to only focus on the scale and coverage of information spreading, and propose that the level of influence reinforcement, quantified by the re-exposure rate, i.e., the rate of indivi…
▽ More
Peer influence and social contagion are key denominators in the adoption and participation of information spreading, such as news propagation, word-of-mouth or viral marketing. In this study, we argue that it is biased to only focus on the scale and coverage of information spreading, and propose that the level of influence reinforcement, quantified by the re-exposure rate, i.e., the rate of individuals who are repeatedly exposed to the same information, should be considered together to measure the effectiveness of spreading. We show that local network structural characteristics significantly affects the probability of being exposed or re-exposed to the same information. After analyzing trending news on the super large-scale online network of Sina Weibo (China's Twitter) with 430 million connected users, we find a class of users with extremely low exposure rate, even they are following tens of thousands of others; and the re-exposure rate is substantially higher for news with more transmission waves and stronger secondary forwarding. While exposure and re-exposure rate typically grow together with the scale of spreading, we find exceptional cases where it is possible to achieve a high exposure rate while maintaining low re-exposure rate, or vice versa.
△ Less
Submitted 25 July, 2019;
originally announced July 2019.
-
Respondent-driven sampling bias induced by clustering and community structure in social networks
Authors:
Luis Enrique Correa Rocha,
Anna Ekeus Thorson,
Renaud Lambiotte,
Fredrik Liljeros
Abstract:
Sampling hidden populations is particularly challenging using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling (RDS) is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted f…
▽ More
Sampling hidden populations is particularly challenging using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling (RDS) is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted for the likelihood of being sampled due to differences in the number of contacts. In RDS, the structure of the social contacts thus defines the sampling process and affects its coverage, for instance by constraining the sampling within a sub-region of the network. In this paper we study the bias induced by network structures such as social triangles, community structure, and heterogeneities in the number of contacts, in the recruitment trees and in the RDS estimator. We simulate different scenarios of network structures and response-rates to study the potential biases one may expect in real settings. We find that the prevalence of the estimated variable is associated with the size of the network community to which the individual belongs. Furthermore, we observe that low-degree nodes may be under-sampled in certain situations if the sample and the network are of similar size. Finally, we also show that low response-rates lead to reasonably accurate average estimates of the prevalence but generate relatively large biases.
△ Less
Submitted 19 March, 2015;
originally announced March 2015.
-
Respondent-driven sampling and an unusual epidemic
Authors:
Jens Malmros,
Fredrik Liljeros,
Tom Britton
Abstract:
Respondent-driven sampling (RDS) is frequently used when sampling hard-to-reach and/or stigmatized communities. RDS utilizes a peer-driven recruitment mechanism where sampled individuals pass on participation coupons to at most $c$ of their acquaintances in the community ($c=3$ being a common choice), who then in turn pass on to their acquaintances if they choose to participate, and so on. This pr…
▽ More
Respondent-driven sampling (RDS) is frequently used when sampling hard-to-reach and/or stigmatized communities. RDS utilizes a peer-driven recruitment mechanism where sampled individuals pass on participation coupons to at most $c$ of their acquaintances in the community ($c=3$ being a common choice), who then in turn pass on to their acquaintances if they choose to participate, and so on. This process of distributing coupons is shown to behave like a new Reed-Frost type network epidemic model, in which becoming infected corresponds to receiving a coupon. The difference from existing network epidemic models is that an infected individual can not infect (i.e.\ sample) all of its contacts, but only at most $c$ of them. We calculate $R_0$, the probability of a major "outbreak", and the relative size of a major outbreak in the limit of infinite population size and evaluate their adequacy in finite populations. We study the effect of varying $c$ and compare RDS to the corresponding usual epidemic models, i.e.\ the case of $c=\infty$. Our results suggest that the number of coupons has a large effect on RDS recruitment. Additionally, we use our findings to explain previous empirical observations.
△ Less
Submitted 11 November, 2014;
originally announced November 2014.
-
Implementation of Web-Based Respondent-Driven Sampling among Men who Have Sex with Men in Vietnam
Authors:
Linus Bengtsson,
Xin Lu,
Quoc Cuong Nguyen,
Martin Camitz,
Nguyen Le Hoang,
Fredrik Liljeros,
Anna Thorson
Abstract:
Objective: Lack of representative data about hidden groups, like men who have sex with men (MSM), hinders an evidence-based response to the HIV epidemics. Respondent-driven sampling (RDS) was developed to overcome sampling challenges in studies of populations like MSM for which sampling frames are absent. Internet-based RDS (webRDS) can potentially circumvent limitations of the original RDS method…
▽ More
Objective: Lack of representative data about hidden groups, like men who have sex with men (MSM), hinders an evidence-based response to the HIV epidemics. Respondent-driven sampling (RDS) was developed to overcome sampling challenges in studies of populations like MSM for which sampling frames are absent. Internet-based RDS (webRDS) can potentially circumvent limitations of the original RDS method. We aimed to implement and evaluate webRDS among a hidden population.
Methods and Design: This cross-sectional study took place 18 February to 12 April, 2011 among MSM in Vietnam. Inclusion criteria were men, aged 18 and above, who had ever had sex with another man and were living in Vietnam. Participants were invited by an MSM friend, logged in, and answered a survey. Participants could recruit up to four MSM friends. We evaluated the system by its success in generating sustained recruitment and the degree to which the sample compositions stabilized with increasing sample size.
Results: Twenty starting participants generated 676 participants over 24 recruitment waves. Analyses did not show evidence of bias due to ineligible participation. Estimated mean age was 22 year and 82% came from the two large metropolitan areas. 32 out of 63 provinces were represented. The median number of sexual partners during the last six months was two. The sample composition stabilized well for 16 out of 17 variables.
Conclusion: Results indicate that webRDS could be implemented at a low cost among Internet-using MSM in Vietnam. WebRDS may be a promising method for sampling of Internet-using MSM and other hidden groups.
Key words: Respondent-driven sampling, Online sampling, Men who have sex with men, Vietnam, Sexual risk behavior
△ Less
Submitted 8 June, 2012;
originally announced June 2012.
-
Respondent-driven Sampling on Directed Networks
Authors:
Xin Lu,
Jens Malmros,
Fredrik Liljeros,
Tom Britton
Abstract:
Respondent-driven sampling (RDS) is a commonly used substitute for random sampling when studying hidden populations, such as injecting drug users or men who have sex with men, for which no sampling frame is known. The method is an extension of the snowball sample method and can, given that some assumptions are met, generate unbiased population estimates. One key assumption, not likely to be met, i…
▽ More
Respondent-driven sampling (RDS) is a commonly used substitute for random sampling when studying hidden populations, such as injecting drug users or men who have sex with men, for which no sampling frame is known. The method is an extension of the snowball sample method and can, given that some assumptions are met, generate unbiased population estimates. One key assumption, not likely to be met, is that the acquaintance network in which the recruitment process takes place is undirected, meaning that all recruiters should have the potential to be recruited by the person they recruit. Here we investigate the potential bias of directedness by simulating RDS on real and artificial network structures. We show that directedness is likely to generate bias that cannot be compensated for unless the sampled individuals know how many that potentially may have recruited them (i.e. their indegree), which is unlikely in most situations. Based on one known parameter, we propose an estimator for RDS on directed networks when only outdegrees are observed.
By comparison of current RDS estimators' performances on networks with varying structures, we find that our new estimator, together with a recent estimator, which requires the population size as a known quantity, have relatively low level of estimate error and bias. Based on our new estimator, sensitivity analysis can be made by varying values of the known parameter to take uncertainty of network directedness and error in reporting degrees into account. Finally, we have developed a bootstrap procedure for the new estimator to construct confidence intervals.
△ Less
Submitted 29 April, 2012; v1 submitted 9 January, 2012;
originally announced January 2012.
-
The Sensitivity of Respondent-driven Sampling Method
Authors:
Xin Lu,
Linus Bengtsson,
Tom Britton,
Martin Camitz,
Beom Jun Kim,
Anna Thorson,
Fredrik Liljeros
Abstract:
Researchers in many scientific fields make inferences from individuals to larger groups. For many groups however, there is no list of members from which to take a random sample. Respondent-driven sampling (RDS) is a relatively new sampling methodology that circumvents this difficulty by using the social networks of the groups under study. The RDS method has been shown to provide unbiased estimat…
▽ More
Researchers in many scientific fields make inferences from individuals to larger groups. For many groups however, there is no list of members from which to take a random sample. Respondent-driven sampling (RDS) is a relatively new sampling methodology that circumvents this difficulty by using the social networks of the groups under study. The RDS method has been shown to provide unbiased estimates of population proportions given certain conditions. The method is now widely used in the study of HIV-related high-risk populations globally. In this paper, we test the RDS methodology by simulating RDS studies on the social networks of a large LGBT web community. The robustness of the RDS method is tested by violating, one by one, the conditions under which the method provides unbiased estimates. Results reveal that the risk of bias is large if networks are directed, or respondents choose to invite persons based on characteristics that are correlated with the study outcomes. If these two problems are absent, the RDS method shows strong resistance to low response rates and certain errors in the participants' reporting of their network sizes. Other issues that might affect the RDS estimates, such as the method for choosing initial participants, the maximum number of recruitments per participant, sampling with or without replacement and variations in network structures, are also simulated and discussed.
△ Less
Submitted 16 February, 2010; v1 submitted 11 February, 2010;
originally announced February 2010.