-
Can machine learning predict citizen-reported angler behavior?
Authors:
Julia S. Schmid,
Sean Simmons,
Mark A. Lewis,
Mark S. Poesch,
Pouria Ramazi
Abstract:
Prediction of angler behaviors, such as catch rates and angler pressure, is essential to maintaining fish populations and ensuring angler satisfaction. Angler behavior can partly be tracked by online platforms and mobile phone applications that provide fishing activities reported by recreational anglers. Moreover, angler behavior is known to be driven by local site attributes. Here, the prediction…
▽ More
Prediction of angler behaviors, such as catch rates and angler pressure, is essential to maintaining fish populations and ensuring angler satisfaction. Angler behavior can partly be tracked by online platforms and mobile phone applications that provide fishing activities reported by recreational anglers. Moreover, angler behavior is known to be driven by local site attributes. Here, the prediction of citizen-reported angler behavior was investigated by machine-learning methods using auxiliary data on the environment, socioeconomics, fisheries management objectives, and events at a freshwater body. The goal was to determine whether auxiliary data alone could predict the reported behavior. Different spatial and temporal extents and temporal resolutions were considered. Accuracy scores averaged 88% for monthly predictions at single water bodies and 86% for spatial predictions on a day in a specific region across Canada. At other resolutions and scales, the models only achieved low prediction accuracy of around 60%. The study represents a first attempt at predicting angler behavior in time and space at a large scale and establishes a foundation for potential future expansions in various directions.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Boosting propagule transport models with individual-specific data from mobile apps
Authors:
Samuel M. Fischer,
Pouria Ramazi,
Sean Simmons,
Mark S. Poesch,
Mark A. Lewis
Abstract:
Management of invasive species and pathogens requires information about the traffic of potential vectors. Such information is often taken from vector traffic models fitted to survey data. Here, user-specific data collected via mobile apps offer new opportunities to obtain more accurate estimates and to analyze how vectors' individual preferences affect propagule flows. However, data voluntarily re…
▽ More
Management of invasive species and pathogens requires information about the traffic of potential vectors. Such information is often taken from vector traffic models fitted to survey data. Here, user-specific data collected via mobile apps offer new opportunities to obtain more accurate estimates and to analyze how vectors' individual preferences affect propagule flows. However, data voluntarily reported via apps may lack some trip records, adding a significant layer of uncertainty. We show how the benefits of app-based data can be exploited despite this drawback.
Based on data collected via an angler app, we built a stochastic model for angler traffic in the Canadian province Alberta. There, anglers facilitate the spread of whirling disease, a parasite-induced fish disease. The model is temporally and spatially explicit and accounts for individual preferences and repeating behaviour of anglers, helping to address the problem of missing trip records.
We obtained estimates of angler traffic between all subbasins in Alberta. The model's accuracy exceeds that of direct empirical estimates even when fewer data were used to fit the model. The results indicate that anglers' local preferences and their tendency to revisit previous destinations reduce the number of long inter-waterbody trips potentially dispersing whirling disease. According to our model, anglers revisit their previous destination in 64% of their trips, making these trips irrelevant for the spread of whirling disease. Furthermore, 54% of fishing trips end in individual-specific spatially contained areas with mean radius of 54.7km. Finally, although the fraction of trips that anglers report was unknown, we were able to estimate the total yearly number of fishing trips in Alberta, matching an independent empirical estimate.
△ Less
Submitted 13 December, 2022; v1 submitted 29 May, 2021;
originally announced May 2021.
-
Enabling Privacy-Preserving GWAS in Heterogeneous Human Populations
Authors:
Sean Simmons,
Cenk Sahinalp,
Bonnie Berger
Abstract:
The projected increase of genotyping in the clinic and the rise of large genomic databases has led to the possibility of using patient medical data to perform genomewide association studies (GWAS) on a larger scale and at a lower cost than ever before. Due to privacy concerns, however, access to this data is limited to a few trusted individuals, greatly reducing its impact on biomedical research.…
▽ More
The projected increase of genotyping in the clinic and the rise of large genomic databases has led to the possibility of using patient medical data to perform genomewide association studies (GWAS) on a larger scale and at a lower cost than ever before. Due to privacy concerns, however, access to this data is limited to a few trusted individuals, greatly reducing its impact on biomedical research. Privacy preserving methods have been suggested as a way of allowing more people access to this precious data while protecting patients. In particular, there has been growing interest in applying the concept of differential privacy to GWAS results. Unfortunately, previous approaches for performing differentially private GWAS are based on rather simple statistics that have some major limitations. In particular, they do not correct for population stratification, a major issue when dealing with the genetically diverse populations present in modern GWAS. To address this concern we introduce a novel computational framework for performing GWAS that tailors ideas from differential privacy to protect private phenotype information, while at the same time correcting for population stratification. This framework allows us to produce privacy preserving GWAS results based on two of the most commonly used GWAS statistics: EIGENSTRAT and linear mixed model (LMM) based statistics. We test our differentially private statistics, PrivSTRAT and PrivLMM, on both simulated and real GWAS datasets and find that they are able to protect privacy while returning meaningful GWAS results.
△ Less
Submitted 15 April, 2016;
originally announced April 2016.