-
Large scale study of primary school student performance relative to their LMS activity and socioeconomic demographics using a Bayesian Additive Regression Trees containing random effects
Authors:
Natalia da Silva,
Bruno Tancredi,
Ignacio Alvarez-Castro
Abstract:
Using data collected on almost every 9-12 years old student in Uruguay, we show how to apply Bayesian Additive Regression Trees (BART) with random effects to study performance association with Learning Managment System (LMS) activity and socioeconomic status. Performance data is joined with LMS activity pattern data. BART is chosen because it is possible to include school-level random effects. The…
▽ More
Using data collected on almost every 9-12 years old student in Uruguay, we show how to apply Bayesian Additive Regression Trees (BART) with random effects to study performance association with Learning Managment System (LMS) activity and socioeconomic status. Performance data is joined with LMS activity pattern data. BART is chosen because it is possible to include school-level random effects. The model can be used for early identification of at-risk students, and highlights schools that are successful or need intervention. An interesting finding is that high levels of LMS usage show larger positive effects on performance in low socioeconomic status.
△ Less
Submitted 20 June, 2025;
originally announced July 2025.
-
SpICE: An interpretable method for spatial data
Authors:
Natalia da Silva,
Ignacio Alvarez-Castro,
Leonardo Moreno,
Andrés Sosa
Abstract:
Statistical learning methods are widely utilized in tackling complex problems due to their flexibility, good predictive performance and its ability to capture complex relationships among variables. Additionally, recently developed automatic workflows have provided a standardized approach to implementing statistical learning methods across various applications. However these tools highlight a main…
▽ More
Statistical learning methods are widely utilized in tackling complex problems due to their flexibility, good predictive performance and its ability to capture complex relationships among variables. Additionally, recently developed automatic workflows have provided a standardized approach to implementing statistical learning methods across various applications. However these tools highlight a main drawbacks of statistical learning: its lack of interpretation in their results. In the past few years an important amount of research has been focused on methods for interpreting black box models. Having interpretable statistical learning methods is relevant to have a deeper understanding of the model. In problems were spatial information is relevant, combined interpretable methods with spatial data can help to get better understanding of the problem and interpretation of the results.
This paper is focused in the individual conditional expectation (ICE-plot), a model agnostic methods for interpreting statistical learning models and combined them with spatial information. ICE-plot extension is proposed where spatial information is used as restriction to define Spatial ICE curves (SpICE). Spatial ICE curves are estimated using real data in the context of an economic problem concerning property valuation in Montevideo, Uruguay. Understanding the key factors that influence property valuation is essential for decision-making, and spatial data plays a relevant role in this regard.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
Approximate Bayesian inference for a "steps and turns" continuous-time random walk observed at regular time intervals
Authors:
Sofia Ruiz-Suarez,
Vianey Leos-Barajas,
Ignacio Alvarez-Castro,
Juan M. Morales
Abstract:
The study of animal movement is challenging because it is a process modulated by many factors acting at different spatial and temporal scales. Several models have been proposed which differ primarily in the temporal conceptualization, namely continuous and discrete time formulations. Naturally, animal movement occurs in continuous time but we tend to observe it at fixed time intervals. To account…
▽ More
The study of animal movement is challenging because it is a process modulated by many factors acting at different spatial and temporal scales. Several models have been proposed which differ primarily in the temporal conceptualization, namely continuous and discrete time formulations. Naturally, animal movement occurs in continuous time but we tend to observe it at fixed time intervals. To account for the temporal mismatch between observations and movement decisions, we used a state-space model where movement decisions (steps and turns) are made in continuous time. The movement process is then observed at regular time intervals. As the likelihood function of this state-space model turned out to be complex to calculate yet simulating data is straightforward, we conduct inference using a few variations of Approximate Bayesian Computation (ABC). We explore the applicability of these methods as a function of the discrepancy between the temporal scale of the observations and that of the movement process in a simulation study. We demonstrate the application of this model to a real trajectory of a sheep that was reconstructed in high resolution using information from magnetometer and GPS devices. Our results suggest that accurate estimates can be obtained when the observations are less than 5 times the average time between changes in movement direction. The state-space model used here allowed us to connect the scales of the observations and movement decisions in an intuitive and easy to interpret way. Our findings underscore the idea that the time scale at which animal movement decisions are made needs to be considered when designing data collection protocols, and that sometimes high-frequency data may not be necessary to have good estimates of certain movement processes.
△ Less
Submitted 23 July, 2019;
originally announced July 2019.
-
Clicks and Cliques. Exploring the Soul of the Community
Authors:
Natalia da Silva,
Ignacio Alvarez-Castro
Abstract:
In the paper we analyze 26 communities across the United States with the objective to understand what attaches people to their community and how this attachment differs among communities. How different are attached people from unattached? What attaches people to their community? How different are the communities? What are key drivers behind emotional attachment? To address these questions, graphic…
▽ More
In the paper we analyze 26 communities across the United States with the objective to understand what attaches people to their community and how this attachment differs among communities. How different are attached people from unattached? What attaches people to their community? How different are the communities? What are key drivers behind emotional attachment? To address these questions, graphical, supervised and unsupervised learning tools were used and information from the Census Bureau and the Knight Foundation were combined. Using the same pre-processed variables as Knight (2010) most likely will drive the results towards the same conclusions than the Knight foundation, so this paper does not use those variables.
△ Less
Submitted 9 October, 2017;
originally announced October 2017.