-
Sustainable Greenhouse Microclimate Modeling: A Comparative Analysis of Recurrent and Graph Neural Networks
Authors:
Emiliano Seri,
Marcello Petitta,
Cristina Cornaro
Abstract:
The integration of photovoltaic (PV) systems into greenhouses not only optimizes land use but also enhances sustainable agricultural practices by enabling dual benefits of food production and renewable energy generation. However, accurate prediction of internal environmental conditions is crucial to ensure optimal crop growth while maximizing energy production. This study introduces a novel applic…
▽ More
The integration of photovoltaic (PV) systems into greenhouses not only optimizes land use but also enhances sustainable agricultural practices by enabling dual benefits of food production and renewable energy generation. However, accurate prediction of internal environmental conditions is crucial to ensure optimal crop growth while maximizing energy production. This study introduces a novel application of Spatio-Temporal Graph Neural Networks (STGNNs) to greenhouse microclimate modeling, comparing their performance with traditional Recurrent Neural Networks (RNNs). While RNNs excel at temporal pattern recognition, they cannot explicitly model the directional relationships between environmental variables. Our STGNN approach addresses this limitation by representing these relationships as directed graphs, enabling the model to capture both environmental dependencies and their directionality. Using high-frequency data collected at 15-minute intervals from a greenhouse in Volos, Greece, we demonstrate that RNNs achieve exceptional accuracy in winter conditions ($R^2 = 0.985$) but show limitations during summer cooling system operation. Though STGNNs currently show lower performance (winter $R^2 = 0.947$), their architecture offers greater potential for integrating additional variables such as PV generation and crop growth indicators.
△ Less
Submitted 18 March, 2025; v1 submitted 24 February, 2025;
originally announced February 2025.
-
Text Data Analysis of Maternal Narratives: Albanian Women in Italy
Authors:
Eleonora Miaci,
Emiliano Seri
Abstract:
Despite growing interest in migration studies, research on motherhood among migrant women in Italy remains limited. This study contributes to the literature by examining the family trajectories of Albanian women in Italy, exploring how their migration patterns and experiences have shaped these life aspects. We conducted a comprehensive textual analysis to find the main topics of 30 semi-structured…
▽ More
Despite growing interest in migration studies, research on motherhood among migrant women in Italy remains limited. This study contributes to the literature by examining the family trajectories of Albanian women in Italy, exploring how their migration patterns and experiences have shaped these life aspects. We conducted a comprehensive textual analysis to find the main topics of 30 semi-structured interviews with Albanian mothers living in Milan, Rome, and Bari. After pre-processing the text, we performed an exploratory analysis to identify key features and explore word relationships. The predominant dimensions that emerged relate to family management, work paths and schedules, and strategies and concerns arising from the trade-off between work and childcare. Subsequently, we stratified the sample by entry channel into Italy (study and work, reunification, and irregular channel) and applied Latent Dirichlet Allocation to model each sub-corpus as a mixture of topics. Our results resonate with existing literature [1] on the key role of female migratory patterns in shaping post-migration fertility. Interviewees who entered Italy through various migratory channels not only differ in their characteristics and migration experiences but also exhibit dissimilar fertility desires and behaviors, motherhood trajectories, and conceptions of their role as mothers and family ideals. These differences influence their priorities and level of commitment to family and work obligations.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Spherical Double K-Means: a co-clustering approach for text data analysis
Authors:
Ilaria Bombelli,
Domenica Fioredistella Iezzi,
Emiliano Seri,
Maurizio Vichi
Abstract:
In text analysis, Spherical K-means (SKM) is a specialized k-means clustering algorithm widely utilized for grouping documents represented in high-dimensional, sparse term-document matrices, often normalized using techniques like TF-IDF. Researchers frequently seek to cluster not only documents but also the terms associated with them into coherent groups. To address this dual clustering requiremen…
▽ More
In text analysis, Spherical K-means (SKM) is a specialized k-means clustering algorithm widely utilized for grouping documents represented in high-dimensional, sparse term-document matrices, often normalized using techniques like TF-IDF. Researchers frequently seek to cluster not only documents but also the terms associated with them into coherent groups. To address this dual clustering requirement, we introduce Spherical Double K-Means (SDKM), a novel methodology that simultaneously clusters documents and terms. This approach offers several advantages: first, by integrating the clustering of documents and terms, SDKM provides deeper insights into the relationships between content and vocabulary, enabling more effective topic identification and keyword extraction. Additionally, the two-level clustering assists in understanding both overarching themes and specific terminologies within document clusters, enhancing interpretability. SDKM effectively handles the high dimensionality and sparsity inherent in text data by utilizing cosine similarity, leading to improved computational efficiency. Moreover, the method captures dynamic changes in thematic content over time, making it well-suited for applications in rapidly evolving fields. Ultimately, SDKM presents a comprehensive framework for advancing text mining efforts, facilitating the uncovering of nuanced patterns and structures that are critical for robust data analysis. We apply SDKM to the corpus of US presidential inaugural addresses, spanning from George Washington in 1789 to Joe Biden in 2021. Our analysis reveals distinct clusters of words and documents that correspond to significant historical themes and periods, showcasing the method's ability to facilitate a deeper understanding of the data. Our findings demonstrate the efficacy of SDKM in uncovering underlying patterns in textual data.
△ Less
Submitted 22 February, 2025; v1 submitted 8 January, 2025;
originally announced January 2025.
-
Partial membership models for soft clustering of multivariate football player performance data
Authors:
Emiliano Seri,
Roberto Rocci,
Thomas Brendan Murphy
Abstract:
The standard mixture modeling framework has been widely used to study heterogeneous populations, by modeling them as being composed of a finite number of homogeneous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrea…
▽ More
The standard mixture modeling framework has been widely used to study heterogeneous populations, by modeling them as being composed of a finite number of homogeneous sub-populations. However, the standard mixture model assumes that each data point belongs to one and only one mixture component, or cluster, but when data points have fractional membership in multiple clusters this assumption is unrealistic. It is in fact conceptually very different to represent an observation as partly belonging to multiple groups instead of belonging to one group with uncertainty. For this purpose, various soft clustering approaches, or individual-level mixture models, have been developed. In this context, Heller et al (2008) formulated the Bayesian partial membership model (PM) as an alternative structure for individual-level mixtures, which also captures partial membership in the form of attribute-specific mixtures. Our work proposes using the PM for soft clustering of count data arising in football performance analysis and compares the results with those achieved with the mixed membership model and finite mixture model. Learning and inference are carried out using Markov chain Monte Carlo methods. The method is applied on Serie A football player data from the 2022/2023 football season, to estimate the positions on the field where the players tend to play, in addition to their primary position, based on their playing style. The application of partial membership model to football data could have practical implications for coaches, talent scouts, team managers and analysts. These stakeholders can utilize the findings to make informed decisions related to team strategy, talent acquisition, and statistical research, ultimately enhancing performance and understanding in the field of football.
△ Less
Submitted 14 February, 2025; v1 submitted 3 September, 2024;
originally announced September 2024.