-
Digital Pathway Curation (DPC): a pipeline able to assess the reproducibility, consensus and accuracy in biomedical search retrieval by comparing Gemini, PubMed, and Scientific Reviewers
Authors:
Flavio Lichtenstein,
Daniel Alexandre de Souza,
Carlos Eduardo Madureira Trufen,
Victor Wendel da Silva Gonçalves,
Juliana de Paula Bernardes,
Vinicius Miranda Baroni,
Carlos DeOcesano-Pereira,
Leonardo Fontoura Ormundo,
Fabio Augusto Labre de Souza,
Olga Celia Martinez Ibañez,
Nancy Starobinas,
Luciano Rodrigo Lopes,
Aparecida Maria Fontes,
Sonia Aparecida de Andrade,
Ana Marisa Chudzinski-Tavassi
Abstract:
A scientific study begins with a central question, and search engines like PubMed are the first tools for retrieving knowledge and understanding the current state of the art. Large Language Models (LLMs) have been used in research, promising acceleration and deeper results. However, besides caution, they demand rigorous validation. Assessing complex biological relationships remains challenging for…
▽ More
A scientific study begins with a central question, and search engines like PubMed are the first tools for retrieving knowledge and understanding the current state of the art. Large Language Models (LLMs) have been used in research, promising acceleration and deeper results. However, besides caution, they demand rigorous validation. Assessing complex biological relationships remains challenging for SQL-based tools and LLM models. Here, we introduce the Digital Pathway Curation (DPC) pipeline to evaluate the reproducibility and accuracy of the Gemini models against PubMed search and human expert curation. Using two omics experiments, we created a large dataset (Ensemble) based on determining pathway-disease associations. With the Ensemble dataset, we demonstrate that Gemini achieves high run-to-run reproducibility of approximately 99% and inter-model reproducibility of around 75%. Next, we calculate the crowdsourced consensus using a smaller dataset. The CSC allows us to calculate accuracies, and the Gemini multi-model consensus reached a significant accuracy of about 87%. Our findings demonstrate that LLMs are reproducible, reliable, and valuable tools for navigating complex biomedical knowledge.
△ Less
Submitted 7 May, 2025; v1 submitted 2 May, 2025;
originally announced May 2025.
-
One predator and two prey: Coexistence of pumas, guanacos and sheep in Patagonia
Authors:
Jhordan Silveira de Borba,
Sebastian Gonçalves
Abstract:
The ecosystem considered in this study is the outcome of a lengthy sequence of historical and ecological events. Patagonia's indigenous fauna comprises survivors of five significant extinction events, with the notable presence of the puma and the guanaco, two of the largest native mammals. In addition to these, European immigrants introduced sheep into the ecosystem. Together, these three species…
▽ More
The ecosystem considered in this study is the outcome of a lengthy sequence of historical and ecological events. Patagonia's indigenous fauna comprises survivors of five significant extinction events, with the notable presence of the puma and the guanaco, two of the largest native mammals. In addition to these, European immigrants introduced sheep into the ecosystem. Together, these three species form a straightforward trophic network, featuring one predator and two prey species, all competing within the Patagonian steppe. For ranchers, guanacos and pumas are frequently perceived as threats to their economic interests. In recent decades, the field of biology, particularly ecology, has witnessed a substantial increase in the development of equation-based models. Scientists are interested in the ability to systematize hypotheses and gain insights into the behavior of complex biological systems, such as the one presented in this study. However, the nonlinear nature and the large number of parameters of models, represent a challenge when one wants to explore the parameter space. To overcome this and, at the same time, improve the understanding of the Patagonia ecosystem, we start by building an equation-based model based on previous contributions, and we reduce it to the essential minimum set of parameters. Then, we introduce two tools, a generalization of ternary graphs and a perceptron based ML, to help understand the response of the system equation to the key parameters. The perceptron tool allows us to visualize/interpret the influence of each parameter on the survival or extinction of each species. Through the generalization of the ternary graph, it was possible to conveniently visualize how the system responds to different combinations/variations of the five parameters of the reduced system equation in a single graphical representation.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Modelling control strategies against Classical Swine Fever: influence of traders and markets using static and temporal networks in Ecuador
Authors:
Alfredo Acosta,
Nicolas Cespedes Cardenas,
Cristian Imbacuan,
Hartmut H. K. Lentz,
Klaas Dietze,
Marcos Amaku,
Alexandra Burbano,
Vitor S. P. Gonçalves,
Fernando Ferreira
Abstract:
Classical swine fever (CSF) in Ecuador is prevalent since 1940, pig farming represents an important economic and cultural sector. Recently, the National Veterinary Service (NVS) has implemented individual identification of pigs, movement control and mandatory vaccination against CSF, looking for a future eradication. Our aim was to characterise the pig premises according to risk criteria, analyse…
▽ More
Classical swine fever (CSF) in Ecuador is prevalent since 1940, pig farming represents an important economic and cultural sector. Recently, the National Veterinary Service (NVS) has implemented individual identification of pigs, movement control and mandatory vaccination against CSF, looking for a future eradication. Our aim was to characterise the pig premises according to risk criteria, analyse the effect of random and targeted strategies to control CSF and consider the temporal development of the network. We used social network analysis (SNA), SIRS (susceptible, infected, recovered, susceptible) network modelling and temporal network analysis. The data set contained 751,003 shipments and 6 million pigs from 2017 to 2019. 165,593 premises were involved: 144,118 farms, 138 industrials, 21,337 traders, and 51 markets. On annual average, 124,976 premises (75%) received or sent one movement with 1.5 pigs, in contrast, 166 (0.01%) with 1,372 movements and 11,607 pigs. Simulations resulted in CSF mean prevalence of 29.93%; Targeted selection strategy reduced the prevalence to 3.3%, while 24% with random selection. Selection of high-risk premises in every province was the best strategy using available surveillance infrastructure. Notably, selecting 10 traders/markets reduced the CSF prevalence to 4%, evidencing their prime influence over the network. Temporal analysis showed an overestimation of 38% (causal fidelity) in the number of transmission paths; The steps to cross the network were 4.3 (average path length), but take approximately 233 days. In conclusion, surveillance strategies applied by the NVS could be more efficient to find cases, reduce the spread of diseases and enable the implementation of risk-based surveillance. To focus the efforts on target selection of high-risk premises, special attention should be given to markets/traders which proved similar disease spread potential.
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
Urban Scaling of COVID-19 epidemics
Authors:
Ben-Hur Francisco Cardoso,
Sebastián Gonçalves
Abstract:
Susceptible-Invective-Recovered (SIR) mathematical models are in high demand due to the COVID-19 pandemic. They are used in their standard formulation, or through the many variants, trying to fit and hopefully predict the number of new cases for the next days or weeks, in any place, city, or country. Such is key knowledge for the authorities to prepare for the health systems demand or to apply res…
▽ More
Susceptible-Invective-Recovered (SIR) mathematical models are in high demand due to the COVID-19 pandemic. They are used in their standard formulation, or through the many variants, trying to fit and hopefully predict the number of new cases for the next days or weeks, in any place, city, or country. Such is key knowledge for the authorities to prepare for the health systems demand or to apply restrictions to slow down the infectives curve. Even when the model can be easily solved ---by the use of specialized software or by programming the numerical solution of the differential equations that represent the model---, the prediction is a non-easy task, because the behavioral change of people is reflected in a continuous change of the parameters. A relevant question is what we can use of one city to another; if what happened in Madrid could have been applied to New York and then, if what we have learned from this city would be of use for São Paulo. With this idea in mind, we present an analysis of a spreading-rate related measure of COVID-19 as a function of population density and population size for all US counties, as long as for Brazilian cities and German cities. Contrary to what is the common hypothesis in epidemics modeling, we observe a higher {\em per-capita} contact rate for higher city's population density and population size. Also, we find that the population size has a more explanatory effect than the population density. A contact rate scaling theory is proposed to explain the results.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
Trend analysis of the COVID-19 pandemic in China and the rest of the world
Authors:
Albertine Weber,
Flavio Ianelli,
Sebastian Goncalves
Abstract:
The recent epidemic of Coronavirus (COVID-19) that started in China has already been "exported" to more than 140 countries in all the continents, evolving in most of them by local spreading. In this contribution we analyze the trends of the cases reported in all the Chinese provinces, as well as in some countries that, until March 15th, 2020, have more than 500 cases reported. Notably and differen…
▽ More
The recent epidemic of Coronavirus (COVID-19) that started in China has already been "exported" to more than 140 countries in all the continents, evolving in most of them by local spreading. In this contribution we analyze the trends of the cases reported in all the Chinese provinces, as well as in some countries that, until March 15th, 2020, have more than 500 cases reported. Notably and differently from other epidemics, the provinces did not show an exponential phase. The data available at the Johns Hopkins University site seem to fit well an algebraic sub-exponential growing behavior as was pointed out recently. All the provinces show a clear and consistent pattern of slowing down with growing exponent going nearly zero, so it can be said that the epidemic was contained in China. On the other side, the more recent spread in countries like, Italy, Iran, and Spain show a clear exponential growth, as well as other European countries. Even more recently, US -which was one of the first countries to have an individual infected outside China (Jan 21st, 2020)- seems to follow the same path. We calculate the exponential growth of the most affected countries, showing the evolution along time after the first local case. We identify clearly different patterns in the analyzed data and we give interpretations and possible explanations for them. The analysis and conclusions of our study can help countries that, after importing some cases, are not yet in the local spreading phase, or have just started.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
An Algebraic Solution for the Kermack-McKendrick Model
Authors:
Alexsandro M. Carvalho,
Sebastian Gonçalves
Abstract:
We present an algebraic solution for the Susceptible-Infective-Removed (SIR) model originally presented by Kermack-McKendrick in 1927. Starting from the differential equation for the removed subjects presented by them in the original paper, we re-write it in a slightly different form in order to derive formally the solution, unless one integration. Then, using algebraic techniques and some well ju…
▽ More
We present an algebraic solution for the Susceptible-Infective-Removed (SIR) model originally presented by Kermack-McKendrick in 1927. Starting from the differential equation for the removed subjects presented by them in the original paper, we re-write it in a slightly different form in order to derive formally the solution, unless one integration. Then, using algebraic techniques and some well justified numerical assumptions we obtain an analytic solution for the integral. Finally, we compare the numerical solution of the differential equations of the SIR model with the analytically solution here proposed, showing an excellent agreement.
△ Less
Submitted 16 December, 2016; v1 submitted 29 September, 2016;
originally announced September 2016.
-
Epidemic oscillations: Interaction between delays and seasonality
Authors:
Guillermo Abramson,
Sebastian Gonçalves,
Marcelo F. C. Gomes
Abstract:
Traditional epidemic models consider that individual processes occur at constant rates. That is, an infected individual has a constant probability per unit time of recovering from infection after contagion. This assumption certainly fails for almost all infectious diseases, in which the infection time usually follows a probability distribution more or less spread around a mean value. We show a gen…
▽ More
Traditional epidemic models consider that individual processes occur at constant rates. That is, an infected individual has a constant probability per unit time of recovering from infection after contagion. This assumption certainly fails for almost all infectious diseases, in which the infection time usually follows a probability distribution more or less spread around a mean value. We show a general treatment for an SIRS model in which both the infected and the immune phases admit such a description. The general behavior of the system shows transitions between endemic and oscillating situations that could be relevant in many real scenarios. The interaction with the other main source of oscillations, seasonality, is also discussed.
△ Less
Submitted 15 March, 2013;
originally announced March 2013.
-
Oscillations in SIRS model with distributed delays
Authors:
S. Goncalves,
G. Abramson,
M. F. C. Gomes
Abstract:
The ubiquity of oscillations in epidemics presents a long standing challenge for the formulation of epidemic models. Whether they are external and seasonally driven, or arise from the intrinsic dynamics is an open problem. It is known that fixed time delays destabilize the steady state solution of the standard SIRS model, giving rise to stable oscillations for certain parameters values. In this co…
▽ More
The ubiquity of oscillations in epidemics presents a long standing challenge for the formulation of epidemic models. Whether they are external and seasonally driven, or arise from the intrinsic dynamics is an open problem. It is known that fixed time delays destabilize the steady state solution of the standard SIRS model, giving rise to stable oscillations for certain parameters values. In this contribution, starting from the classical SIRS model, we make a general treatment of the recovery and loss of immunity terms. We present oscillation diagrams (amplitude and period) in terms of the parameters of the model, showing how oscillations can be destabilized by the shape of the distributions of the two characteristic (infectious and immune) times. The formulation is made in terms of delay equation which are both numerical integrated and linearized. Results from simulation are included showing where they support the linear analysis and explaining why not where they do not. Considerations and comparison with real diseases are presented along.
△ Less
Submitted 30 September, 2010; v1 submitted 7 December, 2009;
originally announced December 2009.
-
Promiscuity and the Evolution of Sexual Transmitted Diseases
Authors:
Sebastian Goncalves,
Marcelo Kuperman,
Marcelo Ferreira da Costa Gomes
Abstract:
We study the relation between different social behaviors and the onset of epidemics in a model for the dynamics of sexual transmitted diseases. The model considers the society as a system of individual sexuated agents that can be organized in couples and interact with each other. The different social behaviors are incorporated assigning what we call a promiscuity value to each individual agent.…
▽ More
We study the relation between different social behaviors and the onset of epidemics in a model for the dynamics of sexual transmitted diseases. The model considers the society as a system of individual sexuated agents that can be organized in couples and interact with each other. The different social behaviors are incorporated assigning what we call a promiscuity value to each individual agent. The individual promiscuity is taken from a distributions and represents the daily probability of going out to look for a sexual partner, abandoning its eventual mate. In terms of this parameter we find a threshold for the epidemic which is much lower than the classical fully mixed model prediction, i.e. $R_0$ (basic reproductive number) $= 1$. Different forms for the distribution of the population promiscuity are considered showing that the threshold is weakly sensitive to them. We study the homosexual and the heterosexual case as well.
△ Less
Submitted 27 February, 2003;
originally announced February 2003.
-
The Social Behavior and the Evolution of Sexually Transmitted Diseases
Authors:
Sebatian Goncalves,
Marcelo Kuperman
Abstract:
We introduce a model for the evolution of sexually transmitted diseases, in which the social behavior is incorporated as a determinant factor for the further propagation of the infection. The system may be regarded as a society of agents where in principle anyone can sexually interact with any other one in the population. Different social behaviors are reflected in a distribution of sexual attit…
▽ More
We introduce a model for the evolution of sexually transmitted diseases, in which the social behavior is incorporated as a determinant factor for the further propagation of the infection. The system may be regarded as a society of agents where in principle anyone can sexually interact with any other one in the population. Different social behaviors are reflected in a distribution of sexual attitudes ranging from the more conservative to the more promiscuous. This is measured by what we call the promiscuity parameter. In terms of this parameter, we find a critical behavior for the evolution of the disease. There is a threshold below what the epidemic does not occur. We relate this critical value of the promiscuity to what epidemiologist call the basic reproductive number, connecting it with the other parameters of the model, namely the infectivity and the infective period in a quantitative way. We consider the possibility of subjects be grouped in couples. In this contribution only the homosexual case is analyzed.
△ Less
Submitted 3 December, 2002;
originally announced December 2002.