-
Learning-based estimation of cattle weight gain and its influencing factors
Authors:
Muhammad Riaz Hasib Hossain,
Rafiqul Islam,
Shawn R. McGrath,
Md Zahidul Islam,
David Lamb
Abstract:
Many cattle farmers still depend on manual methods to measure the live weight gain of cattle at set intervals, which is time consuming, labour intensive, and stressful for both the animals and handlers. A remote and autonomous monitoring system using machine learning (ML) or deep learning (DL) can provide a more efficient and less invasive method and also predictive capabilities for future cattle…
▽ More
Many cattle farmers still depend on manual methods to measure the live weight gain of cattle at set intervals, which is time consuming, labour intensive, and stressful for both the animals and handlers. A remote and autonomous monitoring system using machine learning (ML) or deep learning (DL) can provide a more efficient and less invasive method and also predictive capabilities for future cattle weight gain (CWG). This system allows continuous monitoring and estimation of individual cattle live weight gain, growth rates and weight fluctuations considering various factors like environmental conditions, genetic predispositions, feed availability, movement patterns and behaviour. Several researchers have explored the efficiency of estimating CWG using ML and DL algorithms. However, estimating CWG suffers from a lack of consistency in its application. Moreover, ML or DL can provide weight gain estimations based on several features that vary in existing research. Additionally, previous studies have encountered various data related challenges when estimating CWG. This paper presents a comprehensive investigation in estimating CWG using advanced ML techniques based on research articles (between 2004 and 2024). This study investigates the current tools, methods, and features used in CWG estimation, as well as their strengths and weaknesses. The findings highlight the significance of using advanced ML approaches in CWG estimation and its critical influence on factors. Furthermore, this study identifies potential research gaps and provides research direction on CWG prediction, which serves as a reference for future research in this area.
△ Less
Submitted 9 February, 2025;
originally announced February 2025.
-
Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds
Authors:
Justus C. Will,
Andrea M. Jenney,
Kara D. Lamb,
Michael S. Pritchard,
Colleen Kaul,
Po-Lun Ma,
Kyle Pressel,
Jacob Shpund,
Marcus van Lier-Walqui,
Stephan Mandt
Abstract:
Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and…
▽ More
Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Efficient labeling of solar flux evolution videos by a deep learning model
Authors:
Subhamoy Chatterjee,
Andrés Muñoz-Jaramillo,
Derek A. Lamb
Abstract:
Machine learning (ML) is becoming a critical tool for interrogation of large complex data. Labeling, defined as the process of adding meaningful annotations, is a crucial step of supervised ML. However, labeling datasets is time consuming. Here we show that convolutional neural networks (CNNs), trained on crudely labeled astronomical videos, can be leveraged to improve the quality of data labeling…
▽ More
Machine learning (ML) is becoming a critical tool for interrogation of large complex data. Labeling, defined as the process of adding meaningful annotations, is a crucial step of supervised ML. However, labeling datasets is time consuming. Here we show that convolutional neural networks (CNNs), trained on crudely labeled astronomical videos, can be leveraged to improve the quality of data labeling and reduce the need for human intervention. We use videos of the solar magnetic field, crudely labeled into two classes: emergence or non-emergence of bipolar magnetic regions (BMRs), based on their first detection on the solar disk. We train CNNs using crude labels, manually verify, correct labeling vs. CNN disagreements, and repeat this process until convergence. Traditionally, flux emergence labelling is done manually. We find that a high-quality labeled dataset, derived through this iterative process, reduces the necessary manual verification by 50%. Furthermore, by gradually masking the videos and looking for maximum change in CNN inference, we locate BMR emergence time without retraining the CNN. This demonstrates the versatility of CNNs for simplifying the challenging task of labeling complex dynamic events.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Ontology for Healthcare Artificial Intelligence Privacy in Brazil
Authors:
Tiago Andres Vaz,
José Miguel Silva Dora,
Luís da Cunha Lamb,
Suzi Alves Camey
Abstract:
This article details the creation of a novel domain ontology at the intersection of epidemiology, medicine, statistics, and computer science. Using the terminology defined by current legislation, the article outlines a systematic approach to handling hospital data anonymously in preparation for its use in Artificial Intelligence (AI) applications in healthcare. The development process consisted of…
▽ More
This article details the creation of a novel domain ontology at the intersection of epidemiology, medicine, statistics, and computer science. Using the terminology defined by current legislation, the article outlines a systematic approach to handling hospital data anonymously in preparation for its use in Artificial Intelligence (AI) applications in healthcare. The development process consisted of 7 pragmatic steps, including defining scope, selecting knowledge, reviewing important terms, constructing classes that describe designs used in epidemiological studies, machine learning paradigms, types of data and attributes, risks that anonymized data may be exposed to, privacy attacks, techniques to mitigate re-identification, privacy models, and metrics for measuring the effects of anonymization. The article concludes by demonstrating the practical implementation of this ontology in hospital settings for the development and validation of AI.
△ Less
Submitted 6 June, 2024; v1 submitted 16 April, 2023;
originally announced April 2023.
-
Pyrocast: a Machine Learning Pipeline to Forecast Pyrocumulonimbus (PyroCb) Clouds
Authors:
Kenza Tazi,
Emiliano Díaz Salas-Porras,
Ashwin Braude,
Daniel Okoh,
Kara D. Lamb,
Duncan Watson-Parris,
Paula Harder,
Nis Meinert
Abstract:
Pyrocumulonimbus (pyroCb) clouds are storm clouds generated by extreme wildfires. PyroCbs are associated with unpredictable, and therefore dangerous, wildfire spread. They can also inject smoke particles and trace gases into the upper troposphere and lower stratosphere, affecting the Earth's climate. As global temperatures increase, these previously rare events are becoming more common. Being able…
▽ More
Pyrocumulonimbus (pyroCb) clouds are storm clouds generated by extreme wildfires. PyroCbs are associated with unpredictable, and therefore dangerous, wildfire spread. They can also inject smoke particles and trace gases into the upper troposphere and lower stratosphere, affecting the Earth's climate. As global temperatures increase, these previously rare events are becoming more common. Being able to predict which fires are likely to generate pyroCb is therefore key to climate adaptation in wildfire-prone areas. This paper introduces Pyrocast, a pipeline for pyroCb analysis and forecasting. The pipeline's first two components, a pyroCb database and a pyroCb forecast model, are presented. The database brings together geostationary imagery and environmental data for over 148 pyroCb events across North America, Australia, and Russia between 2018 and 2022. Random Forests, Convolutional Neural Networks (CNNs), and CNNs pretrained with Auto-Encoders were tested to predict the generation of pyroCb for a given fire six hours in advance. The best model predicted pyroCb with an AUC of $0.90 \pm 0.04$.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
Enhancing Accuracy and Robustness of Steering Angle Prediction with Attention Mechanism
Authors:
Swetha Nadella,
Pramiti Barua,
Jeremy C. Hagler,
David J. Lamb,
Qing Tian
Abstract:
In this paper, our focus is on enhancing steering angle prediction for autonomous driving tasks. We initiate our exploration by investigating two veins of widely adopted deep neural architectures, namely ResNets and InceptionNets. Within both families, we systematically evaluate various model sizes to understand their impact on performance. Notably, our key contribution lies in the incorporation o…
▽ More
In this paper, our focus is on enhancing steering angle prediction for autonomous driving tasks. We initiate our exploration by investigating two veins of widely adopted deep neural architectures, namely ResNets and InceptionNets. Within both families, we systematically evaluate various model sizes to understand their impact on performance. Notably, our key contribution lies in the incorporation of an attention mechanism to augment steering angle prediction accuracy and robustness. By introducing attention, our models gain the ability to selectively focus on crucial regions within the input data, leading to improved predictive outcomes. Our findings showcase that our attention-enhanced models not only achieve state-of-the-art results in terms of steering angle Mean Squared Error (MSE) but also exhibit enhanced adversarial robustness, addressing critical concerns in real-world deployment. For example, in our experiments on the Kaggle SAP and our created publicly available datasets, attention can lead to over 6% error reduction in steering angle prediction and boost model robustness by up to 56.09%.
△ Less
Submitted 1 February, 2024; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Identifying the Causes of Pyrocumulonimbus (PyroCb)
Authors:
Emiliano Díaz Salas-Porras,
Kenza Tazi,
Ashwin Braude,
Daniel Okoh,
Kara D. Lamb,
Duncan Watson-Parris,
Paula Harder,
Nis Meinert
Abstract:
A first causal discovery analysis from observational data of pyroCb (storm clouds generated from extreme wildfires) is presented. Invariant Causal Prediction was used to develop tools to understand the causal drivers of pyroCb formation. This includes a conditional independence test for testing $Y$ conditionally independent of $E$ given $X$ for binary variable $Y$ and multivariate, continuous vari…
▽ More
A first causal discovery analysis from observational data of pyroCb (storm clouds generated from extreme wildfires) is presented. Invariant Causal Prediction was used to develop tools to understand the causal drivers of pyroCb formation. This includes a conditional independence test for testing $Y$ conditionally independent of $E$ given $X$ for binary variable $Y$ and multivariate, continuous variables $X$ and $E$, and a greedy-ICP search algorithm that relies on fewer conditional independence tests to obtain a smaller more manageable set of causal predictors. With these tools, we identified a subset of seven causal predictors which are plausible when contrasted with domain knowledge: surface sensible heat flux, relative humidity at $850$ hPa, a component of wind at $250$ hPa, $13.3$ micro-meters, thermal emissions, convective available potential energy, and altitude.
△ Less
Submitted 18 November, 2022; v1 submitted 16 November, 2022;
originally announced November 2022.
-
Software Abstractions and Methodologies for HPC Simulation Codes on Future Architectures
Authors:
A. Dubey,
S. Brandt,
R. Brower,
M. Giles,
P. Hovland,
D. Q. Lamb,
F. Loffler,
B. Norris,
B. OShea,
C. Rebbi,
M. Snir,
R. Thakur
Abstract:
Large, complex, multi-scale, multi-physics simulation codes, running on high performance com-puting (HPC) platforms, have become essential to advancing science and engineering. These codes simulate multi-scale, multi-physics phenomena with unprecedented fidelity on petascale platforms, and are used by large communities. Continued ability of these codes to run on future platforms is as crucial to t…
▽ More
Large, complex, multi-scale, multi-physics simulation codes, running on high performance com-puting (HPC) platforms, have become essential to advancing science and engineering. These codes simulate multi-scale, multi-physics phenomena with unprecedented fidelity on petascale platforms, and are used by large communities. Continued ability of these codes to run on future platforms is as crucial to their communities as continued improvements in instruments and facilities are to experimental scientists. However, the ability of code developers to do these things faces a serious challenge with the paradigm shift underway in platform architecture. The complexity and uncertainty of the future platforms makes it essential to approach this challenge cooperatively as a community. We need to develop common abstractions, frameworks, programming models and software development methodologies that can be applied across a broad range of complex simulation codes, and common software infrastructure to support them. In this position paper we express and discuss our belief that such an infrastructure is critical to the deployment of existing and new large, multi-scale, multi-physics codes on future HPC platforms.
△ Less
Submitted 6 September, 2013;
originally announced September 2013.