-
Earth System Data Cubes: Avenues for advancing Earth system research
Authors:
David Montero,
Guido Kraemer,
Anca Anghelea,
César Aybar,
Gunnar Brandt,
Gustau Camps-Valls,
Felix Cremer,
Ida Flik,
Fabian Gans,
Sarah Habershon,
Chaonan Ji,
Teja Kattenborn,
Laura Martínez-Ferrer,
Francesco Martinuzzi,
Martin Reinhardt,
Maximilian Söchting,
Khalil Teber,
Miguel D. Mahecha
Abstract:
Recent advancements in Earth system science have been marked by the exponential increase in the availability of diverse, multivariate datasets characterised by moderate to high spatio-temporal resolutions. Earth System Data Cubes (ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust data structure. ESDCs achieve this by organising data into an a…
▽ More
Recent advancements in Earth system science have been marked by the exponential increase in the availability of diverse, multivariate datasets characterised by moderate to high spatio-temporal resolutions. Earth System Data Cubes (ESDCs) have emerged as one suitable solution for transforming this flood of data into a simple yet robust data structure. ESDCs achieve this by organising data into an analysis-ready format aligned with a spatio-temporal grid, facilitating user-friendly analysis and diminishing the need for extensive technical data processing knowledge. Despite these significant benefits, the completion of the entire ESDC life cycle remains a challenging task. Obstacles are not only of a technical nature but also relate to domain-specific problems in Earth system research. There exist barriers to realising the full potential of data collections in light of novel cloud-based technologies, particularly in curating data tailored for specific application domains. These include transforming data to conform to a spatio-temporal grid with minimum distortions and managing complexities such as spatio-temporal autocorrelation issues. Addressing these challenges is pivotal for the effective application of Artificial Intelligence (AI) approaches. Furthermore, adhering to open science principles for data dissemination, reproducibility, visualisation, and reuse is crucial for fostering sustainable research. Overcoming these challenges offers a substantial opportunity to advance data-driven Earth system research, unlocking the full potential of an integrated, multidimensional view of Earth system processes. This is particularly true when such research is coupled with innovative research paradigms and technological progress.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
PapagAI:Automated Feedback for Reflective Essays
Authors:
Veronika Solopova,
Adrian Gruszczynski,
Eiad Rostom,
Fritz Cremer,
Sascha Witte,
Chengming Zhang,
Fernando Ramos López Lea Plößl,
Florian Hofmann,
Ralf Romeike,
Michaela Gläser-Zikuda,
Christoph Benzmüller,
Tim Landgraf
Abstract:
Written reflective practice is a regular exercise pre-service teachers perform during their higher education. Usually, their lecturers are expected to provide individual feedback, which can be a challenging task to perform on a regular basis. In this paper, we present the first open-source automated feedback tool based on didactic theory and implemented as a hybrid AI system. We describe the compo…
▽ More
Written reflective practice is a regular exercise pre-service teachers perform during their higher education. Usually, their lecturers are expected to provide individual feedback, which can be a challenging task to perform on a regular basis. In this paper, we present the first open-source automated feedback tool based on didactic theory and implemented as a hybrid AI system. We describe the components and discuss the advantages and disadvantages of our system compared to the state-of-art generative large language models. The main objective of our work is to enable better learning outcomes for students and to complement the teaching activities of lecturers.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Rewarding Chatbots for Real-World Engagement with Millions of Users
Authors:
Robert Irvine,
Douglas Boubert,
Vyas Raina,
Adian Liusie,
Ziyi Zhu,
Vineet Mudupalli,
Aliaksei Korshuk,
Zongyi Liu,
Fritz Cremer,
Valentin Assassi,
Christie-Carol Beauchamp,
Xiaoding Lu,
Thomas Rialan,
William Beauchamp
Abstract:
The emergence of pretrained large language models has led to the deployment of a range of social chatbots for chitchat. Although these chatbots demonstrate language ability and fluency, they are not guaranteed to be engaging and can struggle to retain users. This work investigates the development of social chatbots that prioritize user engagement to enhance retention, specifically examining the us…
▽ More
The emergence of pretrained large language models has led to the deployment of a range of social chatbots for chitchat. Although these chatbots demonstrate language ability and fluency, they are not guaranteed to be engaging and can struggle to retain users. This work investigates the development of social chatbots that prioritize user engagement to enhance retention, specifically examining the use of human feedback to efficiently develop highly engaging chatbots. The proposed approach uses automatic pseudo-labels collected from user interactions to train a reward model that can be used to reject low-scoring sample responses generated by the chatbot model at inference time. Intuitive evaluation metrics, such as mean conversation length (MCL), are introduced as proxies to measure the level of engagement of deployed chatbots. A/B testing on groups of 10,000 new daily chatbot users on the Chai Research platform shows that this approach increases the MCL by up to 70%, which translates to a more than 30% increase in user retention for a GPT-J 6B model. Future work aims to use the reward model to realise a data fly-wheel, where the latest user conversations can be used to alternately fine-tune the language model and the reward model.
△ Less
Submitted 30 March, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.