-
From an attention economy to an ecology of attending. A manifesto
Authors:
Gunter Bombaerts,
Tom Hannes,
Martin Adam,
Alessandra Aloisi,
Joel Anderson,
Lawrence Berger,
Stefano Davide Bettera,
Enrico Campo,
Laura Candiotto,
Silvia Caprioglio Panizza,
Yves Citton,
Diego DâAngelo,
Matthew Dennis,
Nathalie Depraz,
Peter Doran,
Wolfgang Drechsler,
Bill Duane,
William Edelglass,
Iris Eisenberger,
Beverley Foulks McGuire,
Antony Fredriksson,
Karamjit S. Gill,
Peter D. Hershock,
Soraj Hongladarom,
Beth Jacobs
, et al. (30 additional authors not shown)
Abstract:
As the signatories of this manifesto, we denounce the attention economy as inhumane and a threat to our sociopolitical and ecological well-being. We endorse policymakers' efforts to address the negative consequences of the attention economy's technology, but add that these approaches are often limited in their criticism of the systemic context of human attention. Starting from Buddhist philosophy,…
▽ More
As the signatories of this manifesto, we denounce the attention economy as inhumane and a threat to our sociopolitical and ecological well-being. We endorse policymakers' efforts to address the negative consequences of the attention economy's technology, but add that these approaches are often limited in their criticism of the systemic context of human attention. Starting from Buddhist philosophy, we advocate a broader approach: an ecology of attending, that centers on conceptualizing, designing, and using attention (1) in an embedded way and (2) focused on the alleviating of suffering. With 'embedded' we mean that attention is not a neutral, isolated mechanism but a meaning-engendering part of an 'ecology' of bodily, sociotechnical and moral frameworks. With 'focused on the alleviation of suffering' we explicitly move away from the (often implicit) conception of attention as a tool for gratifying desires.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Evaluation of medium-large Language Models at zero-shot closed book generative question answering
Authors:
René Peinl,
Johannes Wirth
Abstract:
Large language models (LLMs) have garnered significant attention, but the definition of "large" lacks clarity. This paper focuses on medium-sized language models (MLMs), defined as having at least six billion parameters but less than 100 billion. The study evaluates MLMs regarding zero-shot generative question answering, which requires models to provide elaborate answers without external document…
▽ More
Large language models (LLMs) have garnered significant attention, but the definition of "large" lacks clarity. This paper focuses on medium-sized language models (MLMs), defined as having at least six billion parameters but less than 100 billion. The study evaluates MLMs regarding zero-shot generative question answering, which requires models to provide elaborate answers without external document retrieval. The paper introduces an own test dataset and presents results from human evaluation. Results show that combining the best answers from different MLMs yielded an overall correct answer rate of 82.7% which is better than the 60.9% of ChatGPT. The best MLM achieved 71.8% and has 33B parameters, which highlights the importance of using appropriate training data for fine-tuning rather than solely relying on the number of parameters. More fine-grained feedback should be used to further improve the quality of answers. The open source community is quickly closing the gap to the best commercial models.
△ Less
Submitted 3 July, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
ASR Bundestag: A Large-Scale political debate dataset in German
Authors:
Johannes Wirth,
René Peinl
Abstract:
We present ASR Bundestag, a dataset for automatic speech recognition in German, consisting of 610 hours of aligned audio-transcript pairs for supervised training as well as 1,038 hours of unlabeled audio snippets for self-supervised learning, based on raw audio data and transcriptions from plenary sessions and committee meetings of the German parliament. In addition, we discuss utilized approaches…
▽ More
We present ASR Bundestag, a dataset for automatic speech recognition in German, consisting of 610 hours of aligned audio-transcript pairs for supervised training as well as 1,038 hours of unlabeled audio snippets for self-supervised learning, based on raw audio data and transcriptions from plenary sessions and committee meetings of the German parliament. In addition, we discuss utilized approaches for the automated creation of speech datasets and assess the quality of the resulting dataset based on evaluations and finetuning of a pre-trained state of the art model. We make the dataset publicly available, including all subsets.
△ Less
Submitted 12 February, 2023;
originally announced February 2023.
-
ASR in German: A Detailed Error Analysis
Authors:
Johannes Wirth,
Rene Peinl
Abstract:
The amount of freely available systems for automatic speech recognition (ASR) based on neural networks is growing steadily, with equally increasingly reliable predictions. However, the evaluation of trained models is typically exclusively based on statistical metrics such as WER or CER, which do not provide any insight into the nature or impact of the errors produced when predicting transcripts fr…
▽ More
The amount of freely available systems for automatic speech recognition (ASR) based on neural networks is growing steadily, with equally increasingly reliable predictions. However, the evaluation of trained models is typically exclusively based on statistical metrics such as WER or CER, which do not provide any insight into the nature or impact of the errors produced when predicting transcripts from speech input. This work presents a selection of ASR model architectures that are pretrained on the German language and evaluates them on a benchmark of diverse test datasets. It identifies cross-architectural prediction errors, classifies those into categories and traces the sources of errors per category back into training data as well as other sources. Finally, it discusses solutions in order to create qualitatively better training datasets and more robust ASR systems.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
HUI-Audio-Corpus-German: A high quality TTS dataset
Authors:
Pascal Puchtler,
Johannes Wirth,
René Peinl
Abstract:
The increasing availability of audio data on the internet lead to a multitude of datasets for development and training of text to speech applications, based on neural networks. Highly differing quality of voice, low sampling rates, lack of text normalization and disadvantageous alignment of audio samples to corresponding transcript sentences still limit the performance of deep neural networks trai…
▽ More
The increasing availability of audio data on the internet lead to a multitude of datasets for development and training of text to speech applications, based on neural networks. Highly differing quality of voice, low sampling rates, lack of text normalization and disadvantageous alignment of audio samples to corresponding transcript sentences still limit the performance of deep neural networks trained on this task. Additionally, data resources in languages like German are still very limited. We introduce the "HUI-Audio-Corpus-German", a large, open-source dataset for TTS engines, created with a processing pipeline, which produces high quality audio to transcription alignments and decreases manual effort needed for creation.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.