-
PictSure: Pretraining Embeddings Matters for In-Context Learning Image Classifiers
Authors:
Lukas Schiesser,
Cornelius Wolff,
Sophie Haas,
Simon Pukrop
Abstract:
Building image classification models remains cumbersome in data-scarce domains, where collecting large labeled datasets is impractical. In-context learning (ICL) has emerged as a promising paradigm for few-shot image classification (FSIC), enabling models to generalize across domains without gradient-based adaptation. However, prior work has largely overlooked a critical component of ICL-based FSI…
▽ More
Building image classification models remains cumbersome in data-scarce domains, where collecting large labeled datasets is impractical. In-context learning (ICL) has emerged as a promising paradigm for few-shot image classification (FSIC), enabling models to generalize across domains without gradient-based adaptation. However, prior work has largely overlooked a critical component of ICL-based FSIC pipelines: the role of image embeddings. In this work, we present PictSure, an ICL framework that places the embedding model -- its architecture, pretraining, and training dynamics -- at the center of analysis. We systematically examine the effects of different visual encoder types, pretraining objectives, and fine-tuning strategies on downstream FSIC performance. Our experiments show that the training success and the out-of-domain performance are highly dependent on how the embedding models are pretrained. Consequently, PictSure manages to outperform existing ICL-based FSIC models on out-of-domain benchmarks that differ significantly from the training distribution, while maintaining comparable results on in-domain tasks. Code can be found at https://github.com/PictSure/pictsure-library.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
-
How well do LLMs reason over tabular data, really?
Authors:
Cornelius Wolff,
Madelon Hulsebos
Abstract:
Large Language Models (LLMs) excel in natural language tasks, but less is known about their reasoning capabilities over tabular data. Prior analyses devise evaluation strategies that poorly reflect an LLM's realistic performance on tabular queries. Moreover, we have a limited understanding of the robustness of LLMs towards realistic variations in tabular inputs. Therefore, we ask: Can general-purp…
▽ More
Large Language Models (LLMs) excel in natural language tasks, but less is known about their reasoning capabilities over tabular data. Prior analyses devise evaluation strategies that poorly reflect an LLM's realistic performance on tabular queries. Moreover, we have a limited understanding of the robustness of LLMs towards realistic variations in tabular inputs. Therefore, we ask: Can general-purpose LLMs reason over tabular data, really?, and focus on two questions 1) are tabular reasoning capabilities of general-purpose LLMs robust to real-world characteristics of tabular inputs, and 2) how can we realistically evaluate an LLM's performance on analytical tabular queries? Building on a recent tabular reasoning benchmark, we first surface shortcomings of its multiple-choice prompt evaluation strategy, as well as commonly used free-form text metrics such as SacreBleu and BERT-score. We show that an LLM-as-a-judge procedure yields more reliable performance insights and unveil a significant deficit in tabular reasoning performance of LLMs. We then extend the tabular inputs reflecting three common characteristics in practice: 1) missing values, 2) duplicate entities, and 3) structural variations. Experiments show that the tabular reasoning capabilities of general-purpose LLMs suffer from these variations, stressing the importance of improving their robustness for realistic tabular inputs.
△ Less
Submitted 2 June, 2025; v1 submitted 12 May, 2025;
originally announced May 2025.
-
Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction
Authors:
Nils Constantin Hellwig,
Jakob Fehle,
Udo Kruschwitz,
Christian Wolff
Abstract:
Aspect sentiment quad prediction (ASQP) facilitates a detailed understanding of opinions expressed in a text by identifying the opinion term, aspect term, aspect category and sentiment polarity for each opinion. However, annotating a full set of training examples to fine-tune models for ASQP is a resource-intensive process. In this study, we explore the capabilities of large language models (LLMs)…
▽ More
Aspect sentiment quad prediction (ASQP) facilitates a detailed understanding of opinions expressed in a text by identifying the opinion term, aspect term, aspect category and sentiment polarity for each opinion. However, annotating a full set of training examples to fine-tune models for ASQP is a resource-intensive process. In this study, we explore the capabilities of large language models (LLMs) for zero- and few-shot learning on the ASQP task across five diverse datasets. We report F1 scores almost up to par with those obtained with state-of-the-art fine-tuned models and exceeding previously reported zero- and few-shot performance. In the 20-shot setting on the Rest16 restaurant domain dataset, LLMs achieved an F1 score of 51.54, compared to 60.39 by the best-performing fine-tuned method MVP. Additionally, we report the performance of LLMs in target aspect sentiment detection (TASD), where the F1 scores were close to fine-tuned models, achieving 68.93 on Rest16 in the 30-shot setting, compared to 72.76 with MVP. While human annotators remain essential for achieving optimal performance, LLMs can reduce the need for extensive manual annotation in ASQP tasks.
△ Less
Submitted 28 May, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
Detecting Calls to Action in Multimodal Content: Analysis of the 2021 German Federal Election Campaign on Instagram
Authors:
Michael Achmann-Denkler,
Jakob Fehle,
Mario Haim,
Christian Wolff
Abstract:
This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. We analyzed over 2,208 Instagram stories and 712 posts using fine-tuned BERT models and OpenAI's GPT-4 models. The fine-tuned BERT model incorporating synthetic training data achieved a macro F1 scor…
▽ More
This study investigates the automated classification of Calls to Action (CTAs) within the 2021 German Instagram election campaign to advance the understanding of mobilization in social media contexts. We analyzed over 2,208 Instagram stories and 712 posts using fine-tuned BERT models and OpenAI's GPT-4 models. The fine-tuned BERT model incorporating synthetic training data achieved a macro F1 score of 0.93, demonstrating a robust classification performance. Our analysis revealed that 49.58% of Instagram posts and 10.64% of stories contained CTAs, highlighting significant differences in mobilization strategies between these content types. Additionally, we found that FDP and the Greens had the highest prevalence of CTAs in posts, whereas CDU and CSU led in story CTAs.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Preserving the Ephemeral: Instagram Story Archiving with the Tidal Tales Plugin
Authors:
Michael Achmann-Denkler,
Christian Wolff
Abstract:
We introduce the Tidal Tales Plugin, a Firefox extension for efficiently collecting and archiving of Instagram stories, addressing the challenges of ephemeral data in social media research. It enables an automated collection of story metadata and media files without risking account bans. It contributes to Web Science by facilitating expansive, long-term studies with enhanced data access and integr…
▽ More
We introduce the Tidal Tales Plugin, a Firefox extension for efficiently collecting and archiving of Instagram stories, addressing the challenges of ephemeral data in social media research. It enables an automated collection of story metadata and media files without risking account bans. It contributes to Web Science by facilitating expansive, long-term studies with enhanced data access and integrity.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Bidirectional Emergent Language in Situated Environments
Authors:
Cornelius Wolff,
Julius Mayer,
Elia Bruni,
Xenia Ohmer
Abstract:
Emergent language research has made significant progress in recent years, but still largely fails to explore how communication emerges in more complex and situated multi-agent systems. Existing setups often employ a reference game, which limits the range of language emergence phenomena that can be studied, as the game consists of a single, purely language-based interaction between the agents. In t…
▽ More
Emergent language research has made significant progress in recent years, but still largely fails to explore how communication emerges in more complex and situated multi-agent systems. Existing setups often employ a reference game, which limits the range of language emergence phenomena that can be studied, as the game consists of a single, purely language-based interaction between the agents. In this paper, we address these limitations and explore the emergence and utility of token-based communication in open-ended multi-agent environments, where situated agents interact with the environment through movement and communication over multiple time-steps. Specifically, we introduce two novel cooperative environments: Multi-Agent Pong and Collectors. These environments are interesting because optimal performance requires the emergence of a communication protocol, but moderate success can be achieved without one. By employing various methods from explainable AI research, such as saliency maps, perturbation, and diagnostic classifiers, we are able to track and interpret the agents' language channel use over time. We find that the emerging communication is sparse, with the agents only generating meaningful messages and acting upon incoming messages in states where they cannot succeed without coordination.
△ Less
Submitted 17 October, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
GERestaurant: A German Dataset of Annotated Restaurant Reviews for Aspect-Based Sentiment Analysis
Authors:
Nils Constantin Hellwig,
Jakob Fehle,
Markus Bink,
Christian Wolff
Abstract:
We present GERestaurant, a novel dataset consisting of 3,078 German language restaurant reviews manually annotated for Aspect-Based Sentiment Analysis (ABSA). All reviews were collected from Tripadvisor, covering a diverse selection of restaurants, including regional and international cuisine with various culinary styles. The annotations encompass both implicit and explicit aspects, including all…
▽ More
We present GERestaurant, a novel dataset consisting of 3,078 German language restaurant reviews manually annotated for Aspect-Based Sentiment Analysis (ABSA). All reviews were collected from Tripadvisor, covering a diverse selection of restaurants, including regional and international cuisine with various culinary styles. The annotations encompass both implicit and explicit aspects, including all aspect terms, their corresponding aspect categories, and the sentiments expressed towards them. Furthermore, we provide baseline scores for the four ABSA tasks Aspect Category Detection, Aspect Category Sentiment Analysis, End-to-End ABSA and Target Aspect Sentiment Detection as a reference point for future advances. The dataset fills a gap in German language resources and facilitates exploration of ABSA in the restaurant domain.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Imagen 3
Authors:
Imagen-Team-Google,
:,
Jason Baldridge,
Jakob Bauer,
Mukul Bhutani,
Nicole Brichtova,
Andrew Bunner,
Lluis Castrejon,
Kelvin Chan,
Yichang Chen,
Sander Dieleman,
Yuqing Du,
Zach Eaton-Rosen,
Hongliang Fei,
Nando de Freitas,
Yilin Gao,
Evgeny Gladchenko,
Sergio Gómez Colmenarejo,
Mandy Guo,
Alex Haig,
Will Hawkins,
Hexiang Hu,
Huilian Huang,
Tobenna Peter Igwe,
Christos Kaplanis
, et al. (237 additional authors not shown)
Abstract:
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
We introduce Imagen 3, a latent diffusion model that generates high quality images from text prompts. We describe our quality and responsibility evaluations. Imagen 3 is preferred over other state-of-the-art (SOTA) models at the time of evaluation. In addition, we discuss issues around safety and representation, as well as methods we used to minimize the potential harm of our models.
△ Less
Submitted 21 December, 2024; v1 submitted 13 August, 2024;
originally announced August 2024.
-
A Review of Nine Physics Engines for Reinforcement Learning Research
Authors:
Michael Kaup,
Cornelius Wolff,
Hyerim Hwang,
Julius Mayer,
Elia Bruni
Abstract:
We present a review of popular simulation engines and frameworks used in reinforcement learning (RL) research, aiming to guide researchers in selecting tools for creating simulated physical environments for RL and training setups. It evaluates nine frameworks (Brax, Chrono, Gazebo, MuJoCo, ODE, PhysX, PyBullet, Webots, and Unity) based on their popularity, feature range, quality, usability, and RL…
▽ More
We present a review of popular simulation engines and frameworks used in reinforcement learning (RL) research, aiming to guide researchers in selecting tools for creating simulated physical environments for RL and training setups. It evaluates nine frameworks (Brax, Chrono, Gazebo, MuJoCo, ODE, PhysX, PyBullet, Webots, and Unity) based on their popularity, feature range, quality, usability, and RL capabilities. We highlight the challenges in selecting and utilizing physics engines for RL research, including the need for detailed comparisons and an understanding of each framework's capabilities. Key findings indicate MuJoCo as the leading framework due to its performance and flexibility, despite usability challenges. Unity is noted for its ease of use but lacks scalability and simulation fidelity. The study calls for further development to improve simulation engines' usability and performance and stresses the importance of transparency and reproducibility in RL research. This review contributes to the RL community by offering insights into the selection process for simulation engines, facilitating informed decision-making.
△ Less
Submitted 23 August, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Innovations in Cover Song Detection: A Lyrics-Based Approach
Authors:
Maximilian Balluff,
Peter Mandl,
Christian Wolff
Abstract:
Cover songs are alternate versions of a song by a different artist. Long being a vital part of the music industry, cover songs significantly influence music culture and are commonly heard in public venues. The rise of online music platforms has further increased their prevalence, often as background music or video soundtracks. While current automatic identification methods serve adequately for ori…
▽ More
Cover songs are alternate versions of a song by a different artist. Long being a vital part of the music industry, cover songs significantly influence music culture and are commonly heard in public venues. The rise of online music platforms has further increased their prevalence, often as background music or video soundtracks. While current automatic identification methods serve adequately for original songs, they are less effective with cover songs, primarily because cover versions often significantly deviate from the original compositions. In this paper, we propose a novel method for cover song detection that utilizes the lyrics of a song. We introduce a new dataset for cover songs and their corresponding originals. The dataset contains 5078 cover songs and 2828 original songs. In contrast to other cover song datasets, it contains the annotated lyrics for the original song and the cover song. We evaluate our method on this dataset and compare it with multiple baseline approaches. Our results show that our method outperforms the baseline approaches.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
GRASP: A novel benchmark for evaluating language GRounding And Situated Physics understanding in multimodal language models
Authors:
Serwan Jassim,
Mario Holubar,
Annika Richter,
Cornelius Wolff,
Xenia Ohmer,
Elia Bruni
Abstract:
This paper presents GRASP, a novel benchmark to evaluate the language grounding and physical understanding capabilities of video-based multimodal large language models (LLMs). This evaluation is accomplished via a two-tier approach leveraging Unity simulations. The first level tests for language grounding by assessing a model's ability to relate simple textual descriptions with visual information.…
▽ More
This paper presents GRASP, a novel benchmark to evaluate the language grounding and physical understanding capabilities of video-based multimodal large language models (LLMs). This evaluation is accomplished via a two-tier approach leveraging Unity simulations. The first level tests for language grounding by assessing a model's ability to relate simple textual descriptions with visual information. The second level evaluates the model's understanding of "Intuitive Physics" principles, such as object permanence and continuity. In addition to releasing the benchmark, we use it to evaluate several state-of-the-art multimodal LLMs. Our evaluation reveals significant shortcomings in the language grounding and intuitive physics capabilities of these models. Although they exhibit at least some grounding capabilities, particularly for colors and shapes, these capabilities depend heavily on the prompting strategy. At the same time, all models perform below or at the chance level of 50% in the Intuitive Physics tests, while human subjects are on average 80% correct. These identified limitations underline the importance of using benchmarks like GRASP to monitor the progress of future models in developing these competencies.
△ Less
Submitted 6 June, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Model-based Analysis and Specification of Functional Requirements and Tests for Complex Automotive Systems
Authors:
Carsten Wiecher,
Constantin Mandel,
Matthias Günther,
Jannik Fischbach,
Joel Greenyer,
Matthias Greinert,
Carsten Wolff,
Roman Dumitrescu,
Daniel Mendez,
Albert Albers
Abstract:
The specification of requirements and tests are crucial activities in automotive development projects. However, due to the increasing complexity of automotive systems, practitioners fail to specify requirements and tests for distributed and evolving systems with complex interactions when following traditional development processes. To address this research gap, we propose a technique that starts w…
▽ More
The specification of requirements and tests are crucial activities in automotive development projects. However, due to the increasing complexity of automotive systems, practitioners fail to specify requirements and tests for distributed and evolving systems with complex interactions when following traditional development processes. To address this research gap, we propose a technique that starts with the early identification of validation concerns from a stakeholder perspective, which we use to systematically design tests that drive a scenario-based modeling and analysis of system requirements. To ensure complete and consistent requirements and test specifications in a form that is required in automotive development projects, we develop a Model-Based Systems Engineering (MBSE) methodology. This methodology supports system architects and test designers in the collaborative application of our technique and in maintaining a central system model, in order to automatically derive the required specifications. We evaluate our methodology by applying it at KOSTAL (Tier1 supplier) and within student projects as part of the masters program Embedded Systems Engineering. Our study corroborates that our methodology is applicable and improves existing requirements and test specification processes by supporting the integrated and stakeholder-focused modeling of product and validation systems, where the early definition of stakeholder and validation concerns fosters a problem-oriented, iterative and test-driven requirements modeling.
△ Less
Submitted 15 November, 2023; v1 submitted 3 September, 2022;
originally announced September 2022.
-
Scenario-based Requirements Engineering for Complex Smart City Projects
Authors:
Carsten Wiecher,
Philipp Tendyra,
Carsten Wolff
Abstract:
Various stakeholders with different backgrounds are involved in Smart City projects. These stakeholders define the project goals, e.g., based on participative approaches, market research or innovation management processes. To realize these goals often complex technical solutions must be designed and implemented. In practice, however, it is difficult to synchronize the technical design and implemen…
▽ More
Various stakeholders with different backgrounds are involved in Smart City projects. These stakeholders define the project goals, e.g., based on participative approaches, market research or innovation management processes. To realize these goals often complex technical solutions must be designed and implemented. In practice, however, it is difficult to synchronize the technical design and implementation phase with the definition of moving Smart City goals. We hypothesize that this is due to a lack of a common language for the different stakeholder groups and the technical disciplines. We address this problem with scenario-based requirements engineering techniques. In particular, we use scenarios at different levels of abstraction and formalization that are connected end-to-end by appropriate methods and tools. This enables fast feedback loops to iteratively align technical requirements, stakeholder expectations, and Smart City goals. We demonstrate the applicability of our approach in a case study with different industry partners.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Integrated and Iterative Requirements Analysis and Test Specification: A Case Study at Kostal
Authors:
Carsten Wiecher,
Jannik Fischbach,
Joel Greenyer,
Andreas Vogelsang,
Carsten Wolff,
Roman Dumitrescu
Abstract:
Currently, practitioners follow a top-down approach in automotive development projects. However, recent studies have shown that this top-down approach is not suitable for the implementation and testing of modern automotive systems. Specifically, practitioners increasingly fail to specify requirements and tests for systems with complex component interactions (e.g., e-mobility systems). In this pape…
▽ More
Currently, practitioners follow a top-down approach in automotive development projects. However, recent studies have shown that this top-down approach is not suitable for the implementation and testing of modern automotive systems. Specifically, practitioners increasingly fail to specify requirements and tests for systems with complex component interactions (e.g., e-mobility systems). In this paper, we address this research gap and propose an integrated and iterative scenario-based technique for the specification of requirements and test scenarios. Our idea is to combine both a top-down and a bottom-up integration strategy. For the top-down approach, we use a behavior-driven development (BDD) technique to drive the modeling of high-level system interactions from the user's perspective. For the bottom-up approach, we discovered that natural language processing (NLP) techniques are suited to make textual specifications of existing components accessible to our technique. To integrate both directions, we support the joint execution and automated analysis of system-level interactions and component-level behavior. We demonstrate the feasibility of our approach by conducting a case study at Kostal (Tier1 supplier). The case study corroborates, among other things, that our approach supports practitioners in improving requirements and test specifications for integrated system behavior.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.
-
Selecting Features for the Next Release in a System of Systems Context
Authors:
Carsten Wiecher,
Carsten Wolff,
Harald Anacker,
Roman Dumitrescu
Abstract:
Smart Cities are developing in parallel with the global trend towards urbanization. The ultimate goal of Smart City projects is to deliver a positive impact for the citizens and the socio-economic and ecological environment. This involves the challenge to derive concrete requirements for (technical) projects from overarching concepts like Quality of Life (QoL) and Subjective Well-Being (SWB). Link…
▽ More
Smart Cities are developing in parallel with the global trend towards urbanization. The ultimate goal of Smart City projects is to deliver a positive impact for the citizens and the socio-economic and ecological environment. This involves the challenge to derive concrete requirements for (technical) projects from overarching concepts like Quality of Life (QoL) and Subjective Well-Being (SWB). Linking long-term, impact oriented goals with project outputs and outcomes is a complex problem. Decision making on requirements and resulting features of single Smart City projects (or systems) is even more complex since cities are not like monolithic, hierarchical and well structured systems. Nevertheless, systems engineering provides concepts which support decision making in such situations. Complex socio-technical systems such as smart cities can be characterized as systems of systems (SoS). A SoS is composed of independently developed systems that nevertheless provide a higher-level integrated functionality. To add new functionality to a SoS, either existing systems must be extended or new systems must be developed and integrated. In both cases, the extension of functionality is usually done in small increments and structured via software releases. However, the decision which features to include in the next release is complex and difficult to manage when done manually. To address this, we make use of the multi-objective next release problem (MONRP) to search for an optimal set of features for a software release in a SoS context. In order to refine the search in an early planning phase, we propose a technique to model and validate the features using the scenario modeling language for Kotlin (SMLK). This is demonstrated with a proof-of-concept implementation.
△ Less
Submitted 3 March, 2021; v1 submitted 17 February, 2021;
originally announced February 2021.
-
Iterative and Scenario-based Requirements Specification in a System of Systems Context
Authors:
Carsten Wiecher,
Joel Greenyer,
Carsten Wolff,
Harald Anacker,
Roman Dumitrescu
Abstract:
[Context&Motivation]Due to the managerial ,operational and evolutionary independence of constituent systems (CSs) in a System of Systems (SoS) context, top-down and linear requirements engineering (RE) approaches are insufficient. RE techniques for SoS must support iterating, changing, synchronizing, and communicating requirements across different abstraction and hierarchy levels as well as scopes…
▽ More
[Context&Motivation]Due to the managerial ,operational and evolutionary independence of constituent systems (CSs) in a System of Systems (SoS) context, top-down and linear requirements engineering (RE) approaches are insufficient. RE techniques for SoS must support iterating, changing, synchronizing, and communicating requirements across different abstraction and hierarchy levels as well as scopes of responsibility. [Question/Problem] We address the challenge of SoS requirements specification, where requirements can describe the SoS behavior, but also the behavior of CSs that are developed independently. [Principal Ideas] To support the requirements specification in an SoS environment, we propose a scenario-based and iterative specification technique. This allows requirements engineers to continuously model and jointly execute and test the system behavior for the SoS and the CS in order to detect contradictions in the requirement specifications at an early stage. [Contribution] In this paper, we describe an extension for the scenario-modeling language for Kotlin (SMLK) to continuously and formally model requirements on SoS and CS level. To support the iterative requirements specification and modeling we combine SMLK with agile development techniques. We demonstrate the applicability of our approach with the help of an example from the field of e-mobility.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
Human vs. supervised machine learning: Who learns patterns faster?
Authors:
Niklas Kühl,
Marc Goutier,
Lucas Baier,
Clemens Wolff,
Dominik Martin
Abstract:
The capabilities of supervised machine learning (SML), especially compared to human abilities, are being discussed in scientific research and in the usage of SML. This study provides an answer to how learning performance differs between humans and machines when there is limited training data. We have designed an experiment in which 44 humans and three different machine learning algorithms identify…
▽ More
The capabilities of supervised machine learning (SML), especially compared to human abilities, are being discussed in scientific research and in the usage of SML. This study provides an answer to how learning performance differs between humans and machines when there is limited training data. We have designed an experiment in which 44 humans and three different machine learning algorithms identify patterns in labeled training data and have to label instances according to the patterns they find. The results show a high dependency between performance and the underlying patterns of the task. Whereas humans perform relatively similarly across all patterns, machines show large performance differences for the various patterns in our experiment. After seeing 20 instances in the experiment, human performance does not improve anymore, which we relate to theories of cognitive overload. Machines learn slower but can reach the same level or may even outperform humans in 2 of the 4 of used patterns. However, machines need more instances compared to humans for the same results. The performance of machines is comparably lower for the other 2 patterns due to the difficulty of combining input features.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
"Healthy surveillance": Designing a concept for privacy-preserving mask recognition AI in the age of pandemics
Authors:
Niklas Kühl,
Dominik Martin,
Clemens Wolff,
Melanie Volkamer
Abstract:
The obligation to wear masks in times of pandemics reduces the risk of spreading viruses. In case of the COVID-19 pandemic in 2020, many governments recommended or even obligated their citizens to wear masks as an effective countermeasure. In order to continuously monitor the compliance of this policy measure in public spaces like restaurants or tram stations by public authorities, one scalable an…
▽ More
The obligation to wear masks in times of pandemics reduces the risk of spreading viruses. In case of the COVID-19 pandemic in 2020, many governments recommended or even obligated their citizens to wear masks as an effective countermeasure. In order to continuously monitor the compliance of this policy measure in public spaces like restaurants or tram stations by public authorities, one scalable and automatable option depicts the application of surveillance systems, i.e., CCTV. However, large-scale monitoring of mask recognition does not only require a well-performing Artificial Intelligence, but also ensure that no privacy issues are introduced, as surveillance is a deterrent for citizens and regulations like General Data Protection Regulation (GDPR) demand strict regulations of such personal data. In this work, we show how a privacy-preserving mask recognition artifact could look like, demonstrate different options for implementation and evaluate performances. Our conceptual deep-learning based Artificial Intelligence is able to achieve detection performances between 95% and 99% in a privacy-friendly setting. On that basis, we elaborate on the trade-off between the level of privacy preservation and Artificial Intelligence performance, i.e. the "price of privacy".
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Advanced machine learning informatics modeling using clinical and radiological imaging metrics for characterizing breast tumor characteristics with the OncotypeDX gene array
Authors:
Michael A. Jacobs,
Christopher Umbricht,
Vishwa Parekh,
Riham El Khouli,
Leslie Cope,
Katarzyna J. Macura,
Susan Harvey,
Antonio C. Wolff
Abstract:
Purpose-Optimal use of established and imaging methods, such as multiparametric magnetic resonance imaging(mpMRI) can simultaneously identify key functional parameters and provide unique imaging phenotypes of breast cancer. Therefore, we have developed and implemented a new machine-learning informatic system that integrates clinical variables, derived from imaging and clinical health records, to c…
▽ More
Purpose-Optimal use of established and imaging methods, such as multiparametric magnetic resonance imaging(mpMRI) can simultaneously identify key functional parameters and provide unique imaging phenotypes of breast cancer. Therefore, we have developed and implemented a new machine-learning informatic system that integrates clinical variables, derived from imaging and clinical health records, to compare with the 21-gene array assay, OncotypeDX. Materials and methods-We tested our informatics modeling in a subset of patients (n=81) who had ER+ disease and underwent OncotypeDX gene expression and breast mpMRI testing. The machine-learning informatic method is termed Integrated Radiomic Informatic System-IRIS was applied to the mpMRI, clinical and pathologic descriptors, as well as a gene array analysis. The IRIS method using an advanced graph theoretic model and quantitative metrics. Summary statistics (mean and standard deviations) for the quantitative imaging parameters were obtained. Sensitivity and specificity and Area Under the Curve were calculated for the classification of the patients. Results-The OncotypeDX classification by IRIS model had sensitivity of 95% and specificity of 89% with AUC of 0.92. The breast lesion size was larger for the high-risk groups and lower for both low risk and intermediate risk groups. There were significant differences in PK-DCE and ADC map values in each group. The ADC map values for high- and intermediate-risk groups were significantly lower than the low-risk group. Conclusion-These initial studies provide deeper understandings of imaging features and molecular gene array OncotypeDX score. This insight provides the foundation to relate these imaging features to the assessment of treatment response for improved personalized medicine.
△ Less
Submitted 7 November, 2018;
originally announced November 2018.
-
RootJS: Node.js Bindings for ROOT 6
Authors:
Theo Beffart,
Maximilian Früh,
Christoph Haas,
Sachin Rajgopal,
Jonas Schwabe,
Christoph Wolff,
Marek Szuba
Abstract:
We present rootJS, an interface making it possible to seamlessly integrate ROOT 6 into applications written for Node.js, the JavaScript runtime platform increasingly commonly used to create high-performance Web applications. ROOT features can be called both directly from Node.js code and by JIT-compiling C++ macros. All rootJS methods are invoked asynchronously and support callback functions, allo…
▽ More
We present rootJS, an interface making it possible to seamlessly integrate ROOT 6 into applications written for Node.js, the JavaScript runtime platform increasingly commonly used to create high-performance Web applications. ROOT features can be called both directly from Node.js code and by JIT-compiling C++ macros. All rootJS methods are invoked asynchronously and support callback functions, allowing non-blocking operation of Node.js applications using them. Last but not least, our bindings have been designed to platform-independent and should therefore work on all systems supporting both ROOT 6 and Node.js.
Thanks to rootJS it is now possible to create ROOT-aware Web applications taking full advantage of the high performance and extensive capabilities of Node.js. Examples include platforms for the quality assurance of acquired, reconstructed or simulated data, book-keeping and e-log systems, and even Web browser-based data visualisation and analysis.
△ Less
Submitted 28 March, 2017;
originally announced April 2017.
-
Impact of video quality and wireless network interface on power consumption of mobile devices
Authors:
Norbert Zsak,
Christian Wolff
Abstract:
During the last years, many improvements were made to the hardware capability of mobile devices. As mobile software also became more interactive and data processing intensive, the increased power demand could not be compensated by the improvements on battery technology. Adaptive systems can help to balance the demand of applications with the limitations of battery resources. For effective systems,…
▽ More
During the last years, many improvements were made to the hardware capability of mobile devices. As mobile software also became more interactive and data processing intensive, the increased power demand could not be compensated by the improvements on battery technology. Adaptive systems can help to balance the demand of applications with the limitations of battery resources. For effective systems, the influence of multimedia quality on power consumption of the components of mobile devices needs to be better understood. In this paper, we analyze the impact of video quality and wireless network type on the energy consumption of a mobile device. We have found that the additional power consumption is up to 38% higher when a movie is played over a WiFi network instead from internal memory and 64% higher in case of a mobile network (3G). We have also discovered that a higher movie quality not only affects the power consumption of the CPU but also the power consumption of the WiFi unit by up to 58% and up to 72% respectively on mobile networks.
△ Less
Submitted 1 August, 2014; v1 submitted 29 July, 2014;
originally announced July 2014.
-
Event based classification of Web 2.0 text streams
Authors:
Andreas Bauer,
Christian Wolff
Abstract:
Web 2.0 applications like Twitter or Facebook create a continuous stream of information. This demands new ways of analysis in order to offer insight into this stream right at the moment of the creation of the information, because lots of this data is only relevant within a short period of time. To address this problem real time search engines have recently received increased attention. They take i…
▽ More
Web 2.0 applications like Twitter or Facebook create a continuous stream of information. This demands new ways of analysis in order to offer insight into this stream right at the moment of the creation of the information, because lots of this data is only relevant within a short period of time. To address this problem real time search engines have recently received increased attention. They take into account the continuous flow of information differently than traditional web search by incorporating temporal and social features, that describe the context of the information during its creation. Standard approaches where data first get stored and then is processed from a peristent storage suffer from latency. We want to address the fluent and rapid nature of text stream by providing an event based approach that analyses directly the stream of information. In a first step we want to define the difference between real time search and traditional search to clarify the demands in modern text filtering. In a second step we want to show how event based features can be used to support the tasks of real time search engines. Using the example of Twitter we present in this paper a way how to combine an event based approach with text mining and information filtering concepts in order to classify incoming information based on stream features. We calculate stream dependant features and feed them into a neural network in order to classify the text streams. We show the separative capabilities of event based features as the foundation for a real time search engine.
△ Less
Submitted 28 June, 2013; v1 submitted 16 April, 2012;
originally announced April 2012.