-
Global Context Is All You Need for Parallel Efficient Tractography Parcellation
Authors:
Valentin von Bornhaupt,
Johannes Grün,
and Justus Bisten,
Tobias Bauer,
Theodor Rüber,
Thomas Schultz
Abstract:
Whole-brain tractography in diffusion MRI is often followed by a parcellation in which each streamline is classified as belonging to a specific white matter bundle, or discarded as a false positive. Efficient parcellation is important both in large-scale studies, which have to process huge amounts of data, and in the clinic, where computational resources are often limited. TractCloud is a state-of…
▽ More
Whole-brain tractography in diffusion MRI is often followed by a parcellation in which each streamline is classified as belonging to a specific white matter bundle, or discarded as a false positive. Efficient parcellation is important both in large-scale studies, which have to process huge amounts of data, and in the clinic, where computational resources are often limited. TractCloud is a state-of-the-art approach that aims to maximize accuracy with a local-global representation. We demonstrate that the local context does not contribute to the accuracy of that approach, and is even detrimental when dealing with pathological cases. Based on this observation, we propose PETParc, a new method for Parallel Efficient Tractography Parcellation. PETParc is a transformer-based architecture in which the whole-brain tractogram is randomly partitioned into sub-tractograms whose streamlines are classified in parallel, while serving as global context for each other. This leads to a speedup of up to two orders of magnitude relative to TractCloud, and permits inference even on clinical workstations without a GPU. PETParc accounts for the lack of streamline orientation either via a novel flip-invariant embedding, or by simply using flips as part of data augmentation. Despite the speedup, results are often even better than those of prior methods. The code and pretrained model will be made public upon acceptance.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Securing Confidential Data For Distributed Software Development Teams: Encrypted Container File
Authors:
Tobias J. Bauer,
Andreas Aßmuth
Abstract:
In the context of modern software engineering, there is a trend towards Cloud-native software development involving international teams with members from all over the world. Cloud-based version management services like GitHub are commonly used for source code and other files. However, a challenge arises when developers from different companies or organizations share the platform, as sensitive data…
▽ More
In the context of modern software engineering, there is a trend towards Cloud-native software development involving international teams with members from all over the world. Cloud-based version management services like GitHub are commonly used for source code and other files. However, a challenge arises when developers from different companies or organizations share the platform, as sensitive data should be encrypted to restrict access to certain developers only. This paper discusses existing tools addressing this issue, highlighting their shortcomings. The authors propose their own solution, Encrypted Container Files, designed to overcome the deficiencies observed in other tools.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Distinguishing Tor From Other Encrypted Network Traffic Through Character Analysis
Authors:
Pitpimon Choorod,
Tobias J. Bauer,
Andreas Aßmuth
Abstract:
For journalists reporting from a totalitarian regime, whistleblowers and resistance fighters, the anonymous use of cloud services on the Internet can be vital for survival. The Tor network provides a free and widely used anonymization service for everyone. However, there are different approaches to distinguishing Tor from non-Tor encrypted network traffic, most recently only due to the (relative)…
▽ More
For journalists reporting from a totalitarian regime, whistleblowers and resistance fighters, the anonymous use of cloud services on the Internet can be vital for survival. The Tor network provides a free and widely used anonymization service for everyone. However, there are different approaches to distinguishing Tor from non-Tor encrypted network traffic, most recently only due to the (relative) frequencies of hex digits in a single encrypted payload packet. While conventional data traffic is usually encrypted once, but at least three times in the case of Tor due to the structure and principle of the Tor network, we have examined to what extent the number of encryptions contributes to being able to distinguish Tor from non-Tor encrypted data traffic.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Encrypted Container File: Design and Implementation of a Hybrid-Encrypted Multi-Recipient File Structure
Authors:
Tobias J. Bauer,
Andreas Aßmuth
Abstract:
Modern software engineering trends towards Cloud-native software development by international teams of developers. Cloud-based version management services, such as GitHub, are used for the source code and other artifacts created during the development process. However, using such a service usually means that every developer has access to all data stored on the platform. Particularly, if the develo…
▽ More
Modern software engineering trends towards Cloud-native software development by international teams of developers. Cloud-based version management services, such as GitHub, are used for the source code and other artifacts created during the development process. However, using such a service usually means that every developer has access to all data stored on the platform. Particularly, if the developers belong to different companies or organizations, it would be desirable for sensitive files to be encrypted in such a way that these can only be decrypted again by a group of previously defined people. In this paper, we examine currently available tools that address this problem, but which have certain shortcomings. We then present our own solution, Encrypted Container Files (ECF), for this problem, eliminating the deficiencies found in the other tools.
△ Less
Submitted 18 May, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Reddiment: Eine SvelteKit- und ElasticSearch-basierte Reddit Sentiment-Analyse
Authors:
Tobias Bauer,
Fabian Beer,
Daniel Holl,
Ardian Imeraj,
Konrad Schweiger,
Philipp Stangl,
Wolfgang Weigl,
Christoph P. Neumann
Abstract:
Reddiment is a web-based dashboard that links sentiment analysis of subreddit texts with share prices. The system consists of a backend, frontend and various services. The backend, in Node.js, manages the data and communicates with crawlers that collect Reddit comments and stock market data. Sentiment is analyzed with the help of Vader and TextBlob. The frontend, based on SvelteKit, provides users…
▽ More
Reddiment is a web-based dashboard that links sentiment analysis of subreddit texts with share prices. The system consists of a backend, frontend and various services. The backend, in Node.js, manages the data and communicates with crawlers that collect Reddit comments and stock market data. Sentiment is analyzed with the help of Vader and TextBlob. The frontend, based on SvelteKit, provides users with a dashboard for visualization. The distribution is carried out via Docker containers and Docker Compose. The project offers expansion options, e.g. the integration of cryptocurrency rates. Reddiment enables the analysis of sentiment and share prices from subreddit data.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
A Robust Framework for Deep Learning Approaches to Facial Emotion Recognition and Evaluation
Authors:
Nyle Siddiqui,
Rushit Dave,
Tyler Bauer,
Thomas Reither,
Dylan Black,
Mitchell Hanson
Abstract:
Facial emotion recognition is a vast and complex problem space within the domain of computer vision and thus requires a universally accepted baseline method with which to evaluate proposed models. While test datasets have served this purpose in the academic sphere real world application and testing of such models lacks any real comparison. Therefore we propose a framework in which models developed…
▽ More
Facial emotion recognition is a vast and complex problem space within the domain of computer vision and thus requires a universally accepted baseline method with which to evaluate proposed models. While test datasets have served this purpose in the academic sphere real world application and testing of such models lacks any real comparison. Therefore we propose a framework in which models developed for FER can be compared and contrasted against one another in a constant standardized fashion. A lightweight convolutional neural network is trained on the AffectNet dataset a large variable dataset for facial emotion recognition and a web application is developed and deployed with our proposed framework as a proof of concept. The CNN is embedded into our application and is capable of instant real time facial emotion recognition. When tested on the AffectNet test set this model achieves high accuracy for emotion classification of eight different emotions. Using our framework the validity of this model and others can be properly tested by evaluating a model efficacy not only based on its accuracy on a sample test dataset, but also on in the wild experiments. Additionally, our application is built with the ability to save and store any image captured or uploaded to it for emotion recognition, allowing for the curation of more quality and diverse facial emotion recognition datasets.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
Towards a Common Testing Terminology for Software Engineering and Data Science Experts
Authors:
Lisa Jöckel,
Thomas Bauer,
Michael Kläs,
Marc P. Hauer,
Janek Groß
Abstract:
Analytical quality assurance, especially testing, is an integral part of software-intensive system development. With the increased usage of Artificial Intelligence (AI) and Machine Learning (ML) as part of such systems, this becomes more difficult as well-understood software testing approaches cannot be applied directly to the AI-enabled parts of the system. The required adaptation of classical te…
▽ More
Analytical quality assurance, especially testing, is an integral part of software-intensive system development. With the increased usage of Artificial Intelligence (AI) and Machine Learning (ML) as part of such systems, this becomes more difficult as well-understood software testing approaches cannot be applied directly to the AI-enabled parts of the system. The required adaptation of classical testing approaches and the development of new concepts for AI would benefit from a deeper understanding and exchange between AI and software engineering experts. We see the different terminologies used in the two communities as a major obstacle on this way. As we consider a mutual understanding of the testing terminology a key, this paper contributes a mapping between the most important concepts from classical software testing and AI testing. In the mapping, we highlight differences in the relevance and naming of the mapped concepts.
△ Less
Submitted 6 October, 2021; v1 submitted 31 August, 2021;
originally announced August 2021.
-
Online and Real-Time Tracking in a Surveillance Scenario
Authors:
Oliver Urbann,
Oliver Bredtmann,
Maximilian Otten,
Jan-Philip Richter,
Thilo Bauer,
David Zibriczky
Abstract:
This paper presents an approach for tracking in a surveillance scenario. Typical aspects for this scenario are a 24/7 operation with a static camera mounted above the height of a human with many objects or people. The Multiple Object Tracking Benchmark 20 (MOT20) reflects this scenario best. We can show that our approach is real-time capable on this benchmark and outperforms all other real-time ca…
▽ More
This paper presents an approach for tracking in a surveillance scenario. Typical aspects for this scenario are a 24/7 operation with a static camera mounted above the height of a human with many objects or people. The Multiple Object Tracking Benchmark 20 (MOT20) reflects this scenario best. We can show that our approach is real-time capable on this benchmark and outperforms all other real-time capable approaches in HOTA, MOTA, and IDF1. We achieve this by contributing a fast Siamese network reformulated for linear runtime (instead of quadratic) to generate fingerprints from detections. Thus, it is possible to associate the detections to Kalman filters based on multiple tracking specific ratings: Cosine similarity of fingerprints, Intersection over Union, and pixel distance ratio in the image.
△ Less
Submitted 2 June, 2021;
originally announced June 2021.
-
Predictable Accelerator Design with Time-Sensitive Affine Types
Authors:
Rachit Nigam,
Sachille Atapattu,
Samuel Thomas,
Zhijing Li,
Theodore Bauer,
Yuwei Ye,
Apurva Koti,
Adrian Sampson,
Zhiru Zhang
Abstract:
Field-programmable gate arrays (FPGAs) provide an opportunity to co-design applications with hardware accelerators, yet they remain difficult to program. High-level synthesis (HLS) tools promise to raise the level of abstraction by compiling C or C++ to accelerator designs. Repurposing legacy software languages, however, requires complex heuristics to map imperative code onto hardware structures.…
▽ More
Field-programmable gate arrays (FPGAs) provide an opportunity to co-design applications with hardware accelerators, yet they remain difficult to program. High-level synthesis (HLS) tools promise to raise the level of abstraction by compiling C or C++ to accelerator designs. Repurposing legacy software languages, however, requires complex heuristics to map imperative code onto hardware structures. We find that the black-box heuristics in HLS can be unpredictable: changing parameters in the program that should improve performance can counterintuitively yield slower and larger designs. This paper proposes a type system that restricts HLS to programs that can predictably compile to hardware accelerators. The key idea is to model consumable hardware resources with a time-sensitive affine type system that prevents simultaneous uses of the same hardware structure. We implement the type system in Dahlia, a language that compiles to HLS C++, and show that it can reduce the size of HLS parameter spaces while accepting Pareto-optimal designs.
△ Less
Submitted 30 April, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
#MeTooMaastricht: Building a chatbot to assist survivors of sexual harassment
Authors:
Tobias Bauer,
Emre Devrim,
Misha Glazunov,
William Lopez Jaramillo,
Balaganesh Mohan,
Gerasimos Spanakis
Abstract:
Inspired by the recent social movement of #MeToo, we are building a chatbot to assist survivors of sexual harassment cases (designed for the city of Maastricht but can easily be extended). The motivation behind this work is twofold: properly assist survivors of such events by directing them to appropriate institutions that can offer them help and increase the incident documentation so as to gather…
▽ More
Inspired by the recent social movement of #MeToo, we are building a chatbot to assist survivors of sexual harassment cases (designed for the city of Maastricht but can easily be extended). The motivation behind this work is twofold: properly assist survivors of such events by directing them to appropriate institutions that can offer them help and increase the incident documentation so as to gather more data about harassment cases which are currently under reported. We break down the problem into three data science/machine learning components: harassment type identification (treated as a classification problem), spatio-temporal information extraction (treated as Named Entity Recognition problem) and dialogue with the users (treated as a slot-filling based chatbot). We are able to achieve a success rate of more than 98% for the identification of a harassment-or-not case and around 80% for the specific type harassment identification. Locations and dates are identified with more than 90% accuracy and time occurrences prove more challenging with almost 80%. Finally, initial validation of the chatbot shows great potential for the further development and deployment of such a beneficial for the whole society tool.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Leveraging Sociological Models for Predictive Analytics
Authors:
Richard Colbaugh,
Kristin Glass,
Travis Bauer
Abstract:
There is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but hidden activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This p…
▽ More
There is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but hidden activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This paper shows the performance of such learning algorithms often can be improved substantially by leveraging sociological models in their development and implementation. In particular, we demonstrate that sociologically-grounded learning algorithms outperform gold-standard methods in three important and challenging tasks: 1.) inferring the (unobserved) nature of relationships in adversarial social networks, 2.) predicting whether nascent social diffusion events will go viral, and 3.) anticipating and defending future actions of opponents in adversarial settings. Significantly, the new algorithms perform well even when there is limited data available for their training and execution.
△ Less
Submitted 30 December, 2012;
originally announced December 2012.
-
Adaptation-Based Programming in Haskell
Authors:
Tim Bauer,
Martin Erwig,
Alan Fern,
Jervis Pinto
Abstract:
We present an embedded DSL to support adaptation-based programming (ABP) in Haskell. ABP is an abstract model for defining adaptive values, called adaptives, which adapt in response to some associated feedback. We show how our design choices in Haskell motivate higher-level combinators and constructs and help us derive more complicated compositional adaptives.
We also show an important specializ…
▽ More
We present an embedded DSL to support adaptation-based programming (ABP) in Haskell. ABP is an abstract model for defining adaptive values, called adaptives, which adapt in response to some associated feedback. We show how our design choices in Haskell motivate higher-level combinators and constructs and help us derive more complicated compositional adaptives.
We also show an important specialization of ABP is in support of reinforcement learning constructs, which optimize adaptive values based on a programmer-specified objective function. This permits ABP users to easily define adaptive values that express uncertainty anywhere in their programs. Over repeated executions, these adaptive values adjust to more efficient ones and enable the user's programs to self optimize.
The design of our DSL depends significantly on the use of type classes. We will illustrate, along with presenting our DSL, how the use of type classes can support the gradual evolution of DSLs.
△ Less
Submitted 4 September, 2011;
originally announced September 2011.
-
The Visualization of the Road Coloring Algorithm in the package TESTAS
Authors:
A. N. Trahtman,
T. Bauer,
N. Cohen
Abstract:
A synchronizing word of a deterministic automaton is a word in the alphabet of colors of its edges that maps the automaton to a single state. A coloring of edges of a directed graph is synchronizing if the coloring turns the graph into a deterministic finite automaton possessing a synchronizing word.
The road coloring problem is the problem of synchronizing coloring of a directed finite strongly…
▽ More
A synchronizing word of a deterministic automaton is a word in the alphabet of colors of its edges that maps the automaton to a single state. A coloring of edges of a directed graph is synchronizing if the coloring turns the graph into a deterministic finite automaton possessing a synchronizing word.
The road coloring problem is the problem of synchronizing coloring of a directed finite strongly connected graph with constant outdegree of all its vertices if the greatest common divisor of the lengths of all its cycles is one. A polynomial time algorithm of the road coloring has been based on recent positive solution of this old famous problem.
One can use our new visualization program for demonstration of the algorithm as well as for visualization of the transition graph of any finite automaton. The visual image presents some structure properties of the transition graph. This help tool is linear in the size of the automaton.
△ Less
Submitted 23 November, 2010; v1 submitted 16 July, 2009;
originally announced July 2009.
-
High Density Through Silicon Via (TSV)
Authors:
Magnus Rimskog,
Tomas Bauer
Abstract:
The Through Silicon Via (TSV) process developed by Silex provides down to 30 micrometers pitch for through wafer connections in up to 600 micrometers thick substrates. Integrated with MEMS designs it enables significantly reduced die size and true "Wafer Level Packaging" - features that are particularly important in consumer market applications. The TSV technology also enables integration of adv…
▽ More
The Through Silicon Via (TSV) process developed by Silex provides down to 30 micrometers pitch for through wafer connections in up to 600 micrometers thick substrates. Integrated with MEMS designs it enables significantly reduced die size and true "Wafer Level Packaging" - features that are particularly important in consumer market applications. The TSV technology also enables integration of advanced interconnect functions in optical MEMS, sensors and microfluidic devices. In addition the Via technology opens for very interesting possibilities considering integration with CMOS processing. With several companies using the process already today, qualified volume manufacturing in place and a line-up of potential users, the process is becoming a standard in the MEMS industry. We provide a introduction to the via formation process and also present some on the novel solutions made available by the technology.
△ Less
Submitted 7 May, 2008;
originally announced May 2008.