-
Comparative Analysis of Document-Level Embedding Methods for Similarity Scoring on Shakespeare Sonnets and Taylor Swift Lyrics
Authors:
Klara Kramer
Abstract:
This study evaluates the performance of TF-IDF weighting, averaged Word2Vec embeddings, and BERT embeddings for document similarity scoring across two contrasting textual domains. By analysing cosine similarity scores, the methods' strengths and limitations are highlighted. The findings underscore TF-IDF's reliance on lexical overlap and Word2Vec's superior semantic generalisation, particularly in…
▽ More
This study evaluates the performance of TF-IDF weighting, averaged Word2Vec embeddings, and BERT embeddings for document similarity scoring across two contrasting textual domains. By analysing cosine similarity scores, the methods' strengths and limitations are highlighted. The findings underscore TF-IDF's reliance on lexical overlap and Word2Vec's superior semantic generalisation, particularly in cross-domain comparisons. BERT demonstrates lower performance in challenging domains, likely due to insufficient domainspecific fine-tuning.
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
How to select slices for annotation to train best-performing deep learning segmentation models for cross-sectional medical images?
Authors:
Yixin Zhang,
Kevin Kramer,
Maciej A. Mazurowski
Abstract:
Automated segmentation of medical images heavily relies on the availability of precise manual annotations. However, generating these annotations is often time-consuming, expensive, and sometimes requires specialized expertise (especially for cross-sectional medical images). Therefore, it is essential to optimize the use of annotation resources to ensure efficiency and effectiveness. In this paper,…
▽ More
Automated segmentation of medical images heavily relies on the availability of precise manual annotations. However, generating these annotations is often time-consuming, expensive, and sometimes requires specialized expertise (especially for cross-sectional medical images). Therefore, it is essential to optimize the use of annotation resources to ensure efficiency and effectiveness. In this paper, we systematically address the question: "in a non-interactive annotation pipeline, how should slices from cross-sectional medical images be selected for annotation to maximize the performance of the resulting deep learning segmentation models?" We conducted experiments on 4 medical imaging segmentation tasks with varying annotation budgets, numbers of annotated cases, numbers of annotated slices per volume, slice selection techniques, and mask interpolations. We found that:
1) It is almost always preferable to annotate fewer slices per volume and more volumes given an annotation budget. 2) Selecting slices for annotation by unsupervised active learning (UAL) is not superior to selecting slices randomly or at fixed intervals, provided that each volume is allocated the same number of annotated slices. 3) Interpolating masks between annotated slices rarely enhances model performance, with exceptions of some specific configuration for 3D models.
△ Less
Submitted 5 April, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
Quantifying the Limits of Segmentation Foundation Models: Modeling Challenges in Segmenting Tree-Like and Low-Contrast Objects
Authors:
Yixin Zhang,
Nicholas Konz,
Kevin Kramer,
Maciej A. Mazurowski
Abstract:
Image segmentation foundation models (SFMs) like Segment Anything Model (SAM) have achieved impressive zero-shot and interactive segmentation across diverse domains. However, they struggle to segment objects with certain structures, particularly those with dense, tree-like morphology and low textural contrast from their surroundings. These failure modes are crucial for understanding the limitation…
▽ More
Image segmentation foundation models (SFMs) like Segment Anything Model (SAM) have achieved impressive zero-shot and interactive segmentation across diverse domains. However, they struggle to segment objects with certain structures, particularly those with dense, tree-like morphology and low textural contrast from their surroundings. These failure modes are crucial for understanding the limitations of SFMs in real-world applications. To systematically study this issue, we introduce interpretable metrics quantifying object tree-likeness and textural separability. On carefully controlled synthetic experiments and real-world datasets, we show that SFM performance (e.g., SAM, SAM 2, HQ-SAM) noticeably correlates with these factors. We link these failures to "textural confusion", where models misinterpret local structure as global texture, causing over-segmentation or difficulty distinguishing objects from similar backgrounds. Notably, targeted fine-tuning fails to resolve this issue, indicating a fundamental limitation. Our study provides the first quantitative framework for modeling the behavior of SFMs on challenging structures, offering interpretable insights into their segmentation capabilities.
△ Less
Submitted 10 March, 2025; v1 submitted 5 December, 2024;
originally announced December 2024.
-
An Efficient Multi-Robot Arm Coordination Strategy for Pick-and-Place Tasks using Reinforcement Learning
Authors:
Tizian Jermann,
Hendrik Kolvenbach,
Fidel Esquivel Estay,
Koen Kramer,
Marco Hutter
Abstract:
We introduce a novel strategy for multi-robot sorting of waste objects using Reinforcement Learning. Our focus lies on finding optimal picking strategies that facilitate an effective coordination of a multi-robot system, subject to maximizing the waste removal potential. We realize this by formulating the sorting problem as an OpenAI gym environment and training a neural network with a deep reinfo…
▽ More
We introduce a novel strategy for multi-robot sorting of waste objects using Reinforcement Learning. Our focus lies on finding optimal picking strategies that facilitate an effective coordination of a multi-robot system, subject to maximizing the waste removal potential. We realize this by formulating the sorting problem as an OpenAI gym environment and training a neural network with a deep reinforcement learning algorithm. The objective function is set up to optimize the picking rate of the robotic system. In simulation, we draw a performance comparison to an intuitive combinatorial game theory-based approach. We show that the trained policies outperform the latter and achieve up to 16% higher picking rates. Finally, the respective algorithms are validated on a hardware setup consisting of a two-robot sorting station able to process incoming waste objects through pick-and-place operations.
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Towards Evolution Capabilities in Data Pipelines
Authors:
Kevin Kramer
Abstract:
Evolutionary change over time in the context of data pipelines is certain, especially with regard to the structure and semantics of data as well as to the pipeline operators. Dealing with these changes, i.e. providing long-term maintenance, is costly. The present work explores the need for evolution capabilities within pipeline frameworks. In this context dealing with evolution is defined as a two…
▽ More
Evolutionary change over time in the context of data pipelines is certain, especially with regard to the structure and semantics of data as well as to the pipeline operators. Dealing with these changes, i.e. providing long-term maintenance, is costly. The present work explores the need for evolution capabilities within pipeline frameworks. In this context dealing with evolution is defined as a two-step process consisting of self-awareness and self-adaption. Furthermore, a conceptual requirements model is provided, which encompasses criteria for self-awareness and self-adaption as well as covering the dimensions data, operator, pipeline and environment. A lack of said capabilities in existing frameworks exposes a major gap. Filling this gap will be a significant contribution for practitioners and scientists alike. The present work envisions and lays the foundation for a framework which can handle evolutionary change.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Deep neuroevolution for limited, heterogeneous data: proof-of-concept application to Neuroblastoma brain metastasis using a small virtual pooled image collection
Authors:
Subhanik Purkayastha,
Hrithwik Shalu,
David Gutman,
Shakeel Modak,
Ellen Basu,
Brian Kushner,
Kim Kramer,
Sofia Haque,
Joseph Stember
Abstract:
Artificial intelligence (AI) in radiology has made great strides in recent years, but many hurdles remain. Overfitting and lack of generalizability represent important ongoing challenges hindering accurate and dependable clinical deployment. If AI algorithms can avoid overfitting and achieve true generalizability, they can go from the research realm to the forefront of clinical work. Recently, sma…
▽ More
Artificial intelligence (AI) in radiology has made great strides in recent years, but many hurdles remain. Overfitting and lack of generalizability represent important ongoing challenges hindering accurate and dependable clinical deployment. If AI algorithms can avoid overfitting and achieve true generalizability, they can go from the research realm to the forefront of clinical work. Recently, small data AI approaches such as deep neuroevolution (DNE) have avoided overfitting small training sets. We seek to address both overfitting and generalizability by applying DNE to a virtually pooled data set consisting of images from various institutions. Our use case is classifying neuroblastoma brain metastases on MRI. Neuroblastoma is well-suited for our goals because it is a rare cancer. Hence, studying this pediatric disease requires a small data approach. As a tertiary care center, the neuroblastoma images in our local Picture Archiving and Communication System (PACS) are largely from outside institutions. These multi-institutional images provide a heterogeneous data set that can simulate real world clinical deployment. As in prior DNE work, we used a small training set, consisting of 30 normal and 30 metastasis-containing post-contrast MRI brain scans, with 37% outside images. The testing set was enriched with 83% outside images. DNE converged to a testing set accuracy of 97%. Hence, the algorithm was able to predict image class with near-perfect accuracy on a testing set that simulates real-world data. Hence, the work described here represents a considerable contribution toward clinically feasible AI.
△ Less
Submitted 26 November, 2022;
originally announced November 2022.
-
Automatically identifying a mobile phone user's position within a vehicle
Authors:
Matt Knutson,
Kevin Kramer,
Sara Seifert,
Ryan Chamberlain
Abstract:
Traffic-related injuries and fatalities are major health risks in the United States. Mobile phone use while driving quadruples the risk for a motor vehicle crash. This work demonstrates the feasibility of using the mobile phone camera to passively detect the location of the phone's user within a vehicle. In a large, varied dataset we were able correctly identify if the user was in the driver's sea…
▽ More
Traffic-related injuries and fatalities are major health risks in the United States. Mobile phone use while driving quadruples the risk for a motor vehicle crash. This work demonstrates the feasibility of using the mobile phone camera to passively detect the location of the phone's user within a vehicle. In a large, varied dataset we were able correctly identify if the user was in the driver's seat or one of the passenger seats with 94.9% accuracy. This model could be used by application developers to selectively change or lock functionality while a user is driving, but not if the user is a passenger in a moving vehicle.
△ Less
Submitted 11 November, 2021;
originally announced November 2021.
-
Machine Learning-based Estimation of Forest Carbon Stocks to increase Transparency of Forest Preservation Efforts
Authors:
Björn Lütjens,
Lucas Liebenwein,
Katharina Kramer
Abstract:
An increasing amount of companies and cities plan to become CO2-neutral, which requires them to invest in renewable energies and carbon emission offsetting solutions. One of the cheapest carbon offsetting solutions is preventing deforestation in developing nations, a major contributor in global greenhouse gas emissions. However, forest preservation projects historically display an issue of trust a…
▽ More
An increasing amount of companies and cities plan to become CO2-neutral, which requires them to invest in renewable energies and carbon emission offsetting solutions. One of the cheapest carbon offsetting solutions is preventing deforestation in developing nations, a major contributor in global greenhouse gas emissions. However, forest preservation projects historically display an issue of trust and transparency, which drives companies to invest in transparent, but expensive air carbon capture facilities. Preservation projects could conduct accurate forest inventories (tree diameter, species, height etc.) to transparently estimate the biomass and amount of stored carbon. However, current rainforest inventories are too inaccurate, because they are often based on a few expensive ground-based samples and/or low-resolution satellite imagery. LiDAR-based solutions, used in US forests, are accurate, but cost-prohibitive, and hardly-accessible in the Amazon rainforest. We propose accurate and cheap forest inventory analyses through Deep Learning-based processing of drone imagery. The more transparent estimation of stored carbon will create higher transparency towards clients and thereby increase trust and investment into forest preservation projects.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
A Fully-Integrated Sensing and Control System for High-Accuracy Mobile Robotic Building Construction
Authors:
Abel Gawel,
Hermann Blum,
Johannes Pankert,
Koen Krämer,
Luca Bartolomei,
Selen Ercan,
Farbod Farshidian,
Margarita Chli,
Fabio Gramazio,
Roland Siegwart,
Marco Hutter,
Timothy Sandy
Abstract:
We present a fully-integrated sensing and control system which enables mobile manipulator robots to execute building tasks with millimeter-scale accuracy on building construction sites. The approach leverages multi-modal sensing capabilities for state estimation, tight integration with digital building models, and integrated trajectory planning and whole-body motion control. A novel method for hig…
▽ More
We present a fully-integrated sensing and control system which enables mobile manipulator robots to execute building tasks with millimeter-scale accuracy on building construction sites. The approach leverages multi-modal sensing capabilities for state estimation, tight integration with digital building models, and integrated trajectory planning and whole-body motion control. A novel method for high-accuracy localization updates relative to the known building structure is proposed. The approach is implemented on a real platform and tested under realistic construction conditions. We show that the system can achieve sub-cm end-effector positioning accuracy during fully autonomous operation using solely on-board sensing.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Stepping Forward with Exoskeletons: Team IHMC's Design and Approach in the 2016 Cybathlon
Authors:
Robert Griffin,
Tyson Cobb,
Travis Craig,
Mark Daniel,
Nick van Dijk,
Jeremy Gines,
Koen Kramer,
Shriya Shah,
Olger Siebinga,
Jesper Smith,
Peter Neuhaus
Abstract:
Exoskeletons are a promising technology that enables individuals with mobility limitations to walk again. As the 2016 Cybathlon illustrated, however, the community has a considerable way to go before exoskeletons have the necessary capabilities to be incorporated into daily life. While most exoskeletons power only hip and knee flexion, Team Institute for Human and Machine Cognition (IHMC) presents…
▽ More
Exoskeletons are a promising technology that enables individuals with mobility limitations to walk again. As the 2016 Cybathlon illustrated, however, the community has a considerable way to go before exoskeletons have the necessary capabilities to be incorporated into daily life. While most exoskeletons power only hip and knee flexion, Team Institute for Human and Machine Cognition (IHMC) presents a new exoskeleton, Mina v2, which includes a powered ankle dorsi/plantar flexion. As our entry to the 2016 Cybathlon Powered Exoskeleton Competition, Mina v2's performance allowed us to explore the effectiveness of its powered ankle compared to other powered exoskeletons for pilots with paraplegia. We designed our gaits to incorporate powered ankle plantar flexion to help improve mobility, which allowed our pilot to navigate the given Cybathlon tasks quickly, including those that required ascending movements, and reliably achieve average, conservative walking speeds of 1.04 km/h (0.29 m/s). This enabled our team to place second overall in the Powered Exoskeleton Competition in the 2016 Cybathlon.
△ Less
Submitted 24 December, 2017; v1 submitted 28 February, 2017;
originally announced February 2017.
-
Noncomputable functions in the Blum-Shub-Smale model
Authors:
Wesley Calvert,
Ken Kramer,
Russell Miller
Abstract:
Working in the Blum-Shub-Smale model of computation on the real numbers, we answer several questions of Meer and Ziegler. First, we show that, for each natural number d, an oracle for the set of algebraic real numbers of degree at most d is insufficient to allow an oracle BSS-machine to decide membership in the set of algebraic numbers of degree d + 1. We add a number of further results on relati…
▽ More
Working in the Blum-Shub-Smale model of computation on the real numbers, we answer several questions of Meer and Ziegler. First, we show that, for each natural number d, an oracle for the set of algebraic real numbers of degree at most d is insufficient to allow an oracle BSS-machine to decide membership in the set of algebraic numbers of degree d + 1. We add a number of further results on relative computability of these sets and their unions. Then we show that the halting problem for BSS-computation is not decidable below any countable oracle set, and give a more specific condition, related to the cardinalities of the sets, necessary for relative BSS-computability. Most of our results involve the technique of using as input a tuple of real numbers which is algebraically independent over both the parameters and the oracle of the machine.
△ Less
Submitted 21 May, 2011; v1 submitted 6 May, 2011;
originally announced May 2011.
-
The Cardinality of an Oracle in Blum-Shub-Smale Computation
Authors:
Wesley Calvert,
Ken Kramer,
Russell Miller
Abstract:
We examine the relation of BSS-reducibility on subsets of the real numbers. The question was asked recently (and anonymously) whether it is possible for the halting problem H in BSS-computation to be BSS-reducible to a countable set. Intuitively, it seems that a countable set ought not to contain enough information to decide membership in a reasonably complex (uncountable) set such as H. We conf…
▽ More
We examine the relation of BSS-reducibility on subsets of the real numbers. The question was asked recently (and anonymously) whether it is possible for the halting problem H in BSS-computation to be BSS-reducible to a countable set. Intuitively, it seems that a countable set ought not to contain enough information to decide membership in a reasonably complex (uncountable) set such as H. We confirm this intuition, and prove a more general theorem linking the cardinality of the oracle set to the cardinality, in a local sense, of the set which it computes. We also mention other recent results on BSS-computation and algebraic real numbers.
△ Less
Submitted 2 June, 2010;
originally announced June 2010.