-
6D Pose Estimation on Point Cloud Data through Prior Knowledge Integration: A Case Study in Autonomous Disassembly
Authors:
Chengzhi Wu,
Hao Fu,
Jan-Philipp Kaiser,
Erik Tabuchi Barczak,
Julius Pfrommer,
Gisela Lanza,
Michael Heizmann,
Jürgen Beyerer
Abstract:
The accurate estimation of 6D pose remains a challenging task within the computer vision domain, even when utilizing 3D point cloud data. Conversely, in the manufacturing domain, instances arise where leveraging prior knowledge can yield advancements in this endeavor. This study focuses on the disassembly of starter motors to augment the engineering of product life cycles. A pivotal objective in t…
▽ More
The accurate estimation of 6D pose remains a challenging task within the computer vision domain, even when utilizing 3D point cloud data. Conversely, in the manufacturing domain, instances arise where leveraging prior knowledge can yield advancements in this endeavor. This study focuses on the disassembly of starter motors to augment the engineering of product life cycles. A pivotal objective in this context involves the identification and 6D pose estimation of bolts affixed to the motors, facilitating automated disassembly within the manufacturing workflow. Complicating matters, the presence of occlusions and the limitations of single-view data acquisition, notably when motors are placed in a clamping system, obscure certain portions and render some bolts imperceptible. Consequently, the development of a comprehensive pipeline capable of acquiring complete bolt information is imperative to avoid oversight in bolt detection. In this paper, employing the task of bolt detection within the scope of our project as a pertinent use case, we introduce a meticulously devised pipeline. This multi-stage pipeline effectively captures the 6D information with regard to all bolts on the motor, thereby showcasing the effective utilization of prior knowledge in handling this challenging task. The proposed methodology not only contributes to the field of 6D pose estimation but also underscores the viability of integrating domain-specific insights to tackle complex problems in manufacturing and automation.
△ Less
Submitted 30 May, 2025;
originally announced May 2025.
-
Laplace Sample Information: Data Informativeness Through a Bayesian Lens
Authors:
Johannes Kaiser,
Kristian Schwethelm,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose Laplace Sample Information (LSI) measure of sample informativeness grounded in information theory widely applicable across model arc…
▽ More
Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose Laplace Sample Information (LSI) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings. LSI leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset. We experimentally show that LSI is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty. We demonstrate these capabilities of LSI on image and text data in supervised and unsupervised settings. Moreover, we show that LSI can be computed efficiently through probes and transfers well to the training of large models.
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
Highly Accurate and Diverse Traffic Data: The DeepScenario Open 3D Dataset
Authors:
Oussema Dhaouadi,
Johannes Meier,
Luca Wahl,
Jacques Kaiser,
Luca Scalerandi,
Nick Wandelburg,
Zhuolun Zhou,
Nijanthan Berinpanathan,
Holger Banzhaf,
Daniel Cremers
Abstract:
Accurate 3D trajectory data is crucial for advancing autonomous driving. Yet, traditional datasets are usually captured by fixed sensors mounted on a car and are susceptible to occlusion. Additionally, such an approach can precisely reconstruct the dynamic environment in the close vicinity of the measurement vehicle only, while neglecting objects that are further away. In this paper, we introduce…
▽ More
Accurate 3D trajectory data is crucial for advancing autonomous driving. Yet, traditional datasets are usually captured by fixed sensors mounted on a car and are susceptible to occlusion. Additionally, such an approach can precisely reconstruct the dynamic environment in the close vicinity of the measurement vehicle only, while neglecting objects that are further away. In this paper, we introduce the DeepScenario Open 3D Dataset (DSC3D), a high-quality, occlusion-free dataset of 6 degrees of freedom bounding box trajectories acquired through a novel monocular camera drone tracking pipeline. Our dataset includes more than 175,000 trajectories of 14 types of traffic participants and significantly exceeds existing datasets in terms of diversity and scale, containing many unprecedented scenarios such as complex vehicle-pedestrian interaction on highly populated urban streets and comprehensive parking maneuvers from entry to exit. DSC3D dataset was captured in five various locations in Europe and the United States and include: a parking lot, a crowded inner-city, a steep urban intersection, a federal highway, and a suburban intersection. Our 3D trajectory dataset aims to enhance autonomous driving systems by providing detailed environmental 3D representations, which could lead to improved obstacle interactions and safety. We demonstrate its utility across multiple applications including motion prediction, motion planning, scenario mining, and generative reactive traffic agents. Our interactive online visualization platform and the complete dataset are publicly available at https://app.deepscenario.com, facilitating research in motion prediction, behavior modeling, and safety validation.
△ Less
Submitted 25 April, 2025; v1 submitted 24 April, 2025;
originally announced April 2025.
-
Shape Your Ground: Refining Road Surfaces Beyond Planar Representations
Authors:
Oussema Dhaouadi,
Johannes Meier,
Jacques Kaiser,
Daniel Cremers
Abstract:
Road surface reconstruction from aerial images is fundamental for autonomous driving, urban planning, and virtual simulation, where smoothness, compactness, and accuracy are critical quality factors. Existing reconstruction methods often produce artifacts and inconsistencies that limit usability, while downstream tasks have a tendency to represent roads as planes for simplicity but at the cost of…
▽ More
Road surface reconstruction from aerial images is fundamental for autonomous driving, urban planning, and virtual simulation, where smoothness, compactness, and accuracy are critical quality factors. Existing reconstruction methods often produce artifacts and inconsistencies that limit usability, while downstream tasks have a tendency to represent roads as planes for simplicity but at the cost of accuracy. We introduce FlexRoad, the first framework to directly address road surface smoothing by fitting Non-Uniform Rational B-Splines (NURBS) surfaces to 3D road points obtained from photogrammetric reconstructions or geodata providers. Our method at its core utilizes the Elevation-Constrained Spatial Road Clustering (ECSRC) algorithm for robust anomaly correction, significantly reducing surface roughness and fitting errors. To facilitate quantitative comparison between road surface reconstruction methods, we present GeoRoad Dataset (GeRoD), a diverse collection of road surface and terrain profiles derived from openly accessible geodata. Experiments on GeRoD and the photogrammetry-based DeepScenario Open 3D Dataset (DSC3D) demonstrate that FlexRoad considerably surpasses commonly used road surface representations across various metrics while being insensitive to various input sources, terrains, and noise types. By performing ablation studies, we identify the key role of each component towards high-quality reconstruction performance, making FlexRoad a generic method for realistic road surface modeling.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
MonoCT: Overcoming Monocular 3D Detection Domain Shift with Consistent Teacher Models
Authors:
Johannes Meier,
Louis Inchingolo,
Oussema Dhaouadi,
Yan Xia,
Jacques Kaiser,
Daniel Cremers
Abstract:
We tackle the problem of monocular 3D object detection across different sensors, environments, and camera setups. In this paper, we introduce a novel unsupervised domain adaptation approach, MonoCT, that generates highly accurate pseudo labels for self-supervision. Inspired by our observation that accurate depth estimation is critical to mitigating domain shifts, MonoCT introduces a novel Generali…
▽ More
We tackle the problem of monocular 3D object detection across different sensors, environments, and camera setups. In this paper, we introduce a novel unsupervised domain adaptation approach, MonoCT, that generates highly accurate pseudo labels for self-supervision. Inspired by our observation that accurate depth estimation is critical to mitigating domain shifts, MonoCT introduces a novel Generalized Depth Enhancement (GDE) module with an ensemble concept to improve depth estimation accuracy. Moreover, we introduce a novel Pseudo Label Scoring (PLS) module by exploring inner-model consistency measurement and a Diversity Maximization (DM) strategy to further generate high-quality pseudo labels for self-training. Extensive experiments on six benchmarks show that MonoCT outperforms existing SOTA domain adaptation methods by large margins (~21% minimum for AP Mod.) and generalizes well to car, traffic camera and drone views.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Multi-timescale synaptic plasticity on analog neuromorphic hardware
Authors:
Amani Atoui,
Jakob Kaiser,
Sebastian Billaudelle,
Philipp Spilger,
Eric Müller,
Jannik Luboeinski,
Christian Tetzlaff,
Johannes Schemmel
Abstract:
As numerical simulations grow in complexity, their demands on computing time and energy increase. Hardware accelerators offer significant efficiency gains in many computationally intensive scientific fields, but their use in computational neuroscience remains limited. Neuromorphic substrates, such as the BrainScaleS architectures, offer significant advantages, especially for studying complex plast…
▽ More
As numerical simulations grow in complexity, their demands on computing time and energy increase. Hardware accelerators offer significant efficiency gains in many computationally intensive scientific fields, but their use in computational neuroscience remains limited. Neuromorphic substrates, such as the BrainScaleS architectures, offer significant advantages, especially for studying complex plasticity rules that require extended simulation runtimes. This work presents the implementation of a calcium-based plasticity rule that integrates calcium dynamics based on the synaptic tagging-and-capture hypothesis on the BrainScaleS-2 system. The implementation of the plasticity rule for a single synapse involves incorporating the calcium dynamics and the plasticity rule equations. The calcium dynamics are mapped to the analog circuits of BrainScaleS-2, while the plasticity rule equations are numerically solved on its embedded digital processors. The main hardware constraints include the speed of the processors and the use of integer arithmetic. By adjusting the timestep of the numerical solver and introducing stochastic rounding, we demonstrate that BrainScaleS-2 accurately emulates a single synapse following a calcium-based plasticity rule across four stimulation protocols and validate our implementation against a software reference model.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Reproduction of AdEx dynamics on neuromorphic hardware through data embedding and simulation-based inference
Authors:
Jakob Huhle,
Jakob Kaiser,
Eric Müller,
Johannes Schemmel
Abstract:
The development of mechanistic models of physical systems is essential for understanding their behavior and formulating predictions that can be validated experimentally. Calibration of these models, especially for complex systems, requires automated optimization methods due to the impracticality of manual parameter tuning. In this study, we use an autoencoder to automatically extract relevant feat…
▽ More
The development of mechanistic models of physical systems is essential for understanding their behavior and formulating predictions that can be validated experimentally. Calibration of these models, especially for complex systems, requires automated optimization methods due to the impracticality of manual parameter tuning. In this study, we use an autoencoder to automatically extract relevant features from the membrane trace of a complex neuron model emulated on the BrainScaleS-2 neuromorphic system, and subsequently leverage sequential neural posterior estimation (SNPE), a simulation-based inference algorithm, to approximate the posterior distribution of neuron parameters. Our results demonstrate that the autoencoder is able to extract essential features from the observed membrane traces, with which the SNPE algorithm is able to find an approximation of the posterior distribution. This suggests that the combination of an autoencoder with the SNPE algorithm is a promising optimization method for complex systems.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
Differentially Private Active Learning: Balancing Effective Data Selection and Privacy
Authors:
Kristian Schwethelm,
Johannes Kaiser,
Jonas Kuntzer,
Mehmet Yigitsoy,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Active learning (AL) is a widely used technique for optimizing data labeling in machine learning by iteratively selecting, labeling, and training on the most informative data. However, its integration with formal privacy-preserving methods, particularly differential privacy (DP), remains largely underexplored. While some works have explored differentially private AL for specialized scenarios like…
▽ More
Active learning (AL) is a widely used technique for optimizing data labeling in machine learning by iteratively selecting, labeling, and training on the most informative data. However, its integration with formal privacy-preserving methods, particularly differential privacy (DP), remains largely underexplored. While some works have explored differentially private AL for specialized scenarios like online learning, the fundamental challenge of combining AL with DP in standard learning settings has remained unaddressed, severely limiting AL's applicability in privacy-sensitive domains. This work addresses this gap by introducing differentially private active learning (DP-AL) for standard learning settings. We demonstrate that naively integrating DP-SGD training into AL presents substantial challenges in privacy budget allocation and data utilization. To overcome these challenges, we propose step amplification, which leverages individual sampling probabilities in batch creation to maximize data point participation in training steps, thus optimizing data utilization. Additionally, we investigate the effectiveness of various acquisition functions for data selection under privacy constraints, revealing that many commonly used functions become impractical. Our experiments on vision and natural language processing tasks show that DP-AL can improve performance for specific datasets and model architectures. However, our findings also highlight the limitations of AL in privacy-constrained environments, emphasizing the trade-offs between privacy, model accuracy, and data selection accuracy.
△ Less
Submitted 31 January, 2025; v1 submitted 1 October, 2024;
originally announced October 2024.
-
CARLA Drone: Monocular 3D Object Detection from a Different Perspective
Authors:
Johannes Meier,
Luca Scalerandi,
Oussema Dhaouadi,
Jacques Kaiser,
Nikita Araslanov,
Daniel Cremers
Abstract:
Existing techniques for monocular 3D detection have a serious restriction. They tend to perform well only on a limited set of benchmarks, faring well either on ego-centric car views or on traffic camera views, but rarely on both. To encourage progress, this work advocates for an extended evaluation of 3D detection frameworks across different camera perspectives. We make two key contributions. Firs…
▽ More
Existing techniques for monocular 3D detection have a serious restriction. They tend to perform well only on a limited set of benchmarks, faring well either on ego-centric car views or on traffic camera views, but rarely on both. To encourage progress, this work advocates for an extended evaluation of 3D detection frameworks across different camera perspectives. We make two key contributions. First, we introduce the CARLA Drone dataset, CDrone. Simulating drone views, it substantially expands the diversity of camera perspectives in existing benchmarks. Despite its synthetic nature, CDrone represents a real-world challenge. To show this, we confirm that previous techniques struggle to perform well both on CDrone and a real-world 3D drone dataset. Second, we develop an effective data augmentation pipeline called GroundMix. Its distinguishing element is the use of the ground for creating 3D-consistent augmentation of a training image. GroundMix significantly boosts the detection accuracy of a lightweight one-stage detector. In our expanded evaluation, we achieve the average precision on par with or substantially higher than the previous state of the art across all tested datasets.
△ Less
Submitted 21 October, 2024; v1 submitted 21 August, 2024;
originally announced August 2024.
-
Towards Unlocking Insights from Logbooks Using AI
Authors:
Antonin Sulc,
Alex Bien,
Annika Eichler,
Daniel Ratner,
Florian Rehm,
Frank Mayet,
Gregor Hartmann,
Hayden Hoschouer,
Henrik Tuennermann,
Jan Kaiser,
Jason St. John,
Jennefer Maldonado,
Kyle Hazelwood,
Raimund Kammering,
Thorsten Hellert,
Tim Wilksen,
Verena Kain,
Wan-Lin Hu
Abstract:
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly t…
▽ More
Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly testing a tailored Retrieval Augmented Generation (RAG) model for enhancing the usability of particle accelerator logbooks at institutes like DESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus built on logbook contributions and aims to unlock insights from these logbooks by leveraging retrieval over facility datasets, including discussion about potential multimodal sources. Our goals are to increase the FAIR-ness (findability, accessibility, interoperability, and reusability) of logbooks by exploiting their information content to streamline everyday use, enable macro-analysis for root cause analysis, and facilitate problem-solving automation.
△ Less
Submitted 25 May, 2024;
originally announced June 2024.
-
Large Language Models for Human-Machine Collaborative Particle Accelerator Tuning through Natural Language
Authors:
Jan Kaiser,
Annika Eichler,
Anne Lauscher
Abstract:
Autonomous tuning of particle accelerators is an active and challenging field of research with the goal of enabling novel accelerator technologies cutting-edge high-impact applications, such as physics discovery, cancer research and material sciences. A key challenge with autonomous accelerator tuning remains that the most capable algorithms require an expert in optimisation, machine learning or a…
▽ More
Autonomous tuning of particle accelerators is an active and challenging field of research with the goal of enabling novel accelerator technologies cutting-edge high-impact applications, such as physics discovery, cancer research and material sciences. A key challenge with autonomous accelerator tuning remains that the most capable algorithms require an expert in optimisation, machine learning or a similar field to implement the algorithm for every new tuning task. In this work, we propose the use of large language models (LLMs) to tune particle accelerators. We demonstrate on a proof-of-principle example the ability of LLMs to successfully and autonomously tune a particle accelerator subsystem based on nothing more than a natural language prompt from the operator, and compare the performance of our LLM-based solution to state-of-the-art optimisation algorithms, such as Bayesian optimisation (BO) and reinforcement learning-trained optimisation (RLO). In doing so, we also show how LLMs can perform numerical optimisation of a highly non-linear real-world objective function. Ultimately, this work represents yet another complex task that LLMs are capable of solving and promises to help accelerate the deployment of autonomous tuning algorithms to the day-to-day operations of particle accelerators.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Visual Privacy Auditing with Diffusion Models
Authors:
Kristian Schwethelm,
Johannes Kaiser,
Moritz Knolle,
Sarah Lockfisch,
Daniel Rueckert,
Alexander Ziller
Abstract:
Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumpt…
▽ More
Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumptions regarding adversary knowledge about the target data, particularly in the image domain, raising questions about their real-world applicability. In this work, we empirically investigate this discrepancy by introducing a reconstruction attack based on diffusion models (DMs) that only assumes adversary access to real-world image priors and specifically targets the DP defense. We find that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as heuristic auditing tools for visualizing privacy leakage.
△ Less
Submitted 9 March, 2025; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Intermediate field-induced phase of the honeycomb magnet BaCo$_2$(AsO$_4$)$_2$
Authors:
Prashanta K. Mukharjee,
Bin Shen,
Sebastian Erdmann,
Anton Jesche,
Julian Kaiser,
Priya R. Baral,
Oksana Zaharko,
Philipp Gegenwart,
Alexander A. Tsirlin
Abstract:
We use magnetometry, calorimetry, and high-resolution capacitive dilatometry, as well as single-crystal neutron diffraction to explore temperature-field phase diagram of the anisotropic honeycomb magnet BaCo$_2$(AsO$_4)_2$. Our data reveal four distinct ordered states observed for in-plane magnetic fields. Of particular interest is the narrow region between 0.51 and 0.55 T that separates the up-up…
▽ More
We use magnetometry, calorimetry, and high-resolution capacitive dilatometry, as well as single-crystal neutron diffraction to explore temperature-field phase diagram of the anisotropic honeycomb magnet BaCo$_2$(AsO$_4)_2$. Our data reveal four distinct ordered states observed for in-plane magnetic fields. Of particular interest is the narrow region between 0.51 and 0.55 T that separates the up-up-down order from the fully polarized state and coincides with the field range where signatures of the spin-liquid behavior have been reported. We show that magnetic Bragg peaks persist in this intermediate phase, thus ruling out its spin-liquid nature. However, the simultaneous nonmonotonic evolution of nuclear Bragg peaks suggests the involvement of the lattice, witnessed also in other regions of the phase diagram where large changes in the sample length are observed upon entering the magnetically ordered states. Our data highlight the importance of lattice effects in BaCo$_2$(AsO$_4)_2$.
△ Less
Submitted 21 October, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
Cheetah: Bridging the Gap Between Machine Learning and Particle Accelerator Physics with High-Speed, Differentiable Simulations
Authors:
Jan Kaiser,
Chenran Xu,
Annika Eichler,
Andrea Santamaria Garcia
Abstract:
Machine learning has emerged as a powerful solution to the modern challenges in accelerator physics. However, the limited availability of beam time, the computational cost of simulations, and the high-dimensionality of optimisation problems pose significant challenges in generating the required data for training state-of-the-art machine learning models. In this work, we introduce Cheetah, a PyTorc…
▽ More
Machine learning has emerged as a powerful solution to the modern challenges in accelerator physics. However, the limited availability of beam time, the computational cost of simulations, and the high-dimensionality of optimisation problems pose significant challenges in generating the required data for training state-of-the-art machine learning models. In this work, we introduce Cheetah, a PyTorch-based high-speed differentiable linear-beam dynamics code. Cheetah enables the fast collection of large data sets by reducing computation times by multiple orders of magnitude and facilitates efficient gradient-based optimisation for accelerator tuning and system identification. This positions Cheetah as a user-friendly, readily extensible tool that integrates seamlessly with widely adopted machine learning tools. We showcase the utility of Cheetah through five examples, including reinforcement learning training, gradient-based beamline tuning, gradient-based system identification, physics-informed Bayesian optimisation priors, and modular neural network surrogate modelling of space charge effects. The use of such a high-speed differentiable simulation code will simplify the development of machine learning-based methods for particle accelerators and fast-track their integration into everyday operations of accelerator facilities.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Bayesian Optimization Algorithms for Accelerator Physics
Authors:
Ryan Roussel,
Auralee L. Edelen,
Tobias Boltz,
Dylan Kennedy,
Zhe Zhang,
Fuhao Ji,
Xiaobiao Huang,
Daniel Ratner,
Andrea Santamaria Garcia,
Chenran Xu,
Jan Kaiser,
Angel Ferran Pousa,
Annika Eichler,
Jannis O. Lubsen,
Natalie M. Isenberg,
Yuan Gao,
Nikita Kuklev,
Jose Martinez,
Brahim Mustapha,
Verena Kain,
Weijian Lin,
Simone Maria Liuzzo,
Jason St. John,
Matthew J. V. Streeter,
Remi Lehe
, et al. (1 additional authors not shown)
Abstract:
Accelerator physics relies on numerical algorithms to solve optimization problems in online accelerator control and tasks such as experimental design and model calibration in simulations. The effectiveness of optimization algorithms in discovering ideal solutions for complex challenges with limited resources often determines the problem complexity these methods can address. The accelerator physics…
▽ More
Accelerator physics relies on numerical algorithms to solve optimization problems in online accelerator control and tasks such as experimental design and model calibration in simulations. The effectiveness of optimization algorithms in discovering ideal solutions for complex challenges with limited resources often determines the problem complexity these methods can address. The accelerator physics community has recognized the advantages of Bayesian optimization algorithms, which leverage statistical surrogate models of objective functions to effectively address complex optimization challenges, especially in the presence of noise during accelerator operation and in resource-intensive physics simulations. In this review article, we offer a conceptual overview of applying Bayesian optimization techniques towards solving optimization problems in accelerator physics. We begin by providing a straightforward explanation of the essential components that make up Bayesian optimization techniques. We then give an overview of current and previous work applying and modifying these techniques to solve accelerator physics challenges. Finally, we explore practical implementation strategies for Bayesian optimization algorithms to maximize their performance, enabling users to effectively address complex optimization challenges in real-time beam control and accelerator design.
△ Less
Submitted 5 April, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Metallic conductivity on Na-deficient structural domain walls in the spin-orbit Mott insulator Na$_2$IrO$_3$
Authors:
Franziska A. Breitner,
Julian Kaiser,
Anton Jesche,
Philipp Gegenwart
Abstract:
Honeycomb Na$_2$IrO$_3$ is a prototype spin-orbit Mott insulator and Kitaev magnet. We report a combined structural and electrical resistivity study of Na$_2$IrO$_3$ single crystals. Laue back-scattering diffraction indicates twinning with $\pm 120^\circ$ rotation around the $c^*$-axis while scanning electron microscopy displays nanothin lines parallel to all three b-axis orientations of twin doma…
▽ More
Honeycomb Na$_2$IrO$_3$ is a prototype spin-orbit Mott insulator and Kitaev magnet. We report a combined structural and electrical resistivity study of Na$_2$IrO$_3$ single crystals. Laue back-scattering diffraction indicates twinning with $\pm 120^\circ$ rotation around the $c^*$-axis while scanning electron microscopy displays nanothin lines parallel to all three b-axis orientations of twin domains. Energy dispersive x-ray analysis line-scans across such domain walls indicate no change of the Ir signal intensity, i.e. intact honeycomb layers, while the Na intensity is reduced down to $\sim 2/3$ of its original value at the domain walls, implying significant hole doping. Utilizing focused-ion-beam micro-sectioning, the temperature dependence of the electrical resistance of individual domain walls is studied. It demonstrates the tuning through the metal-insulator transition into a correlated-metal ground state by increasing hole doping.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Learning to Do or Learning While Doing: Reinforcement Learning and Bayesian Optimisation for Online Continuous Tuning
Authors:
Jan Kaiser,
Chenran Xu,
Annika Eichler,
Andrea Santamaria Garcia,
Oliver Stein,
Erik Bründermann,
Willi Kuropka,
Hannes Dinter,
Frank Mayet,
Thomas Vinatier,
Florian Burkart,
Holger Schlarb
Abstract:
Online tuning of real-world plants is a complex optimisation problem that continues to require manual intervention by experienced human operators. Autonomous tuning is a rapidly expanding field of research, where learning-based methods, such as Reinforcement Learning-trained Optimisation (RLO) and Bayesian optimisation (BO), hold great promise for achieving outstanding plant performance and reduci…
▽ More
Online tuning of real-world plants is a complex optimisation problem that continues to require manual intervention by experienced human operators. Autonomous tuning is a rapidly expanding field of research, where learning-based methods, such as Reinforcement Learning-trained Optimisation (RLO) and Bayesian optimisation (BO), hold great promise for achieving outstanding plant performance and reducing tuning times. Which algorithm to choose in different scenarios, however, remains an open question. Here we present a comparative study using a routine task in a real particle accelerator as an example, showing that RLO generally outperforms BO, but is not always the best choice. Based on the study's results, we provide a clear set of criteria to guide the choice of algorithm for a given tuning task. These can ease the adoption of learning-based autonomous tuning solutions to the operation of complex real-world plants, ultimately improving the availability and pushing the limits of operability of these facilities, thereby enabling scientific and engineering advancements.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Simulation-based Inference for Model Parameterization on Analog Neuromorphic Hardware
Authors:
Jakob Kaiser,
Raphael Stock,
Eric Müller,
Johannes Schemmel,
Sebastian Schmitt
Abstract:
The BrainScaleS-2 (BSS-2) system implements physical models of neurons as well as synapses and aims for an energy-efficient and fast emulation of biological neurons. When replicating neuroscientific experiments on BSS-2, a major challenge is finding suitable model parameters. This study investigates the suitability of the sequential neural posterior estimation (SNPE) algorithm for parameterizing a…
▽ More
The BrainScaleS-2 (BSS-2) system implements physical models of neurons as well as synapses and aims for an energy-efficient and fast emulation of biological neurons. When replicating neuroscientific experiments on BSS-2, a major challenge is finding suitable model parameters. This study investigates the suitability of the sequential neural posterior estimation (SNPE) algorithm for parameterizing a multi-compartmental neuron model emulated on the BSS-2 analog neuromorphic system. The SNPE algorithm belongs to the class of simulation-based inference methods and estimates the posterior distribution of the model parameters; access to the posterior allows quantifying the confidence in parameter estimations and unveiling correlation between model parameters. For our multi-compartmental model, we show that the approximated posterior agrees with experimental observations and that the identified correlation between parameters fits theoretical expectations. Furthermore, as already shown for software simulations, the algorithm can deal with high-dimensional observations and parameter spaces when the data is generated by emulations on BSS-2. These results suggest that the SNPE algorithm is a promising approach for automating the parameterization and the analyzation of complex models, especially when dealing with characteristic properties of analog neuromorphic substrates, such as trial-to-trial variations or limited parameter ranges.
△ Less
Submitted 20 November, 2023; v1 submitted 28 March, 2023;
originally announced March 2023.
-
From Clean Room to Machine Room: Commissioning of the First-Generation BrainScaleS Wafer-Scale Neuromorphic System
Authors:
Hartmut Schmidt,
José Montes,
Andreas Grübl,
Maurice Güttler,
Dan Husmann,
Joscha Ilmberger,
Jakob Kaiser,
Christian Mauch,
Eric Müller,
Lars Sterzenbach,
Johannes Schemmel,
Sebastian Schmitt
Abstract:
The first-generation of BrainScaleS, also referred to as BrainScaleS-1, is a neuromorphic system for emulating large-scale networks of spiking neurons. Following a "physical modeling" principle, its VLSI circuits are designed to emulate the dynamics of biological examples: analog circuits implement neurons and synapses with time constants that arise from their electronic components' intrinsic prop…
▽ More
The first-generation of BrainScaleS, also referred to as BrainScaleS-1, is a neuromorphic system for emulating large-scale networks of spiking neurons. Following a "physical modeling" principle, its VLSI circuits are designed to emulate the dynamics of biological examples: analog circuits implement neurons and synapses with time constants that arise from their electronic components' intrinsic properties. It operates in continuous time, with dynamics typically matching an acceleration factor of 10000 compared to the biological regime. A fault-tolerant design allows it to achieve wafer-scale integration despite unavoidable analog variability and component failures. In this paper, we present the commissioning process of a BrainScaleS-1 wafer module, providing a short description of the system's physical components, illustrating the steps taken during its assembly and the measures taken to operate it. Furthermore, we reflect on the system's development process and the lessons learned to conclude with a demonstration of its functionality by emulating a wafer-scale synchronous firing chain, the largest spiking network emulation ran with analog components and individual synapses to date.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
MotorFactory: A Blender Add-on for Large Dataset Generation of Small Electric Motors
Authors:
Chengzhi Wu,
Kanran Zhou,
Jan-Philipp Kaiser,
Norbert Mitschke,
Jan-Felix Klein,
Julius Pfrommer,
Jürgen Beyerer,
Gisela Lanza,
Michael Heizmann,
Kai Furmans
Abstract:
To enable automatic disassembly of different product types with uncertain conditions and degrees of wear in remanufacturing, agile production systems that can adapt dynamically to changing requirements are needed. Machine learning algorithms can be employed due to their generalization capabilities of learning from various types and variants of products. However, in reality, datasets with a diversi…
▽ More
To enable automatic disassembly of different product types with uncertain conditions and degrees of wear in remanufacturing, agile production systems that can adapt dynamically to changing requirements are needed. Machine learning algorithms can be employed due to their generalization capabilities of learning from various types and variants of products. However, in reality, datasets with a diversity of samples that can be used to train models are difficult to obtain in the initial period. This may cause bad performances when the system tries to adapt to new unseen input data in the future. In order to generate large datasets for different learning purposes, in our project, we present a Blender add-on named MotorFactory to generate customized mesh models of various motor instances. MotorFactory allows to create mesh models which, complemented with additional add-ons, can be further used to create synthetic RGB images, depth images, normal images, segmentation ground truth masks, and 3D point cloud datasets with point-wise semantic labels. The created synthetic datasets may be used for various tasks including motor type classification, object detection for decentralized material transfer tasks, part segmentation for disassembly and handling tasks, or even reinforcement learning-based robotics control or view-planning.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Library transfer between distinct Laser-Induced Breakdown Spectroscopy systems with shared standards
Authors:
J. Vrábel,
E. Képeš,
P. Nedělník,
J. Buday,
J. Cempírek,
P. Pořízka,
J. Kaiser
Abstract:
The mutual incompatibility of distinct spectroscopic systems is among the most limiting factors in Laser-Induced Breakdown Spectroscopy (LIBS). The cost related to setting up a new LIBS system is increased, as its extensive calibration is required. Solving the problem would enable inter-laboratory reference measurements and shared spectral libraries, which are fundamental for other spectroscopic t…
▽ More
The mutual incompatibility of distinct spectroscopic systems is among the most limiting factors in Laser-Induced Breakdown Spectroscopy (LIBS). The cost related to setting up a new LIBS system is increased, as its extensive calibration is required. Solving the problem would enable inter-laboratory reference measurements and shared spectral libraries, which are fundamental for other spectroscopic techniques. In this work, we study a simplified version of this challenge where LIBS systems differ only in used spectrometers and collection optics but share all other parts of the apparatus, and collect spectra simultaneously from the same plasma plume. Extensive datasets measured as hyperspectral images of heterogeneous specimens are used to train machine learning models that can transfer spectra between systems. The transfer is realized by a pipeline that consists of a variational autoencoder (VAE) and a fully-connected artificial neural network (ANN). In the first step, we obtain a latent representation of the spectra which were measured on the Primary system (by using the VAE). In the second step, we map spectra from the Secondary system to corresponding locations in the latent space (by the ANN). Finally, Secondary system spectra are reconstructed from the latent space to the space of the Primary system. The transfer is evaluated by several figures of merit (Euclidean and cosine distances, both spatially resolved; k-means clustering of transferred spectra). The methodology is compared to several baseline approaches.
△ Less
Submitted 23 September, 2022; v1 submitted 31 August, 2022;
originally announced September 2022.
-
A Scalable Approach to Modeling on Accelerated Neuromorphic Hardware
Authors:
Eric Müller,
Elias Arnold,
Oliver Breitwieser,
Milena Czierlinski,
Arne Emmel,
Jakob Kaiser,
Christian Mauch,
Sebastian Schmitt,
Philipp Spilger,
Raphael Stock,
Yannik Stradmann,
Johannes Weis,
Andreas Baumbach,
Sebastian Billaudelle,
Benjamin Cramer,
Falk Ebert,
Julian Göltz,
Joscha Ilmberger,
Vitali Karasenko,
Mitja Kleider,
Aron Leibfried,
Christian Pehle,
Johannes Schemmel
Abstract:
Neuromorphic systems open up opportunities to enlarge the explorative space for computational research. However, it is often challenging to unite efficiency and usability. This work presents the software aspects of this endeavor for the BrainScaleS-2 system, a hybrid accelerated neuromorphic hardware architecture based on physical modeling. We introduce key aspects of the BrainScaleS-2 Operating S…
▽ More
Neuromorphic systems open up opportunities to enlarge the explorative space for computational research. However, it is often challenging to unite efficiency and usability. This work presents the software aspects of this endeavor for the BrainScaleS-2 system, a hybrid accelerated neuromorphic hardware architecture based on physical modeling. We introduce key aspects of the BrainScaleS-2 Operating System: experiment workflow, API layering, software design, and platform operation. We present use cases to discuss and derive requirements for the software and showcase the implementation. The focus lies on novel system and software features such as multi-compartmental neurons, fast re-configuration for hardware-in-the-loop training, applications for the embedded processors, the non-spiking operation mode, interactive platform access, and sustainable hardware/software co-development. Finally, we discuss further developments in terms of hardware scale-up, system usability and efficiency.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
The BrainScaleS-2 accelerated neuromorphic system with hybrid plasticity
Authors:
Christian Pehle,
Sebastian Billaudelle,
Benjamin Cramer,
Jakob Kaiser,
Korbinian Schreiber,
Yannik Stradmann,
Johannes Weis,
Aron Leibfried,
Eric Müller,
Johannes Schemmel
Abstract:
Since the beginning of information processing by electronic components, the nervous system has served as a metaphor for the organization of computational primitives. Brain-inspired computing today encompasses a class of approaches ranging from using novel nano-devices for computation to research into large-scale neuromorphic architectures, such as TrueNorth, SpiNNaker, BrainScaleS, Tianjic, and Lo…
▽ More
Since the beginning of information processing by electronic components, the nervous system has served as a metaphor for the organization of computational primitives. Brain-inspired computing today encompasses a class of approaches ranging from using novel nano-devices for computation to research into large-scale neuromorphic architectures, such as TrueNorth, SpiNNaker, BrainScaleS, Tianjic, and Loihi. While implementation details differ, spiking neural networks - sometimes referred to as the third generation of neural networks - are the common abstraction used to model computation with such systems. Here we describe the second generation of the BrainScaleS neuromorphic architecture, emphasizing applications enabled by this architecture. It combines a custom analog accelerator core supporting the accelerated physical emulation of bio-inspired spiking neural network primitives with a tightly coupled digital processor and a digital event-routing network.
△ Less
Submitted 3 February, 2022; v1 submitted 26 January, 2022;
originally announced January 2022.
-
Benchmarking a Probabilistic Coprocessor
Authors:
Jan Kaiser,
Risi Jaiswal,
Behtash Behin-Aein,
Supriyo Datta
Abstract:
Computation in the past decades has been driven by deterministic computers based on classical deterministic bits. Recently, alternative computing paradigms and domain-based computing like quantum computing and probabilistic computing have gained traction. While quantum computers based on q-bits utilize quantum effects to advance computation, probabilistic computers based on probabilistic (p-)bits…
▽ More
Computation in the past decades has been driven by deterministic computers based on classical deterministic bits. Recently, alternative computing paradigms and domain-based computing like quantum computing and probabilistic computing have gained traction. While quantum computers based on q-bits utilize quantum effects to advance computation, probabilistic computers based on probabilistic (p-)bits are naturally suited to solve problems that require large amount of random numbers utilized in Monte Carlo and Markov Chain Monte Carlo algorithms. These Monte Carlo techniques are used to solve important problems in the fields of optimization, numerical integration or sampling from probability distributions. However, to efficiently implement Monte Carlo algorithms the generation of random numbers is crucial. In this paper, we present and benchmark a probabilistic coprocessor based on p-bits that are naturally suited to solve these problems. We present multiple examples and project that a nanomagnetic implementation of our probabilistic coprocessor can outperform classical CPU and GPU implementations by multiple orders of magnitude.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Probabilistic computing with p-bits
Authors:
Jan Kaiser,
Supriyo Datta
Abstract:
Digital computers store information in the form of bits that can take on one of two values 0 and 1, while quantum computers are based on qubits that are described by a complex wavefunction, whose squared magnitude gives the probability of measuring either 0 or 1. Here, we make the case for a probabilistic computer based on p-bits, which take on values 0 and 1 with controlled probabilities and can…
▽ More
Digital computers store information in the form of bits that can take on one of two values 0 and 1, while quantum computers are based on qubits that are described by a complex wavefunction, whose squared magnitude gives the probability of measuring either 0 or 1. Here, we make the case for a probabilistic computer based on p-bits, which take on values 0 and 1 with controlled probabilities and can be implemented with specialized compact energy-efficient hardware. We propose a generic architecture for such p-computers and emulate systems with thousands of p-bits to show that they can significantly accelerate randomized algorithms used in a wide variety of applications including but not limited to Bayesian networks, optimization, Ising models, and quantum Monte Carlo.
△ Less
Submitted 12 October, 2021; v1 submitted 22 August, 2021;
originally announced August 2021.
-
Magnetic phase diagram, magnetoelastic coupling and Grüneisen scaling in CoTiO$_3$
Authors:
M. Hoffmann,
K. Dey,
J. Werner,
R. Bag,
J. Kaiser,
H. Wadepohl,
Y. Skourski,
M. Abdel-Hafiez,
S. Singh,
R. Klingeler
Abstract:
High-quality single crystals of CoTiO$_3$ are grown and used to elucidate in detail structural and magnetostructural effects by means of high-resolution capacitance dilatometry studies in fields up to 15 T which are complemented by specific heat and magnetization measurements. In addition, we refine the single-crystal structure of the ilmenite ($R\bar{3}$) phase. At the antiferromagnetic ordering…
▽ More
High-quality single crystals of CoTiO$_3$ are grown and used to elucidate in detail structural and magnetostructural effects by means of high-resolution capacitance dilatometry studies in fields up to 15 T which are complemented by specific heat and magnetization measurements. In addition, we refine the single-crystal structure of the ilmenite ($R\bar{3}$) phase. At the antiferromagnetic ordering temperature $T_\mathrm{N}$, pronounced $λ$-shaped anomaly in the thermal expansion coefficients signals shrinking of both the $c$ and $b$ axes, indicating strong magnetoelastic coupling with uniaxial pressure along $c$ yielding six times larger effect on $T_\mathrm{N}$ than the pressure applied in-plane. The hydrostatic pressure dependency derived by means of Grüneisen analysis amounts to $\partial T_\mathrm{N}/ \partial p\approx 2.7(4)$~K/GPa. The high-field magnetization studies in static and pulsed magnetic fields up to 60~T along with high-field thermal expansion measurements facilitate in constructing the complete anisotropic magnetic phase diagram of CoTiO$_3$. While the results confirm the presence of significant magnetodielectric coupling, our data show that magnetism drives the observed structural, dielectric, and magnetic changes both in the short-range ordered regime well-above $T_\mathrm{N}$ as well as in the long-range magnetically ordered phase.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Hardware-aware $in \ situ$ Boltzmann machine learning using stochastic magnetic tunnel junctions
Authors:
Jan Kaiser,
William A. Borders,
Kerem Y. Camsari,
Shunsuke Fukami,
Hideo Ohno,
Supriyo Datta
Abstract:
One of the big challenges of current electronics is the design and implementation of hardware neural networks that perform fast and energy-efficient machine learning. Spintronics is a promising catalyst for this field with the capabilities of nanosecond operation and compatibility with existing microelectronics. Considering large-scale, viable neuromorphic systems however, variability of device pr…
▽ More
One of the big challenges of current electronics is the design and implementation of hardware neural networks that perform fast and energy-efficient machine learning. Spintronics is a promising catalyst for this field with the capabilities of nanosecond operation and compatibility with existing microelectronics. Considering large-scale, viable neuromorphic systems however, variability of device properties is a serious concern. In this paper, we show an autonomously operating circuit that performs hardware-aware machine learning utilizing probabilistic neurons built with stochastic magnetic tunnel junctions. We show that $in \ situ$ learning of weights and biases in a Boltzmann machine can counter device-to-device variations and learn the probability distribution of meaningful operations such as a full adder. This scalable autonomously operating learning circuit using spintronics-based neurons could be especially of interest for standalone artificial-intelligence devices capable of fast and efficient learning at the edge.
△ Less
Submitted 13 January, 2022; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Effects of Pre- and Post-Processing on type-based Embeddings in Lexical Semantic Change Detection
Authors:
Jens Kaiser,
Sinan Kurtyigit,
Serge Kotchourko,
Dominik Schlechtweg
Abstract:
Lexical semantic change detection is a new and innovative research field. The optimal fine-tuning of models including pre- and post-processing is largely unclear. We optimize existing models by (i) pre-training on large corpora and refining on diachronic target corpora tackling the notorious small data problem, and (ii) applying post-processing transformations that have been shown to improve perfo…
▽ More
Lexical semantic change detection is a new and innovative research field. The optimal fine-tuning of models including pre- and post-processing is largely unclear. We optimize existing models by (i) pre-training on large corpora and refining on diachronic target corpora tackling the notorious small data problem, and (ii) applying post-processing transformations that have been shown to improve performance on synchronic tasks. Our results provide a guide for the application and optimization of lexical semantic change detection models across various learning scenarios.
△ Less
Submitted 26 January, 2021; v1 submitted 22 January, 2021;
originally announced January 2021.
-
OP-IMS @ DIACR-Ita: Back to the Roots: SGNS+OP+CD still rocks Semantic Change Detection
Authors:
Jens Kaiser,
Dominik Schlechtweg,
Sabine Schulte im Walde
Abstract:
We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian. We exploit one of the earliest and most influential semantic change detection models based on Skip-Gram with Negative Sampling, Orthogonal Procrustes alignment and Cosine Distance and obtain the winning submission of the shared task with near to perfect accuracy .94. Our resul…
▽ More
We present the results of our participation in the DIACR-Ita shared task on lexical semantic change detection for Italian. We exploit one of the earliest and most influential semantic change detection models based on Skip-Gram with Negative Sampling, Orthogonal Procrustes alignment and Cosine Distance and obtain the winning submission of the shared task with near to perfect accuracy .94. Our results once more indicate that, within the present task setup in lexical semantic change detection, the traditional type-based approaches yield excellent performance.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.
-
Demonstration of nanosecond operation in stochastic magnetic tunnel junctions
Authors:
Christopher Safranski,
Jan Kaiser,
Philip Trouilloud,
Pouya Hashemi,
Guohan Hu,
Jonathan Z Sun
Abstract:
Magnetic tunnel junctions operating in the superparamagnetic regime are promising devices in the field of probabilistic computing, which is suitable for applications like high-dimensional optimization or sampling problems. Further, random number generation is of interest in the field of cryptography. For such applications, a device's uncorrelated fluctuation time-scale can determine the effective…
▽ More
Magnetic tunnel junctions operating in the superparamagnetic regime are promising devices in the field of probabilistic computing, which is suitable for applications like high-dimensional optimization or sampling problems. Further, random number generation is of interest in the field of cryptography. For such applications, a device's uncorrelated fluctuation time-scale can determine the effective system speed. It has been theoretically proposed that a magnetic tunnel junction designed to have only easy-plane anisotropy provides fluctuation rates determined by its easy-plane anisotropy field, and can perform on nanosecond or faster time-scale as measured by its magnetoresistance's autocorrelation in time. Here we provide experimental evidence of nanosecond scale fluctuations in a circular shaped easy-plane magnetic tunnel junction, consistent with finite-temperature coupled macrospin simulation results and prior theoretical expectations. We further assess the degree of stochasticity of such signal.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Riesz bases of port-Hamiltonian systems
Authors:
Birgit Jacob,
Julia T. Kaiser,
Hans Zwart
Abstract:
The location of the spectrum and the Riesz basis property of well-posed homogeneous infinite-dimensional linear port-Hamiltonian systems on a 1D spatial domain are studied. It is shown that the Riesz basis property is equivalent to the fact that system operator generates a strongly continuous group. Moreover, in this situation the spectrum consists of eigenvalues only, located in a strip parallel…
▽ More
The location of the spectrum and the Riesz basis property of well-posed homogeneous infinite-dimensional linear port-Hamiltonian systems on a 1D spatial domain are studied. It is shown that the Riesz basis property is equivalent to the fact that system operator generates a strongly continuous group. Moreover, in this situation the spectrum consists of eigenvalues only, located in a strip parallel to the imaginary axis and they can decomposed into finitely many sets having each a uniform gap.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
IMS at SemEval-2020 Task 1: How low can you go? Dimensionality in Lexical Semantic Change Detection
Authors:
Jens Kaiser,
Dominik Schlechtweg,
Sean Papay,
Sabine Schulte im Walde
Abstract:
We present the results of our system for SemEval-2020 Task 1 that exploits a commonly used lexical semantic change detection model based on Skip-Gram with Negative Sampling. Our system focuses on Vector Initialization (VI) alignment, compares VI to the currently top-ranking models for Subtask 2 and demonstrates that these can be outperformed if we optimize VI dimensionality. We demonstrate that di…
▽ More
We present the results of our system for SemEval-2020 Task 1 that exploits a commonly used lexical semantic change detection model based on Skip-Gram with Negative Sampling. Our system focuses on Vector Initialization (VI) alignment, compares VI to the currently top-ranking models for Subtask 2 and demonstrates that these can be outperformed if we optimize VI dimensionality. We demonstrate that differences in performance can largely be attributed to model-specific sources of noise, and we reveal a strong relationship between dimensionality and frequency-induced noise in VI alignment. Our results suggest that lexical semantic change models integrating vector space alignment should pay more attention to the role of the dimensionality parameter.
△ Less
Submitted 7 August, 2020;
originally announced August 2020.
-
The Operating System of the Neuromorphic BrainScaleS-1 System
Authors:
Eric Müller,
Sebastian Schmitt,
Christian Mauch,
Sebastian Billaudelle,
Andreas Grübl,
Maurice Güttler,
Dan Husmann,
Joscha Ilmberger,
Sebastian Jeltsch,
Jakob Kaiser,
Johann Klähn,
Mitja Kleider,
Christoph Koke,
José Montes,
Paul Müller,
Johannes Partzsch,
Felix Passenberg,
Hartmut Schmidt,
Bernhard Vogginger,
Jonas Weidner,
Christian Mayr,
Johannes Schemmel
Abstract:
BrainScaleS-1 is a wafer-scale mixed-signal accelerated neuromorphic system targeted for research in the fields of computational neuroscience and beyond-von-Neumann computing. The BrainScaleS Operating System (BrainScaleS OS) is a software stack giving users the possibility to emulate networks described in the high-level network description language PyNN with minimal knowledge of the system. At th…
▽ More
BrainScaleS-1 is a wafer-scale mixed-signal accelerated neuromorphic system targeted for research in the fields of computational neuroscience and beyond-von-Neumann computing. The BrainScaleS Operating System (BrainScaleS OS) is a software stack giving users the possibility to emulate networks described in the high-level network description language PyNN with minimal knowledge of the system. At the same time, expert usage is facilitated by allowing to hook into the system at any depth of the stack. We present operation and development methodologies implemented for the BrainScaleS-1 neuromorphic architecture and walk through the individual components of BrainScaleS OS constituting the software stack for BrainScaleS-1 platform operation.
△ Less
Submitted 2 February, 2022; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Hardware Design for Autonomous Bayesian Networks
Authors:
Rafatul Faria,
Jan Kaiser,
Kerem Y. Camsari,
Supriyo Datta
Abstract:
Directed acyclic graphs or Bayesian networks that are popular in many AI related sectors for probabilistic inference and causal reasoning can be mapped to probabilistic circuits built out of probabilistic bits (p-bits), analogous to binary stochastic neurons of stochastic artificial neural networks. In order to satisfy standard statistical results, individual p-bits not only need to be updated seq…
▽ More
Directed acyclic graphs or Bayesian networks that are popular in many AI related sectors for probabilistic inference and causal reasoning can be mapped to probabilistic circuits built out of probabilistic bits (p-bits), analogous to binary stochastic neurons of stochastic artificial neural networks. In order to satisfy standard statistical results, individual p-bits not only need to be updated sequentially, but also in order from the parent to the child nodes, necessitating the use of sequencers in software implementations. In this article, we first use SPICE simulations to show that an autonomous hardware Bayesian network can operate correctly without any clocks or sequencers, but only if the individual p-bits are appropriately designed. We then present a simple behavioral model of the autonomous hardware illustrating the essential characteristics needed for correct sequencer-free operation. This model is also benchmarked against SPICE simulations and can be used to simulate large scale networks. Our results could be useful in the design of hardware accelerators that use energy efficient building blocks suited for low-level implementations of Bayesian networks. The autonomous massively parallel operation of our proposed stochastic hardware has biological relevance since neural dynamics in brain is also stochastic and autonomous by nature.
△ Less
Submitted 3 July, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Embodied Synaptic Plasticity with Online Reinforcement learning
Authors:
Jacques Kaiser,
Michael Hoff,
Andreas Konle,
J. Camilo Vasquez Tieck,
David Kappel,
Daniel Reichard,
Anand Subramoney,
Robert Legenstein,
Arne Roennau,
Wolfgang Maass,
Rudiger Dillmann
Abstract:
The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robo…
▽ More
The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robotics closer together by integrating open-source software components from these two fields. The resulting framework allows to evaluate the validity of biologically-plausibe plasticity models in closed-loop robotics environments. We demonstrate this framework to evaluate Synaptic Plasticity with Online REinforcement learning (SPORE), a reward-learning rule based on synaptic sampling, on two visuomotor tasks: reaching and lane following. We show that SPORE is capable of learning to perform policies within the course of simulated hours for both tasks. Provisional parameter explorations indicate that the learning rate and the temperature driving the stochastic processes that govern synaptic learning dynamics need to be regulated for performance improvements to be retained. We conclude by discussing the recent deep reinforcement learning techniques which would be beneficial to increase the functionality of SPORE on visuomotor tasks.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Probabilistic Circuits for Autonomous Learning: A simulation study
Authors:
Jan Kaiser,
Rafatul Faria,
Kerem Y. Camsari,
Supriyo Datta
Abstract:
Modern machine learning is based on powerful algorithms running on digital computing platforms and there is great interest in accelerating the learning process and making it more energy efficient. In this paper we present a fully autonomous probabilistic circuit for fast and efficient learning that makes no use of digital computing. Specifically we use SPICE simulations to demonstrate a clockless…
▽ More
Modern machine learning is based on powerful algorithms running on digital computing platforms and there is great interest in accelerating the learning process and making it more energy efficient. In this paper we present a fully autonomous probabilistic circuit for fast and efficient learning that makes no use of digital computing. Specifically we use SPICE simulations to demonstrate a clockless autonomous circuit where the required synaptic weights are read out in the form of analog voltages. Such autonomous circuits could be particularly of interest as standalone learning devices in the context of mobile and edge computing.
△ Less
Submitted 25 February, 2020; v1 submitted 14 October, 2019;
originally announced October 2019.
-
Embodied Neuromorphic Vision with Event-Driven Random Backpropagation
Authors:
Jacques Kaiser,
Alexander Friedrich,
J. Camilo Vasquez Tieck,
Daniel Reichard,
Arne Roennau,
Emre Neftci,
Rüdiger Dillmann
Abstract:
Spike-based communication between biological neurons is sparse and unreliable. This enables the brain to process visual information from the eyes efficiently. Taking inspiration from biology, artificial spiking neural networks coupled with silicon retinas attempt to model these computations. Recent findings in machine learning allowed the derivation of a family of powerful synaptic plasticity rule…
▽ More
Spike-based communication between biological neurons is sparse and unreliable. This enables the brain to process visual information from the eyes efficiently. Taking inspiration from biology, artificial spiking neural networks coupled with silicon retinas attempt to model these computations. Recent findings in machine learning allowed the derivation of a family of powerful synaptic plasticity rules approximating backpropagation for spiking networks. Are these rules capable of processing real-world visual sensory data? In this paper, we evaluate the performance of Event-Driven Random Back-Propagation (eRBP) at learning representations from event streams provided by a Dynamic Vision Sensor (DVS). First, we show that eRBP matches state-of-the-art performance on the DvsGesture dataset with the addition of a simple covert attention mechanism. By remapping visual receptive fields relatively to the center of the motion, this attention mechanism provides translation invariance at low computational cost compared to convolutions. Second, we successfully integrate eRBP in a real robotic setup, where a robotic arm grasps objects according to detected visual affordances. In this setup, visual information is actively sensed by a DVS mounted on a robotic head performing microsaccadic eye movements. We show that our method classifies affordances within 100ms after microsaccade onset, which is comparable to human performance reported in behavioral study. Our results suggest that advances in neuromorphic technology and plasticity rules enable the development of autonomous robots operating at high speed and low energy consumption.
△ Less
Submitted 6 May, 2019; v1 submitted 9 April, 2019;
originally announced April 2019.
-
On exact controllability of infinite-dimensional linear port-Hamiltonian systems
Authors:
Birgit Jacob,
Julia T. Kaiser
Abstract:
Infinite-dimensional linear port-Hamiltonian systems on a one-dimensional spatial domain with full boundary control and without internal damping are studied. This class of systems includes models of beams and waves as well as the transport equation and networks of nonhomogeneous transmission lines. The main result shows that well-posed port-Hamiltonian systems, with state space…
▽ More
Infinite-dimensional linear port-Hamiltonian systems on a one-dimensional spatial domain with full boundary control and without internal damping are studied. This class of systems includes models of beams and waves as well as the transport equation and networks of nonhomogeneous transmission lines. The main result shows that well-posed port-Hamiltonian systems, with state space $L^2((0,1);\mathbb C^n)$ and input space $\mathbb C^n$, are exactly controllable.
△ Less
Submitted 15 May, 2019; v1 submitted 9 March, 2019;
originally announced March 2019.
-
Subnanosecond Fluctuations in Low-Barrier Nanomagnets
Authors:
Jan Kaiser,
Avinash Rustagi,
Kerem Y. Camsari,
Jonathan Z. Sun,
Supriyo Datta,
Pramey Upadhyaya
Abstract:
Fast magnetic fluctuations due to thermal torques have useful technological functionality ranging from cryptography to probabilistic computing. The characteristic time of fluctuations in typical uniaxial anisotropy magnets studied so far is bounded from below by the well-known energy relaxation mechanism. This time scales as $α^{-1}$, where $α$ parameterizes the strength of dissipative processes.…
▽ More
Fast magnetic fluctuations due to thermal torques have useful technological functionality ranging from cryptography to probabilistic computing. The characteristic time of fluctuations in typical uniaxial anisotropy magnets studied so far is bounded from below by the well-known energy relaxation mechanism. This time scales as $α^{-1}$, where $α$ parameterizes the strength of dissipative processes. Here, we theoretically analyze the fluctuating dynamics in easy-plane and antiferromagnetically coupled nanomagnets. We find in such magnets, the dynamics are strongly influenced by fluctuating intrinsic fields, which give rise to an additional dephasing-type mechanism for washing out correlations. In particular, we establish two time scales for characterizing fluctuations (i) the average time for a nanomagnet to reverse|which for the experimentally relevant regime of low damping is governed primarily by dephasing and becomes independent of $α$, (ii) the time scale for memory loss of a single nanomagnet|which scales as $α^{-1/3}$ and is governed by a combination of energy dissipation and dephasing mechanism. For typical experimentally accessible values of intrinsic fields, the resultant thermal-fluctuation rate is increased by multiple orders of magnitude when compared with the bound set solely by the energy relaxation mechanism in uniaxial magnets. This could lead to higher operating speeds of emerging devices exploiting magnetic fluctuations.
△ Less
Submitted 25 November, 2019; v1 submitted 8 February, 2019;
originally announced February 2019.
-
Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE)
Authors:
Jacques Kaiser,
Hesham Mostafa,
Emre Neftci
Abstract:
A growing body of work underlines striking similarities between biological neural networks and recurrent, binary neural networks. A relatively smaller body of work, however, discusses similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely caused by the discrepancy between the dy…
▽ More
A growing body of work underlines striking similarities between biological neural networks and recurrent, binary neural networks. A relatively smaller body of work, however, discusses similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks. The challenge preventing this is largely caused by the discrepancy between the dynamical properties of synaptic plasticity and the requirements for gradient backpropagation. Learning algorithms that approximate gradient backpropagation using locally synthesized gradients can overcome this challenge. Here, we show that synthetic gradients enable the derivation of Deep Continuous Local Learning (DECOLLE) in spiking neural networks. DECOLLE is capable of learning deep spatio-temporal representations from spikes relying solely on local information. Synaptic plasticity rules are derived systematically from user-defined cost functions and neural dynamics by leveraging existing autodifferentiation methods of machine learning frameworks. We benchmark our approach on the MNIST and the event-based neuromorphic DvsGesture dataset, on which DECOLLE performs comparably to the state-of-the-art. DECOLLE networks provide continuously learning machines that are relevant to biology and supportive of event-based, low-power computer vision architectures matching the accuracies of conventional computers on tasks where temporal precision and speed are essential.
△ Less
Submitted 20 May, 2020; v1 submitted 26 November, 2018;
originally announced November 2018.
-
Well-posedness of networks for 1-D hyperbolic partial differential equations
Authors:
Birgit Jacob,
Julia T. Kaiser
Abstract:
We consider the well-posedness of a class of hyperbolic partial differential equations on a one dimensional spatial domain. This class includes in particular infinite-dimensional networks of transport, wave and beam equations, or even combinations of these. Equivalent conditions for contraction semigroup generation are derived. In the first part we assume a finite interval and in the second part,…
▽ More
We consider the well-posedness of a class of hyperbolic partial differential equations on a one dimensional spatial domain. This class includes in particular infinite-dimensional networks of transport, wave and beam equations, or even combinations of these. Equivalent conditions for contraction semigroup generation are derived. In the first part we assume a finite interval and in the second part, we consider partial differential equations on the semi-axis.
△ Less
Submitted 14 November, 2018; v1 submitted 1 September, 2017;
originally announced September 2017.
-
Random Spatial Networks: Small Worlds without Clustering, Traveling Waves, and Hop-and-Spread Disease Dynamics
Authors:
John Lang,
Hans De Sterck,
Jamieson L. Kaiser,
Joel C. Miller
Abstract:
Random network models play a prominent role in modeling, analyzing and understanding complex phenomena on real-life networks. However, a key property of networks is often neglected: many real-world networks exhibit spatial structure, the tendency of a node to select neighbors with a probability depending on physical distance. Here, we introduce a class of random spatial networks (RSNs) which gener…
▽ More
Random network models play a prominent role in modeling, analyzing and understanding complex phenomena on real-life networks. However, a key property of networks is often neglected: many real-world networks exhibit spatial structure, the tendency of a node to select neighbors with a probability depending on physical distance. Here, we introduce a class of random spatial networks (RSNs) which generalizes many existing random network models but adds spatial structure. In these networks, nodes are placed randomly in space and joined in edges with a probability depending on their distance and their individual expected degrees, in a manner that crucially remains analytically tractable. We use this network class to propose a new generalization of small-world networks, where the average shortest path lengths in the graph are small, as in classical Watts-Strogatz small-world networks, but with close spatial proximity of nodes that are neighbors in the network playing the role of large clustering. Small-world effects are demonstrated on these spatial small-world networks without clustering. We are able to derive partial integro-differential equations governing susceptible-infectious-recovered disease spreading through an RSN, and we demonstrate the existence of traveling wave solutions. If the distance kernel governing edge placement decays slower than exponential, the population-scale dynamics are dominated by long-range hops followed by local spread of traveling waves. This provides a theoretical modeling framework for recent observations of how epidemics like Ebola evolve in modern connected societies, with long-range connections seeding new focal points from which the epidemic locally spreads in a wavelike manner.
△ Less
Submitted 4 February, 2017;
originally announced February 2017.
-
Thermal Transport at the Nanoscale - A Fourier's Law vs. Phonon Boltzmann Equation Study
Authors:
Jan Kaiser,
Tianli Feng,
Jesse Maassen,
Xufeng Wang,
Xiulin Ruan,
Mark Lundstrom
Abstract:
Steady-state thermal transport in nanostructures with dimensions comparable to the phonon mean-free-path is examined. Both the case of contacts at different temperatures with no internal heat generation and contacts at the same temperature with internal heat generation are considered. Fourier's Law results are compared to finite volume method solutions of the phonon Boltzmann equation in the gray…
▽ More
Steady-state thermal transport in nanostructures with dimensions comparable to the phonon mean-free-path is examined. Both the case of contacts at different temperatures with no internal heat generation and contacts at the same temperature with internal heat generation are considered. Fourier's Law results are compared to finite volume method solutions of the phonon Boltzmann equation in the gray approximation. When the boundary conditions are properly specified, results obtained using Fourier's Law without modifying the bulk thermal conductivity are in essentially exact quantitative agreement with the phonon Boltzmann equation in the ballistic and diffusive limits. The errors between these two limits are examined in this paper. For the four cases examined, the error in the apparent thermal conductivity as deduced from a correct application of Fourier's Law is less than 6%. We also find that the Fourier's Law results presented here are nearly identical to those obtained from a widely-used ballistic-diffusive approach, but analytically much simpler. Although limited to steady-state conditions with spatial variations in one dimension and to a gray model of phonon transport, the results show that Fourier's Law can be used for linear transport from the diffusive to the ballistic limit. The results also contribute to an understanding of how heat transport at the nanoscale can be understood in terms of the conceptual framework that has been established for electron transport at the nanoscale.
△ Less
Submitted 9 December, 2016; v1 submitted 3 August, 2016;
originally announced August 2016.
-
On the relationship between the Collatz conjecture and Mersenne prime numbers
Authors:
Jonas Kaiser
Abstract:
The purpose of this study is to show how to get a necessary criterion for prime numbers with the help of special matrices. My special interest lies in the empirical research of these matrices and their patterns, structures and symmetries. The matrices in turn depend on an expansion of the Collatz algorithm 3n+1.
The purpose of this study is to show how to get a necessary criterion for prime numbers with the help of special matrices. My special interest lies in the empirical research of these matrices and their patterns, structures and symmetries. The matrices in turn depend on an expansion of the Collatz algorithm 3n+1.
△ Less
Submitted 7 August, 2016; v1 submitted 1 August, 2016;
originally announced August 2016.
-
Reasoning in complex environments with the SelectScript declarative language
Authors:
André Dietrich,
Sebastian Zug,
Luigi Nardi,
Jörg Kaiser
Abstract:
SelectScript is an extendable, adaptable, and declarative domain-specific language aimed at information retrieval from simulation environments and robotic world models in an SQL-like manner. In this work we have extended the language in two directions. First, we have implemented hierarchical queries; second, we improve efficiency enabling manual design space exploration on different "search" strat…
▽ More
SelectScript is an extendable, adaptable, and declarative domain-specific language aimed at information retrieval from simulation environments and robotic world models in an SQL-like manner. In this work we have extended the language in two directions. First, we have implemented hierarchical queries; second, we improve efficiency enabling manual design space exploration on different "search" strategies. We demonstrate the applicability of such extensions in two application problems; the basic language concepts are explained by solving the classical problem of the Towers of Hanoi and then a common path planning problem in a complex 3D environment is implemented.
△ Less
Submitted 4 October, 2015; v1 submitted 17 August, 2015;
originally announced August 2015.
-
Efficient Informative Sensing using Multiple Robots
Authors:
Amarjeet Singh,
Andreas Krause,
Carlos Guestrin,
William J. Kaiser
Abstract:
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of thei…
▽ More
The need for efficient monitoring of spatio-temporal dynamics in large environmental applications, such as the water quality monitoring in rivers and lakes, motivates the use of robotic sensors in order to achieve sufficient spatial coverage. Typically, these robots have bounded resources, such as limited battery or limited amounts of time to obtain measurements. Thus, careful coordination of their paths is required in order to maximize the amount of information collected, while respecting the resource constraints. In this paper, we present an efficient approach for near-optimally solving the NP-hard optimization problem of planning such informative paths. In particular, we first develop eSIP (efficient Single-robot Informative Path planning), an approximation algorithm for optimizing the path of a single robot. Hereby, we use a Gaussian Process to model the underlying phenomenon, and use the mutual information between the visited locations and remainder of the space to quantify the amount of information collected. We prove that the mutual information collected using paths obtained by using eSIP is close to the information obtained by an optimal solution. We then provide a general technique, sequential allocation, which can be used to extend any single robot planning algorithm, such as eSIP, for the multi-robot problem. This procedure approximately generalizes any guarantees for the single-robot problem to the multi-robot case. We extensively evaluate the effectiveness of our approach on several experiments performed in-field for two important environmental sensing applications, lake and river monitoring, and simulation experiments performed using several real world sensor network data sets.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Algorithm for Missing Values Imputation in Categorical Data with Use of Association Rules
Authors:
Jiří Kaiser
Abstract:
This paper presents algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using the algorithm then using most common attribute value.
This paper presents algorithm for missing values imputation in categorical data. The algorithm is based on using association rules and is presented in three variants. Experimental shows better accuracy of missing values imputation using the algorithm then using most common attribute value.
△ Less
Submitted 8 November, 2012;
originally announced November 2012.
-
Volumes of chain links
Authors:
James Kaiser,
Jessica S. Purcell,
Clint Rollins
Abstract:
Agol has conjectured that minimally twisted n-chain links are the smallest volume hyperbolic manifolds with n cusps, for n at most 10. In his thesis, Venzke mentions that these cannot be smallest volume for n at least 11, but does not provide a proof. In this paper, we give a proof of Venzke's statement for a number of cases. For n at least 60 we use a formula from work of Futer, Kalfagianni, and…
▽ More
Agol has conjectured that minimally twisted n-chain links are the smallest volume hyperbolic manifolds with n cusps, for n at most 10. In his thesis, Venzke mentions that these cannot be smallest volume for n at least 11, but does not provide a proof. In this paper, we give a proof of Venzke's statement for a number of cases. For n at least 60 we use a formula from work of Futer, Kalfagianni, and Purcell to obtain a lower bound for volume. The proof for n between 12 and 25 inclusive uses a rigorous computer computation that follows methods of Moser and Milley. Finally, we prove that the n-chain link with 2m or 2m+1 half-twists cannot be the minimal volume hyperbolic manifold with n cusps, provided n is at least 60 or |m| is at least 8, and we give computational data indicating this remains true for smaller n and |m|.
△ Less
Submitted 7 June, 2012; v1 submitted 14 July, 2011;
originally announced July 2011.
-
Phase readout of a charge qubit capacitively coupled to an open double quantum dot
Authors:
C. Kreisbeck,
F. J. Kaiser,
S. Kohler
Abstract:
We study the dynamics of a charge qubit that is capacitively coupled to an open double quantum dot. Depending on the qubit state, the transport through the open quantum dot may be resonant or off-resonant, such that the qubit affects the current through the open double dot. We relate the initial qubit state to the magnitude of an emerging transient current peak. The relation between these quanti…
▽ More
We study the dynamics of a charge qubit that is capacitively coupled to an open double quantum dot. Depending on the qubit state, the transport through the open quantum dot may be resonant or off-resonant, such that the qubit affects the current through the open double dot. We relate the initial qubit state to the magnitude of an emerging transient current peak. The relation between these quantities enables the readout of not only the charge but also the phase of the qubit.
△ Less
Submitted 8 March, 2010; v1 submitted 17 April, 2009;
originally announced April 2009.
-
Molecular electronics in junctions with energy disorder
Authors:
Franz J. Kaiser,
Peter Hänggi,
Sigmund Kohler
Abstract:
We investigate transport through molecular wires whose energy levels are affected by environmental fluctuations. We assume that the relevant fluctuations are so slow that they, within a tight-binding description, can be described by disordered, Gaussian distributed onsite energies. For long wires, we find that the corresponding current distribution can be rather broad even for a small energy var…
▽ More
We investigate transport through molecular wires whose energy levels are affected by environmental fluctuations. We assume that the relevant fluctuations are so slow that they, within a tight-binding description, can be described by disordered, Gaussian distributed onsite energies. For long wires, we find that the corresponding current distribution can be rather broad even for a small energy variance. Moreover, we analyse with a Floquet master equation the interplay of laser excitations and static disorder. Then the disorder leads to spatial asymmetries such that the laser diving can induce a ratchet current.
△ Less
Submitted 1 April, 2008; v1 submitted 31 March, 2008;
originally announced March 2008.