-
BIMgent: Towards Autonomous Building Modeling via Computer-use Agents
Authors:
Zihan Deng,
Changyu Du,
Stavros Nousias,
André Borrmann
Abstract:
Existing computer-use agents primarily focus on general-purpose desktop automation tasks, with limited exploration of their application in highly specialized domains. In particular, the 3D building modeling process in the Architecture, Engineering, and Construction (AEC) sector involves open-ended design tasks and complex interaction patterns within Building Information Modeling (BIM) authoring so…
▽ More
Existing computer-use agents primarily focus on general-purpose desktop automation tasks, with limited exploration of their application in highly specialized domains. In particular, the 3D building modeling process in the Architecture, Engineering, and Construction (AEC) sector involves open-ended design tasks and complex interaction patterns within Building Information Modeling (BIM) authoring software, which has yet to be thoroughly addressed by current studies. In this paper, we propose BIMgent, an agentic framework powered by multimodal large language models (LLMs), designed to enable autonomous building model authoring via graphical user interface (GUI) operations. BIMgent automates the architectural building modeling process, including multimodal input for conceptual design, planning of software-specific workflows, and efficient execution of the authoring GUI actions. We evaluate BIMgent on real-world building modeling tasks, including both text-based conceptual design generation and reconstruction from existing building design. The design quality achieved by BIMgent was found to be reasonable. Its operations achieved a 32% success rate, whereas all baseline models failed to complete the tasks (0% success rate). Results demonstrate that BIMgent effectively reduces manual workload while preserving design intent, highlighting its potential for practical deployment in real-world architectural modeling scenarios. Project page: https://tumcms.github.io/BIMgent.github.io/
△ Less
Submitted 30 June, 2025; v1 submitted 8 June, 2025;
originally announced June 2025.
-
Predictive Modeling: BIM Command Recommendation Based on Large-scale Usage Logs
Authors:
Changyu Du,
Zihan Deng,
Stavros Nousias,
André Borrmann
Abstract:
The adoption of Building Information Modeling (BIM) and model-based design within the Architecture, Engineering, and Construction (AEC) industry has been hindered by the perception that using BIM authoring tools demands more effort than conventional 2D drafting. To enhance design efficiency, this paper proposes a BIM command recommendation framework that predicts the optimal next actions in real-t…
▽ More
The adoption of Building Information Modeling (BIM) and model-based design within the Architecture, Engineering, and Construction (AEC) industry has been hindered by the perception that using BIM authoring tools demands more effort than conventional 2D drafting. To enhance design efficiency, this paper proposes a BIM command recommendation framework that predicts the optimal next actions in real-time based on users' historical interactions. We propose a comprehensive filtering and enhancement method for large-scale raw BIM log data and introduce a novel command recommendation model. Our model builds upon the state-of-the-art Transformer backbones originally developed for large language models (LLMs), incorporating a custom feature fusion module, dedicated loss function, and targeted learning strategy. In a case study, the proposed method is applied to over 32 billion rows of real-world log data collected globally from the BIM authoring software Vectorworks. Experimental results demonstrate that our method can learn universal and generalizable modeling patterns from anonymous user interaction sequences across different countries, disciplines, and projects. When generating recommendations for the next command, our approach achieves a Recall@10 of approximately 84%.
△ Less
Submitted 23 February, 2025;
originally announced April 2025.
-
VectorGraphNET: Graph Attention Networks for Accurate Segmentation of Complex Technical Drawings
Authors:
Andrea Carrara,
Stavros Nousias,
André Borrmann
Abstract:
This paper introduces a new approach to extract and analyze vector data from technical drawings in PDF format. Our method involves converting PDF files into SVG format and creating a feature-rich graph representation, which captures the relationships between vector entities using geometrical information. We then apply a graph attention transformer with hierarchical label definition to achieve accu…
▽ More
This paper introduces a new approach to extract and analyze vector data from technical drawings in PDF format. Our method involves converting PDF files into SVG format and creating a feature-rich graph representation, which captures the relationships between vector entities using geometrical information. We then apply a graph attention transformer with hierarchical label definition to achieve accurate line-level segmentation. Our approach is evaluated on two datasets, including the public FloorplanCAD dataset, which achieves state-of-the-art results on weighted F1 score, surpassing existing methods. The proposed vector-based method offers a more scalable solution for large-scale technical drawing analysis compared to vision-based approaches, while also requiring significantly less GPU power than current state-of-the-art vector-based techniques. Moreover, it demonstrates improved performance in terms of the weighted F1 (wF1) score on the semantic segmentation task. Our results demonstrate the effectiveness of our approach in extracting meaningful information from technical drawings, enabling new applications, and improving existing workflows in the AEC industry. Potential applications of our approach include automated building information modeling (BIM) and construction planning, which could significantly impact the efficiency and productivity of the industry.
△ Less
Submitted 2 October, 2024;
originally announced October 2024.
-
Text2BIM: Generating Building Models Using a Large Language Model-based Multi-Agent Framework
Authors:
Changyu Du,
Sebastian Esser,
Stavros Nousias,
André Borrmann
Abstract:
The conventional BIM authoring process typically requires designers to master complex and tedious modeling commands in order to materialize their design intentions within BIM authoring tools. This additional cognitive burden complicates the design process and hinders the adoption of BIM and model-based design in the AEC (Architecture, Engineering, and Construction) industry. To facilitate the expr…
▽ More
The conventional BIM authoring process typically requires designers to master complex and tedious modeling commands in order to materialize their design intentions within BIM authoring tools. This additional cognitive burden complicates the design process and hinders the adoption of BIM and model-based design in the AEC (Architecture, Engineering, and Construction) industry. To facilitate the expression of design intentions more intuitively, we propose Text2BIM, an LLM-based multi-agent framework that can generate 3D building models from natural language instructions. This framework orchestrates multiple LLM agents to collaborate and reason, transforming textual user input into imperative code that invokes the BIM authoring tool's APIs, thereby generating editable BIM models with internal layouts, external envelopes, and semantic information directly in the software. Furthermore, a rule-based model checker is introduced into the agentic workflow, utilizing predefined domain knowledge to guide the LLM agents in resolving issues within the generated models and iteratively improving model quality. Extensive experiments were conducted to compare and analyze the performance of three different LLMs under the proposed framework. The evaluation results demonstrate that our approach can effectively generate high-quality, structurally rational building models that are aligned with the abstract concepts specified by user input. Finally, an interactive software prototype was developed to integrate the framework into the BIM authoring software Vectorworks, showcasing the potential of modeling by chatting.
△ Less
Submitted 15 August, 2024;
originally announced August 2024.
-
Towards a copilot in BIM authoring tool using a large language model-based agent for intelligent human-machine interaction
Authors:
Changyu Du,
Stavros Nousias,
André Borrmann
Abstract:
Facing increasingly complex BIM authoring software and the accompanying expensive learning costs, designers often seek to interact with the software in a more intelligent and lightweight manner. They aim to automate modeling workflows, avoiding obstacles and difficulties caused by software usage, thereby focusing on the design process itself. To address this issue, we proposed an LLM-based autonom…
▽ More
Facing increasingly complex BIM authoring software and the accompanying expensive learning costs, designers often seek to interact with the software in a more intelligent and lightweight manner. They aim to automate modeling workflows, avoiding obstacles and difficulties caused by software usage, thereby focusing on the design process itself. To address this issue, we proposed an LLM-based autonomous agent framework that can function as a copilot in the BIM authoring tool, answering software usage questions, understanding the user's design intentions from natural language, and autonomously executing modeling tasks by invoking the appropriate tools. In a case study based on the BIM authoring software Vectorworks, we implemented a software prototype to integrate the proposed framework seamlessly into the BIM authoring scenario. We evaluated the planning and reasoning capabilities of different LLMs within this framework when faced with complex instructions. Our work demonstrates the significant potential of LLM-based agents in design automation and intelligent interaction.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Towards commands recommender system in BIM authoring tool using transformers
Authors:
Changyu Du,
Zihan Deng,
Stavros Nousias,
André Borrmann
Abstract:
The complexity of BIM software presents significant barriers to the widespread adoption of BIM and model-based design within the Architecture, Engineering, and Construction (AEC) sector. End-users frequently express concerns regarding the additional effort required to create a sufficiently detailed BIM model when compared with conventional 2D drafting. This study explores the potential of sequenti…
▽ More
The complexity of BIM software presents significant barriers to the widespread adoption of BIM and model-based design within the Architecture, Engineering, and Construction (AEC) sector. End-users frequently express concerns regarding the additional effort required to create a sufficiently detailed BIM model when compared with conventional 2D drafting. This study explores the potential of sequential recommendation systems to accelerate the BIM modeling process. By treating BIM software commands as recommendable items, we introduce a novel end-to-end approach that predicts the next-best command based on user historical interactions. Our framework extensively preprocesses real-world, large-scale BIM log data, utilizes the transformer architectures from the latest large language models as the backbone network, and ultimately results in a prototype that provides real-time command suggestions within the BIM authoring tool Vectorworks. Subsequent experiments validated that our proposed model outperforms the previous study, demonstrating the immense potential of the recommendation system in enhancing design efficiency.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Coordinating robotized construction using advanced robotic simulation: The case of collaborative brick wall assembly
Authors:
Mohammad Reza Kolani,
Stavros Nousias,
André Borrmann
Abstract:
Utilizing robotic systems in the construction industry is gaining popularity due to their build time, precision, and efficiency. In this paper, we introduce a system that allows the coordination of multiple manipulator robots for construction activities. As a case study, we chose robotic brick wall assembly. By utilizing a multi robot system where arm manipulators collaborate with each other, the…
▽ More
Utilizing robotic systems in the construction industry is gaining popularity due to their build time, precision, and efficiency. In this paper, we introduce a system that allows the coordination of multiple manipulator robots for construction activities. As a case study, we chose robotic brick wall assembly. By utilizing a multi robot system where arm manipulators collaborate with each other, the entirety of a potentially long wall can be assembled simultaneously. However, the reduction of overall bricklaying time is dependent on the minimization of time required for each individual manipulator. In this paper, we execute the simulation with various placements of material and the robots base, as well as different robot configurations, to determine the optimal position of the robot and material and the best configuration for the robot. The simulation results provide users with insights into how to find the best placement of robots and raw materials for brick wall assembly.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Transient Neural Radiance Fields for Lidar View Synthesis and 3D Reconstruction
Authors:
Anagh Malik,
Parsa Mirdehghan,
Sotiris Nousias,
Kiriakos N. Kutulakos,
David B. Lindell
Abstract:
Neural radiance fields (NeRFs) have become a ubiquitous tool for modeling scene appearance and geometry from multiview imagery. Recent work has also begun to explore how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxilia…
▽ More
Neural radiance fields (NeRFs) have become a ubiquitous tool for modeling scene appearance and geometry from multiview imagery. Recent work has also begun to explore how to use additional supervision from lidar or depth sensor measurements in the NeRF framework. However, previous lidar-supervised NeRFs focus on rendering conventional camera imagery and use lidar-derived point cloud data as auxiliary supervision; thus, they fail to incorporate the underlying image formation model of the lidar. Here, we propose a novel method for rendering transient NeRFs that take as input the raw, time-resolved photon count histograms measured by a single-photon lidar system, and we seek to render such histograms from novel views. Different from conventional NeRFs, the approach relies on a time-resolved version of the volume rendering equation to render the lidar measurements and capture transient light transport phenomena at picosecond timescales. We evaluate our method on a first-of-its-kind dataset of simulated and captured transient multiview scans from a prototype single-photon lidar. Overall, our work brings NeRFs to a new dimension of imaging at transient timescales, newly enabling rendering of transient imagery from novel views. Additionally, we show that our approach recovers improved geometry and conventional appearance compared to point cloud-based supervision when training on few input viewpoints. Transient NeRFs may be especially useful for applications which seek to simulate raw lidar measurements for downstream tasks in autonomous driving, robotics, and remote sensing.
△ Less
Submitted 5 April, 2024; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Towards predicting Pedestrian Evacuation Time and Density from Floorplans using a Vision Transformer
Authors:
Patrick Berggold,
Stavros Nousias,
Rohit K. Dubey,
André Borrmann
Abstract:
Conventional pedestrian simulators are inevitable tools in the design process of a building, as they enable project engineers to prevent overcrowding situations and plan escape routes for evacuation. However, simulation runtime and the multiple cumbersome steps in generating simulation results are potential bottlenecks during the building design process. Data-driven approaches have demonstrated th…
▽ More
Conventional pedestrian simulators are inevitable tools in the design process of a building, as they enable project engineers to prevent overcrowding situations and plan escape routes for evacuation. However, simulation runtime and the multiple cumbersome steps in generating simulation results are potential bottlenecks during the building design process. Data-driven approaches have demonstrated their capability to outperform conventional methods in speed while delivering similar or even better results across many disciplines. In this work, we present a deep learning-based approach based on a Vision Transformer to predict density heatmaps over time and total evacuation time from a given floorplan. Specifically, due to limited availability of public datasets, we implement a parametric data generation pipeline including a conventional simulator. This enables us to build a large synthetic dataset that we use to train our architecture. Furthermore, we seamlessly integrate our model into a BIM-authoring tool to generate simulation results instantly and automatically.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Patient-specific modelling, simulation and real-time processing for respiratory diseases
Authors:
Stavros Nousias
Abstract:
Asthma is a common chronic disease of the respiratory system causing significant disability and societal burden. It affects more than 300 million people worldwide, while more than 100 million people will likely have asthma by 2025. The price of asthma varies greatly from nation to nation. Mean yearly cost can be estimated to 1900 EUR in Europe and $3100 in the United States. Managing asthma involv…
▽ More
Asthma is a common chronic disease of the respiratory system causing significant disability and societal burden. It affects more than 300 million people worldwide, while more than 100 million people will likely have asthma by 2025. The price of asthma varies greatly from nation to nation. Mean yearly cost can be estimated to 1900 EUR in Europe and $3100 in the United States. Managing asthma involves controlling symptoms, preventing exacerbations, and maintaining lung function. Improved asthma control is reduces the risk of exacerbations and lung function impairment while reducing the direct costs of asthma care and indirect costs associated with reduced productivity. Understanding the complex dynamics of the pulmonary system and the lung's response to disease is fundamental to the advancement of Asthma treatment. Computational models of the respiratory system seek to provide a theoretical framework to understand the interaction between structure and function. Their application can improve pulmonary medicine by a patient-specific approach to medicinal methodologies optimizing the delivery given the personalized geometry and personalized ventilation patterns. A three-fold objective is addressed within this dissertation. The first part refers to the comprehension of pulmonary pathophysiology and the mechanics of Asthma and subsequently of constrictive pulmonary conditions in general. The second part refers to the design and implementation of tools that facilitate personalized medicine to improve delivery and effectiveness. Finally, the third part refers to the self-management of the condition, meaning that medical personnel and patients have access to tools and methods that allow the first party to easily track the course of the condition and the second party, i.e. the patient to easily self-manage it alleviating the significant burden from the health system.
△ Less
Submitted 8 September, 2022; v1 submitted 3 July, 2022;
originally announced July 2022.
-
AI-enabled Sound Pattern Recognition on Asthma Medication Adherence: Evaluation with the RDA Benchmark Suite
Authors:
Nikos D. Fakotakis,
Stavros Nousias,
Gerasimos Arvanitis,
Evangelia I. Zacharaki,
Konstantinos Moustakas
Abstract:
Asthma is a common, usually long-term respiratory disease with negative impact on global society and economy. Treatment involves using medical devices (inhalers) that distribute medication to the airways and its efficiency depends on the precision of the inhalation technique. There is a clinical need for objective methods to assess the inhalation technique, during clinical consultation. Integrated…
▽ More
Asthma is a common, usually long-term respiratory disease with negative impact on global society and economy. Treatment involves using medical devices (inhalers) that distribute medication to the airways and its efficiency depends on the precision of the inhalation technique. There is a clinical need for objective methods to assess the inhalation technique, during clinical consultation. Integrated health monitoring systems, equipped with sensors, enable the recognition of drug actuation, embedded with sound signal detection, analysis and identification, from intelligent structures, that could provide powerful tools for reliable content management. Health monitoring systems equipped with sensors, embedded with sound signal detection, enable the recognition of drug actuation and could be used for effective audio content analysis. This paper revisits sound pattern recognition with machine learning techniques for asthma medication adherence assessment and presents the Respiratory and Drug Actuation (RDA) Suite (https://gitlab.com/vvr/monitoring-medication-adherence/rda-benchmark) for benchmarking and further research. The RDA Suite includes a set of tools for audio processing, feature extraction and classification procedures and is provided along with a dataset, consisting of respiratory and drug actuation sounds. The classification models in RDA are implemented based on conventional and advanced machine learning and deep networks' architectures. This study provides a comparative evaluation of the implemented approaches, examines potential improvements and discusses on challenges and future tendencies.
△ Less
Submitted 16 April, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Fast mesh denoising with data driven normal filtering using deep variational autoencoders
Authors:
Stavros Nousias,
Gerasimos Arvanitis,
Aris S. Lalos,
Konstantinos Moustakas
Abstract:
Recent advances in 3D scanning technology have enabled the deployment of 3D models in various industrial applications like digital twins, remote inspection and reverse engineering. Despite their evolving performance, 3D scanners, still introduce noise and artifacts in the acquired dense models. In this work, we propose a fast and robust denoising method for dense 3D scanned industrial models. The…
▽ More
Recent advances in 3D scanning technology have enabled the deployment of 3D models in various industrial applications like digital twins, remote inspection and reverse engineering. Despite their evolving performance, 3D scanners, still introduce noise and artifacts in the acquired dense models. In this work, we propose a fast and robust denoising method for dense 3D scanned industrial models. The proposed approach employs conditional variational autoencoders to effectively filter face normals. Training and inference are performed in a sliding patch setup reducing the size of the required training data and execution times. We conducted extensive evaluation studies using 3D scanned and CAD models. The results verify plausible denoising outcomes, demonstrating similar or higher reconstruction accuracy, compared to other state-of-the-art approaches. Specifically, for 3D models with more than 1e4 faces, the presented pipeline is twice as fast as methods with equivalent reconstruction error.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Accelerating deep neural networks for efficient scene understanding in automotive cyber-physical systems
Authors:
Stavros Nousias,
Erion-Vasilis Pikoulis,
Christos Mavrokefalidis,
Aris S. Lalos
Abstract:
Automotive Cyber-Physical Systems (ACPS) have attracted a significant amount of interest in the past few decades, while one of the most critical operations in these systems is the perception of the environment. Deep learning and, especially, the use of Deep Neural Networks (DNNs) provides impressive results in analyzing and understanding complex and dynamic scenes from visual data. The prediction…
▽ More
Automotive Cyber-Physical Systems (ACPS) have attracted a significant amount of interest in the past few decades, while one of the most critical operations in these systems is the perception of the environment. Deep learning and, especially, the use of Deep Neural Networks (DNNs) provides impressive results in analyzing and understanding complex and dynamic scenes from visual data. The prediction horizons for those perception systems are very short and inference must often be performed in real time, stressing the need of transforming the original large pre-trained networks into new smaller models, by utilizing Model Compression and Acceleration (MCA) techniques. Our goal in this work is to investigate best practices for appropriately applying novel weight sharing techniques, optimizing the available variables and the training procedures towards the significant acceleration of widely adopted DNNs. Extensive evaluation studies carried out using various state-of-the-art DNN models in object detection and tracking experiments, provide details about the type of errors that manifest after the application of weight sharing techniques, resulting in significant acceleration gains with negligible accuracy losses.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Efficient automated U-Net based tree crown delineation using UAV multi-spectral imagery on embedded devices
Authors:
Kostas Blekos,
Stavros Nousias,
Aris S Lalos
Abstract:
Delineation approaches provide significant benefits to various domains, including agriculture, environmental and natural disasters monitoring. Most of the work in the literature utilize traditional segmentation methods that require a large amount of computational and storage resources. Deep learning has transformed computer vision and dramatically improved machine translation, though it requires m…
▽ More
Delineation approaches provide significant benefits to various domains, including agriculture, environmental and natural disasters monitoring. Most of the work in the literature utilize traditional segmentation methods that require a large amount of computational and storage resources. Deep learning has transformed computer vision and dramatically improved machine translation, though it requires massive dataset for training and significant resources for inference. More importantly, energy-efficient embedded vision hardware delivering real-time and robust performance is crucial in the aforementioned application. In this work, we propose a U-Net based tree delineation method, which is effectively trained using multi-spectral imagery but can then delineate single-spectrum images. The deep architecture that also performs localization, i.e., a class label corresponds to each pixel, has been successfully used to allow training with a small set of segmented images. The ground truth data were generated using traditional image denoising and segmentation approaches. To be able to execute the proposed DNN efficiently in embedded platforms designed for deep learning approaches, we employ traditional model compression and acceleration methods. Extensive evaluation studies using data collected from UAVs equipped with multi-spectral cameras demonstrate the effectiveness of the proposed methods in terms of delineation accuracy and execution efficiency.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Empowering cyberphysical systems of systems with intelligence
Authors:
Stavros Nousias,
Nikos Piperigkos,
Gerasimos Arvanitis,
Apostolos Fournaris,
Aris S. Lalos,
Konstantinos Moustakas
Abstract:
Cyber Physical Systems have been going into a transition phase from individual systems to a collecttives of systems that collaborate in order to achieve a highly complex cause, realizing a system of systems approach. The automotive domain has been making a transition to the system of system approach aiming to provide a series of emergent functionality like traffic management, collaborative car fle…
▽ More
Cyber Physical Systems have been going into a transition phase from individual systems to a collecttives of systems that collaborate in order to achieve a highly complex cause, realizing a system of systems approach. The automotive domain has been making a transition to the system of system approach aiming to provide a series of emergent functionality like traffic management, collaborative car fleet management or large-scale automotive adaptation to physical environment thus providing significant environmental benefits (e.g air pollution reduction) and achieving significant societal impact. Similarly, large infrastructure domains, are evolving into global, highly integrated cyber-physical systems of systems covering all parts of the value chain. In practice, there are significant challenges in CPSoS applicability and usability to be addressed, i.e. even a small CPSoS such as a car consists several subsystems Decentralization of CPSoS appoints tasks to individual CPSs within the System of Systems. CPSoSs are heterogenous systems. They comprise of various, autonomous, CPSs, each one of them having unique performance capabilities, criticality level, priorities and pursued goals. all CPSs must also harmonically pursue system-based achievements and collaborate in order to make system-of-system based decisions and implement the CPSoS functionality. This survey will provide a comprehensive review on current best practices in connected cyberphysical systems. The basis of our investigation is a dual layer architecture encompassing a perception layer and a behavioral layer. Perception algorithms with respect to scene understanding (object detection and tracking, pose estimation), localization mapping and path planning are thoroughly investigated. Behavioural part focuses on decision making and human in the loop control.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Enhancing an eco-driving gamification platform through wearable and vehicle sensor data integration
Authors:
Christos Tselios,
Stavros Nousias,
Dimitris Bitzas,
Dimitrios Amaxilatis,
Orestis Akrivopoulos,
Aris S. Lalos,
Konstantinos Moustakas,
Ioannis Chatzigiannakis
Abstract:
As road transportation has been identified as a major contributor of environmental pollution, motivating individuals to adopt a more eco-friendly driving style could have a substantial ecological as well as financial benefit. With gamification being an effective tool towards guiding targeted behavioural changes, the development of realistic frameworks delivering a high end user experience, becomes…
▽ More
As road transportation has been identified as a major contributor of environmental pollution, motivating individuals to adopt a more eco-friendly driving style could have a substantial ecological as well as financial benefit. With gamification being an effective tool towards guiding targeted behavioural changes, the development of realistic frameworks delivering a high end user experience, becomes a topic of active research. This paper presents a series of enhancements introduced to an eco-driving gamification platform by the integration of additional wearable and vehicle-oriented sensing data sources, leading to a much more realistic evaluation of the context of a driving session.
△ Less
Submitted 19 October, 2020;
originally announced October 2020.
-
Part-to-whole Registration of Histology and MRI using Shape Elements
Authors:
Jonas Pichat,
Juan Eugenio Iglesias,
Sotiris Nousias,
Tarek Yousry,
Sebastien Ourselin,
Marc Modat
Abstract:
Image registration between histology and magnetic resonance imaging (MRI) is a challenging task due to differences in structural content and contrast. Too thick and wide specimens cannot be processed all at once and must be cut into smaller pieces. This dramatically increases the complexity of the problem, since each piece should be individually and manually pre-aligned. To the best of our knowled…
▽ More
Image registration between histology and magnetic resonance imaging (MRI) is a challenging task due to differences in structural content and contrast. Too thick and wide specimens cannot be processed all at once and must be cut into smaller pieces. This dramatically increases the complexity of the problem, since each piece should be individually and manually pre-aligned. To the best of our knowledge, no automatic method can reliably locate such piece of tissue within its respective whole in the MRI slice, and align it without any prior information. We propose here a novel automatic approach to the joint problem of multimodal registration between histology and MRI, when only a fraction of tissue is available from histology. The approach relies on the representation of images using their level lines so as to reach contrast invariance. Shape elements obtained via the extraction of bitangents are encoded in a projective-invariant manner, which permits the identification of common pieces of curves between two images. We evaluated the approach on human brain histology and compared resulting alignments against manually annotated ground truths. Considering the complexity of the brain folding patterns, preliminary results are promising and suggest the use of characteristic and meaningful shape elements for improved robustness and efficiency.
△ Less
Submitted 27 August, 2017;
originally announced August 2017.