Search | arXiv e-print repository

Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography

Authors: Ibrahim Ethem Hamamci, Sezgin Er, Chenyu Wang, Furkan Almas, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Irem Doga, Omer Faruk Durugol, Weicheng Dai, Murong Xu, Muhammed Furkan Dasdelen, Bastian Wittmann, Tamaz Amiranashvili, Enis Simsar, Mehmet Simsar, Emine Bensu Erdemir, Abdullah Alanbay, Anjany Sekuboyina, Berkan Lafci, Christian Bluethgen, Kayhan Batmanghelich, Mehmet Kemal Ozdemir, Bjoern Menze

Abstract: While computer vision has achieved tremendous success with multimodal encoding and direct textual interaction with images via chat-based large language models, similar advancements in medical imaging AI, particularly in 3D imaging, have been limited due to the scarcity of comprehensive datasets. To address this critical gap, we introduce CT-RATE, the first dataset that pairs 3D medical images with… ▽ More While computer vision has achieved tremendous success with multimodal encoding and direct textual interaction with images via chat-based large language models, similar advancements in medical imaging AI, particularly in 3D imaging, have been limited due to the scarcity of comprehensive datasets. To address this critical gap, we introduce CT-RATE, the first dataset that pairs 3D medical images with corresponding textual reports. CT-RATE comprises 25,692 non-contrast 3D chest CT scans from 21,304 unique patients. Through various reconstructions, these scans are expanded to 50,188 volumes, totaling over 14.3 million 2D slices. Each scan is accompanied by its corresponding radiology report. Leveraging CT-RATE, we develop CT-CLIP, a CT-focused contrastive language-image pretraining framework designed for broad applications without the need for task-specific training. We demonstrate how CT-CLIP can be used in two tasks: multi-abnormality detection and case retrieval. Remarkably, in multi-abnormality detection, CT-CLIP outperforms state-of-the-art fully supervised models across all key metrics, effectively eliminating the need for manual annotation. In case retrieval, it efficiently retrieves relevant cases using either image or textual queries, thereby enhancing knowledge dissemination. By combining CT-CLIP's vision encoder with a pretrained large language model, we create CT-CHAT, a vision-language foundational chat model for 3D chest CT volumes. Finetuned on over 2.7 million question-answer pairs derived from the CT-RATE dataset, CT-CHAT surpasses other multimodal AI assistants, underscoring the necessity for specialized methods in 3D medical imaging. Collectively, the open-source release of CT-RATE, CT-CLIP, and CT-CHAT not only addresses critical challenges in 3D medical imaging, but also lays the groundwork for future innovations in medical AI and improved patient care. △ Less

Submitted 4 April, 2025; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2305.16037 [pdf, other]

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Authors: Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina, Enis Simsar, Alperen Tezcan, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Furkan Almas, Irem Dogan, Muhammed Furkan Dasdelen, Chinmay Prabhakar, Hadrien Reynaud, Sarthak Pati, Christian Bluethgen, Mehmet Kemal Ozdemir, Bjoern Menze

Abstract: GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts, incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging,… ▽ More GenerateCT, the first approach to generating 3D medical imaging conditioned on free-form medical text prompts, incorporates a text encoder and three key components: a novel causal vision transformer for encoding 3D CT volumes, a text-image transformer for aligning CT and text tokens, and a text-conditional super-resolution diffusion model. Without directly comparable methods in 3D medical imaging, we benchmarked GenerateCT against cutting-edge methods, demonstrating its superiority across all key metrics. Importantly, we evaluated GenerateCT's clinical applications in a multi-abnormality classification task. First, we established a baseline by training a multi-abnormality classifier on our real dataset. To further assess the model's generalization to external data and performance with unseen prompts in a zero-shot scenario, we employed an external set to train the classifier, setting an additional benchmark. We conducted two experiments in which we doubled the training datasets by synthesizing an equal number of volumes for each set using GenerateCT. The first experiment demonstrated an 11% improvement in the AP score when training the classifier jointly on real and generated volumes. The second experiment showed a 7% improvement when training on both real and generated volumes based on unseen prompts. Moreover, GenerateCT enables the scaling of synthetic training datasets to arbitrary sizes. As an example, we generated 100,000 3D CTs, fivefold the number in our real set, and trained the classifier exclusively on these synthetic CTs. Impressively, this classifier surpassed the performance of the one trained on all available real data by a margin of 8%. Last, domain experts evaluated the generated volumes, confirming a high degree of alignment with the text prompt. Access our code, model weights, training data, and generated data at https://github.com/ibrahimethemhamamci/GenerateCT △ Less

Submitted 12 July, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2204.03120 [pdf]

AutoCOR: Autonomous Condylar Offset Ratio Calculator on TKA-Postoperative Lateral Knee X-ray

Authors: Gulsade Rabia Cakmak, Ibrahim Ethem Hamamci, Mehmet Kursat Yilmaz, Reda Alhajj, Ibrahim Azboy, Mehmet Kemal Ozdemir

Abstract: The postoperative range of motion is one of the crucial factors indicating the outcome of Total Knee Arthroplasty (TKA). Although the correlation between range of knee flexion and posterior condylar offset (PCO) is controversial in the literature, PCO maintains its importance on evaluation of TKA. Due to limitations on PCO measurement, two novel parameters, posterior condylar offset ratio (PCOR) a… ▽ More The postoperative range of motion is one of the crucial factors indicating the outcome of Total Knee Arthroplasty (TKA). Although the correlation between range of knee flexion and posterior condylar offset (PCO) is controversial in the literature, PCO maintains its importance on evaluation of TKA. Due to limitations on PCO measurement, two novel parameters, posterior condylar offset ratio (PCOR) and anterior condylar offset ratio (ACOR), were introduced. Nowadays, the calculation of PCOR and ACOR on plain lateral radiographs is done manually by orthopedic surgeons. In this regard, we developed a software, AutoCOR, to calculate PCOR and ACOR autonomously, utilizing unsupervised machine learning algorithm (k-means clustering) and digital image processing techniques. The software AutoCOR is capable of detecting the anterior/posterior edge points and anterior/posterior cortex of the femoral shaft on true postoperative lateral conventional radiographs. To test the algorithm, 50 postoperative true lateral radiographs from Istanbul Kosuyolu Medipol Hospital Database were used (32 patients). The mean PCOR was 0.984 (SD 0.235) in software results and 0.972 (SD 0.164) in ground truth values. It shows strong and significant correlation between software and ground truth values (Pearson r=0.845 p<0.0001). The mean ACOR was 0.107 (SD 0.092) in software results and 0.107 (SD 0.070) in ground truth values. It shows moderate and significant correlation between software and ground truth values (Spearman's rs=0.519 p=0.0001412). We suggest that AutoCOR is a useful tool that can be used in clinical practice. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: 9 pages

MSC Class: 92C55 (Primary)

arXiv:2202.08448 [pdf]

FM Band Channel Measurements and Modeling

Authors: Omar Ahmadien, Nann Win Moe Thet, Mehmet Kemal Ozdemir

Abstract: As FM coverage is so ubiquitous around the world, several applications can be considered to better exploit this useful band. Thus, it is of significant interest to investigate and characterize channel properties of the FM Band for the potential two-way digital wireless systems. In this paper, we present the results of field measurements at 86 MHz conducted at Gebze, Kocaeli, Turkey. Through the me… ▽ More As FM coverage is so ubiquitous around the world, several applications can be considered to better exploit this useful band. Thus, it is of significant interest to investigate and characterize channel properties of the FM Band for the potential two-way digital wireless systems. In this paper, we present the results of field measurements at 86 MHz conducted at Gebze, Kocaeli, Turkey. Through the measurements, some of the FM channel characteristics are identified. Measurements are performed for urban, hilly terrain, and rural areas. Our results show that the FM channel expectedly has a large coverage area but at the same time, it possesses large channel excess delays. While most of COST-207 channel power delay profile models are also applicable for the FM Band, for urban environments and hilly terrain environments, channel clusters and excess delays are higher than those of COST-207 models. As 5G systems aim to utilize lower frequency bands for supplementary links, FM Band can be considered as one of the potential bands and the channel models proposed in this study can then be exploited for performance analysis. △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2010.10193 [pdf, other]

Identification of The Number of Wireless Channel Taps Using Deep Neural Networks

Authors: Ahmad M. Jaradat, Khaled Walid Elgammal, Mehmet Kemal Ozdemir, Huseyin Arslan

Abstract: In wireless communication systems, identifying the number of channel taps offers an enhanced estimation of the channel impulse response (CIR). In this work, efficient identification of the number of wireless channel taps has been achieved via deep neural networks (DNNs), where we modified an existing DNN and analyzed its convergence performance using only the transmitted and received signals of a… ▽ More In wireless communication systems, identifying the number of channel taps offers an enhanced estimation of the channel impulse response (CIR). In this work, efficient identification of the number of wireless channel taps has been achieved via deep neural networks (DNNs), where we modified an existing DNN and analyzed its convergence performance using only the transmitted and received signals of a wireless system. The displayed results demonstrate that the adopted DNN accomplishes superior performance in identifying the number of channel taps, as compared to an existing algorithm called Spectrum Weighted Identification of Signal Sources (SWISS). △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:1808.06025 [pdf, other]

doi 10.1109/SARNOF.2016.7846771

Optimization of LTE Radio Resource Block Allocation for Maritime Channels

Authors: Amit Kachroo, Mehmet Kemal Ozdemir, Hatice Tekiner-Mogulkoc

Abstract: In this study, we describe the behavior of LTE over the sea and investigate the problem of radio resource block allocation in such SINR limited maritime channels. For simulations of such sea environment, we considered a network scenario of Bosphorus Strait in Istanbul, Turkey with different number of ships ferrying between two ports at a given time. After exploiting the network characteristics, we… ▽ More In this study, we describe the behavior of LTE over the sea and investigate the problem of radio resource block allocation in such SINR limited maritime channels. For simulations of such sea environment, we considered a network scenario of Bosphorus Strait in Istanbul, Turkey with different number of ships ferrying between two ports at a given time. After exploiting the network characteristics, we formulated and solved the radio resource allocation problem by max-min integer linear programming method. The radio resource allocation fairness in terms of Jain's fairness index was computed and it was compared with round robin and opportunistic methods. Results show that the max-min optimization method performs better than the opportunistic and round robin methods. This result in turn reflects that the max-min optimization method gives us the high minimum best throughput as compared to other two methods considering different ship density scenarios in the sea. Also, it was observed that as the number of ships begin to increase in the sea, the max-min method performs significantly better with good fairness as compared to the other two methods. △ Less

Submitted 17 August, 2018; originally announced August 2018.

Comments: 6 pages, 10 figures. Published in 2016 IEEE 37th Sarnoff Symposium at Newark, NJ, USA

Showing 1–6 of 6 results for author: Ozdemir, M K