Search | arXiv e-print repository

EndoDINO: A Foundation Model for GI Endoscopy

Authors: Patrick Dermyer, Angad Kalra, Matt Schwartz

Abstract: In this work, we present EndoDINO, a foundation model for GI endoscopy tasks that achieves strong generalizability by pre-training on a well-curated image dataset sampled from the largest known GI endoscopy video dataset in the literature. Specifically, we pre-trained ViT models with 1B, 307M, and 86M parameters using datasets ranging from 100K to 10M curated images. Using EndoDINO as a frozen fea… ▽ More In this work, we present EndoDINO, a foundation model for GI endoscopy tasks that achieves strong generalizability by pre-training on a well-curated image dataset sampled from the largest known GI endoscopy video dataset in the literature. Specifically, we pre-trained ViT models with 1B, 307M, and 86M parameters using datasets ranging from 100K to 10M curated images. Using EndoDINO as a frozen feature encoder, we achieved state-of-the-art performance in anatomical landmark classification, polyp segmentation, and Mayo endoscopic scoring (MES) for ulcerative colitis with only simple decoder heads. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2407.09392 [pdf, other]

Open-Canopy: Towards Very High Resolution Forest Monitoring

Authors: Fajwel Fogel, Yohann Perron, Nikola Besic, Laurent Saint-André, Agnès Pellissier-Tanon, Martin Schwartz, Thomas Boudras, Ibrahim Fayad, Alexandre d'Aspremont, Loic Landrieu, Philippe Ciais

Abstract: Estimating canopy height and its changes at meter resolution from satellite imagery is a significant challenge in computer vision with critical environmental applications. However, the lack of open-access datasets at this resolution hinders the reproducibility and evaluation of models. We introduce Open-Canopy, the first open-access, country-scale benchmark for very high-resolution (1.5 m) canopy… ▽ More Estimating canopy height and its changes at meter resolution from satellite imagery is a significant challenge in computer vision with critical environmental applications. However, the lack of open-access datasets at this resolution hinders the reproducibility and evaluation of models. We introduce Open-Canopy, the first open-access, country-scale benchmark for very high-resolution (1.5 m) canopy height estimation, covering over 87,000 km$^2$ across France with 1.5 m resolution satellite imagery and aerial LiDAR data. Additionally, we present Open-Canopy-$Δ$, a benchmark for canopy height change detection between images from different years at tree level-a challenging task for current computer vision models. We evaluate state-of-the-art architectures on these benchmarks, highlighting significant challenges and opportunities for improvement. Our datasets and code are publicly available at https://github.com/fajwel/Open-Canopy. △ Less

Submitted 11 December, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

Comments: 25 pages, 6+6 figures, Submitted to CVPR25

arXiv:2304.11487 [pdf]

doi 10.1016/j.rse.2023.113945

Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights

Authors: Ibrahim Fayad, Philippe Ciais, Martin Schwartz, Jean-Pierre Wigneron, Nicolas Baghdadi, Aurélien de Truchis, Alexandre d'Aspremont, Frederic Frappart, Sassan Saatchi, Agnes Pellissier-Tanon, Hassan Bazzi

Abstract: Accurate and timely monitoring of forest canopy heights is critical for assessing forest dynamics, biodiversity, carbon sequestration as well as forest degradation and deforestation. Recent advances in deep learning techniques, coupled with the vast amount of spaceborne remote sensing data offer an unprecedented opportunity to map canopy height at high spatial and temporal resolutions. Current tec… ▽ More Accurate and timely monitoring of forest canopy heights is critical for assessing forest dynamics, biodiversity, carbon sequestration as well as forest degradation and deforestation. Recent advances in deep learning techniques, coupled with the vast amount of spaceborne remote sensing data offer an unprecedented opportunity to map canopy height at high spatial and temporal resolutions. Current techniques for wall-to-wall canopy height mapping correlate remotely sensed 2D information from optical and radar sensors to the vertical structure of trees using LiDAR measurements. While studies using deep learning algorithms have shown promising performances for the accurate mapping of canopy heights, they have limitations due to the type of architectures and loss functions employed. Moreover, mapping canopy heights over tropical forests remains poorly studied, and the accurate height estimation of tall canopies is a challenge due to signal saturation from optical and radar sensors, persistent cloud covers and sometimes the limited penetration capabilities of LiDARs. Here, we map heights at 10 m resolution across the diverse landscape of Ghana with a new vision transformer (ViT) model optimized concurrently with a classification (discrete) and a regression (continuous) loss function. This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function. The ViT model results show that our proposed discrete/continuous loss significantly increases the sensitivity for very tall trees (i.e., > 35m), for which other approaches show saturation effects. The height maps generated by the ViT also have better ground sampling distance and better sensitivity to sparse vegetation in comparison to a convolutional model. Our ViT model has a RMSE of 3.12m in comparison to a reference dataset while the ConvNet model has a RMSE of 4.3m. △ Less

Submitted 22 April, 2023; originally announced April 2023.

arXiv:2212.10265 [pdf]

doi 10.1016/j.jag.2024.103711

High-resolution canopy height map in the Landes forest (France) based on GEDI, Sentinel-1, and Sentinel-2 data with a deep learning approach

Authors: Martin Schwartz, Philippe Ciais, Catherine Ottlé, Aurelien De Truchis, Cedric Vega, Ibrahim Fayad, Martin Brandt, Rasmus Fensholt, Nicolas Baghdadi, François Morneau, David Morin, Dominique Guyon, Sylvia Dayau, Jean-Pierre Wigneron

Abstract: In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "La… ▽ More In intensively managed forests in Europe, where forests are divided into stands of small size and may show heterogeneity within stands, a high spatial resolution (10 - 20 meters) is arguably needed to capture the differences in canopy height. In this work, we developed a deep learning model based on multi-stream remote sensing measurements to create a high-resolution canopy height map over the "Landes de Gascogne" forest in France, a large maritime pine plantation of 13,000 km$^2$ with flat terrain and intensive management. This area is characterized by even-aged and mono-specific stands, of a typical length of a few hundred meters, harvested every 35 to 50 years. Our deep learning U-Net model uses multi-band images from Sentinel-1 and Sentinel-2 with composite time averages as input to predict tree height derived from GEDI waveforms. The evaluation is performed with external validation data from forest inventory plots and a stereo 3D reconstruction model based on Skysat imagery available at specific locations. We trained seven different U-net models based on a combination of Sentinel-1 and Sentinel-2 bands to evaluate the importance of each instrument in the dominant height retrieval. The model outputs allow us to generate a 10 m resolution canopy height map of the whole "Landes de Gascogne" forest area for 2020 with a mean absolute error of 2.02 m on the Test dataset. The best predictions were obtained using all available satellite layers from Sentinel-1 and Sentinel-2 but using only one satellite source also provided good predictions. For all validation datasets in coniferous forests, our model showed better metrics than previous canopy height models available in the same region. △ Less

Submitted 20 December, 2022; originally announced December 2022.

Comments: 39 pages, 16 figures + supplementary contents

arXiv:2105.05800 [pdf, other]

doi 10.1088/2399-7532/ac0060

Linking Physical Objects to Their Digital Twins via Fiducial Markers Designed for Invisibility to Humans

Authors: Mathew Schwartz, Yong Geng, Hakam Agha, Rijeesh Kizhakidathazhath, Danqing Liu, Gabriele Lenzini, Jan PF Lagerwall

Abstract: The ability to label and track physical objects that are assets in digital representations of the world is foundational to many complex systems. Simple, yet powerful methods such as bar- and QR-codes have been highly successful, e.g. in the retail space, but the lack of security, limited information content and impossibility of seamless integration with the environment have prevented a large-scale… ▽ More The ability to label and track physical objects that are assets in digital representations of the world is foundational to many complex systems. Simple, yet powerful methods such as bar- and QR-codes have been highly successful, e.g. in the retail space, but the lack of security, limited information content and impossibility of seamless integration with the environment have prevented a large-scale linking of physical objects to their digital twins. This paper proposes to link digital assets created through BIM with their physical counterparts using fiducial markers with patterns defined by Cholesteric Spherical Reflectors (CSRs), selective retroreflectors produced using liquid crystal self-assembly. The markers leverage the ability of CSRs to encode information that is easily detected and read with computer vision while remaining practically invisible to the human eye. We analyze the potential of a CSR-based infrastructure from the perspective of BIM, critically reviewing the outstanding challenges in applying this new class of functional materials, and we discuss extended opportunities arising in assisting autonomous mobile robots to reliably navigate human-populated environments, as well as in augmented reality. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: 30 pages, 8 figures. This paper is a very interdisciplinary topical review on the use of Cholesteric Spherical Reflectors to make fiducial markers-- visible to robots but not humans-- to link Digital and Physical twins. The authors are from fields including Design, Materials Science, Security and Computer Science, and Physics

arXiv:2008.02251 [pdf]

Fully Automated and Standardized Segmentation of Adipose Tissue Compartments by Deep Learning in Three-dimensional Whole-body MRI of Epidemiological Cohort Studies

Authors: Thomas Küstner, Tobias Hepp, Marc Fischer, Martin Schwartz, Andreas Fritsche, Hans-Ulrich Häring, Konstantin Nikolaou, Fabian Bamberg, Bin Yang, Fritz Schick, Sergios Gatidis, Jürgen Machann

Abstract: Purpose: To enable fast and reliable assessment of subcutaneous and visceral adipose tissue compartments derived from whole-body MRI. Methods: Quantification and localization of different adipose tissue compartments from whole-body MR images is of high interest to examine metabolic conditions. For correct identification and phenotyping of individuals at increased risk for metabolic diseases, a rel… ▽ More Purpose: To enable fast and reliable assessment of subcutaneous and visceral adipose tissue compartments derived from whole-body MRI. Methods: Quantification and localization of different adipose tissue compartments from whole-body MR images is of high interest to examine metabolic conditions. For correct identification and phenotyping of individuals at increased risk for metabolic diseases, a reliable automatic segmentation of adipose tissue into subcutaneous and visceral adipose tissue is required. In this work we propose a 3D convolutional neural network (DCNet) to provide a robust and objective segmentation. In this retrospective study, we collected 1000 cases (66$\pm$ 13 years; 523 women) from the Tuebingen Family Study and from the German Center for Diabetes research (TUEF/DZD), as well as 300 cases (53$\pm$ 11 years; 152 women) from the German National Cohort (NAKO) database for model training, validation, and testing with a transfer learning between the cohorts. These datasets had variable imaging sequences, imaging contrasts, receiver coil arrangements, scanners and imaging field strengths. The proposed DCNet was compared against a comparable 3D UNet segmentation in terms of sensitivity, specificity, precision, accuracy, and Dice overlap. Results: Fast (5-7seconds) and reliable adipose tissue segmentation can be obtained with high Dice overlap (0.94), sensitivity (96.6%), specificity (95.1%), precision (92.1%) and accuracy (98.4%) from 3D whole-body MR datasets (field of view coverage 450x450x2000mm${}^3$). Segmentation masks and adipose tissue profiles are automatically reported back to the referring physician. Conclusion: Automatic adipose tissue segmentation is feasible in 3D whole-body MR data sets and is generalizable to different epidemiological cohort studies with the proposed DCNet. △ Less

Submitted 5 August, 2020; originally announced August 2020.

Comments: This manuscript has been accepted for publication in Radiology: Artificial Intelligence (https://pubs.rsna.org/journal/ai), which is published by the Radiological Society of North America (RSNA)

arXiv:1710.10511 [pdf, other]

doi 10.1109/TRO.2018.2791600

Online Approximate Optimal Station Keeping of a Marine Craft in the Presence of a Current

Authors: Patrick Walters, Rushikesh Kamalapurkar, Forrest Voight, Eric M. Schwartz, Warren E. Dixon

Abstract: Online approximation of the optimal station keeping strategy for a fully actuated six degrees-of-freedom marine craft subject to an irrotational ocean current is considered. An approximate solution to the optimal control problem is obtained using an adaptive dynamic programming technique. The hydrodynamic drift dynamics of the dynamic model are assumed to be unknown; therefore, a concurrent learni… ▽ More Online approximation of the optimal station keeping strategy for a fully actuated six degrees-of-freedom marine craft subject to an irrotational ocean current is considered. An approximate solution to the optimal control problem is obtained using an adaptive dynamic programming technique. The hydrodynamic drift dynamics of the dynamic model are assumed to be unknown; therefore, a concurrent learning-based system identifier is developed to identify the unknown model parameters. The identified model is used to implement an adaptive model-based reinforcement learning technique to estimate the unknown value function. The developed policy guarantees uniformly ultimately bounded convergence of the vehicle to the desired station and uniformly ultimately bounded convergence of the approximated policies to the optimal polices without the requirement of persistence of excitation. The developed strategy is validated using an autonomous underwater vehicle, where the three degrees-of-freedom in the horizontal plane are regulated. The experiments are conducted in a second-magnitude spring located in central Florida. △ Less

Submitted 28 October, 2017; originally announced October 2017.

Showing 1–7 of 7 results for author: Schwartz, M