-
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Authors:
Gheorghe Comanici,
Eric Bieber,
Mike Schaekermann,
Ice Pasupat,
Noveen Sachdeva,
Inderjit Dhillon,
Marcel Blistein,
Ori Ram,
Dan Zhang,
Evan Rosen,
Luke Marris,
Sam Petulla,
Colin Gaffney,
Asaf Aharoni,
Nathan Lintz,
Tiago Cardal Pais,
Henrik Jacobsson,
Idan Szpektor,
Nan-Jiang Jiang,
Krishna Haridasan,
Ahmed Omran,
Nikunj Saunshi,
Dara Bahri,
Gaurav Mishra,
Eric Chu
, et al. (3278 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal unde…
▽ More
In this report, we introduce the Gemini 2.X model family: Gemini 2.5 Pro and Gemini 2.5 Flash, as well as our earlier Gemini 2.0 Flash and Flash-Lite models. Gemini 2.5 Pro is our most capable model yet, achieving SoTA performance on frontier coding and reasoning benchmarks. In addition to its incredible coding and reasoning skills, Gemini 2.5 Pro is a thinking model that excels at multimodal understanding and it is now able to process up to 3 hours of video content. Its unique combination of long context, multimodal and reasoning capabilities can be combined to unlock new agentic workflows. Gemini 2.5 Flash provides excellent reasoning abilities at a fraction of the compute and latency requirements and Gemini 2.0 Flash and Flash-Lite provide high performance at low latency and cost. Taken together, the Gemini 2.X model generation spans the full Pareto frontier of model capability vs cost, allowing users to explore the boundaries of what is possible with complex agentic problem solving.
△ Less
Submitted 7 July, 2025;
originally announced July 2025.
-
Sliced Online Model Checking for Optimizing the Beam Scheduling Problem in Robotic Radiation Therapy
Authors:
Lars Beckers,
Stefan Gerlach,
Ole Lübke,
Alexander Schlaefer,
Sibylle Schupp
Abstract:
In robotic radiation therapy, high-energy photon beams from different directions are directed at a target within the patient. Target motion can be tracked by robotic ultrasound and then compensated by synchronous beam motion. However, moving the beams may result in beams passing through the ultrasound transducer or the robot carrying it. While this can be avoided by pausing the beam delivery, the…
▽ More
In robotic radiation therapy, high-energy photon beams from different directions are directed at a target within the patient. Target motion can be tracked by robotic ultrasound and then compensated by synchronous beam motion. However, moving the beams may result in beams passing through the ultrasound transducer or the robot carrying it. While this can be avoided by pausing the beam delivery, the treatment time would increase. Typically, the beams are delivered in an order which minimizes the robot motion and thereby the overall treatment time. However, this order can be changed, i.e., instead of pausing beams, other feasible beam could be delivered.
We address this problem of dynamically ordering the beams by applying a model checking paradigm to select feasible beams. Since breathing patterns are complex and change rapidly, any offline model would be too imprecise. Thus, model checking must be conducted online, predicting the patient's current breathing pattern for a short amount of time and checking which beams can be delivered safely. Monitoring the treatment delivery online provides the option to reschedule beams dynamically in order to avoid pausing and hence to reduce treatment time.
While human breathing patterns are complex and may change rapidly, we need a model which can be verified quickly and use approximation by a superposition of sine curves. Further, we simplify the 3D breathing motion into separate 1D models. We compensate the simplification by adding noise inside the model itself. In turn, we synchronize between the multiple models representing the different spatial directions, the treatment simulation, and corresponding verification queries.
Our preliminary results show a 16.02 % to 37.21 % mean improvement on the idle time compared to a static beam schedule, depending on an additional safety margin. Note that an additional safety margin around the ultrasound robot can decrease idle times but also compromises plan quality by limiting the range of available beam directions. In contrast, the approach using online model checking maintains the plan quality. Further, we compare to a naive machine learning approach that does not achieve its goals while being harder to reason about.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Collaborative Robotic Biopsy with Trajectory Guidance and Needle Tip Force Feedback
Authors:
Robin Mieling,
Maximilian Neidhardt,
Sarah Latus,
Carolin Stapper,
Stefan Gerlach,
Inga Kniep,
Axel Heinemann,
Benjamin Ondruschka,
Alexander Schlaefer
Abstract:
The diagnostic value of biopsies is highly dependent on the placement of needles. Robotic trajectory guidance has been shown to improve needle positioning, but feedback for real-time navigation is limited. Haptic display of needle tip forces can provide rich feedback for needle navigation by enabling localization of tissue structures along the insertion path. We present a collaborative robotic bio…
▽ More
The diagnostic value of biopsies is highly dependent on the placement of needles. Robotic trajectory guidance has been shown to improve needle positioning, but feedback for real-time navigation is limited. Haptic display of needle tip forces can provide rich feedback for needle navigation by enabling localization of tissue structures along the insertion path. We present a collaborative robotic biopsy system that combines trajectory guidance with kinesthetic feedback to assist the physician in needle placement. The robot aligns the needle while the insertion is performed in collaboration with a medical expert who controls the needle position on site. We present a needle design that senses forces at the needle tip based on optical coherence tomography and machine learning for real-time data processing. Our robotic setup allows operators to sense deep tissue interfaces independent of frictional forces to improve needle placement relative to a desired target structure. We first evaluate needle tip force sensing in ex-vivo tissue in a phantom study. We characterize the tip forces during insertions with constant velocity and demonstrate the ability to detect tissue interfaces in a collaborative user study. Participants are able to detect 91% of ex-vivo tissue interfaces based on needle tip force feedback alone. Finally, we demonstrate that even smaller, deep target structures can be accurately sampled by performing post-mortem in situ biopsies of the pancreas.
△ Less
Submitted 12 July, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Ultrasound Shear Wave Elasticity Imaging with Spatio-Temporal Deep Learning
Authors:
Maximilian Neidhardt,
Marcel Bengs,
Sarah Latus,
Stefan Gerlach,
Christian J. Cyron,
Johanna Sprenger,
Alexander Schlaefer
Abstract:
Ultrasound shear wave elasticity imaging is a valuable tool for quantifying the elastic properties of tissue. Typically, the shear wave velocity is derived and mapped to an elasticity value, which neglects information such as the shape of the propagating shear wave or push sequence characteristics. We present 3D spatio-temporal CNNs for fast local elasticity estimation from ultrasound data. This a…
▽ More
Ultrasound shear wave elasticity imaging is a valuable tool for quantifying the elastic properties of tissue. Typically, the shear wave velocity is derived and mapped to an elasticity value, which neglects information such as the shape of the propagating shear wave or push sequence characteristics. We present 3D spatio-temporal CNNs for fast local elasticity estimation from ultrasound data. This approach is based on retrieving elastic properties from shear wave propagation within small local regions. A large training data set is acquired with a robot from homogeneous gelatin phantoms ranging from 17.42 kPa to 126.05 kPa with various push locations. The results show that our approach can estimate elastic properties on a pixelwise basis with a mean absolute error of 5.01+-4.37 kPa. Furthermore, we estimate local elasticity independent of the push location and can even perform accurate estimates inside the push region. For phantoms with embedded inclusions, we report a 53.93% lower MAE (7.50 kPa) and on the background of 85.24% (1.64 kPa) compared to a conventional shear wave method. Overall, our method offers fast local estimations of elastic properties with small spatio-temporal window sizes.
△ Less
Submitted 28 April, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Robotic Tissue Sampling for Safe Post-mortem Biopsy in Infectious Corpses
Authors:
Maximilian Neidhardt,
Stefan Gerlach,
Robin Mieling,
Max-Heinrich Laves,
Thorben Weiß,
Martin Gromniak,
Antonia Fitzek,
Dustin Möbius,
Inga Kniep,
Alexandra Ron,
Julia Schädler,
Axel Heinemann,
Klaus Püschel,
Benjamin Ondruschka,
Alexander Schlaefer
Abstract:
In pathology and legal medicine, the histopathological and microbiological analysis of tissue samples from infected deceased is a valuable information for developing treatment strategies during a pandemic such as COVID-19. However, a conventional autopsy carries the risk of disease transmission and may be rejected by relatives. We propose minimally invasive biopsy with robot assistance under CT gu…
▽ More
In pathology and legal medicine, the histopathological and microbiological analysis of tissue samples from infected deceased is a valuable information for developing treatment strategies during a pandemic such as COVID-19. However, a conventional autopsy carries the risk of disease transmission and may be rejected by relatives. We propose minimally invasive biopsy with robot assistance under CT guidance to minimize the risk of disease transmission during tissue sampling and to improve accuracy. A flexible robotic system for biopsy sampling is presented, which is applied to human corpses placed inside protective body bags. An automatic planning and decision system estimates optimal insertion point. Heat maps projected onto the segmented skin visualize the distance and angle of insertions and estimate the minimum cost of a puncture while avoiding bone collisions. Further, we test multiple insertion paths concerning feasibility and collisions. A custom end effector is designed for inserting needles and extracting tissue samples under robotic guidance. Our robotic post-mortem biopsy (RPMB) system is evaluated in a study during the COVID-19 pandemic on 20 corpses and 10 tissue targets, 5 of them being infected with SARS-CoV-2. The mean planning time including robot path planning is (5.72+-1.67) s. Mean needle placement accuracy is (7.19+-4.22) mm.
△ Less
Submitted 28 January, 2022;
originally announced January 2022.
-
Fooling the Crowd with Deep Learning-based Methods
Authors:
Christian Marzahl,
Marc Aubreville,
Christof A. Bertram,
Stefan Gerlach,
Jennifer Maier,
Jörn Voigt,
Jenny Hill,
Robert Klopfleisch,
Andreas Maier
Abstract:
Modern, state-of-the-art deep learning approaches yield human like performance in numerous object detection and classification tasks. The foundation for their success is the availability of training datasets of substantially high quantity, which are expensive to create, especially in the field of medical imaging. Recently, crowdsourcing has been applied to create large datasets for a broad range o…
▽ More
Modern, state-of-the-art deep learning approaches yield human like performance in numerous object detection and classification tasks. The foundation for their success is the availability of training datasets of substantially high quantity, which are expensive to create, especially in the field of medical imaging. Recently, crowdsourcing has been applied to create large datasets for a broad range of disciplines. This study aims to explore the challenges and opportunities of crowd-algorithm collaboration for the object detection task of grading cytology whole slide images. We compared the classical crowdsourcing performance of twenty participants with their results from crowd-algorithm collaboration. All participants performed both modes in random order on the same twenty images. Additionally, we introduced artificial systematic flaws into the precomputed annotations to estimate a bias towards accepting precomputed annotations. We gathered 9524 annotations on 800 images from twenty participants organised into four groups in concordance to their level of expertise with cytology. The crowd-algorithm mode improved on average the participants' classification accuracy by 7%, the mean average precision by 8% and the inter-observer Fleiss' kappa score by 20%, and reduced the time spent by 31%. However, two thirds of the artificially modified false labels were not recognised as such by the contributors. This study shows that crowd-algorithm collaboration is a promising new approach to generate large datasets when it is ensured that a carefully designed setup eliminates potential biases.
△ Less
Submitted 30 November, 2019;
originally announced December 2019.