-
HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation
Authors:
Xinzhuo Li,
Adheesh Juvekar,
Xingyou Liu,
Muntasir Wahed,
Kiet A. Nguyen,
Ismini Lourentzou
Abstract:
Recent progress in vision-language segmentation has significantly advanced grounded visual understanding. However, these models often exhibit hallucinations by producing segmentation masks for objects not grounded in the image content or by incorrectly labeling irrelevant regions. Existing evaluation protocols for segmentation hallucination primarily focus on label or textual hallucinations withou…
▽ More
Recent progress in vision-language segmentation has significantly advanced grounded visual understanding. However, these models often exhibit hallucinations by producing segmentation masks for objects not grounded in the image content or by incorrectly labeling irrelevant regions. Existing evaluation protocols for segmentation hallucination primarily focus on label or textual hallucinations without manipulating the visual context, limiting their capacity to diagnose critical failures. In response, we introduce HalluSegBench, the first benchmark specifically designed to evaluate hallucinations in visual grounding through the lens of counterfactual visual reasoning. Our benchmark consists of a novel dataset of 1340 counterfactual instance pairs spanning 281 unique object classes, and a set of newly introduced metrics that quantify hallucination sensitivity under visually coherent scene edits. Experiments on HalluSegBench with state-of-the-art vision-language segmentation models reveal that vision-driven hallucinations are significantly more prevalent than label-driven ones, with models often persisting in false segmentation, highlighting the need for counterfactual reasoning to diagnose grounding fidelity.
△ Less
Submitted 28 June, 2025; v1 submitted 26 June, 2025;
originally announced June 2025.
-
Uncertainty in Action: Confidence Elicitation in Embodied Agents
Authors:
Tianjiao Yu,
Vedant Shah,
Muntasir Wahed,
Kiet A. Nguyen,
Adheesh Juvekar,
Tal August,
Ismini Lourentzou
Abstract:
Expressing confidence is challenging for embodied agents navigating dynamic multimodal environments, where uncertainty arises from both perception and decision-making processes. We present the first work investigating embodied confidence elicitation in open-ended multimodal environments. We introduce Elicitation Policies, which structure confidence assessment across inductive, deductive, and abduc…
▽ More
Expressing confidence is challenging for embodied agents navigating dynamic multimodal environments, where uncertainty arises from both perception and decision-making processes. We present the first work investigating embodied confidence elicitation in open-ended multimodal environments. We introduce Elicitation Policies, which structure confidence assessment across inductive, deductive, and abductive reasoning, along with Execution Policies, which enhance confidence calibration through scenario reinterpretation, action sampling, and hypothetical reasoning. Evaluating agents in calibration and failure prediction tasks within the Minecraft environment, we show that structured reasoning approaches, such as Chain-of-Thoughts, improve confidence calibration. However, our findings also reveal persistent challenges in distinguishing uncertainty, particularly under abductive settings, underscoring the need for more sophisticated embodied confidence elicitation methods.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models
Authors:
Kiet A. Nguyen,
Adheesh Juvekar,
Tianjiao Yu,
Muntasir Wahed,
Ismini Lourentzou
Abstract:
Recent advances in Large Vision-Language Models (LVLMs) have enabled general-purpose vision tasks through visual instruction tuning. While existing LVLMs can generate segmentation masks from text prompts for single images, they struggle with segmentation-grounded reasoning across images, especially at finer granularities such as object parts. In this paper, we introduce the new task of part-focuse…
▽ More
Recent advances in Large Vision-Language Models (LVLMs) have enabled general-purpose vision tasks through visual instruction tuning. While existing LVLMs can generate segmentation masks from text prompts for single images, they struggle with segmentation-grounded reasoning across images, especially at finer granularities such as object parts. In this paper, we introduce the new task of part-focused semantic co-segmentation, which involves identifying and segmenting common objects, as well as common and unique object parts across images. To address this task, we present CALICO, the first LVLM designed for multi-image part-level reasoning segmentation. CALICO features two key components, a novel Correspondence Extraction Module that identifies semantic part-level correspondences, and Correspondence Adaptation Modules that embed this information into the LVLM to facilitate multi-image understanding in a parameter-efficient manner. To support training and evaluation, we curate MixedParts, a large-scale multi-image segmentation dataset containing $\sim$2.4M samples across $\sim$44K images spanning diverse object and part categories. Experimental results demonstrate that CALICO, with just 0.3% of its parameters finetuned, achieves strong performance on this challenging task.
△ Less
Submitted 3 April, 2025; v1 submitted 26 December, 2024;
originally announced December 2024.
-
PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation
Authors:
Muntasir Wahed,
Kiet A. Nguyen,
Adheesh Sunil Juvekar,
Xinzhuo Li,
Xiaona Zhou,
Vedant Shah,
Tianjiao Yu,
Pinar Yanardag,
Ismini Lourentzou
Abstract:
Despite significant advancements in Large Vision-Language Models (LVLMs), existing pixel-grounding models operate on single-image settings, limiting their ability to perform detailed, fine-grained comparisons across multiple images. Conversely, current multi-image understanding models lack pixel-level grounding. Our work addresses this gap by introducing the task of multi-image pixel-grounded reas…
▽ More
Despite significant advancements in Large Vision-Language Models (LVLMs), existing pixel-grounding models operate on single-image settings, limiting their ability to perform detailed, fine-grained comparisons across multiple images. Conversely, current multi-image understanding models lack pixel-level grounding. Our work addresses this gap by introducing the task of multi-image pixel-grounded reasoning segmentation, and PRIMA, a novel LVLM that integrates pixel-level grounding with robust multi-image reasoning capabilities to produce contextually rich, pixel-grounded explanations. Central to PRIMA is an efficient vision module that queries fine-grained visual representations across multiple images, reducing TFLOPs by $25.3\%$. To support training and evaluation, we curate $M^4Seg$, a new reasoning segmentation benchmark consisting of $\sim$224K question-answer pairs that require fine-grained visual understanding across multiple images. Experimental results demonstrate PRIMA outperforms state-of-the-art baselines.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Analysis of Stick-Slip Motion as a Jump Phenomenon
Authors:
Vinay A. Juvekar,
Arun K. Singh
Abstract:
In this work, we analyse the stick-slip motion of a soft elastomeric block on a smooth, hard surface under the application of shear, which is induced by a puller moving at a steady velocity. The frictional stress is generated by make-break of bonds between the pendent chains of the elastomeric block and bonding sites on the hard surface. Relation between velocity and frictional stress has been est…
▽ More
In this work, we analyse the stick-slip motion of a soft elastomeric block on a smooth, hard surface under the application of shear, which is induced by a puller moving at a steady velocity. The frictional stress is generated by make-break of bonds between the pendent chains of the elastomeric block and bonding sites on the hard surface. Relation between velocity and frictional stress has been estimated using the bond-population balance model. Stick-slip motion occurs when the pulling velocity is lower than a critical value. Unlike, the rate-and-state friction model which views the stick-slip motion as a limit cycle, we show that during the stick phase, the sliding surface actually sticks to the hard surface and remains stationary till the shear exerted by puller causes rupture of all bonds between contacting surfaces. The major fraction of the bonds undergo catastrophic rupture so as to cause the sliding surface to slip and attain a significantly higher velocity than the pulling velocity. During the slip phase, the sliding friction is balanced by rapid make-break of weak bonds. As the sliding velocity decreases, the bonds undergo aging and the adhesion stress increases. When the bond adhesion stress exceeds the pulling stress, the contacting surfaces stick together. We have mathematically modeled both the stick and the slip regimes using the bond-population balance model. We have validated the model using the experimental data from the work of Baumberger et al (2002) on sliding of an elastomeric gelatine-gel block on a glass surface.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
A Modified Model for Static Friction of a Soft and Hard Solid Interface
Authors:
Arun K. Singh,
Vinay A. Juvekar
Abstract:
We present a modified model based on shear rate and aging time dependent static friction between a soft solid such as gelatin hydrogel and a hard surface for instance glass surface. Earlier the model for static friction (Juvekar and Singh, 2016) considered only the bond rupture process as a result, the friction model over predicts the static friction in the experiment. The friction model now takes…
▽ More
We present a modified model based on shear rate and aging time dependent static friction between a soft solid such as gelatin hydrogel and a hard surface for instance glass surface. Earlier the model for static friction (Juvekar and Singh, 2016) considered only the bond rupture process as a result, the friction model over predicts the static friction in the experiment. The friction model now takes into account both formation and rupture of molecular chains at the sliding interface. It is also assumed that age of the newly formed bonds during the rupture process is the same as the aging time. As a result, the model predicts quite well the experimental data and thus highlighting the significance of bond formation in static friction. Moreover, it is also observed that residual stress has no effect on static strength.
△ Less
Submitted 24 January, 2021;
originally announced January 2021.
-
Underpotential electroless deposition of metals on polyaniline
Authors:
Amrita Singh,
Asfiya Contractora,
Ravindra D. Kale,
Vinay A Juvekar
Abstract:
A novel technique to deposit metals on highly conjugated polyaniline films has been developed. In general, electrodeposition of metals, having low reduction potential, from aqueous solution, is difficult due to disruptive effect of hydrogen which evolves during the process. This difficulty is avoided using conducting polymers films with high surface mass density. The polymer chains of these films…
▽ More
A novel technique to deposit metals on highly conjugated polyaniline films has been developed. In general, electrodeposition of metals, having low reduction potential, from aqueous solution, is difficult due to disruptive effect of hydrogen which evolves during the process. This difficulty is avoided using conducting polymers films with high surface mass density. The polymer chains of these films possess a high degree of conjugation. Such a polymer produces highly stable polarons and therefore has the ability to perform underpotential deposition. Our method involves reduction of polyaniline film with formic acid followed by dipping the coated electrode in the metal salt solution. Deposition of the metal is monitored by rise in the open circuit potential of the electrode. Deposition of metals with high surface mass density has been achieved. The metal is most likely present in the polymer as a coordination complex with amine nitrogen. Such form of metal is expected to have higher catalytic activity than the zero-valent metal. We have been able to deposit metals such as Mn and Cu. Among these, Mn cannot be deposited on polymer by any other method.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Face Verification and Forgery Detection for Ophthalmic Surgery Images
Authors:
Kaushal Bhogale,
Nishant Shankar,
Adheesh Juvekar,
Asutosh Padhi
Abstract:
Although modern face verification systems are accessible and accurate, they are not always robust to pose variance and occlusions. Moreover, accurate models require a large amount of data to train. We structure our experiments to operate on small amounts of data obtained from an NGO that funds ophthalmic surgeries. We set up our face verification task as that of verifying pre-operation and post-op…
▽ More
Although modern face verification systems are accessible and accurate, they are not always robust to pose variance and occlusions. Moreover, accurate models require a large amount of data to train. We structure our experiments to operate on small amounts of data obtained from an NGO that funds ophthalmic surgeries. We set up our face verification task as that of verifying pre-operation and post-operation images of a patient that undergoes ophthalmic surgery, and as such the post-operation images have occlusions like an eye patch. In this paper, we present a system that performs the face verification task using one-shot learning. To this end, our paper uses deep convolutional networks and compares different model architectures and loss functions. Our best model achieves 85% test accuracy. During inference time, we also attempt to detect image forgeries in addition to performing face verification. To achieve this, we use Error Level Analysis. Finally, we propose an inference pipeline that demonstrates how these techniques can be used to implement an automated face verification and forgery detection system.
△ Less
Submitted 15 November, 2018;
originally announced November 2018.
-
An Electrochemical Technique for Measurements of Electrical Conductivity of Aqueous Electrolytes
Authors:
Rajkumar S. Patil,
Vinay A. Juvekar,
Umesh Nalage
Abstract:
The technique presented here for the measurement of electrical conductivity is based on the principle that the current converges on a small disk electrode. Most of the ohmic resistance therefore lies within a narrow region surrounding the disk. If the reference electrode is kept outside this zone, the potential difference between the working and the reference electrode includes practically all ohm…
▽ More
The technique presented here for the measurement of electrical conductivity is based on the principle that the current converges on a small disk electrode. Most of the ohmic resistance therefore lies within a narrow region surrounding the disk. If the reference electrode is kept outside this zone, the potential difference between the working and the reference electrode includes practically all ohmic potential drops occurring in the solution. Moreover, this ohmic drop can be related to the conductivity of the solution by an analytical expression derived by Newman. At sufficiently high overpotentials, the rate of charge transfer is limited by the conduction of current from the bulk solution to the electrode. In this regime, the current varies linearly with the electrode potential and the conductivity of the solution can be estimated from the slope of the voltammogram using Newman's expression. The electrochemical reaction used for measuring conductivity of solutions of salts is the cathodic reduction of water and that used for aqueous acids is the cathodic reduction of hydrogen ions. The technique has been used to measure conductivity of several common aqueous electrolytes. A good agreement is found between the present technique and the conventional technique based on AC impedance analysis.
△ Less
Submitted 20 December, 2017;
originally announced December 2017.
-
Rate and Aging Time Dependent Static Friction of a Soft and Hard Solid Interface
Authors:
Vinay A. Juvekar,
Arun K. Singh
Abstract:
In this article, we present a mathematical model that answers a classical question concerning how much force, which is generally called static friction force, will it require to initiate the motion of a soft solid block such as gel, rubber or elastomer on a hard surface for instance glass surface. The model uses population balance of the bonds between the polymer chains of the soft solid and the h…
▽ More
In this article, we present a mathematical model that answers a classical question concerning how much force, which is generally called static friction force, will it require to initiate the motion of a soft solid block such as gel, rubber or elastomer on a hard surface for instance glass surface. The model uses population balance of the bonds between the polymer chains of the soft solid and the hard surface to estimate rate and aging time dependent static friction. The model predicts that under certain range of the pulling velocity, the friction stress at the onset of sliding (static friction stress) varies as the logarithm of the pulling velocity, as well as the logarithm of the aging time. These predictions are consistent with the experimental observations.
△ Less
Submitted 2 February, 2016;
originally announced February 2016.
-
A Strong Bond Model for Stress Relaxation of Soft Solid Interfaces
Authors:
Arun K. Singh,
Vinay A. Juvekar
Abstract:
In this article, we propose a mathematical model which explains the formation of strong bonds during the relaxation process of a soft solid on a hard surface. As a result, the soft solid relaxes to a non zero residual stress level. The model assumes that formation of strong bonds occurs owing to transition from weak to strong bonds at a critical time. Parametric studies are carried out to understa…
▽ More
In this article, we propose a mathematical model which explains the formation of strong bonds during the relaxation process of a soft solid on a hard surface. As a result, the soft solid relaxes to a non zero residual stress level. The model assumes that formation of strong bonds occurs owing to transition from weak to strong bonds at a critical time. Parametric studies are carried out to understand the effect of different friction parameters related with the model on the relaxation process. The relaxation model is, in turn, validated with experiments and corresponding numerical values are justified.
△ Less
Submitted 4 April, 2017; v1 submitted 29 November, 2014;
originally announced December 2014.
-
Elucidation of charge storage characteristics of conducting polymer film using redox reaction
Authors:
Asfiya Q. Contractor,
Vinay A. Juvekar
Abstract:
A general technique to investigate charge storage characteristics of conducting polymer films has been developed. A redox reaction is conducted on a polymer film on a rotating disk electrode under potentiostatic condition so that the rate of charging of the film equals the rate of removal of the charge by the reaction. In an experiment on polyaniline film deposited on platinum substrate, using Fe2…
▽ More
A general technique to investigate charge storage characteristics of conducting polymer films has been developed. A redox reaction is conducted on a polymer film on a rotating disk electrode under potentiostatic condition so that the rate of charging of the film equals the rate of removal of the charge by the reaction. In an experiment on polyaniline film deposited on platinum substrate, using Fe2+/Fe3+ in HCl as the redox system, the voltammogram shows five distinct linear segments (bands) with discontinuity in the slope at specific transition potentials. These bands are the same as those indicated by ESR/Raman spectroscopy with comparable transition potentials. From the dependence of the slopes of the bands on concentration of ferrous and ferric ions, it was possible to estimate the energies of the charge carrier in different bands. It is shown that the charge storage in the film is capacitive.
△ Less
Submitted 18 November, 2013;
originally announced November 2013.