-
phylo2vec: a library for vector-based phylogenetic tree manipulation
Authors:
Neil Scheidwasser,
Ayush Nag,
Matthew J Penn,
Anthony MV Jakob,
Frederik Mølkjær Andersen,
Mark P Khurana,
Landung Setiawan,
Madeline Gordon,
David A Duchêne,
Samir Bhatt
Abstract:
Phylogenetics is a fundamental component of many analysis frameworks in biology as well as linguistics to study the evolutionary relationships of different entities. Recently, the advent of large-scale genomics and the SARS-CoV-2 pandemic has underscored the necessity for phylogenetic software to handle large datasets of genomes or phylogenetic trees. While significant efforts have focused on scal…
▽ More
Phylogenetics is a fundamental component of many analysis frameworks in biology as well as linguistics to study the evolutionary relationships of different entities. Recently, the advent of large-scale genomics and the SARS-CoV-2 pandemic has underscored the necessity for phylogenetic software to handle large datasets of genomes or phylogenetic trees. While significant efforts have focused on scaling optimisation algorithms, visualization, and lineage identification, an emerging body of research has been dedicated to efficient representations of data for genomes and phylogenetic trees such as phylo2vec. Compared to traditional tree representations such as the Newick format, which represents trees using strings of nested parentheses, modern representations of phylogenetic trees utilize integer vectors to define the tree topology traversal. This approach offers several advantages, including easier manipulability, increased memory efficiency, and applicability to downstream tasks such as machine learning. Here, we present the latest release of phylo2vec (or Phylo2Vec), a high-performance software package for encoding, manipulating, and analysing binary phylogenetic trees. At its core, the package is based on the phylo2vec representation of binary trees, which defines a bijection from any tree topology with $n$ leaves into an integer vector of size $n-1$. Compared to the traditional Newick format, phylo2vec is designed to enable fast sampling and comparison of binary trees. This release features a core implementation in Rust, providing significant performance improvements and memory efficiency, while remaining available in Python (superseding the release described in the original paper) and R via dedicated wrappers, making it accessible to a broad audience in the bioinformatics community.
△ Less
Submitted 24 June, 2025;
originally announced June 2025.
-
Open World Scene Graph Generation using Vision Language Models
Authors:
Amartya Dutta,
Kazi Sajeed Mehrab,
Medha Sawhney,
Abhilash Neog,
Mridul Khurana,
Sepideh Fatemi,
Aanish Pradhan,
M. Maruf,
Ismini Lourentzou,
Arka Daw,
Anuj Karpatne
Abstract:
Scene-Graph Generation (SGG) seeks to recognize objects in an image and distill their salient pairwise relationships. Most methods depend on dataset-specific supervision to learn the variety of interactions, restricting their usefulness in open-world settings, involving novel objects and/or relations. Even methods that leverage large Vision Language Models (VLMs) typically require benchmark-specif…
▽ More
Scene-Graph Generation (SGG) seeks to recognize objects in an image and distill their salient pairwise relationships. Most methods depend on dataset-specific supervision to learn the variety of interactions, restricting their usefulness in open-world settings, involving novel objects and/or relations. Even methods that leverage large Vision Language Models (VLMs) typically require benchmark-specific fine-tuning. We introduce Open-World SGG, a training-free, efficient, model-agnostic framework that taps directly into the pretrained knowledge of VLMs to produce scene graphs with zero additional learning. Casting SGG as a zero-shot structured-reasoning problem, our method combines multimodal prompting, embedding alignment, and a lightweight pair-refinement strategy, enabling inference over unseen object vocabularies and relation sets. To assess this setting, we formalize an Open-World evaluation protocol that measures performance when no SGG-specific data have been observed either in terms of objects and relations. Experiments on Visual Genome, Open Images V6, and the Panoptic Scene Graph (PSG) dataset demonstrate the capacity of pretrained VLMs to perform relational understanding without task-level training.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Species Generation
Authors:
Amin Karimi Monsefi,
Mridul Khurana,
Rajiv Ramnath,
Anuj Karpatne,
Wei-Lun Chao,
Cheng Zhang
Abstract:
We propose TaxaDiffusion, a taxonomy-informed training framework for diffusion models to generate fine-grained animal images with high morphological and identity accuracy. Unlike standard approaches that treat each species as an independent category, TaxaDiffusion incorporates domain knowledge that many species exhibit strong visual similarities, with distinctions often residing in subtle variatio…
▽ More
We propose TaxaDiffusion, a taxonomy-informed training framework for diffusion models to generate fine-grained animal images with high morphological and identity accuracy. Unlike standard approaches that treat each species as an independent category, TaxaDiffusion incorporates domain knowledge that many species exhibit strong visual similarities, with distinctions often residing in subtle variations of shape, pattern, and color. To exploit these relationships, TaxaDiffusion progressively trains conditioned diffusion models across different taxonomic levels -- starting from broad classifications such as Class and Order, refining through Family and Genus, and ultimately distinguishing at the Species level. This hierarchical learning strategy first captures coarse-grained morphological traits shared by species with common ancestors, facilitating knowledge transfer before refining fine-grained differences for species-level distinction. As a result, TaxaDiffusion enables accurate generation even with limited training samples per species. Extensive experiments on three fine-grained animal datasets demonstrate that outperforms existing approaches, achieving superior fidelity in fine-grained animal image generation. Project page: https://amink8.github.io/TaxaDiffusion/
△ Less
Submitted 25 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images
Authors:
M. Maruf,
Arka Daw,
Kazi Sajeed Mehrab,
Harish Babu Manogaran,
Abhilash Neog,
Medha Sawhney,
Mridul Khurana,
James P. Balhoff,
Yasin Bakis,
Bahadir Altintas,
Matthew J. Thompson,
Elizabeth G. Campolongo,
Josef C. Uyeda,
Hilmar Lapp,
Henry L. Bart,
Paula M. Mabee,
Yu Su,
Wei-Lun Chao,
Charles Stewart,
Tanya Berger-Wolf,
Wasila Dahdul,
Anuj Karpatne
Abstract:
Images are increasingly becoming the currency for documenting biodiversity on the planet, providing novel opportunities for accelerating scientific discoveries in the field of organismal biology, especially with the advent of large vision-language models (VLMs). We ask if pre-trained VLMs can aid scientists in answering a range of biologically relevant questions without any additional fine-tuning.…
▽ More
Images are increasingly becoming the currency for documenting biodiversity on the planet, providing novel opportunities for accelerating scientific discoveries in the field of organismal biology, especially with the advent of large vision-language models (VLMs). We ask if pre-trained VLMs can aid scientists in answering a range of biologically relevant questions without any additional fine-tuning. In this paper, we evaluate the effectiveness of 12 state-of-the-art (SOTA) VLMs in the field of organismal biology using a novel dataset, VLM4Bio, consisting of 469K question-answer pairs involving 30K images from three groups of organisms: fishes, birds, and butterflies, covering five biologically relevant tasks. We also explore the effects of applying prompting techniques and tests for reasoning hallucination on the performance of VLMs, shedding new light on the capabilities of current SOTA VLMs in answering biologically relevant questions using images. The code and datasets for running all the analyses reported in this paper can be found at https://github.com/sammarfy/VLM4Bio.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Hierarchical Conditioning of Diffusion Models Using Tree-of-Life for Studying Species Evolution
Authors:
Mridul Khurana,
Arka Daw,
M. Maruf,
Josef C. Uyeda,
Wasila Dahdul,
Caleb Charpentier,
Yasin Bakış,
Henry L. Bart Jr.,
Paula M. Mabee,
Hilmar Lapp,
James P. Balhoff,
Wei-Lun Chao,
Charles Stewart,
Tanya Berger-Wolf,
Anuj Karpatne
Abstract:
A central problem in biology is to understand how organisms evolve and adapt to their environment by acquiring variations in the observable characteristics or traits of species across the tree of life. With the growing availability of large-scale image repositories in biology and recent advances in generative modeling, there is an opportunity to accelerate the discovery of evolutionary traits auto…
▽ More
A central problem in biology is to understand how organisms evolve and adapt to their environment by acquiring variations in the observable characteristics or traits of species across the tree of life. With the growing availability of large-scale image repositories in biology and recent advances in generative modeling, there is an opportunity to accelerate the discovery of evolutionary traits automatically from images. Toward this goal, we introduce Phylo-Diffusion, a novel framework for conditioning diffusion models with phylogenetic knowledge represented in the form of HIERarchical Embeddings (HIER-Embeds). We also propose two new experiments for perturbing the embedding space of Phylo-Diffusion: trait masking and trait swapping, inspired by counterpart experiments of gene knockout and gene editing/swapping. Our work represents a novel methodological advance in generative modeling to structure the embedding space of diffusion models using tree-based knowledge. Our work also opens a new chapter of research in evolutionary biology by using generative models to visualize evolutionary changes directly from images. We empirically demonstrate the usefulness of Phylo-Diffusion in capturing meaningful trait variations for fishes and birds, revealing novel insights about the biological mechanisms of their evolution.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Authors:
Kazi Sajeed Mehrab,
M. Maruf,
Arka Daw,
Abhilash Neog,
Harish Babu Manogaran,
Mridul Khurana,
Zhenyang Feng,
Bahadir Altintas,
Yasin Bakis,
Elizabeth G Campolongo,
Matthew J Thompson,
Xiaojun Wang,
Hilmar Lapp,
Tanya Berger-Wolf,
Paula Mabee,
Henry Bart,
Wei-Lun Chao,
Wasila M Dahdul,
Anuj Karpatne
Abstract:
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using problem formulations in computer vision. Fish-Vista contains 69,126 annotated images spanning 4,154 fish species, curated and organized to serve three downstream tasks of species classification, trait identification, and t…
▽ More
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using problem formulations in computer vision. Fish-Vista contains 69,126 annotated images spanning 4,154 fish species, curated and organized to serve three downstream tasks of species classification, trait identification, and trait segmentation. Our work makes two key contributions. First, we perform a fully reproducible data processing pipeline to process images sourced from various museum collections. We annotate these images with carefully curated labels from biological databases and manual annotations to create an AI-ready dataset of visual traits, contributing to the advancement of AI in biodiversity science. Second, our proposed downstream tasks offer fertile grounds for novel computer vision research in addressing a variety of challenges such as long-tailed distributions, out-of-distribution generalization, learning with weak labels, explainable AI, and segmenting small objects. We benchmark the performance of several existing methods for our proposed tasks to expose future research opportunities in AI for biodiversity science problems involving visual traits.
△ Less
Submitted 27 February, 2025; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection
Authors:
Mehar Khurana,
Neehar Peri,
James Hays,
Deva Ramanan
Abstract:
State-of-the-art 3D object detectors are often trained on massive labeled datasets. However, annotating 3D bounding boxes remains prohibitively expensive and time-consuming, particularly for LiDAR. Instead, recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels. Contemporary methods adapt best-practices for self-supervised…
▽ More
State-of-the-art 3D object detectors are often trained on massive labeled datasets. However, annotating 3D bounding boxes remains prohibitively expensive and time-consuming, particularly for LiDAR. Instead, recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels. Contemporary methods adapt best-practices for self-supervised learning from the image domain to point clouds (such as contrastive learning). However, publicly available 3D datasets are considerably smaller and less diverse than those used for image-based self-supervised learning, limiting their effectiveness. We do note, however, that such 3D data is naturally collected in a multimodal fashion, often paired with images. Rather than pre-training with only self-supervised objectives, we argue that it is better to bootstrap point cloud representations using image-based foundation models trained on internet-scale data. Specifically, we propose a shelf-supervised approach (e.g. supervised with off-the-shelf image foundation models) for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data. Pre-training 3D detectors with such pseudo-labels yields significantly better semi-supervised detection accuracy than prior self-supervised pretext tasks. Importantly, we show that image-based shelf-supervision is helpful for training LiDAR-only, RGB-only and multi-modal (RGB + LiDAR) detectors. We demonstrate the effectiveness of our approach on nuScenes and WOD, significantly improving over prior work in limited data settings. Our code is available at https://github.com/meharkhurana03/cm3d
△ Less
Submitted 15 October, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
DeepSee: Multidimensional Visualizations of Seabed Ecosystems
Authors:
Adam Coscia,
Haley M. Sapers,
Noah Deutsch,
Malika Khurana,
John S. Magyar,
Sergio A. Parra,
Daniel R. Utter,
Rebecca L. Wipfler,
David W. Caress,
Eric J. Martin,
Jennifer B. Paduan,
Maggie Hendrie,
Santiago Lombeyda,
Hillary Mushkin,
Alex Endert,
Scott Davidoff,
Victoria J. Orphan
Abstract:
Scientists studying deep ocean microbial ecosystems use limited numbers of sediment samples collected from the seafloor to characterize important life-sustaining biogeochemical cycles in the environment. Yet conducting fieldwork to sample these extreme remote environments is both expensive and time consuming, requiring tools that enable scientists to explore the sampling history of field sites and…
▽ More
Scientists studying deep ocean microbial ecosystems use limited numbers of sediment samples collected from the seafloor to characterize important life-sustaining biogeochemical cycles in the environment. Yet conducting fieldwork to sample these extreme remote environments is both expensive and time consuming, requiring tools that enable scientists to explore the sampling history of field sites and predict where taking new samples is likely to maximize scientific return. We conducted a collaborative, user-centered design study with a team of scientific researchers to develop DeepSee, an interactive data workspace that visualizes 2D and 3D interpolations of biogeochemical and microbial processes in context together with sediment sampling history overlaid on 2D seafloor maps. Based on a field deployment and qualitative interviews, we found that DeepSee increased the scientific return from limited sample sizes, catalyzed new research workflows, reduced long-term costs of sharing data, and supported teamwork and communication between team members with diverse research goals.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Discovering Novel Biological Traits From Images Using Phylogeny-Guided Neural Networks
Authors:
Mohannad Elhamod,
Mridul Khurana,
Harish Babu Manogaran,
Josef C. Uyeda,
Meghan A. Balk,
Wasila Dahdul,
Yasin Bakış,
Henry L. Bart Jr.,
Paula M. Mabee,
Hilmar Lapp,
James P. Balhoff,
Caleb Charpentier,
David Carlyn,
Wei-Lun Chao,
Charles V. Stewart,
Daniel I. Rubenstein,
Tanya Berger-Wolf,
Anuj Karpatne
Abstract:
Discovering evolutionary traits that are heritable across species on the tree of life (also referred to as a phylogenetic tree) is of great interest to biologists to understand how organisms diversify and evolve. However, the measurement of traits is often a subjective and labor-intensive process, making trait discovery a highly label-scarce problem. We present a novel approach for discovering evo…
▽ More
Discovering evolutionary traits that are heritable across species on the tree of life (also referred to as a phylogenetic tree) is of great interest to biologists to understand how organisms diversify and evolve. However, the measurement of traits is often a subjective and labor-intensive process, making trait discovery a highly label-scarce problem. We present a novel approach for discovering evolutionary traits directly from images without relying on trait labels. Our proposed approach, Phylo-NN, encodes the image of an organism into a sequence of quantized feature vectors -- or codes -- where different segments of the sequence capture evolutionary signals at varying ancestry levels in the phylogeny. We demonstrate the effectiveness of our approach in producing biologically meaningful results in a number of downstream tasks including species image generation and species-to-species image translation, using fish species as a target example.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Phylo2Vec: a vector representation for binary trees
Authors:
Matthew J Penn,
Neil Scheidwasser,
Mark P Khurana,
David A Duchêne,
Christl A Donnelly,
Samir Bhatt
Abstract:
Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming l…
▽ More
Binary phylogenetic trees inferred from biological data are central to understanding the shared history among evolutionary units. However, inferring the placement of latent nodes in a tree is computationally expensive. State-of-the-art methods rely on carefully designed heuristics for tree search, using different data structures for easy manipulation (e.g., classes in object-oriented programming languages) and readable representation of trees (e.g., Newick-format strings). Here, we present Phylo2Vec, a parsimonious encoding for phylogenetic trees that serves as a unified approach for both manipulating and representing phylogenetic trees. Phylo2Vec maps any binary tree with $n$ leaves to a unique integer vector of length $n-1$. The advantages of Phylo2Vec are fourfold: i) fast tree sampling, (ii) compressed tree representation compared to a Newick string, iii) quick and unambiguous verification if two binary trees are identical topologically, and iv) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill-climbing-based optimisation scheme can efficiently traverse the vastness of tree space from a random to an optimal tree.
△ Less
Submitted 25 March, 2025; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Soft Computing Techniques for Change Detection in remotely sensed images : A Review
Authors:
Madhu Khurana,
Vikas Saxena
Abstract:
With the advent of remote sensing satellites, a huge repository of remotely sensed images is available. Change detection in remotely sensed images has been an active research area as it helps us understand the transitions that are taking place on the Earths surface. This paper discusses the methods and their classifications proposed by various researchers for change detection. Since use of soft co…
▽ More
With the advent of remote sensing satellites, a huge repository of remotely sensed images is available. Change detection in remotely sensed images has been an active research area as it helps us understand the transitions that are taking place on the Earths surface. This paper discusses the methods and their classifications proposed by various researchers for change detection. Since use of soft computing based techniques are now very popular among research community, this paper also presents a classification based on learning techniques used in soft-computing methods for change detection.
△ Less
Submitted 25 September, 2018; v1 submitted 2 June, 2015;
originally announced June 2015.