Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Authors:
Kazi Sajeed Mehrab,
M. Maruf,
Arka Daw,
Abhilash Neog,
Harish Babu Manogaran,
Mridul Khurana,
Zhenyang Feng,
Bahadir Altintas,
Yasin Bakis,
Elizabeth G Campolongo,
Matthew J Thompson,
Xiaojun Wang,
Hilmar Lapp,
Tanya Berger-Wolf,
Paula Mabee,
Henry Bart,
Wei-Lun Chao,
Wasila M Dahdul,
Anuj Karpatne
Abstract:
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using problem formulations in computer vision. Fish-Vista contains 69,126 annotated images spanning 4,154 fish species, curated and organized to serve three downstream tasks of species classification, trait identification, and t…
▽ More
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using problem formulations in computer vision. Fish-Vista contains 69,126 annotated images spanning 4,154 fish species, curated and organized to serve three downstream tasks of species classification, trait identification, and trait segmentation. Our work makes two key contributions. First, we perform a fully reproducible data processing pipeline to process images sourced from various museum collections. We annotate these images with carefully curated labels from biological databases and manual annotations to create an AI-ready dataset of visual traits, contributing to the advancement of AI in biodiversity science. Second, our proposed downstream tasks offer fertile grounds for novel computer vision research in addressing a variety of challenges such as long-tailed distributions, out-of-distribution generalization, learning with weak labels, explainable AI, and segmenting small objects. We benchmark the performance of several existing methods for our proposed tasks to expose future research opportunities in AI for biodiversity science problems involving visual traits.
△ Less
Submitted 27 February, 2025; v1 submitted 10 July, 2024;
originally announced July 2024.
BioCLIP: A Vision Foundation Model for the Tree of Life
Authors:
Samuel Stevens,
Jiaman Wu,
Matthew J Thompson,
Elizabeth G Campolongo,
Chan Hee Song,
David Edward Carlyn,
Li Dong,
Wasila M Dahdul,
Charles Stewart,
Tanya Berger-Wolf,
Wei-Lun Chao,
Yu Su
Abstract:
Images of the natural world, collected by a variety of cameras, from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extracting biologically relevant information from images for science and conservation. Yet most of these are bespoke approaches designed for a specif…
▽ More
Images of the natural world, collected by a variety of cameras, from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extracting biologically relevant information from images for science and conservation. Yet most of these are bespoke approaches designed for a specific task and are not easily adaptable or extendable to new questions, contexts, and datasets. A vision model for general organismal biology questions on images is of timely need. To approach this, we curate and release TreeOfLife-10M, the largest and most diverse ML-ready dataset of biology images. We then develop BioCLIP, a foundation model for the tree of life, leveraging the unique properties of biology captured by TreeOfLife-10M, namely the abundance and variety of images of plants, animals, and fungi, together with the availability of rich structured biological knowledge. We rigorously benchmark our approach on diverse fine-grained biology classification tasks and find that BioCLIP consistently and substantially outperforms existing baselines (by 16% to 17% absolute). Intrinsic evaluation reveals that BioCLIP has learned a hierarchical representation conforming to the tree of life, shedding light on its strong generalizability. https://imageomics.github.io/bioclip has models, data and code.
△ Less
Submitted 14 May, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.