-
Flaws of ImageNet, Computer Vision's Favourite Dataset
Authors:
Nikita Kisel,
Illia Volkov,
Katerina Hanzelkova,
Klara Janouskova,
Jiri Matas
Abstract:
Since its release, ImageNet-1k dataset has become a gold standard for evaluating model performance. It has served as the foundation for numerous other datasets and training tasks in computer vision. As models have improved in accuracy, issues related to label correctness have become increasingly apparent. In this blog post, we analyze the issues in the ImageNet-1k dataset, including incorrect labe…
▽ More
Since its release, ImageNet-1k dataset has become a gold standard for evaluating model performance. It has served as the foundation for numerous other datasets and training tasks in computer vision. As models have improved in accuracy, issues related to label correctness have become increasingly apparent. In this blog post, we analyze the issues in the ImageNet-1k dataset, including incorrect labels, overlapping or ambiguous class definitions, training-evaluation domain shifts, and image duplicates. The solutions for some problems are straightforward. For others, we hope to start a broader conversation about refining this influential dataset to better serve future research.
△ Less
Submitted 26 November, 2024;
originally announced December 2024.
-
Homology-constrained vector quantization entropy regularizer
Authors:
Ivan Volkov
Abstract:
This paper describes an entropy regularization term for vector quantization (VQ) based on the analysis of persistent homology of the VQ embeddings. Higher embedding entropy positively correlates with higher codebook utilization, mitigating overfit towards the identity and codebook collapse in VQ-based autoencoders [1]. We show that homology-constrained regularization is an effective way to increas…
▽ More
This paper describes an entropy regularization term for vector quantization (VQ) based on the analysis of persistent homology of the VQ embeddings. Higher embedding entropy positively correlates with higher codebook utilization, mitigating overfit towards the identity and codebook collapse in VQ-based autoencoders [1]. We show that homology-constrained regularization is an effective way to increase entropy of the VQ process (approximated to input entropy) while preserving the approximated topology in the quantized latent space, averaged over mini batches. This work further explores some patterns of persistent homology diagrams of latents formed by vector quantization. We implement and test the proposed algorithm as a module integrated into a sample VQ-VAE. Linked code repository provides a functioning implementation of the proposed architecture, referred to as homology-constrained vector quantization (HC-VQ) further in this work.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Digital Twins, Internet of Things and Mobile Medicine: a Review of Current Platforms to Support Smart Healthcare
Authors:
Ivan Volkov,
Gleb Radchenko,
Andrey Tchernykh
Abstract:
As the population grows, the need for a quality level of medical services grows correspondingly, so does the demand for information technology in medicine. The concept of "Smart Healthcare" offers many approaches aimed at solving the acute problems faced by modern healthcare. In this paper, we review the main problems of modern healthcare, analyze existing approaches and technologies in the areas…
▽ More
As the population grows, the need for a quality level of medical services grows correspondingly, so does the demand for information technology in medicine. The concept of "Smart Healthcare" offers many approaches aimed at solving the acute problems faced by modern healthcare. In this paper, we review the main problems of modern healthcare, analyze existing approaches and technologies in the areas of digital twins, the Internet of Things and mobile medicine, determine their effectiveness in solving the set problems, consider the technologies that are used to monitor and treat patients and propose the concept of the Smart Healthcare platform.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.