Search | arXiv e-print repository

Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics

Authors: Burooj Ghani, Vincent J. Kalkman, Bob Planqué, Willem-Pier Vellinga, Lisa Gill, Dan Stowell

Abstract: Animal sounds can be recognised automatically by machine learning, and this has an important role to play in biodiversity monitoring. Yet despite increasingly impressive capabilities, bioacoustic species classifiers still exhibit imbalanced performance across species and habitats, especially in complex soundscapes. In this study, we explore the effectiveness of transfer learning in large-scale bir… ▽ More Animal sounds can be recognised automatically by machine learning, and this has an important role to play in biodiversity monitoring. Yet despite increasingly impressive capabilities, bioacoustic species classifiers still exhibit imbalanced performance across species and habitats, especially in complex soundscapes. In this study, we explore the effectiveness of transfer learning in large-scale bird sound classification across various conditions, including single- and multi-label scenarios, and across different model architectures such as CNNs and Transformers. Our experiments demonstrate that both fine-tuning and knowledge distillation yield strong performance, with cross-distillation proving particularly effective in improving in-domain performance on Xeno-canto data. However, when generalizing to soundscapes, shallow fine-tuning exhibits superior performance compared to knowledge distillation, highlighting its robustness and constrained nature. Our study further investigates how to use multi-species labels, in cases where these are present but incomplete. We advocate for more comprehensive labeling practices within the animal sound community, including annotating background species and providing temporal details, to enhance the training of robust bird sound classifiers. These findings provide insights into the optimal reuse of pretrained models for advancing automatic bioacoustic recognition. △ Less

Submitted 21 September, 2024; originally announced September 2024.

Comments: 25 pages

arXiv:2404.03474 [pdf, other]

Performance of computer vision algorithms for fine-grained classification using crowdsourced insect images

Authors: Rita Pucci, Vincent J. Kalkman, Dan Stowell

Abstract: With fine-grained classification, we identify unique characteristics to distinguish among classes of the same super-class. We are focusing on species recognition in Insecta, as they are critical for biodiversity monitoring and at the base of many ecosystems. With citizen science campaigns, billions of images are collected in the wild. Once these are labelled, experts can use them to create distrib… ▽ More With fine-grained classification, we identify unique characteristics to distinguish among classes of the same super-class. We are focusing on species recognition in Insecta, as they are critical for biodiversity monitoring and at the base of many ecosystems. With citizen science campaigns, billions of images are collected in the wild. Once these are labelled, experts can use them to create distribution maps. However, the labelling process is time-consuming, which is where computer vision comes in. The field of computer vision offers a wide range of algorithms, each with its strengths and weaknesses; how do we identify the algorithm that is in line with our application? To answer this question, we provide a full and detailed evaluation of nine algorithms among deep convolutional networks (CNN), vision transformers (ViT), and locality-based vision transformers (LBVT) on 4 different aspects: classification performance, embedding quality, computational cost, and gradient activity. We offer insights that we haven't yet had in this domain proving to which extent these algorithms solve the fine-grained tasks in Insecta. We found that the ViT performs the best on inference speed and computational cost while the LBVT outperforms the others on performance and embedding quality; the CNN provide a trade-off among the metrics. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2403.06874 [pdf, other]

COOD: Combined out-of-distribution detection using multiple measures for anomaly & novel class detection in large-scale hierarchical classification

Authors: L. E. Hogeweg, R. Gangireddy, D. Brunink, V. J. Kalkman, L. Cornelissen, J. W. Kamminga

Abstract: High-performing out-of-distribution (OOD) detection, both anomaly and novel class, is an important prerequisite for the practical use of classification models. In this paper, we focus on the species recognition task in images concerned with large databases, a large number of fine-grained hierarchical classes, severe class imbalance, and varying image quality. We propose a framework for combining i… ▽ More High-performing out-of-distribution (OOD) detection, both anomaly and novel class, is an important prerequisite for the practical use of classification models. In this paper, we focus on the species recognition task in images concerned with large databases, a large number of fine-grained hierarchical classes, severe class imbalance, and varying image quality. We propose a framework for combining individual OOD measures into one combined OOD (COOD) measure using a supervised model. The individual measures are several existing state-of-the-art measures and several novel OOD measures developed with novel class detection and hierarchical class structure in mind. COOD was extensively evaluated on three large-scale (500k+ images) biodiversity datasets in the context of anomaly and novel class detection. We show that COOD outperforms individual, including state-of-the-art, OOD measures by a large margin in terms of TPR@1% FPR in the majority of experiments, e.g., improving detecting ImageNet images (OOD) from 54.3% to 85.4% for the iNaturalist 2018 dataset. SHAP (feature contribution) analysis shows that different individual OOD measures are essential for various tasks, indicating that multiple OOD measures and combinations are needed to generalize. Additionally, we show that explicitly considering ID images that are incorrectly classified for the original (species) recognition task is important for constructing high-performing OOD detection methods and for practical applicability. The framework can easily be extended or adapted to other tasks and media modalities. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2307.11112 [pdf, other]

Comparison between transformers and convolutional models for fine-grained classification of insects

Authors: Rita Pucci, Vincent J. Kalkman, Dan Stowell

Abstract: Fine-grained classification is challenging due to the difficulty of finding discriminatory features. This problem is exacerbated when applied to identifying species within the same taxonomical class. This is because species are often sharing morphological characteristics that make them difficult to differentiate. We consider the taxonomical class of Insecta. The identification of insects is essent… ▽ More Fine-grained classification is challenging due to the difficulty of finding discriminatory features. This problem is exacerbated when applied to identifying species within the same taxonomical class. This is because species are often sharing morphological characteristics that make them difficult to differentiate. We consider the taxonomical class of Insecta. The identification of insects is essential in biodiversity monitoring as they are one of the inhabitants at the base of many ecosystems. Citizen science is doing brilliant work of collecting images of insects in the wild giving the possibility to experts to create improved distribution maps in all countries. We have billions of images that need to be automatically classified and deep neural network algorithms are one of the main techniques explored for fine-grained tasks. At the SOTA, the field of deep learning algorithms is extremely fruitful, so how to identify the algorithm to use? We focus on Odonata and Coleoptera orders, and we propose an initial comparative study to analyse the two best-known layer structures for computer vision: transformer and convolutional layers. We compare the performance of T2TViT, a fully transformer-base, EfficientNet, a fully convolutional-base, and ViTAE, a hybrid. We analyse the performance of the three models in identical conditions evaluating the performance per species, per morph together with sex, the inference time, and the overall performance with unbalanced datasets of images from smartphones. Although we observe high performances with all three families of models, our analysis shows that the hybrid model outperforms the fully convolutional-base and fully transformer-base models on accuracy performance and the fully transformer-base model outperforms the others on inference speed and, these prove the transformer to be robust to the shortage of samples and to be faster at inference time. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2003.00921 [pdf]

Decision Support in the Context of a Complex Decision Situation

Authors: Teus H. Kappen, Mirko Noordegraaf, Wilton A. van Klei, Karel G. M. Moons, Cor J. Kalkman

Abstract: The aim of a clinical decision support tool is to reduce the complexity of clinical decisions. However, when decision support tools are poorly implemented they may actually confuse physicians and complicate clinical care. This paper argues that information from decision support tools is often removed from the clinical context of the targeted decisions. Physicians largely depend on clinical context… ▽ More The aim of a clinical decision support tool is to reduce the complexity of clinical decisions. However, when decision support tools are poorly implemented they may actually confuse physicians and complicate clinical care. This paper argues that information from decision support tools is often removed from the clinical context of the targeted decisions. Physicians largely depend on clinical context to handle the complexity of their day-to-day decisions. Clinical context enables them to take into account all ambiguous information and patient preferences. Decision support tools that provide analytic information to physicians, without its context, may then complicate the decision process of physicians. It is likely that the joint forces of physicians and technology will produce better decisions than either of them exclusively: after all, they do have different ways of dealing with the complexity of a decision and are thus complementary. Therefore, the future challenges of decision support do not only reside in the optimization of the predictive value of the underlying models and algorithms, but equally in the effective communication of information and its context to doctors. △ Less

Submitted 2 March, 2020; originally announced March 2020.

arXiv:1808.00840 [pdf, other]

Shepherd: Enabling Automatic and Large-Scale Login Security Studies

Authors: Hugo Jonker, Jelmer Kalkman, Benjamin Krumnow, Marc Sleegers, Alan Verresen

Abstract: More and more parts of the internet are hidden behind a login field. This poses a barrier to any study predicated on scanning the internet. Moreover, the authentication process itself may be a weak point. To study authentication weaknesses at scale, automated login capabilities are needed. In this work we introduce Shepherd, a scanning framework to automatically log in on websites. The Shepherd fr… ▽ More More and more parts of the internet are hidden behind a login field. This poses a barrier to any study predicated on scanning the internet. Moreover, the authentication process itself may be a weak point. To study authentication weaknesses at scale, automated login capabilities are needed. In this work we introduce Shepherd, a scanning framework to automatically log in on websites. The Shepherd framework enables us to perform large-scale scans of post-login aspects of websites. Shepherd scans a website for login fields, attempts to submit credentials and evaluates whether login was successful. We illustrate Shepherd's capabilities by means of a scan for session hijacking susceptibility. In this study, we use a set of unverified website credentials, some of which will be invalid. Using this set, Shepherd is able to fully automatically log in and verify that it is indeed logged in on 6,273 unknown sites, or 12.4% of the test set. We found that from our (biased) test set, 2,579 sites, i.e., 41.4%, are vulnerable to simple session hijacking attacks. △ Less

Submitted 2 August, 2018; originally announced August 2018.

arXiv:1411.4252 [pdf, ps, other]

doi 10.1364/OE.23.003448

Simultaneous measurement of localized diffusion and flow using optical coherence tomography

Authors: Nicolás Weiss, Ton G. van Leeuwen, Jeroen Kalkman

Abstract: We report on the simultaneous and localized measurements of the diffusion coefficient and flow velocity based on the normalized autocorrelation function using optical coherence tomography (OCT). Our results on a flowing solution of polystyrene spheres show that the flow velocity and the diffusion coefficient can be reliably estimated in a regime determined by the sample diffusivity, the local flow… ▽ More We report on the simultaneous and localized measurements of the diffusion coefficient and flow velocity based on the normalized autocorrelation function using optical coherence tomography (OCT). Our results on a flowing solution of polystyrene spheres show that the flow velocity and the diffusion coefficient can be reliably estimated in a regime determined by the sample diffusivity, the local flow velocity, and the Gaussian beam waist. We experimentally show that a smaller beam waist results in an improvement of the velocity sensitivity at cost of the precision and accuracy of the estimation of the diffusion coefficient. Further, we show that the decay of the OCT autocorrelation due to flow depends only on the Gaussian beam waist irrespective of the sample position with respect to the focus position. △ Less

Submitted 16 November, 2014; originally announced November 2014.

Journal ref: Optics Express 23(3), 3448-3459 (2015)

arXiv:0705.4250 [pdf, ps, other]

doi 10.1063/1.2777134

Ultrafast optical switching of three-dimensional Si inverse opal photonic band gap crystals

Authors: Tijmen G. Euser, Hong Wei, Jeroen Kalkman, Yoonho Jun, Albert Polman, David J. Norris, Willem L. Vos

Abstract: We present ultrafast optical switching experiments on 3D photonic band gap crystals. Switching the Si inverse opal is achieved by optically exciting free carriers by a two-photon process. We probe reflectivity in the frequency range of second order Bragg diffraction where the photonic band gap is predicted. We find good experimental switching conditions for free-carrier plasma frequencies betwee… ▽ More We present ultrafast optical switching experiments on 3D photonic band gap crystals. Switching the Si inverse opal is achieved by optically exciting free carriers by a two-photon process. We probe reflectivity in the frequency range of second order Bragg diffraction where the photonic band gap is predicted. We find good experimental switching conditions for free-carrier plasma frequencies between 0.3 and 0.7 times the optical frequency: we thus observe a large frequency shift of up to D omega/omega= 1.5% of all spectral features including the peak that corresponds to the photonic band gap. We deduce a corresponding large refractive index change of Dn'_Si/n'_Si= 2.0% and an induced absorption length that is longer than the sample thickness. We observe a fast decay time of 21 ps, which implies that switching could potentially be repeated at GHz rates. Such a high switching rate is relevant to future switching and modulation applications. △ Less

Submitted 19 September, 2007; v1 submitted 29 May, 2007; originally announced May 2007.

Journal ref: J. Appl. Phys. 102, 053111 (2007) (6 pages)

arXiv:physics/0608270 [pdf, ps, other]

doi 10.1103/PhysRevA.74.051802

Demonstration of an erbium doped microdisk laser on a silicon chip

Authors: T. J. Kippenberg, J. Kalkman, A. Polman, K. J. Vahala

Abstract: An erbium doped micro-laser is demonstrated utilizing $\mathrm{SiO_{2}}$ microdisk resonators on a silicon chip. Passive microdisk resonators exhibit whispering gallery type (WGM) modes with intrinsic optical quality factors of up to $6\times{10^{7}}$ and were doped with trivalent erbium ions (peak concentration $\mathrm{\sim3.8\times{10^{20}cm^{-3})}}$ using MeV ion implantation. Coupling to th… ▽ More An erbium doped micro-laser is demonstrated utilizing $\mathrm{SiO_{2}}$ microdisk resonators on a silicon chip. Passive microdisk resonators exhibit whispering gallery type (WGM) modes with intrinsic optical quality factors of up to $6\times{10^{7}}$ and were doped with trivalent erbium ions (peak concentration $\mathrm{\sim3.8\times{10^{20}cm^{-3})}}$ using MeV ion implantation. Coupling to the fundamental WGM of the microdisk resonator was achieved by using a tapered optical fiber. Upon pumping of the $^{4}% I_{15/2}\longrightarrow$ $^{4}I_{13/2}$ erbium transition at 1450 nm, a gradual transition from spontaneous to stimulated emission was observed in the 1550 nm band. Analysis of the pump-output power relation yielded a pump threshold of 43 $\mathrmμ$W and allowed measuring the spontaneous emission coupling factor: $β\approx1\times10^{-3}$. △ Less

Submitted 27 August, 2006; originally announced August 2006.

arXiv:physics/0505106 [pdf, ps, other]

Purcell factor enhanced scattering efficiency in optical microcavities

Authors: T. J. Kippenberg, A. L. Tchebotareva, J. Kalkman, A. Polman, K. J. Vahala

Abstract: Scattering processes in an optical microcavity are investigated for the case of silicon nanocrystals embedded in an ultra-high Q toroid microcavity. Using a novel measurement technique based on the observable mode-splitting, we demonstrate that light scattering is highly preferential: more than 99.8% of the scattered photon flux is scattered into the original doubly-degenerate cavity modes. The… ▽ More Scattering processes in an optical microcavity are investigated for the case of silicon nanocrystals embedded in an ultra-high Q toroid microcavity. Using a novel measurement technique based on the observable mode-splitting, we demonstrate that light scattering is highly preferential: more than 99.8% of the scattered photon flux is scattered into the original doubly-degenerate cavity modes. The large capture efficiency is attributed to an increased scattering rate into the cavity mode, due to the enhancement of the optical density of states over the free space value and has the same origin as the Purcell effect in spontaneous emission. The experimentally determined Purcell factor amounts to 883. △ Less

Submitted 18 May, 2005; v1 submitted 15 May, 2005; originally announced May 2005.

arXiv:hep-th/9407115 [pdf, ps, other]

Residues in Non-Abelian Localization

Authors: Jaap Kalkman

Abstract: In this paper we obtain an expression for the residue as it occurs in the non-Abelian localization formula due to Jeffrey and Kirwan. This expression can be used in cohomology computations for symplectic quotients. In this paper we obtain an expression for the residue as it occurs in the non-Abelian localization formula due to Jeffrey and Kirwan. This expression can be used in cohomology computations for symplectic quotients. △ Less

Submitted 19 July, 1994; originally announced July 1994.

Comments: 9 pages

arXiv:hep-th/9308132 [pdf, ps, other]

BRST model applied to symplectic geometry

Authors: Jaap Kalkman

Abstract: Two local macros are included (gothic.sty and fleqn.sty) Two local macros are included (gothic.sty and fleqn.sty) △ Less

Submitted 30 August, 1993; v1 submitted 27 August, 1993; originally announced August 1993.

Comments: 55 pages

Showing 1–12 of 12 results for author: Kalkman, J