Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations

Scott, K. Jack; Speers, Lucinda J.; Bilkey, David K.

doi:10.1121/10.0024340

Computer Science > Sound

arXiv:2303.03183 (cs)

[Submitted on 3 Mar 2023 (v1), last revised 19 Jan 2024 (this version, v2)]

Title:Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations

Authors:K. Jack Scott, Lucinda J. Speers, David K. Bilkey

View PDF

Abstract:Murine rodents generate ultrasonic vocalizations (USVs) with frequencies that extend to around 120kHz. These calls are important in social behaviour, and so their analysis can provide insights into the function of vocal communication, and its dysfunction. The manual identification of USVs, and subsequent classification into different subcategories is time consuming. Although machine learning approaches for identification and classification can lead to enormous efficiency gains, the time and effort required to generate training data can be high, and the accuracy of current approaches can be problematic. Here we compare the detection and classification performance of a trained human against two convolutional neural networks (CNNs), DeepSqueak and VocalMat, on audio containing rat USVs. Furthermore, we test the effect of inserting synthetic USVs into the training data of the VocalMat CNN as a means of reducing the workload associated with generating a training set. Our results indicate that VocalMat outperformed the DeepSqueak CNN on measures of call identification, and classification. Additionally, we found that the augmentation of training data with synthetic images resulted in a further improvement in accuracy, such that it was sufficiently close to human performance to allow for the use of this software in laboratory conditions.

Comments:	25 pages, 5 main figures, 2 tables
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2303.03183 [cs.SD]
	(or arXiv:2303.03183v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2303.03183
Journal reference:	J Acoust Soc Am 1 January 2024 155 (1)
Related DOI:	https://doi.org/10.1121/10.0024340

Submission history

From: K. Jack Scott [view email]
[v1] Fri, 3 Mar 2023 03:17:45 UTC (912 KB)
[v2] Fri, 19 Jan 2024 02:31:58 UTC (1,754 KB)

Computer Science > Sound

Title:Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Utilizing synthetic training data for the supervised classification of rat ultrasonic vocalizations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators