Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Liu, Tan; Guo, Wu; Gu, Bin

Computer Science > Computation and Language

arXiv:2106.08637 (cs)

[Submitted on 16 Jun 2021]

Title:Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Authors:Tan Liu, Wu Guo, Bin Gu

View PDF

Abstract:Topic classification systems on spoken documents usually consist of two modules: an automatic speech recognition (ASR) module to convert speech into text and a text topic classification (TTC) module to predict the topic class from the decoded text. In this paper, instead of using the ASR transcripts, the fusion of deep acoustic and linguistic features is used for topic classification on spoken documents. More specifically, a conventional CTC-based acoustic model (AM) using phonemes as output units is first trained, and the outputs of the layer before the linear phoneme classifier in the trained AM are used as the deep acoustic features of spoken documents. Furthermore, these deep acoustic features are fed to a phoneme-to-word (P2W) module to obtain deep linguistic features. Finally, a local multi-head attention module is proposed to fuse these two types of deep features for topic classification. Experiments conducted on a subset selected from Switchboard corpus show that our proposed framework outperforms the conventional ASR+TTC systems and achieves a 3.13% improvement in ACC.

Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2106.08637 [cs.CL]
	(or arXiv:2106.08637v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2106.08637

Submission history

From: Tan Liu [view email]
[v1] Wed, 16 Jun 2021 08:54:31 UTC (702 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-06

Change to browse by:

cs
cs.SD
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Wu Guo
Bin Gu

export BibTeX citation

Computer Science > Computation and Language

Title:Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators