DNN-based ensemble singing voice synthesis with interactions between singers
Authors:
Hiroaki Hyodo,
Shinnosuke Takamichi,
Tomohiko Nakamura,
Junya Koguchi,
Hiroshi Saruwatari
Abstract:
We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degra…
▽ More
We propose a singing voice synthesis (SVS) method for a more unified ensemble singing voice by modeling interactions between singers. Most existing SVS methods aim to synthesize a solo voice, and do not consider interactions between singers, i.e., adjusting one's own voice to the others' voices. Since the production of ensemble voices from solo singing voices ignores the interactions, it can degrade the unity of the vocal ensemble. Therefore, we propose a SVS that reproduces the interactions. It is based on an architecture that uses musical scores of multiple voice parts, and loss functions that simulate the interactions' effect to acoustic features. Experimental results show that our methods improve the unity of the vocal ensemble.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
PJS: phoneme-balanced Japanese singing voice corpus
Authors:
Junya Koguchi,
Shinnosuke Takamichi
Abstract:
This paper presents a free Japanese singing voice corpus that can be used for highly applicable and reproducible singing voice synthesis research. A singing voice corpus helps develop singing voice synthesis, but existing corpora have two critical problems: data imbalance (singing voice corpora do not guarantee phoneme balance, unlike speaking-voice corpora) and copyright issues (cannot legally sh…
▽ More
This paper presents a free Japanese singing voice corpus that can be used for highly applicable and reproducible singing voice synthesis research. A singing voice corpus helps develop singing voice synthesis, but existing corpora have two critical problems: data imbalance (singing voice corpora do not guarantee phoneme balance, unlike speaking-voice corpora) and copyright issues (cannot legally share data). As a way to avoid these problems, we constructed a PJS (phoneme-balanced Japanese singing voice) corpus that guarantees phoneme balance and is licensed with CC BY-SA 4.0, and we composed melodies using a phoneme-balanced speaking-voice corpus. This paper describes how we built the corpus.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.