Skip to main content
Fig. 5 | EURASIP Journal on Audio, Speech, and Music Processing

Fig. 5

From: Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation

Fig. 5

MUSHRA naturalness scores for single-speaker and multi-speaker models trained using cross-lingual transfer learning, where (a) are single-speaker models trained with one high-resource language or both and (b) are multi-speaker models trained with one high-resource language or both. M SJ(Japanese), M SE10(English,10hours), M SE24(English,24hours), M SEJ(EnglishandJapanese): sequentially trained single-speaker models. M MJ(Japanese), M ME10(English,10hours), M ME24(English,24hours), M MEJ(EnglishandJapanese): simultaneously trained multi-speaker models

Back to article page