Fig. 5

MUSHRA naturalness scores for single-speaker and multi-speaker models trained using cross-lingual transfer learning, where (a) are single-speaker models trained with one high-resource language or both and (b) are multi-speaker models trained with one high-resource language or both. M SJ(Japanese), M SE10(English,10hours), M SE24(English,24hours), M SEJ(EnglishandJapanese): sequentially trained single-speaker models. M MJ(Japanese), M ME10(English,10hours), M ME24(English,24hours), M MEJ(EnglishandJapanese): simultaneously trained multi-speaker models