Fig. 10From: Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentationMUSHRA naturalness scores for all single-speaker and multi-speaker models. M-MN: TTS model trained with 12 h of target language data; M-MN 30: TTS model trained from scratch with only 30 min of target language data; M SEJ: sequentially trained single-speaker model; M MEJ: simultaneously trained multi-speaker model; TL: cross-lingual transfer learning; DA: data augmentation; DA D: data augmentation method with additional fine-tuningBack to article page