Fig. 4From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transformExample of sentence, phrase, word, syllable, and phone level scales when the number of each level (λ) is set to 3. The red, blue, and yellow curves represent the scales in each level when temporal duration (D i ) is calculated with i=1, i=2, and i=3, respectivelyBack to article page