Table 2 MCD and F0-RMSE results for different emotions

From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

	MCD			F0-RMSE
	N2A	N2S	N2H	N2A	N2S	N2H
Source	6.03	5.18	6.30	76.8	73.7	100.4
DBNs+LG	5.67	4.88	5.55	76.3	72.0	99.3
DBNs+NMF	5.67	4.88	5.54	70.4	62.3	75.2
DBNs+CWT(30)	5.68	4.88	5.55	39.5	40.1	64.5
DBNs+CWT(40)	5.68	4.88	5.55	41.6	40.5	67.5
DBNs+AS-CWT	5.68	4.88	5.55	41.5	39.4	63.2

N2A, N2S, and N2H represent the datasets neutral to angry voice, neutral to sad voice and neutral to happy voice, respectively

Back to article page