EURASIP Journal on Audio, Speech, and Music Processing

Table 1 MCD and F0-RMSE results for different emotions

From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

	MCD			F0-RMSE
	A2N	S2N	H2N	A2N	S2N	H2N
Source	6.03	5.18	6.30	76.8	73.7	100.4
DBNs+LG	5.47	4.77	5.92	76.1	73.5	85.2
DBN+NMF	5.46	4.78	5.93	69.4	66.9	74.3
DBN+CWT(30)	5.47	4.77	5.93	61.6	64.2	75.9
DBN+CWT(40)	5.47	4.77	5.93	62.3	67.2	76.1
DBN+AS-CWT	5.47	4.77	5.93	51.1	52.1	64.4

A2N, S2N, and H2N represent the datasets angry to neutral voice, sad to neutral voice, and happy to neutral voice, respectively

Back to article page