Skip to main content

Table 1 MCD and F0-RMSE results for different emotions

From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

 

MCD

F0-RMSE

 

A2N

S2N

H2N

A2N

S2N

H2N

Source

6.03

5.18

6.30

76.8

73.7

100.4

DBNs+LG

5.47

4.77

5.92

76.1

73.5

85.2

DBN+NMF

5.46

4.78

5.93

69.4

66.9

74.3

DBN+CWT(30)

5.47

4.77

5.93

61.6

64.2

75.9

DBN+CWT(40)

5.47

4.77

5.93

62.3

67.2

76.1

DBN+AS-CWT

5.47

4.77

5.93

51.1

52.1

64.4

  1. A2N, S2N, and H2N represent the datasets angry to neutral voice, sad to neutral voice, and happy to neutral voice, respectively