Skip to main content

Table 1 MCD and F0-RMSE results for different emotions

From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform

  MCD F0-RMSE
  A2N S2N H2N A2N S2N H2N
Source 6.03 5.18 6.30 76.8 73.7 100.4
DBNs+LG 5.47 4.77 5.92 76.1 73.5 85.2
DBN+NMF 5.46 4.78 5.93 69.4 66.9 74.3
DBN+CWT(30) 5.47 4.77 5.93 61.6 64.2 75.9
DBN+CWT(40) 5.47 4.77 5.93 62.3 67.2 76.1
DBN+AS-CWT 5.47 4.77 5.93 51.1 52.1 64.4
  1. A2N, S2N, and H2N represent the datasets angry to neutral voice, sad to neutral voice, and happy to neutral voice, respectively