From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform
 | MCD | F0-RMSE | ||||
---|---|---|---|---|---|---|
 | A2N | S2N | H2N | A2N | S2N | H2N |
Source | 6.03 | 5.18 | 6.30 | 76.8 | 73.7 | 100.4 |
DBNs+LG | 5.47 | 4.77 | 5.92 | 76.1 | 73.5 | 85.2 |
DBN+NMF | 5.46 | 4.78 | 5.93 | 69.4 | 66.9 | 74.3 |
DBN+CWT(30) | 5.47 | 4.77 | 5.93 | 61.6 | 64.2 | 75.9 |
DBN+CWT(40) | 5.47 | 4.77 | 5.93 | 62.3 | 67.2 | 76.1 |
DBN+AS-CWT | 5.47 | 4.77 | 5.93 | 51.1 | 52.1 | 64.4 |