From: Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform
 | MCD | F0-RMSE | ||||
---|---|---|---|---|---|---|
 | N2A | N2S | N2H | N2A | N2S | N2H |
Source | 6.03 | 5.18 | 6.30 | 76.8 | 73.7 | 100.4 |
DBNs+LG | 5.67 | 4.88 | 5.55 | 76.3 | 72.0 | 99.3 |
DBNs+NMF | 5.67 | 4.88 | 5.54 | 70.4 | 62.3 | 75.2 |
DBNs+CWT(30) | 5.68 | 4.88 | 5.55 | 39.5 | 40.1 | 64.5 |
DBNs+CWT(40) | 5.68 | 4.88 | 5.55 | 41.6 | 40.5 | 67.5 |
DBNs+AS-CWT | 5.68 | 4.88 | 5.55 | 41.5 | 39.4 | 63.2 |