Skip to main content

Table 13 Subjective evaluation results with standard deviation (std dev) of voice conversion in cross-lingual scenario. The results are listd as MOS/std dev

From: U2-VC: one-shot voice conversion using two-level nested U-structure

  MOS (similarity)/std dev MOS (naturalness)/std dev
  VCTK2VCC VCC2VCTK VCC2VCC Average VCTK2VCC VCC2VCTK VCC2VCC Average
AdaIN-VC 2.04/0.38 1.81/0.60 1.90/0.38 1.92 2.24/0.25 1.91/0.33 2.00/0.34 2.05
AGAIN-VC 2.94/0.45 2.47/0.58 2.68/0.44 2.70 3.19/0.50 2.74/0.48 2.58/0.60 2.84
U2-VC 3.33/0.37 3.10/0.48 3.14/0.49 3.19 3.78/0.48 3.44/0.46 3.26/0.40 3.49