Skip to main content

Table 9 Subjective comparison results with standard deviation (std dev) of mono-lingual conversion in unseen-to-unseen scenario. The results are listd as MOS/std dev

From: U2-VC: one-shot voice conversion using two-level nested U-structure

  MOS (similarity)/std dev MOS (naturalness)/std dev
  SF2TF SF2TM SM2TF SM2TM Average SF2TF SF2TM SM2TF SM2TM Average
AdaIN-VC 2.01/0.31 2.02/0.48 2.04/0.29 2.10/0.29 2.04 2.07/0.61 2.08/0.67 2.05/0.58 2.06/0.68 2.07
AGAIN-VC 2.68/0.49 2.88/0.57 2.94/0.47 3.31/0.35 2.95 3.14/0.82 3.24/0.65 3.02/0.82 3.50/0.51 3.23
U2-VC 3.14/0.43 3.40/0.49 3.34/0.43 3.62/0.37 3.37 3.94/0.77 4.04/0.47 3.74/0.78 4.05/0.56 3.94