Skip to main content

Table 13 Subjective evaluation results with standard deviation (std dev) of voice conversion in cross-lingual scenario. The results are listd as MOS/std dev

From: U2-VC: one-shot voice conversion using two-level nested U-structure

 

MOS (similarity)/std dev

MOS (naturalness)/std dev

 

VCTK2VCC

VCC2VCTK

VCC2VCC

Average

VCTK2VCC

VCC2VCTK

VCC2VCC

Average

AdaIN-VC

2.04/0.38

1.81/0.60

1.90/0.38

1.92

2.24/0.25

1.91/0.33

2.00/0.34

2.05

AGAIN-VC

2.94/0.45

2.47/0.58

2.68/0.44

2.70

3.19/0.50

2.74/0.48

2.58/0.60

2.84

U2-VC

3.33/0.37

3.10/0.48

3.14/0.49

3.19

3.78/0.48

3.44/0.46

3.26/0.40

3.49