Skip to main content

Table 7 Subjective comparison results with standard deviation (std dev) of mono-lingual conversion in seen-to-seen scenario. The results are listd as MOS/std dev

From: U2-VC: one-shot voice conversion using two-level nested U-structure

 

MOS (similarity)/std dev

MOS (naturalness)/std dev

 

SF2TF

SF2TM

SM2TF

SM2TM

Average

SF2TF

SF2TM

SM2TF

SM2TM

Average

AdaIN-VC

2.00/0.30

2.04/0.37

2.04/0.30

2.19/0.40

2.07

2.07/0.77

2.01/0.35

2.10/0.32

2.21/0.46

2.10

AGAIN-VC

2.92/0.46

2.76/0.40

2.87/0.45

3.40/0.59

2.99

3.38/0.65

3.18/0.49

3.12/0.41

3.62/0.43

3.33

U2-VC

3.30/0.41

3.24/0.40

3.28/0.38

4.02/0.42

3.46

3.91/0.56

3.92/0.47

3.78/0.31

4.16/0.32

3.94