Skip to main content

Table 9 Subjective comparison results with standard deviation (std dev) of mono-lingual conversion in unseen-to-unseen scenario. The results are listd as MOS/std dev

From: U2-VC: one-shot voice conversion using two-level nested U-structure

 

MOS (similarity)/std dev

MOS (naturalness)/std dev

 

SF2TF

SF2TM

SM2TF

SM2TM

Average

SF2TF

SF2TM

SM2TF

SM2TM

Average

AdaIN-VC

2.01/0.31

2.02/0.48

2.04/0.29

2.10/0.29

2.04

2.07/0.61

2.08/0.67

2.05/0.58

2.06/0.68

2.07

AGAIN-VC

2.68/0.49

2.88/0.57

2.94/0.47

3.31/0.35

2.95

3.14/0.82

3.24/0.65

3.02/0.82

3.50/0.51

3.23

U2-VC

3.14/0.43

3.40/0.49

3.34/0.43

3.62/0.37

3.37

3.94/0.77

4.04/0.47

3.74/0.78

4.05/0.56

3.94