Skip to main content

Table 1 The MOS results with 95% confidence intervals show the impact of VQ-VAE, CTC-VQ-VAE, FragmentVC, and the proposed method on speech naturalness

From: W2VC: WavLM representation based one-shot voice conversion with gradient reversal distillation and CTC supervision

Method

Intra-gender

Inter-gender

Average

VQ-VAE

2.08 ± 0.1912

2.18 ± 0.2297

2.13 ± 0.2817

CTC-VQ-VAE

3.36 ± 0.1546

3.22 ± 0.1754

3.29 ± 0.1465

FragmentVC

1.54 ± 0.1497

1.54 ± 0.2216

1.54 ± 0.2437

W2VC

4.42 ± 0.1737

4.48 ± 0.1632

4.45 ± 0.1970