Skip to main content

Table 3 Objective evaluation results of the ablation study on architecture in unseen-to-unseen conversion scenario. “AGAIN-VC” represents the network has neither U2-Net structure nor SaAdaIN. “U2-VC” represents the network has both U2-Net structure and SaAdaIN

From: U2-VC: one-shot voice conversion using two-level nested U-structure

  MCD (dB) Predicted MOS by NISQA
  SF2TF SF2TM SM2TF SM2TM Average SF2TF SF2TM SM2TF SM2TM Average
AGAIN-VC 5.95 6.03 5.96 6.02 5.99 3.71 3.75 3.82 3.93 3.80
w/o SaAdaIN, with U2-Net 6.11 6.16 6.19 6.20 6.17 3.85 3.91 3.81 3.97 3.89
w/o U2-Net, with SaAdaIN 6.01 6.02 5.96 6.01 6.05 3.88 3.74 3.83 3.89 3.84
U2-VC 6.01 6.09 6.02 6.03 6.04 4.00 3.95 3.85 3.97 3.94