Skip to main content

Table 7 Overall SDRi/SI-SNRi(dB) performance with different configurations

From: Heterogeneous separation consistency training for adaptation of unsupervised speech separation

Dataset

System

Baseline

SCT

Supervised

Aishell2Mix

Conv-TasNet

2.57/2.08

6.15/5.52

9.00/8.32

 

DPCCN

5.78/5.09

6.48/5.82

8.86/8.14

WHAMR!

Conv-TasNet

6.83/6.45

8.48/8.06

11.03/10.59

 

DPCCN

8.99/8.50

9.26/8.81

11.01/10.56

  1. “Baseline” means model trained on source domain Libri2Mix while evaluated on target domain Aishell2Mix and WHAMR!. “SCT” is the best adaptation configuration, i.e. SCT-2 with CPS-2. “Supervised” means model trained with ground-truth labels