From: Multi-task deep cross-attention networks for far-field speaker verification and keyword spotting
 | SNR | Seen | Unseen | ||||
---|---|---|---|---|---|---|---|
 | EER (%) | minDCF | Acc (%) | EER (%) | minDCF | Acc (%) | |
Noise | − 5 dB | 4.11 | 0.193 | 90.72 | 8.07 | 0.406 | 87.85 |
0 dB | 1.34 | 0.947 | 94.70 | 4.92 | 0.266 | 91.95 | |
5 dB | 0.63 | 0.037 | 96.49 | 3.68 | 0.206 | 93.75 | |
10 dB | 0.27 | 0.019 | 97.45 | 2.81 | 0.175 | 94.65 | |
15 dB | 0.20 | 0.014 | 97.85 | 2.44 | 0.158 | 95.17 | |
Music | − 5 dB | 5.13 | 0.251 | 89.83 | 9.54 | 0.453 | 86.97 |
0 dB | 1.49 | 0.082 | 94.68 | 5.14 | 0.266 | 91.75 | |
5 dB | 0.53 | 0.032 | 96.68 | 3.55 | 0.201 | 94.03 | |
10 dB | 0.26 | 0.014 | 97.76 | 2.53 | 0.162 | 94.87 | |
15 dB | 0.17 | 0.007 | 98.04 | 2.24 | 0.143 | 95.35 |