From: A robust polynomial regression-based voice activity detector for speaker verification
Noise type | SNR level (dB) | Proposed algorithm | Drugman’s VAD | Rangachari’s noise tracking | EER reduction compared to Drugman’s | EER reduction compared to Rangachari’s |
---|---|---|---|---|---|---|
Lynx | − 10 | 34.25 (0.64) | 46.10 (0.85) | 47.4 (0.87) | 25.70 | 27.74 |
− 5 | 25.30 (0.47) | 32.18 (0.60) | 39.22 (0.72) | 21.38 | 35.49 | |
0 | 15.29 (0.28) | 14.60 (0.27) | 22.47 (0.42) | − 4.72 | 31.95 | |
5 | 8.41 (0.15) | 8.41 (0.15) | 13.45 (0.24) | 0 | 37.47 | |
10 | 5.42 (0.10) | 6.50 (0.12) | 9.93 (0.18) | 16.61 | 45.41 | |
F16 | − 10 | 41.28 (0.78) | 48.31 (0.88) | 48.16 (0.89) | 14.55 | 14.28 |
− 5 | 31.88 (0.60) | 41.82 (0.80) | 45.18 (0.84) | 23.77 | 29.43 | |
0 | 20.87 (0.39) | 24.38 (0.46) | 33.4 (0.60) | 14.40 | 37.51 | |
5 | 11.85 (0.22) | 11.54 (0.21) | 18.19 (0.34) | − 2.68 | 34.85 | |
10 | 6.95 (0.13) | 7.8 (0.14) | 12.46 (0.23) | 10.89 | 44.22 | |
Car | − 10 | 5.96 (0.10) | 6.27 (0.11) | 8.94 (0.16) | 4.94 | 33.33 |
− 5 | 4.74 (0.08) | 5.88 (0.10) | 8.35 (0.15) | 19.38 | 43.23 | |
0 | 4.35 (0.08) | 5.50 (0.10) | 8.18 (0.15) | 20.91 | 46.82 | |
5 | 4.05 (0.07) | 5.27 (0.09) | 7.95 (0.14) | 23.15 | 49.05 | |
10 | 4.05 (0.07) | 5.12 (0.09) | 7.95 (0.14) | 20.90 | 49.05 | |
Babble | − 10 | 36.85 (0.69) | 48.08 (0.87) | 47.85 (0.88) | 23.35 | 22.98 |
− 5 | 26.83 (0.50) | 38.45 (0.72) | 43.94 (0.87) | 30.22 | 42.84 | |
0 | 17.50 (0.33) | 19.49 (0.36) | 28.28 (0.51) | 10.21 | 38.11 | |
5 | 10.01 (0.18) | 10.16 (0.18) | 14.52 (0.27) | 1.47 | 31.06 | |
10 | 6.72 (0.12) | 7.26 (0.13) | 10.93 (0.20) | 7.44 | 38.51 | |
Stitel | − 10 | 42.66 (0.79) | 47.17 (0.86) | 45.18 (0.84) | 9.56 | 5.57 |
− 5 | 33.71 (0.62) | 37.23 (0.69) | 37.15 (0.69) | 9.45 | 9.26 | |
0 | 19.95 (0.37) | 19.26 (0.36) | 20.41 (0.38) | − 3.58 | 2.25 | |
5 | 9.40 (0.17) | 9.71 (0.18) | 11.62 (0.21) | 3.19 | 19.10 | |
10 | 5.96 (0.11) | 6.95 (0.12) | 9.32 (0.17) | 14.24 | 36.05 |