From: A robust polynomial regression-based voice activity detector for speaker verification
Noise type | SNR level (dB) | Proposed algorithm | Drugman’s VAD | Rangachari’s noise tracking | EER reduction compared to Drugman’s | EER reduction compared to Rangachari’s |
---|---|---|---|---|---|---|
Lynx | − 10 | 30.04 (0.55) | 43.57 (0.81) | 43.57 (0.81) | 31.05 | 31.05 |
− 5 | 17.58 (0.33) | 28.36 (0.53) | 29.51 (0.55) | 38.01 | 40.42 | |
0 | 9.48 (0.17) | 14.37 (0.27) | 17.43 (0.32) | 34.03 | 45.61 | |
5 | 5.58 (0.09) | 8.25 (0.14) | 11.39 (0.21) | 32.36 | 51 | |
10 | 3.90 (0.06) | 5.35 (0.09) | 7.72 (0.14) | 27.10 | 49.48 | |
F16 | − 10 | 38.60 (0.73) | 48.16 (0.89) | 48.93(0.89) | 19.85 | 21.11 |
− 5 | 27.44 (0.52) | 38.07 (0.71) | 37.08 (0.70) | 27.92 | 26 | |
0 | 15.82 (0.30) | 21.71 (0.40) | 23.39 (0.43) | 27.13 | 32.36 | |
5 | 8.48 (0.15) | 11.31 (0.21) | 14.98 (0.27) | 25.02 | 43.39 | |
10 | 5.35 (0.09) | 7.41 (0.13) | 10.16 (0.18) | 27.8 | 47.34 | |
Car | − 10 | 3.74 (0.06) | 4.66 (0.08) | 6.04 (0.11) | 19.74 | 38.08 |
− 5 | 3.28 (0.05) | 4.43 (0.07) | 3.66 (0.06) | 25.95 | 10.38 | |
0 | 2.98 (0.05) | 4.20 (0.06) | 5.58 (0.09) | 29.04 | 46.59 | |
5 | 3.13 (0.05) | 4.05 (0.06) | 5.65 (0.09) | 22.71 | 44.60 | |
10 | 3.13 (0.04) | 3.97 (0.06) | 5.58 (0.09) | 21.15 | 43.90 | |
Babble | − 10 | 31.88 (0.60) | 47.24 (0.87) | 47.09 (0.88) | 32.51 | 32.3 |
− 5 | 19.95 (0.37) | 32.95 (0.61) | 42.66 (0.80) | 39.45 | 53.23 | |
0 | 10.85 (0.19) | 18.19 (0.34) | 20.18 (0.38) | 40.35 | 46.23 | |
5 | 5.65 (0.10) | 9.25 (0.17) | 12.00 (0.22) | 38.92 | 52.91 | |
10 | 4.35 (0.07) | 6.11 (0.11) | 8.56 (0.15) | 28.80 | 49.18 | |
Stitel | − 10 | 37.53 (0.71) | 46.56 (0.87) | 45.87 (0.86) | 19.39 | 18.18 |
− 5 | 22.24 (0.42) | 32.11 (0.60) | 31.72 (0.60) | 30.73 | 29.88 | |
0 | 11.23 (0.20) | 17.35 (0.32) | 15.75 (0.29) | 35.27 | 28.7 | |
5 | 5.81 (0.11) | 9.17 (0.17) | 10.24 (0.19) | 36.64 | 43.26 | |
10 | 4.05 (0.07) | 6.34 (0.11) | 7.41 (0.13) | 36.12 | 45.34 |