From: A robust polynomial regression-based voice activity detector for speaker verification
Noise type | SNR level (dB) | Proposed algorithm | Drugman’s VAD | Rangachari’s noise tracking | EER reduction compared to Drugman’s | EER reduction compared to Rangachari’s |
---|---|---|---|---|---|---|
Lynx | − 10 | 31.32 (0.58) | 41.84 (0.78) | 44.52 (0.84) | 25.14 | 29.65 |
− 5 | 20.16 (0.38) | 29.44 (0.55) | 36.76 (0.86) | 31.52 | 45.15 | |
0 | 11.82 (0.22) | 15.80 (0.30) | 23.93 (0.44) | 25.19 | 50.60 | |
5 | 6.81 (0.12) | 8.34 (0.15) | 14.35 (0.26) | 18.34 | 52.54 | |
10 | 4.13 (0.07) | 4.85 (0.08) | 9.64 (0.18) | 14.84 | 57.15 | |
F16 | − 10 | 38.79 (0.71) | 46.26 (0.85) | 47.71 (0.88) | 16.14 | 18.69 |
− 5 | 27.99 (0.52) | 37.63 (0.70) | 42.20 (0.78) | 25.61 | 33.67 | |
0 | 17.4 (0.33) | 24.43 (0.46) | 31.54 (0.59) | 28.77 | 44.83 | |
5 | 9.93 (0.18) | 11.89 (0.22) | 19.29 (0.36) | 16.48 | 48.52 | |
10 | 5.87 (0.10) | 6.16 (0.11) | 12.54 (0.23) | 4.70 | 53.19 | |
Car | − 10 | 3.62 (0.06) | 3.77 (0.06) | 6.74 (0.12) | 3.97 | 47.29 |
− 5 | 2.82 (0.05) | 3.19 (0.05) | 6.23 (0.11) | 11.59 | 54.73 | |
0 | 2.75 (0.04) | 3.12 (0.05) | 6.09 (0.11) | 11.86 | 54.84 | |
5 | 2.75 (0.04) | 3.04 (0.05) | 6.02 (0.11) | 9.54 | 54.32 | |
10 | 2.75 (0.04) | 3.04 (0.05) | 6.09 (0.11) | 9.54 | 54.82 | |
Babble | − 10 | 33.21 (0.63) | 44.81 (0.84) | 46.12 (0.84) | 25.88 | 27.99 |
− 5 | 21.68 (0.40) | 34.15 (0.63) | 40.32 (0.75) | 36.51 | 46.23 | |
0 | 12.98 (0.24) | 20.08 (0.37) | 27.19 (0.51) | 35.35 | 52.26 | |
5 | 6.89 (0.12) | 9.42 (0.17) | 16.75 (0.31) | 26.85 | 58.86 | |
10 | 4.06 (0.07) | 5.07 (0.09) | 10.73 (0.19) | 19.92 | 62.16 | |
Stitel | − 10 | 33.21 (0.63) | 46.04 (0.85) | 45.17 (0.83) | 27.86 | 26.47 |
− 5 | 26.83 (0.50) | 34.37 (0.64) | 35.53 (0.65) | 21.93 | 24.48 | |
0 | 15.08 (0.28) | 18.92 (0.35) | 22.33 (0.42) | 20.29 | 32.46 | |
5 | 8.12 (0.14) | 10.37 (0.19) | 13.77 (0.26) | 21.69 | 41.03 | |
10 | 4.20 (0.07) | 6.23 (0.11) | 8.99 (0.17) | 32.58 | 53.28 |