From: Enhancement of speech dynamics for voice activity detection using DNN
AUC (%)—mean ± standard deviation | ||||
---|---|---|---|---|
Noise | SNR (dB) | Log power spectra | MFCCs | MFCCs +Δ+ΔΔ |
Clean | 98.72 ±0.20 | 98.18 ±0.08 | 97.79 ±0.41 | |
White | 10 | 97.51 ±0.49 | 96.10 ±0.59 | 96.91 ±0.57 |
5 | 97.27 ±0.48 | 93.99 ±0.85 | 95.11 ±0.97 | |
0 | 96.14 ±0.76 | 89.58 ±1.46 | 90.85 ±1.73 | |
− 5 | 93.88 ±1.10 | 81.43 ±1.11 | 82.42 ±1.53 | |
Babble | 10 | 96.50 ±0.55 | 92.71 ±0.92 | 93.51 ±0.77 |
5 | 94.26 ±0.66 | 87.24 ±1.03 | 87.73 ±0.89 | |
0 | 88.88 ±0.74 | 77.78 ±0.99 | 77.86 ±0.82 | |
− 5 | 78.10 ±1.10 | 65.72 ±1.40 | 65.59 ±1.40 | |
Factory | 10 | 96.80 ±0.60 | 95.16 ±0.79 | 96.04 ±0.74 |
5 | 95.14 ±0.72 | 91.60 ±1.23 | 92.55 ±1.06 | |
0 | 91.17 ±0.45 | 84.19 ±1.23 | 84.81 ±1.13 | |
− 5 | 80.49 ±1.54 | 72.40 ±1.14 | 72.70 ±0.72 | |
Car | 10 | 98.83 ±0.15 | 98.34 ±0.23 | 98.26 ±0.32 |
5 | 98.75 ±0.16 | 98.22 ±0.34 | 98.23 ±0.35 | |
0 | 98.56 ±0.16 | 97.91 ±0.44 | 98.08 ±0.40 | |
− 5 | 98.06 ±0.02 | 97.27 ±0.54 | 97.70 ±0.46 | |
Pink | 10 | 97.20 ±0.66 | 95.91 ±0.81 | 96.64 ±0.62 |
5 | 96.28 ±0.79 | 93.31 ±1.00 | 94.28 ±0.99 | |
0 | 94.06 ±0.95 | 87.96 ±1.54 | 88.91 ±1.40 | |
− 5 | 88.01 ±1.54 | 78.30 ±1.81 | 79.02 ±1.22 |