From: Enhancement of speech dynamics for voice activity detection using DNN
AUC (%)—mean ± standard deviation | ||||
---|---|---|---|---|
Noise | SNR (dB) | Proposed | Log power spectra | Speech period candidates |
Clean | 99.06 ±0.13 | 98.72 ±0.20 | 98.10 ±0.39 | |
White | 10 | 97.91 ±0.28 | 97.51 ±0.49 | 97.06 ±0.54 |
5 | 97.44 ±0.43 | 97.27 ±0.48 | 96.64 ±0.46 | |
0 | 96.59 ±0.50 | 96.14 ±0.76 | 95.44 ±0.57 | |
− 5 | 94.69 ±0.66 | 93.88 ±1.10 | 93.40 ±0.60 | |
Babble | 10 | 96.84 ±0.60 | 96.50 ±0.55 | 96.19 ±0.68 |
5 | 95.19 ±0.71 | 94.26 ±0.66 | 94.59 ±0.92 | |
0 | 91.30 ±0.74 | 88.88 ±0.74 | 90.42 ±0.53 | |
− 5 | 83.20 ±0.87 | 78.10 ±1.10 | 81.85 ±0.85 | |
Factory | 10 | 97.25 ±0.39 | 96.80 ±0.60 | 96.60 ±0.56 |
5 | 95.96 ±0.43 | 95.14 ±0.72 | 95.48 ±0.77 | |
0 | 93.18 ±0.46 | 91.17 ±0.45 | 92.53 ±0.67 | |
− 5 | 85.91 ±0.29 | 80.49 ±1.54 | 84.57 ±0.83 | |
Car | 10 | 99.02 ±0.11 | 98.83 ±0.15 | 97.60 ±0.45 |
5 | 98.94 ±0.11 | 98.75 ±0.16 | 97.37 ±0.45 | |
0 | 98.79 ±0.09 | 98.56 ±0.16 | 97.02 ±0.41 | |
− 5 | 98.40 ±0.05 | 98.06 ±0.02 | 96.36 ±0.32 | |
Pink | 10 | 97.79 ±0.39 | 97.20 ±0.66 | 96.86 ±0.73 |
5 | 96.82 ±0.59 | 96.28 ±0.79 | 95.98 ±0.74 | |
0 | 95.26 ±0.70 | 94.06 ±0.95 | 94.26 ±0.89 | |
− 5 | 91.56 ±1.03 | 88.01 ±1.54 | 89.91 ±1.20 |