Skip to main content

Table 1 AUC (%) comparison between the proposed method and DNN-based VAD methods using speech period candidates and log power spectra as the baseline

From: Enhancement of speech dynamics for voice activity detection using DNN

   AUC (%)—mean ± standard deviation
Noise SNR (dB) Proposed Log power spectra Speech period candidates
Clean   99.06 ±0.13 98.72 ±0.20 98.10 ±0.39
White 10 97.91 ±0.28 97.51 ±0.49 97.06 ±0.54
  5 97.44 ±0.43 97.27 ±0.48 96.64 ±0.46
  0 96.59 ±0.50 96.14 ±0.76 95.44 ±0.57
  − 5 94.69 ±0.66 93.88 ±1.10 93.40 ±0.60
Babble 10 96.84 ±0.60 96.50 ±0.55 96.19 ±0.68
  5 95.19 ±0.71 94.26 ±0.66 94.59 ±0.92
  0 91.30 ±0.74 88.88 ±0.74 90.42 ±0.53
  − 5 83.20 ±0.87 78.10 ±1.10 81.85 ±0.85
Factory 10 97.25 ±0.39 96.80 ±0.60 96.60 ±0.56
  5 95.96 ±0.43 95.14 ±0.72 95.48 ±0.77
  0 93.18 ±0.46 91.17 ±0.45 92.53 ±0.67
  − 5 85.91 ±0.29 80.49 ±1.54 84.57 ±0.83
Car 10 99.02 ±0.11 98.83 ±0.15 97.60 ±0.45
  5 98.94 ±0.11 98.75 ±0.16 97.37 ±0.45
  0 98.79 ±0.09 98.56 ±0.16 97.02 ±0.41
  − 5 98.40 ±0.05 98.06 ±0.02 96.36 ±0.32
Pink 10 97.79 ±0.39 97.20 ±0.66 96.86 ±0.73
  5 96.82 ±0.59 96.28 ±0.79 95.98 ±0.74
  0 95.26 ±0.70 94.06 ±0.95 94.26 ±0.89
  − 5 91.56 ±1.03 88.01 ±1.54 89.91 ±1.20
  1. The numbers in italics indicate the best results