Skip to main content

Table 7 Room-independent SAD results on the DIRHA-sim test set, employing all available microphones (\(| \mathcal {M}_{\text {all}} |\!=\,\)40) or the reduced setups of Fig. 12

From: Room-localized speech activity detection in multi-microphone smart homes

\(|\mathcal {M}_{\text {all}}|\) Microphone-specific GMMs Single-microphone GMM
  Recall Precision F-score Recall Precision F-score
40 91.78 91.82 91.80 91.21 87.25 89.19
25 91.45 91.41 91.43 90.66 87.58 89.09
16 91.50 90.89 91.19 87.51 90.69 89.07
10 90.84 89.39 90.11 90.33 86.42 88.33
5 88.22 91.02 89.60 89.21 86.61 87.89
  1. In all cases, HMM-based Viterbi decoding and “w-sum” decision fusion are used, where the combined log-likelihoods result from microphone-specific GMMs (left) or a GMM trained on a single microphone (right)