Table 3 Effect of the various choices in the design of the system’s first stage (discussed in Section 4.4) to the room-localized SAD performance on the DIRHA-sim test set

From: Room-localized speech activity detection in multi-microphone smart homes

Oper \({\mathcal {M}}\) Classes \({\mathcal {J}}\) Recall Precision F-score
RI \({\mathcal {M}}_{\mathrm {\,all}}\!\!\!\!\) { spall,silall} 72.30 56.63 63.51
RL \({\mathcal {M}}_{\,r}\) { spr,silall} 72.07 61.08 66.12
   { spr,silr} 71.20 60.39 65.35
   \(\{\,{\text {sp}}_{\,r\,},{\text {sp}}_{\,{\bar {r}}\,},{\text {sil}}_{\,\text {all}\,}\}\) 71.00 62.40 66.43
  1. For consistency, the first stage is always followed by the second stage of the MFCC/GMM baseline of Section 6.1. RI denotes room-independent operation (“oper”) of the first stage and RL room-localized one