Skip to main content

Table 7 Room-independent SAD results on the DIRHA-sim test set, employing all available microphones (\(| \mathcal {M}_{\text {all}} |\!=\,\)40) or the reduced setups of Fig. 12

From: Room-localized speech activity detection in multi-microphone smart homes

\(|\mathcal {M}_{\text {all}}|\)

Microphone-specific GMMs

Single-microphone GMM

 

Recall

Precision

F-score

Recall

Precision

F-score

40

91.78

91.82

91.80

91.21

87.25

89.19

25

91.45

91.41

91.43

90.66

87.58

89.09

16

91.50

90.89

91.19

87.51

90.69

89.07

10

90.84

89.39

90.11

90.33

86.42

88.33

5

88.22

91.02

89.60

89.21

86.61

87.89

  1. In all cases, HMM-based Viterbi decoding and “w-sum” decision fusion are used, where the combined log-likelihoods result from microphone-specific GMMs (left) or a GMM trained on a single microphone (right)