Skip to main content

Table 5 Comparison of the two baselines of Section 6 (upper part) and the room discriminant feature-based approach (lower part) for the room-inside vs. room-outside speech classification task

From: Room-localized speech activity detection in multi-microphone smart homes

Features

DIRHA-sim

DIRHA-real

 

Single room

Multi-room

Single room

Multi-room

 

Liv.

Kitch.

Bath.

Bed.

Corr.

R=4

R=5

Liv.

Kitch.

Bath.

Bed.

R=4

R=5

MFCCs

70.96

72.52

61.25

76.94

39.03

72.32

70.49

46.93

71.50

80.98

58.11

68.91

68.91

SNR

55.59

57.57

17.38

50.75

8.80

48.59

41.22

41.47

72.99

53.42

31.63

53.74

45.54

\({{f}}_{\,r,{\mathcal {T}}}^{\,{\mathrm {(all)}}}\)

83.74

92.83

83.33

86.76

34.28

87.39

80.63

97.17

100.00

99.89

100.00

99.42

93.34

\({{f}}_{\,r,\,{\text {avg}},{\mathcal {T}}}^{\,{\mathrm {(all)}}}\!\!\!\)

84.80

92.83

84.48

85.05

38.11

87.42

81.46

99.50

97.51

99.89

99.16

98.66

89.94

\({{f}}_{\,{\text {home}},{\mathcal {T}}}^{\,{\mathrm {(all)}}}\!\!\!\)

86.29

89.96

91.67

88.25

39.66

88.30

84.26

97.88

79.23

95.00

93.63

87.26

79.19

  1. F-scores are reported for each room, as well as over R=4 rooms (excluding the corridor) and all R=5 rooms of the DIRHA smart home, on both DIRHA-sim (left) and DIRHA-real (right) test sets using ground-truth speech segment boundaries. Room-specific SVMs are employed, operating over entire segments