From: Room-localized speech activity detection in multi-microphone smart homes
Method | DIRHA-sim | DIRHA-real | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Recall | Precision | F-score | Recall | Precision | F-score | ||||||||
GMM | HMM | GMM | HMM | GMM | HMM | GMM | HMM | GMM | HMM | GMM | HMM | ||
Oracle-best | 96.94 | 94.67 | 94.01 | 96.82 | 95.45 | 95.73 | 93.01 | 95.49 | 95.91 | 96.46 | 94.44 | 95.97 | |
Channel avg. | 87.86 | 82.26 | 76.64 | 83.13 | 81.82 | 82.69 | 65.56 | 71.57 | 89.47 | 87.42 | 75.37 | 78.34 | |
Best act.-SNR | 94.56 | 92.36 | 83.85 | 87.95 | 88.88 | 90.10 | 88.77 | 90.33 | 88.95 | 86.87 | 88.86 | 88.57 | |
Best est.-SNR | 96.60 | 93.63 | 66.56 | 73.54 | 78.81 | 82.38 | 92.43 | 93.41 | 74.38 | 74.02 | 82.43 | 82.59 | |
Sohn’s | 81.22 | 58.91 | 68.29 | 78.05 | 61.51 | 68.80 | |||||||
Decision fusion | “u-sum” | 94.39 | 91.08 | 83.60 | 90.97 | 88.67 | 91.01 | 74.76 | 89.11 | 96.54 | 91.70 | 84.26 | 90.39 |
“w-sum” | 95.00 | 91.78 | 83.57 | 91.82 | 88.92 | 91.80 | 76.87 | 87.37 | 96.58 | 93.37 | 85.67 | 90.27 | |
“u-max” | 74.17 | 82.51 | 75.28 | 73.69 | 74.72 | 77.85 | 45.66 | 68.40 | 97.21 | 95.01 | 62.14 | 79.54 | |
“w-max” | 95.44 | 95.53 | 82.34 | 87.16 | 88.41 | 91.15 | 79.76 | 89.66 | 95.77 | 88.70 | 87.03 | 89.18 | |
“u-vote” | 92.55 | 88.92 | 84.18 | 92.24 | 88.16 | 90.55 | 69.12 | 83.39 | 96.61 | 95.02 | 80.58 | 88.82 | |
“w-vote” | 91.37 | 91.83 | 87.39 | 90.40 | 89.34 | 91.11 | 74.76 | 85.03 | 96.54 | 94.82 | 84.26 | 89.66 |