From: Exploiting spectro-temporal locality in deep learning based acoustic event detection
AE | Frame length (resolution) | |||||
---|---|---|---|---|---|---|
10 ms | 20 ms | 30 ms | 40 ms | 50 ms | 60 ms | |
ap | 76.39 % | 65.39 % | 66.32 % | 82.65 % | 69.90 % | 72.85 % |
cl | 71.84 % | 84.04 % | 68.17 % | 62.26 % | 62.40 % | 70.64 % |
cm | 31.59 % | 35.71 % | 33.73 % | 30.98 % | 44.00 % | 23.62 & |
co | 36.97 % | 27.82 % | 27.49 % | 17.09 % | 21.58 % | 29.97 % |
ds | 29.70 % | 16.92 % | 17.76 % | 11.62 % | 38.74 % | 21.66 % |
kj | 12.90 % | 11.46 % | 14.64 % | 12.70 % | 17.11 % | 13.66 % |
kn | 49.66 % | 27.08 % | 37.03 % | 66.89 % | 44.57 % | 23.55 % |
kt | 38.37 % | 27.97 % | 26.98 % | 27.29 % | 32.61 % | 28.59 % |
la | 13.67 % | 12.14 % | 12.48 % | 10.78 % | 10.90 % | 11.48 % |
pr | 53.58 % | 55.98 % | 51.35 % | 60.25 % | 55.43 % | 52.82 % |
pw | 83.34 % | 82.28 % | 87.28 % | 92.02 % | 92.69 % | 88.15 % |
st | 54.85 % | 47.15 % | 51.83 % | 46.43 % | 63.27 % | 48.38 & |
all | 69.20 % | 69.80 % | 67.34 % | 68.09 % | 68.33 % | 67.34 % |