From: Deep semantic learning for acoustic scene classification
Audio-SegNet | SegNet-L | SegNet-M | SegNet-S | Mini-SegNet |
---|---|---|---|---|
Encoder | 64 × 2 | 64 × 2 | 64 × 2 | 64 × 1 |
128 × 2 | 128 × 2 | 128 × 2 | 128 × 2 | |
256 × 3 | 196 × 2 |  |  | |
512 × 3 |  |  |  | |
512 × 3 |  |  |  | |
Decoder | 512 × 3 | 196 × 2 | 128 × 2 | 128 × 2 |
512 × 3 | 128 × 2 | 64 × 2 | 64 × 1 | |
256 × 3 | 64 × 2 |  |  | |
128 × 2 |  |  |  | |
64 × 2 |  |  |  | |
Train params | 31,880,650 | 2,051,050 | 707,338 | 670,282 |
Time(s)/Epoch | 328 | 215 | 206 | 195 |
All-accuracy | 93.86/59.06 | 90.84/63.44 | 85.32/65.35 | 83.45/66.46 |