From: Multi-encoder attention-based architectures for sound recognition with partial visual assistance
Data partition | Number of recordings |
---|---|
Unsupervised training | 14412 |
Weakly supervised training | 1578 |
Strongly supervised training | 2584 |
Validation | 1168 |
Evaluation (public) | 692 |