From: Automated audio captioning: an overview of recent progress and new challenges
Dataset | Method | Year | BLEU\(_{1}\) | BLEU\(_{2}\) | METEOR | CIDEr | SPICE | SPIDEr |
---|---|---|---|---|---|---|---|---|
AudioCaps | Kim et al. [20] | 2019 | 0.614 | 0.446 | 0.203 | 0.593 | 0.144 | 0.369 |
 | Koizumi et al. [68] | 2020 | 0.638 | 0.458 | 0.199 | 0.603 | 0.139 | 0.371 |
 | Eren et al. [39] | 2020 | 0.710 | 0.490 | 0.290 | 0.750 | - | - |
 | Xu et al. [44] | 2021 | 0.655 | 0.476 | 0.229 | 0.660 | 0.168 | 0.414 |
 | Mei et al. [47] | 2021 | 0.647 | 0.488 | 0.222 | 0.679 | 0.160 | 0.420 |
 | Gontier et al. [69] | 2021 | 0.699 | 0.523 | 0.241 | 0.753 | 0.176 | 0.465 |
 | Liu et al. [70] | 2022 | 0.671 | 0.498 | 0.232 | 0.667 | 0.172 | 0.420 |
Clotho v1 | Drossos et al. [64] | 2019 | 0.420 | 0.140 | 0.090 | 0.100 | - | - |
 | Cakir et al. [57] | 2020 | 0.409 | 0.156 | 0.088 | 0.107 | 0.040 | 0.074 |
 | Nguyen et al. [33] | 2020 | 0.417 | 0.154 | 0.089 | 0.093 | 0.040 | 0.067 |
 | Perez-Castanos [38] | 2020 | 0.469 | 0.265 | 0.136 | 0.214 | 0.086 | 0.150 |
 | Tran et al. [40] | 2020 | 0.489 | 0.303 | 0.143 | 0.268 | 0.095 | 0.182 |
 | Takeuchi et al. [42] | 2020 | 0.512 | 0.325 | 0.145 | 0.290 | 0.089 | 0.190 |
 | Koizumi et al. [18] | 2020 | 0.521 | 0.309 | 0.149 | 0.258 | 0.097 | 0.178 |
 | Chen et al. [34] | 2020 | 0.534 | 0.343 | 0.160 | 0.346 | 0.108 | 0.227 |
 | Xu et al. [43] | 2020 | 0.561 | 0.341 | 0.162 | 0.338 | 0.108 | 0.223 |
 | Eren et al. [39] | 2020 | 0.590 | 0.350 | 0.220 | 0.280 | - | - |
 | Xu et al. [44] | 2021 | 0.556 | 0.363 | 0.169 | 0.377 | 0.115 | 0.246 |
 | Koh et al. [66] | 2022 | 0.551 | 0.369 | 0.165 | 0.380 | 0.111 | 0.246 |
Clotho v2 | Narisetty et al. [48] | 2021 | 0.536 | 0.341 | 0.160 | 0.346 | 0.108 | 0.227 |
 | Won et al. [77] | 2021 | 0.564 | 0.376 | 0.177 | 0.441 | 0.128 | 0.285 |
 | Ye et al. [36] | 2021 | 0.577 | - | 0.174 | 0.419 | 0.119 | 0.269 |
 | Han et al. [37] | 2021 | 0.585 | 0.392 | 0.177 | 0.474 | 0.130 | 0.302 |
Clotho v2 + val set | Narisetty et al.[48] | 2021 | 0.541 | 0.346 | 0.161 | 0.362 | 0.110 | 0.236 |
 | Liu et al. [23] | 2021 | 0.553 | 0.349 | 0.168 | 0.368 | 0.115 | 0.242 |
 | Mei et al. [35] | 2021 | 0.561 | 0.374 | 0.171 | 0.426 | 0.124 | 0.275 |
 | Chen et al. [73] | 2022 | 0.572 | 0.379 | 0.171 | 0.407 | 0.119 | 0.263 |
 | Xiao et al. [59] | 2022 | 0.578 | 0.387 | 0.177 | 0.434 | 0.122 | 0.278 |