From: YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation
Dataset | Scale | Modal | Content | ||
---|---|---|---|---|---|
V | M | T | |||
CFM400 [29] | 401 | \(\checkmark\) | \(\checkmark\) | Â | Game videos (Cross fire) |
HoK400 [29] | 427 | \(\checkmark\) | \(\checkmark\) | Â | Game videos (Honor of king) |
UGV [30] | 1265 | \(\checkmark\) | \(\checkmark\) | \(\checkmark\) | User generated videos |
YouCook2 [31] | 2000 | \(\checkmark\) | Â | \(\checkmark\) | Cooking videos on Youtube |
EmoMV [32] | 5986 | \(\checkmark\) | \(\checkmark\) | \(\checkmark\) | Music videos with emotion label |
MSR-VTT [33] | 10,000 | \(\checkmark\) | Â | \(\checkmark\) | Online videos with caption |
TT-150K [34] | 150,000 | \(\checkmark\) | \(\checkmark\) | \(\checkmark\) | Microvideos on Tiktok |
HIMV-200K [21] | 205,000 | \(\checkmark\) | \(\checkmark\) | Â | Music videos on YouTube |
Youtbe-8M [35] | 8,000,000 | \(\checkmark\) | \(\checkmark\) | \(\checkmark\) | Videos on YouTube |
Commercial-98K | 98,071 | \(\checkmark\) | \(\checkmark\) | \(\checkmark\) | E-commerce ads |