From: Dynamically localizing multiple speakers based on the time-frequency domain
# of parameters [million]
Average inference time of 1-s signal [seconds]
CMS-DOA
8.7
0.09
TF-DOAnet
2.1
0.07