Dynamically localizing multiple speakers based on the time-frequency domain

EURASIP Journal on Audio, Speech, and Music Processing

Table 6 Computational cost comparison

	# of parameters [million]	Average inference time of 1-s signal [seconds]
CMS-DOA	8.7	0.09
TF-DOAnet	2.1	0.07