From: Dynamically localizing multiple speakers based on the time-frequency domain
∙ Compute the iRTF features from the multi-microphone recordings. |  |
∙ Apply the U-net network to classify each TF bin to one of the possible DOAs. |  |
∙ Based on the U-net results, decide the locations of the active speakers at each time frame. |  |