Skip to main content

Table 2 Configuration of training data generation. All rooms are 2.7 m in height

From: Dynamically localizing multiple speakers based on the time-frequency domain

Simulated training data

 

Room 1

Room 2

Room 3

Room 4

Room 5

Room size

(6×6) m

(5×4) m

(10×6) m

(8×3) m

(8×5) m

RT60

0.3 s

0.2 s

0.8 s

0.4 s

0.6 s

Signal

Noiseless signals from WSJ1 training database

Array position in room

6 arbitrary positions in each room

Source-array distance

1.5 m with added noise with 0.1 variance