Skip to main content

Table 2 Configuration of training data generation. All rooms are 2.7 m in height

From: Dynamically localizing multiple speakers based on the time-frequency domain

Simulated training data
  Room 1 Room 2 Room 3 Room 4 Room 5
Room size (6×6) m (5×4) m (10×6) m (8×3) m (8×5) m
RT60 0.3 s 0.2 s 0.8 s 0.4 s 0.6 s
Signal Noiseless signals from WSJ1 training database
Array position in room 6 arbitrary positions in each room
Source-array distance 1.5 m with added noise with 0.1 variance