Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition

EURASIP Journal on Audio, Speech, and Music Processing

Table 14 Multi-condition (combined) NN training datasets for the experiments using real reverberant data

Dataset name	Speaker number	Utterances num. per spk. per env. (pairs of utts.)	Total duration of utterances (seconds)
5s.20u	5	1	54
5s.40u	5	2	108
5s.60u	5	3	162
5s.80u	5	4	213
1s.20u	1	5	70
1s.40u	1	10	138

The total duration of utterances is after removing the silence parts in the beginning and ending of each recording. For the one-speaker datasets (‘1s’), the total duration is the average from five speakers’ datasets.