Skip to main content

Advertisement

Table 2 Experimental results by using simulated reverberant data

From: Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition

NN conf. RIR Frame sel. type Speaker identification rate (%)
Left context only (L) Left+right context (L+R) Left+short right context (L+sR)
Frame sel. Training data Frame sel. Training data Frame sel. Training data
1u 5u 10u 15u 1u 5u 10u 15u 1u 5u 10u 15u
Multiple NNs Office Linear 3-1-0 71.6 76.7 76.8 77.0
7-1-0 59.9 79.6 81.7 81.9 3-1-3 70.0 82.3 82.7 83.1
15-1-0 33.4 65.3 78.8 80.9 7-1-7 55.5 76.9 83.3 85.1 7-1-3 56.2 81.3 85.4 85.8
Skip1 3-1-0 74.4 79.5 79.4 80.1
7-1-0 57.1 81.3 82.4 84.0 3-1-3 69.2 83.8 85.8 86.1
7-1-7 52.7 72.0 82.2 85.0 7-1-3 59.1 83.1 85.7 87.1
Livingroom Linear 3-1-0 60.1 69.6 70.6 70.8
7-1-0 52.3 76.0 78.1 78.9 3-1-3 52.3 75.4 75.8 75.7
15-1-0 23.6 58.4 72.2 76.5 7-1-7 35.5 62.4 74.4 78.4 7-1-3 32.5 74.1 79.4 81.1
Skip1 3-1-0 63.2 74.0 74.1 74.5
7-1-0 39.8 75.5 78.8 79.7 3-1-3 52.4 77.8 79.5 79.1
7-1-7 25.6 61.3 74.6 79.2 7-1-3 32.5 72.2 79.6 82.1
Single NN Livingroom Linear 3-1-0 64.9 71.6 70.6 71.3
7-1-0 71.8 75.4 75.4 75.4 3-1-3 72.0 75.4 74.9 74.0
15-1-0 70.5 77.0 77.2 77.7 7-1-7 73.6 77.5 79.2 78.4 7-1-3 76.9 77.8 78.8 78.6
Skip1 3-1-0 71.1 73.4 74.3 74.4
7-1-0 72.6 75.9 76.1 76.2 3-1-3 73.7 76.2 76.2 77.0
7-1-7 71.4 79.3 79.5 79.9 7-1-3 74.6 79.6 79.7 79.7