Skip to main content

Table 2 Test PER results of different models for online recognition on TIMIT dataset

From: Segment boundary detection directed attention for online end-to-end speech recognition

Model

#Param (M)

PER (%)

Partial condition [12]

3.1

20.8

Hard alignment with RL [14]

6.8

20.5

Gaussian prediction attention [40]

5.8

20.4

Hard monotonic attention [16]

6.4

20.4

Stacked LSTM [15]

1.0

20.0

CTC [41]

3.8

19.6

Proposed method

6.9

20.2

Soft attention*

5.9

21.0

Soft attention bigger-E*

7.5

20.8

Soft attention bigger-D*

7.0

20.6

  1. All the models use a unidirectional encoder and * indicates offline attention model