From: A new joint CTC-attention-based speech recognition model with multi-level multi-head attention
2 heads
3 heads
4 heads
5 heads
TIMIT
16.73
16.52
16.34
16.60
WSJ
4.2
4.0
3.8
LibriSpeech
3.9
3.7
3.6