From: Advanced recurrent network-based hybrid acoustic models for low resource speech recognition
Model | Training time per epoch (h) | Â | |||||
---|---|---|---|---|---|---|---|
 | 101 Cantonese | 104 Pashto | 107 Vietnamese | 202 Swahili | 204 Tamil | 302 Kazakh | 404 Georgian |
BLSTM-fbank | 12.79 | 8.40 | 7.69 | 4.92 | 7.00 | 4.85 | 4.88 |
LW-BLSTM-fbank | 8.01 | 4.71 | 4.83 | 3.06 | 4.37 | 3.03 | 3.04 |
BLSTM-MBN | 12.78 | 8.40 | 7.67 | 4.93 | 7.01 | 4.85 | 4.87 |
LW-BLSTM-MBN | 8.01 | 4.71 | 4.83 | 3.06 | 4.37 | 3.03 | 3.03 |