From: Dual supervised learning for non-native speech recognition
Setup
M L
M S
M STT
M TTS
1
RNN 3 × 512
RNN 2 × 1024
Wavenet
2
LSTM 3 × 512
LSTM 2 × 1024
3
3-gram