Skip to main content

Table 2 WER (%) for 6 ×2048 network with soft targets from sequence-trained teacher with fMLLR inputs

From: Wise teachers train better DNN acoustic models

Input features Targets Data Hub5’00-SWB RT03S-FSH
FMLLR Hard alignment 110 h transcribed 15.1 % 18.2 %
FBANK Hard alignment 110 h transcribed 18.3 % 22.6 %
FBANK FMLLR-sMBR outputs 110 h untranscribed 17.9 % 22.6 %
FBANK FMLLR-sMBR outputs 300 h untranscribed 16.8 % 21.2 %