Skip to main content

Table 1 WER (%) for 6 ×2048 network with soft targets from cross-entropy-trained teacher with fMLLR inputs

From: Wise teachers train better DNN acoustic models

Input features Targets Data Hub5’00-SWB RT03S-FSH
FMLLR Hard alignment 110 h transcribed 16.9 % 20.1 %
FBANK Hard alignment 110 h transcribed 19.9 % 25.1 %
FBANK FMLLR-XEnt outputs 110 h untranscribed 19.5 % 24.2 %
FBANK FMLLR-XEnt outputs 300 h untranscribed 18.4 % 22.7 %