Skip to main content

Table 3 WER (%) for 6 ×2048 network with soft targets and additional untranscribed from Fisher

From: Wise teachers train better DNN acoustic models

Input features Targets Data Hub5’00-SWB RT03S-FSH
FBANK Hard alignment 110 h transcribed (Xent) 19.9 % 25.1 %
FBANK FMLLR-Xent outputs 1100 h untranscribed SWBD + FSH 17.7 % 20.9 %
FBANK Hard alignment 110 h transcribed (sMBR) 18.3 % 22.6 %
FBANK FMLLR-sMBR outputs 1100 h untranscribed SWBD + FSH 16.4 % 19.5 %