Skip to main content

Table 2 WER (%) for 6 ×2048 network with soft targets from sequence-trained teacher with fMLLR inputs

From: Wise teachers train better DNN acoustic models

Input features

Targets

Data

Hub5’00-SWB

RT03S-FSH

FMLLR

Hard alignment

110 h transcribed

15.1 %

18.2 %

FBANK

Hard alignment

110 h transcribed

18.3 %

22.6 %

FBANK

FMLLR-sMBR outputs

110 h untranscribed

17.9 %

22.6 %

FBANK

FMLLR-sMBR outputs

300 h untranscribed

16.8 %

21.2 %