Skip to main content

Table 1 WER (%) for 6 ×2048 network with soft targets from cross-entropy-trained teacher with fMLLR inputs

From: Wise teachers train better DNN acoustic models

Input features

Targets

Data

Hub5’00-SWB

RT03S-FSH

FMLLR

Hard alignment

110 h transcribed

16.9 %

20.1 %

FBANK

Hard alignment

110 h transcribed

19.9 %

25.1 %

FBANK

FMLLR-XEnt outputs

110 h untranscribed

19.5 %

24.2 %

FBANK

FMLLR-XEnt outputs

300 h untranscribed

18.4 %

22.7 %