Skip to main content

Table 3 WER (%) for 6 ×2048 network with soft targets and additional untranscribed from Fisher

From: Wise teachers train better DNN acoustic models

Input features

Targets

Data

Hub5’00-SWB

RT03S-FSH

FBANK

Hard alignment

110 h transcribed (Xent)

19.9 %

25.1 %

FBANK

FMLLR-Xent outputs

1100 h untranscribed SWBD + FSH

17.7 %

20.9 %

FBANK

Hard alignment

110 h transcribed (sMBR)

18.3 %

22.6 %

FBANK

FMLLR-sMBR outputs

1100 h untranscribed SWBD + FSH

16.4 %

19.5 %