Skip to main content

Table 1 The corpus size and number of senones of the Babel languages

From: Advanced recurrent network-based hybrid acoustic models for low resource speech recognition

Language

Training hours

# senone

Language

Training hours

# senone

Language

Training hours

# senone

Cantonese(101)

140.7

4687

Assamese(102)

60.3

4707

Bengali(103)

61.1

4929

Pashto(104)

77.3

4823

Turkish(105)

76.4

4791

Tagalog(106)

83.8

4814

Vietnamese(107)

87.1

4692

Haitian(201)

66.5

4875

Swahili(202)

44.0

4638

Lao(203)

65.2

4667

Tamil(204)

64.1

4560

Kurdish (205)

41.7

4412

Zulu(206)

61.3

4464

Tok Pisin(207)

39.0

4565

Cebuano(301)

41.0

4603

Kazakh(302)

39.6

4714

Telugu(303)

41.8

4515

Lithuanian(304)

42.1

4755

Guarani(305)

42.6

4504

Igbo(306)

43.7

4659

Amharic(307)

43.2

4685

Mongolian(401)

45.7

4537

Javanese(402)

45.1

4763

Dholuo(403)

41.3

4571

  1. The number in the () indicates the language id per the Babel program