Skip to main content

Table 1 Development and test term list characteristics for MAVIR database

From: Whisper-based spoken term detection systems for search on speech ALBAYZIN evaluation challenge

Term list

Development

Test

#INL (occ.)

354 (959)

208 (2071)

#OOL (occ.)

20 (55)

15 (50)

#SINGLE (occ.)

340 (984)

198 (2093)

#MULTI (occ.)

34 (30)

25 (28)

  1. occ. number of occurrences (in brackets), INL in-language, OOL out-of-language, SINGLE single-word terms, MULTI multi-word terms. The term length in the development term list varies between 5 and 27 graphemes (single-word term length varies between 5 and 16 graphemes, and multi-word term length varies between 7 and 27 graphemes). The term length in the test term list varies between 4 and 28 graphemes (single-word term length varies between 4 and 16 graphemes, and multi-word term length varies between 7 and 28 graphemes)