Skip to main content

Advertisement

Table 4 Development and test term list characteristics for MAVIR and EPIC databases

From: ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish

Term list dev test-MAVIR test-EPIC
#IN-LANG terms (occ.) 354 (959) 208 (2071) 183 (1912)
#OUT-LANG terms (occ.) 20 (55) 15 (50) 0 (0)
#SINGLE terms (occ.) 340 (984) 198 (2093) 183 (1912)
#MULTI terms (occ.) 34 (30) 25 (28) 0 (0)
#INV terms (occ.) 292 (668) 192 (1749) 150 (1562)
#OOV terms (occ.) 82 (346) 31 (372) 33 (350)
  1. ‘dev’ stands for development, ‘IN-LANG’ refers to in-language terms, ‘OUT-LANG’ to foreign terms, ‘SINGLE’ to single-word terms, ‘MULTI’ to multi-word terms, ‘INV’ to in-vocabulary terms, ‘OOV’ to out-of-vocabulary terms, and ‘occ.’ stands for occurrences. The term length of the development term list varies between 5 and 27 graphemes. The term length of the MAVIR test term list varies between 4 and 28 graphemes. The term length of the EPIC test term list varies between 6 and 16 graphemes