Skip to main content

Table 4 MAVIR training/development and test data file characteristics. “train/dev” stands for training/development, “occ.” stands for occurrences, and “min” stands for minutes

From: Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

File ID Dataset # occ. # occ./min # different terms # different terms/min
MAVIR-02 train/dev 1016 13.6 203 2.7
MAVIR-03 train/dev 653 17.1 153 4.0
MAVIR-06 train/dev 446 15.3 126 4.3
MAVIR-07 train/dev 296 13.6 104 4.8
MAVIR-08 train/dev 200 10.6 76 4.0
MAVIR-09 train/dev 910 13.0 199 2.8
MAVIR-12 train/dev 671 9.9 129 1.9
MAVIR-04 test 1026 17.9 167 2.9
MAVIR-11 test 414 20.4 70 3.4
MAVIR-13 test 614 14.1 98 2.2
ALL train/dev 4192 13.1 346 1.1
ALL test 2054 16.9 202 1.7