Skip to main content

Advertisement

Table 1 Characteristics of the MAVIR database. Number of word occurrences (#occ.), duration (dur.) in minutes (min), number of speakers (#spk.), and average MOS (Ave. MOS)

From: Search on speech from spoken queries: the Multi-domain International ALBAYZIN 2018 Query-by-Example Spoken Term Detection Evaluation

File ID Data #word occ. dur. (min) #spk. Ave. MOS
Mavir-02 train 13432 74.51 7 (7 ma.) 2.69
Mavir-03 dev 6681 38.18 2 (1 ma. 1 fe.) 2.83
Mavir-06 train 4332 29.15 3 (2 ma. 1 fe.) 2.89
Mavir-07 dev 3831 21.78 2 (2 ma.) 3.26
Mavir-08 train 3356 18.90 1 (1 ma.) 3.13
Mavir-09 train 11179 70.05 1 (1 ma.) 2.39
Mavir-12 train 11168 67.66 1 (1 ma.) 2.32
Mavir-04 test 9310 57.36 4 (3 ma. 1 fe.) 2.85
Mavir-11 test 3130 20.33 1 (1 ma.) 2.46
Mavir-13 test 7837 43.61 1 (1 ma.) 2.48
ALL train 43467 260.27 13 (12 ma. 1 fe.) 2.56
ALL dev 10512 59.96 4 (3 ma. 1 fe.) 2.64
ALL test 20277 121.3 6 (5 ma. 1 fe.) 2.65
  1. ma. male, fe. female. These characteristics are displayed for training (train), development (dev), and testing (test) datasets