From: Deep neural networks for automatic speech processing: a survey from large corpora to limited data
Model type
Quantity of data used
Accuracy
Pre-training
Training
Pase+ [8]
50h
350h
37.99
Wav2Vec2.0 [9]
960h
75.18
60k h
86.14
HuBERT [10]
81.42
90.33
AutoSpeech [17]
-
87.66