Skip to main content

Table 1 Paper summary

From: Performance vs. hardware requirements in state-of-the-art automatic speech recognition

Section Content
1. Introduction Paper context; paper goals; paper structure
2. Introduction to ASR systems Main concepts about the automatic speech recognition field
2.1 The road from pipeline ASR to end-to-end ASR Differences between those two categories of systems
2.2 Feature extraction Most popular speech features
2.3 Traditional, HMM-based acoustic modeling Acoustic modeling using Hidden Markov Models: concepts and systems
2.4 End-to-end ASR systems Most common end-to-end approaches
2.5 Language Modeling Most common language modeling approaches
3. State-of-the-art ASR implementations Detailed description of 8 speech recognition systems
3.1 Kaldi chain model TDNN Simple time-delay neural network system
3.2 Kaldi chain model CNN-TDNN Convolutional + time-delay neural network system
3.3 Paddle Paddle implementation of DeepSpeech2 Simple recurrent neural network system
3.4 RWTH RETURNN Attention-based encoder-decoder neural network system
3.5 Facebook CNN-ASG Fully convolutional with gated linear units neural network system
3.6 Facebook TDS-S2S Convolutional with time-depth separable blocks neural network system
3.7 Nvidia Jasper Convolutional neural network with residual connections system
3.8 Nvidia QuartzNet Lightweight convolutional neural network with time-channel separable residual
  blocks system
4. ASR comparison and evaluation. Case study on LibriSpeech Accuracy and hardware requirements of those 8 implementations evaluated on
  LibriSpeech task
4.1 Evaluation of model complexity Definition of the metrics used for model complexity
4.2 Comparison of ASR systems in terms of model complexity Complexity of the models computed as the number of parameters,
  operations and activations
4.3 Comparison of ASR systems in terms of performance Transcription accuracy on LibriSpeech dataset
4.4 Trade-offs between ASR performance and hardware Accuracy vs. hardware requirements: trade-off analysis
5. Conclusion Paper summary; achieved goals
  Main conclusions emerged from the analysis of those 8 systems