From: Performance vs. hardware requirements in state-of-the-art automatic speech recognition
Network complexity | Formula |
---|---|
Parameters (model size) in fully connected layers | input size∗output size+bias |
Parameters (model size) in time-delay layers | (input size∗output size+bias)∗context size |
Parameters (model size) in convolutional layers | (filter size∗number of input filters+bias)∗number of output filters |
Parameters (model size) in recurrent layers | (input size+recurrent layer size)∗output size∗number of gates∗directionality factor |
Multiply-accumulate operations (MACs) | parameters∗features vector length∗sequence length in time |
Operations (Ops) | MACs∗2 |
Activations in fully connected layers and time-delay layers | output size∗output sequence length |
Activations in convolutional layers | output size∗number of output filters∗output sequence length |
Activations in recurrent layers | recurrent layer size∗output sequence length∗directionality factor |