Fig. 6From: End-to-end speech emotion recognition using a novel context-stacking dilated convolution neural networkPerformance comparisons to the baseline CNN-LSTM model. “Aug.” indicates using the sliding window augmentation in a or the random flipping augmentation in bBack to article page