Fig. 1From: Exploring convolutional, recurrent, and hybrid deep neural networks for speech and music detection in a large audio datasetWaveform, spectrogram, and mel-spectrogram of a 10-s speech segment obtained from Google AudioSet. The mel-spectrogram, based on the auditory-based mel-frequency scale, provides better resolution for lower frequencies than the spectrogramBack to article page