Fig. 5
From: Enhancement of speech dynamics for voice activity detection using DNN

Representation of a speech signal (a), its starting and ending point candidates using rules (i) and (ii) (b), masks (c), and speech period candidates as a result of multiplying masks by the spectra expressed in decimal form (d)