Fig. 4From: A depthwise separable convolutional neural network for keyword spotting on an embedded systemOverview of a single depthwise separable convolutional layer consisting of a depthwise convolution followed by a pointwise convolution. (1) The depthwise convolution separately applies a 2-dimensional filter to each of the channels in the input, extracting time-frequency patterns. (2) The pointwise convolution then applies a number of 1-dimensional filters to the output of the depthwise convolution across all channelsBack to article page