Fig. 2From: Whisper-based spoken term detection systems for search on speech ALBAYZIN evaluation challengeWhisper encoder-decoder transformer structure. Top: Overall structure. Bottom: Detail of the encoder and decoder blocks. MLP, multi-layer perceptron; K, keys; V, values; Q, queries [103]Back to article page