Skip to main content
Figure 3 | EURASIP Journal on Audio, Speech, and Music Processing

Figure 3

From: Improved monaural speech segregation based on computational auditory scene analysis

Figure 3

The energy-labeled mask for the speech and crowd noise with music mixture. (a) Cochleagram of a female utterance showing the energy of each T-F units. The brighter pixel indicates stronger energy. (b) Ideal binary mask, which is computed by target and intrusion before mixing. (c) Cochleagram of the mixture. (d) The mask labeled by the conventional threshold. (e) The mask labeled by the proposed threshold selection method.

Back to article page