The perceptual preprocessing steps in various emotional modes. Prior to feature extraction, we perform preprocessing on utterances to highlight the perceptual content of the audio. To make this procedure apparent, q4 (high arousal) (first column) and q2 (low arousal) (second column) modes are considered from VAM. The first two rows illustrate time domain and power spectrum of the files taken from, respectively, q4 and q2 categories. The spectral properties are shaped by an auditory filter bank, resembling the hearing threshold of the outer ear (third row). The disparity stemming from the emotional characteristics improve after perceptual masking performed in Bark domain (fourth row).