EURASIP Journal on Audio, Speech, and Music Processing

Table 4 Performance of the three models under the Modulation Spectrum Smoothing attack(UA/WA/ACC)

From: Black-box adversarial attacks through speech distortion for speech emotion recognition

α_ms	CNN-LSTM (%)	GCN (%)	CNN-MAA (%)
0.30	13.74/13.49/13.62	11.53/11.95/11.74	11.14/11.59/11.37
0.25	14.94/14.99/14.97	12.46/13.55/13.01	12.22/13.32/12.77
0.20	18.44/19.29/18.87	14.93/16.32/15.63	15.74/17.22/16.48
0.15	20.34/20.77/20.56	17.44/18.30/17.87	19.32/20.43/19.88
0.10	23.21/23.89/23.55	19.92/21.49/20.71	20.11/21.34/20.73
0.05	26.56/28.44/27.50	23.93/24.60/24.27	22.94/24.05/23.50

Bold fonts indicate the best attack performance under the current modes

Back to article page