Black-box adversarial attacks through speech distortion for speech emotion recognition

EURASIP Journal on Audio, Speech, and Music Processing

Table 3 Performance of the three models under the McAdams transform attack (UA/WA/ACC)

α_mas	CNN-LSTM (%)	GCN (%)	CNN-MAA (%)
1.25	9.54/09.95/9.75	8.66/08.72/8.69	8.02/08.94/8.48
1.20	10.54/11.32/10.93	9.54/10.55/10.05	9.33/10.06/9.70
1.15	12.75/13.44/13.10	11.83/12.31/12.07	11.45/12.02/11.74
1.10	15.56/17.75/16.66	15.66/17.37/16.52	14.03/14.69/14.36
1.05	18.64/20.68/20.68	19.55/19.73/19.64	18.40/19.83/19.12
1.00	61.77/63.64/62.71	77.44/76.27/76.86	75.34/76.73/76.04
0.95	19.94/21.55/20.75	20.33/20.97/20.65	19.21/19.63/19.42
0.90	16.03/16.94/16.49	18.42/18.93/18.68	15.42/15.88/15.65
0.85	14.88/15.32/15.10	13.03/15.64/14.34	14.03/14.86/14.45
0.80	9.57/10.04/9.81	9.93/11.01/10.47	08.32/09.07/8.70
0.75	8.57/09.04/8.81	8.93/09.77/9.35	7.32/7.88/7.60