Integration of evolutionary computation algorithms and new AUTO-TLBO technique in the speaker clustering stage for speaker diarization of broadcast news

EURASIP Journal on Audio, Speech, and Music Processing

Table 9 Performance results of TLBO algorithm, c-bic, c-sid, and p-asr systems obtained on the RT-04F and ESTER datasets. Scores are given for missed speech (MS), false alarms (FA), speaker errors (SPK), and overall diarization error rate (DER). #REF and #Sys are, respectively, the reference and system speaker number

RT-04F dev1 dataset
System	Method	#Ref	#Sys	MS	FA	SPK	Overall DER
Dev1	c-sid	121	161	0.4	1.3	5.4	7.1
Dev1	TLBO algorithm	121	161	0.383	1.116	5.75	7.249
Show	ABC	27	35	1.4	1.1	12.2	14.7
	VOA	20	22	0.2	1.1	2.1	3.4
	PRI	27	29	0.1	0.8	2.7	3.6
	NBC	21	30	0.1	0.9	11.5	12.5
	CNN	16	19	0.4	1.2	5.4	7.0
	MNB	10	13	0.1	1.6	0.6	2.3
Dev2	c-sid	90	130	0.5	3.1	4.1	7.6
Dev2	TLBO algorithm	90	130	0.516	3.083	4.216	7.725
Show	CSPN	3	4	0.2	2.8	0.1	3.1
	CNN	17	20	0.6	4.1	4.9	9.6
	PBS	27	28	0.1	2.6	7.2	10.0
	ABC	23	26	2.1	6.7	12.1	20.9
	CNNHL	9	15	0.0	1.4	0.3	1.7
	CNBC	11	16	0.1	0.9	0.7	1.7
RT-04F dev2 dataset
c-bic	–	–		0.4	1.8	14.8	17.0
c-sid (δ = 0.1)	–	–		0.4	1.8	6.9	9.1
p-asr	–	–		0.6	1.1	5.2	7.6
TLBO algorithm	–	–		0.6	1.8	7.8	10.2
ESTER development dataset
c-bic	–	–	0.7	1.0		12.1	13.8
c-sid (δ = 1.5)	–	–	0.7	1.0		9.8	11.5
TLBO algorithm	–	–	0.6	1.0		9.7	12.3
Post-evaluation result on ESTER dataset
c-sid (δ = 2.0)	–	–	0.7	1.0		7.4	9.1