Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning

Hizlisoy, Serhat; Arslan, Recep Sinan; Çolakoğlu, Emel

doi:10.1186/s13636-024-00336-8

EURASIP Journal on Audio, Speech, and Music Processing

Table 3 MFCC + spectral contrast classification results

From: Singer identification model using data augmentation and enhanced feature conversion with hybrid feature vector and machine learning

Article	Dataset	Features	Classifiers	Rate(%)
Liu and Huang (2002) [46]	Ten different singers, 30 different music for each singer (Chinese)	FMCV, PMCV	KNN	80.0
Tsai and Wang (2006) [47]	Twenty-three different singers, 10 different music for each singer	MFCC	GMM	87.8
Dharini and Revathy (2014) [26]	Ten different singers, 20 different soundtracks for each singer (Indian, Bengali)	PLP	K-means	55.56
Eghbal-Zadeh et al. (2015) [18]	Artist20 (20 singers, 1413 songs)	MFCCs	KNN	84.31
Xing (2017) [48]	Ten different singers, 10 different music for each singer	LPC	GMM	81.8
Shen et al. (2019) [1]	MIR-1K dataset	MFCCs	LSTM	88.4
Loni and Subbaraman (2019) [17]	Twenty-six different singers, 550 different songs (Indian)	Formants, vibrato, timbre, and harmonic spectral envelope	SVM	86
Murthy et al. (2021) [8]	Indian popular singers’ database (IPSD), Artist20	MFCCs, LPCCs, SDCs, chroma, spectogram	YSA-RF-CNN	61.69–75.50
Noyum et al. (2021) [49]	Four different singers, 50 different songs for each singer	DWT	Linear SVM	83.96
Costa et al. (2017) [29]	Latin Music Database, ISMIR 2004, and African music collection dataset	Spectrogram, RLBP, rhythm patterns (RP), statistical spectrum descriptors (SSD), and rhythm histograms (RH)	CNN SVM	92
Li et al. (2021) [30]	Artist20, singer32 vs singer60	Spectrogram	CRNN	99.0–85.0
Nasrullah et al. (2019) [12]	Artist20	Spectrogram	CRNN	93.7 (F1)
Sharma et al. (2019) [31]	Artist20	MFCC	UBM T-matrix	89.97
Zhang et al. (2022) [32]	Artist20	Mel-spectrogram, articulation, rhythmic complexity, rhythmic stability, dissonance, tonal stability, modality, x-vector	CRNN	81 (F1)
Proposed model	Nine different singers, 20 different songs for each singer	MFCC, octave-based spectral contrast	Extra Tree	89.4
Proposed model	Artist20	MFCC, octave-based spectral contrast	KNN	85.4

Back to article page