A review of infant cry analysis and classification

EURASIP Journal on Audio, Speech, and Music Processing

Table 4 Significant works on infant cry detection

Literature	Dataset	Features	Classifiers	Performance
Chang [48] (2019)	Self-recorded (crying with TV, speech, etc.)	Spectrogram	CNN	99.83%
Manikanta [25] (2019)	Recorded in homes (crying with ac, fan, etc.)	MFCC	1D-CNN, FFNN, SVM	86%
Dewi [64] (2019)	Self-recorded (cry and not cry)	LFCC	KNN	90%
Gu [16] (2018)	Self-recorded (crying with laughter, barking, etc.)	LPC	Dynamic time warping	97.1%
			algorithm
Ferretti [18] (2018)	Real Dataset: recorded in the NICU of a hospital;	Log-Mel Coefficients	CNN	86.58% on real dataset,
	Synthetic DB: crying with speech, “beep” sounds, etc.)			92.92% on synthetic
				dataset
Feier [87] (2017)	TUT Rare Sound Events 2017 (crying with	Log-amplitude mel-spectrogram	CRNN	85% for baby crying detection,
	“glass breaking”, “gunshot”, etc.)			87% for all
				three targets
Torres [27] (2017)	Online resources (crying with adult	Voiced unvoiced counter, Consecutive F0	Support Vector Data	AUC 92%
	cry, vacuum cleaning, etc.)	and harmonic ratio accumulation,MFCC	Description (SVDD),CNN
Lavner [17] (2016)	Recorded in domestic environment (crying	MFCC, Pitch, Formants, etc.	CNN	95%
	with speech, door opening, etc.)