AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks

EURASIP Journal on Audio, Speech, and Music Processing

Table 5 Evaluation measures for music segmentation with respect to different boundaries. F-measure (F), precision (P), and recall (R) are estimated for all boundaries (A) and boundaries coming with instrument (I), key (K), or tempo (T) change. Abbreviations for the feature groups: MelS: the Mel spectrum; SemS: the semitone spectrum; Fluct: fluctuation patterns. Best values are marked with bold font

Meas.	MFCCs	MelS	SemS	Fluct	CNN
\(F_{\textrm{A}}\)	0.646	0.577	0.615	0.470	0.914
\(P_{\textrm{A}}\)	0.568	0.473	0.506	0.397	0.950
\(R_{\textrm{A}}\)	0.810	0.803	0.850	0.615	0.899
\(F_{\textrm{I}}\)	0.600	0.531	0.571	0.418	0.925
\(P_{\textrm{I}}\)	0.484	0.401	0.432	0.322	0.947
\(R_{\textrm{I}}\)	0.871	0.870	0.926	0.644	0.921
\(F_{\textrm{K}}\)	0.520	0.469	0.510	0.374	0.822
\(P_{\textrm{K}}\)	0.399	0.338	0.370	0.274	0.798
\(R_{\textrm{K}}\)	0.863	0.887	0.951	0.666	0.901
\(F_{\textrm{T}}\)	0.387	0.357	0.377	0.299	0.846
\(P_{\textrm{T}}\)	0.272	0.237	0.251	0.201	0.895
\(R_{\textrm{T}}\)	0.840	0.912	0.947	0.718	0.857