 Research
 Open Access
 Published:
Clustering algorithm for audio signals based on the sequential Psim matrix and Tabu Search
EURASIP Journal on Audio, Speech, and Music Processing volume 2017, Article number: 26 (2017)
Abstract
Audio signals are a type of highdimensional data, and their clustering is critical. However, distance calculation failures, inefficient index trees, and cluster overlaps, derived from the equidistance, redundant attribute, and sparsity, respectively, seriously affect the clustering performance. To solve these problems, an audiosignal clustering algorithm based on the sequential Psim matrix and Tabu Search is proposed. First, the audio signal similarity is calculated with the Psim function, which avoids the equidistance. The data is then organized using a sequential Psim matrix, which improves the indexing performance. The initial clusters are then generated with differential truncation and refined using the Tabu Search, which eliminates cluster overlap. Finally, the KMedoids algorithm is used to refine the cluster. This algorithm is compared to the KMedoids and spectral clustering algorithms using UCI waveform datasets. The experimental results indicate that the proposed algorithm can obtain better MacroF1 and MicroF1 values with fewer iterations.
Introduction
Audio signal clustering forms the basis for speech recognition, audio synthesis, audio retrieval, etc. Audio signals are considered as highdimensional data, with dimensionalities of more than 20 [1]. Their clustering is undertaken based on this consideration and solving the problems in highdimensional data clustering, in this regard, is highly beneficial.
There are three types of clustering algorithms for highdimensional data: attribute reduction [2], subspace clustering [3], and coclustering [4]. The first method reduces the data dimensionality with attribute conversion or reduction and then, performs clustering. The effect of this method is heavily dependent on the degree of dimension reduction; if it is considerable, useful information may be lost, and if it is less, clustering cannot be done effectively. The second method divides the original space into several different subspaces and searches for cluster in the subspace. When the dimensionality is high and the accuracy requirement rigorous, the number of subspaces rapidly increases. Thus, searching for a cluster in a subspace becomes a bottleneck and may lead to failure [5]. The third method implements clustering iteratively with respect to the content and feature alternately. The clustering result is adjusted as per the semantic relationship between the theme and characteristic, realizing a balance between data and attribute clustering. This method has two stages, resulting in a high time complexity. In addition to the above three methods, clustering algorithms for highdimensional data includes hierarchical clustering [6], parallel clustering [7], knowledgedriven clustering [8], etc. However, these methods also have similar problems.
Equidistance, the redundant attribute, and sparsity are the fundamental factors affecting the clustering performance of highdimensional data [9]. Equidistance renders the distance between any two points in a highdimensional space approximately equal, leading to a failure in the clustering algorithm, based on the distance. The redundant attribute increases the dimensionality of the highdimensional data and the complexity of the index structure, decreasing the efficiency of building and retrieving the index structure. Sparsity enables uniform data distribution, and some clusters may overlap with each other, affecting the clustering precision.
It is reported that some dimensional components of the highdimensional data are nonrelated noise that hide the real distance, resulting in equidistance. The Psim function can find and eliminate noise in all the dimensions [10]. The Tabu Search [11] is a heuristic global search algorithm. All the possible existent clusters are combinatorically optimized using the Tabu Search such that a nonoverlap cluster is selected.
To solve the clustering problems owing to equidistance, the redundant attribute, and sparsity, an efficient audio signal clustering algorithm is proposed, by integration with the Psim matrix and Tabu Search. First, for all the points in the highdimensional space, the Psim values between them and the corresponding location numbers are stored in a Psim matrix. Next, a sequential Psim matrix is generated by sorting the elements in each row of the Psim matrix. Further, the initial clusters are generated with differential truncation and refined with the Tabu Search. Finally, the initial clusters are iteratively refined with the KMedoids algorithm, until all the cluster medoids are stable.
Related works
Psim matrix
Traditional similarity measurement methods (e.g., the Euclidean distance, Jaccard coefficient [12], and Pearson coefficient [12]) fail in highdimensional space because in these methods, equidistance is a common phenomenon in highdimensional space; hence, the calculated distance is not the real distance. To solve this problem, the Hsim function [13] was proposed; however, the relative difference and noise distribution were not considered. The Gsim function [14] was also proposed and the relative differences of the properties in different dimensions were analyzed, but the weight discrepancy was ignored. The proposed Close function [15] can reduce the influence of components in certain dimensions, whose variances are larger; however, the relative difference was not considered and it would be affected by noise. The Esim [16] function was proposed by improving the Hsim and Close functions and considering the influence of the property on the similarity. In every dimension, the Esim component has a positive correlation. All the dimensions are divided into normal and noisy. In a noisy dimension, noise is the main ingredient. When it is similar and larger than the signal, in a normal dimension, Esim is invalid. The secondary measurement method [17] is used to calculate the similarity by considering the property distribution, space distance, etc. However, the noise distribution and weight are not taken into account. In addition, its formula is complicated and the calculation is slow. In highdimensional space, a large difference exists in certain dimensionalities [10], even though the data is similar. This difference occupies a large portion of the similarity calculation; hence, all the calculation results are similar. Therefore, the Psim function [10] was proposed to diminish the influence of noise on the similarity data; experimental results indicate that this method is suitable.
When using the Psim function to measure the similarity, the data component in every dimension must be sorted and the value range divided into several intervals. The similarity between X and Y in the jth dimension is added to the Psim function, if and only if, their data components are in the same interval.
In an ndimensional space, the Psim value between X and Y is as follow:
where X _{ j } and Y _{ j } are the data components of X and Y in the jth dimension. Ds(X, Y) is a subscript set of X _{ j } and Y _{ j }, which are in the same interval [u _{ j }, l _{ j }], and Ds(X, Y) is the number of elements in Ds(X, Y). The above is the outline of the Psim function; a detailed introduction can be found in [10].
Data organization is critical in a clustering algorithm. In the traditional method, the data space is separated using an index tree and mapped onto the indextree nodes. The commonly used index trees are the R tree [18], cR tree [19], VP tree [20], M tree [21], SA tree [22], etc. The partitioning of the data space is the foundation for building an index tree, but its complexity increases with the increase in dimensionality. Thus, it is difficult to build index trees for highdimensional data. In addition, the retrieval efficiency of the index tree falls sharply with the increase in dimensionality. The retrieval function works effectively, when the dimensionality is less than 16; however, it weakens rapidly, for dimensionalities greater than 16, even down to the level of a linear search [23]. A sequential Psim matrix is used to solve this problem. First, all the Psim values between points, S _{ 1 }, S _{ 2 }, ⋯, S _{ M }, are calculated to build a Psim matrix, PsimMat, with a size, M × M. PsimMat(i, t) is composed of three properties: i, t, and Psim(S _{ i }, S _{ t }). Next, the sequential Psim matrix, SortPsimMat, is generated by sorting the elements in every row of PsimMat in the descending order of the Psim value. The elements in the ith row represent the similarities between S _{ i } and the other points. From left to right, the Psim values gradually diminish, indicating a decrease in the similarity. It can be seen that the sequential Psim matrix is not affected by the dimensionality and can represent the similarity distribution of all the points. Therefore, it is suitable for highdimensional data clustering.
Differential truncation
The elements in every row of SortPsimMat are regarded as a sequence, A, whose length is M. The sequential Psim differential matrix, DeltaPsimMat, is generated with a differential operation on the sequence, A. The size of DeltaPsimMat is M × (M − 1). The elements in the ith row of SortPsimMat represent the similarities between S _{ i } and the other points. Several points corresponding to the frontier elements in this row, from the left, would form a cluster centered at S _{ i } because the similarity between the elements inside the cluster is higher than that of those outside. Thus, the similarity differences between the elements inside the cluster are lesser than that of the others. Assuming that the cluster centered at S _{ i } has p _{ i } elements, the left p _{ i } − 1 elements in the ith row of DeltaPsimMat are lesser than the differential threshold, ΔA _{max}. Thus, a reasonable ΔA _{max} is set up and all the elements that are less than ΔA _{max} in the ith row of DeltaPsimMat are found to form a cluster centered at S _{ i }.
Tabu Search
After differential truncation, the intersection of some of the clusters may not be null. Thus, the overlapping elements should be eliminated by refinement. The clusters that are to be refined are called the imminentrefining cluster sets, and the initial values are the clusters after differential truncation. The clusters that have been refined are called the refined cluster sets and their initial values are null. The refinement of the cluster is an iterative process. Considering the average Psim of the remainder elements in the ith row of SortPsimMat, after differential truncation, as the similarity of a cluster centered at S _{ i }, the operation in every iteration is as follows: First, the similarity of every cluster is calculated. Next, the cluster with the highest similarity is added into the refined cluster set and the element in the other cluster is deleted, if it is in the selected cluster. However, there is a problem. After deleting the overlapping elements, the similarity of the cluster in the imminentrefining cluster set may be greater than that of the selected cluster. To solve this problem, Tabu Search is used for refinement.
Tabu Search is an expansion of the neighborhood search, a global optimum algorithm [11], and is mainly used for combinatorial optimization. A roundabout search can be avoided using the Tabu rule and aspiration criterion, for improving the global search efficiency. This method can accept an inferior solution and has a strong “climbing” ability; it has a higher probability of obtaining a global optimal solution.
The main process of Tabu Search is as follows: Initially, a random initial solution is regarded as the current solution, and several neighboring solutions are considered as the candidate solutions. Further, if the objective function value of a certain candidate solution meets the aspiration criterion, the current solution is replaced by this candidate solution and added to the Tabu list. Else, the best choice of a nonTabu object is considered as the new current solution. In addition, the corresponding solution must be added into the Tabu list [24]. The above steps are repeated, until the terminate criterion is satisfied.
In order to use Tabu Search for refining the cluster, an appropriate Tabu object, Tabu list, aspiration criterion, and terminate criterion are required. The Tabu objects are the elements in the refined cluster set and are saved into the Tabu list to prevent the Tabu Search from falling into the local optimum. The Tabu length is set as the number of clusters after differential truncation. In every iteration process, the selected cluster is considered as the Tabu object. After eliminating the overlapping elements, the cluster, whose similarity is higher than that of the previously selected cluster, is considered as the better cluster and it replaces the previously selected cluster. The previously selected cluster is removed from the Tabu list and added into the imminentrefining cluster set. The above “eliminating the overlapping elements—searching for a better cluster” process is repeated, until a better cluster can no longer be found. Then, the previously selected cluster is considered as the optimal cluster of this iteration. The search for the better cluster of the next iteration then commences, until the imminentrefining cluster set is null.
Clustering algorithm
Problem description
The dataset of M audio signals with a length, n, is considered as the point set, S = {S _{1}, S _{2}, ⋯, S _{ M }}, of ndimensional space, where S _{ i } = {S _{ i1}, ⋯, S _{ ij }, ⋯, S _{ in }}, i = 1, 2, ⋯, M, j = 1, 2, ⋯, n, and S _{ ij } are the jth property of S _{ i }.
The goal is to search for sets, C _{1}, C _{2}, ⋯, C _{ K }, that meet the following two requirements:

1.
C _{1} ∪ C _{2} ∪ ⋯ ∪ C _{ K } = S

2.
C _{ v } ∩ C _{ w } = φ, for any 1 ≤ v ≠ w ≤ K.
Framework of the clustering algorithm
The proposed clustering algorithm has four steps, as shown in Fig. 1. First, the sequential Psim matrix is built to represent the similarity between the points in the set, S. Next, the initial cluster is generated by integration with the differential truncation and heuristic search. Further, the initial cluster is refined with Tabu Search. Finally, the expected cluster is generated by clustering, based on the KMedoids.
Clustering algorithm procedure
Construction of a sequential Psim matrix
The Psim values between all the points in the set, S, are calculated using Eq. 1 and saved into the Psim matrix, PsimMat. Then, the elements in every row of PsimMat are sorted with quicksort to obtain the sequential Psim matrix, SortPsimMat. The above is a brief introduction; the detailed procedure can be found in [25].
Initial cluster generation
First, the Laplacian matrix, L, is generated by PsimMat, and its eigenvalue distribution is used to determine the number of expected clusters [26]. Next, the differential threshold, ΔA _{max}, is initialized. Let C _{max} be the maximal time for searching the cluster set. The upper bound of C _{max} is the combinatorial number, \( {C}_M^K \), where K is the number of clusters. Searching \( {C}_M^K \) times is timeconsuming because the magnitude of \( {C}_M^K \) is M!. In addition, the K expected clusters maybe not found by searching \( {C}_M^K \) times. Thus, C _{max} is set to C _{max} = M and a heuristic search is implemented. Finally, the collision list of the clusters, TBL_{ C }, is set to null and i = 1. Then, the initial cluster can be generated with the following steps.

Step 1: The elements in the ith row of DeltaPsimMat are visited from left to right, until the p _{ i }th element is greater than the differential threshold, ΔA _{max}, for the first time.

Step 2: The points corresponding to the left p _{ i } ‐ 1 elements in the ith row of SortPsimMat are used to construct a cluster, \( {C}_T^i \), centered at S _{ i }.

Step 3: If i < M, then i = i + 1; go to Step 1; else, c = 1.

Step 4: K clusters, \( {C}_T^{i1},{C}_T^{i1},\cdots, {C}_T^{iK} \), are selected from M clusters, \( {C}_T^1,{C}_T^2,\cdots, {C}_T^M \), to ensure that the set composed of K centers of the selected cluster are not in the Tabu list,TBL_{ C }.

Step 5: If the union of K clusters is equal to the set, S, then the set, \( {C}_i=\left\{{C}_i^0,{C}_i^1,\cdots, {C}_i^K\right\} \), is considered as the initial cluster set, where \( {C}_i^v={C}_T^{iv} \). Otherwise, the set C _{ I } is added into the TBL_{ C }; go to Step 6.

Step 6: If c ≥ C _{max}, then i = 1, increase ΔA _{max}, clear TBL_{ C }, and go to Step 1. Otherwise, c = c + 1; go to Step 4.
Refinement of the initial cluster
The initial cluster set, C _{ I }, is considered as the imminentrefining cluster set, C _{Refining}. The refined cluster set, C _{Refined} and the Tabu list are both null. The maximal search time is F _{max} and a heuristic algorithm is used for the search. F _{max} is set to F _{max} = K because F _{max} is proportional to the size of C _{ I }. The refinement procedure for C _{ I } is as follows.

Step 1: The number of iterations, r = 0, and the number of searches in the current iteration, f = 0.

Step 2: The similarity of every cluster in C _{Refining} is calculated, and the cluster with the highest similarity is considered as the better cluster, C _{Optimal}, and moved into the refined cluster set, C _{Refined}. In addition, the selected cluster and similarity are added toTBL_{ F }.

Step 3: The element in the reminder cluster, C _{Refining}, is deleted, if it is in the cluster, C _{Optimal}. Then, the similarity of every cluster in C _{Refining} is calculated, and the cluster with the highest similarity is expressed as C _{MaxPsim}.

Step 4: If the average similarity of every cluster in C _{MaxPsim} is not more than those in C _{Optimal} or f ≥ F _{max}, then go to Step 5. Otherwise, f = f + 1; go to Step 6.

Step 5: If r ≥ K, then the refinement of the initial cluster is terminated. Otherwise, r = r + 1, f = 0; go to Step 2.

Step 6: Cluster C _{Optimal} is moved back to C _{Refining} from C _{Refined} and the corresponding information in the TBL_{ F } is deleted.

Step 7: Cluster C _{MaxPsim} is considered as the better cluster, C _{Optimal}, and moved into C _{Refined}. In addition, the corresponding information is added into theTBL_{ F }.

Step 8: Go to Step 2.
Clustering based on iterative partitioning
The cluster, after Tabu Search, has the basic cluster characteristics. For further improvement, clustering based on KMedoids [27] is implemented.
Convergence analysis
The proposed clustering algorithm has four steps, and the corresponding convergence analysis is as follows:

1.
Construction of a sequential Psim matrix
The Psim matrix, PsimMat, is generated by running Eq. 1 M × M times; the sequential Psim matrix, SortPsimMat, is generated by sorting the elements in every row of PsimMat. The above operation can be completed within a limited time; hence, this step converges.

2.
Generating the initial cluster
First, the number of expected clusters can be determined by spectral clustering in a limited time. Next, with the increase in the differential threshold, ΔA _{max}, the number of elements in every cluster increases. Thus, the union, \( {C}_T^{i 1}\cup {C}_T^{i 1}\cup L\cup {C}_T^{iK} \), gets closer to the set, S, gradually. Thus, this step converges.

3.
Refinement of the initial cluster
This step iterates K times. In every iteration procedure, the calculation of the K average similarities of the cluster is carried out K times, at most. The above operation can be completed in limited time; thus, this step converges.

4.
Clustering based on iterative partitioning
This step is based on the KMedoids clustering algorithm, which converges naturally. Thus, this step also converges. The above statements show that the four steps can be completed in limited time. Thus, the proposed clustering algorithm converges.
Time complexity analysis
Considering multiplication and comparison as the two basic operations, the corresponding complexity analysis is as follows:

1.
Construction of a sequential Psim matrix
The complexity of this step is reported to be O(M ^{2} · n) [25].

2.
Generating the initial cluster
The size of the Laplacian matrix, L, in Section 3.3.2 is M × M. Its top K eigenvalues are calculated with the power iteration method [28], and the time complexity is O(K · M ^{2}). The optimal number range of the clusters, K _{opt}, is \( 1\le {K}_{\mathrm{opt}}\le \sqrt{M} \). Hence, the time complexity for the calculation of the eigenvalues in the Laplacian matrix, L, is O(M ^{2.5}). Further, the initial cluster is generated by iterating several times. The analysis in every iteration process is as follows: First, the differential threshold, ΔA _{max}, is increased and the elements in every row of DeltaPsimMat are visited. The time complexity is O(M ^{2}), accordingly. Then, K clusters are selected and tested whether their union is equal to the set, S. The maximal comparison time for calculating the union of two clusters is M ^{2} n because the maximal number of elements in a cluster is M and the dimension of an element is n. Thus, the maximal comparison time for the union of K clusters is (K ‐ 1)M ^{2} n. In addition, the selection operation of the K clusters are performed C _{max} = M times, at most. Therefore, the time complexity of the selection process is O((K − 1)M ^{2} n · C _{max}) = O((K − 1)M ^{2} n · M) = O(KM ^{3} n). The optimal number of clusters, K _{opt}, is less than \( \sqrt{M} \) [29]; i.e., O(KM ^{3} n) = O(M ^{3.5} n). Let the maximal iterations be H. Hence, the time complexity for generating the initial cluster is O(H · M ^{3.5} n).

3.
Refinement of the initial cluster
In this step, the basic operation is the calculation of the K similarities of the cluster. The maximal number of elements in every cluster is M. Thus, the number of addition operations is K · M. This basic operation is carried out K ^{2} times, at most. Therefore, the total number of addition operations is K ^{3} · M; i.e., the time complexity in this step is O(K ^{3} M). The upper bound of the optimal number of clusters is \( {K}_{\mathrm{opt}}=\sqrt{M} \) [29]. Thus, the time complexity can be expressed as O(M ^{2.5}).

4.
Clustering based on iterative partitioning
This step should be iteratively carried out Q times. In every iteration process, there are three basic operations: the construction of K clusters, the computation of the K medoids of the clusters, and the calculation of the objective functions E _{ q } and \( {E}_q^{\ast } \). During these three basic operations, the Psim value is calculated as K · M, M and M times, respectively. Thus, the total number of Psim calculations in this step is Q · (KM + M + M) = Q(K + 2)M. From Eq. 1, it can be seen that the time complexity of the Psim calculation is O(n). Therefore, the time complexity of this step is O(Q(K + 2)M · n) = O(QKMn), which is briefly expressed as O(QM^{1.5} n) by virtue of the property [29] of the optimal number of clusters, \( 1\le {K}_{\mathrm{opt}}\le \sqrt{M} \).
To sum up the above statements, the time complexity of the proposed clustering algorithm is O(M ^{2} ⋅ n) + O(M ^{2} ⋅ n) + O(M ⋅ n log n) + O(M ^{2.5}) + O(H ⋅ M ^{3.5} n) + O(M ^{2.5}) + O(QM^{1.5} n) = O(M ^{2} ⋅ n) + O(M ⋅ n log n) + O(M ^{2.5}) + O(H ⋅ M ^{3.5} n) + O(QM^{1.5} n). Generally, the difference in the magnitudes of M and n is negligible, i.e., M > > log n. Thus, O(M ⋅ n log n) and O(M ^{2.5}) can be ignored, relative to O(M ^{2} ⋅ n). The magnitudes of H and Q are the same because they are both iterations. Thus, O(QM^{1.5} n) can be ignored relative to O(H ⋅ M ^{3.5} n). Therefore, the time complexity can be briefly expressed as O(M ^{2} ⋅ n) + O(H ⋅ M ^{3.5} n) = O(H ⋅ M ^{3.5} n).
As can be seen from the above analysis, this algorithm is a polynomial time algorithm, which can be carried out in a normal machine and condition.
Experiment
Overview
In the following experiment, the hardware includes an AMD Athlon(tm) II X2250 processor and a Kingston 4G memory; the software used includes the Win7 operating system and Microsoft Visual Studio 2012. The audio signal data [30] is downloaded from the UCI database. This dataset is composed of 5000 audio vectors with lengths of 41, and each one is produced by mixing a normal waveform with noise; there are three categories.
The number of clusters is determined using the spectral clustering algorithm. Then, the test data is clustered ten times with the proposed clustering algorithm, based on the Psim matrix and Tabu Search (PMTS clustering algorithm), the KMedoids clustering algorithm [29], and the spectral clustering algorithm [26]. In the process of each clustering, the iterations, MacroF1 and MicroF1 [31], are calculated. In addition, their average in ten clustering processes is required. Finally, these algorithms are compared based on the above results.
Selection criteria for the compared algorithm
In our experiment, there are three criteria for selecting the compared clustering algorithm: the selected algorithm must be widely recognized by academia and industry, it should be suitable for highdimensional data clustering and converge stably, and must be relevant to our algorithm.
Based on the above criteria, the KMedoids and the spectral clustering algorithms were selected. Both are widely used by academia and industry, can converge stably, and are strongly related to our algorithm. In addition, they are also related to each other. The detailed analysis is as follows:

1.
Correlation analysis of the KMedoids clustering algorithm
The KMedoids clustering algorithm is one of the few clustering algorithms that is theoretically convergent. At the beginning of this algorithm, the initial cluster is randomly selected, and subsequently, iterative clustering is done by the center of the nearest point close to the center of the cluster. Theoretically, iterative clustering is the key to convergence and not the initial cluster. The focus of our study is to propose a clustering algorithm suitable for highdimensional data; convergence is a prerequisite to be satisfied. Therefore, the iterative clustering strategy of the KMedoids clustering algorithm is adopted for convergence. In addition, the randomly selected initial cluster of the KMedoids clustering is replaced by a refined nonoverlapping cluster. Thus, our algorithm is derived from the KMedoids clustering algorithm.

2.
Correlation analysis of the spectral clustering algorithm
First, the spectral clustering algorithm procedure is similar to that of the proposed and KMedoids clustering algorithms. The clustering function can be completed only by the adjacency matrix that stores the similarity of the points and not by the vector that records the point coordinates, as in the KMeans clustering algorithm. Next, the spectral clustering algorithm is based on graph theory. The data points and their similarities are represented as the vertex and weight of the edge, respectively. The eigenvector of the adjacency matrix is extracted from its Laplace matrix and is subsequently used for clustering. Because the number of eigenvectors is considerably lesser than the dimension of the points, it can be regarded as a dimensionality reduction clustering algorithm. Our algorithm is also of the same type because the sparse and noisy dimension components do not participate in the computation. Hence, both the algorithms are similar with respect to the reduction in dimensionality. Finally, the number of clusters used in the proposed and KMedoids clustering algorithms is calculated with the eigenvalue decomposition of the Laplace matrix in the spectral clustering algorithm, i.e., some of the results of the spectral clustering algorithm are useful for the proposed and KMedoids clustering algorithms; thus, these algorithms are strongly related to the spectral clustering algorithm.
Tabu Search analysis
Tabu Search, the core of our algorithm, can solve the overlap problem of the initial cluster to improve its quality. Ten Sequential Psim Matrix are generated, and the running time is shown in Fig. 2. After that, the corresponding initial clusters are generated in accordance with the method described in Section 3.3.2 and refined with the method in Section 3.3.3 to produce ten sets of refined clusters. The searching times for producing the initial clusters are shown in Fig. 3. The overlap, recall, precision, and running time of the initial and refined clusters are calculated, respectively, as shown in Figs. 4, 5, 6 and 7.
It can be seen that 35–45% of the elements, which were overlapping in the initial cluster, are eliminated by Tabu Search. The recall and precision of the initial cluster were approximately 39 and 34%, respectively. After Tabu Search, the recall reduced to approximately 33%, but the precision increased to approximately 39%. This is owing to the elimination of certain correct classified elements, while deleting the overlapping elements in the cluster, leading to a reduction in the recall. However, the number of error classified elements deleted by Tabu Search is more. Therefore, the proportion of correct classified elements in the cluster increases. The searching times for producing the initial clusters is the random number from 1 to 5000, because the maximal searching times C _{max} is M = 5000. But in most cases, the expected initial clusters can be found within 1500 times, and the corresponding running time is less than 9 s. The upper bound of running time for refining cluster is total time to construct all the permutations of cluster. In our experiment, the operation time is less than 0.5 s due to the limited number of clusters (only three clusters).
The experimental results are averaged and presented in Table 1; they illustrate the role of Tabu Search in the refinement of the initial cluster. After Tabu search, the precision increased from 33.9 to 38.4%, although the recall reduced to 33.4%, satisfying the equilibrium distribution of the precision and recall. In addition, several overlapping elements (approximately 40%) in the initial cluster are completely deleted with Tabu Search. The average running time for producing the initial clusters is 7.36 s, and the one in refinement is 0.49 s. It can be seen that the time load from cluster optimization is acceptable. Therefore, it can improve the quality of the cluster to a certain extent.
Stability analysis
First, the method for determining the number of clusters, in Section 3.3.2, is applied to the test data. The result is three, which corresponds to the existing data. Then, this dataset is clustered ten times with the PMTS clustering algorithm, KMedoids clustering algorithm [29], and the spectral clustering algorithm [26]. The corresponding results are depicted in Figs. 8, 9, 10 and 11.
It can be seen that the iterations for the PMTS clustering algorithm are lesser than those of the KMedoids and spectral clustering algorithms, indicating that our proposed method can obtain a more precise initial cluster and converges faster. In addition, the whole running speed of PMTS clustering algorithm is faster than the one of KMedoids and spectral clustering algorithms. In most cases, the clustering accuracy (MacroF1 and MicroF1) and stability (variations in MacroF1 and MicroF1) are both of the order, PMTS clustering algorithm > spectral clustering algorithm > KMedoids clustering algorithm. The above results demonstrate the advantages of the PMTS clustering algorithm in terms of the speed, accuracy, and stability. In some cases, the clustering accuracies of the KMedoids and spectral clustering algorithms are less than 50%, indicating a clustering failure. However, the PMTS clustering algorithm did not have similar issues, exhibiting its validity.
Wholeperformance analysis
The experimental results are averaged and presented in Table 2; they illustrate the better performance of the PMTS clustering algorithm compared to the KMedoids and spectral clustering algorithms. On the one hand, its iterations are lesser and the convergence is fast as well as the whole running speed. On the other hand, it has no failure cases and MacroF1/MicroF1 increase more than 13 and 10%, respectively. To sum up the above analysis, the failure in the distance calculation, the inefficient index tree, and the cluster overlap, derived from the characteristics of the highdimensional data, can be corrected using the PMTS clustering algorithm.
Conclusions
Audio signal clustering is critical in media computing. The key to improving its performance is the solving of the problems that exist in highdimensional data clustering, such as failures in the distance calculation, inefficient index trees, cluster overlaps, etc. To address these problems, a clustering algorithm integrated with the sequential Psim matrix, differential truncation, and Tabu Search is proposed. Compared to the other clustering algorithms, its characteristics are as follows: In highdimensional space, the sequential Psim matrix is used to calculate the distance and organize data. Differential truncation and Tabu Search are used to obtain the initial cluster with a high accuracy. Experimental results indicate that the performance of this algorithm is better than that of the KMedoids and spectral clustering algorithms. Several heuristic methods used in this algorithm have a potential for improvement. Thus, our future work includes the determination of more effective initial parameters, evaluation functions, and convergence criteria, for improving the accuracy of the results.
References
 1.
Ericson, K, & Pallickara, S. (2013). On the performance of high dimensional data clustering and classification algorithms. Futur. Gener. Comput. Syst., 29(4), 1024–1034.
 2.
Ravale, U, Marathe, N, Padiya, P. (2013). Attribute reduction based hybrid anomaly intrusion detection using Kmeans and SVM classifier. Int. J. Comput. Appl., 82(15), 32–35.
 3.
Gan, GJ, & Ng, MK. (2015). Subspace clustering using affinity propagation. Pattern Recogn., 48(4), 1455–1464.
 4.
Govaert, G, & Nadif, M (2014). Coclustering: models, algorithms and applications. London: WileyISTE.
 5.
Kriegel, HP, Kröger, P, Zimek, A. (2009). Clustering highdimensional data: a survey on subspace clustering, patternbased clustering, and correlation clustering. ACM Trans. Knowl. Discov. Data, 3(1), 1–58.
 6.
Rashedi, E, & Mirzaei, A. (2013). A hierarchical clusterer ensemble method based on boosting theory. Knowl.Based Syst., 45, 83–93.
 7.
Luo, R, & Yi, Q (2011). A novel parallel clustering algorithm based on artificial immune network using nVidia CUDA framework. In Proceedings of the 14 ^{th} international conference on humancomputer interaction, (pp. 598–607). Berlin: SpringerVerlag Press.
 8.
Sun, ZY, Mak, LO, Mao, KZ, Tang, WY, Liu, Y, Xian, KT, Wang, ZM, Sui, Y (2014). A knowledgedriven ART clustering algorithm. In Proceedings of the IEEE International Conference on Software Engineering and Service Science 2014, (pp. 645–648). Birmingham: IEEE Comput Soc.
 9.
Keogh, E, & Mueen, A (2010). Curse of dimensionality. In Encyclopedia of machine learning, (pp. 257–258). Berlin: SpringerVerlag.
 10.
Yi, LH (2011). Research on clustering algorithm for high dimensional data. Qinhuangdao: Yanshan University.
 11.
Shahvari, O, & Logendran, R. (2017). An enhanced Tabu search algorithm to minimize a bicriteria objective in batching and scheduling problems on unrelatedparallel machines with desired lower bounds on batch sizes. Comput. Oper. Res., 77, 154–176.
 12.
Tan, PN, Steinbach, M, Kumar, V (2005). Introduction to data mining. Boston: AddisonWesley Publishing Company.
 13.
Yang, FZ, & Zhu, YY. (2004). An efficient method for similarity search on quantitative transaction data. J. Comput. Res. Dev., 41(2), 361–368.
 14.
Huang, SD, & Chen, QM. (2009). On clustering algorithm of high dimensional data based on similarity measurement. Comput. Appl. Softw., 26(9), 102–105.
 15.
Shao, CS, Lou, W, Yan, LM. (2011). Optimization of algorithm of similarity measurement in highdimensional data. Comput. Technol. Dev., 21(2), 1–4.
 16.
Wang, XY, Zhang, HY, Shen, LZ, Chi, WL. (2013). Research on high dimensional clustering algorithm based on similarity measurement. Comput. Technol. Dev., 23(5), 30–33.
 17.
Jia, XY. (2005). A high dimensional data clustering algorithm based on twice similarity. J. Comput. Appl., 25(B12), 176–177.
 18.
Tan, N, & Shi, YX (2009). Optimization research of multidimensional indexing structure of R*tree. In Proceedings of the international forum on information technology and applications, (pp. 612–615). Berlin: SpringerVerlag Press.
 19.
Chen, HB, & Wang, ZQ. (2005). CR*tree: an improved Rtree using cost model. Lect. Notes Comput. Sc., 3801, 758–764.
 20.
Nielsen, F, Piro, P, Barlaud, M (2009). Bregman vantage point trees for efficient nearest neighbor queries. In Proceedings of the IEEE International Conference on Multimedia and Expo 2009, (pp. 878–881). Birmingham: IEEE Comput Soc.
 21.
Kunze, M, & Weske, M. (2011). Metric trees for efficient similarity search in large process model repositories. Lect. Notes Bus. Info. Proc., 66, 535–546.
 22.
Navarro, G. (2002). Searching in metric spaces by spatial approximation. VLDB J., 11(1), 28–46.
 23.
Chen, JB (2011). The research and application of key technologies in knowledge discovery of highdimensional clustering. Beijing: Publishing House of Electronics Industry.
 24.
Glover, F, & Laguna, M (2013). Tabu search*. In Handbook of combinatorial optimization, (pp. 3261–3362). Berlin: SpringerVerlag.
 25.
Li, WF, Wang, GM, Ma, N, Liu, HZ. (2016). A nearest neighbor search algorithm of highdimensional data based on sequential Npsim matrix. High Technol. Lett., 22(3), 241–247.
 26.
Tremblay, N, Puy, G, Gribonval, R, Vandergheynst, P (2016). Compressive spectral clustering. In Proceedings of the 33 ^{rd} international conference on machine learning, (pp. 1002–1011). Birmingham: IEEE Comput Soc.
 27.
Jin, X, & Han, JW (2010). KMedoids clustering, Encyclopedia of machine learning (pp. 564–565). New York: Springer Publishing.
 28.
Booth, TE. (2006). Power iteration method for the several largest eigenvalues and eigenfunctions. Nucl. Sci. Eng., 154(1), 48–62.
 29.
Ng, RT, & Han, JW (1994). Efficient and effective clustering methods for spatial data mining. In Proceedings of the VLDB 1994, (pp. 144–155). Birmingham: IEEE Comput Soc.
 30.
Breiman, L, Friedman, J, Stone, CJ, Olshen, RA (1984). Waveform recognition problem, Classification and regression trees (pp. 64–66). Belmont: Wadsworth International Group.
 31.
Chen, LF, Ye, YF, Jiang, QS (2008). A new centroidbased classifier for text categorization. In Proceeding of the IEEE 22 ^{nd} International Conference on Advanced Information Networking and Applications, (pp. 1217–1222). Birmingham: IEEE Comput Soc.
Acknowledgements
We would like to thank Editage (www.editage.com) for English language editing and publication support.
Funding
This work is partly supported by the National Nature Science Foundation of China (No. 61502475) and the Importation and Development of HighCaliber Talents Project of the Beijing Municipal Institutions (No. CIT & TCD201504039).
Funding
WL has conducted the research, analyzed the data, and authored the paper. GW has performed the overall design, providing new methods or models, and has written/revised the paper. All authors read and approved the final manuscript.
Author information
Affiliations
Contributions
The dataset supporting the conclusions of this article is available in the UCI database, http://archive.ics.uci.edu/ml/datasets/Waveform+Database+Generator+%28Version+2%29.
Corresponding author
Correspondence to Gongming Wang.
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Li, W., Wang, G. & Li, K. Clustering algorithm for audio signals based on the sequential Psim matrix and Tabu Search. J AUDIO SPEECH MUSIC PROC. 2017, 26 (2017). https://doi.org/10.1186/s1363601701233
Received:
Accepted:
Published:
Keywords
 Audio signal clustering
 Sequential Psim matrix
 Tabu Search
 Heuristic search
 KMedoids
 Spectral clustering