Skip to main content

Clustering algorithm for audio signals based on the sequential Psim matrix and Tabu Search

Abstract

Audio signals are a type of high-dimensional data, and their clustering is critical. However, distance calculation failures, inefficient index trees, and cluster overlaps, derived from the equidistance, redundant attribute, and sparsity, respectively, seriously affect the clustering performance. To solve these problems, an audio-signal clustering algorithm based on the sequential Psim matrix and Tabu Search is proposed. First, the audio signal similarity is calculated with the Psim function, which avoids the equidistance. The data is then organized using a sequential Psim matrix, which improves the indexing performance. The initial clusters are then generated with differential truncation and refined using the Tabu Search, which eliminates cluster overlap. Finally, the K-Medoids algorithm is used to refine the cluster. This algorithm is compared to the K-Medoids and spectral clustering algorithms using UCI waveform datasets. The experimental results indicate that the proposed algorithm can obtain better Macro-F1 and Micro-F1 values with fewer iterations.

1 Introduction

Audio signal clustering forms the basis for speech recognition, audio synthesis, audio retrieval, etc. Audio signals are considered as high-dimensional data, with dimensionalities of more than 20 [1]. Their clustering is undertaken based on this consideration and solving the problems in high-dimensional data clustering, in this regard, is highly beneficial.

There are three types of clustering algorithms for high-dimensional data: attribute reduction [2], subspace clustering [3], and co-clustering [4]. The first method reduces the data dimensionality with attribute conversion or reduction and then, performs clustering. The effect of this method is heavily dependent on the degree of dimension reduction; if it is considerable, useful information may be lost, and if it is less, clustering cannot be done effectively. The second method divides the original space into several different subspaces and searches for cluster in the subspace. When the dimensionality is high and the accuracy requirement rigorous, the number of subspaces rapidly increases. Thus, searching for a cluster in a subspace becomes a bottleneck and may lead to failure [5]. The third method implements clustering iteratively with respect to the content and feature alternately. The clustering result is adjusted as per the semantic relationship between the theme and characteristic, realizing a balance between data and attribute clustering. This method has two stages, resulting in a high time complexity. In addition to the above three methods, clustering algorithms for high-dimensional data includes hierarchical clustering [6], parallel clustering [7], knowledge-driven clustering [8], etc. However, these methods also have similar problems.

Equidistance, the redundant attribute, and sparsity are the fundamental factors affecting the clustering performance of high-dimensional data [9]. Equidistance renders the distance between any two points in a high-dimensional space approximately equal, leading to a failure in the clustering algorithm, based on the distance. The redundant attribute increases the dimensionality of the high-dimensional data and the complexity of the index structure, decreasing the efficiency of building and retrieving the index structure. Sparsity enables uniform data distribution, and some clusters may overlap with each other, affecting the clustering precision.

It is reported that some dimensional components of the high-dimensional data are non-related noise that hide the real distance, resulting in equidistance. The Psim function can find and eliminate noise in all the dimensions [10]. The Tabu Search [11] is a heuristic global search algorithm. All the possible existent clusters are combinatorically optimized using the Tabu Search such that a non-overlap cluster is selected.

To solve the clustering problems owing to equidistance, the redundant attribute, and sparsity, an efficient audio signal clustering algorithm is proposed, by integration with the Psim matrix and Tabu Search. First, for all the points in the high-dimensional space, the Psim values between them and the corresponding location numbers are stored in a Psim matrix. Next, a sequential Psim matrix is generated by sorting the elements in each row of the Psim matrix. Further, the initial clusters are generated with differential truncation and refined with the Tabu Search. Finally, the initial clusters are iteratively refined with the K-Medoids algorithm, until all the cluster medoids are stable.

2 Related works

2.1 Psim matrix

Traditional similarity measurement methods (e.g., the Euclidean distance, Jaccard coefficient [12], and Pearson coefficient [12]) fail in high-dimensional space because in these methods, equidistance is a common phenomenon in high-dimensional space; hence, the calculated distance is not the real distance. To solve this problem, the Hsim function [13] was proposed; however, the relative difference and noise distribution were not considered. The Gsim function [14] was also proposed and the relative differences of the properties in different dimensions were analyzed, but the weight discrepancy was ignored. The proposed Close function [15] can reduce the influence of components in certain dimensions, whose variances are larger; however, the relative difference was not considered and it would be affected by noise. The Esim [16] function was proposed by improving the Hsim and Close functions and considering the influence of the property on the similarity. In every dimension, the Esim component has a positive correlation. All the dimensions are divided into normal and noisy. In a noisy dimension, noise is the main ingredient. When it is similar and larger than the signal, in a normal dimension, Esim is invalid. The secondary measurement method [17] is used to calculate the similarity by considering the property distribution, space distance, etc. However, the noise distribution and weight are not taken into account. In addition, its formula is complicated and the calculation is slow. In high-dimensional space, a large difference exists in certain dimensionalities [10], even though the data is similar. This difference occupies a large portion of the similarity calculation; hence, all the calculation results are similar. Therefore, the Psim function [10] was proposed to diminish the influence of noise on the similarity data; experimental results indicate that this method is suitable.

When using the Psim function to measure the similarity, the data component in every dimension must be sorted and the value range divided into several intervals. The similarity between X and Y in the j-th dimension is added to the Psim function, if and only if, their data components are in the same interval.

In an n-dimensional space, the Psim value between X and Y is as follow:

$$ \mathrm{Psim}\left(X,Y\right)=\sum \limits_{j\in {D}_s\left(X,Y\right)}\left(1-\frac{\left|{X}_j-{Y}_j\right|}{l_j-{u}_j}\right)\frac{\left|{D}_s\left(X,Y\right)\right|}{n} $$
(1)

where X j and Y j are the data components of X and Y in the j-th dimension. Ds(X, Y) is a subscript set of X j and Y j , which are in the same interval [u j , l j ], and |Ds(X, Y)| is the number of elements in Ds(X, Y). The above is the outline of the Psim function; a detailed introduction can be found in [10].

Data organization is critical in a clustering algorithm. In the traditional method, the data space is separated using an index tree and mapped onto the index-tree nodes. The commonly used index trees are the R tree [18], cR tree [19], VP tree [20], M tree [21], SA tree [22], etc. The partitioning of the data space is the foundation for building an index tree, but its complexity increases with the increase in dimensionality. Thus, it is difficult to build index trees for high-dimensional data. In addition, the retrieval efficiency of the index tree falls sharply with the increase in dimensionality. The retrieval function works effectively, when the dimensionality is less than 16; however, it weakens rapidly, for dimensionalities greater than 16, even down to the level of a linear search [23]. A sequential Psim matrix is used to solve this problem. First, all the Psim values between points, S 1 , S 2 , , S M , are calculated to build a Psim matrix, PsimMat, with a size, M × M. PsimMat(i, t) is composed of three properties: i, t, and Psim(S i , S t ). Next, the sequential Psim matrix, SortPsimMat, is generated by sorting the elements in every row of PsimMat in the descending order of the Psim value. The elements in the i-th row represent the similarities between S i and the other points. From left to right, the Psim values gradually diminish, indicating a decrease in the similarity. It can be seen that the sequential Psim matrix is not affected by the dimensionality and can represent the similarity distribution of all the points. Therefore, it is suitable for high-dimensional data clustering.

2.2 Differential truncation

The elements in every row of SortPsimMat are regarded as a sequence, A, whose length is M. The sequential Psim differential matrix, DeltaPsimMat, is generated with a differential operation on the sequence, A. The size of DeltaPsimMat is M × (M − 1). The elements in the i-th row of SortPsimMat represent the similarities between S i and the other points. Several points corresponding to the frontier elements in this row, from the left, would form a cluster centered at S i because the similarity between the elements inside the cluster is higher than that of those outside. Thus, the similarity differences between the elements inside the cluster are lesser than that of the others. Assuming that the cluster centered at S i has p i elements, the left p i  − 1 elements in the i-th row of DeltaPsimMat are lesser than the differential threshold, ΔA max. Thus, a reasonable ΔA max is set up and all the elements that are less than ΔA max in the i-th row of DeltaPsimMat are found to form a cluster centered at S i .

2.3 Tabu Search

After differential truncation, the intersection of some of the clusters may not be null. Thus, the overlapping elements should be eliminated by refinement. The clusters that are to be refined are called the imminent-refining cluster sets, and the initial values are the clusters after differential truncation. The clusters that have been refined are called the refined cluster sets and their initial values are null. The refinement of the cluster is an iterative process. Considering the average Psim of the remainder elements in the i-th row of SortPsimMat, after differential truncation, as the similarity of a cluster centered at S i , the operation in every iteration is as follows: First, the similarity of every cluster is calculated. Next, the cluster with the highest similarity is added into the refined cluster set and the element in the other cluster is deleted, if it is in the selected cluster. However, there is a problem. After deleting the overlapping elements, the similarity of the cluster in the imminent-refining cluster set may be greater than that of the selected cluster. To solve this problem, Tabu Search is used for refinement.

Tabu Search is an expansion of the neighborhood search, a global optimum algorithm [11], and is mainly used for combinatorial optimization. A roundabout search can be avoided using the Tabu rule and aspiration criterion, for improving the global search efficiency. This method can accept an inferior solution and has a strong “climbing” ability; it has a higher probability of obtaining a global optimal solution.

The main process of Tabu Search is as follows: Initially, a random initial solution is regarded as the current solution, and several neighboring solutions are considered as the candidate solutions. Further, if the objective function value of a certain candidate solution meets the aspiration criterion, the current solution is replaced by this candidate solution and added to the Tabu list. Else, the best choice of a non-Tabu object is considered as the new current solution. In addition, the corresponding solution must be added into the Tabu list [24]. The above steps are repeated, until the terminate criterion is satisfied.

In order to use Tabu Search for refining the cluster, an appropriate Tabu object, Tabu list, aspiration criterion, and terminate criterion are required. The Tabu objects are the elements in the refined cluster set and are saved into the Tabu list to prevent the Tabu Search from falling into the local optimum. The Tabu length is set as the number of clusters after differential truncation. In every iteration process, the selected cluster is considered as the Tabu object. After eliminating the overlapping elements, the cluster, whose similarity is higher than that of the previously selected cluster, is considered as the better cluster and it replaces the previously selected cluster. The previously selected cluster is removed from the Tabu list and added into the imminent-refining cluster set. The above “eliminating the overlapping elements—searching for a better cluster” process is repeated, until a better cluster can no longer be found. Then, the previously selected cluster is considered as the optimal cluster of this iteration. The search for the better cluster of the next iteration then commences, until the imminent-refining cluster set is null.

3 Clustering algorithm

3.1 Problem description

The dataset of M audio signals with a length, n, is considered as the point set, S = {S 1, S 2, , S M }, of n-dimensional space, where S i  = {S i1, , S ij , , S in }, i = 1, 2, , M, j = 1, 2, , n, and S ij are the j-th property of S i .

The goal is to search for sets, C 1, C 2, , C K , that meet the following two requirements:

  1. 1.

    C 1C 2C K  = S

  2. 2.

    C v  ∩ C w  = φ, for any 1 ≤ v ≠ w ≤ K.

3.2 Framework of the clustering algorithm

The proposed clustering algorithm has four steps, as shown in Fig. 1. First, the sequential Psim matrix is built to represent the similarity between the points in the set, S. Next, the initial cluster is generated by integration with the differential truncation and heuristic search. Further, the initial cluster is refined with Tabu Search. Finally, the expected cluster is generated by clustering, based on the K-Medoids.

Fig. 1
figure 1

Framework of the proposed clustering algorithm

3.3 Clustering algorithm procedure

3.3.1 Construction of a sequential Psim matrix

The Psim values between all the points in the set, S, are calculated using Eq. 1 and saved into the Psim matrix, PsimMat. Then, the elements in every row of PsimMat are sorted with quicksort to obtain the sequential Psim matrix, SortPsimMat. The above is a brief introduction; the detailed procedure can be found in [25].

3.3.2 Initial cluster generation

First, the Laplacian matrix, L, is generated by PsimMat, and its eigenvalue distribution is used to determine the number of expected clusters [26]. Next, the differential threshold, ΔA max, is initialized. Let C max be the maximal time for searching the cluster set. The upper bound of C max is the combinatorial number, \( {C}_M^K \), where K is the number of clusters. Searching \( {C}_M^K \) times is time-consuming because the magnitude of \( {C}_M^K \) is M!. In addition, the K expected clusters maybe not found by searching \( {C}_M^K \) times. Thus, C max is set to C max = M and a heuristic search is implemented. Finally, the collision list of the clusters, TBL C , is set to null and i = 1. Then, the initial cluster can be generated with the following steps.

  • Step 1: The elements in the i-th row of DeltaPsimMat are visited from left to right, until the p i -th element is greater than the differential threshold, ΔA max, for the first time.

  • Step 2: The points corresponding to the left p i  ‐ 1 elements in the i-th row of SortPsimMat are used to construct a cluster, \( {C}_T^i \), centered at S i .

  • Step 3: If i < M, then i = i + 1; go to Step 1; else, c = 1.

  • Step 4: K clusters, \( {C}_T^{i1},{C}_T^{i1},\cdots, {C}_T^{iK} \), are selected from M clusters, \( {C}_T^1,{C}_T^2,\cdots, {C}_T^M \), to ensure that the set composed of K centers of the selected cluster are not in the Tabu list,TBL C .

  • Step 5: If the union of K clusters is equal to the set, S, then the set, \( {C}_i=\left\{{C}_i^0,{C}_i^1,\cdots, {C}_i^K\right\} \), is considered as the initial cluster set, where \( {C}_i^v={C}_T^{iv} \). Otherwise, the set C I is added into the TBL C ; go to Step 6.

  • Step 6: If c ≥ C max, then i = 1, increase ΔA max, clear TBL C , and go to Step 1. Otherwise, c = c + 1; go to Step 4.

3.3.3 Refinement of the initial cluster

The initial cluster set, C I , is considered as the imminent-refining cluster set, C Refining. The refined cluster set, C Refined and the Tabu list are both null. The maximal search time is F max and a heuristic algorithm is used for the search. F max is set to F max = K because F max is proportional to the size of C I . The refinement procedure for C I is as follows.

  • Step 1: The number of iterations, r = 0, and the number of searches in the current iteration, f = 0.

  • Step 2: The similarity of every cluster in C Refining is calculated, and the cluster with the highest similarity is considered as the better cluster, C Optimal, and moved into the refined cluster set, C Refined. In addition, the selected cluster and similarity are added toTBL F .

  • Step 3: The element in the reminder cluster, C Refining, is deleted, if it is in the cluster, C Optimal. Then, the similarity of every cluster in C Refining is calculated, and the cluster with the highest similarity is expressed as C MaxPsim.

  • Step 4: If the average similarity of every cluster in C MaxPsim is not more than those in C Optimal or f ≥ F max, then go to Step 5. Otherwise, f = f + 1; go to Step 6.

  • Step 5: If r ≥ K, then the refinement of the initial cluster is terminated. Otherwise, r = r + 1, f = 0; go to Step 2.

  • Step 6: Cluster C Optimal is moved back to C Refining from C Refined and the corresponding information in the TBL F is deleted.

  • Step 7: Cluster C MaxPsim is considered as the better cluster, C Optimal, and moved into C Refined. In addition, the corresponding information is added into theTBL F .

  • Step 8: Go to Step 2.

3.3.4 Clustering based on iterative partitioning

The cluster, after Tabu Search, has the basic cluster characteristics. For further improvement, clustering based on K-Medoids [27] is implemented.

3.4 Convergence analysis

The proposed clustering algorithm has four steps, and the corresponding convergence analysis is as follows:

  1. 1.

    Construction of a sequential Psim matrix

    The Psim matrix, PsimMat, is generated by running Eq. 1 M × M times; the sequential Psim matrix, SortPsimMat, is generated by sorting the elements in every row of PsimMat. The above operation can be completed within a limited time; hence, this step converges.

  2. 2.

    Generating the initial cluster

    First, the number of expected clusters can be determined by spectral clustering in a limited time. Next, with the increase in the differential threshold, ΔA max, the number of elements in every cluster increases. Thus, the union, \( {C}_T^{i 1}\cup {C}_T^{i 1}\cup L\cup {C}_T^{iK} \), gets closer to the set, S, gradually. Thus, this step converges.

  3. 3.

    Refinement of the initial cluster

    This step iterates K times. In every iteration procedure, the calculation of the K average similarities of the cluster is carried out K times, at most. The above operation can be completed in limited time; thus, this step converges.

  4. 4.

    Clustering based on iterative partitioning

    This step is based on the K-Medoids clustering algorithm, which converges naturally. Thus, this step also converges. The above statements show that the four steps can be completed in limited time. Thus, the proposed clustering algorithm converges.

3.5 Time complexity analysis

Considering multiplication and comparison as the two basic operations, the corresponding complexity analysis is as follows:

  1. 1.

    Construction of a sequential Psim matrix

    The complexity of this step is reported to be O(M 2 · n) [25].

  2. 2.

    Generating the initial cluster

    The size of the Laplacian matrix, L, in Section 3.3.2 is M × M. Its top K eigenvalues are calculated with the power iteration method [28], and the time complexity is O(K · M 2). The optimal number range of the clusters, K opt, is \( 1\le {K}_{\mathrm{opt}}\le \sqrt{M} \). Hence, the time complexity for the calculation of the eigenvalues in the Laplacian matrix, L, is O(M 2.5). Further, the initial cluster is generated by iterating several times. The analysis in every iteration process is as follows: First, the differential threshold, ΔA max, is increased and the elements in every row of DeltaPsimMat are visited. The time complexity is O(M 2), accordingly. Then, K clusters are selected and tested whether their union is equal to the set, S. The maximal comparison time for calculating the union of two clusters is M 2 n because the maximal number of elements in a cluster is M and the dimension of an element is n. Thus, the maximal comparison time for the union of K clusters is (K ‐ 1)M 2 n. In addition, the selection operation of the K clusters are performed C max = M times, at most. Therefore, the time complexity of the selection process is O((K − 1)M 2 n · C max) = O((K − 1)M 2 n · M) = O(KM 3 n). The optimal number of clusters, K opt, is less than \( \sqrt{M} \) [29]; i.e., O(KM 3 n) = O(M 3.5 n). Let the maximal iterations be H. Hence, the time complexity for generating the initial cluster is O(H · M 3.5 n).

  3. 3.

    Refinement of the initial cluster

    In this step, the basic operation is the calculation of the K similarities of the cluster. The maximal number of elements in every cluster is M. Thus, the number of addition operations is K · M. This basic operation is carried out K 2 times, at most. Therefore, the total number of addition operations is K 3 · M; i.e., the time complexity in this step is O(K 3 M). The upper bound of the optimal number of clusters is \( {K}_{\mathrm{opt}}=\sqrt{M} \) [29]. Thus, the time complexity can be expressed as O(M 2.5).

  4. 4.

    Clustering based on iterative partitioning

    This step should be iteratively carried out Q times. In every iteration process, there are three basic operations: the construction of K clusters, the computation of the K medoids of the clusters, and the calculation of the objective functions E q and \( {E}_q^{\ast } \). During these three basic operations, the Psim value is calculated as K · M, M and M times, respectively. Thus, the total number of Psim calculations in this step is Q · (KM + M + M) = Q(K + 2)M. From Eq. 1, it can be seen that the time complexity of the Psim calculation is O(n). Therefore, the time complexity of this step is O(Q(K + 2)M · n) = O(QKMn), which is briefly expressed as O(QM1.5 n) by virtue of the property [29] of the optimal number of clusters, \( 1\le {K}_{\mathrm{opt}}\le \sqrt{M} \).

To sum up the above statements, the time complexity of the proposed clustering algorithm is O(M 2n) + O(M 2n) + O(Mn log n) + O(M 2.5) + O(HM 3.5 n) + O(M 2.5) + O(QM1.5 n) = O(M 2n) + O(Mn log n) + O(M 2.5) + O(HM 3.5 n) + O(QM1.5 n). Generally, the difference in the magnitudes of M and n is negligible, i.e., M >  > log n. Thus, O(Mn log n) and O(M 2.5) can be ignored, relative to O(M 2n). The magnitudes of H and Q are the same because they are both iterations. Thus, O(QM1.5 n) can be ignored relative to O(HM 3.5 n). Therefore, the time complexity can be briefly expressed as O(M 2n) + O(HM 3.5 n) = O(HM 3.5 n).

As can be seen from the above analysis, this algorithm is a polynomial time algorithm, which can be carried out in a normal machine and condition.

4 Experiment

4.1 Overview

In the following experiment, the hardware includes an AMD Athlon(tm) II X2-250 processor and a Kingston 4G memory; the software used includes the Win7 operating system and Microsoft Visual Studio 2012. The audio signal data [30] is downloaded from the UCI database. This dataset is composed of 5000 audio vectors with lengths of 41, and each one is produced by mixing a normal waveform with noise; there are three categories.

The number of clusters is determined using the spectral clustering algorithm. Then, the test data is clustered ten times with the proposed clustering algorithm, based on the Psim matrix and Tabu Search (PM-TS clustering algorithm), the K-Medoids clustering algorithm [29], and the spectral clustering algorithm [26]. In the process of each clustering, the iterations, Macro-F1 and Micro-F1 [31], are calculated. In addition, their average in ten clustering processes is required. Finally, these algorithms are compared based on the above results.

4.2 Selection criteria for the compared algorithm

In our experiment, there are three criteria for selecting the compared clustering algorithm: the selected algorithm must be widely recognized by academia and industry, it should be suitable for high-dimensional data clustering and converge stably, and must be relevant to our algorithm.

Based on the above criteria, the K-Medoids and the spectral clustering algorithms were selected. Both are widely used by academia and industry, can converge stably, and are strongly related to our algorithm. In addition, they are also related to each other. The detailed analysis is as follows:

  1. 1.

    Correlation analysis of the K-Medoids clustering algorithm

    The K-Medoids clustering algorithm is one of the few clustering algorithms that is theoretically convergent. At the beginning of this algorithm, the initial cluster is randomly selected, and subsequently, iterative clustering is done by the center of the nearest point close to the center of the cluster. Theoretically, iterative clustering is the key to convergence and not the initial cluster. The focus of our study is to propose a clustering algorithm suitable for high-dimensional data; convergence is a prerequisite to be satisfied. Therefore, the iterative clustering strategy of the K-Medoids clustering algorithm is adopted for convergence. In addition, the randomly selected initial cluster of the K-Medoids clustering is replaced by a refined non-overlapping cluster. Thus, our algorithm is derived from the K-Medoids clustering algorithm.

  2. 2.

    Correlation analysis of the spectral clustering algorithm

    First, the spectral clustering algorithm procedure is similar to that of the proposed and K-Medoids clustering algorithms. The clustering function can be completed only by the adjacency matrix that stores the similarity of the points and not by the vector that records the point coordinates, as in the K-Means clustering algorithm. Next, the spectral clustering algorithm is based on graph theory. The data points and their similarities are represented as the vertex and weight of the edge, respectively. The eigenvector of the adjacency matrix is extracted from its Laplace matrix and is subsequently used for clustering. Because the number of eigenvectors is considerably lesser than the dimension of the points, it can be regarded as a dimensionality reduction clustering algorithm. Our algorithm is also of the same type because the sparse and noisy dimension components do not participate in the computation. Hence, both the algorithms are similar with respect to the reduction in dimensionality. Finally, the number of clusters used in the proposed and K-Medoids clustering algorithms is calculated with the eigenvalue decomposition of the Laplace matrix in the spectral clustering algorithm, i.e., some of the results of the spectral clustering algorithm are useful for the proposed and K-Medoids clustering algorithms; thus, these algorithms are strongly related to the spectral clustering algorithm.

4.3 Tabu Search analysis

Tabu Search, the core of our algorithm, can solve the overlap problem of the initial cluster to improve its quality. Ten Sequential Psim Matrix are generated, and the running time is shown in Fig. 2. After that, the corresponding initial clusters are generated in accordance with the method described in Section 3.3.2 and refined with the method in Section 3.3.3 to produce ten sets of refined clusters. The searching times for producing the initial clusters are shown in Fig. 3. The overlap, recall, precision, and running time of the initial and refined clusters are calculated, respectively, as shown in Figs. 4, 5, 6 and 7.

Fig. 2
figure 2

Running time for producing sequential Psim matrix ten times

Fig. 3
figure 3

Searching times for producing initial clusters ten times

Fig. 4
figure 4

Overlap of the initial cluster and that of the refined cluster after running Tabu Search ten times

Fig. 5
figure 5

Recall of the initial cluster and that of the refined cluster after running Tabu Search ten times

Fig. 6
figure 6

Precision of the initial cluster and that of the refined cluster after running Tabu Search ten times

Fig. 7
figure 7

Running time of the initial cluster and that of the refined cluster after running Tabu Search ten times

It can be seen that 35–45% of the elements, which were overlapping in the initial cluster, are eliminated by Tabu Search. The recall and precision of the initial cluster were approximately 39 and 34%, respectively. After Tabu Search, the recall reduced to approximately 33%, but the precision increased to approximately 39%. This is owing to the elimination of certain correct classified elements, while deleting the overlapping elements in the cluster, leading to a reduction in the recall. However, the number of error classified elements deleted by Tabu Search is more. Therefore, the proportion of correct classified elements in the cluster increases. The searching times for producing the initial clusters is the random number from 1 to 5000, because the maximal searching times C max is M = 5000. But in most cases, the expected initial clusters can be found within 1500 times, and the corresponding running time is less than 9 s. The upper bound of running time for refining cluster is total time to construct all the permutations of cluster. In our experiment, the operation time is less than 0.5 s due to the limited number of clusters (only three clusters).

The experimental results are averaged and presented in Table 1; they illustrate the role of Tabu Search in the refinement of the initial cluster. After Tabu search, the precision increased from 33.9 to 38.4%, although the recall reduced to 33.4%, satisfying the equilibrium distribution of the precision and recall. In addition, several overlapping elements (approximately 40%) in the initial cluster are completely deleted with Tabu Search. The average running time for producing the initial clusters is 7.36 s, and the one in refinement is 0.49 s. It can be seen that the time load from cluster optimization is acceptable. Therefore, it can improve the quality of the cluster to a certain extent.

Table 1 Average performances of the initial and refined clusters with Tabu Search

4.4 Stability analysis

First, the method for determining the number of clusters, in Section 3.3.2, is applied to the test data. The result is three, which corresponds to the existing data. Then, this dataset is clustered ten times with the PM-TS clustering algorithm, K-Medoids clustering algorithm [29], and the spectral clustering algorithm [26]. The corresponding results are depicted in Figs. 8, 9, 10 and 11.

Fig. 8
figure 8

Iterations of the three algorithms, run ten times

Fig. 9
figure 9

Whole running time of the three algorithms, run ten times

Fig. 10
figure 10

Macro-F1 of the three algorithms, run ten times

Fig. 11
figure 11

Micro-F1 of the three algorithms, run ten times

It can be seen that the iterations for the PM-TS clustering algorithm are lesser than those of the K-Medoids and spectral clustering algorithms, indicating that our proposed method can obtain a more precise initial cluster and converges faster. In addition, the whole running speed of PM-TS clustering algorithm is faster than the one of K-Medoids and spectral clustering algorithms. In most cases, the clustering accuracy (Macro-F1 and Micro-F1) and stability (variations in Macro-F1 and Micro-F1) are both of the order, PM-TS clustering algorithm > spectral clustering algorithm > K-Medoids clustering algorithm. The above results demonstrate the advantages of the PM-TS clustering algorithm in terms of the speed, accuracy, and stability. In some cases, the clustering accuracies of the K-Medoids and spectral clustering algorithms are less than 50%, indicating a clustering failure. However, the PM-TS clustering algorithm did not have similar issues, exhibiting its validity.

4.5 Whole-performance analysis

The experimental results are averaged and presented in Table 2; they illustrate the better performance of the PM-TS clustering algorithm compared to the K-Medoids and spectral clustering algorithms. On the one hand, its iterations are lesser and the convergence is fast as well as the whole running speed. On the other hand, it has no failure cases and Macro-F1/Micro-F1 increase more than 13 and 10%, respectively. To sum up the above analysis, the failure in the distance calculation, the inefficient index tree, and the cluster overlap, derived from the characteristics of the high-dimensional data, can be corrected using the PM-TS clustering algorithm.

Table 2 Average performances of the three algorithms

5 Conclusions

Audio signal clustering is critical in media computing. The key to improving its performance is the solving of the problems that exist in high-dimensional data clustering, such as failures in the distance calculation, inefficient index trees, cluster overlaps, etc. To address these problems, a clustering algorithm integrated with the sequential Psim matrix, differential truncation, and Tabu Search is proposed. Compared to the other clustering algorithms, its characteristics are as follows: In high-dimensional space, the sequential Psim matrix is used to calculate the distance and organize data. Differential truncation and Tabu Search are used to obtain the initial cluster with a high accuracy. Experimental results indicate that the performance of this algorithm is better than that of the K-Medoids and spectral clustering algorithms. Several heuristic methods used in this algorithm have a potential for improvement. Thus, our future work includes the determination of more effective initial parameters, evaluation functions, and convergence criteria, for improving the accuracy of the results.

References

  1. Ericson, K, & Pallickara, S. (2013). On the performance of high dimensional data clustering and classification algorithms. Futur. Gener. Comput. Syst., 29(4), 1024–1034.

    Article  Google Scholar 

  2. Ravale, U, Marathe, N, Padiya, P. (2013). Attribute reduction based hybrid anomaly intrusion detection using K-means and SVM classifier. Int. J. Comput. Appl., 82(15), 32–35.

    Google Scholar 

  3. Gan, GJ, & Ng, MK. (2015). Subspace clustering using affinity propagation. Pattern Recogn., 48(4), 1455–1464.

    Article  Google Scholar 

  4. Govaert, G, & Nadif, M (2014). Co-clustering: models, algorithms and applications. London: Wiley-ISTE.

    MATH  Google Scholar 

  5. Kriegel, HP, Kröger, P, Zimek, A. (2009). Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Trans. Knowl. Discov. Data, 3(1), 1–58.

    Article  Google Scholar 

  6. Rashedi, E, & Mirzaei, A. (2013). A hierarchical clusterer ensemble method based on boosting theory. Knowl.-Based Syst., 45, 83–93.

    Article  Google Scholar 

  7. Luo, R, & Yi, Q (2011). A novel parallel clustering algorithm based on artificial immune network using nVidia CUDA framework. In Proceedings of the 14 th international conference on human-computer interaction, (pp. 598–607). Berlin: Springer-Verlag Press.

    Google Scholar 

  8. Sun, ZY, Mak, LO, Mao, KZ, Tang, WY, Liu, Y, Xian, KT, Wang, ZM, Sui, Y (2014). A knowledge-driven ART clustering algorithm. In Proceedings of the IEEE International Conference on Software Engineering and Service Science 2014, (pp. 645–648). Birmingham: IEEE Comput Soc.

    Google Scholar 

  9. Keogh, E, & Mueen, A (2010). Curse of dimensionality. In Encyclopedia of machine learning, (pp. 257–258). Berlin: Springer-Verlag.

    Google Scholar 

  10. Yi, LH (2011). Research on clustering algorithm for high dimensional data. Qinhuangdao: Yanshan University.

    Google Scholar 

  11. Shahvari, O, & Logendran, R. (2017). An enhanced Tabu search algorithm to minimize a bi-criteria objective in batching and scheduling problems on unrelated-parallel machines with desired lower bounds on batch sizes. Comput. Oper. Res., 77, 154–176.

    Article  MathSciNet  Google Scholar 

  12. Tan, PN, Steinbach, M, Kumar, V (2005). Introduction to data mining. Boston: Addison-Wesley Publishing Company.

    Google Scholar 

  13. Yang, FZ, & Zhu, YY. (2004). An efficient method for similarity search on quantitative transaction data. J. Comput. Res. Dev., 41(2), 361–368.

    Google Scholar 

  14. Huang, SD, & Chen, QM. (2009). On clustering algorithm of high dimensional data based on similarity measurement. Comput. Appl. Softw., 26(9), 102–105.

    Google Scholar 

  15. Shao, CS, Lou, W, Yan, LM. (2011). Optimization of algorithm of similarity measurement in high-dimensional data. Comput. Technol. Dev., 21(2), 1–4.

    Google Scholar 

  16. Wang, XY, Zhang, HY, Shen, LZ, Chi, WL. (2013). Research on high dimensional clustering algorithm based on similarity measurement. Comput. Technol. Dev., 23(5), 30–33.

    Google Scholar 

  17. Jia, XY. (2005). A high dimensional data clustering algorithm based on twice similarity. J. Comput. Appl., 25(B12), 176–177.

    Google Scholar 

  18. Tan, N, & Shi, YX (2009). Optimization research of multi-dimensional indexing structure of R*-tree. In Proceedings of the international forum on information technology and applications, (pp. 612–615). Berlin: Springer-Verlag Press.

    Google Scholar 

  19. Chen, HB, & Wang, ZQ. (2005). CR*-tree: an improved R-tree using cost model. Lect. Notes Comput. Sc., 3801, 758–764.

    Article  Google Scholar 

  20. Nielsen, F, Piro, P, Barlaud, M (2009). Bregman vantage point trees for efficient nearest neighbor queries. In Proceedings of the IEEE International Conference on Multimedia and Expo 2009, (pp. 878–881). Birmingham: IEEE Comput Soc.

    Chapter  Google Scholar 

  21. Kunze, M, & Weske, M. (2011). Metric trees for efficient similarity search in large process model repositories. Lect. Notes Bus. Info. Proc., 66, 535–546.

    Article  Google Scholar 

  22. Navarro, G. (2002). Searching in metric spaces by spatial approximation. VLDB J., 11(1), 28–46.

    Article  Google Scholar 

  23. Chen, JB (2011). The research and application of key technologies in knowledge discovery of high-dimensional clustering. Beijing: Publishing House of Electronics Industry.

    Google Scholar 

  24. Glover, F, & Laguna, M (2013). Tabu search*. In Handbook of combinatorial optimization, (pp. 3261–3362). Berlin: Springer-Verlag.

    Chapter  Google Scholar 

  25. Li, WF, Wang, GM, Ma, N, Liu, HZ. (2016). A nearest neighbor search algorithm of high-dimensional data based on sequential Npsim matrix. High Technol. Lett., 22(3), 241–247.

    Article  Google Scholar 

  26. Tremblay, N, Puy, G, Gribonval, R, Vandergheynst, P (2016). Compressive spectral clustering. In Proceedings of the 33 rd international conference on machine learning, (pp. 1002–1011). Birmingham: IEEE Comput Soc.

    Google Scholar 

  27. Jin, X, & Han, JW (2010). K-Medoids clustering, Encyclopedia of machine learning (pp. 564–565). New York: Springer Publishing.

    Google Scholar 

  28. Booth, TE. (2006). Power iteration method for the several largest eigenvalues and eigenfunctions. Nucl. Sci. Eng., 154(1), 48–62.

    Article  MathSciNet  Google Scholar 

  29. Ng, RT, & Han, JW (1994). Efficient and effective clustering methods for spatial data mining. In Proceedings of the VLDB 1994, (pp. 144–155). Birmingham: IEEE Comput Soc.

    Google Scholar 

  30. Breiman, L, Friedman, J, Stone, CJ, Olshen, RA (1984). Waveform recognition problem, Classification and regression trees (pp. 64–66). Belmont: Wadsworth International Group.

    Google Scholar 

  31. Chen, LF, Ye, YF, Jiang, QS (2008). A new centroid-based classifier for text categorization. In Proceeding of the IEEE 22 nd International Conference on Advanced Information Networking and Applications, (pp. 1217–1222). Birmingham: IEEE Comput Soc.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Editage (www.editage.com) for English language editing and publication support.

Funding

This work is partly supported by the National Nature Science Foundation of China (No. 61502475) and the Importation and Development of High-Caliber Talents Project of the Beijing Municipal Institutions (No. CIT & TCD201504039).

Funding

WL has conducted the research, analyzed the data, and authored the paper. GW has performed the overall design, providing new methods or models, and has written/revised the paper. All authors read and approved the final manuscript.

Author information

Authors and Affiliations

Authors

Contributions

The dataset supporting the conclusions of this article is available in the UCI database, http://archive.ics.uci.edu/ml/datasets/Waveform+Database+Generator+%28Version+2%29.

Corresponding author

Correspondence to Gongming Wang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, W., Wang, G. & Li, K. Clustering algorithm for audio signals based on the sequential Psim matrix and Tabu Search. J AUDIO SPEECH MUSIC PROC. 2017, 26 (2017). https://doi.org/10.1186/s13636-017-0123-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13636-017-0123-3

Keywords