 Research
 Open Access
 Published:
An imperceptible and robust audio watermarking algorithm
EURASIP Journal on Audio, Speech, and Music Processing volume 2014, Article number: 37 (2014)
Abstract
In this paper, we propose a semiblind, imperceptible, and robust digital audio watermarking algorithm. The proposed algorithm is based on cascading two wellknown transforms: the discrete wavelet transform and the singular value decomposition. The two transforms provide different, but complementary, levels of robustness against watermarking attacks. The uniqueness of the proposed algorithm is twofold: the distributed formation of the wavelet coefficient matrix and the selection of the offdiagonal positions of the singular value matrix for embedding watermark bits. Imperceptibility, robustness, and high data payload of the proposed algorithm are demonstrated using different musical clips.
11 Introduction
The recent advancements of digital audio technology have increased the ease with which audio files are stored, transmitted, and reproduced. However, along with such conveniences come new risks such as copyright violation. Conventional encryption algorithms permit only authorized users to access encrypted digital data; however, once decrypted, there is no way to prohibit illegal copying and distribution of the data [1]. A promising solution to the copyright violation problem is to apply audio watermarking in which audio files are marked with secret, robust, and imperceptible watermarks to achieve copyright protection [2][5]. Indeed, a digital watermark is a good deterrent to illicit copying and dissemination of copyrighted audio since it can provide evidence of copyright infringements after the copyright violation has occurred.
Audio watermarking techniques which are used for copyright protection of digital audio signals must satisfy two main requirements: imperceptibility and robustness [6]. Imperceptibility refers to the condition that the embedded watermark should not produce audible distortion to the sound quality of the original audio. That is, the watermarked version of the audio signal must be indistinguishable from the original audio signal. On the other hand, robustness ensures the resistance of the watermark against removal or degradation. The watermark should survive malicious attacks such as random cropping and noise adding. Some watermarking applications may demand additional requirements such as high data payload and low computational time of the watermarking algorithm [3]. In practice, there exists a fundamental tradeoff between the different watermarking requirements.
Audio watermarking can be carried out in the time domain or the transform domain of the audio signal. Timedomain techniques based on least significant bit substitution and echo hiding are found extensively in literature [7][12]. In general, timedomain audio watermarking techniques are relatively easy to implement and require few computing resources. However, they are less robust than transformdomain techniques which employ the human perceptual properties and frequency masking characteristics of the human auditory system [13]. Popular transforms that have been widely used in digital watermarking include the discrete Fourier transform (DFT), the discrete cosine transform (DCT), the discrete wavelet transform (DWT), and the singular value decomposition (SVD) [14][20].
It has been reported recently that imperceptible and robust audio watermarking can be achieved by applying a cascade of two different transforms on the original audio signal. Being different, the cascaded transforms may provide different, but complementary, levels of robustness against the same attack. Many audio watermarking techniques based on hybrid transforms have been proposed in literature. These techniques include but are not limited to DWTDCT [21], DWTSVD [22], and SVDSTFT [23].
Several hybrid algorithms based on the SVD transform have been recently proposed in literature. In the algorithm proposed by [23], the audio signal is first converted into a matrix form using the shorttime Fourier transform (STFT), the SVD transform is then applied on the matrix, and finally embedding is carried out by adaptively modifying the SVD coefficients with watermark bits. In the hybrid algorithm proposed by [24], the audio signal is partitioned into blocks, and the watermark bits are embedded using dither modulation quantization of the singular values of the blocks. In [23], an audio watermarking algorithm is proposed in which watermark embedding and extraction procedures are based on the quantization of the norms of the singular values of audio blocks. The same authors proposed in [25] a hybrid algorithm in which watermark bits are embedded by applying quantization index modulation (QIM) on the singular values of waveletdomain blocks. All of the abovementioned SVDbased hybrid algorithms employ some sort of quantization to embed watermark bits. Although quantization is simple, an acceptable level of robustness against noise and filtering attack may not always be achieved.
In this paper, we propose a semiblind hybrid audio watermarking algorithm based on the DWT and SVD transforms. In the proposed algorithm, the audio signal is sampled, partitioned into short audio segments called frames, and a fourlevel DWT decomposition is applied on each frame. A matrix is then formed by arranging the wavelet coefficients of all detail subbands in a unique distributed pattern which scatters the watermark bits throughout the transformed frame to provide a high degree of robustness. The SVD operator is then applied on the matrix, and the watermark bits are embedded onto the offdiagonal zero elements of the S matrix produced by the SVD transform. Unlike the other SVDbased algorithms, the proposed algorithm leaves the nonzero singular values of the S matrix unchanged to ensure high watermarking imperceptibility.
The rest of the paper is organized as follows. In the next section, the DWT and SVD transforms are described, and their unique utilization in the proposed algorithm is outlined. The proposed audio DWTSVD watermarking algorithm is described in detail in Section 3, and evaluated with respect to imperceptibly, robustness, and data payload in Section 4. Concluding remarks are given in Section 5.
22 Related work and contribution
The proposed algorithm is based on cascading the two transforms: DWT and SVD. The uniqueness of the proposed algorithm is twofold: the distributed formation of the DWT coefficient matrix and the selection of the offdiagonal positions of SVD's singular value matrix for embedding watermark bits. Description of the two transforms and their exact utilization in the proposed algorithm is given in this section.
2.1 2.1 DWTbased audio watermarking
DWT is a frequency transform capable of giving a timefrequency representation of any given signal [26]. Starting from an audio signal S, DWT produces two sets of coefficients: the approximation coefficients A_{ 1 } produced by passing S through a lowpass filter and the detail coefficients D_{ 1 } produced by passing S through a highpass filter. Depending on the application and the length of S, A_{ 1 } can be further decomposed into more levels. Figure 1 illustrates a threelevel DWT decomposition of the audio signal S.
Many DWTbased audio watermarking algorithms can be found in literature. Many variations among the different algorithms exit; however, the main variation is in the subband chosen for embedding the watermark bits. In [27][29], the approximation subband is used for embedding the watermark bits, while in most algorithms, only one detail subband is used to embed the watermark bits [30][36]. Claims of good imperceptibility and robustness have been reported using the two embedding approaches.
In this paper, watermark bits are not embedded in one subband only, rather the bits are distributed among all multiresolution detail subbands. For a threelevel DWT decomposition, this is done by forming a matrix of the detail subbands (D_{1}, D_{2}, and D_{3}) as shown in Figure 2. This matrix formation allows for better scattering of the watermark bits throughout the subbands, leading to a higher degree of robustness. The resultant DWT matrix is processed by the SVD transform to embed the watermark bits, as will be explained in the next subsection.
2.2 2.2 SVDbased audio watermarking
The SVD of matrix A is defined by the operation A = U Σ V^{T}, as shown in Figure 3. The nonzero diagonal entries of Σ are called the singular values of A and are assumed to be arranged in decreasing order σ_{ i } > σ_{ i +1 }. The columns of the U matrix are called the left singular vectors, while the columns of the V matrix are called the right singular vectors of A.
The SVD transform has been used in several audio watermarking algorithms [22][25],[37][39]. The algorithms varied in the way the singular values were used in the watermarking process. For example, in [37], the single largest singular value, σ_{ 11, } was quantized and used to embed the watermark, whereas in [38], the encrypted watermark signal was added to all singular values of matrix Σ. In [22],[24],[25], the norms of all singular values were quantized and used in the watermark embedding process.
In our proposed algorithm, matrix A represents the detail subbands matrix shown in Figure 2, which is produced after applying DWT on the original audio signal. After applying the SVD operator on the DWT matrix, watermark bits are embedded onto the offdiagonal zero elements of the S matrix, while the diagonal singular values of the matrix remain unchanged. This embedding procedure will eliminate the possibility of any distortion caused to the singular values which may affect imperceptibility and watermarking quality. Related preliminary works have been published by the author and others in [40],[41]. The algorithms reported in those papers have low capacity as they embed the watermark bits in the single largest singular value, σ_{ 11 }, and not in the offdiagonal zero elements of the Σ matrix, as it is the case in the proposed algorithm.
33 Proposed DWTSVD audio watermarking algorithm
In this section, we describe the proposed DWTSVD algorithm. The algorithm consists of two procedures: watermark embedding and watermark extraction procedures.
3.1 3.1 Watermark embedding procedure
The watermark embedding procedure transforms the audio signal using DWT and SVD, embeds the bits of a binary image watermark in appropriate locations in the transformed signal, and finally produces a watermarked audio signal by performing inverse SVD and DWT operations. The procedure is illustrated in the block diagram shown in Figure 4 and described thereafter.
Step 1: Convert the binary image watermark into a onedimensional vector b of length M × N. A watermark bit b_{ i } may take one of two values: 0 or 1.
Step 2: Sample the original audio signal at a sampling rate of 44,100 samples per second and partition the sampled file into N frames. The optimal frame length will be determined experimentally in such a way to increase data payload.
Step 3: Perform a fourlevel DWT transformation on each frame. This operation produces five multiresolution subbands: D_{ 1 }, D_{ 2 }, D_{ 3 }, D_{ 4 }, and A_{ 4 }. The D subbands are called ‘detail subbands’ and the A_{ 4 } subband is called ‘approximation subband’. The five subbands are arranged in the vector shown in Figure 5.
Step 4: Arrange the four detail subbands D_{ 1 }, D_{ 2 }, D_{ 3 }, and D_{ 4 } in a matrix D as shown in Figure 6. The matrix formation is done this way to distribute the watermark bits throughout the multiresolution subbands D_{ 1 }, D_{ 2 }, D_{ 3 }, and D_{ 4 }. Forming the matrix with the Ds, rather than using A alone, is done to allow for matrix formation and subsequent application of the matrixbased SVD operator. The size of matrix D is 4 × (L/2), where L refers to the length of the frame.
Step 5: Decompose matrix D using the SVD operator. This operation produces the three orthonormal matrices Σ, U, and V^{T} as follows:
where the diagonal matrix Σ has the same size of the D matrix. The diagonal σ_{ ii } entries correspond to the singular values of the D matrix. However, for embedding purposes, only a 4 × 4 subset of matrix Σ, assigned the name S hereafter, is used as shown below. This is a tradeoff between imperceptibility (inaudibility) and payload (embedding capacity). That is, using the whole Σ matrix for embedding will increase embedding capacity but will lead to severe distortion in imperceptibility (inaudibility) of the watermarked audio signal.
Step 6: Arrange 12 bits of the original watermark bit vector b into a scaled 4 × 4 watermark matrix W. The watermark bits must be located in the nondiagonal positions within the matrix, as shown below.
As an example, the watermark 12bit watermark pattern 1010 0011 0101 must be converted to the following matrix form before the actual embedding is carried out.
Step 7: Embed watermark matrix W bits into matrix S according to the following ‘additiveembedding’ formula:
where S_{ w } is the watermarked S matrix, and α is the watermark intensity which should be chosen to tune the tradeoff between robustness and imperceptibility. With this type of embedding, the singular values of D remain unchanged, and thus, audible distortion caused by modifying the singular values is avoided.
Step 8: Decompose the new watermarked matrix S_{ w } using the SVD operator. This operation produces three new orthonormal matrices as follows:
The matrices U_{ 1 } and V_{ 1 }^{T} are stored for later use in the extraction process. This makes the proposed watermarking algorithm semiblind, as the whole original audio frame is not required in the extraction process.
Step 9: Apply the inverse SVD operation using the U and V^{T} matrices, which were unchanged, and the S_{ 1 } matrix, which has been modified according to Equation (6). The D_{ w } matrix given below is the watermarked D matrix given in Equation (2).
where matrix Σ′ is the original Σ matrix with the S submatrix replaced by the S_{ 1 } submatrix.
Step 10: Apply the inverse DWT operation on the D_{ w } matrix to obtain the watermarked audio frame.
Step 11: Repeat all previous steps on each frame. The overall watermarked audio signal is obtained by concatenating the watermarked frames obtained in the previous steps.
3.2 3.2 Watermark extraction procedure
Given the watermarked audio signal and the corresponding U_{ 1 } and V_{ 1 } matrices that were computed in Equation (7) and stored for each frame, the embedded watermark can be extracted according to the procedure outlined in Figure 7 and described in detail in the following steps:
Step 1: Obtain the matrix S_{ 1 }′ from each frame of the watermarked audio signal following the general steps presented in Figure 7.
Step 2: Multiply matrix S_{ 1 }′ by U_{ 1 } and V_{ 1 } which were computed in the watermark embedding procedure and stored for use in the extraction process. This results in the following matrix.
Step 3: Extract the 12 watermark bits from each frame by examining the nondiagonal values of matrix S_{ w }'. It has been experimentally noticed that there are two groups of nondiagonal values that are extremely distinct. The values at the positions where a 0 bit has been embedded tend to be much smaller than those values at the positions where a 1 bit has been embedded. Thus, to determine the watermark bit W(n), the average of nondiagonal values is first computed, name it avg, then for each nondiagonal value S_{ w }'_{ ij }, W(n) is extracted according to the following formula:
Step 4: Construct the original watermark image by assembling the bits extracted from all frames.
44 Experimental results
Different types of audio signals have different perceptual properties, and therefore, watermarking performance may vary from type to another. Accordingly, we evaluated the performance of the proposed algorithm using three mono audio signals representing pop music, instrumental music, and speech. Each signal has a duration of 11 s and was sampled at 44.1 kHz and quantized to 16 bits per sample. The watermark used for experimentation is the 12 × 10 binary image shown in Figure 8. The watermark is embedded repeatedly throughout the sampled signal, such that one single watermark image is embedded in a sequence of ten frames.
Fourlevel DWT decomposition is applied on each frame using the Daubechies wavelet (db1). Using other wavelet types has a little effect on the performance, as it was observed experimentally. Values ranging from 1 to 5 were used for the watermark intensity α. However, the results reported in this paper were obtained when the intensity value was set to 3. In what follows, we present performance results of the proposed algorithm with respect to three metrics: imperceptibility, robustness, and data payload [42],[43].
4.1 4.1 Imperceptibility results
Imperceptibility ensures that the quality of the signal is not perceivably distorted and the watermark is imperceptible to listeners. To measure imperceptibility, different authors use different metrics; however, the most commonly used metrics are signaltonoise ratio (SNR) and listening tests.
4.1.1 4.1.1 Signaltonoise ratio
SNR is a statistical difference metric which is used to measure the similitude between the undistorted original audio signal and the distorted watermarked audio signal. The SNR computation is done according to Equation (11), where A corresponds to the original signal, and A′ corresponds to the watermarked signal.
We obtained the SNR_{dB} values given in Table 1. As shown in the table, the values are much higher than the 20_{dB} minimum requirement set by the International Federation of Phonographic Industry [13]. Although SNR is a simple metric to measure the noise introduced by the embedded watermark and can give a general idea of imperceptibility, it does not take into account the specific characteristics of the human auditory system.
4.1.2 4.1.2 Listening tests
For better evaluation of imperceptibility, subjective and objective listening tests are used. Subjective difference grade (SDG) listening tests are implemented by human listeners, and objective difference grade (ODG) listening tests are implemented by software packages incorporating the human auditory system. The two listening tests use the 5grade scale shown in Table 2.
We employed a blind subjective listening test to estimate the audio quality of the watermarked signals. The listening test was performed repeatedly with five adults in a listening room equipped with audio testing and recording devices. A computer system running a special software was also used for computercontrolled presentation of the watermarked signals to the listeners and for recording their responses. Each person was presented with ten pairs of signals (original and watermarked) and then asked to give performance scores using the 5grade impairment scale given in Table 1. The five persons listened to each pair of signals ten times and gave an average SDG value for each pair. The average grade for each pair submitted by all persons is considered the final grade for that particular pair of signals. The SDG averages obtained for the subjective listening tests are 4.67, 4.72, and 4.81 for the pop, instrumental, and speech signals, respectively. These values clearly indicate that imperceptibility has been achieved by the proposed audio watermarking algorithm.
The ODG scores were also computed using the Perceptual Evaluation of Audio Quality (PEAQ) standard. The standard is specified in ITUR BS.1387 [44] and implemented by the software tool EAQUAL [45]. The ODG values we obtained are −0.67, −0.71, and −0.91 for the pop, instrumental, and speech signals, respectively. These results confirm with those obtained by subjective listening tests. The measured SDG and ODG values are given in Table 3.
Comparing imperceptibility results with results achieved by other algorithms is not straightforward, since different authors use different evaluation metrics. Moreover, subjective evaluation is relative and may differ from one listener to another. This may explain why imperceptibly results are hardly compared in literature. Nonetheless, and for the sake of completion, we present in Table 4 some imperceptibility results achieved by recently proposed algorithms. It is important to note that the values in table are average values taken over different audio types.
4.2 4.2 Robustness results
Watermarked audio signals may undergo signal processing operations such as linear filtering, lossy compression, among many other operations [46],[47]. Although these operations may not affect the perceived quality of the host signal, they may corrupt the watermark embedded within the signal. Two sets of attacks were performed to test the robustness of our proposed algorithm. The first set includes the following set of common signal processing operations: Gaussian noise addition, requantization, resampling, MP3 compression, lowpass filtering, and echo addition. The other set is the Stirmark® audio watermarking benchmark which includes a whole set of add, modify, and filter attacks [48],[49].
Robustness is measured using the bit error rate (BER) metric since the watermark used in the simulation is a binary image. BER is defined as the ratio of incorrect extracted bits to the total amount of embedded bits, as expressed in Equation (12).
where l is the watermark length, W_{ n } is the n th bit of the embedded watermark, and W′_{ n } is the n th bit of the extracted watermark.
4.2.1 4.2.1 Common signal processing operations
The following common signal processing attacks were applied to test the robustness of the proposed algorithm:

1.
Additive white Gaussian noise: White Gaussian noise is added to corrupt the watermarked signal to SNR levels of 15_{dB} and 20_{dB}.

2.
Requantization: The 16bit watermarked audio signal is requantized to 8 bits per sample and 24 bits per sample.

3.
Resampling: The watermarked signal, originally sampled at 44.1 kHz, is downsampled to 22.05, 11.025, and 6 kHz.

4.
MP3 compression: The watermarked audio signal is compressed at different bit rates: 128, 96, 64, and 32 kbps.

5.
Low, high, and bandpass filtering: Filtering at different cutoff frequencies is applied to the watermarked signal.

6.
Echo addition: An echo signal with a delay of 100 ms and different decay rates are added the watermarked signal.
The BER values we obtained after applying the common signal processing operations are listed in Table 5. As shown in the table, the BER values, which have been computed over the whole period of the test signals, are very small in magnitude and thus reflect the robustness of the proposed algorithm against common signal operations. Maximum robustness has been achieved against the Gaussian noise attacks, requantization, and MP3 compression at 128 kbps. BER values due to resampling increased as the watermarked signal was downsampled to lower frequencies. The same observation is also seen for the MP3 compression attack, where higher BER values were obtained as the compression rate of the watermarked signal was increased. The watermarked signal is also robust against filtering operations as shown in the corresponding small BER values. The least robustness is seen against the echo addition operation as indicated by the relatively higher BER values.
Finally, we compared the robustness of the proposed algorithm with the robustness of recently published transformbased algorithms. Its clear from Table 6 that the proposed algorithm performs better when compared with the other algorithms. It is important to note that the values in Table 6 represent average values taken over different audio types.
4.2.2 4.2.2 Stirmark© attacks
To evaluate robustness of the proposed algorithm furthermore, we implemented a set of attacks defined by Stirmark® benchmark for audio[48],[49]. The attacks are comprehensive as they include add, filter, and modification attacks. The results are recorded in Table 7 alongside with snapshots of extracted watermarks from the watermarked signals. It is noted in Table 7 that BER values due to most of the attacks are zero. It is also noted that the proposed algorithm performs comparably well with regard to the three audio signal types.
The Stirmark® attacks have been used by several transformbased algorithms. Table 8 compares the BER results we obtained and the BER results reported in four relevant references. As shown in the table, the results are comparable among the different transformbased references with regard to most of the Stirmark® attacks. It is instructive to note here that Stirmark® package can be used to simulate composite attacks, where two or more attacks are tested in one run. Such composite attacks may give better comparison between the different algorithms; however, they are rarely reported in literature.
4.3 4.3 Data payload results
Data payload is defined as the data embedding capacity of the algorithm and is measured as the number of bits embedded within one second of the audio signal (bps). In the proposed algorithm, the audio signal is segmented into frames, with each frame having a fixed embedding capacity of 12 watermark bits, as shown in matrix W given in (5). Therefore, the payload is computed by multiplying number of frames per second by the bit capacity of the frame. The number of frames per second depends on the frame length and is computed by dividing the 44.1 KHz sampling rate by the frame length. Table 9 shows the data payload as a function of the frame length.
As shown in the table, the payload increases as the frame length decreases. However, shortlength frames degrade performance and result in unacceptable imperceptibility and robustness results. A frame length of 2,048 samples has been fixed and used to evaluate imperceptibly and robustness of the proposed algorithm.
The data payload we obtained is higher than payload rates obtained by other recently proposed algorithms. Table 10 lists the payload of different transformbased audio watermarking algorithms.
55 Conclusions
In this paper, we proposed an imperceptible and a robust audio watermarking technique based on cascading two wellknown transforms: the discrete wavelet transform and the singular value decomposition. The two transforms were used in a unique way that scatters the watermark bits throughout the transformed frame in order to achieve high degrees of imperceptibility and robustness. High data payloads were also achieved. The simulation results obtained were in total agreement with the requirements set by IFPI for audio watermarking, thus proving the effectiveness of the proposed algorithm.
Future research will focus on enhancing the proposed algorithm to resist desynchronization attacks such as random cropping, pitch shifting, amplitude variation, timescale modification, and jittering. Methods proposed in the literature that counter desynchronization attacks include the alllistsearch method, the combination of spread spectrum and spread spectrum code method, the selfsynchronization strategy method, and the synchronization code method. Our approach will be based on embedding synchronization codes with the watermark bits so that the hidden data have the selfsynchronization capability.
References
 1.
Furht B, Kirovski D: Encryption and Authentications: Techniques and Applications. Auerbach, USA; 2006.
 2.
Arnold M, Wolthusen S, Schmucker M: Techniques and applications of digital watermarking and content protection. Artech House. In Psychoacoustics: Facts and Models. Edited by: Zwicker E, Fastl H. SpringerVerlag, Massachusetts, USA; 2003.
 3.
Acevedo A: Digital Watermarking for Audio Data in Techniques and Applications of Digital Watermarking and Content Protection. Artech House, USA; 2003.
 4.
Xu C, Wu J, Sun Q, Xin K: Applications of watermarking technology in audio signals. J. Audio Eng. Soc. 1999, 47(10):805812.
 5.
M Swanson, B Zhu, A Tewfic, L Boney, Current state of the art, challenges and future direction for audio watermarking, in Proceeding of the IEEE International Conference on Multimedia Computing and Systems (1999), pp. 19–24
 6.
M Arnold, Audio watermarking: Features, applications and algorithms, in Proceeding of the IEEE International Conference on Multimedia and Expo (2000), pp. 1013–1016
 7.
Bassia P, Pitas I: Robust audio watermarking in the time domain. IEEE Trans. Multimed. 2001, 3(2):232241. 10.1109/6046.923822
 8.
Lie WN, Chang LC: Robust and highquality timedomain audio watermarking based on lowfrequency amplitude modification. IEEE Trans. Multimed. 2006, 8(1):4659. 10.1109/TMM.2005.861292
 9.
Dumitrescu S, Wu W, Wang Z: Detection of LSB steganography via sample pair analysis. IEEE Trans. Signal Process. 2003, 51(7):19952007. 10.1109/TSP.2003.812753
 10.
Chen O, Wu W: Highly robust, secure, and perceptualquality echo hiding scheme. IEEE Trans. Speech Audio Process. 2008, 16(3):629638. 10.1109/TASL.2007.913022
 11.
Ko BS, Nishimura R, Suzuki Y: Timespread echo method for digital audio watermarking. IEEE Trans. Multimed. 2005, 7(2):212221. 10.1109/TMM.2005.843366
 12.
Kim H, Choi Y: A novel echohiding scheme with backward and forward kernels. IEEE Trans. Circ. Syst. Video Tech. 2003, 13(8):885889. 10.1109/TCSVT.2003.815950
 13.
Katzenbeisser S, Petitcloas F: Information Hiding Techniques for Steganography and Digital Watermarking. Artech House, USA; 2000.
 14.
Fallahpour M, PerezMegias D: High capacity audio watermarking using FFT amplitude interpolation. IEICE Electron. Express 2009, 6(14):10571063. 10.1587/elex.6.1057
 15.
Fan M, Wang H: Chaosbased discrete fractional sine transform domain audio watermarking scheme. Comput. Electr. Eng. 2009, 35(3):506516. 10.1016/j.compeleceng.2008.12.004
 16.
Yeo I, Kim H: Modified patchwork algorithm: a novel audio watermarking scheme. IEEE Trans. Speech Audio Process. 2003, 11(4):381386. 10.1109/TSA.2003.812145
 17.
Hsieh M, Tseng D, Huang Y: Hiding digital watermarks using multiresolution wavelet transform. IEEE Trans. Ind. Electron. 2001, 48(5):875882. 10.1109/41.954550
 18.
Chang C, Shen W, Wang H: Using counterpropagation neural network for robust digital audio watermarking in DWT domain. Proc. IEEE Int. Conf. Syst. Man. Cybern. 2006, 2: 12141219.
 19.
Liu R, Tan T: An SVDbased watermarking scheme for protecting rightful ownership. IEEE Trans. Multimed. 2002, 4(1):121128. 10.1109/6046.985560
 20.
Mohammad A, AlHaj A, Shaltaf S: An improved SVDbased watermarking scheme for protecting rightful ownership. Signal Process. J. 2008, 88(9):21582180. 10.1016/j.sigpro.2008.02.015
 21.
Wang X, Zhao H: A novel synchronization invariant audio watermarking scheme based on DWT and DCT. IEEE Trans. Signal Process. 2006, 54(12):48354840. 10.1109/TSP.2006.881258
 22.
Bhat KV, Sengupta I, Das A: A new audio watermarking scheme based on singular value decomposition and quantization. Circ. Syst. Signal Process. 2011, 30: 915927. 10.1007/s0003401092558
 23.
H Ozer, B Sankur, N Memon, An SVDbased audio watermarking technique, in ACM Workshop on Multimedia and Security (2005), pp. 51–56
 24.
Bhat KV, Sengupta I, Das A: An audio watermarking scheme using singular value decomposition and dithermodulation quantization. Multimed. Tool. Appl. 2011, 52: 369383. 10.1007/s1104201005151
 25.
Bhat K, Sengupta I, Das A: An adaptive audio watermarking based on the singular value decomposition in the wavelet domain. Digit. Signal Process. 2010, 20: 15471558. 10.1016/j.dsp.2010.02.006
 26.
Strang G, Nguyen T: Wavelets and Filter Banks. WellesleyCambridge Press, Wellesley, MA; 1996.
 27.
Xiang S: Audio watermarking robust against D/A and A/D conversions. EURASIP J Adv. Signal Process. 2011, 3: 114.
 28.
Peng H, Wang J, Zhang Z: Audio watermarking scheme robust against desynchronization attacks based on kernel clustering. Multimed. Tool. Appl. 2011, 3: 114.
 29.
Wu S, Huang J, Huang D, Shi Y: Efficiently selfsynchronized audio watermarking for assured audio data transmission. IEEE Trans. Broadcast. 2005, 51(1):6976. 10.1109/TBC.2004.838265
 30.
Fallahpour M, Megias D: High capacity audio watermarking using the high frequency band of the wavelet domain. Multimed. Tool. Appl. 2011, 52: 485498. 10.1007/s1104201004951
 31.
Swanson M, Zhu B, Tewfic A, Boney L: Robust audio watermarking using perceptual masking. Signal Process. 1998, 66(3):337355. 10.1016/S01651684(98)000140
 32.
X Li, M Zhang, L Sun, Adaptive audio watermarking algorithm based on SNR in wavelet domain, in International Conference on Natural Language Processing and Knowledge Engineering (2003), pp. 287–292
 33.
Wu Y, Shimamoto S: A study on DWTbased digital audio watermarking for mobile ad hoc networks. International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing 2006. 5–7 June
 34.
Erelebi E, Bataki L: Audio watermarking scheme based on embedding strategy in low frequency components with a binary image. Digit. Signal Process. 2009, 19(2):265277. 10.1016/j.dsp.2008.11.007
 35.
Wei L, Xue X: An audio watermarking technique that is robust against random cropping. Comput. Music. J. 2003, 27(4):5868. 10.1162/014892603322730505
 36.
X Li, H Yu, Transparent and robust audio data hiding in subband domain, in Proceedings of the International Conference on Information Technology: Coding and Computing (2000), pp. 74–79
 37.
Chang C, Tsai P, Lin C: SVDbased digital image watermarking scheme. Pattern Recogn. Lett. 2005, 26(10):15771586. 10.1016/j.patrec.2005.01.004
 38.
Abd ElSamie F: An efficient singular value decomposition algorithm for digital audio watermarking. Int. J. Speech Tech. 2009, 21: 2745. 10.1007/s1077200990562
 39.
Basso A, Bergadano F, Cavagnino D, Pomponiu V, Vernone A: A novel blockbased watermarking scheme using the SVD transform. Algorithms 2009, 2(1):4675. 10.3390/a2010046
 40.
AlHaj A, Mohammad A: Digital audio watermarking based on the discrete wavelets transform and singular value decomposition. Eur. J. Sci. Res. 2010, 39(1):621.
 41.
A AlHaj, C Twal, A Mohammad, Hybrid DWTSVD audio watermarking, in Proceedings of the International Conference on Digital Information Management (2010), pp. 525–529
 42.
M Sehirli, F Gurgen, S Ikizoglu, Performance evaluation of digital audio watermarking techniques designed in time, frequency and cepstrum domains, in International Conference on Advances in Information Systems (2004), pp. 430–440
 43.
J Grody, L Brutun, Performance evaluation of digital audio watermarking algorithms, in The 43rd IEEE Midwest Symposium on Circuits and Systems (2000), pp. 456–459
 44.
Thielde T, Treurniet WC, Bitto R, Schmidmer C, Sporer T, Beerends JG, Colomes C, Keyhl M, Stoll G, Brandenburg K, Feiten B: PEAQ – the ITU standard for objective measurement of perceived audio quality. J. Audio Eng. 2000, 48(1/2/3):329.
 45.
Lerch A: Zplane development, EAQUAL Evaluate Audio QUALity, version:0.1.3alpha. 2002.
 46.
Voloshynovskiy S, Pereira S, Pun T: Attacks on digital watermarks: classification, estimationbased attacks, and benchmarks. Comm. Mag. 2001, 39(8):118126. 10.1109/35.940053
 47.
M Arnold, Attacks on digital audio watermarks and countermeasures, in Proceedings of the IEEE International Conference on WEB Delivering of Music (2003), pp. 1–8
 48.
M Steinebach, F Petitcolas, F Raynal, J Dittmann, C Fontaine, S Seibel, N Fates, LC Ferri, Stirmark benchmark: audio watermarking attacks, in Proceedings of the International Conference on Information Technology: Coding and Computing (2001), pp. 49–54
 49.
Lang A: Stirmark benchmark for audio (SMBA): evaluation of watermarking schemes for audio [Online]. 2006.
 50.
Kalantari N, Akhaee M, Ahadi S, Feizi S, Amindavar H: Robust multiplicative patchwork method for audio watermarking. IEEE Trans. Audio Speech Lang. Process. 2009, 17(6):11331141. 10.1109/TASL.2009.2019259
 51.
Cox I, Kilian J, Leighton T, Shamoon T: Secure spread spectrum watermarking for multimedia. IEEE Trans. Image Process. 1997, 6(12):16731687. 10.1109/83.650120
Author information
Additional information
Competing interests
The author has no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
AlHaj, A. An imperceptible and robust audio watermarking algorithm. J AUDIO SPEECH MUSIC PROC. 2014, 37 (2014). https://doi.org/10.1186/s1363601400372
Received:
Accepted:
Published:
Keywords
 Audio watermarking
 Copyright protection
 Discrete wavelet transform
 Singular value decomposition
 Imperceptibility
 Robustness
 Data payload