- Open Access
Comparative study of digital audio steganography techniques
© Djebbar et al.; licensee Springer. 2012
Received: 21 December 2011
Accepted: 20 June 2012
Published: 9 October 2012
The rapid spread in digital data usage in many real life applications have urged new and effective ways to ensure their security. Efficient secrecy can be achieved, at least in part, by implementing steganograhy techniques. Novel and versatile audio steganographic methods have been proposed. The goal of steganographic systems is to obtain secure and robust way to conceal high rate of secret data. We focus in this paper on digital audio steganography, which has emerged as a prominent source of data hiding across novel telecommunication technologies such as covered voice-over-IP, audio conferencing, etc. The multitude of steganographic criteria has led to a great diversity in these system design techniques. In this paper, we review current digital audio steganographic techniques and we evaluate their performance based on robustness, security and hiding capacity indicators. Another contribution of this paper is the provision of a robustness-based classification of steganographic models depending on their occurrence in the embedding process. A survey of major trends of audio steganography applications is also discussed in this paper.
To minimize the difference between the cover- and the stego-medium, recent steganography techniques utilize natural limitations in human auditory and visual perceptions. Image and video based steganography rely on the limited human visual system to notice luminance variation at levels greater than 1 in 240 across uniform grey levels, or 1 in 30 across random patterns. However, audio-based steganography exploits the masking effect property of the Human Auditory System (HAS) as explained later in this paper.
Various features influence the quality of audio steganographic methods. The importance and the impact of each feature depend on the application and the transmission environment. The most important properties include robustness to noise, to compression and to signal manipulation, as well as the security and the hiding-capacity of hidden data. The robustness requirement is tightly coupled with the application, and is also the most challenging requirement to fulfill in a steganographic system when traded with data hiding-capacity. Generally, the robustness and the capacity hardly coexist in the same steganographic system due to tradeoffs imbalance between these two criteria where increased robustness levels result in decreasing data hiding capacity.
In this work, several works in audio steganography are discussed as well as a thorough investigation of the use of audio files as a cover medium for secret communications. The present review paper builds on our previous work, however, our contributions are as follows:
We survey latest audio steganographic methods and reveal their strengths and weaknesses.
We propose a classification of the reviewed audio steganographic techniques relative to their occurrence in voice encoders.
We compare steganographic methods based on selected robustness criteria.
We evaluate the performance of the reviewed steganographic techniques.
The remainder of this paper is organized as follows: Section Motivation and background presents the motivations related to the use of audio signals as carriers as well selecting some performance criteria used to assess hidden data tolerance to common signal manipulations. Section Audio Steganography Methods presents reviewed steganography methods. However, Section Classification of audio steganography methodsClassificationof audio steganography methods proposes a classification of existing audio steganographic techniques based on their occurrence instances in voice encoders. Evaluation and possible applications are presented in Section Audio steganography evaluation and Applications and trendsApplications andtrends. Finally, conclusions and future work are presented in Section Conclusion.
Motivation and background
Audio file as a cover
The particular importance of hiding data in audio files results from the prevailing presence of audio signals as information vectors in our human society. Prudent steganography practice assumes that the cover utilized to hide messages should not raise any suspicion to opponents. In fact, the availability and the popularity of audio files make them eligible to carry hidden information. In addition, most steganalysis efforts are more directed towards digital images leaving audio steganalysis relatively unexplored. Data hiding in audio files is especially challenging because of the sensitivity of the HAS. However, HAS still tolerates common alterations in small differential ranges. For example, loud sounds tend to mask out quiet sounds. Additionally, there are some common environmental distortions, to the point that they would be ignored by listeners in most cases. These properties have led researchers to explore the utilization of audio signals as carriers to hide data[4–9]. The alterations of audio signals for data embedding purposes may affect the quality of these signals. Assessing the tradeoffs between these alterations and the induced quality is discussed next.
Various parameters influence the quality of audio steganographic systems. Besides, the amount of the hidden data and its imperceptibility level, robustness against removal or destruction of embedded data remains the most critical property in a steganographic system. The robustness criteria are assessed through the survival of concealed data to noise, compression and manipulations of the audio signal (e.g., filtering, re-sampling, re-quantization). In this section, we discuss some selected comparison criteria between the cover- and the stego-signals. We only focus on those methods’ properties that have been evaluated and verified in the reviewed techniques. These properties are listed as follows:
Hiding rate: Measured in bps and refers to the amount of concealed data (in bits) within a cover audio signal, and correctly extracted.
Imperceptibility: This concept is based on the properties of the HAS which is measured through perceptual evaluation of speech quality (PESQ)a. The hidden information is imperceptible if a listener is unable to distinguish between the cover- and the stego-audio signal. The PESQ test produces a value ranging from 4.5 to 1. A PESQ value of 4.5 means that the measured speech has no distortion, it is exactly the same as the original. A value of 1 indicates the severest degradation. Another measure which is widely used is the level of distortion in audio signals and it is captured through SegSNRb (i.e., Signal to Noise Ratio). It is important that the embedding process occurs without a significant degradation or loss of perceptual quality of the cover signal.
Amplification: This criterion results in increasing the magnitude of the audio signal which could alter the hidden data if a malicious attack is intended.
Filtering: Maliciously removes the hidden data by cutting-off selected part of the spectrum.
Re-quantization: This parameter modifies the original quantization of the audio signal. For example, a 16 bits audio signal is quantized to 8 bits and back to 16 bits in an attempt to destroy the hidden data.
Re-sampling: Similarly to the above operation, this parameter triggers the sampling frequency of the audio signal to another one, i.e., wideband audio signal sampled at 16 kHz to 8 kHz and back to 16 kHz.
Noise addition: Adding noise to the audio signal in an attempt to destroy the hidden data, i.e., WGN (White Gaussian Noise).
Encoding/Decoding: This operation reduces the amount of data by removing redundant or unnecessary information. Thus, a hidden message can be completely destroyed. This is also true if the audio file is converted into another format. MP3 compression, for example, changes a wave file to an MP3 file before it reaches the receiver.
Transcoding: It is the process of decoding the audio signal with a decoder that is different than the one used in the encoding operation.
Review of Audio Steganography Methods
Temporal Domain: Methods Comparison
Transform Domain: Criteria comparison
Low pass filtering
Hiding in temporal domain
The majority of temporal domain methods employ low-bit encoding techniques, which we describe next. Other candidate techniques that fall under temporal domain category are also presented in the subsequent sections.
To improve the robustness of LSB method against distortion and noise addition,[13–15] have increased the depth of the embedding layer from 4th to 6th and to 8th LSB layers without affecting the perceptual transparency of the stego audio signal. In[13, 14], only bits at the sixth position of each 16 bits sample of the original host signal are replaced with bits from the message. To minimize the embedding error, the other bits can be flipped in order to have a new sample that is closer to the original one. For example, if the original sample value was 4 which is represented in binary by ”0100”, and the bit to be hidden into the 4th LSB layer is 1, instead of having the value 12=’1100’ produced by the conventional LSB algorithm, the proposed algorithm produces a sample that has value 3= ’0011’, which is much closer to the original sample value (i.e., 4). On the other hand, has shifted the LSB embedding to the eighth layer and has avoided hiding in silent periods or near silent points of the host signal. The occurrence of embedding instances in the eighth bit will slightly increase the robustness of this method compared to the conventional LSB methods. However, the hiding capacity decreases since some of the samples have to be left unaltered to preserve the audio perceptual quality of the signal. In addition, the easiness of the hidden message retrieval is still one of the major drawback of the LSB and its variants, if the hidden bits at the sixth or the eighth position are maliciously revealed out of the stego audio signal.
Due to the low embedding rate and security, and to the best of our knowledge, no audio steganography system based on echo hiding has been presented in recent research works. Moreover, only few techniques have been proposed, even for audio watermarking. To improve the watermark system robustness against common linear signal processing, an echo hiding-time spread technique has been proposed in. Compared to the conventional echo-hiding system, this proposed method spreads the watermark bits throughout the whole signal and it recover them based on the correlation amount at the receiver. The presented system is cepstral content based in which the original signal cepstral portion of error is removed at the decoder which leads to a better detection rate.
Hiding in silence intervals
In, a simple and effective embedding method has been used to exploit silence intervals in speech signal. Initially, the silence intervals of the speech and their respective lengths (the number of samples in a silence interval) are determined. These values are decreased by a value x where 0 < x < 2 nbits , and nbits is the number of bits needed to represent a value from the message to hide. For the extraction process x is evaluated as mod(NewIntervalLength,2 nbits ). For example, if we want to hide the value 6 in a silence interval with length=109, we remove 7 samples from this interval which makes the new interval length 102 samples. To extract the hidden data from this silent interval in the stego-signal, we compute mod (102,8) = 6. Small silence intervals are left unchanged since they usually occur in continuous sentences and changing them might affect the quality of the speech. This method has a good perceptual transparency but obviously it is sensitive to compression. Changes in silence intervals length will lead to false data extraction. To overcome this shortcoming, suggested to slightly amplify speech interval samples and reduce the silence interval samples. Thus, silence sample intervals will not be interpreted as speech samples and vice-versa. The first and last interval added to the speech during MP3 coding are simply ignored in data hiding and retrieval.
Strengths and weaknesses of temporal domain methods
Although robustness and security are not the main characteristics of temporal domain steganographic methods, conventional LSB technique and its variants provide an easy and simple way to hide data. Tolerance to noise addition at low levels and some robustness criteria have been achieved with LSB variants’ methods[13–15], but at a very low hiding capacity. At present, only few time domain hiding techniques have been developed. An evaluation of steganographic systems based on these techniques is shown in Table1. The presence of (✓) sign denotes that the property is validated while (-) indicates the inverse or the information is unavailable.
Hiding in transform domain
The human auditory system has certain peculiarities that must be exploited for hiding data effectively. The ”masking effect” phenomenon masks weaker frequencies near stronger resonant ones[20, 21]. Several methods in the transform domain have been proposed in the literature as described next. To achieve the inaudibility, these methods exploit the frequency masking effect of the HAS directly by explicitly modifying only masked regions[7, 22–24] or indirectly[25, 26] by altering slightly the audio signals samples.
Spread spectrum technique spreads hidden data through the frequency spectrum. Spread spectrum (SS) is a concept developed in data communications to ensure a proper recovery of a signal sent over a noisy channel by producing redundant copies of the data signal. Basically, data are multiplied by an M-sequence code known to both sender and receiver, then hidden in the cover audio. Thus, if noise corrupts some values, there will still be copies of each value left to recover the hidden message. In, conventional direct sequence spread spectrum (DSSS) technique was applied to hide confidential information in MP3 and WAV signals. However, to control stego-audio distortion,[22, 23] have proposed an embedding method where data are hidden under a frequency mask. In, spread spectrum is combined to phase shifting in order to increase the robustness of transmitted data against additive noise and to allow easy detection of the hidden data. For a better hiding rate, used SS technique in the sub-band domain. Appropriately chosen sub-band coefficients were selected to address robustness and resolve synchronization uncertainty at the decoder.
Discrete wavelet transform
Audio steganography based on Discrete Wavelet Transform (DWT) is described in. Data are hidden in the LSBs of the wavelet coefficients of the audio signals. To improve the imperceptibility of embedded data, employed a hearing threshold when embedding data in the integer wavelet coefficients, while avoided data hiding in silent parts of the audio signal. Even though data hiding in wavelet domain procures high embedding rate, data extraction at the receiver side might not be accurate.
Tone insertion method can resist to attacks such as low-pass filtering and bit truncation. In addition to low embedding capacity, embedded data could be maliciously extracted since inserted tones are easy to detect. The authors suggest to overcome these drawbacks by varying four or more pairs of frequencies in a keyed order.
The HAS characteristics depend more on the frequency values as it is more sensitive to amplitude components. Following this principle, authors in propose a steganographic algorithm that embeds high-capacity data in the magnitude speech spectrum while ensuring the hidden-data security and controlling the distortion of the cover-medium. The hidden data (payload) could be of any type such as: encrypted data, compressed data, groups of data (LPC, MP3, AMR, CELP, parameters of speech recognition, etc). The proposed algorithm is based on finding secure spectral embedding-areas in a wideband magnitude speech spectrum using a frequency mask defined at 13 dB below the original signal spectrum. The embedding locations and hiding capacity in magnitude components are defined according to a tolerated distortion level defined in the magnitude spectrum. Since the frequency components within the range of 7 kHz to 8 kHz contribute minimally to wideband speech intelligibility, proposed a method to hide data in this range by completely replacing the frequencies 7-8 kHz by the message to be hidden. The method realizes high hiding capacity without degrading the speech quality.
Known also as log spectral domain, data in this method is embedded in the cepstrum coefficients which tolerate most common signal processing attacks. In addition, cepstrum alteration at frequencies that are in the perceptually masked regions of the majority of cover audio frames, ensures inaudibility of the resulting stego audio frames. Employing cepstral domain modification is proposed in. The cover signal is first transformed into cepstral domain then data are embedded in selected cepstrum coefficient by applying statistical mean manipulations. In this method, an embedding rate of 20 to 40 bps is achieved while guarantying robustness to common signal attacks. In, the cepstrums of two selected frequencies f1 and f2 in each energetic frame are modified slightly to embed bit ’1’ or ’0’. For more security of the embedded data, the author of the previous research suggested later in to use the latter algorithm and embed data with different arbitrary frequency components at each frame.
Allpass digital filters
Using allpass digital filters (APFs), authors in embed data in selected subbands using distinct patterns of APF. The proposed scheme is robust against: noise addition, random chopping, re-quantization and re-sampling. To further increase the robustness of this hiding scheme, a set of n th order APFs were used in. The value of n is an even positive integer and pole locations may be chosen in a variety of ways. data are embedded in selected APF parameters and retrieved using the power spectrum to estimate APF pole locations.
Strengths and weaknesses of transform domain methods
It has been proven that hiding in frequency domain rather than time domain will give better results in terms of signal to noise ratio. Indeed, audio steganography techniques in the transform domain benefit from the frequency masking effect. Most of data hiding algorithms based on transform domain use a perceptual model to determine the permissible amount of embedded data to avoid stego signal distortion. A great number of transform domain have been presented in the last decade and to a certain extent, these techniques have succeeded in realizing the security and the robustness of hidden data against simple audio signal manipulations such as amplification, filtration or re-sampling as shown in Table2.
Although hidden data robustness against simple audio signal manipulation is the main characteristic of transform domain techniques, embedded data will unlikely survive noisy transmission environment or data compression induced by one of the encoding processes such us: ACELP, G.729, etc.
When considering data hiding for real time communications, voice encoders such as: AMR, ACELP and SILK at their respective encoding rate are employed. When passing through one of the encoders, the transmitted audio signal is coded according to the encoder rate then decompressed at the decoder end. Thus, the data signal at the receiver side is not exactly the same as it was at the sender side, which affects the hidden data-retrieval correctness and therefore makes these techniques very challenging. We distinguish two such techniques, namely in-encoder and post-encoder techniques, which we discuss thoroughly next.
A research work where embedded data survives audio codec, compression, reverberations and background noises is presented in. The technique hides data into speech and music signals of various types using subband amplitude modulation. Embedding data in the LPC vocoder was further proposed in. The authors used an auto-correlation based pitch tracking algorithm to perform a voiced/unvoiced segmentation. They replaced the linear prediction residual in the unvoiced segments by a data sequence. Once the residual’s power is matched, this substitution does not lead to perceptual degradation. The signal is conceived using the unmodified LPC filter coefficients. Linear prediction analysis of the received signal is used to decode hidden data. The technique offers a reliable hiding rate of 2kbps.
An alternative to in-encoder techniques is the post-encoder (or in-stream) techniques. To survive audio encoders, authors in have embedded data in the bitstream of an ACELP codec. This technique hides data jointly with the analysis-by-synthesis codebook search. The authors applied the concept on the AMR encoder at a rate of 12.2 kbit/s and were able to hide 2 kbit/s of data in the bitstream. The quality of the stego speech is evaluated in terms of signal to noise ratio at 20.3 dB. A lossless steganography technique for G.711-PCMU telephony encoder has been proposed in. Data in this case is represented by folded binary code which codes each sample with a value between -127 and 127 including -0 and +0. One bit is embedded in 8-bits sample which absolute amplitude is zero. Depending on the number of samples with absolute amplitudes of 0, a potential hiding rate ranging from 24 to 400 bps is obtained. To increase the hiding capacity, the same authors have introduced a semi-lossless technique for G.711-PCMU, where audio sample amplitudes are amplified with a pre-defined level ’i’. The audio signal samples with absolute amplitudes vary from 0 to i are utilized in the hiding process. For a greater hiding capacity, suggested to embed data in the inactive frames of low bit-rate audio streams (i.e., 6.3 kbps) encoded by G.723.1 source codec.
Strengthes and weaknesses of coded domain methods
Robustness and security of embedded data are the main advantages of in-encoder approaches. Hidden data survives noise addition and audio codecs such as ACELP, AMR or LPC. Some of the coded domain methods have achieved a considerably high hiding capacity comparing to the used codecs rate. Since hidden data are not affected by the encoding process, data-extraction correctness is fulfilled in tandem-free operation.
Despite their robustness, hidden data integrity in in-encoder audio steganography techniques could be compromised if a voice encoder/decoder (transcoding) exists in the network. Furthermore, hidden data could be also subject to transformation if a voice enhancement algorithm such as echo or noise reduction is deployed in the network. Since bitstream is more sensitive to modifications than the original audio signal, the hiding capacity should be kept small to avoid embedded data perceptibility. Coded domain techniques are well suited for real-time applications. Table3 summarizes coded domain techniques based on selected robustness criteria.
Classification of audio steganography methods
The pre-encoder methods apply to time and frequency domains where data embedding occurs before the encoding process. A greater part of the methods belonging to pre-encoder embedding class does not guarantee the integrity of the hidden data over the network. Noise addition in its different forms (e.g., WGN) and high-data rate compression induced by one of the encoding processes such us ACELP or G.729, will likely affect the integrity of embedded data. In other methods, embedded data resists only to few audio manipulations such as resizing, re-sampling, filtering etc, and they only tolerate noise addition or data compression at very low rate. High embedding data rate can be achieved with methods designed for noise-free environments.
The robustness of embedded data are the main advantage of this approach. This approach is based on data-embedding operation within the codebook of the codecs. The transmitted information is hidden in the codebook parameter after a re-quantization operation. Thus, each audio signal parameter has a double significance: embedded-data value and audio codebook parameter. One of the drawbacks of this method arises when the encoded parameters traverse a network such as GSM that have for example a voice decoder/encoder in the Radio Access Network (BST, BSC, TRAU) and/or in the Core Network (MSC). In this configuration, hidden data values will be modified. These modifications might also happen when a voice enhancement algorithm is enabled in the Radio Access Network and/or in the Core Network.
In this approach, data are embedded in the bitstream resulting from the encoding process and extracted before traversing the decoder side. Since the bitstream is more sensitive to modifications than the original audio signal, the hiding capacity should be kept small to avoid embedded data perceptibility. Furthermore, transcoding can modify embedded data values and therefore could alter the integrity of the steganographic system. However, one of the positive sides of these methods is the correctness of data retrieval. Hidden message-extraction is done with no loss in tandem-free operations since it is not affected by the encoding process. A general scheme of the three steganography approaches is illustrated in Figure9.
Low bit encoding
LSB of each sample in the audio is replaced by one bit of hidden information
Simple and easy way of hiding Information with high bit rate
Easy to extract and to destroy
Embeds data by introducing echo in the cover signal
Resilient to lossy data compression algorithms
Low security and capacity
Uses the number of samples in silence interval to represent hidden data
Resilient to lossy data compression algorithms
Use frequency bands to hide data
Longer message to hide and less likely to be affected by errors during transmission
Low robustness to simple audio manipulations
insertion of inaudible tones at selected frequencies
Imperceptibility and concealment of embedded data
Lack of transparency and security
Modulate the phase of the cover signal
Robust against signal processing manipulation and data retrieval needs the original signal
Spread the data over all signal frequencies
Provide better robustness
Vulnerable to time scale modification
Altering the cepstral coefficients for embedding data
Robust against signal processing operations
Perceptible signal distortions and low robustness
Altering wavelet coefficients for embedding data
Provide high embedding capacity
lossy data retrieval
Altering codebook parameters
Low embedding rate
LSB is applied on the bitstream resulting from the encoder process
Low embedding rate
Audio steganography evaluation
To evaluate the performance of the reviewed techniques, the imperceptibility and the detectability rate of hidden data are assessed. Next, imperceptibility evaluation of selected temporal, transform and coded domain steganography tools and methods is discussed.
Hiding in speech, speech pauses or music audio signals as shown in Figures (10a), (10b), (10c) and in Additional file1: Table S1 indicates that Steganos software induces more noise, where H4PGP shows better performance in terms of SNR and hiding capacity. However, the other softwares behave almost alike. In addition, our results show that music signals are better hosts to hide data in terms of imperceptibility and capacity.
To control the distortion induced by the embedding process, most audio steganography methods based on transform domain use a perceptual model to determine the permissible amount of data embedding without distorting the audio signal. Previous investigations evaluated frequency domain method are reported in Figure10. Related results are reported in Additional file1: Table S1. In a more challenging environment, such as real time applications, encoded domain methods ensure robustness against compression. A similar performance investigation reports the results shown in Additional file1: Table S1 and in Figures (10g), (10h) and (10i). Our results show that while using the same embedding capacity in temporal and frequency domains, stego signals generated in the frequency domain are less distinguishable than the ones produced by hiding data in the temporal domain.
Evaluation by steganalysis
The contingency table
True positives (tp)
False negatives (fn)
False positives (fp)
True negatives (tn)
The entries of the contingency table are described as follows:
tp: stego-audio classified as stego-audio signal
tn: cover-audio classified as cover-audio signal
fn: stego-audio classified as cover-audio signal
fp: cover-audio classified as stego-audio signal
Following the preparation of the training and testing datasets, we used the SVM library tool available athttp://www.csie.ntu.edu.tw/˜cjlin/libsvm to discriminate between the cover- and the stego-audio signals. The results of the comparative study are reported in Additional file2: Table S2. The accuracy of each studied tool is measured by the accuracy (AC). The values presented in Additional file2: Table S2 are the percentages of the stego-audio signals correctly classified. Higher score values are interpreted as high-detection rates. Consequently, frequency-domain steganography technique described in Steghide tool shows a performance improvement over time domain techniques (Stools and Hide4PGP). These results are consistent with our finding in the imperceptibility evaluation presented in the previous section.
In Additional file2: Table S2, further investigation is done to put more emphasis on the behavior of the tested algorithms when music- and speech-audio signals are used separately to convey hidden data. The results show that hiding in music is less detectable than speech audio signals. In fact, the reference steganalysis method uses features extracted from high frequencies (lower in energy) to discriminate between cover- and stego- signals. Therefore, it allows to intensify the signal discontinuities due to the noise generated by data embedding. As the number of low-energy frequency components in music audio signals is smaller than that in speech audio-signals, the detection rate is expected to be lower.
Applications and trends
A various range of audio steganographic applications have been successfully developed. Audio Steganography techniques can be applied for covert communications using unclassified channels without additional demand for bandwidth or simply for storing data. In general, three application types for audio steganography techniques are distinguished and can be categorized as discussed next.
Given the possibility to hide more than 16 Kbps in a wide-band audio file with a conventional LSB encoding method, digital information can be reliably stored in audio steganographic systems. Another application for data storage could be seen in subtitled movies. Actors speech, film music, background sounds could be used to embed the text needed for translation. In this case, bandwidth is substantially reduced.
In order to provide better protection to digital data content, new steganography techniques have been investigated in recent researcher works. The availability and popularity of digital audio signals have made them an appealing choice to convey secret information. Audio steganography techniques address issues related to the need to secure and preserve the integrity of data hidden in voice communications in particular. In this work, a comparative study of the current-state-of-the-art literature in digital audio steganography techniques and approaches is presented. In an attempt to reveal their capabilities in ensuring secure communications, we discussed their strengthes and weaknesses. Also, a differentiation between the reviewed techniques based on the intended applications has been highlighted. Thus, while temporal domain techniques, in general, aim to maximize the hiding capacity, transform domain methods exploit the masking properties in order to make the noise generated by embedded data imperceptible. On the other side, encoded domain methods strive to ensure the integrity of hidden data against challenging environment such as real time applications. To better estimate the robustness of the presented techniques, a classification based on their occurrence in the voice encoder is given. A comparison as well as a performance evaluation (i.e., imperceptibility and steganalysis) for the reviewed techniques have been also presented. This study showed that the frequency domain is preferred over the temporal domain and music signals are better covers for data hiding in terms of capacity, imperceptibility and undetectability. From our point of view, the diversity and large number of existing audio steganography techniques expand application possibilities. The advantage on using one technique over another one depends on the application constraints in use and its requirement for hiding capacity, embedded data security level and encountered attacks resistance.
a Standard ITU-T P862.2
b Segmental SNR
- Anderson (ed.) RJ: Information hiding: 1st international workshop, volume 1174 of Lecture Notes in Computer Science, Isaac Newton Institute. Springer-Verlag, Berlin, Germany; 1996.View ArticleGoogle Scholar
- Bender W, Gruhl D, Morimoto N, Lu A: Techniques for Data Hiding. IBM Syst. J 1996, 35(3 and 4):313-336.View ArticleGoogle Scholar
- Zwicker E, Fastl H: Psychoacoustics. Springer Verlag, Berlin; 1990.Google Scholar
- Djebbar F, Ayad B, Hamam H, Abed-Meraim K: A view on latest audio steganography techniques, Innovations in Information Technology (IIT), 2011 International Conference on , vol., no.,. (Abu Dhabi, 25-27 April 2011), pp. 409–414Google Scholar
- Fallahpour M, Megias D: High capacity audio watermarking using FFT amplitude interpolation, Information and Communication Engineers (IEICE). Electron J. Express 2009, 6(14):1057-1063. 10.1587/elex.6.1057View ArticleGoogle Scholar
- Djebbar F, Guerchi D, Abed-Maraim K, Hamam H: Text hiding in high frequency components of speech spectrum, Information Sciences Signal Processing and their Applications (ISSPA), 2010 10th International Conference on , vol., no.,. (Malaysia, 10-13 May 2010), pp. 666–669Google Scholar
- Djebbar F, Ayad B, Abed-Meraim K, Habib H: Unified phase and magnitude speech spectra data hiding algorithm. Accepted in Journal of Security and Communication Networks. (John Wiley and Sons, Ltd, 4 April, 2012)Google Scholar
- Djebbar F, Abed-Maraim K, Guerchi D, Hamam H: Dynamic energy based text-in-speech spectrum hiding using speech masking properties, Industrial Mechatronics and Automation (ICIMA), 2010 2nd International Conference on , vol.2, no.,. (China, 30-31 May 2010), pp. 422–426Google Scholar
- Djebbar F, Hamam H, Abed-Meraim K, Guerchi D: Controlled distortion for high capacity data-in-speech spectrum steganography, International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IEEE-IIHMSP),. ISBN: 978-0-7695-4222-5, (Darmstadt, Germany, 2010), pp. 212–215Google Scholar
- Hu Y, Loizou P: Evaluation of objective quality measures for speech enhancement. IEEE Trans. Speech and Audio Process 2008, 16(1):229-238.View ArticleGoogle Scholar
- Gopalan K: Audio steganography using bit modification, Proceedings of the IEEE 2003 International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03),. (Hong Kong, April 2003)Google Scholar
- Cvejic N, Seppiinen T: Increasing the capacity of, LSB-based audio steganography, IEEE Workshop on Multimedia Signal processing. (St. Thomas, USA 2002), pp. 336–338Google Scholar
- Cvejic N, Seppanen T: Increasing Robustness of, LSB Audio Steganography Using a Novel Embedding Method, Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC04). vol. 2,. (Washington, DC, USA, 2004), pp. 533Google Scholar
- Cvejic N, Seppanen T: Reduced distortion bit-modification for LSB audio steganography. J. Universal Comput. Sci 2005, 11(1):56-65.Google Scholar
- Ahmed MA, Kiah LM, Zaidan BB, Zaidan AA: A Novel Embedding Method to Increase Capacity and Robustness of Low-bit Encoding Audio Steganography Technique Using Noise Gate Software Logic Algorithm. J. Appl. Sci 2010, 10: 59-64.View ArticleGoogle Scholar
- Gruhl D, Bender W: Echo hiding, Proceeding of the 1st Inforomation Hiding Workshop, Lecture Notes in Computer Science,. (Isaac Newton Institute, England, 1996), pp. 295–315Google Scholar
- Erfani Y, Siahpoush S: Robust audio watermarking using improved TS echo hiding. Digital Signal Process 2009, 19: 809-814. 10.1016/j.dsp.2009.04.003View ArticleGoogle Scholar
- Shirali-Shahreza S, Shirali-Shahreza M: Steganography in Silence Intervals of Speech, proceedings of the Fourth IEEE International Conference on Intelligent Information Hiding and Multimedia Signal (IIH-MSP 2008). (Harbin, China, August 15-17, 2008), pp. 605–607Google Scholar
- Shirali-Shahreza S, Shirali-Shahreza M: Real-time and MPEG-1 layer III compression resistant steganography in speech, The Institution of Engineering and Technology Information Security. IET Inf. Secur 2010, 4(1):1-7. 10.1049/iet-ifs.2008.0129View ArticleGoogle Scholar
- Kang GS, Moran TM, Heide DA: Hiding Information Under Speech, Naval Research Laboratory,. (Washington, DC NRL/FR/5550–05-10,126,2005), 20375-5320Google Scholar
- Paillard B, Mabilleau P, Morissette S, Soumagne J: PERCEVAL: Perceptual Evaluation of the Quality of Audio Signals. J. Audio Eng. Soc 1992, 40: 21-31.Google Scholar
- Matsuka H: Spread spectrum audio steganography using sub-band phase shifting,. In IEEE International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP’06). (Pasadena, CA, USA; December 2006:3-6.View ArticleGoogle Scholar
- Li X, Yu HH: Transparent and robust audio data hiding in subband domain, Proceedings of the Fourth IEEE International Conference on Multimedia and Expo, (ICME 2000),. (New York, USA, 2000), pp. 397–400Google Scholar
- Pooyan M, Delforouzi A: Adaptive Digital Audio Steganography Based on Integer Wavelet Transform, Intelligent Information Hiding and Multimedia Signal Processing (IIHMSP 2007). vol. 2,. (Splendor Kaohsiung, Taiwan, 2007), pp. 283–28Google Scholar
- Gopalan K, et al.: Covert Speech Communication Via Cover Speech By Tone Insertion, Proceeding of IEEE Aerospace Conference,. (Big Sky, Montana, March 2003)Google Scholar
- Gopalan K: A unified audio and image steganography by spectrum modification, IEEE International Conference on Industrial Technology (ICIT’09),. (Gippsland, Australia, 10-13 Feb 2009), pp. 1–5Google Scholar
- Khan K: Cryptology and the origins of spread spectrum. IEEE Spectrum 1984, 21: 70-80.View ArticleGoogle Scholar
- Hernandez-Garay S, Vazquez-Medina R, de Rivera LN, Ponomaryov V: Steganographic communication channel using audio signals, 12th International Conference on Mathematical Methods in Electromagnetic Theory, (MMET). (Odesa, Ukraine, 2 July 2008), pp. 427–429Google Scholar
- Cvejic N, Seppanen T: A wavelet domain, LSB insertion algorithm for high capacity audio steganography, Proc. 10th IEEE Digital Signal Processing Workshop and 2nd Signal Processing Education Workshop,. (Georgia, USA, 13-16 October, 2002), pp. 53–55Google Scholar
- Shirali-Shahreza S, Shirali-Shahreza M: High capacity error free wavelet domain speech steganography, Proc. 33rd Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2008). (Las Vegas, Nevada, USA, 30 March 2008), pp. 1729–1732Google Scholar
- Gopalan K, Wenndt S: Audio Steganography for Covert Data Transmission by Imperceptible Tone Insertion, WOC 2004,. (Banff, Canada, July 8–10, 2004)Google Scholar
- Gang L, Akansu AN, Ramkumar M: MP3 resistant oblivious steganography, Proceedings of, IEEE International Conference on Acoustics, Speech, and Signal Processing. Vol. 3,. (Salt Lake City, UT. 7-11 May 2001), pp. 1365–1368Google Scholar
- Dong X, Bocko M, Ignjatovic Z: Data hiding via phase manipulation of audio signals, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’04). vol. 5,. (Montreal, Quebec, Canada, 17-21 May 2004), pp. 377–380Google Scholar
- Guerchi D, Harmain H, Rabie T, Mohamed E: Speech secrecy: An FFT-based approach. Int. J. Mathematics Comput. Sci 2008, 3(2):1-19.MathSciNetGoogle Scholar
- Li X, Yu HH: Transparent and robust audio data hiding in cepstrum domain, Proc. IEEE International Conference on Multimedia and Expo, (ICME 2000),. (New York, USA, 2000)Google Scholar
- Gopalan K: Audio Steganography by Cepstrum Modification, Proc. of the IEEE 2005 International Conference on, Acoustics, Speech, and Signal Processing (ICASSP’05),. (Philadelphia, USA, March 2005)View ArticleGoogle Scholar
- Ansari R, Malik H, Khokhar A: Data-hiding in audio using frequency-selective phase alteration, IEEE International, Conference on Acoustics, Speech, and Signal Processing, (ICASSP’04),. (Montreal, Quebec, Canada, May 2004), pp. 389–392Google Scholar
- Malik HMA, Ansari R, Khokhar AA: Robust Data Hiding in Audio Using Allpass Filters. IEEE Trans. Audio, Speech and Language Process 2007, 15(4):1296-1304.View ArticleGoogle Scholar
- Nishimura A: Data hiding for audio signals that are robust with respect to air transmission and a speech codec, IIH-MSP’08. (Harbin, China, 15-17 Aug 2008), pp. 601–604Google Scholar
- Hofbauer K, Kubin G: High-rate data embedding in unvoiced speech, in, Proc. Int. Conf. Spoken Language Processing (INTERSPEECH),. (Pittsburgh, PY, USA, September 2006), pp. 241–244Google Scholar
- Geiser B, Vary P: High rate data hiding in, ACELP speech codecs, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’08),. (Las Vegas, USA, 4 April 2008), pp. 4005–4008Google Scholar
- Aoki N: A Technique of Lossless Steganography for G.711 Telephony Speech, International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP’08),. (Harbin, China, 2008), pp. 608–611Google Scholar
- Aoki N: A Semi-Lossless Steganography Technique for G.711 Telephony Speech, International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2010),. (Darmstadt, Germany, 2010), pp. 534–537Google Scholar
- Huang YF, Tang S, Yuan J: Steganography in Inactive Frames of VoIP Streams Encoded by Source Codec. IEEE Trans. Inf. Forensics and Security 2011, 6(2):296-306.View ArticleGoogle Scholar
- Invisible secrets,[http://www.invisiblesecrets.com/]
- Steganos Security Suite 7,[http://www.steganos.com]
- Stools Version 4.0,[http://info.umuc.edu/its/online_lab/ifsm459/s-tools4/]
- Liu Q, Sung AH, Qiao M: Temporal derivative-based spectrum and mel-cepstrum audio steganalysis. IEEE Trans. Inf. Forensics and Security 2009, 4(3):359-368.View ArticleGoogle Scholar
- Cristianini N, Shawe-Taylor J: An introduction to Support Vector Machines. Cambridge University Press; 2000.Google Scholar
- Nafeesa Begum J, Kumar K, Sumathy DrV: Design And Implementation Of Multilevel Access Control In Medical Image Transmission Using Symmetric Polynomial Based Audio Steganography. Int. J. Comput. Sci. Inf. Security 2010, 7: 139-146.Google Scholar
- Shirali-Shahreza M: Steganography in MMS, IEEE International Conference in Multitopic, INMIC 2007,. (Lahore, Pakistan, 2007), pp. 1–4Google Scholar
- Paik M: Blacknoise: Low-fi Lightweight Steganography in Service of Free Speech, Proceedings of the 2nd International Conference on M4D - Mobile Communication Technology for Development, NYU. (Kampala, Uganda, November 2010), pp. 1–11Google Scholar
- Vary P, Geiser B: Steganographic wideband telephony using narrowband speech codecs, in, Conference Record of Asilomar Conference on Signals, Systems, and Computers,. (Grove, CA USA, Nov 2007)Google Scholar
- Chen S, Leung H, Ding H: Telephony Speech Enhancement by Data Hiding. IEEE Trans. Instrum. Meas 2007, 56(1):63-74.View ArticleGoogle Scholar
- Lazic N, Aarabi P: Communication over an Acoustic Channel Using Data Hiding Techniques. IEEE Trans. Multimedia 2006., 8(5):Google Scholar
- Chen P-W, Huang C-H, Shen Y-C, Wu J-L: Pushing information over acoustic channels, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’09). (Taipei, Taiwan, 2009), pp. 1421–1424Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.