 Research
 Open Access
Contextbased adaptive arithmetic coding in time and frequency domain for the lossless compression of audio coding parameters at variable rate
 Jing Wang^{1}Email author,
 Xuan Ji^{1},
 Shenghui Zhao^{1},
 Xiang Xie^{1} and
 Jingming Kuang^{1}
https://doi.org/10.1186/1687472220139
© Wang et al.; licensee Springer. 2013
 Received: 24 September 2012
 Accepted: 2 May 2013
 Published: 21 May 2013
Abstract
This paper presents a novel lossless compression technique of the contextbased adaptive arithmetic coding which can be used to further compress the quantized parameters in audio codec. The key feature of the new technique is the combination of the context model in time domain and frequency domain which is called timefrequency context model. It is used for the lossless compression of audio coding parameters such as the quantized modified discrete cosine transform (MDCT) coefficients and the frequency band gains in ITUT G.719 audio codec. With the proposed adaptive arithmetic coding, a high degree of adaptation and redundancy reduction can be achieved. In addition, an efficient variable rate algorithm is employed, which is designed based on both the baseline entropy coding method of G.719 and the proposed adaptive arithmetic coding technique. Experiments show that the proposed technique is of higher efficiency compared with the conventional Huffman coding and the common adaptive arithmetic coding when used in the lossless compression of audio coding parameters. For a set of audio samples used in the G.719 application, the proposed technique achieves an average bit rate saving of 7.2% at low bit rate coding mode while producing audio quality equal to that of the original G.719.
Keywords
 Adaptive arithmetic coding
 Timefrequency context
 Lossless compression
 Variable rate
 MDCT
1. Introduction
Natural digital audio signals require large bandwidth for transmission and enormous amounts of storage space. Developments in entropy coding, i.e., Huffman coding [1, 2] and arithmetic coding [3, 4], have made it practical to reduce these requirements without information loss. They employ nonstationary statistical behavior which exploits redundant information in the source signal. Compared with lossless compression methods, vector quantization methods and lossy compression methods are adopted in audio coding system to remove irrelevancy inaudible to humans and to improve the coding efficiency. Many audio codecs only use lossy compression methods to quantize and encode the audio parameters. In fact, when further combined with lossless entropy coding for the quantization and encoding procedure, audio codec can achieve better performance on the coding efficiency compared with using the lossy compression alone.
With the development of modern multimedia communication, highquality fullband speech and audio coding becomes significant and is needed more at low bit rate. Besides the lossy compression through parametric and transform coding, many audio codecs introduce lossless coding algorithm to further compress the coding bits, such as Moving Picture Experts Group4 advanced audio coding (MPEG4 AAC) [5], MPEG unified speech and audio coding (USAC) [6], and ITUT G.719 [7]. ITUT G.719 is a lowcomplexity fullband (20 Hz to 20 kHz) audio codec for highquality speech and audio, which operates from 32 to 128 kbps [7]. As with most of the transform audio coding, G.719 uses modified discrete cosine transform (MDCT) to realize the timefrequency transform and to avoid artifacts stemming from the block boundaries. In the MDCT domain [8], statistical and subjective redundancies of the signals can be better understood, exploited, and removed in most cases. After the lossy compression with vector quantization, removing irrelevancy inaudible to humans, the further compression performance is largely determined by the entropy coding efficiency of the quantized MDCT coefficients. In G.719, Huffman coding is applied, and the coding procedure has to be driven by an estimated probability distribution of the quantized MDCT coefficients along with the norms (frequency band gains).
Although Huffman coding removes some of the quantized MDCT coefficients' redundancy, it suffers from several shortcomings which limit further coding gains. For instance, in Huffman code, the distribution of MDCT coefficients is predefined from training statistics, and the adaptation mechanism is not flexible enough to combat the possible statistics mismatch, such as the techniques of switching between different codebooks and multidimensional codebooks which are exploited in AAC. Furthermore, if the symbols are not grouped into blocks, the symbols whose probabilities greater than 0.5 cannot be efficiently coded due to the intrinsic limit of 1 bit per symbol of Huffman code. Hence, the entropy coding schemes based on the adaptive arithmetic coding [9] are involved in the audio codec like MPEG USAC. The adaptive model measures the statistics of source symbols and is updated continuously with the encoding and decoding processes. In addition, the context from the point of view of the neighboring symbols is taken into account in order to further improve the coding efficiency.
For the context, it is firstly introduced in image and video coding. Here, contextbased adaptive binary arithmetic coding (CABAC) in H.264/AVC [10] is taken as an example. CABAC is one of the two entropy coding methods of the new ITUT/ISO/IEC standard for video coding, i.e., H.264/AVC, and plays a very important role in the efficiency improvement of the video coding. Through combining an adaptive binary arithmetic coding technique with context modeling of the neighboring symbols in binary bit stream and macro block, a high degree of adaptation and redundancy reduction is achieved. The encoding process of CABAC consists of three elementary steps: binarization, context model selecting, and adaptive binary arithmetic encoding. The last step consists of probability estimation and binary arithmetic encoder.
In the second step of CABAC [10], a context model is chosen, and a model probability distribution is assigned to the given symbols. In the subsequent coding stage, the binary arithmetic coding engine generates a sequence of bits that represent the symbols. The model determines the coding efficiency in the first place, so it is of paramount importance to design an adequate model that explores the statistical dependencies to a large degree. At the same time, this model needs to be continuously updated during encoding. Suppose one predefined set T of the past symbols, a socalled context template, and one related set C = {0,…,C1} of the contexts are given, where the contexts are specified by a modeling function F:T → C operating on the template T. For each symbol x to be coded, a conditional probability p(xF(z)) is estimated by switching between different probability models according to the already coded neighboring symbols z ∊ T. Generally speaking, the context model makes use of the information related to the encoded symbols and describes the mapping between a sequence of symbols and the assignment of the symbols' probability distribution.
Lately, arithmetic coding schemes based on bitplane context are also involved in the field of audio coding such as USAC, like the applications in video coding. The spectral noiseless coding scheme is based on an arithmetic coding in conjunction with a dynamically adaptive context. The noiseless coding is fed by the quantized spectral values and uses contextdependent cumulative frequency tables derived from the two previously decoded neighboring twotuple quantized spectral coefficients. The coding separately considers the sign, the two most significant bits (MSBs) and the remaining least significant bits. The context adaptation is applied only to the two MSBs of the unsigned spectral values. The sign and the least significant bits are assumed to be uniformly distributed.
By now, entropy coding schemes based on arithmetic coding are quite frequently involved in the field of none blockbased video coding. The CABAC design is based on the key elements of binarization, context modeling, and binary arithmetic coding. Binarization enables efficient binary arithmetic coding via a unique mapping of nonbinary syntax elements to a sequence of bits, which are called bins. Now, the arithmetic coding as a lossless data compression scheme also plays an essential role in the chain of processing of audio signal coding. The correlation in bit plane of the quantized MDCT coefficients is employed in the USAC [11]. However, the concept of context model for the adaptive arithmetic coding has been neither deeply investigated nor widely used in audio coding especially for the efficient compression by setting up context model from the point of view of the quantized audio parameters. When using the arithmetic coding to compress the coding parameters directly, the probability estimation based on the bitplane context model may not be suitable. In this situation, the correlation of audio coding parameters leading to lower information entropy could be considered both in time and frequency domain which can be deeply investigated in theory and carefully designed in practice. Thus, a novel timefrequency plane context model will be given in this paper, and the adaptive arithmetic coding will be used directly for the audio coding parameters. Furthermore, variable coding scheme is introduced to advance the efficiency.
In our work on arithmetic coding, the entropy coding method of an adaptive arithmetic coding technique with a timefrequency plane context model (both time and frequency domain are taken into account) was developed, which has led to the improvement of coding the quantized MDCT coefficients and the frequency band gains. The adaptive arithmetic coding will be applied to further compress the coding parameters in audio codec frame by frame and the probability estimation of which will make use of the interframe (time domain) correlation and the intraframe (frequency domain) correlation of the coding parameters. In fact, most of alternative approaches to audio coding are on the basis of MDCT. One of its main distinguishing features is related to the timefrequency plane: Given a source of the quantized transform coefficients for instance, it was found to be useful to utilize the correlation in the time domain and frequency domain to increase the probability of the encoding symbol for arithmetic coding. The experiment on G.719 is carried out as an application of the proposed technique, in which the compatibility with the G.719 baseline is required. The good compression performance is achieved. Adopting this method, the allocated bits for coding the quantized parameters vary in consecutive analysis frames, while the quality of decoded audio remains constant. Therefore, the average bit rate is lower than that of the fixed bit rate codec while sustaining the same audio quality. Hence, a variable rate operation is introduced into the novel contextbased adaptive arithmetic coding algorithm, which achieves better performance in terms of the coding efficiency.
This paper is organized as follows. Section 2 outlines the novel adaptive arithmetic coding of the parameters produced in the audio encoding. Section 3 describes in detail the novel techniques and the underlying ideas of our entropy coding modules. Section 4 presents the experimental results and the performance comparison. Section 5 concludes this paper with a summary.
2. Modules of the novel adaptive entropy coding
2.1. Preliminary principle
where p(x_{ i }) is the probability of the symbol x_{ i }.
where s_{ j }, a socalled context, is a specific state of the source and J represents the total number of the considered states. For the application of the socalled context, the distribution of the symbols (x_{0}, ……, x_{I−1}) is more concentrated in the vicinity of the encoding symbol, which means the probability of the encoded symbol can be increased through establishing the context model. Consequently, a suitable context design considering the correlation of the source means the lower entropy. In the applications of audio coding, because of the similarity of the sequential frames as well as the adjacent frequency bands, some audio parameters like frequency band gains and frequency spectral values have the correlation in time, and frequency domain and the context model with the neighboring parameters can be designed to make the entropy of the coding source lower, thus the compression efficiency can be higher. In Sections 2.3 and 2.4, the proposed context model and the way to utilize it will be mentioned in theory. The practical behavior and design in the case of G.719 codec will be investigated in Section 3.
2.2 Integer arithmetic coding
The performance of arithmetic coding is optimal without the need for blocking of input data. It encourages a clear separation between the probability distribution model and the encoding of information. For example, the model may assign a predetermined probability to each symbol. These probabilities can be determined by counting frequencies of representative samples to be transmitted. Such a fixed model is communicated in advance for both encoder and decoder. Alternatively, the probabilities that an adaptive model assigns may change as each symbol is transmitted. The encoder's model changes as each symbol is transmitted and the decoder's model changes as each symbol is received. If the context is involved, the adaptive model is based on the context.
The properties of the intervals guarantee that 0 ≤ l_{ i } ≤ l_{i + 1} < N, and 0 ≤ u_{ i } ≤ u_{i + 1} < N. The expression $\frac{c\left({x}_{i}\right)c\left({x}_{i1}\right)}{c\left({x}_{I1}\right)}$ is equivalent to p(x_{ i }) in Equation 1. To have incremental output, i.e., coded word, during the encoding process and to resolve the need for highprecision computations, the algorithm is performed through three mappings as follows. ‘Scale’ is defined as an intermediate variable in the calculation process to count the number of the three mappings, which represents the bit following the previous output bit in steps I and II.

I: If the subinterval [l, u] lies entirely in the lower half part of [0, N − 1], i.e., [0, N/2 − 1], then the coder emits a bit 0 and scale outputs a bit 1 until it is successively reduced to 0, and linearly expands [l, u] to [2l, 2u + 1]. Scale is reset to 0.

II: If the subinterval [l, u] lies entirely in the upper half part of [0, N − 1], i.e., [N/2, N − 1], then the coder emits a bit 1 and scale outputs a bit 0 until it is successively reduced to 0, and linearly expands [l, u] to [2l − N, 2u − N + 1]. Scale is reset to 0.

III: If the subinterval [l, u] lies entirely in the interval [N/4, 3N/4 − 1], then the coder linearly expands [l, u] to [2l − N/2, 2u − N/2 + 1] and increases the value of scale by 1.
The three mapping steps will be ended until the interval [l, u] meets with none of the above looping conditions. As the subinterval shortens, the number of loops increases which lead to more bits output. Thus, the larger the subinterval is, the smaller bits the coder output. Since the context model can be established to increase the probability of the encoded symbol, the subinterval representing the probability will correspond to be enlarged.
2.3. Timefrequency context model
A family of contexts is defined by means of the function T(m). The parameter m represents the number of symbols lying in the vicinity of the present coded symbol with 0 ≤ m ≤ 2. For each symbol C to be coded, the conditional probability p(CT(m)) is estimated by switching between different probability models according to the already coded neighboring symbols. In Figure 1, T(0) represents no context, T(1) = A or B, and T(2) = A or B. A represents the context in the frequency domain, while B represents the context in the time domain, and they correspond with the quantized parameters in the transform audio codec. Their conditional probabilities are estimated by different methods which will be introduced in the following sections.
2.3.1. Context model in the frequency domain
The length of the contextbased sequence is defined as the order of the context model. A key issue in context modeling for the input symbol sequence is to balance the usage of the model order and the model cost. Higher order means higher cost of the computation. To solve this problem, one order context model [17] can be chosen in the frequency domain regarding its good compression and low complexity in the audio coding application.
2.3.2. Context model in the time domain
When the neighboring elements are correlated and the current symbol C distributes around the encoded symbol B, i.e., C ∈ (B − δ, B + δ), where δ represents the rescaling parameter, the model probability distribution is reassigned to the current symbol C.
For the mary (m is the number of symbols) adaptive arithmetic coding, the encoded symbol B is taken as the center; 2δ symbols, which are located in the vicinity of B, would be chosen to add a large number λ on the basis of the original frequency, leading to rearrange the distribution of the model. λ is the cumulative counts of all symbols which can change the subinterval adaptively.
As f^{′}(x_{ i }) increases for i = B − δ + 1, …, B, …, B + δ, the inequality $\frac{{c}^{\text{'}}\left({x}_{i}\right){c}^{\text{'}}\left({x}_{i1}\right)}{{c}^{\text{'}}\left({x}_{I1}\right)}>\frac{c\left({x}_{i}\right)c\left({x}_{i1}\right)}{c\left({x}_{I1}\right)}$ can be obtained. The subinterval u′1 − l^{′}1 is then larger than u 2 − l 2 under the above condition. Consequently, the higher the encoding symbol's frequency counts value is, the better the designed coding scheme performs with the larger subinterval of the encoding symbol.
As to the context model in time domain, we only consider one state context which models the past symbol B close to the current symbol C because the state before B has a weaker correlation with C while more states mean higher complexity.
3. Scheme of the novel context adaptive arithmetic coding in G.719
3.1 Stateoftheart techniques of G.719
ITUT G.719 codec [7] makes use of the transform coding technique for lowcomplexity fullband conversational speech and audio, operating from 32 up to 128 kbps. The input signal sampled at 48 kHz is firstly processed through a transient detector based on the energy ratio between the shortterm energy and the longterm energy. An adaptive window switching technique is used depending on the detection of transient and stationary signal. Then, time domain aliasing and MDCT techniques are designed to process the different kind of input signal. The transformed spectral coefficients are grouped into subbands of unequal lengths. The gain of each band (i.e., norm) is estimated, and the resulting spectral envelope consisting of the norms of all bands is quantized and encoded. The quantized norms are further adjusted based on adaptive spectral weighting and used as the input for bit allocation. The spectral coefficients are normalized by the quantized norms, and the normalized MDCT coefficients are then lattice vector quantized and encoded based on the allocated bits for each frequency band. In the process of bit allocation, Huffman coding is applied to encode the indices of both the encoded spectral coefficients and the encoded norms. The saved bits by Huffman coding are used for the following bit allocation and the noise adjustment in order to generate better audio quality. Finally, the fixed bit stream is obtained and transmitted to the decoder.
3.2 The novel structure of G.719
In this section, the novel contextbased adaptive arithmetic coding is introduced to improve the coding scheme in G.719, and the probability statistic of the entropy coding is established for the transient and the stationary audio separately. The key elements will be discussed in the next section.
When the coding procedure of the quantized norms is over, the coefficients are normalized by the quantized norms, and then, the normalized spectral coefficients are lattice vector quantized according to the bit allocation which leads to different dynamic range in subbands. For the socalled bit allocation, the maximum number of bits assigned to each normalized transform coefficient is set to R_{max} = 9 in G.719 by default. Thus, nine statistical models for the adaptive arithmetic coding to be updated are employed, and all bands will be rearranged in order from low band to high band for the arithmetic coding so that the quantized coefficients in the subbands with the same allocated bits are encoded continuously. Considering that the 1bit subband, the 2 to 4bit subband, and the 5 to 9bit subband have different correlations in the time domain and in the frequency domain, we use different context models when the bit allocation is different. The subbands of 5 to 9 bits are designed to exploit the correlation in the time domain for compression, while the subbands of 2 to 4 bits make good use of the correlation in frequency domain. Finally, the subband of 1 bit uses the normal adaptive arithmetic coding.
3.3 Timefrequency context model in G.719
Through a large number of experiments, we have found that the quantized norms and the quantized MDCT coefficients with 2 to 4 bits have the context statistical characteristic in the frequency domain, while the quantized norms and the quantized MDCT coefficients with 5 to 9 bits have the characteristic in the time domain, as is discussed in Section 2.3.
Thus, for the 2 to 4bit subbands, the context in the frequency domain is defined as the encoded symbol A before the input one C, as is shown in Figure 1. Then, the conditional cumulative counts c(CA) can be obtained. Let c(CA) be the estimated conditional cumulative counts to drive the integer arithmetic coder.
where ${D}_{i,j}^{\left(t\right)}$ represents subband index with 1 ≤ j ≤ 44 and j means the number of the subbands. The subbands have different sizes n = 8, 16, 24, 32 that increase with the increasing frequency. The character b represents the bits allocated for the current frame and 2^{ b } is just the number of symbols for the mary (m symbols) adaptive arithmetic coding, i.e., m = 2^{ b }.
If γ_{ j }(n) ≥ 0.5, then the context in the time domain is employed in the present adjacent subbands with the same bit allocation. By statistical analysis, we have found that the audio coding parameters for music signal have higher correlation than the speech signal in time domain. As to the quantized norms in G.719, a large percentage, 98.9%, of all the frames have the correlation (i.e., the correlation coefficient is higher than 0.5) between the adjacent frames which enables larger compression.
Given the encoded symbol in the previous frame, referred to as B, there is a large possibility of the input symbol C distributing around B. In G.719, for the mary (m symbols) adaptive arithmetic coding, the encoded symbol B is the center; m/2 symbols, which are located in the range of B and m − B (provided by − B to avoid negative symbol), would be chosen to add $\lambda ={\displaystyle \sum}_{i=1}^{m}f\left(i\right)$ on the basis of the original frequency, and δ = m/8, which can guarantee that the probability of half of all symbols is increased.
3.4 Variable rate in G.719
The bit rate is determined through three steps. The module of Huffman coding is kept to calculate the saving bits and prepare for the bit allocation. Let Sum be the total bits at a fixed bit rate. Firstly, the norms are coded by both the original Huffman coding consuming h 1 bits and the contextbased adaptive arithmetic coding consuming a 1 bits simultaneously. Compared to the Huffman coding, the contextbased adaptive arithmetic coding can save bits L 1 = h 1 − a 1. The remaining bits num 1 = Sum − h 1 are used for bit allocation of the quantized MDCT coefficients. In the second step, the subbands with different bits assigned by the bit allocation are encoded by the proposed adaptive arithmetic coding. The quantized MDCT coefficients are also encoded by Huffman coding consuming h 2 bits to calculate the remaining bits num 2 = Sum − h 1 − h 2 used for the noise level adjustment. Compared to the Huffman coding, the number of bits used for coding the quantized MDCT coefficients with the contextbased adaptive arithmetic coding is a 2, which can save bits L 2 = h 2 − a 2. Finally, the noise level is adjusted according to num 2. The total bits and the bits used for the bit allocation and noise level adjustment in the improved encoder remain the same as those in the primary fixed rate G.719; hence, the saving bits L 1 + L 2 (provided by the contextbased adaptive arithmetic coding compared to the original Huffman coding) lead to the variable rate of G.719. To ensure the correct decoding, the header in G.719 [7] which specifies the number of bits used for encoding is changed to indicate variable bits instead of fixed bits.
4. Experimental results
4.1 Bit rate comparison
Average bit rate of different signal type
Signal type  Average bit rate in fixed rate G.719 (kb/s)  Average bit rate in variable rate G.719 (kb/s) 

Music  32  29.4817 
Mixed music  32  29.6640 
Speech  32  29.9606 
Total  32  29.6512 
As is shown in Table 1, our scheme achieves an average bit rate from 29.4817 to 29.9606 kb/s at low bit rate coding mode, compared with the fixed rate 32 kb/s. The coding gains of the three types of signal have a range from 6.4% to 7.9%, and it shows a coding gain on average 7.2% for all the test samples. Particularly, the bit rate saving for music signal is the largest compared with the mixed music signal and speech signal because of its good correlation in time domain and frequency domain.
Coding modes in G.719
Coding mode  Fixed rate (kb/s)  Variable rate (kb/s) 

1  32  29.6512 
2  48  44.8484 
3  64  59.8988 
4  80  74.5056 
5  96  88.6831 
6  112  102.5974 
7  128  116.3237 
4.2 Investigation of the shortterm coding efficiency
The performance of bit allocation of each frame
Signal type  Bits of each frame in fixed rate G.719 (bits/frame)  The minimum bits in variable rate G.719 (bits/frame)  The maximum bits in variable rate G.719 (bits/frame) 

Music  640  495  725 
Mixed music  640  540  718 
Speech  640  550  714 
As it can be seen from Table 3, the minimum bits of each frame in the variable rate G.719 are less than that in the fixed rate G.719, and the maximum bits of each frame in the variable rate G.719 are more than that in the fixed rate G.719 only because the context model tends to be stable after the first several input frames. Through statistical analysis, there is an extraordinarily large percentage, 99.1%, of all the frames needing less than the fixed 640 bits, which guarantees the shortterm coding efficiency of the proposed variable rate arithmetic coding. Since the good correlation in the time domain and in the frequency domain, the minimum bits in the variable rate G.719 for music signal have the best performance.
4.3 The performance comparison of different entropy coding
The average number of bits when coding the quantized norms
Coding bits for the quantized norms  Huffman coding  Adaptive arithmetic coding  Contextbased adaptive arithmetic coding 

Modes 1 to 7  147.1909  132.0607  119.8233 
The average number of bits when coding the quantized MDCT coefficients
Coding bits for the quantized MDCT coefficients  Huffman coding  Adaptive arithmetic coding  Contextbased adaptive arithmetic coding 

Mode 1  429.8683  418.1427  417.3993 
Mode 2  720.0894  695.6415  694.4265 
Mode 3  1,011.484  972.396  970.7839 
Mode 4  1,305.252  1,250.046  1,247.866 
Mode 5  1,604.597  1,532.503  1,530.054 
Mode 6  1,860.674  1,773.074  1,770.578 
Mode 7  2,239.736  2,131.121  2,128.619 
The compression percentage of the quantized norms
Compression percentage (%)  Adaptive arithmetic coding  Contextbased adaptive arithmetic coding 

Modes 1 to 7  10.27932  18.59329 
The compression percentage of the quantized MDCT coefficients
Compression percentage (%)  Adaptive arithmetic coding  Contextbased adaptive arithmetic coding 

Mode 1  2.727712  2.900664 
Mode 2  3.395124  3.563846 
Mode 3  3.864423  4.023802 
Mode 4  4.229515  4.396522 
Mode 5  4.492942  4.645568 
Mode 6  4.707953  4.842141 
Mode 7  4.849440  4.961144 
As it can be seen from Tables 6 and 7, the compression percentage of the quantized norms is higher than that of the quantized MDCT coefficients. Since the variation of the quantized norms is less than that of the quantized MDCT coefficients, the conditional probability of the encoding symbol of the quantized norms is bigger than that of the quantized MDCT coefficients. Moreover, the correlation in the time domain of the quantized norms is higher than that of the quantized MDCT coefficients because of the less variation of norms. As a result, the scheme of the contextbased adaptive arithmetic coding used for the quantized norms has a better performance than that used for the quantized MDCT coefficients.
4.4 Audio quality
The proposed contextbased arithmetic coding is performed directly on the quantized audio parameters, and the technique is lossless, so the decoded parameters using the proposed arithmetic coding method should have no distortion. In the quality tests to evaluate the arithmetic coding, objective comparison tests would be firstly used to verify the lossless coding. By the objective comparison, i.e., PEAQ [21] over a large number of speech and music samples, all samples generated by the proposed variable rate G.719 appear the same as those of the fixed rate G.719. Secondly, we carry out the preferable listening tests to verify that the proposed scheme does not introduce any kind of undesirable effects although there is no need to use subjective listening tests if the sample values are not changed. It is thus verified that the proposed variable rate coder has the same audio quality as the original G.719 under the different coding modes. Besides, we use the audio comparing tool ‘CompAudio’ [22] to check if all the sample values are equal before and after the arithmetic coding. Through careful audio quality evaluation and the value comparison, the proposed contextbased adaptive arithmetic coding actually leads to lossless compression used for the quantized audio parameters. It is verified that the proposed technique is lossless and the detailed test results need not to be reported. As to the audio qualities of the full codec (e.g., ITUT G.719), the formal test results can be found in [23, 24].
4.5 Complexity test
Average complexity comparison test results in terms of WMOPS
Signal type  Fixed rate G.719  Variable rate G.719  The proposed modules  

Encoder  Decoder  Encoder  Decoder  Encoder  Decoder  
Music  6.5986  6.2988  9.6376  8.8378  3.0390  2.5390 
Mixed music  6.6356  6.3944  9.6641  8.9429  3.0285  2.5485 
Speech  6.6854  6.3825  9.7223  9.2516  3.0369  2.8691 
Total  6.6298  6.3288  9.6643  8.9808  3.0345  2.6520 
5. Conclusions
The novel contextbased adaptive arithmetic coding technique proposed in this paper behaves promising and significant for the lossless compression when both the time and frequency plane of the audio coding parameters are considered. The proposed technique has been introduced to compress the quantized MDCT coefficients and the quantized norms in G.719. Variable rate coding structure has also been investigated and adopted to obtain high coding efficiency compared with the original fixed rate G.719. Experiments have shown that the new technique achieves a coding gain of 6% to 10% at all coding modes for different types of signals, appearing to be advantageous over the conventional Huffman coding. To evaluate the performance of the proposed algorithm, objective and subjective quality tests have been done for a variety of speech and audio samples. The average bit rates and computation complexity have also been computed at different coding modes. It is verified that the proposed variable rate coder with the adaptive arithmetic coding based on the timefrequency context produces the same audio quality as the original G.719 coder while achieving a high coding gain. The proposed method in this paper can be easily used in other audio codecs which need to lower the coding bit rate by means of entropy coding.
Declarations
Acknowledgements
The authors would like to thank the reviewers for their suggestions which have contributed a lot to the great improvement of the manuscript. The work in this paper is supported by the National Natural Science Foundation of China (no.11161140319), and the corporation between BIT and Ericsson.
Authors’ Affiliations
References
 Fenwick PM: Huffman code efficiencies for extensions of sources. IEEE Trans. Commun. 1995, 43(234):163165. 10.1109/26.380027MATHView ArticleGoogle Scholar
 Huffman DA: A method for construction of minimum redundancy codes. Proc. IRE 1952, 40(9):10981101. 10.1109/JRPROC.1952.273898View ArticleGoogle Scholar
 Langdon GG: An introduction to arithmetic coding. IBM J. Res. Dev. 1984, 28(2):135149. 10.1147/rd.282.0135MATHMathSciNetView ArticleGoogle Scholar
 Hyungjin K, Jiangtao W, Villasenor JD: Secure arithmetic coding. IEEE Trans. Signal Process. 1987, 55(5):22632272. 10.1109/TSP.2007.892710Google Scholar
 Information technology: Coding of AudioVisual Objects  Part 3, Audio, Subpart 4: Time/Frequency Coding. International Organization for Standardization ISO/IEC 14496–3:1999, 1999Google Scholar
 Neuendorf M, Gournay P, Multrus M, Lecomte J, Bessette B, Geiger R, Bayer S, Fuchs G, Hilpert J, Rettelbach N, Salami R, Schuller G, Lefebvre R, Grill B: Unified speech and audio coding scheme for high quality at low bitrates. Proc of IEEE Int Conf Acoustics, Speech and Signal Processing 2009, 14. 10.1109/ICASSP.2009.4959505Google Scholar
 ITUT Recommendation: G.719 (06/08), Lowcomplexity fullband audio coding for highquality conversational applications. Geneva: Int Telecomm Union; 2008.Google Scholar
 Zhang L, Wu X, Zhang N, Gao W, Wang Q, Zhao D: Contextbased arithmetic coding reexamined for DCT video compression. In IEEE International Symposium on Circuits and Systems. New Orleans; 2007:31473150. 10.1109/ISCAS.2007.378098Google Scholar
 Ryabko B, Rissanen J: Fast adaptive arithmetic code for large alphabet sources with asymmetrical distributions. IEEE Commun. Lett. 2003, 7(1):3335. 10.1109/LCOMM.2002.807424View ArticleGoogle Scholar
 Marpe D, Schwarz H, Wiegand T: Contextbased adaptive binary arithmetic coding in the H.264/AVC video compression standard. IEEE T Circ Syst Vid 2003, 13(7):620636. 10.1109/TCSVT.2003.815173View ArticleGoogle Scholar
 Information technology  MPEG audio technologies: International Organization for Standardization. ISO/IEC; ISO/IEC 23003–3: 2012Google Scholar
 Shannon CE: A mathematical theory of communications. Bell Syst. Tech. J. 1948, 27(3):379423.MATHMathSciNetView ArticleGoogle Scholar
 Fuchs G, Subbaraman V, Multrus M: Efficient context adaptive entropy coding for realtime applications. Proc of IEEE Int Conf Acoustics, Speech and Signal Processing 2011, 493496. 10.1109/ICASSP.2011.5946448Google Scholar
 Moradmand H, Payandeh A, Aref MR: Joint sourcechannel coding using finite state integer arithmetic codes. In IEEE International Conference on Electro/Information Technology. Windsor; 2009:1922. 10.1109/EIT.2009.5189577Google Scholar
 Huang YM, Liang YC: A secure arithmetic coding algorithm based on integer implementation. In International Symposium on Communications and Information Technologies. Hangzhou; 2011:518521. 10.1109/ISCIT.2011.6092162Google Scholar
 Witten IH, Neal RM, Cleary JG: Arithmetic coding for data compression. Communication of the ACM 1987, 30(6):520540. 10.1145/214762.214771View ArticleGoogle Scholar
 Chen Y, Zhu H, Jin H, Sun XH: Improving the effectiveness of contextbased prefetching with multiorder analysis. San Diego: International Conference on Parallel Processing Workshops; 2010:428435. 10.1109/ICPPW.2010.64Google Scholar
 Pasi O: Toll quality variablerate speech codec. Int Conf Acoust Spee 1997, 2: 747750.Google Scholar
 Dong E, Zhao H, Li Y: Low bit and variable rate speech coding using local cosine transform. Proceedings of TENCON. on Computers, Communications, Control and Power Engineering. 2002, 1: 2831.Google Scholar
 McClellan S, Gibson JD: Variable rate CELP based on subband flatness. IEEE T Speech Audi P 1997, 5(2):120130. 10.1109/89.554774View ArticleGoogle Scholar
 ITUR Recommendation: BS.13871 (11/01), Method for Objective Measurements of Perceived Audio Quality. Geneva: Int Telecomm Union; 2001.Google Scholar
 Kabal P: CompAudio. 1996.http://www.csee.umbc.edu/help/sound/AFspV2R1/html/audio/CompAudio.html . Accessed 20 January 2013Google Scholar
 Xie M, Chu P, Taleb A, Briand M: ITUT G.719, A new lowcomplexity fullband (20 kHz) audio coding standard for highquality conversational applications. In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. New Paltz; 2009:265268. 10.1109/ASPAA.2009.5346487Google Scholar
 Taleb A, Karapetkov S: G.719: The first ITUT standard for highquality conversational fullband audio coding. IEEE Communication Magazine 2009, 47(10):124130. 10.1109/MCOM.2009.5273819View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.