Current high quality audio coding techniques mainly focus on coding efficiency, which makes them extremely sensitive to channel noise, especially in high error rate wireless channels. In our previous work, we developed a progressive high quality audio codec, which was shown to outperform MPEG-4 version 2's scalable audio codec. In this work, we extend the error-free progressive audio codec to an error-resilient scalable audio codec by re-organizing the bitstream and modifying the noiseless coding module. A dynamic segmentation scheme is used to divide an audio bitstream into several segments. Each segment contains independently decodable data so that errors will not propagate across segment boundaries. An unequal error protection scheme is then adopted to improve error resilience of the final bitstream. The performance of the proposed algorithm is tested under different error patterns of WCDMA channels with several test audio materials. Our experimental results show that the proposed approach achieves excellent error resilience at a regular user bit rate of 64 kb/s.
Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version-2 audio supports fine grain bit rate scalability in the Generic Audio Coder (GAC). It has a Bit-Sliced Arithmetic Coding (BSAC) tool, which provides scalability in the step of 1kbit/sec per audio channel. However, this fine grain scalability tool is only available for mono and stereo audio material. Not much work has been done on progressively transmitting multichannel audio sources. MPEG Advanced Audio Coding (AAC) is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop a progressive syntax-rich multichannel audio codec in this work. It not only supports fine grain bit rate scalability for the multichannel audio bitstream, but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves a better performance at several different bit rates when compared with MPEG-4 BSAC for the mono audio sources.
A modified MPEG Advanced Audio Coding (AAC) scheme based on the Karhunen-Loeve transform (KLT) to remove inter-channel redundancy, which is called the MAACKL method, has been proposed in our previous work. However, a straightforward coding of elements of the KLT matrix generates about 240 bits per matrix for typical 5 channel audio contents. Such an overhead is too expensive so that it prevents MAACKL from updating KLT dynamically in a short period of time. In this research, we study the de-correlation efficiency of adaptive KLT as well as an efficient way to encode elements of the KLT matrix via vector quantization. The effect due to different quantization accuracy and adaptation period is examined carefully. It is demonstrated that with the smallest possible number of bits per matrix and a moderately long KLT adaptation time, the MAACKL algorithm can still generate a very good coding performance.
An embedded high-quality multichannel audio coding algorithm is proposed in this research. The Karhunen-Loeve Transform is applied to multichannel audio signals in the pre- processing stage to remove inter-channel redundancy. Then, after processing of several audio coding blocks, transformed coefficients are layered quantized and the bit stream is ordered according to their importance. The multichannel audio bit stream generated by the proposed algorithm has a fully progressive property, which is highly desirable for audio multicast applications in heterogenous networks. Experimental results show that, compared with the MPEG Advanced Audio Coding algorithm, the proposed algorithm achieves a better performance with both the object Mask-to- Noise-Ratio measurement and the subjective listening test at several different bit rates.
KEYWORDS: Computer programming, Algorithm development, FDA class II medical device development, Signal processing, Associative arrays, Rhodium, Fourier transforms, System integration, Evolutionary algorithms, Algorithms
A new quality-scalable multichannel audio compression algorithm based on MPEG-2 Advanced Audio Coding (AAC) is developed in this work. The Karhunen-Loeve Transform (KLT) is applied to multichannel audio signals in the preprocessing stage to remove the inter-channel redundancy. Then, signals in de-correlated channels are compressed by using a modified AAC main profile encoder. Finally, a channel transmission control mechanism is used to re-organize the bit stream so that the multichannel audio bit stream has a quality scalable property when it is transmitted over a heterogeneous network. Experimental results show that, compared with AAC, the proposed algorithm achieves a better performance with the objective Mask-to-Noise-Ratio (MNR) measurement while maintaining a similar computational complexity at the regular bit rate of 64 kbit/sec/ch. When the bit stream is transmitted to narrowband end users at a lower bit rate, packets of some channels can be dropped and audio of full channel can still be reconstructed in a reasonable fashion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.