"Loop dee Loop" & MP3 Code
From: Universe (universe_at_covad.net)
Date: 04/03/04
- Next message: Harry Erwin: "A lecturer's lament"
- Previous message: Mark Carron: "OO Refactoring question."
- Next in thread: Ronald E Jeffries: "Re: "Loop dee Loop" & MP3 Code"
- Reply: Ronald E Jeffries: "Re: "Loop dee Loop" & MP3 Code"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 3 Apr 2004 02:09:08 -0500
Material on MP3 by its progenitor the German firm, Fraunhofer IIS.
Really interesting to me is how they use an outer code loop and a
nested inner code loop to each perform respectively distinct yet
related processing of a signal.
Enjoy!
Elliott
*****************BEGIN QUOTE
History
In 1987, the Fraunhofer IIS started to work on perceptual audio coding in
the framework of the EUREKA project EU147, Digital Audio Broadcasting (DAB).
In a joint cooperation with the University of Erlangen (Prof. Dieter
Seitzer), the Fraunhofer IIS finally devised a very powerful algorithm that
is standardized as ISO-MPEG Audio Layer-3 (IS 11172-3 and IS 13818-3).
Without data reduction, digital audio signals typically consist of 16 bit
samples recorded at a sampling rate more than twice the actual audio
bandwidth (e.g. 44.1 kHz for Compact Discs). So you end up with more than
1.400 Mbit to represent just one second of stereo music in CD quality. By
using MPEG audio coding, you may shrink down the original sound data from a
CD by a factor of 12, without losing sound quality. Factors of 24 and even
more still maintain a sound quality that is significantly better than what
you get by just reducing the sampling rate and the resolution of your
samples. Basically, this is realized by perceptual coding techniques
addressing the perception of sound waves by the human ear.
Using MPEG audio, one may achieve a typical data reduction of
1:4 by Layer 1 (corresponds to 384 kbps for a stereo signal),
1:6...1:8 by Layer 2 (corresponds to 256..192 kbps for a stereo signal),
1:10...1:12 by Layer 3 (corresponds to 128..112 kbps for a stereo signal),
still maintaining the original CD sound quality.
By exploiting stereo effects and by limiting the audio bandwidth, the coding
schemes may achieve an acceptable sound quality at even lower bitrates. MPEG
Layer-3 is the most powerful member of the MPEG audio coding family. For a
given sound quality level, it requires the lowest bitrate - or for a given
bitrate, it achieves the highest sound quality.
Sound Quality
Some typical performance data of MPEG Layer-3 are:
sound quality bandwidth mode bitrate reduction ratio
telephone sound 2.5 kHz mono 8 kbps * 96:1
better than short wave 4.5 kHz mono 16 kbps 48:1
better than AM radio 7.5 kHz mono 32 kbps 24:1
similar to FM radio 11 kHz stereo 56...64 kbps 26...24:1
near-CD 15 kHz stereo 96 kbps 16:1
CD >15 kHz stereo 112..128kbps 14..12:1
*) Fraunhofer IIS uses a non-ISO extension of MPEG Layer-3 for enhanced
performance ("MPEG 2.5")
In all international listening tests, MPEG Layer-3 impressively proved its
superior performance, maintaining the original sound quality at a data
reduction of 1:12 (around 64 kbit/s per audio channel). If applications may
tolerate a limited bandwidth of around 10 kHz, a reasonable sound quality
for stereo signals can be achieved even at a reduction of 1:24.
For the use of low bit-rate audio coding schemes in broadcast applications
at bitrates of 60 kbit/s per audio channel, the ITU-R recommends MPEG
Layer-3. (ITU-R doc. BS.1115)
Details
Filter bank
The filter bank used in MPEG Layer-3 is a hybrid filter bank which consists
of a polyphase filter bank and a Modified Discrete Cosine Transform (MDCT).
This hybrid form was chosen for reasons of compatibility to its
predecessors, Layer-1 and Layer-2.
Perceptual Model
The perceptual model mainly determines the quality of a given encoder
implementation. It uses either a separate filter bank or combines the
calculation of energy values (for the masking calculations) and the main
filter bank. The output of the perceptual model consists of values for the
masking threshold or the allowed noise for each coder partition. If the
quantization noise can be kept below the masking threshold, then the
compression results should be indistinguishable from the original signal.
Joint Stereo
Joint stereo coding takes advantage of the fact that both channels of a
stereo channel pair contain far the same information. These stereophonic
irrelevancies and redundancies are exploited to reduce the total bitrate.
Joint stereo is used in cases where only low bitrates are available but
stereo signals are desired.
Quantization and Coding
A system of two nested iteration loops is the common solution for
quantization and coding in a Layer-3 encoder.
Quantization is done via a power-law quantizer. In this way, larger values
are automatically coded with less accuracy and some noise shaping is already
built into the quantization process.
The quantized values are coded by Huffman coding. As a specific method for
entropy coding, Huffman coding is lossless. This is called noiseless coding
because no noise is added to the audio signal.
The process to find the optimum gain and scalefactors for a given block,
bit-rate and output from the perceptual model is usually done by two nested
iteration loops in an analysis-by-synthesis way:
Inner iteration loop (rate loop)
The Huffman code tables assign shorter code words to (more frequent) smaller
quantized values. If the number of bits resulting from the coding operation
exceeds the number of bits available to code a given block of data, this can
be corrected by adjusting the global gain to result in a larger quantization
step size, leading to smaller quantized values. This operation is repeated
with different quantization step sizes until the resulting bit demand for
Huffman coding is small enough. The loop is called rate loop because it
modifies the overall coder rate until it is small enough.
Outer iteration loop (noise control/distortion loop)
To shape the quantization noise according to the masking threshold,
scalefactors are applied to each scalefactor band. The systems starts with a
default factor of 1.0 for each band. If the quantization noise in a given
band is found to exceed the masking threshold (allowed noise) as supplied by
the perceptual model, the scalefactor for this band is adjusted to reduce
the quantization noise. Since achieving a smaller quantization noise
requires a larger number of quantization steps and thus a higher bitrate,
the rate adjustment loop has to be repeated every time new scalefactors are
used. In other words, the rate loop is nested within the noise control loop.
The outer (noise control) loop is executed until the actual noise (computed
from the difference of the original spectral values minus the quantized
spectral values) is below the masking threshold for every scalefactor band
(i.e. critical band).
----------------------------------------------------------------------------
Imprint | Copyright ©1998-2004 Fraunhofer-Gesellschaft
***********************END QUOTE
Available at:
http://www.iis.fraunhofer.de/amm/techinf/layer3/index.htmld
- Next message: Harry Erwin: "A lecturer's lament"
- Previous message: Mark Carron: "OO Refactoring question."
- Next in thread: Ronald E Jeffries: "Re: "Loop dee Loop" & MP3 Code"
- Reply: Ronald E Jeffries: "Re: "Loop dee Loop" & MP3 Code"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|