Re: Audio CODEC choices



On 7/27/2011 1:08 PM, upsidedown@xxxxxxxxxxxxx wrote:
On Wed, 27 Jul 2011 00:57:03 -0700, Don Y<nowhere@xxxxxxxx> wrote:

I'm looking for ideas for an audio CODEC to use in
streaming audio to my "network speakers". I've debugged
the system using "raw" 16b samples. But, this wastes a
fair bit of bandwidth "needlessly" (?)

These days, why bother with compression in a real time network ?

Note that this was the approach I initially took! But, in doing
so, I fully planned on testing with the network quiescent, etc.
That's not something I can expect of a deployed system...

For instance 100baseT Ethernet segments are capable of carrying quite
a few uncompressed audio channels.

Using some differential PCM might dropping the bit rate to one half
with low latency.

You don't always have control over what *else* is on the wire
(do all clients have QoS guarantees?, etc.) so you have no idea
as to what portion of the total bandwidth might be available to
you at any given time.

Similarly, you have no control over the latency of particular
packets wrt other consumers on the wire.

[unless, of course, you are designing a network *specifically*
for -- and dedicated to -- this purpose]

But, the issue goes beyond just network bandwidth!

Compression acts as a storage magnifier, too. I.e., if you
can efficiently decompress "on the fly", then you can store
compressed packets in your client's elastic store and effectively
increase the depth of that buffer -- with no associated
hardware cost.

E.g., if you are storing 2 channels of 16b data at ~50Ks/sec,
then you need 200KB/sec of stored audio. Probabilistically,
if your CDF indicates that "all" (ahem) of your packets will
arrive with a worst-case latency not exceeding 0.5sec, then
you *must* set aside 100KB *just* for the data buffer
(neglecting any other memory requirements in your device).

If you can get 2:1 compression *in* the CODEC, then that
drops to 50K. Alternatively, your probability of having to
deal with a dropout decreases (though not proportionately as
you're probably out on the tail of the CDF at that point) for
the same size buffer.

If you are trying to put this into something (physically) small
and *inexpensive*, this can make a big difference (e.g., when
you consider the PHY, power conditioning, connectors, audio
amp, etc., you really don't have much room to fit "external
memory"... especially if you're NOT interested in megabytes
of it!)

Any "frames" (if present) should be relatively small
as I may need to synchronize streams to a fraction of
a frame (consequences for space requirements).

Ideally, the decoder should directly support sample
interpolation (though this could be done in a post-processing
step) to synchronize to a fraction of a sample interval.

Why ?

At 48 kHz sampling rate, during one sample period, the sound moves 7
mm, thus moving the speaker by a few millimeters will have the same
effect.

You're assuming the sample rate is fixed at 48K *and* that you
can "move the speaker". :>

The CODEC shouldn't care what the sample rate is (barring some
constants). So, I should be able to use the same software
to drive infrasonics (trading off frequency response, sample rate
and buffer depth)

This also increases the signal processing possibilities in the
client (e.g., a client could then resample efficiently)
.