Re: codecs.getencoder encodes entire string ?



nicolas_riesch wrote:

> I just don't understand why it returns the "length consumed".
>
> Does it means that in some case, the input string can be only partially
> converted ?

For an encoder, I believe the answer is "no". For a decoder, it is
a definite yes: if the input does not end with a complete character,
you may have bytes left at the end which did not get decoded.

For an encoder, the same *might* happen if you want to encode
half-surrogates into, say, UTF-8; the encoder might refuse to
encode the half-surrogate, and wait for the other half. Of course,
the current UTF-8 encoder will then just encode the surrogate
codepoint as if it was a proper character.

If you extend the notion of "encoding", similar things may happen
all the time. E.g. a DES encoder may only support multiples of
the block size, and leave bytes at the end.

> What can be the use of the "length consumed" value ?

It's primarily intended for stream writers, which may need
to buffer extra characters at the end that did not get encoded,
and wait until more input is provided.

For all practical purposes, you can ignore the length on
encoding. If you are paranoid, assert that it equals the
length of the input.

Regards,
Martin
.



Relevant Pages

  • Re: using FileStream
    ... You might not get all the characters into the destination buffer. ... The documentation clearly says that the last call to the GetBytes method should have the parameter set to true to ensure that everything is flushed to the output. ... I mean that the GC would take care of flushing the Encoder object e ... it will expect the second half of the character ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: using FileStream
    ... I have read about the last parameter of GetBytes but I can't really ... I mean that the GC would take care of flushing the Encoder object e without ... *or* you can reuse it with subsequent bytes. ... it will expect the second half of the character ...
    (microsoft.public.dotnet.languages.csharp)
  • RE: compressing mp3 files
    ... > structure of encoding process is identical in all Layers. ... The continuation of the encoding process depends on used Layer. ... > encoded with its own bitrate, calculated by encoder. ... > We shall consider now existing encoding techniques of stereo data stipulated ...
    (microsoft.public.windowsxp.music)
  • RE: compressing mp3 files
    ... > structure of encoding process is identical in all Layers. ... The continuation of the encoding process depends on used Layer. ... > encoded with its own bitrate, calculated by encoder. ... > We shall consider now existing encoding techniques of stereo data stipulated ...
    (microsoft.public.windowsxp.music)
  • Re: Binary to ascii encoder
    ... > I needed a basic binary to ascii encoder, so I wrote this piece of ... > Encoding algo: suppose 11111010 is the byte to be encoded. ... endless loop will ensue. ... You should definitely mask the bits you want to extract. ...
    (comp.lang.c)