Re: comparing binary strings




Quoth Joost Diepenmaat <joost@xxxxxxxxx>:
On Mon, 10 Dec 2007 14:27:12 +0000, Ben Morrow wrote:

You can't. You need to perform *the comparisons* under 'use bytes'.

No, you don't need to. The only time the encoding of the strings is
important is when you're passing them to external code as a C-style char*
pointer. Or at least it should be.

I agree, it should be; it's not, however. For instance, under 5.8.8, a
string containing "\xc1" (capital A acute) will match /\w/ if it is
utf8-encoded and not if its not. I'm not sure if this is fixed in 5.10;
I'm not sure, either, what the correct fix would be.

Ben

.



Relevant Pages

  • something strange with new process Arguments
    ... I'm starting a new process and passing it some arguments to look at a ... The folder gets created correctly. ... I figure it has to be an encoding problem somewhere among the various ... possible encodings and passings of strings going on. ...
    (microsoft.public.dotnet.general)
  • Re: comparing binary strings
    ... The only time the encoding of the strings is ... important is when you're passing them to external code as a C-style ...
    (comp.lang.perl.misc)
  • Re: Proposal: require 7-bit source strs
    ... I'm referring to a time when there was no encoding ... It would be possible to go back and find all strings ... That's why I specified to do this after conversion to ... make the assumption that the character set is ASCII-based, ...
    (comp.lang.python)
  • Re: Using Japanese and English strings, encodings
    ... English, and I keep wishing I could closely, reliable, and simply ... using edict + CLISP + araneida. ... It keeps screwing up the EUC-JP encoding of any parameters I ... Internally, strings are 16 bit characters, I think. ...
    (comp.lang.lisp)
  • Re: Using Japanese and English strings, encodings
    ... using edict + CLISP + araneida. ... It would be much better if you had a single LANG macro, ... It keeps screwing up the EUC-JP encoding of any parameters I ... Internally, strings are 16 bit characters, I think. ...
    (comp.lang.lisp)