Re: strange effect with [:lower:] in perl

From: Alan J. Flavell (flavell_at_ph.gla.ac.uk)
Date: 10/28/03


Date: Tue, 28 Oct 2003 18:46:30 +0000

On Mon, 27 Oct 2003, Abigail wrote:

> "" > [a-z0-9] # Lowercase letters *and* digits.
> ""
> "" Surely that only refers to a subset of what Unicode considers to be
> "" "letters"?
>
> Yeah, but that's what [:lower:] seems to do too:
>
> $ perl -wle 'for (0x00 .. 0x80) {

Surely you meant to set the limit at 0xff or so for this
demonstration?

> printf "%02x %s\n", $_, chr if chr () =~ /[[:lower:]]/}'

[snip]

> No lowercase accented letters here.

Curious. No surprise when the limit's set at 0x80, as I'm sure you'd
agree; but I must admit I was surprised at the accented lower-case
letters up to 0xff not being counted, despite the accented lower case
letters above 0x100 being counted. Prima facie I think there's
something wrong here, no? (This is perl 5.8.0 per RedHat 9).

If I set the upper limit at, say, 0xfff, then I get lots of lower-case
letters reported in the blocks of extended Latin, Greek, Coptic,
Cyrillic and Armenian.

And there are more still, e.g 0x2149 "DOUBLE-STRUCK ITALIC SMALL J"
;-)



Relevant Pages

  • Re: One ring to rule them all
    ... What might surprise you is the percentage of users who don't know that, ... the first letters of the first ten words, ... foreign language, choose from that language and proceed as above, and ... inept people have urgent need of skinny little battery cells? ...
    (comp.sys.mac.apps)
  • Re: One ring to rule them all
    ... What might surprise you is the percentage of users who don't know that, ... the first letters of the first ten words, ... foreign language, choose from that language and proceed as above, and ... There must be some famous quotation or ...
    (comp.sys.mac.apps)
  • Re: On good authority...
    ... what are your feelings about this Bill Clinton sex scandal ... This will surprise all lurkers... ... I could care less. ...
    (sci.electronics.design)
  • Re: Conjucation of "to be"
    ... (edict). ... and which also contained the letters "be". ... >table for that verb but to my big surprise i found that i ...
    (sci.lang.japan)
  • Re: Great SWT Program
    ... but that I personally am apt to find those of CLIs ... much the case with my home setup, ... surprise assumes extra significance. ...
    (comp.lang.java.programmer)