Re: strange effect with [:lower:] in perl
From: Abigail (abigail_at_abigail.nl)
Date: 10/28/03
- Next message: Greg G: "system() rc hosed?"
- Previous message: Ben Morrow: "Re: regex for stripping HTML"
- In reply to: Alan J. Flavell: "Re: strange effect with [:lower:] in perl"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 28 Oct 2003 19:48:44 GMT
Alan J. Flavell (flavell@ph.gla.ac.uk) wrote on MMMDCCX September
MCMXCIII in <URL:news:Pine.LNX.4.53.0310281819040.28979@ppepc56.ph.gla.ac.uk>:
][ On Mon, 27 Oct 2003, Abigail wrote:
][
][ > "" > [a-z0-9] # Lowercase letters *and* digits.
][ > ""
][ > "" Surely that only refers to a subset of what Unicode considers to be
][ > "" "letters"?
][ >
][ > Yeah, but that's what [:lower:] seems to do too:
][ >
][ > $ perl -wle 'for (0x00 .. 0x80) {
][
][ Surely you meant to set the limit at 0xff or so for this
][ demonstration?
Yes, I did. However, it doesn't change the outcome.
][ > printf "%02x %s\n", $_, chr if chr () =~ /[[:lower:]]/}'
][
][ [snip]
][
][ > No lowercase accented letters here.
][
][ Curious. No surprise when the limit's set at 0x80, as I'm sure you'd
][ agree; but I must admit I was surprised at the accented lower-case
][ letters up to 0xff not being counted, despite the accented lower case
][ letters above 0x100 being counted. Prima facie I think there's
][ something wrong here, no? (This is perl 5.8.0 per RedHat 9).
Maybe, maybe not. I'm still confused what Perl is doing with Unicode,
and considering all the discussions on p5p, not everyone wants to do
the same.
And considering that the fonts I use are unable to display Unicode,
I'm not that interested anyway.
Abigail
-- $_ = "\x3C\x3C\x45\x4F\x54\n" and s/<<EOT/<<EOT/ee and print; "Just another Perl Hacker" EOT
- Next message: Greg G: "system() rc hosed?"
- Previous message: Ben Morrow: "Re: regex for stripping HTML"
- In reply to: Alan J. Flavell: "Re: strange effect with [:lower:] in perl"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|