Re: Simple pattern matching negation



John W. Krahn:
> Dr.Ruud:
>> Todd de Gruyl:

>>> [You'll also note that I used \d instead of [0-9], same thing... a
>>> couple of characters shorter.]
>>
>> \d is not always the same as [0-9]. See \p{IsDigit} in `man perlre`.
>
> Where in the man page does it say that "\d is not always the same as
> [0-9]"?

I just did. Even \d and \p{IsDigit} aren't always the same test.

Demonstration:

use warnings;
use strict;
use charnames ':full';

my $text = "\x{00030}"
. "\x{00660}\x{006F0}"
. "\x{02460}\x{02474}\x{02488}\x{024F5}"
. "\x{02673}\x{02680}"
. "\x{02776}\x{02780}\x{0278A}"
. "\x{1D7CE}\x{1D7D8}\x{1D7E2}\x{1D7EC}\x{1D7F6}"
. "\x{E0030}";

my $n = length($text);

print '-'x $n, "\n";

for (my $i=0; $i<$n; $i++) {
my $c = substr($text, $i, 1);
printf "\\x\{%5.5X} %s\n", ord($c), charnames::viacode ord $c;
print ' [0-9]' , "\n" if $c =~ /[0-9]/;
print ' \d' , "\n" if $c =~ /\d/;
print ' \p{IsNumber}', "\n" if $c =~ /\p{IsNumber}/;
print '-'x $n, "\n";
}


Output:

------------------
\x{00030} DIGIT ZERO
[0-9]
\d
\p{IsNumber}
------------------
\x{00660} ARABIC-INDIC DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{006F0} EXTENDED ARABIC-INDIC DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{02460} CIRCLED DIGIT ONE
\p{IsNumber}
------------------
\x{02474} PARENTHESIZED DIGIT ONE
\p{IsNumber}
------------------
\x{02488} DIGIT ONE FULL STOP
\p{IsNumber}
------------------
\x{024F5} DOUBLE CIRCLED DIGIT ONE
\p{IsNumber}
------------------
\x{02673} RECYCLING SYMBOL FOR TYPE-1 PLASTICS
------------------
\x{02680} DIE FACE-1
------------------
\x{02776} DINGBAT NEGATIVE CIRCLED DIGIT ONE
\p{IsNumber}
------------------
\x{02780} DINGBAT CIRCLED SANS-SERIF DIGIT ONE
\p{IsNumber}
------------------
\x{0278A} DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE
\p{IsNumber}
------------------
\x{1D7CE} MATHEMATICAL BOLD DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{1D7D8} MATHEMATICAL DOUBLE-STRUCK DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{1D7E2} MATHEMATICAL SANS-SERIF DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{1D7EC} MATHEMATICAL SANS-SERIF BOLD DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{1D7F6} MATHEMATICAL MONOSPACE DIGIT ZERO
\d
\p{IsNumber}
------------------
\x{E0030} TAG DIGIT ZERO
------------------

perl, v5.8.6 built for i386-freebsd-64int


--
Affijn, Ruud

"Gewoon is een tijger."


.