\w in regular expression
From: Marcello Pietrobon (teiffel_at_attglobal.net)
Date: 02/28/04
- Next message: William Park: "Re: \w in regular expression"
- Previous message: Terry Reedy: "Re: Please hear my plea: print without softspace"
- Next in thread: William Park: "Re: \w in regular expression"
- Reply: William Park: "Re: \w in regular expression"
- Reply: stewart: "Re: \w in regular expression"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 28 Feb 2004 15:14:17 -0500 To: python-list@python.org
Hello,
I am reading
http://www.amk.ca/python/howto/regex/
But there is an incongruence:
In the paragraph 2.1: Matching Character:
|\w|
Matches any alphanumeric character; this is equivalent to the class
[a-zA-Z0-9_].
|\W|
Matches any non-alphanumeric character; this is equivalent to the
class |[^a-zA-Z0-9_]|.
Which is fine with me and the same as in Perl and congruent with:
|\d|
Matches any decimal digit; this is equivalent to the class [0-9].
|\D|
Matches any non-digit character; this is equivalent to the class
|[^0-9]|.
|
But in the paragraph 5.1: Splitting Strings
I find:
|
>>> p = re.compile(r'\W+')
>>> p.split('This is a test, short and sweet, of split().')
['This', 'is', 'a', 'test', 'short', 'and', 'sweet', 'of', 'split', '']
At first I thought a typo:
But on my Python command line:
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> p = re.compile(r'\W+'); print p
<_sre.SRE_Pattern object at 0x0090DC38>
>>> p.split('This is a test, short and sweet, of split().')
['This', 'is', 'a', 'test', 'short', 'and', 'sweet', 'of', 'split', '']
>>> p = re.compile(r'\w+'); print p
<_sre.SRE_Pattern object at 0x0090D140>
>>> p.split('This is a test, short and sweet, of split().')
['', ' ', ' ', ' ', ', ', ' ', ' ', ', ', ' ', '().']
>>>
In other word is Python re module not compatible with Perl ?
I also noted that the tools\scripts\redemo.py behaves different than the Python command line ( it is not the only case )
because it matches 'This' when I use \w+ and not when I use \W+
???
Thank you for any comments,
Marcello
||
- Next message: William Park: "Re: \w in regular expression"
- Previous message: Terry Reedy: "Re: Please hear my plea: print without softspace"
- Next in thread: William Park: "Re: \w in regular expression"
- Reply: William Park: "Re: \w in regular expression"
- Reply: stewart: "Re: \w in regular expression"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]