Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Duncan Booth <duncan.booth@xxxxxxxxxxxxxxx>
- Date: 14 May 2007 13:44:04 GMT
Stefan Behnel <stefan.behnel-n05pAM@xxxxxx> wrote:
Just to confirm that: IronPython does accept non-ascii identifiers.
From "Differences between IronPython and CPython":
IronPython will compile files whose identifiers use non-ASCII
characters if the file has an encoding comment such as "# -*-
coding: utf-8 -*-". CPython will not compile such a file in any
case.
Sounds like CPython would better follow IronPython here.
I cannot find any documentation which says exactly which non-ASCII
characters IronPython will accept.
I would guess that it probably follows C# in general, but it doesn't
follow C# identifier syntax exactly (in particular the leading @ to
quote keywords is not supported).
The C# identifier syntax from http://msdn2.microsoft.com/en-us/library/aa664670(VS.71).aspx
I think it differs from the PEP only in also allowing the Cf class of characters:
identifier:
available-identifier
@ identifier-or-keyword
available-identifier:
An identifier-or-keyword that is not a keyword
identifier-or-keyword:
identifier-start-character identifier-part-charactersopt
identifier-start-character:
letter-character
_ (the underscore character U+005F)
identifier-part-characters:
identifier-part-character
identifier-part-characters identifier-part-character
identifier-part-character:
letter-character
decimal-digit-character
connecting-character
combining-character
formatting-character
letter-character:
A Unicode character of classes Lu, Ll, Lt, Lm, Lo, or Nl
A unicode-escape-sequence representing a character of classes Lu, Ll, Lt, Lm, Lo, or Nl
combining-character:
A Unicode character of classes Mn or Mc
A unicode-escape-sequence representing a character of classes Mn or Mc
decimal-digit-character:
A Unicode character of the class Nd
A unicode-escape-sequence representing a character of the class Nd
connecting-character:
A Unicode character of the class Pc
A unicode-escape-sequence representing a character of the class Pc
formatting-character:
A Unicode character of the class Cf
A unicode-escape-sequence representing a character of the class Cf
For information on the Unicode character classes mentioned above, see
The Unicode Standard, Version 3.0, section 4.5.
.
- References:
- PEP 3131: Supporting Non-ASCII Identifiers
- From: "Martin v. Löwis"
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Jarek Zgoda
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Alexander Schmolck
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Paul Rubin
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Neil Hodgson
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Alexander Schmolck
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Duncan Booth
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: Stefan Behnel
- PEP 3131: Supporting Non-ASCII Identifiers
- Prev by Date: Re: Yet Another Software Challenge
- Next by Date: Re: multi threaded SimpleXMLRPCServer
- Previous by thread: Re: PEP 3131: Supporting Non-ASCII Identifiers
- Next by thread: Re: PEP 3131: Supporting Non-ASCII Identifiers
- Index(es):
Relevant Pages
|
Loading