Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: gatti@xxxxxxxxx
- Date: 14 May 2007 07:42:37 -0700
On May 13, 5:44 pm, "Martin v. Löwis" <mar...@xxxxxxxxxxx> wrote:
In summary, this PEP proposes to allow non-ASCII letters as
identifiers in Python. If the PEP is accepted, the following
identifiers would also become valid as class, function, or
variable names: Löffelstiel, changé, ошибка, or 売り場
(hoping that the latter one means "counter").
I am strongly against this PEP. The serious problems and huge costs
already explained by others are not balanced by the possibility of
using non-butchered identifiers in non-ASCII alphabets, especially
considering that one can write any language, in its full Unicode
glory, in the strings and comments of suitably encoded source files.
The diatribe about cross language understanding of Python code is IMHO
off topic; if one doesn't care about international readers, using
annoying alphabets for identifiers has only a marginal impact. It's
the same situation of IRIs (a bad idea) with HTML text (happily
Unicode).
- should non-ASCII identifiers be supported? why?No, they are useless.
- would you use them if it was possible to do so? in what cases?No, never.
Being Italian, I'm sometimes tempted to use accented vowels in my
code, but I restrain myself because of the possibility of annoying
foreign readers and the difficulty of convincing every text editor I
use to preserve them
Python code is written by many people in the world who are not familiar
with the English language, or even well-acquainted with the Latin
writing system. Such developers often desire to define classes and
functions with names in their native languages, rather than having to
come up with an (often incorrect) English translation of the concept
they want to name.
The described set of users includes linguistically intolerant people
who don't accept the use of suitable languages instead of their own,
and of compromised but readable spelling instead of the one they
prefer.
Most "people in the world who are not familiar with the English
language" are much more mature than that, even when they don't write
for international readers.
The syntax of identifiers in Python will be based on the Unicode
standard annex UAX-31 [1]_, with elaboration and changes as defined
below.
Not providing an explicit listing of allowed characters is inexcusable
sloppiness.
The XML standard is an example of how listings of large parts of the
Unicode character set can be provided clearly, exactly and (almost)
concisely.
``ID_Start`` is defined as all characters having one of the general
categories uppercase letters (Lu), lowercase letters (Ll), titlecase
letters (Lt), modifier letters (Lm), other letters (Lo), letter numbers
(Nl), plus the underscore (XXX what are "stability extensions" listed in
UAX 31).
``ID_Continue`` is defined as all characters in ``ID_Start``, plus
nonspacing marks (Mn), spacing combining marks (Mc), decimal number
(Nd), and connector punctuations (Pc).
Am I the first to notice how unsuitable these characters are? Many of
these would be utterly invisible ("variation selectors" are Mn) or
displayed out of sequence (overlays are Mn), or normalized away
(combining accents are Mn) or absurdly strange and ambiguous (roman
numerals are Nl, for instance).
Lorenzo Gatti
.
- Follow-Ups:
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- From: "Martin v. Löwis"
- Re: PEP 3131: Supporting Non-ASCII Identifiers
- References:
- PEP 3131: Supporting Non-ASCII Identifiers
- From: "Martin v. Löwis"
- PEP 3131: Supporting Non-ASCII Identifiers
- Prev by Date: deployment scripts
- Next by Date: Re: PEP 3131: Supporting Non-ASCII Identifiers
- Previous by thread: Re: PEP 3131: Supporting Non-ASCII Identifiers
- Next by thread: Re: PEP 3131: Supporting Non-ASCII Identifiers
- Index(es):
Relevant Pages
|
Loading