Re: Unrecognized escape sequences in string literals
- From: Steven D'Aprano <steven@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: 10 Aug 2009 06:03:05 GMT
On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:
Steven D'Aprano wrote:
Why should a backslash in a string literal be an error?
Because in Python, if my friend sees the string "foo\xbar\n", he has no
idea whether the "\x" is an escape sequence, or if it is just the
characters "\x", unless he looks it up in the manual, or tries it out in
the REPL, or what have you.
Fair enough, but isn't that just another way of saying that if you look
at a piece of code and don't know what it does, you don't know what it
does unless you look it up or try it out?
My friend is adamant that it would be better
if he could just look at the string literal and know. He doesn't want to
be bothered to have to store stuff like that in his head. He wants to be
able to figure out programs just by looking at them, to the maximum
degree that that is feasible.
I actually sympathize strongly with that attitude. But, honestly, your
friend is a programmer (or at least pretends to be one *wink*). You can't
be a programmer without memorizing stuff: syntax, function calls, modules
to import, quoting rules, blah blah blah. Take C as an example -- there's
absolutely nothing about () that says "group expressions or call a
function" and {} that says "group a code block". You just have to
memorize it. If you don't know what a backslash escape is going to do,
why would you use it? I'm sure your friend isn't in the habit of randomly
adding backslashes to strings just to see whether it will still compile.
This is especially important when reading (as opposed to writing) code.
You read somebody else's code, and see "foo\xbar\n". Let's say you know
it compiles without warning. Big deal -- you don't know what the escape
codes do unless you've memorized them. What does \n resolve to? chr(13)
or chr(97) or chr(0)? Who knows?
Unless you know the rules, you have no idea what is in the string.
Allowing \y to resolve to a literal backslash followed by y doesn't
change that. All it means is that some \c combinations return a single
character, and some return two.
In comparison to Python, in C++, he can just look "foo\xbar\n" and know
that "\x" is a special character. (As long as it compiles without
warnings under g++.)
So what you mean is, he can just look at "foo\xbar\n" AND COMPILE IT
USING g++, and know whether or not \x is a special character.
[sarcasm] Gosh. That's an enormous difference from Python, where you have
to print the string at the REPL to know what it does. [/sarcasm]
Aside:
\x isn't a special character:
ValueError: invalid \x escape"\x"
However, \xba is:
'\xba'"\xba"
1len("\xba")
186ord("\xba")
He's particularly annoyed too, that if he types "foo\xbar" at the REPL,
it echoes back as "foo\\xbar". He finds that to be some sort of annoying
DWIM feature, and if Python is going to have DWIM features, then it
should, for example, figure out what he means by "\" and not bother him
with a syntax error in that case.
Now your friend is confused. This is a good thing. Any backslash you see
in Python's default string output is *always* an escape:
"a string with a 'proper' escape \t (tab)""a string with a 'proper' escape \t (tab)"
"a string with an 'improper' escape \\y (backslash-y)""a string with an 'improper' escape \y (backslash-y)"
The REPL is actually doing him a favour. It always escapes backslashes,
so there is no ambiguity. A backslash is displayed as \\, any other \c is
a special character.
Of course I think that he's overreacting a bit.
:)
My point of view is that
every language has *some* warts; Python just has a bit fewer than most.
It would have been nice, I should think, if this wart had been "fixed"
in Python 3, as I do consider it to be a minor wart.
And if anyone had cared enough to raise it a couple of years back, it
possibly might have been.
--
Steven
.
- Follow-Ups:
- Re: Unrecognized escape sequences in string literals
- From: MRAB
- Re: Unrecognized escape sequences in string literals
- From: Douglas Alan
- Re: Unrecognized escape sequences in string literals
- References:
- Unrecognized escape sequences in string literals
- From: Douglas Alan
- Re: Unrecognized escape sequences in string literals
- From: Steven D'Aprano
- Re: Unrecognized escape sequences in string literals
- From: Douglas Alan
- Unrecognized escape sequences in string literals
- Prev by Date: Re: Client/Server based on SocketServer and Windows
- Next by Date: Re: Unrecognized escape sequences in string literals
- Previous by thread: Re: Unrecognized escape sequences in string literals
- Next by thread: Re: Unrecognized escape sequences in string literals
- Index(es):
Relevant Pages
|