Re: Unrecognized escape sequences in string literals



On Sun, 09 Aug 2009 17:56:55 -0700, Douglas Alan wrote:

Steven D'Aprano wrote:

Why should a backslash in a string literal be an error?

Because in Python, if my friend sees the string "foo\xbar\n", he has no
idea whether the "\x" is an escape sequence, or if it is just the
characters "\x", unless he looks it up in the manual, or tries it out in
the REPL, or what have you.

Fair enough, but isn't that just another way of saying that if you look
at a piece of code and don't know what it does, you don't know what it
does unless you look it up or try it out?


My friend is adamant that it would be better
if he could just look at the string literal and know. He doesn't want to
be bothered to have to store stuff like that in his head. He wants to be
able to figure out programs just by looking at them, to the maximum
degree that that is feasible.

I actually sympathize strongly with that attitude. But, honestly, your
friend is a programmer (or at least pretends to be one *wink*). You can't
be a programmer without memorizing stuff: syntax, function calls, modules
to import, quoting rules, blah blah blah. Take C as an example -- there's
absolutely nothing about () that says "group expressions or call a
function" and {} that says "group a code block". You just have to
memorize it. If you don't know what a backslash escape is going to do,
why would you use it? I'm sure your friend isn't in the habit of randomly
adding backslashes to strings just to see whether it will still compile.

This is especially important when reading (as opposed to writing) code.
You read somebody else's code, and see "foo\xbar\n". Let's say you know
it compiles without warning. Big deal -- you don't know what the escape
codes do unless you've memorized them. What does \n resolve to? chr(13)
or chr(97) or chr(0)? Who knows?

Unless you know the rules, you have no idea what is in the string.
Allowing \y to resolve to a literal backslash followed by y doesn't
change that. All it means is that some \c combinations return a single
character, and some return two.



In comparison to Python, in C++, he can just look "foo\xbar\n" and know
that "\x" is a special character. (As long as it compiles without
warnings under g++.)

So what you mean is, he can just look at "foo\xbar\n" AND COMPILE IT
USING g++, and know whether or not \x is a special character.

[sarcasm] Gosh. That's an enormous difference from Python, where you have
to print the string at the REPL to know what it does. [/sarcasm]

Aside:
\x isn't a special character:

"\x"
ValueError: invalid \x escape

However, \xba is:

"\xba"
'\xba'
len("\xba")
1
ord("\xba")
186



He's particularly annoyed too, that if he types "foo\xbar" at the REPL,
it echoes back as "foo\\xbar". He finds that to be some sort of annoying
DWIM feature, and if Python is going to have DWIM features, then it
should, for example, figure out what he means by "\" and not bother him
with a syntax error in that case.

Now your friend is confused. This is a good thing. Any backslash you see
in Python's default string output is *always* an escape:

"a string with a 'proper' escape \t (tab)"
"a string with a 'proper' escape \t (tab)"
"a string with an 'improper' escape \y (backslash-y)"
"a string with an 'improper' escape \\y (backslash-y)"

The REPL is actually doing him a favour. It always escapes backslashes,
so there is no ambiguity. A backslash is displayed as \\, any other \c is
a special character.


Of course I think that he's overreacting a bit.

:)


My point of view is that
every language has *some* warts; Python just has a bit fewer than most.
It would have been nice, I should think, if this wart had been "fixed"
in Python 3, as I do consider it to be a minor wart.

And if anyone had cared enough to raise it a couple of years back, it
possibly might have been.


--
Steven
.



Relevant Pages

  • Re: Unrecognized escape sequences in string literals
    ... If you don't know what your string literals are, ... Adding escape codes into the string literal doesn't change this ... extra effort required to defeat the compiler (forcing the programmer to ... And if you saw that in Python, you'd also know that there are some ...
    (comp.lang.python)
  • Re: more on unescaping escapes
    ... without the quotes in the file so my parser can read it as a single ... string. ... It really is a tab that gets stored there, not the escape for one. ... if you give python an unknown escape it passes it leaves it ...
    (comp.lang.python)
  • Re: more on unescaping escapes
    ... I need to use the \x20 because of my parser. ... it's not really a problem of what happens when you assign a string ... It really is a tab that gets stored there, not the escape for one. ... if you give python an unknown escape it passes it leaves it ...
    (comp.lang.python)
  • Re: Unrecognized escape sequences in string literals
    ... My friend is adamant that it would be ... better if he could just look at the string literal and know. ... annoying DWIM feature, and if Python is going to have DWIM features, ... Generating a syntax error instead for unknown escape sequences would ...
    (comp.lang.python)
  • TOC of Python Cookbook now online (was Re: author index for Python Cookbook 2?)
    ... Processing a String One Character at a Time ... Finding a File on the Python Search Path ... Constructing Lists with List Comprehensions ... Looping over Items and Their Indices in a Sequence ...
    (comp.lang.python)