Re: A critique of cgi.escape



Jon Ribbens <jon+usenet@xxxxxxxxxxxxxxxxx> wrote:

In article <Xns984996E6BABCEduncanbooth@xxxxxxxxx>, Duncan Booth
wrote:
It is generally a principle of Python that new releases maintain
backward compatability. An incompatible change such proposed here
would probably break many tests for a large number of people.

Why is the suggested change incompatible? What code would it break?
I agree that it would be a bad idea if it did indeed break backwards
compatibility - but it doesn't.

I guess you've never seen anyone write tests which retrieve some generated
html and compare it against the expected value. If the page contains any
unescaped quotes then this change would break it.


There should be a one-stop shop where I can take my unicode text and
convert it into something I can safely insert into a generated html
page;

I disagree. I think that doing it in one is muddled thinking and
liable to lead to bugs. Why not keep your output as unicode until it
is ready to be output to the browser, and encode it as appropriate
then? Character encoding and character escaping are separate jobs with
separate requirements that are better off handled by separate code.

Sorry, convert into something I can safely insert wasn't meant to imply
encoding: just entity escaping.

To be clear:

I'm talking about encoding certain characters as entity references. It
doesn't matter whether its the character ampersand or right double quote,
they both want to be converted to entities. Same operation.

The resulting string might be a byte string or it might still be unicode:
the point being that the conversion I want is from unescaped to entity
escaped, not from unicode to byte encoded. Right now the only way the
Python library gives me to do the entity escaping properly has a side
effect of encoding the string. I should be able to do the escaping without
having to encode the string at the same time.
.



Relevant Pages

  • RE: Split non delimited data into multiple cells
    ... I used a column as a counter, and then used the mid statement in separate ... It separates each character into its own cell. ... charcter of the string in its own cell. ... I am importing an ascii file and can separate the data manually during ...
    (microsoft.public.excel.worksheet.functions)
  • Re: OpenForm OpenArgs - multiple arguements
    ... The OpenArgs argument is a string. ... You can separate the various inner strings using any character ... not a character that might be found in an inner string). ...
    (microsoft.public.access.formscoding)
  • Re: Getting a filename and directory path from a string in RPG?
    ... I have a string that looks like this: ... filename in separate variables. ... I don't think CHECKR will help in this case. ... just be the last character in your string. ...
    (comp.sys.ibm.as400.misc)
  • How to separate a string into char array
    ... How to separate the string into a char array or a string array just have on ... character per element? ...
    (microsoft.public.dotnet.languages.csharp)
  • [TOMOYO #15 3/8] Common functions for TOMOYO Linux.
    ... This file contains common functions (e.g. policy I/O, pattern matching). ... Since TOMOYO Linux is a name based access control, ... TOMOYO Linux's string manipulation functions make reviewers feel crazy, ... the Linux kernel accepts all characters but NUL character ...
    (Linux-Kernel)