Re: editing perl script through TEXTAREA
From: Alan J. Flavell (flavell_at_ph.gla.ac.uk)
Date: 08/18/04
- Next message: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Previous message: John M. Gamble: "Re: cpan update: 106 installed modules have no parseable version number"
- In reply to: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Next in thread: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Reply: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 18 Aug 2004 20:20:19 +0100
On Wed, 18 Aug 2004, Gunnar Hjalmarsson wrote:
> There is a lot of confusion here.
Nothing new there, then... :-}
> A '&' character that is submitted via a textarea control is URI
> encoded, not converted to the corresponding HTML entity. Accordingly,
> after URI decoding, it's still a '&'.
Agreed.
> It's when you want to display content initially in a textarea field
> that you should first convert certain characters to HTML entities, but
> if you for instance have:
>
> <textarea name="demo">Smith & Son Co.</textarea>
>
> the browser will convert the '&' to '&' and display it as '&'
> right away, i.e. before submitting.
Spot on.
> So my point, which I also tried to illustrate with a little program
> in another post in this thread, is that there is never a need for the
> Perl program to do any "reverse conversion" of HTML entities.
As a matter of principle you're correct here. But that isn't quite
true in practice, as I'll deal with in a moment.
As usual, it's all a matter of dividing the problem up into its
component parts, and understanding how each one works separately,
before assembling them into a working application.
But, over and above this, if folks go pasting weird characters into
their form submission (and there's no way you can stop them doing so),
then browsers do strange things with them. As the Perl Encode
documentation so engagingly remarks:
It is beyond the power of words to describe the way HTML
browsers encode non-ASCII form data.
- http://www.perldoc.com/perl5.8.0/lib/Encode/Supported.html
And there are some browsers (or should I say "browser-like operating
system components"?) which, when the user feeds into a form a
character which cannot be represented in the prevailing character
encoding, will turn it into &#number; or even into &entityname; format
for submission.
On arrival at the server, of course, the server-side process can have
no idea whether the user typed just a curly-quote character, or typed
the ASCII string “ (ampersand, hash, 8, 2, 2, 0, semicolon). By
that time they are indistinguishable. The behaviour in this situation
is undefined anyway, and browser developers have addressed it in
various different ways as they saw fit.
But Perl is only a small part of this problem - the major issues
really need to be hammered out on a suitable WWW-related group.
Where one might even get referred to my no-longer-quite-new
tutorial-ish page on the topic,
http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html
have fun
- Next message: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Previous message: John M. Gamble: "Re: cpan update: 106 installed modules have no parseable version number"
- In reply to: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Next in thread: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Reply: Gunnar Hjalmarsson: "Re: editing perl script through TEXTAREA"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|