Re: urlencode and $_GET



M. Trausch said the following on 18/11/2005 01:49:
Oli Filth wrote:

They aren't represented the same interally at all. A literal hash in a
URL delimits an HTML reference to a named anchor, whereas %23 does not,
it's treated as part of the query string in the HTTP GET request; try
this simple test to demonstrate this:


That's very much like saying the character # on the right side of a hex dump and the '23' on the left side of a hex dump aren't represented internally at all. It's just a character reference, either way. Just because one may receive a flag that the other doesn't in one instance or several instances does not mean that it will in *all* instances.

%23 is a character reference, yes, but not an *HTML* character reference/entity, it's merely a way of representing # in an HTTP GET string, and means nothing in the context of HTML.


The browser treats %23 as exactly that, the literal characters %, 2, 3. In the context of a clicked hyperlink, these exact characters are transmitted in the corresponding HTTP GET request string. e.g. the following link:

	<A href="http://example.com/file.php?%23xyz";>...</A>

will result in the following HTTP request:

	GET /file.php?%23xyz HTTP/1.1
	Host: example.com

At no point between the server delivering the original HTML to the browser and the server receiving the GET request has %23 been decoded.

On the other hand, the browser treats the literal # as a delimiter (as defined by HTML specs), and strips that (and everything after it) from the URL before the HTTP request is made. e.g. the following link:

	<A href="http://example.com/file.php?#xyz";>...</A>

will result in the following HTTP request:

	GET /file.php? HTTP/1.1
	Host: example.com

Entirely different behaviour, working at a different layer (HTML vs. HTTP), completely defined by the specs (W3C HTML specs, and RFC 1738).

If you had tried the demo code I posted earlier, you would see this in action.


Where is it defined as "unsafe", except in RFC 1738 where it states that
it's unsafe to use # unless to delimit a named anchor reference?

Show me an example where it doesn't work...


The fact is that the published standard which addresses the issue states that it's unsafe.

No, it states that it's unsafe to use # in cases other than where you mean it to be a delimiter for an HTML anchor identifier.


In cases where you do not intend it as a delimiter, you should encode it with the alternative, %23, because this *is* safe (defined as such in RFC 1738), and when received by the agent processing the HTTP GET request (i.e. the server), it is translated into the originally intended character, i.e. #.


It is wise to be cautious and write defensively
towards something you can refer, then away from it, even if it does work
on 98% of the browsers.  My point was that you cannot make a blanket
assumption about something when it's already known that it's unsafe and
the behavior of an action is undefined.

However, the behaviour *is* *completely* defined, so any agent (browser, server, or otherwise) that behaves differently is in explicit breach of the specs, i.e. a bug.




--
Oli
.



Relevant Pages

  • Re: How to write something to a html textfield and send it?
    ... > added to it (look below for html). ... I've never had any desire to automate a browser, ... agent to post the request and retrieve the response. ... Most non-trivial HTML forms use HTTP POST requests. ...
    (comp.programming)
  • Re: HTTP Download complete detect using TCP sniffer
    ... its HTML have been received. ... HTTP parsing" would tell me the HTTP download is complete please? ... When the request is sent you start ...
    (microsoft.public.win32.programmer.networks)
  • Re: HTTP Download complete detect using TCP sniffer
    ... its HTML have been received. ... HTTP parsing" would tell me the HTTP download is complete please? ... When the request is sent you start ...
    (microsoft.public.win32.programmer.networks)
  • Re: Is this possible with .NET?
    ... I'm making a web page so html and javascript over http is ... > send a request to the server. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Is this possible with .NET?
    ... I'm making a web page so html and javascript over http is ... > send a request to the server. ...
    (microsoft.public.dotnet.framework.aspnet)