Re: urllib interpretation of URL with ".."



John Nagle schrieb:
Here's a URL, found in a link, which gives us trouble
when we try to follow the link:

http://sportsbra.co.uk/../acatalog/shop.html

Browsers immediately turn this into

http://sportsbra.co.uk/acatalog/shop.html

and go from there, but urllib tries to open it explicitly, which
results in an HTTP error 400.

Is "urllib" wrong?

I can't see how. HTTP 1.1 says that the parameter to the GET
request should be an abs_path; RFC 2396 says that
/../acatalog/shop.html is indeed an abs_path, as .. is a valid
segment. That RFC also has a section on relative identifiers
and normalization; it defines what .. means *in a relative path*.

Section 4 is explicit about .. in absolute URIs:
# The syntax for relative URI is a shortened form of that for absolute
# URI, where some prefix of the URI is missing and certain path
# components ("." and "..") have a special meaning when, and only when,
# interpreting a relative path.

Notice the "and only when": the browsers who modify above
URL before sending it seem to be in clear violation of
RFC 2396.

Regards,
Martin
.



Relevant Pages

  • Re: urllib interpretation of URL with ".."
    ... Is "urllib" wrong? ... That RFC also has a section on relative identifiers ... # The syntax for relative URI is a shortened form of that for absolute ... the browsers who modify above ...
    (comp.lang.python)
  • Re: [Full-Disclosure] Microsoft Faces Angry IE Users Questions
    ... it is part of the _general_ URI scheme. ... if that is _ALL_ you recall from that RFC you are out of your ... general URI form (with _or without_ the "userid" feature), ... say anything meaningful about HTTP URIs you have to find if the HTTP ...
    (Full-Disclosure)
  • Re: urllib interpretation of URL with ".."
    ... segment. ... That RFC also has a section on relative identifiers ... # The syntax for relative URI is a shortened form of that for absolute ... or by avoiding traversal of the reference. ...
    (comp.lang.python)
  • Re: Ahhh.. URL wants to get encoded. Does Java wanna?
    ... the way I read RFC 2396 is that reserved chars: ... Perhaps Patricia could read the RFC ... query, in which case it should be present as an actual & character. ... example URI Wayne gave uses ampersands as query ...
    (comp.lang.java.programmer)
  • Re: Relative URI parsing
    ... it's implementing an earlier version of the RFC for URI than I was ... The context of this is developing a library for Topic Maps (which uses URI's ... The examples I posted are the normative tests from the RFC I mentioned; ... > Welcome to MSDN newsgroup! ...
    (microsoft.public.dotnet.framework)