Re: urllib interpretation of URL with ".."
- From: Duncan Booth <duncan.booth@xxxxxxxxxxxxxxx>
- Date: 25 Jun 2007 08:46:22 GMT
"Martin v. Löwis" <martin@xxxxxxxxxxx> wrote:
Is "urllib" wrong?
I can't see how. HTTP 1.1 says that the parameter to the GET
request should be an abs_path; RFC 2396 says that
/../acatalog/shop.html is indeed an abs_path, as .. is a valid
segment. That RFC also has a section on relative identifiers
and normalization; it defines what .. means *in a relative path*.
Section 4 is explicit about .. in absolute URIs:
# The syntax for relative URI is a shortened form of that for absolute
# URI, where some prefix of the URI is missing and certain path
# components ("." and "..") have a special meaning when, and only when,
# interpreting a relative path.
Notice the "and only when": the browsers who modify above
URL before sending it seem to be in clear violation of
RFC 2396.
Section 5.2 is also relevant here. In particular:
g) If the resulting buffer string still begins with one or more
complete path segments of "..", then the reference is
considered to be in error. Implementations may handle this
error by retaining these components in the resolved path (i.e.,
treating them as part of the final URI), by removing them from
the resolved path (i.e., discarding relative levels above the
root), or by avoiding traversal of the reference.
The common practice seems to be for client-side implementations to handle
this using option 2 (removing them) and servers to use option 3 (avoiding
traversal of the reference). urllib uses option 1 which is also correct but
not as useful as it might be.
.
- Follow-Ups:
- Re: urllib interpretation of URL with ".."
- From: John Nagle
- Re: urllib interpretation of URL with ".."
- References:
- urllib interpretation of URL with ".."
- From: John Nagle
- Re: urllib interpretation of URL with ".."
- From: "Martin v. Löwis"
- urllib interpretation of URL with ".."
- Prev by Date: Re: automatical pdf generating
- Next by Date: Re: Which XML?
- Previous by thread: Re: urllib interpretation of URL with ".."
- Next by thread: Re: urllib interpretation of URL with ".."
- Index(es):
Relevant Pages
|