Re: urllib interpretation of URL with ".."



Duncan Booth wrote:
"Martin v. Löwis" <martin@xxxxxxxxxxx> wrote:


Is "urllib" wrong?

Section 5.2 is also relevant here. In particular:


g) If the resulting buffer string still begins with one or more
complete path segments of "..", then the reference is
considered to be in error. Implementations may handle this
error by retaining these components in the resolved path (i.e.,
treating them as part of the final URI), by removing them from
the resolved path (i.e., discarding relative levels above the
root), or by avoiding traversal of the reference.


The common practice seems to be for client-side implementations to handle this using option 2 (removing them) and servers to use option 3 (avoiding traversal of the reference). urllib uses option 1 which is also correct but not as useful as it might be.

That's helpful. Thanks.

In Python, of course, "urlparse.urlparse", which is
the main function used to disassemble a URL, has no idea whether it's being
used by a client or a server, so it, reasonably enough, takes option 1.

(Yet another hassle in processing real-world HTML.)

John Nagle
.



Relevant Pages

  • Re: String Reference Type
    ... All unary and binary operators have predefined implementations that are ... Therefore its always allocated in the heap and a variable of string ... As with all classes in this case y and x both reference the same String ... language depandant matter as below. ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: Separating implementation and interface: HOW?
    ... Factory references both implementations and interface projects ... I'm not sure why adding a project reference is such a bad thing, but if you really want to avoid ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: sorting algorithms
    ... > distributed and some reference algorithms have their official ... The heap sort given looks very different from other reference ... implementations that I have seen -- it is organized to minimize ... redundant data movement and leverage modern compiler/platform ...
    (comp.programming)
  • Re: Separating implementation and interface: HOW?
    ... I'm not sure why adding a project reference is such a bad thing, but if you really want to avoid it ... you can create a shim project that contains the factory. ... reference both the project containing the interface and the project containing the implementations. ... The interface project won't have to reference anything. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: OO Javascript for AJAX encapsulation
    ... - in ECMAScript implementations). ... I'm talking about nailing the object context to some object methods by ... are demonstrating a weird behavior in comparison to javascript. ... result is a reference to the anonymous function (two memory items: ...
    (comp.lang.javascript)