Re: Bugs in http

tom.rmadilo wrote:
On Nov 12, 4:01 pm, "Donal K. Fellows"
<donal.k.fell...@xxxxxxxxxxxxxxxx> wrote:

However, I'm arguing something less: the data is binary/opaque during
transfer. If the application can then figure out how to do a
transform, great, but it isn't part of http and the http part of the
application should preserve the exact data it received.
I disagree. HTTP is not just a binary download protocol.

Just point me to the relevant part of the protocol, because I'm not
the only one who has missed this completely.

Since you asked, RFC 1945 Section 3.6.1, to be precise the second and third paragraphs which read:

Media subtypes of the "text" type use CRLF as the text line break when in canonical form. However, HTTP allows the transport of text media with plain CR or LF alone representing a line break when used consistently within the Entity-Body. HTTP applications must accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP.

In addition, if the text media is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. This flexibility regarding line breaks applies only to text media in the Entity-Body; a bare CR or LF should not be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries).

And please explain why Mozilla failed your interpretation. Why didn't
it convert <LF> to <CR><LF>?

It has a bug -- file a bug report with Mozilla.

But it isn't even clear when you think such a transformation should
take place. Http is a transport protocol, not a file saving protocol.

To be exact it is a Hypertext Transport Protocol (note second part of the compound first word in the name).

My example simply proves that no conversion takes place. If the save
operation had performed eol conversions, I could not so easily
demonstrate that the http protocol did not perform the conversion. But
the saved document did not contain <CR><LF> conversions, proving that
at least mozilla on windows does not follow your interpretation.

Again, sounds like a mozilla bug -- file a bug report with them.

The http stuff is way over before the "save a copy" operation. If http
required or allowed any conversion they would already have been done.

It does require it -- for text media subtypes, please read the RFC. Pay particular attention to the sections I quoted above.

Binary/opaque/octet download means that "no interpretation" is forced
on the content. Interpretation is up to the application. Conversion
destroys the possibility of user applied interpretation.

You seem to forget, part of the protocol allows for multiple media types -- the conversion rules *ONLY APPLY TO TEXT MEDIA SUBTYPES*!!! (yes I meant to yell!) -- you are using bait-and-switch arguments here1

There are just endless examples of when your "prefect world" model

Imagine a tcl source file. You set up a server that allows files
ending in .tcl to be served as text/plain. Problem: tcl files allow
binary data to follow a ^Z or eof. How do you configure a server to
handle such a situation? So you have now vastly complicated the
ability to support source code browsing.

Sounds like you have a misconfigured server -- the type shoud be application/tcl.

Not sure why the http protocol needs to be burdened with all these
complications. It is already stupidly complex because of the different
platform eol conventions, you want to extend that madness to the body

Go argue with the W3C and ISO committee -- we are just telling you the way it *IS*.

| Gerald W. Lester |
|"The man who fights for his ideals is the man who is alive." - Cervantes|