Re: Bugs in http
- From: "tom.rmadilo" <tom.rmadilo@xxxxxxxxx>
- Date: Fri, 13 Nov 2009 08:18:30 -0800 (PST)
On Nov 12, 11:51 pm, "Gerald W. Lester" <Gerald.Les...@xxxxxxx> wrote:
tom.rmadilo wrote:
Still can't figure out when the translation is supposed to take place.
For text media types the on the wire format is supposed to use <CR><LF> as
EOL marker. In the case of Tcl's http package, it should translate those to
the Tcl internal standard of a <LF>. If the string is then written out to
the file system it should translate the <LF> to whatever that channel has
been fconfigured to (which by default is the standard for the platform the
program is running on).
Okay, from my reading of the standard, it only applies to text/plain,
but even that is limited.
In fact, the meaning is almost the opposite of what you claim when you
take into account all media types.
Backing up a little, to include more of the standard:
2.3.1. Canonicalization and Text Defaults
Internet media types are registered with a canonical form. An
entity-body transferred via HTTP messages MUST be represented in
the
appropriate canonical form prior to its transmission except for
"text" types, as defined in the next paragraph.
When in canonical form, media subtypes of the "text" type use CRLF
as
the text line break. HTTP relaxes this requirement and allows the
transport of text media with plain CR or LF alone representing a
line
break when it is done consistently for an entire entity-body. HTTP
applications MUST accept CRLF, bare CR, and bare LF as being
representative of a line break in text media received via HTTP. In
addition, if the text is represented in a character set that does
not
use octets 13 and 10 for CR and LF respectively, as is the case for
some multi-byte character sets, HTTP allows the use of whatever
octet
sequences are defined by that character set to represent the
equivalent of CR and LF for line breaks. This flexibility
regarding
line breaks applies only to text media in the entity-body; a bare
CR
or LF MUST NOT be substituted for CRLF within any of the HTTP
control
structures (such as header fields and multipart boundaries).
If an entity-body is encoded with a content-coding, the underlying
data MUST be in a form defined above prior to being encoded.
....
3.2.1. Type
When an entity-body is included with a message, the data type of
that
body is determined via the header fields Content-Type and Content-
Encoding. These define a two-layer, ordered encoding model:
entity-body := Content-Encoding( Content-Type( data ) )
Content-Type specifies the media type of the underlying data. Any
HTTP/1.1 message containing an entity-body SHOULD include a
Content-
Type header field defining the media type of that body, unless that
information is unknown. If the Content-Type header field is not
present, it indicates that the sender does not know the media type
of
the data; recipients MAY either assume that it is "application/
octet-stream" ([RFC2046], Section 4.5.1) or examine the content to
determine its type.
So it is clear from the first section that CR, LF or CRLF, or any
charset defined mapping of chars 13 and 10 must be allowed. And the
best way to allow them is to just accept the text "as-is".
The second section says that the server does not have to understand
and should not guess at the media types it serves, and the client is
under no obligation to figure out the type either. But if the client
wants to figure out the type, it can look at the content, somewhat
like an xml application would, or how unix applications work.
But there is still nothing here that talks about transforms or
substitutions of eol chars in text If the client receives a text
media type which contains a mixture of CR, LF and CRLF as eol, it
seems to me that the client should notify the user and refuse to do
any translation during a save operation. It seems to me this would
require several passes over the data, so using [fcopy] with
translations on the first pass would cause problems. Maybe it should
save to a temp file on the first pass.
.
- References:
- Bugs in http
- From: drscrypt
- Re: Bugs in http
- From: drscrypt
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: Alexandre Ferrieux
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: Donal K. Fellows
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: Donal K. Fellows
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: Gerald W. Lester
- Re: Bugs in http
- From: Gerald W. Lester
- Re: Bugs in http
- From: tom.rmadilo
- Re: Bugs in http
- From: Gerald W. Lester
- Bugs in http
- Prev by Date: Re: Bugs in http
- Next by Date: Re: Save the file in strings, and evaulate the strings in Tcl source
- Previous by thread: Re: Bugs in http
- Next by thread: Re: Bugs in http
- Index(es):
Relevant Pages
|