Re: Zero terminated strings



On 2009-07-31, jacob navia <jacob@xxxxxxxxxx> wrote:
Zero terminated strings are a continuing security nightmare.

Slashdot reports this today:

"Two researchers, Dan Kaminsky and Moxie Marlinspike, came up with exact
same way to fake being a popular website with authentication from a
certificate authority.

Wired has the details: 'When an attacker who owns his own domain —
badguy.com — requests a certificate from the CA, the CA, using contact
information from Whois records, sends him an email asking to confirm his
ownership of the site. But an attacker can also request a certificate
for a subdomain of his site, such as Paypal.com\0.badguy.com, using the
null character \0 in the URL.

Obviously, this bug was caused by idiots who thought that they could solve some
imaginary problem by using a ``better'' string library that can represent a
null byte in the middle of a string.

A null byte has absolutely no place in character (i.e. text) strings. If an
array of bytes contains nulls, it's not a character string, but a binary
string, or blob if you will. Null is not really a character, basically. It has
no glyph, and no signaling action for printing control.

There is no legitimate need, ever, in a data representation for text, to
support an embedded null byte. It's not text; it's a special code which says
``I am not text''. So, implicitly, if a null byte follows text, it means either
that the text has ended, or the text is corrupt with the repugnant inclusion of
non-text data.

The moral of this story is that if your language or string library allows nulls
in the middle of a string, it's wrong, and you should fix it such that the null
is treated as a terminator, or such that an exception is triggered if it
occurs.

There are good reasons for working with strings in a representation other than
the null-terminated array, but being able to represent a null in the middle of
a string is not one of those good reasons. Strings that know their own length
should still banish the null byte from being a constituent.
.



Relevant Pages

  • [TOMOYO #15 3/8] Common functions for TOMOYO Linux.
    ... This file contains common functions (e.g. policy I/O, pattern matching). ... Since TOMOYO Linux is a name based access control, ... TOMOYO Linux's string manipulation functions make reviewers feel crazy, ... the Linux kernel accepts all characters but NUL character ...
    (Linux-Kernel)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • Re: RfD: Escaped Strings
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... \b BS (backspace, ASCII 8) ... \ ** escapes to characters much as C does. ...
    (comp.lang.forth)
  • Re: A note on computing thugs and coding bums
    ... code is valid for any character set that is legal in C (which is a ... characters in the required source character set ... A String, in C Sharp or Java, can be redefined. ... allow programmers to handle some other data format, ...
    (comp.programming)