Re: Null pointers

From: Chris Torek (nospam_at_torek.net)
Date: 08/06/04


Date: 6 Aug 2004 18:40:23 GMT


>In article <bxJQc.397$QJ3.43@newssvr21.news.prodigy.com>, "Mabden"
><mabden@sbc_global.net> writes:
>> All my critics, please just write a simple program to read from location
>> zero in memory. I you can do it, post what you find there.

I have a number of embedded-systems boards accessible to me that have
ordinary RAM at *(char *)0. Put something in there and it is in there;
read from it and you get the same data back:

    void showZeros(void) {
        char *p = 0;
        int i = 0;

        *p = 2;
        printf("p = %p, *p = %d\n", (void *)p, *p);
        p = i;
        *p = 3;
        printf("p = %p, *p = %d\n", (void *)p, *p);
    }

Compiled and run (from the target shell) under vxWorks on various
single-board machines, this produces:

    -> showZeros()
    p = 0x0, *p = 2
    p = 0x0, *p = 3
    ->

(Both Diab and GNU use all-bits-zero for NULL, even on these
machines that have RAM at location 0.) On other single-board
machines RAM starts "above" 0; where there is ROM at 0 this
produces things like:

    p = 0x0, *p = -72
    p = 0x0, *p = -72

and where there is nothing at all at zero the task gets a fault
and gets suspended.

In article <news:cf0aif0hq2@news3.newsguy.com>
Michael Wojcik <mwojcik@newsguy.com> writes:
>Try it on an older VAX VMS C implementation, and you'll find a zero
>there. Because some old C programs relied on this, there are more
>recent implementations which support the "zero-address hack" as an
>option - an option which does NOT render them non-conforming. IBM's
>C for AIX had this option (actually implemented by the linker); I
>don't know if it still does, but I have a machine sitting around with
>a sufficiently old version of AIX to still provide it.

4.1BSD on the VAX also had *(char *)0 == 0 (in fact there was a
short-word, 16 bits long, of all-zero-bits at 0). I thought most
VMS systems mapped page zero away, though.

Version 6 Unix on the PDP-11 (other than split I&D) had the magic
number for the a.out format at 0; for OMAGIC binaries (mode 0407)
we had *(char *)0 == 7 (remember that the PDP-11 was "PDP-endian",
little-endian within 16-bit words and big-endian for "long"s stored
as two 16-bit little-endian words). Since C had its main development
on the PDP-11, one might even say that *(char *)0 == 7 was the
*expected* result. :-)

Most peculiar of all, I think, was some code that crept into
System III or System V Unix (not sure which). This is a paraphrase
(I have no recollection of the actual second argument to strcmp()):

    if (strcmp(p, "#\307x") == 0)
        ...

The reason this code got in was that the 3B system on which
the programmer wrote it happened to have an odd sequence of bytes
at *(char *)0, and he wrote that strcmp() call instead of the
correct test:

    if (p == NULL)

In other words, whoever wrote this Unix utility, whichever utility
it was (cpio?), thought that *(char *)0 was supposed to contain a
weird string! (I believe I heard this story from Doug Gwyn, who
might remember which utility it was and what the code was. It may
have been Guy Harris. Whoever it was, found the problem by porting
the particular utility and discovering that it did not work on the
new machine.)

The history and "current state of the world" is clear enough, and
mabden@sbc_global.net is simply wrong. The C Standards are a
little tougher to read and interpret, but overall, the facts are:

  - The NULL macro, and the null pointer constants, are source
    code constructs.

  - A compiler's job is to convert source code constructs to
    suitable machine code. This allows the compiler to change
    "what you see in the source" to "what you will see if you
    disassemble the machine code". That is, there is some
    mapping between "external" representation -- what you type
    into a C program, or see when you printf() -- and "internal"
    representation, as used by machine-level code.

  - The Standards allow the machine code to use any particular bit
    pattern(s) the implementor chooses to represent various null
    pointers internally, *provided* that no valid C object's address
    (nor function pointer) compares equal to any such null pointer.

  - Most implementors use all-bits-zero for internal null pointers.
    It is usually the easiest thing to do, and most people usually
    do the easiest thing. But other bit patterns are allowed, and
    some machines have particularly complicated pointers (e.g.,
    IBM AS/400) so that all-bits-zero is not easiest after all.

  - Many machines without virtual memory, and even some with, have
    ROM or RAM at physical address 0.

  - C implementations that use all-bits-zero for all their internal
    null pointers *and* that have useable RAM at address 0 must
    make sure not to put any C object or function at address zero,
    which is typically easily achieved by putting some "non-C"
    thing there, such as startup code or a "shim".

  - Many implementations with virtual memory simply map out address
    0 so that improper attempts to access it are caught right away.
    This is a good thing, but is not required by the C Standards.

  - C implementations that use something other than "hardware
    address 0" for their internal null pointers *and* that have
    useful stuff at "hardware address zero" are not actually
    obligated to let you get at the useful stuff -- nobody ever
    said C *has* to be useful for systems programmers -- but will
    likely have some trick(s) you can use to do that, because
    most systems programmers *like* their C systems to be useful
    to them. (After all, why buy a C compiler if you cannot *use*
    it?)

-- 
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
email: forget about it   http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.


Relevant Pages

  • Re: void pointers & void function pointers
    ... > void; ... try to coerce a pointer of one class to a pointer of the other, ... Machines that maintain "permission bits" on regions of memory ...
    (comp.lang.c)
  • Re: NULL and zeros
    ... If you are excluding such machines, you are in the wrong newsgroup.", and it's no good. ... the originator of this calloc() is clueless -- my personal ... bits zero by a call to callocif deemed necessary. ... It is possible to write real world programmes that ...
    (comp.lang.c)
  • Re: General method for dynamically allocating memory for a string
    ... Allocated memory is initialized to a machine-specific bit pattern ... a pointer accidentally. ... even on machines with 32-bit pointers): ...
    (comp.lang.c)
  • Re: References for machines where NULL is not zero.
    ... > I've been reading some past discussions on the NULL vs. zero. ... value of 1 if p is a null pointer and 0 if p is not a null pointer. ... is a null pointer constant, ... On some machines, the _representation_ of a null pointer has not all ...
    (comp.lang.c)
  • Re: Better quality can be a bad thing....(long but important)
    ... Do you mean the gage above the way lube reservoir? ... If so it's at zero on both machines and stays at zero even during a ... You might also want to check see you are getting clean dry air. ...
    (alt.machines.cnc)