Re: pointer and storage



Ark <akhasin@xxxxxxxxxxxxxxxxxxxx> writes:
Keith Thompson wrote:
[...]
The following:
char *a = "hello";
*a = 'a';
(assuming it appears in an appropriate context) is legal (it violates
no syntax rules or constraints), and a conforming compiler must accept
it. But, as always, a compiler is free to issue any diagnostics it
likes. The standard requires diagnostics in certain cases; it never
forbids them.
If the initialization is executed, it invokes undefined behavior.
The
undefined behavior is unconditional, though the effects of the
undefined behavior can be literally anything. There is no
implementation-defined behavior involved (implementation-defined
behavior must be documented by the implementation, and there is no
documentation requirement here).

That's exactly where my comprehension fails me.
After
char *a = "hello";
the pointer /is/ initialized, and if, as Keith writes,
*a = 'a';
produces the UB unconditionally, it means that the initialization of
the pointer is unconditionally bad (for the type), isn't it?

No, it isn't, but it's a bad idea.

Initializing a char* object ("a" in this case) to point to the first
character of a string literal is perfectly legal. For example, you
can read the elements of the array through the pointer will work just
fine. Undefined behavior occurs only if you try to *modify* elements
of the array.

There
must be a reason (like "old broken code"? or something else?) why the
type of "hello" is not const char *.

It's to avoid breaking old code that may have been written before
"const" was introduced to the language (a *long* time ago). For example:

#include <stdio.h>

void print_string(char *s)
{
printf("print_string(\"%s\")\n", s);
}

int main(void)
{
char *message = "hello";
print_string(message);
return 0;
}

In old versions of the C language, before "const" was introduced, this
kind of thing was common. The language didn't provide a way to have
the compiler warn you if you tried to modify something that shouldn't
be modified.

Once "const" was introduced, it might have made sense to make string
literals const, but it would have broken existing code, which was
considered unacceptable. The alternative would have required all the
existing code to be modified by adding "const" qualifiers -- which
would have meant it would fail to compile under old compilers. It was
considered too high a price to pay.

OK, I can drill this case down my brain, but this leaves the following
question:
What are (all) legal initializations of char *a such that assigning to
*a is UB-free?

There are infinitely many such initializations. As long as a points
to modifiable memory, you can modify it.

Here's one example:

char str[] = "hello";
char *s = str;

The first line creates str as a non-const array. The second
initializes s to point to the first character of the array.

--
Keith Thompson (The_Other_Keith) kst-u@xxxxxxx <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
.



Relevant Pages

  • Re: Read-only functionality without const
    ... the compiler isn't required to complain, ... This isn't because it's const; ... because the standard explicitly says that it's undefined behavior. ... The language would be a bit cleaner if string literals actually were ...
    (comp.lang.c)
  • Re: a = b or memset/cpy?
    ... This is initialization, not assignment. ... The relevant part is assignment to gX. ... other part is just initialization of a const. ... I've used a compiler which, ...
    (comp.lang.c)
  • Re: OT: Requesting C advice
    ... some behind the scenes action of the compiler. ... In fact the memory could ... Proper initialization means that floats and doubles must be initialized to 0.0 and pointers must be initialized to the null pointer value, even if those bit patterns differ from all-bits-zero. ...
    (Fedora)
  • Re: Is this valid C statement?
    ... > Unless the type name specifies a void type, ... >any constraint), but it invokes undefined behavior, since the standard ... The ones the compiler can detect. ... They may diagnose them, but typically only if the user ...
    (comp.lang.c)
  • Re: I think C# is forcing us to write more (redundant) code
    ... static void F(out string s) ... >> compiler enforced). ... locals here and locals are stack allocated, ... I don't have a problem with this explicit initialization in the current ...
    (microsoft.public.dotnet.languages.csharp)