Re: pointer and storage



On Sun, 17 Sep 2006 21:23:46 -0400, Ark <akhasin@xxxxxxxxxxxxxxxxxxxx>
wrote in comp.lang.c:

Keith Thompson wrote:
Ark <akhasin@xxxxxxxxxxxxxxxxxxxx> writes:
Richard Heathfield wrote:
[...]
The behaviour resulting from modifying a constant string is
undefined. A segmentation fault is one possible result. The absence
of a segmentation fault is another possible result. And the
destruction of Rome by fire is another possible result.

It would be clearer to use the term "string literal" rather than
"constant string".

I am confused profoundly.
I always thought that where the string literals are stored (RO vs. RW)
is implementation-defined

Yes.

(and decent compilers would allow me to
choose my way with a command-line switch).

That's debatable. I don't see much advantage in allowing string
literals to be modifiable (except *maybe* to handle old and broken
code).

However, the /type/ of (a pointer to) a string literal is char *,
regardless of the switch, or so I read the standard a while ago.

Yes.

So the statement
*a='a';
must compile OK *without diagnostics* and then cause or not cause
undefined behavior depending on implementation-defined behavior.

The following:
char *a = "hello";
*a = 'a';
(assuming it appears in an appropriate context) is legal (it violates
no syntax rules or constraints), and a conforming compiler must accept
it. But, as always, a compiler is free to issue any diagnostics it
likes. The standard requires diagnostics in certain cases; it never
forbids them.

If the initialization is executed, it invokes undefined behavior. The
undefined behavior is unconditional, though the effects of the
undefined behavior can be literally anything. There is no
implementation-defined behavior involved (implementation-defined
behavior must be documented by the implementation, and there is no
documentation requirement here).

That's exactly where my comprehension fails me.
After
char *a = "hello";
the pointer /is/ initialized, and if, as Keith writes,
*a = 'a';
produces the UB unconditionally, it means that the initialization of the
pointer is unconditionally bad (for the type), isn't it? There must be a
reason (like "old broken code"? or something else?) why the type of
"hello" is not const char *.

The simple fact is that string literals existed in the early C
language long before the const keyword appeared. So sufficiently old
code that assigned the address of a string literal to a plain old
ordinary pointer to char is not necessarily "broken", it was the only
character pointer type available at the time.

Having the const keyword available officially now for almost 17 years
does make it easier to avoid accidental errors, if it is used
properly. Attempting to write through a "pointer to const type" is a
constraint violation requiring a diagnostic.

OK, I can drill this case down my brain, but this leaves the following
question:
What are (all) legal initializations of char *a such that assigning to
*a is UB-free?

I'm too lazy to think hard about it right now, but assigning the
address of a modifiable array and using dynamic allocation come to
mind, without getting into type punning.

char ok [] = "hello";
char *a = ok;

....results in a pointing to characters that can be modified.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
.



Relevant Pages

  • Re: Compiler difference
    ... However, when I checked other examples of array and its size, they tend to ... Attempts to do so result in 'undefined behavior' ... Evalutating the value of this pointer will ... > String literals, by definition are not allowed to be ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Newbie pointers and reversing question
    ... void reverse(char* begin, char* end) { ... char tmp; ... because of the indisputable fact that string literals existed in the ... Attempting to modify a string literal in C produces undefined behavior ...
    (comp.lang.c)
  • Re: =operator for structs
    ... Christian Christmann wrote: ... Then you did a very bad thing, that's undefined behavior. ... Isn't "Hello" a char array somewhere in the ... which is a string literal; and string literals are translated ...
    (comp.lang.c)
  • Re: What is wrong with this code?
    ... dst is an array of pointers to char. ... "xxxxxx" to the first element of that array. ... String literals are "read-only" ... regardless of whether you assign them to a char * pointer or a const char * ...
    (microsoft.public.vc.language)
  • Re: Comments ok?
    ... and the resulting undefined behavior may have caused `p' to ... Could it be a single char, an unitialised pointer, ... the caller will provide an ...
    (comp.lang.c)