Re: Strange strcmp() action?



Eric <answer.to.newsgroup@xxxxxxxxxx> writes:
On Thu, 1 Jan 2009 14:30:50 -0800 (PST), JC <jason.cipriani@xxxxxxxxx>
wrote:

No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.

The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.

Hmmm... OK.

http://www.cplusplus.com/reference/clibrary/cstring/strcmp.html says:

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string. If
they are equal to each other, it continues with the following pairs
until the characters differ or until a terminanting null-character is
reached."

To me, that seems to say that if a null terminator is found, as it
would be at the end of "wer" in str2, then if the strings are equal up
to that point, they are equal.

It doesn't say anything about comparing the terminating null character
in str2 with the corresponding character in str1 and finding that they
are not equal (because the corresponding character in str1 is "t").

I know it doesn't say this explicitly, and I can't find anything else,
including "man strcmp" for both Linux and FreeBSD, that is explicit
about it.

Here's what the C99 standard says.

7.21.4:

The sign of a nonzero value returned by the comparison functions
memcmp, strcmp, and strncmp is determined by the sign of the
difference between the values of the first pair of characters
(both interpreted as unsigned char) that differ in the objects
being compared.

7.21.4.2:

The strcmp function returns an integer greater than, equal to, or
less than zero, accordingly as the string pointed to by s1 is
greater than, equal to, or less than the string pointed to by s2.

A "string" is, by the definition in 7.1.1p1:

a contiguous sequence of characters terminated by and including
the first null character.

So the terminating null character is part of the string, and is
compared if a mismatch isn't found sooner. In comparing the strings
"abc" and "abcd", the characters compared are:

'a' vs. 'a' (equal, keep looking)
'b' vs. 'b' (equal, keep looking)
'c' vs. 'c' (equal, keep looking)
'\0' vs. 'd' (less, return a negative result)

In my opinion it would be nice if this were made a bit more explicit.

--
Keith Thompson (The_Other_Keith) kst-u@xxxxxxx
<http://www.ghoti.net/~kst> Nokia "We must do something. This is
something. Therefore, we must do this." -- Antony Jay and Jonathan
Lynn, "Yes Minister"
.



Relevant Pages

  • Re: Jemalloc SEGV for 1MB chunk
    ... after terminating the string with NULL character no SEGV is seen. ... For a large allocation like 1MB you get page aligned memory and the page after the allocation is very likely not mapped, so you get a segfault when you try to access it. ...
    (freebsd-current)
  • Last line with or without new-line
    ... terminating new-line character is implementation-defined. ... One is that, because of the definition of "line", the string "Second ... standard doesn't say what the behavior should be. ...
    (comp.std.c)
  • Re: Portable EOL?
    ... That standard, like the C standard for C ... Now, since the string is a C string, I would follow the terminating new ... the same thing as a null character, ...
    (comp.lang.c)
  • Re: Will the real Altair Basic please stand up?
    ... If you look at the code, he's not putting an 80h before the terminating 00h. ... using the high bit of a character to signal the end of the ... string, was common enough, back when, byte counting counted. ... Monitor, and voila! ...
    (comp.os.cpm)
  • Re: comparing two strcasecmp (stricmp) implementations
    ... >> If the first character is bigger, the first string is bigger ... >> loop and try the next characters... ... which is my prefered idiom for comparing int types ...
    (comp.lang.c)

Loading