Re: || putchar(ch == '\177' ? '?' : ch | 0100) == EOF)



c gordon liddy <grumpy196884@xxxxxxxxxxx> writes:
"Keith Thompson" <kst-u@xxxxxxx> wrote in message news:
87hcerqseb.fsf@xxxxxxxxxxxxxxxxxx
"c gordon liddy" <c@xxxxxxxxxxxxx> writes:
[...]
Similarly, I'm out of my depth with what follows the double pipe in the
second if clause.
|| putchar(ch == '\177' ? '?' : ch | 0100) == EOF)

Wouldn't \177 be a tri-graph? A perfectly-acceptable explanation might be
that it's beyond the scope of my present endeavor and can be omitted.

No, it's not a trigraph; trigraphs are introduced by a double question
mark. It's a character constant that uses an escape sequence. '\177'
is the character whose integer value is 177 in octal, or 127 in
decimal; it's the ASCII DEL character. "ch | 0100" yields the value
of ch with a certain bit forced on; it's terse way of mapping
control-A (1) to 'A" and so forth. The conditional expression is used
to handle the fact that mapping DEL to "^?" is a special case.

I think I could study the above for a long time and not really get
it. It's interesting but not germane to something that can be done in
standard C. I have a double problem with the double pipe here. Not
only is that which is on the right hand side of it obfuscated C, I
don't get the control mechanism. To me, it looks like
if this then that or the other.

The quote code, as far as I can tell, *is* standard C. I don't
believe it's deliberately obfuscated; rather, it's unusually terse,
written in a style that favors packing lots of information into
complex expressions rather than breaking it down into separate
statements.

You can skip it and go on to something easier if you like, but you
might consider taking one more stab at it.

Let's take a look at the statement:

if (iscntrl(ch)) {
if (putchar('^') == EOF ||
putchar(ch == '\177' ? '?' :
ch | 0100) == EOF)
break;
continue;
}

if ch is a control character then
if printing '^' fails *or* printing another character fails then
break out of the loop (give up)
end if
Printing succeeded; nothing more to do here: "continue"
end if

iscntrl(ch) returns true if ch is a "control character". In this
context, it tells us that it's a non-printable character that we want
to represent as a '^' followed by another character (^G for the ASCII
BEL character, ^? for DEL).

Within the if statement we see two calls to putchar(), one to print
the '^' character and one to print whatever follows it. Both results
are compared against EOF (which indicates failure); if either
putchar() fails, we break out of the loop.

The part before the "||" is reasonably clear: try to print a '^'
character and check whether the attempt failed. "||" is a
short-circuit operator, evaluating its right operand only if the left
operand is false, so if the first putchar call fails we won't attempt
the second one.

Now let's look at the part after the "||":

putchar(ch == '\177' ? '?' : ch | 0100) == EOF

We've covered the higher level control flow, so we're down to figuring
out what the heck

ch == '\177' ? '?' : ch | 0100

means. Some parentheses might make it clearer:

(ch == '\177') ? ('?') : (ch | 0100)

If ch is equal to '\177' (character 177 octal, 127 decimal, ASCII
DEL), the expression yields '?'. The result is that we print a '?'
after the '^'.

Otherwise (For any other control character), the result is (ch |
0100). 0100, since it begins with '0' is an octal constant, equal to
64, a power of 2. "|" is the bitwise "or" operator.

The binary value of 0100 is 01000000. Suppose the value of ch is 7
(ascii BEL, which we're going to want to print as "^G"). 7 is
00000111. Applying bitwise or to these two operands gives us
01000111, which is 0107 in octal, or 71 in decimal, or 'G' in ASCII.

0100 (octal) is being used as a bit mask; it has a single bit set to
1, and all others set to 0. (ch | 0100) yields the value of ch with
the bit in that particular position turned on. As it happens, that's
a terse way to specify a transformation from a control character to
the corresponding letter.

Note that (ch + 64) would have worked just as well in this context
(since we know the bit we want to turn on isn't already on). The
author probably chose to write "ch | 0100" because he thought of the
operation as setting a bit, not as the equivalent addition.

Here's a much more verbose chunk of code that does the same thing.
I've kept the "c | 0100" idiom, but expanded everything else. The
original code is more terse than I tend to like; the following is much
too verbose for my taste, but it might be clearer. (I've compiled it,
but I haven't tested it.)

if (iscntrl(ch)) {
/* ch is a control character */
int result;

/*
* The two characters we want to print. The first is '^';
* we don't know yet what the second is.
*/
int ch1 = '^';
int ch2;

/* Try to print the first character. */
result = putchar(ch1);
if (result == EOF) {
/* Failed, terminate the loop *?
break;
}

if (ch == '\177') {
/* ch is DEL, we want "^?" */
ch2 = '?';
}
else {
/*
* ch is another control character.
* Transform 1 to 'A', 2 to 'B', etc. using
* our intimate knowledge of ASCII encoding.
*/
ch2 = ch | 0100;
}

/* Print as above */
result = putchar(ch2);
if (result == EOF) {
break;
}
}

--
Keith Thompson (The_Other_Keith) <kst-u@xxxxxxx>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
.