Re: casting to unsigned char for is*() and to*() functions





James Daughtry wrote:
>>The reason you need the cast is that converting directly
>>from plain `char' to `int' might not produce what toupper()
>>needs.
>
> Does the standard guarantee that casting signed char with a negative,
> non-EOF value, to unsigned char will produce the expected character?

The Standard does not govern your expectations. ;-)

> It
> seems to me that unless this guarantee is provided, the cast would give
> you defined behavior but garbage results. That's only marginally better
> than undefined behavior. As such, wouldn't it be better to simply avoid
> the operation if the value is out of range?
>
> if (c == EOF || (c >= 0 && c <= UCHAR_MAX))
> c = toupper(c);
> else {
> /* Special treatment for c */
> }

What would the "special treatement" be? Without toupper()
to aid you, how would you know to transform -25 to -57 so as to
turn ç into Ç? And how would you know that -65 should remain
unchanged because ¿ has no upper-case equivalent?

IMHO (and with benefit of hindsight, a luxury not afforded
to pioneers) it was a mistake to define the <ctype.h> functions
on "all values getc() can return," and things would have been
better if they'd been defined on "all values a `char' can have."
But done's done, the moving finger writes, and there's no use
crying over spilt milk. I'm sure that if someone today were to
redesign the C library from scratch and without the need to
accommodate existing code, the outcome would be nicer in many
ways than what we have now. (Maybe I shouldn't be so sure; look
at the insalubrities of C that have been perpetuated in Java, a
"from scratch" effort whose designers might have been expected
to have known better!) However, it's surpassingly unlikely that
the library will change except by addition and extension; the
fundamental decisions about things like <ctype.h> will stay.

Do you have (or were you born with) a vermiform appendix
whose only discernible purpose is to put you at risk of
peritonitis? It's your heritage -- and C has heritage, too.

--
Eric.Sosman@xxxxxxx

.



Relevant Pages