Re: Is C99 the final C? (some suggestions)
From: Sidney Cadot (sidney_at_jigsaw.nl)
Date: 12/06/03
- Next message: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Previous message: Uncle: "Re: Can I send char as array argument?"
- In reply to: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Next in thread: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Reply: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 06 Dec 2003 02:02:53 +0100
Paul Hsieh wrote:
> Sidney Cadot <sidney@jigsaw.nl> wrote:
>>>Sure. Vendors are waiting to see what the C++ people do, because they
>>>are well aware of the unreconcilable conflicts that have arisen. Bjarne
>>>and crew are going to be forced to take the new stuff C99 in the bits and
>>>pieces that don't cause any conflict or aren't otherwise stupid for other
>>>reasons. The Vendors are going to look at this and decide that the
>>>subset of C99 that the C++ people chose will be the least problematic
>>>solution and just go with that.
>>
>>Ok. I'll give you 10:1 odds; there will be a (near-perfect) C99 compiler
>>by the end of this decade.
> A single vendor?!?! Ooooh ... try not to set your standards too high.
One has to be conservative when engaging in bets.
> Obviously, its well known that the gnu C++ people are basically converging
> towards C99 compliance and are most of the way there already. That's not my
> point. My point is that will Sun, Microsoft, Intel, MetroWerks, etc join the
> fray so that C99 is ubiquitous to the point of obsoleting all previous C's for
> all practical purposes for the majority of developers?
I think they will. Could take a couple of years though.
> Maybe the Comeau guy
> will join the fray to serve the needs of the "perfect spec compliance" market
> that he seems to be interested in.
>
> If not, then projects that have a claim of real portability will never
> embrace C99 (like LUA, or Python, or the JPEG reference implementation, for
> example.) Even the average developers will forgo the C99 features for fear
> that someone will try compile their stuff on an old compiler.
Sure, there'll be market inertia, but this also happened with the
transition of K&R -> ANSI fifteen years ago.
> Look, nobody uses K&R-style function declarations anymore. The reason is
> because the ANSI standard obsoleted them, and everyone picked up the ANSI
> standard. That only happened because *EVERYONE* moved forward and picked up
> the ANSI standard. One vendor is irrelevant.
Ok. Can't speak for MW, but I think that by the end of 2007 we'll have
near-perfect C99 compilers from GNU, Sun, Microsoft, and Intel. Odds now
down to at 5:1; you're in?
>>>No, that's not what I am proposing. I am saying that you should not use
>>>structs at all, but you can use the contents of them as a list of comma
>>>seperated entries. With a more beefed up preprocessor one could find the
>>>offset of a packed char array that corresponds to the nth element of the list
>>>as a sum of sizeof()'s and you'd be off to the races.
>>
>>Perhaps I'm missing something here, but wouldn't it be easier to use the
>>offsetof() macro?
> It would be, but only if you have the packed structure mechanism. Other
> people have posted indicating that in fact _Packed is more common that I
> thought, so perhaps my suggestion is not necessary.
Ok. I agree with you on extending the capabilities of the preprocessor
in general, although I can't come up with actual things I miss in
everyday programming.
>>>C is a language suitable for
>>>and high encouraging of writing extremely unsound and poor code. Fixing it
>>>would require a major overhaul of the language and library.
>>
>>That's true. I don't quite see how this relates to the preceding
>>statement though.
> I'm saying that trying to fix C's intrinsic problems shouldn't start or end
> with some kind of resolution of call stack issues. Anyone who understands
> machine architecture will not be surprised about call stack depth limitations.
It's the task of a standard to spell these out, I think.
> There are far more pressing problems in the language that one would like to
> fix.
Yes, but most things that relate to the "encouraging of writing
extremely unsound and poor code", as you describe C, would be better
fixed by using another language. A lot of the inherently unsafe things
in C are sometimes needed, when doing low-level stuff.
>> [powerful heap manager]
>>>But a third party library can't do this portably.
>>
>>I don't see why not?
>
> Explain to me how you implement malloc() in a *multithreaded* environment
> portably. You could claim that C doesn't support multithreading, but I highly
> doubt your going to convince any vendor that they should shut off their
> multithreading support based on this argument.
Now your shifting the goalposts.
> By dictating its existence in
> the library, it would put the responsibility of making it work right in the
> hands of the vendor without affecting the C standards stance on not
> acknowledging the need for multithreading.
Obviously, you cannot write a multithreading heap manager portably if
there is no portable standard on multithreading (doh). If you need this
you can always presume POSIX, and be on your way.
>>>Its actual useful functionality that you just can't get from the C
>>>language, and there's no way to reliably map such functionality to the C
>>>language itself. One is forced to know the details of the underlying
>>>platform to implement such things. Its something that really *should* be
>>>in the language.
I disagree. POSIX is for things like this.
>>Well, it looks to me you're proposing to have a feature-rich heap
>>manager. I honestly don't see you this couldn't be implemented portably
>>without platform-specific knowledge. Could you elaborate?
>
> See my multithreading comment above. Also, efficient heaps are usually
> written with a flat view of memory in mind. This kind of is impossible in
> non-flat memory architectures (like segmented architectures.)
What does the latter remark have to do with C's suitability for doing it?
>>[...] I want this more for reasons of orthogonality in design than anything
>>else.
> You want orthogonality in the C language? You must be joking ...
Not at all.
>>>My proposal allows the programmer to decide what is or is not useful them.
>>I'm all for that.
>
> Well, I'm a programmer, and I don't care about binary output -- how does your
> proposal help me decide what I think is useful to me?
It already did, it seems - You just stated your decision, with regard to
binary output. Fine by me.
>>[...] . Without being bellingerent: why not use that if you want this
>>kind of thing?
>
> Well, when I am programming in C++ I will use it. But I'm not going to move
> all the way to using C++ just for this single purpose by itself.
>
>
>>I used "%x" as an example of a format specifier that isn't defined ('x'
>>being a placeholder for any letter that hasn't been taken by the
>>standard). The statement is that there'd be only 15 about letters left
>>for this kind of thing (including 'x' by the way -- it's not a hex
>>specifier). Sorry for the confusion, I should've been clearer.
>
> Well what's wrong with %@, %*, %_, %^, etc?
%* will clash with already legal format specifiers like %*d. All the
others are just plain ugly :-)
>>A first-class citizen string wouldn't be a pointer; neither would you
>>necessarily be able to get its address (although you should be able to
>>get the address of the characters it contains).
>
> But a string has variable length. If you allow strings to be mutable, then
> the actual sequence of characters has to be put into some kind of dynamic
> storage somewhere. Either way, the base part of the string would in some way
> have to be the storable into, say a struct. But you can copy a struct via
> memcpy or however. But this then requires a count increment since there is
> now an additional copy of the string. So how is memcpy supposed to know that
> its contents contain a string that it needs to increase the ref count for?
> Similarly, memset needs to know how to *decrease* such a ref count.
It's not very practical, is it..... Hmmmm. Instead of thinking of
perverse ways of circumventing these rather fundamental problems, I'll
just concede the point. Good show :-)
>>>I'm saying that you could have &&&, |||, but just don't defined what they
>>>actually do. Require that the programmer define what they do. C doesn't have
>>>type-specific functions, and if one were to add in operator overloading in a
>>>consistent way, then that would mean that an operator overload would have to
>>>accept only its defined type.
>>
>>Ok, so the language should have a big bunch of operators, ready for the
>>taking. Incidentally, Mathematica supports this, if you want it badly.
>
> Hey, its not me -- apparently its people like you who wants more operators.
Just a dozen or so! But _standardized_ ones.
Seriously, though, your "operation introduction" idea is something
different than "operator overloading" alltogether. We should try not to
mix up these two.
> My point is that no matter what operators get added to the C language, you'll
> never satisfy everyone's appetites. People will just want more and more,
> though almost nobody will want all of what could be added.
>
> My solution solves the problem once and for all. You have all the operators
> you want, with whatever semantics you want.
That's too much freedom for my taste. If I would want this kind of
thing, I would yank out lex and yacc and code my own language.
>>>For this to be useful without losing the
>>>operators that already exist in C, the right answer is to *ADD* operators. In
>>>fact I would suggest that one simply defined a grammar for such operators, and
>>>allow *ALL* such operators to be definable.
>>
>>This seems to me a bad idea for a multitude of reasons. First, it would
>>complicate most stages of the compiler considerably. Second, a
>>maintenance nightmare ensues: while the standard operators of C are
>>basically burnt into my soul, I'd have to get used to the Fantasy
>>Operator Of The Month every time I take on a new project, originally
>>programmed by someone else.
> Yes, but if instead of actual operator overloading you only allow redefinition
> of these new operators, there will not be any of the *surprise* factor.
I don't know if you've ever experienced the mispleasure of having to
maintain code that's not written by yourself, but it's difficult enough
as it is. Adding new operators might be interesting from a theoretical
perspective, but it surely is a bad idea from a broader software
engineering perspective.
>If you see one of these new operators, you can just view it like you view an
> unfamilliar function -- you'll look up its definition obviously.
There is an important difference: functions have a "name" that has a
mnemonic function. Operators are just a bunch of pixels with no link to
anything else. It's only by a lot of repetition that you get used to
weird things like '<' and '>'. I don't know about you, But I used to do
Pascal before I switched to C. It took me quite some time before I got
used to "!=".
>>There's a good reason that we use things like '+' and '*' pervasively,
>>in many situations; they are short, and easily absorbed in many
>>contexts. Self-defined operator tokens (consisting, of course, of
>>'atomic' operators like '+', '=', '<' ...) will lead to unreadable code,
>>I think; perhaps something akin to a complicated 'sed' script.
> And allowing people to define their own functions with whatever names they
> like doesn't lead to unreadable code? Its just the same thing.
Nope. See above.
> What makes your code readable is adherence to an agreed upon coding
> standard that exists outside of what the language defines.
There are several such standards for identifier names. No such standard
exists for operator names, except: use familiar ones; preferably, steal
them from other languages. The common denominator of all the identifier
standards is: "use meaningful names". I maintain that there is no
parallel for operators; there's no such thing as a "meaningful"
operator, except when you have been drilled to know their meaning. Your
proposal is in direct collision with this rather important fact of how
the human mind seems to work.
>>>>3) because operator overloading is mostly a bad idea, IMHO
>>>Well, Bjarne Stroustrup has made a recent impassioned request to *REMOVE*
>>>features from C++.
>>Do you have a reference? That's bound to be a fun read, and he probably
>>missed a few candidates.
>
> It was just in the notes to some meeting Bjarne had in the last year or so to
> discuss the next C++ standard. His quote was something like that: while
> adding a feature for C++ can have value, removing one would have even more
> value. Maybe someone who is following the C++ standardization threads can
> find a reference -- I just spent a few minutes on google and couldn't find it.
Ok. I appreciate the effort.
>>>I highly doubt that operator overloading is one that has
>>>been made or would be taken seriously. I.e., I don't think a credible
>>>population of people who have been exposed to it would consider it a bad idea.
>>
>>I can only speak for myself; I have been exposed, and think it's a bad
>>idea. When used very sparsely, it has it's uses. However, introducing
>>new user-definable operators as you propose would be folly; the only way
>>operator overloading works in practice is if you maintain some sort of
>>link to the intuitive meaning of an operator. User defined operators
>>lack this by definition.
> But so do user definable function names. Yet, functionally they are almost
> the same.
"names" refer to (often tangible) objects, whereas "operators" refer to
abstract ideas. I'm no psychologist, but I would guess they could back
up my claim that it's easier for us to handle names than symbols. For
one thing, I have yet to see the first 2-year old that utters "greater
than" as first words.
>><<< and @ are nice though. I would be almost in favour of adding them,
>>were it not for the fact that this would drive C dangerously close in
>>the direction of APL.
>
> You missed the "etc., etc., etc." part.
In a sense, I truly missed it. Your suggestions were rather interesting! ;-)
> I could keep coming up with them
> until the cows come home: a! for factorial, a ^< b for "a choose b" (you want
> language supposed for this because of overflow concerns of using the direct
> definition) <-> a for endian swapping, $% a for the fractional part of a
> floating point number, a +>> b for the average (there is another overflow
> issue), etc., etc.
Golly! You truly are good at this :-)
>>Again I wonder, seriously: wouldn't you be better of using C++ ?
> No because I want *MORE* operators -- not just the ability to redefine the
> ones I've got (and therefore lose some.)
Ok. Your opinion on this is quite clear. I disagree for technical
(implementability) and psychological (names versus symbols) reasons. We
could just leave it at that.
>> [snipped a bit...]
>>I find the idea freaky, yet interesting. I think C is not the place for
>>this (really, it would be too easy to compete in the IOCCC) but perhaps
>>in another language... Just to follow your argument for a bit, what
>>would an "operator definition" declaration look like for, say, the "?<"
>>min operator in your hypothetical extended C?
>
> This is what I've posted elsewhere:
>
> int _Operator ?< after + (int a, int b) {
> if (a > b) return a;
> return b;
> }
I already saw that and reacted. Will come to that in another post.
>>>Yes I'm sure the same trick works for chars and shorts. So how do you
>>>widen a long long multiply?!?!? What compiler trick are you going to
>>>hope for to capture this? What you show here is just some trivial
>>>*SMALL* multiply, that relies on the whims of the optimizer.
>>
>>Well, I'd show you, but it's impossible _in principle_. Given that you
>>are multiplying two expressions of the widest type supported by your
>>compiler, where would it store the result?
>
> In two values of the widest type -- just like how just about every
> microprocessor which has a multiply does it:
>
> high *% low = a * b;
Hmmm. I hate to do this again, but could you provide semantics? Just to
keep things manageable, I'd be happy to see what happens if high, low,
a, and b are any possible combinations of bit-widths and signedness.
Could you clearly define the meaning of this?
>>>PowerPC, Alpha, Itanium, UltraSPARC and AMD64 all have widening multiplies that
>>>take two 64 bit operands and returns a 128 bit result in a pair of 64 bit
>>>operands. They all invest a *LOT* of transistors to do this *ONE* operation.
>>>They all *KNOW* you can't finagle any C/C++ compiler to produce the operation,
>>>yet they still do it -- its *THAT* important (hint: SSL, and therefore *ALL* of
>>>e-commerce, uses it.)
>>Well, I don't know if these dozen-or-so big-number 'powermod' operations
>>that are needed to establish an SSL connection are such a big deal as
>>you make it.
> Its not me -- its Intel, IBM, Motorola, Sun and AMD who seem to be obsessed
> with these instructions.
I don't see them working the committees to get these supported in
non-assembly languages. I guess they're pretty satisfied with the bignum
libs that exist, that provide assembly implementations for all important
platforms (and even a slow fallback for others). The reality is that
no-one seems to care except you, on this.
> Of course Amazon, Yahoo and Ebay and most banks are
> kind of obsessed with them too, even if they don't know it.
I think you would find that bignum operations are a small part of the
load on e-commerce servers. All RSA-based protocols just do a small
amount of bignum work at session-establishment time to agree to a key
for a shared-secret algorithm.
>>>>Many languages exists where this is possible, they are called
>>>>"assembly". There is no way that you could come up with a well-defined
>>>>semantics for this.
>>
>>>carry +< var = a + b;
>>
>>It looks cute, I'll give you that. Could you please provide semantics?
>>It may be a lot less self evident than you think.
> How about:
>
> - carry is set to either 1 or 0, depending on whether or not a + b overflows
> (just follow the 2s complement rules of one of a or b is negative.)
Hang on, are we talking about "overflow" or "carry" here? These are two
different things with signed numbers.
What happens if a is signed and b is unsigned?
> - var is set to the result of the addition; the remainder if a carry occurs.
What happens if the signedness of var, a, and b are not equal?
What happens if the bit-widths of var, a, and b are not equal?
> - The whole expression (if you put the whole thing in parenthesese) returns
> the result of carry.
.... So this would presume the actual expression is: "+< var = a + b" .
There's no need to introduce a mandatory "carry" variable, then.
In fact, if is were only interested in the carry, I'd be out of luck:
still need the 'var'. That's a bit ugly.
Basically, this is a C-esque syntax for a tuple assignment which
unfortunately is lacking in C:
(carry, value) = a+b
> +< would not be an operator in of itself -- the whole syntax is required.
> For example: c +< v = a * b would just be a syntax error. The "cuteness" was
> stolen from an idea I saw in some ML syntax. Obviously +< - would also be
> useful.
I would think you don't need the "c" as well, to make a valid
expression. But I would still need to know what happens with all the
bit-widths and signedness issues.
>>>>Did you know that a PowerPC processor doesn't have a shift-right where
>>>>you can capture the carry bit in one instruction? Silly but no less true.
>>
>>>What has this got to do with anything? Capturing carries coming out of
>>>shifts don't show up in any significant algorithms that I am aware of
>>>that are significantly faster than using what we have already.
>>
>>Ah, I see you've never implemented a non-table-driven CRC or a binary
>>greatest common divisor algorithm.
>
> You can find a binary gcd algorithm that I wrote here:
>
> http://www.pobox.com/~qed/32bprim.c
That's not the "binary GCD algorithm", that's just Knuths version that
avoids modulos. Below is a binary GCD.
unsigned bgcd(unsigned a, unsigned b)
{
unsigned c,e;
if (c=a|b)
{
for(e=0;c%2==0;e++) c/=2;
a>>=e;
b>>=e;
while(a%2==0) a/=2;
while(b%2==0) b/=2;
while (a!=b)
{
if (a<b)
{
b-=a;
do b/=2; while (b%2==0);
}
else
{
a-=b;
do a/=2; while (a%2==0);
}
}
c=a<<e;
}
return c;
}
> You will notice how I don't use or care about carries coming out of a right
> shift. There wouldn't be enough of a savings to matter.
Check bgcd().
>>>The specific operations I am citing make a *HUGE* difference and have billion
>>>dollar price tags associated with them.
>>
>>These numbers you made up from thin air, no? otherwise, I'd welcome a
>>reference.
>
> Widening multpilies cost transistor on the CPU. The hardware algorithms are
> variations of your basic public school multiply algorithm -- so it takes n^2
> transistors to perform the complete operation, where n is the largest bit
> word that the machine accepts for the multiplier. If the multiply were not
> widened they could save half of those transistors. So multiply those extra
> transistors by the number of CPUs shipped with a widening multipliy (PPC,
> x86s, Alphas, UltraSparcs, ... etc) and you easily end up in the billion
> dollar range.
This is probably the most elaborate version of "yes, I made these
numbers up from thin air" I've ever came across :-)
>>>I understand the need for the C language standard to be applicable to as
>>>many platforms as possible. But unlike some right shift detail that you
>>>are talking about, the widening multiply hardware actually *IS* deployed
>>>everywhere.
Yup. And it is used too. From machine language.
>>Sure is. Several good big-number libraries are available that have
>>processor-dependent machine code to do just this.
>
> And that's the problem. They have to be hand written in assembly. Consider
> just the SWOX Gnu multiprecision library. When the Itanium was introduced,
> Intel promised that it would be great for e-commerce.
Correction: the Intel marketing department promised that it would be
great for e-commerce.
> The problem is that the SWOX guys were having a hard time with IA64 assembly
>language (as apparently lots of people are.)
Yes, it's close to a VLIW architecture. Hard to code manually.
> So they projected performance results for
> the Itanium without having code available to do what they claim. So people
> who wanted to consider using an Itanium system based on its performance for
> e-commerce were stuck -- they had no code, and had to believe Intel's claims,
> or SWOX's as to what the performance would be.
The only thing your example shows is that a marketing angle sometimes
doesn't rhyme well with technical realities.
> OTOH, if instead, the C language had exposed a carry propogating add, and a
> widening multiply in the language, then it would just be up to the Intel
> *compiler* people to figure out how to make sure the widening multiply was
> used optimally, and the SWOX/GMP people would just do a recompile for baseline
> results at least.
I would guess that Intel, being both a compiler maker and the IA64
manufacturer, could have introduced a macro widemul(hr,lr,a,b) to do
this, and help the SWOX guys out a bit?
My guess is that they have problems with raw performance and/or compiler
technique. I have some experience with a VLIW compiler, and these things
need a compiler pass to do instruction to execution-pipeline allocation.
This is a very active area of research, and notoriously difficult. My
guess is that there are inherent problems of getting high performance
out of IA64 for this kind of algorithms. VLIW and VLIW-like
architectures can do wonders on high-troughput, low-branching type of
work, but they tend to break down on some very simple algorithms, if
there is a lot of branching.
I don't know SWOX; what do they use for bignum multiplication?
Karatsuba's algorithm?
Best regards, Sidney
- Next message: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Previous message: Uncle: "Re: Can I send char as array argument?"
- In reply to: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Next in thread: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Reply: Paul Hsieh: "Re: Is C99 the final C? (some suggestions)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|