Re: Is C99 the final C? (some suggestions)
From: Paul Hsieh (qed_at_pobox.com)
Date: 12/12/03
- Next message: Les Cargill: "Re: SOT: Turbo C 2 Question"
- Previous message: Barry Schwarz: "Re: Global initialization of function pointer array"
- In reply to: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Next in thread: pete: "Re: Is C99 the final C? (some suggestions)"
- Reply: pete: "Re: Is C99 the final C? (some suggestions)"
- Reply: Richard Heathfield: "Re: Is C99 the final C? (some suggestions)"
- Reply: CBFalconer: "Using the standard (was: Is C99 the final C? (some suggestions))"
- Reply: Mark F. Haigh: "Re: Is C99 the final C? (some suggestions)"
- Reply: Johan Aurer: "Re: Is C99 the final C? (some suggestions)"
- Reply: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 12 Dec 2003 03:01:49 GMT
Sidney Cadot wrote:
> Paul Hsieh wrote:
> >Sidney Cadot wrote:
> >>Paul Hsieh wrote:
> >> [snip]
> >>I think there are relatively few excuses for any professional C
> >>programmer not to have access to the standard (and reading it once
> >>or twice wouldn't hurt either). It's only 18 bucks in electronic
> >>form, and you don't even have to go out to get it.
> >
> > Care to take a quick survey of comp.lang.c patrons who own their
> > own copy of the standard?
>
> Sure, I think you'd be surprised.
Well the few people in here who have the standard have chimed in. Now
how do you supposed we convince everyone else in the group to chime
about the fact that they don't have it? (Look up the term "push poll"
if you don't understand what I mean.)
> What is your excuse for not having the standard? A free draft (N869)
> is also available.
The standard does not tell me how to do the job of coding. Although
it may cause the occasional portability problem, having the raw source
of the C library is surprisingly more useful. For example, please
locate for me where in the standard it says that fgets() ignores '\0'
that are consumed -- then locate where in the standard it says its the
only string function which does this. Then tell me where in the
standard it says that strtok is not re-enterable.
> >>>I had to look this one up -- that's one of the classic useless
> >>>non-solutions (especially if you want to call malloc more than 4
> >>>billion times, for example), which explains why I've never
> >>>committed it to memory.
> >>We have long longs nowadays. 2^32 is not an important limit.
> >
> > And long long's are portable?!?! Are you even following your own
> > train of thought?
>
> You really seem a bit out of touch with current events... Since you
> don't seem to have access to the C99 standard, here's the relevant
> quote:
Excuse me?!?! *C99* is not portable.
> >>>You still need to be able to identify process instances, which *BY
> >>>ITSELF* is not portable. Process instances identifiers would
> >>>count as something *BEYOND* just malloc/free/etc.
> >>
> >>Why do you need to distinguish process instances?
> >
> > Reread the Lamport's bakery algorithm. One of the crucial steps is
> > "for each process, check their number, and pick the max, then add
> > 1". Another is "for each process that has a lower number, wait in
> > line behind them." This means you have a way of knowing how many
> > processes there are, and how to index them. This means you have
> > to hook out or wrap whatever "spawn" mechanism you have in your
> > language (or use some other platform specific mechanism to iterate
> > through all running threads.)
>
> Could you provide a reference to the description of the algorithm
> you're using?
Typed it into google and felt lucky. Its the first hit. Notice
language like "Process i", and statements like for (j=0; j < n; j++),
where each j indexes a process id, and n is the number of processes.
I tried to think up a way of implementing that algorithm without
specifically requiring an enumeration of processes, but it seems that
that is central to the way the algorithm works. Which is only *one*
of the reasons why that algorithm is worthless.
> [...] I'm not saying it's incorrect or anything, it's just that I'd
> like to respond in the same terminology.
>
> >>[...] In your API, each call carries the heap it refers to as a
> >>parameter.
> >
> > Yes, but if two threads simultaneously call heap functions on the same
> > heap (heaps are not restricted to being distinct across threads) then
> > there is a race condition.
>
> So we need to put a mutex around the critical section, no big deal.
It *is* a big deal. Mutexes, and other similar multithreading objects
are not expressible is a portable way in language like C or C++.
> [...] All hinges now on whether the Lamport algorithm needs process
> id's. Fortunately, there's no subjective element in that, we should
> be able to sort that out.
What I don't understand is how is it that *YOU* brought up the
algorithm, and yet you are unaware of its fundamental functionality.
It only took me only 30 seconds to understand it.
> >>>>I would say that writing a more powerful heap manager is possible in
> >>>>principle, even in a multithreaded environment, as long as you have a
> >>>>well-implemented malloc() (and friends) at your disposal. [...]
> >>>>It would be a drag to do so and it would be slow, but it is possible.
> >>>
> >>>Impossible. Slow, fast or otherwise. Its just impossible.
> >>
> >>If you can explain to me why you need to explicitly distinguish threads
> >>in the memory manager, I'll concede the point.
>
> > Go look up the source code for *ANY* multithreading supporting
> > malloc/free. I absolutely guarantee that every single one of them
> > uses some kind of platform specific exclusion object to make sure that
> > two simultaneous calls to the heap do not collide.
>
> Sure. Mutexes would suffice and I contend the bakery algorithm can
> provide them.
Well, but that's not my contention, and I would appreciate that you not put
words in my mouth, or follow the discussion (whichever one is causing
you do write things like this) -- my contention is that Lamport's
Bakery algorithm is not portable (I mean relative to C, which doesn't
expose and multithreading functionality, even if implementations do.)
Then again, once you open it up to non-portable code, of course,
nobody will touch Lamport's Bakery algorithm with a 10 ft pole -- its
worthless when you have far superior mechanisms for creating Mutexes
at your disposal.
> [...] I feel rather silly getting dragged in this line of
> discussion (multithreading has little to do with the C execution
> model) but we'll see where this leads us.
Then let me review the facts for you. Multithreading is inextricably
intertwined with the implementation of malloc. Thus if a compiler
vendor wants to provide a multithreading extension to C (which they
pretty much all do) discussions of implementation of malloc, or
malloc-like functions have to at least acknowledge this as a
consideration. There is no such thing as a portable mutex in pure C
that actually encodes all the expected notions of a mutex unless tied
specifically to the extension mechanisms that are used to provide the
multitasking in the first place. The inevitable conclusion is that
its impossible to make a portable library for extending the heap
functions that are multithreading safe.
> > One way or another the heap is some kind of quasi-linked list of
> > memory entries. You should have learned from your OS classes
> > (assuming you passed it and/or comprehended the material)
>
> Could you refrain from the silly ad hominems? I am trying to take you
> seriously, even though you seem to deny the importance of standards,
> and even though you are advocating an (in my opinion) confused
> "operator introduction" feature. Maintaining a basic level of
> respect helps in any discussion.
Well, if you are sensitive about being admonished in public, then can
I ask that you have a little bit of intellectual honesty? I am trying
to put forward hard core analytical ideas, and I have made efforts to
distill your own proposals is a very serious way. The response is not
supposed to be "oh but you can't prove its impossible to implement".
This is like Ari Fleischer telling the anti-war sympathizers that its
up to them to prove that there are no WMDs in Iraq. If you have a
problem with what I am saying why can't you limit it to an analytical
response?
See the problem with your suggestion of using the Lamport's bakery
algorithm, is not just that it obviously doesn't satisfy the
conditions of the original problem (portably extending malloc/etc),
but that the next thing you are going to do is throw some other
algorithm at it that also won't work in some vain attempt to throw me
off yet again. You did it full well expecting that I wouldn't chase
it down and find the flaw. Are you going to do it again, or are you
going at least take my arguments more seriously?
You see its not *UP* to me to analyze bogus algorithms to prove that
you can't implement a mutex portably. That's retarded -- even if I do,
its not actual proof of anything. If you have a serious contention
against my very technical claim, then don't you think that its at
least somewhat incumbant upon your to wade through the details
yourself?
> >>>What if an implementation uses multiple stacks depending on the
> >>>*type* of the parameter?
> >>
> >>The hypothetical implementation [...]
> >
> > That the ANSI C standard currently does not have any problem with ...
> >
> >>[...] that does this will be out of luck. Not because the proposal
> >>wouldn't work, mind you, but because the ULMU is a very loose bound.
> >>But hey, it beats having no bound whatsoever.
> >
> > No it doesn't. Dealing with the stack is an area that clearly
> > should be platform specific. For example, it could be that an
> > embedded system has fast and slow ram. Certain data types might
> > be best suited to being mapped into fast RAM, whether they are
> > local or not, but you might have a lot more slow RAM, so you'd put
> > all the rest of the stuff in there.
>
> To what does "no it doesn't" refer?
The last statement "... it beats having no bound ..."
> How does your example invalidate the "upper bound" idea?
Look, if you have two stacks, then the right way to bound it is to
have two bounds. If one of your stacks has very limited amount of
memory (fast RAM is likely to be more expensive) then in order to
properly implement a single bound (you proposal) you need to assume
the worst case and say that you only have as much stack space of the
smaller stack (otherwise it wouldn't be a correct bound.) This is
completely worthless to systems that want to expose really really deep
call stacks but fairly shallow floating point stacks, for example
(look up the Ackerman function again, and try to imagine attempts at
mathematical simplifications, to be able to output at least *some* of
the reasonable computable values in reasonable time.)
> >>>> - For every parameter/local variable of basic type there is a
> >>>> corresponding constant macro of type size_t:
> >>>
> >>>You missed all the implicit temporaries required. Homework
> >>>exercise: for any fixed number n, create a function which requires
> >>>at least n implicit temporaries (not explicitely declared) to
> >>>implement.
> >>
> >>Ah yes! So it seems. An interesting problem. Have to think about that
> >>for a bit.
> >
> > Why don't you think about the implication of the results of this
> > exercise with respect to your proposal while you are at it.
>
> If no way can be found to address this, we can no longer give an upper
> bound, and the proposal is dead in the water. However, I'm not conviced
> it can't be handled.
I can see once again I'll be trying to prove that the are no WMDs in
Iraq ...
> [...] For one thing, your proposal to make macro's available outside
> the function scope that give the functions stack usage upper bound
> may come in handy.
See how that worked out? You had a proposal, I found something wrong
with it, so I offered something *CONSTRUCTIVE* to try to save it,
rather than pretending it would fail because of an issue that I can
work around even if you might have missed it. Its called intellectual
honesty.
> >>>And what about the number of alloca() calls? alloca() is a common
> >>>extension which just eats extra space off the stack dynamically --
> >>>the big advantage being that you don't have to call free or
> >>>anything like it, for it to be properly cleaned up. It will be
> >>>cleaned up upon the function's return (the simplest implementation
> >>>of a "freeall".)
> >>
> >>Ok, we can add alloca'ed space is each function invocation, plus an
> >>overhead constant per active alloca.
> >
> > The alloca() calls can be in a loop. Knowing how much memory a set of
> > alloca's will perform in any function is equivalent to solving the
> > halting problem.
>
> So what? Why is this relevant? Most of the time you can still give an
> upper bound.
Perhaps you do not know the true nature of the halting problem. You
can't know if a piece of code ends, which is equivalent to knowing if
it takes one branch or another, or how many times it executes a given
code fragment. There's no shortage of open mathematical problems for
which the output of a given function is just completely unboundable,
calculatable, or estimatable (the famous "+3 on odd, /2 if even"
problem which I can't recall the details of comes to mind) with any
current knowledge.
> [...] In cases where you can't (few and far inbetween) you don't
> have the guarantee, which is a pity; the code can no longer be
> written in a way that's fully portable. It would still be much
> better than what happens now (standard-compliant programs
> segfaulting without so much as a hint in the standard).
It means you are expecting compilers to come as close to solving the
halting problem as they can. And you think variable length operators
based on a grammar is an impossible, intractible problem in
practice?!?!
> >>Unless you allow that I'm trying to define an upper bound of total
> >>use, which may be loose as far as I am concerned. The idiomatic
> >>use would be to compare this to the ULMU_TOTAL to see if it fits.
> >>This may yield the number of bytes on the smallest check for all I
> >>care. On those silly platforms, you will grossly underestimate
> >>stack capacity now and then. [...] Still beats the current
> >>situation.
> >
> > This versus just living with runtime stack checking? I'll take the
> > runtime stack checking.
>
> Then define a proposal to formalize "runtime stack checking" in
> standard-like language.
The implementation may exit or take any other defined action which halts
the execution on any function call. But it may not create undefined
behaviour from the action of performing the call itself alone.
There's a proposal that's compatible with many implementations today
(if we presume that turning stack checking *off* would be an
extension, and that leaving it on is the default conforming behavior)
and should be easy to understand and implement.
> >> [...]
> >> My opionion is determined largely by the psychological objections (to
> >> which you don't subscribe.)
>
> > Says who? You see, unlike some people, by idea isn't driven by an
> > opinion. Its driven by a consideration for design. Since you *CAN'T*
> > add just any old operator to satisfy everyone, then how do you satisfy
> > the demand for more operators? Some variation on redefinable
> > operators or operator overloading is the logical conclusion -- just
> > fix its problems (since type-index functions don't really exist in C,
> > so you just add in a bunch of them so that you don't lose any of the
> > ones you have). For all I know, I might not ever use such a feature,
> > even if it *WERE* to be endorsed by the standard.
>
> I fail to see the point you want to make here.
Just that I am trying analyze problems, and you are just spouting
opinion.
> > > My technical problem is that I can see no way that this could be
> > > implemented.
> >
> > Well I don't know what to make of this -- you can't see a race
> > condition when its staring you in the face, and you can't imagine
> > problems with knowing the size of a function's stack usage.
>
> Be careful not to presume things that are not there. [...]
Yeah, thanks for the warning. I'll be sure to purchase some duct tape
to deal with my neighbors who might be terrorists too.
The race is there. I reduced it to the standard notions that you are
supposed to learn about in an OS class. I've explained everything you
need to know to understand its there. Any research you do into the
subject will confirm what I'm saying. But I'm supposed to worry that
I might be wrong based on your flimsy non-arguments.
> > > There is a possibility of error caused by bad design (which isn't
> > > trimmed to the type of mistakes we fallible humans make) in your
> > > operator introduction scheme.
> >
> > Same is true with free form function names, of course.
>
> The risks of errors are minute, in comparison.
An opinion you will back up how ... ?
> >>>>You haven't shown me that it can be implemented in a compiler,
> >>>>even in theory. That should count for something.
> >>>
> >>>I don't think there's a credible assertion that my idea is
> >>>impossible. You want me to describe a whole compiler
> >>>implementation just to support my idea?
> >>
> >>I listed some specific objections with regards to disambiguation of
> >>the grammer that the parser would have to be able to handle.
>
> > Even without a full grammar being specified? That's a pretty good trick.
>
> Yes, it's called "abstract thinking". It's a cousin of "thought
> experiment".
Yeah, well believing that there are parsable grammars doesn't take very much
"abstract thinking", there millions of examples.
> > [smartcards]
> > Yes they do. That's the point of creating a pico-java assembly
> > language. Its so that you just have to upload a minimal java kernel to
> > it, and all of a sudden you have the whole world's Java libraries at your
> > disposal.
>
> How much would these combined libraries consume, in terms of memory on
> the smartcard? I mean, the bignum-multiply code has to reside somewhere.
If you have a widening multiply, a bignum multiply is tiny. As a WAG,
maybe 30 instructions or so? Obviously the don't *include* every
library -- I imagine they need the Java GUI for a smart card. Look
I'm not going to defend them, *they* did it, not me. Go look this stuff up
yourself:
http://www.google.com/search?hl=en&ie=UTF-
8&c2coff=1&safe=off&q=Michael+Montgomery+Java+smart+card&spell=1
> > > > It all just comes down to the carry flag. The other flags are
> > > > not g necessary for bignum libraries. Maybe there's a way to
> > > > transform the OF into something that can be used in a bignum
> > > > library, but I am not aware of it.
> > >
> > > Because, we're talking about expanding the C language here (not
> > > about your fairy-tale operator introduction hocus-pocus). You
> > > need decent semantics.
>
> > CPU companies have laid out the decent semantics. I and others have
> > written bignum routines -- the carry flag is where its at as far as
> > this is concerned.
>
> That's fine. Now here's a big idea: hardware design and general-purpose
> language design are two different things.
And so which category do you think "bignum libraries" falls into? I
mean given that every platform supports it, and that in fact an
implementation (the GMP) purports to be a software library with a
portable interface up to software, supporting just about every popular
desktop/server/workstation platform under the sun.
> >>>[...]
> >>>OF is never the important flag. Its not relevant at all.
> >>I guess that's why everything since the 4004 supports it.
>
> > That's because OF is used for <,>,<=,>= comparisons. Outside of
> > this, I have never seen OF being used for any purpose. I.e., I would
> > surmise that the only relevant usage of OF is already encoded in the
> > language semantics of C (and just about any other language with
> > integers.) But I could be wrong -- can you name at least one other
> > purpose which isn't really just an encoding of <,>,<=,>= ?
>
> Adding, subtracting, multiplying, dividing. Basically anything
> arithmetic you can do that can map the result of an operation on signed
> ints to a value that doesn't correspond to the mathematical
> interpretation of what the result should be.
No, we all understand what produces an OF, I mean, name an actual
agorithm that uses it, that can't extract the semantics from one of
<,>,<=,>=.
> >>[...]
> >>>In this case, I am just copying the x86's MUL -> EDX:EAX or
> >>>AMD64's MUL -> RDX:RAX semantics.
> >>
> >>So again (as with the carry vs. overflow), you're going to perform an
> >>instruction that works on UNsigned integers even on signed integers.
> >
> > Yes.
>
> That doesn't rhyme well with C.
Well that's right -- *bignums* don't rhyme well with C. *STRINGS*
don't rhyme well with C. *BITS* don't rhyme well with C. That
doesn't mean its not worth having them.
> [...] There are many subtle issues with
> signed/unsigned representation and operations, and the standard is
> careful to find a compromise; not assuming too much on the
> architecture side, while at the same time maintaining a sound
> semantics for both signed and unsigned operations. Your semantics
> just doesn't fit in.
There's your deep analysis again "... just doesn't fit in ...". I
just can't argue with that. Wow, I guess I am just wrong.
> >>Pardon my French, but that's just plain stupid. Perhaps you should
> >>be a little less confident about your own understanding of
> >>bit-twiddling...
>
> > Perhaps you should consider that there's a reason every assembly
> > language in existence associated signedness with particular
> > *OPERATIONS* not the *OPERANDS*.
>
> Are you aware that the C standard allows for other representations
> than 2-complement? Assume 1-complement representation for a moment
> (the standard is specifically worded to allow this).
Yes, but the semantics of what a carry is, is not crucially tied to
this. Even if you use ones complement, or something else, you can
still *compute* the carry, even if you don't automatically have it from
the built-in representation.
> > Having signed and non-signed versions of every operator may be
> > something that fit C when it was first designed, but that doesn't
> > make it necessarily correct for all relevant computations.
>
> Well, if (and that's a big if) something resembling your wide-mul
> would ever make it in the standard, I'd be disappointed if it didn't
> for instance allow me to multiply two big signed integers. It would
> be messy to only support unsigned ones.
But I didn't precisely say that. In fact losing the (un)signedness
semantics is really something you need for the carry flag idea. The
"almost like widening multiplies", that exist in C retain the sign as
you would expect, and obviously you probably wouldn't want to change
that in an extension to C.
-- Paul Hsieh http://www.pobox.com/~qed/ http://bstring.sf.net/
- Next message: Les Cargill: "Re: SOT: Turbo C 2 Question"
- Previous message: Barry Schwarz: "Re: Global initialization of function pointer array"
- In reply to: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Next in thread: pete: "Re: Is C99 the final C? (some suggestions)"
- Reply: pete: "Re: Is C99 the final C? (some suggestions)"
- Reply: Richard Heathfield: "Re: Is C99 the final C? (some suggestions)"
- Reply: CBFalconer: "Using the standard (was: Is C99 the final C? (some suggestions))"
- Reply: Mark F. Haigh: "Re: Is C99 the final C? (some suggestions)"
- Reply: Johan Aurer: "Re: Is C99 the final C? (some suggestions)"
- Reply: Sidney Cadot: "Re: Is C99 the final C? (some suggestions)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|