Re: Is C99 the final C? (some suggestions)

From: Sidney Cadot (sidney_at_jigsaw.nl)
Date: 12/12/03

  • Next message: Chris Torek: "Re: simple algorithm for finding primes"
    Date: Fri, 12 Dec 2003 21:35:00 +0100
    
    

    Paul Hsieh wrote:

    >>>Care to take a quick survey of comp.lang.c patrons who own their
    >>>own copy of the standard?
    >>
    >>Sure, I think you'd be surprised.

    > Well the few people in here who have the standard have chimed in. Now
    > how do you supposed we convince everyone else in the group to chime
    > about the fact that they don't have it? (Look up the term "push poll"
    > if you don't understand what I mean.)

    Now you're just being silly. Please feel free to start a top-level
    thread about who has access to the standard, and who doesn't.

    >>What is your excuse for not having the standard? A free draft (N869)
    >>is also available.

    > The standard does not tell me how to do the job of coding.

    That is a strange excuse. The C standard is not supposed to do that,
    that's what tutorials/books on different levels are for. The standard
    addresses the features you can (and cannot) rely on to work on a
    compiler that adheres to the standard.

    > Although it may cause the occasional portability problem, having the raw source
    > of the C library is surprisingly more useful.

    "The" C library?

    > For example, please
    > locate for me where in the standard it says that fgets() ignores '\0'
    > that are consumed -- then locate where in the standard it says its the
    > only string function which does this. Then tell me where in the
    > standard it says that strtok is not re-enterable.

    As for the re-entrancy, that's covered. As for the fgets(), please
    provide a code snippet and some behavior you didn't expect, then we can
    look why it doesn't work as you expect. It will surely be covered.

    >>You really seem a bit out of touch with current events... Since you
    >>don't seem to have access to the C99 standard, here's the relevant
    >>quote:

    > Excuse me?!?! *C99* is not portable.

    Well, that's true, given a practical definition of "portable".

    Would gcc be portable enough (according to this practical definition)
    for you? Or would we have to define portable in a way that includes the
    Microsoft compiler, or do you draw the line at C89?

    >>>>Why do you need to distinguish process instances?
    >>>
    >>>Reread the Lamport's bakery algorithm. One of the crucial steps is
    >>>"for each process, check their number, and pick the max, then add
    >>>1". Another is "for each process that has a lower number, wait in
    >>>line behind them." This means you have a way of knowing how many
    >>>processes there are, and how to index them. This means you have
    >>>to hook out or wrap whatever "spawn" mechanism you have in your
    >>>language (or use some other platform specific mechanism to iterate
    >>>through all running threads.)
    >>
    >>Could you provide a reference to the description of the algorithm
    >>you're using?
    >
    >
    > Typed it into google and felt lucky. Its the first hit. Notice
    > language like "Process i", and statements like for (j=0; j < n; j++),
    > where each j indexes a process id, and n is the number of processes.
    > I tried to think up a way of implementing that algorithm without
    > specifically requiring an enumeration of processes, but it seems that
    > that is central to the way the algorithm works. Which is only *one*
    > of the reasons why that algorithm is worthless.

    Typed -what- into google? Is it so hard to provide a link?

    I'm not doing this to piss you off... Like with many important
    algorithms, there are different formulations of the bakery algorithm
    floating around. I'll look up the original Lamport papers later on, and
    see if the process-id thing is truly essential.

    >>>Yes, but if two threads simultaneously call heap functions on the same
    >>>heap (heaps are not restricted to being distinct across threads) then
    >>>there is a race condition.
    >>
    >>So we need to put a mutex around the critical section, no big deal.

    > It *is* a big deal. Mutexes, and other similar multithreading objects
    > are not expressible is a portable way in language like C or C++.

    I already established that if you cannot portably get mutexes in C, I
    will concede. This very specific point of contention is *not* decided
    yet, in this discussion.

    >>[...] All hinges now on whether the Lamport algorithm needs process
    >>id's. Fortunately, there's no subjective element in that, we should
    >>be able to sort that out.

    > What I don't understand is how is it that *YOU* brought up the
    > algorithm, and yet you are unaware of its fundamental functionality.

    Hells bells, I used it 5 years ago. What I remember is that it can be
    used to provide mutexes without hardware support like atomic
    read/update/write cycles. At this point, I'm not sure if the process id
    stuff is essential. I'll look into it.

    > It only took me only 30 seconds to understand it.

    Well, that would surely make you quite brilliant. Took me a couple of
    hours, the first time I read it, to get my head around it, and to see
    that it is correct.

    >>>Go look up the source code for *ANY* multithreading supporting
    >>>malloc/free. I absolutely guarantee that every single one of them
    >>>uses some kind of platform specific exclusion object to make sure that
    >>>two simultaneous calls to the heap do not collide.
    >>
    >>Sure. Mutexes would suffice and I contend the bakery algorithm can
    >>provide them.

    > Well, but that's not my contention, and I would appreciate that you not put
    > words in my mouth, or follow the discussion (whichever one is causing
    > you do write things like this) --

    What's the problem?

    > my contention is that Lamport's
    > Bakery algorithm is not portable (I mean relative to C, which doesn't
    > expose and multithreading functionality, even if implementations do.)

    Yes, and for now, I contend otherwise, ...

    > Then again, once you open it up to non-portable code, of course,
    > nobody will touch Lamport's Bakery algorithm with a 10 ft pole -- its
    > worthless when you have far superior mechanisms for creating Mutexes
    > at your disposal.

    Sure. Read a few posts back, I already said that it would be an
    interesting idea to put your stuff in the standard library for
    performance reasons. I was quite ready to leave it at that, but you
    *INSISTED* that it is completely impossible to do, portably. That's the
    one and only thing we're discussing now.

    At this point, I'm not sure whether it can be done but my hunch is that
    it is possible, based on something like the Bakery algorithm. If you
    think this question is too academic, I'd be happy /not/ to spend the
    effort to get to the bottom of this. Then again, in this case, the onus
    would be on me to give at least an outline of how it could be done.

    >>[...] I feel rather silly getting dragged in this line of
    >>discussion (multithreading has little to do with the C execution
    >>model) but we'll see where this leads us.

    > Then let me review the facts for you. Multithreading is inextricably
    > intertwined with the implementation of malloc. Thus if a compiler
    > vendor wants to provide a multithreading extension to C (which they
    > pretty much all do) discussions of implementation of malloc, or
    > malloc-like functions have to at least acknowledge this as a
    > consideration. There is no such thing as a portable mutex in pure C
    > that actually encodes all the expected notions of a mutex unless tied
    > specifically to the extension mechanisms that are used to provide the
    > multitasking in the first place. The inevitable conclusion is that
    > its impossible to make a portable library for extending the heap
    > functions that are multithreading safe.

    Ok, except that "There is no such thing as a portable mutex in pure C
    that actually encodes all the expected notions of a mutex unless tied
    specifically to the extension mechanisms that are used to provide the
    multitasking in the first place" is not a fact. That's what we're
    arguing about, now.

    >>>One way or another the heap is some kind of quasi-linked list of
    >>>memory entries. You should have learned from your OS classes
    >>>(assuming you passed it and/or comprehended the material)
    >>
    >>Could you refrain from the silly ad hominems? I am trying to take you
    >>seriously, even though you seem to deny the importance of standards,
    >>and even though you are advocating an (in my opinion) confused
    >>"operator introduction" feature. Maintaining a basic level of
    >>respect helps in any discussion.
    >
    >
    > Well, if you are sensitive about being admonished in public, then can
    > I ask that you have a little bit of intellectual honesty? I am trying
    > to put forward hard core analytical ideas, and I have made efforts to
    > distill your own proposals is a very serious way. The response is not
    > supposed to be "oh but you can't prove its impossible to implement".
    > This is like Ari Fleischer telling the anti-war sympathizers that its
    > up to them to prove that there are no WMDs in Iraq. If you have a
    > problem with what I am saying why can't you limit it to an analytical
    > response?

    I acknowledge that with the portable mutex stuff, the burden of proof
    rests with me. However, you did't seem to think this was the case with
    your "operator introduction" parser, expecting to get away with the
    premise that it could surely be solved (which is not at all obvious).
    Why the double standard?

    > See the problem with your suggestion of using the Lamport's bakery
    > algorithm, is not just that it obviously doesn't satisfy the
    > conditions of the original problem (portably extending malloc/etc),

    It's not obvious. Perhaps you're right, perhaps I'm right. We'll get to
    that.

    > but that the next thing you are going to do is throw some other
    > algorithm at it that also won't work in some vain attempt to throw me
    > off yet again. You did it full well expecting that I wouldn't chase
    > it down and find the flaw.

    That's quite a strange accusation, given that we didn't establish yet
    that the bakery algorithm or something similar can't provide mutexes,
    using portable C.

    > Are you going to do it again, or are you
    > going at least take my arguments more seriously?

    I don't do cheap rhetorical tricks, and I will take seriously-looking
    arguments serious.

    > You see its not *UP* to me to analyze bogus algorithms to prove that
    > you can't implement a mutex portably. That's retarded -- even if I do,
    > its not actual proof of anything. If you have a serious contention
    > against my very technical claim, then don't you think that its at
    > least somewhat incumbant upon your to wade through the details
    > yourself?

    Will do. Will you do the same with the operator-introduction parser
    problems? Like provide a semantics that you can show not to lead to
    parsing ambiguities? Quid pro quo.

    >>>>[...] that does this will be out of luck. Not because the proposal
    >>>>wouldn't work, mind you, but because the ULMU is a very loose bound.
    >>>>But hey, it beats having no bound whatsoever.
    >>>
    >>>No it doesn't. Dealing with the stack is an area that clearly
    >>>should be platform specific. For example, it could be that an
    >>>embedded system has fast and slow ram. Certain data types might
    >>>be best suited to being mapped into fast RAM, whether they are
    >>>local or not, but you might have a lot more slow RAM, so you'd put
    >>>all the rest of the stuff in there.
    >>
    >>To what does "no it doesn't" refer?

    > The last statement "... it beats having no bound ..."

    Well, in that case: "oh yes it does!" :-)

    >>How does your example invalidate the "upper bound" idea?

    > Look, if you have two stacks, then the right way to bound it is to
    > have two bounds.

    That's a better way, that (unfortunately) seems to be rather hard to
    formalize. I am proposing to forego the difficulties be defining (for
    these rare cases) a very loose upper bound.

    > If one of your stacks has very limited amount of
    > memory (fast RAM is likely to be more expensive) then in order to
    > properly implement a single bound (you proposal) you need to assume
    > the worst case and say that you only have as much stack space of the
    > smaller stack (otherwise it wouldn't be a correct bound.)

    Yes.

    > This is completely worthless to systems that want to expose really really deep
    > call stacks but fairly shallow floating point stacks, for example

    For such architectures it would be difficult to write code that honours
    the upper bound metric. I don't think that's a problem; rather the
    opposite. It would force programmers to consider the issue.

    > (look up the Ackerman function again, and try to imagine attempts at
    > mathematical simplifications, to be able to output at least *some* of
    > the reasonable computable values in reasonable time.)

    I don't understand this, why would you have to do this?

    >>If no way can be found to address this, we can no longer give an upper
    >>bound, and the proposal is dead in the water. However, I'm not conviced
    >>it can't be handled.

    > I can see once again I'll be trying to prove that the are no WMDs in
    > Iraq ...

    I agree that this would have to be fixed. I don't see an easy way,
    except perhaps:

    >>[...] For one thing, your proposal to make macro's available outside
    >>the function scope that give the functions stack usage upper bound
    >>may come in handy.

    > See how that worked out? You had a proposal, I found something wrong
    > with it, so I offered something *CONSTRUCTIVE* to try to save it,
    > rather than pretending it would fail because of an issue that I can
    > work around even if you might have missed it. Its called intellectual
    > honesty.

    Why thank you. I think I dislike the proposition that I would not do the
    same thing, though.

    >>>The alloca() calls can be in a loop. Knowing how much memory a set of
    >>>alloca's will perform in any function is equivalent to solving the
    >>>halting problem.
    >>
    >>So what? Why is this relevant? Most of the time you can still give an
    >>upper bound.
    >
    >
    > Perhaps you do not know the true nature of the halting problem. You
    > can't know if a piece of code ends, which is equivalent to knowing if
    > it takes one branch or another, or how many times it executes a given
    > code fragment. There's no shortage of open mathematical problems for
    > which the output of a given function is just completely unboundable,
    > calculatable, or estimatable (the famous "+3 on odd, /2 if even"
    > problem which I can't recall the details of comes to mind) with any
    > current knowledge.

    It's called the Collatz problem, which asks if this list ends at 1 for
    all n. As I said, "most of the time" you can still give an upper bound,
    which is fine the vast majority of practical recursive functions. For
    example, in the count_nodes() example, you could establish that it would
    be guaranteed to work if the tree were not to exceed a certain depth (as
    expressed in a number of macro's that are available at runtime). I think
    it would be like this for 99% of practical recursive problems.

    >>[...] In cases where you can't (few and far inbetween) you don't
    >>have the guarantee, which is a pity; the code can no longer be
    >>written in a way that's fully portable. It would still be much
    >>better than what happens now (standard-compliant programs
    >>segfaulting without so much as a hint in the standard).

    > It means you are expecting compilers to come as close to solving the
    > halting problem as they can.

    No. I don't expect the compiler to do anything with this information.
    It's there for the programmer to use, for checks at runtime (e.g. at the
    start of main() in heavy recursive programs). It can't be done at
    compile time since, in general, you don't know the stack size at that point.

    > And you think variable length operators based on a grammar is an
    > impossible, intractible problem in practice?!?!

    I think it's very difficult, and it may be even impossible, yes. But
    surely, you're going to demonstrate that I am wrong? :-)

    >>>This versus just living with runtime stack checking? I'll take the
    >>>runtime stack checking.
    >>
    >>Then define a proposal to formalize "runtime stack checking" in
    >>standard-like language.

    > The implementation may exit or take any other defined action which halts
    > the execution on any function call. But it may not create undefined
    > behaviour from the action of performing the call itself alone.

    Ok, that would be an alternative way of preventing the UB. I'd much
    prefer it to the current situation, and when I think of it, it's
    probably better than my proposal which has a lot of practical problems.
    The 'stack exception' could also occur on alloca().

    > There's a proposal that's compatible with many implementations today
    > (if we presume that turning stack checking *off* would be an
    > extension, and that leaving it on is the default conforming behavior)
    > and should be easy to understand and implement.

    I agree. How's that for intellectual honesty.

    >>>>[...]
    >>>>My opionion is determined largely by the psychological objections (to
    >>>>which you don't subscribe.)
    >>
    >>>Says who? You see, unlike some people, by idea isn't driven by an
    >>>opinion. Its driven by a consideration for design. Since you *CAN'T*
    >>>add just any old operator to satisfy everyone, then how do you satisfy
    >>>the demand for more operators? Some variation on redefinable
    >>>operators or operator overloading is the logical conclusion -- just
    >>>fix its problems (since type-index functions don't really exist in C,
    >>>so you just add in a bunch of them so that you don't lose any of the
    >>>ones you have). For all I know, I might not ever use such a feature,
    >>>even if it *WERE* to be endorsed by the standard.
    >>
    >>I fail to see the point you want to make here.

    > Just that I am trying analyze problems, and you are just spouting
    > opinion.

    My point is that symbols have an order of magnitude less menmonic value
    than symbols. Well yes, you can call that an 'opinion', but it's also a
    testable hypothesis (and it has probably been tested in research). Don't
    have a reference, though. Can we just leave it at that?

    >>>Well I don't know what to make of this -- you can't see a race
    >>>condition when its staring you in the face, and you can't imagine
    >>>problems with knowing the size of a function's stack usage.

    >>Be careful not to presume things that are not there. [...]

    > Yeah, thanks for the warning. I'll be sure to purchase some duct tape
    > to deal with my neighbors who might be terrorists too.

    You don't know me, I don't know you. Probably if we met in person we
    could get along just fine, given the fact that we seem to share some
    interests.... Why is it silly to ask that you don't pass judgement from
    5000 miles away on someone whom you don't know?

    > The race is there. I reduced it to the standard notions that you are
    > supposed to learn about in an OS class. I've explained everything you
    > need to know to understand its there. Any research you do into the
    > subject will confirm what I'm saying. But I'm supposed to worry that
    > I might be wrong based on your flimsy non-arguments.

    We'll see. Have some patience... :-)

    >>>>There is a possibility of error caused by bad design (which isn't
    >>>>trimmed to the type of mistakes we fallible humans make) in your
    >>>>operator introduction scheme.
    >>>
    >>>Same is true with free form function names, of course.
    >>
    >>The risks of errors are minute, in comparison.

    > An opinion you will back up how ... ?

    I won't back it up. It's an opinion that could easily be turned into
    fact, though. Here's an experiment for you:

    count_nodes counts the nodes in the tree
    @* do a matrix multiply
    insert_element insert element in the list
    ?> take the maximum
    is_prime determine if number is prime
    =%= see if two numbers are in the same residue class
    log take the natural logarithm
    <#> count the elements in the list
    event_handler_poll poll the event handler
    <^ left-associative power
    mutexP take the mutex
    %# calculate a percentage

    ...Now turn away from the screen, and have somebody read the
    _descriptions_. Reproduce the function name / operator symbol from
    memory. How many operators did you get? How many symbols? Let's have a
    bit of this intellectual honesty! :-)

    >>>>I listed some specific objections with regards to disambiguation of
    >>>>the grammer that the parser would have to be able to handle.
    >>
    >>>Even without a full grammar being specified? That's a pretty good trick.
    >>
    >>Yes, it's called "abstract thinking". It's a cousin of "thought
    >>experiment".

    > Yeah, well believing that there are parsable grammars doesn't take very much
    > "abstract thinking", there millions of examples.

    Sure, but the question is if the operator introduction feature that you
    loosely proposed will be guaranteed to yield an unambiguous parser, when
    you make the rules for operator intruduction more concrete.

    >> [snipped the smartcard stuff]

    >>>CPU companies have laid out the decent semantics. I and others have
    >>>written bignum routines -- the carry flag is where its at as far as
    >>>this is concerned.
    >>
    >>That's fine. Now here's a big idea: hardware design and general-purpose
    >>language design are two different things.

    > And so which category do you think "bignum libraries" falls into?

    Neither, but if you put a gun to my head I'd say the former.

    > I mean given that every platform supports it, and that in fact an
    > implementation (the GMP) purports to be a software library with a
    > portable interface up to software, supporting just about every popular
    > desktop/server/workstation platform under the sun.

    Hmmmm. Still neither, I'm afraid.

    >>>>>OF is never the important flag. Its not relevant at all.
    >>>>
    >>>>I guess that's why everything since the 4004 supports it.
    >>
    >>>That's because OF is used for <,>,<=,>= comparisons. Outside of
    >>>this, I have never seen OF being used for any purpose. I.e., I would
    >>>surmise that the only relevant usage of OF is already encoded in the
    >>>language semantics of C (and just about any other language with
    >>>integers.) But I could be wrong -- can you name at least one other
    >>>purpose which isn't really just an encoding of <,>,<=,>= ?
    >>
    >>Adding, subtracting, multiplying, dividing. Basically anything
    >>arithmetic you can do that can map the result of an operation on signed
    >>ints to a value that doesn't correspond to the mathematical
    >>interpretation of what the result should be.

    > No, we all understand what produces an OF,
    > I mean, name an actual agorithm that uses it, that can't extract the
    > semantics from one of <,>,<=,>=.

    What is that supposed to prove? The same is true for your carry:

    /* adding with carry, presuming 2-complement repr. */

    unsigned a, b, result;
    int carry;

    result = a+b;
    carry = a>(UINT_MAX-b);

    /******/

    Here's an algorithm that uses overflow:

    int add_signed(int a, int b, int *overflow);

    {
       int a,b,r,ov;

       get_ab(&a, &b);
       r = add_signed(a, b, &ov);
       if (ov)
         printf("sorry, can't add those.\n");
       else
         printf("result = %d\n", r);
    }

    >>>>So again (as with the carry vs. overflow), you're going to perform an
    >>>>instruction that works on UNsigned integers even on signed integers.
    >>>
    >>>Yes.
    >>
    >>That doesn't rhyme well with C.
    >
    > Well that's right -- *bignums* don't rhyme well with C. *STRINGS*
    > don't rhyme well with C. *BITS* don't rhyme well with C. That
    > doesn't mean its not worth having them.

    But the point is: C doesn't have (real) strings. C doesn't provide
    bignums. It provides rather mundane support for bit-twiddling, but only
    on a per-byte-or-more basis. That's the essence of my "rhyme" claim:
    things that collide with the core language shouldn't be forced in there,
    just out of convenience for some specific application.

    >>[...] There are many subtle issues with
    >>signed/unsigned representation and operations, and the standard is
    >>careful to find a compromise; not assuming too much on the
    >>architecture side, while at the same time maintaining a sound
    >>semantics for both signed and unsigned operations. Your semantics
    >>just doesn't fit in.

    > There's your deep analysis again "... just doesn't fit in ...". I
    > just can't argue with that. Wow, I guess I am just wrong.

    Perhaps you should download the standard and read it (especially the
    naughty parts on signed and unsigned integers), to see what I mean.

    >>Are you aware that the C standard allows for other representations
    >>than 2-complement? Assume 1-complement representation for a moment
    >>(the standard is specifically worded to allow this).
    >
    >
    > Yes, but the semantics of what a carry is, is not crucially tied to
    > this. Even if you use ones complement, or something else, you can
    > still *compute* the carry, even if you don't automatically have it from
    > the built-in representation.

    Hold on a minute... Where's your famed "intellectual honesty" now? You
    made a claim that I disproved, and now you're just moving along as if it
    didn't happen? At least an acknowledgement would be nice.

    To address your point though.

    I'm really interested to see how you would write a big number multiply
    that doesn't presuppose 2-complement, but works in 1-complement as well.
    For the sake of argument, feel free to introduce a wide-mul operation
    (but let's have a semantics for it... The x86 MUL obviously won't do).

    >>>Having signed and non-signed versions of every operator may be
    >>>something that fit C when it was first designed, but that doesn't
    >>>make it necessarily correct for all relevant computations.
    >>
    >>Well, if (and that's a big if) something resembling your wide-mul
    >>would ever make it in the standard, I'd be disappointed if it didn't
    >>for instance allow me to multiply two big signed integers. It would
    >>be messy to only support unsigned ones.
    >
    >
    > But I didn't precisely say that. In fact losing the (un)signedness
    > semantics is really something you need for the carry flag idea. The
    > "almost like widening multiplies", that exist in C retain the sign as
    > you would expect, and obviously you probably wouldn't want to change
    > that in an extension to C.

    Ok.

    Best regards,

       Sidney


  • Next message: Chris Torek: "Re: simple algorithm for finding primes"

    Relevant Pages

    • Re: Is C99 the final C? (some suggestions)
      ... Well the few people in here who have the standard have chimed in. ... it may cause the occasional portability problem, ... >> Reread the Lamport's bakery algorithm. ... Dealing with the stack is an area that clearly ...
      (comp.lang.c)
    • Re: E96 Series Computation
      ... >>>3) multiply remaining fraction by 64 ... >>>absolute closest standard value to arbitrary input R are hopelessly lost ... >>>as noise compared to tolerance. ... >> I guess I don't understand what your algorithm is supposed to do. ...
      (sci.electronics.design)
    • Re: AES with SslStream
      ... I can't remember which version of Windows is supposed to get that support, ... my understanding is that the AES types are still ... Aes The Advanced Encryption Standard algorithm. ...
      (microsoft.public.dotnet.security)
    • EFS and System Cryptography Group Policy - Windows XP SP2
      ... Windows XP uses the Data Encryption Standard algorithm with a 56-bit ...
      (microsoft.public.windowsxp.security_admin)
    • Re: a rand array
      ... the standard deviation of the sample mean will be the standard ... sequences as a naive "fill and sort" algorithm, ... A similar call to a naive "fill and sort" algorithm gives: ...
      (comp.lang.c)

    Loading